Steven Rostedt [Wed, 10 Nov 2010 11:56:12 +0000 (12:56 +0100)]
tracing: Fix recursive user stack trace
The user stack trace can fault when examining the trace. Which
would call the do_page_fault handler, which would trace again,
which would do the user stack trace, which would fault and call
do_page_fault again ...
Thus this is causing a recursive bug. We need to have a recursion
detector here.
[ Resubmitted by Jiri Olsa ]
[ Eric Dumazet recommended using __this_cpu_* instead of __get_cpu_* ]
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
LKML-Reference: <
1289390172-9730-3-git-send-email-jolsa@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Steven Rostedt [Thu, 11 Nov 2010 03:29:49 +0000 (22:29 -0500)]
tracing: Force arch_local_irq_* notrace for paravirt
When running ktest.pl randconfig tests, I would sometimes trigger
a lockdep annotation bug (possible reason: unannotated irqs-on).
This triggering happened right after function tracer self test was
executed. After doing a config bisect I found that this was caused with
having function tracer, paravirt guest, prove locking, and rcu torture
all enabled.
The rcu torture just enhanced the likelyhood of triggering the bug.
Prove locking was needed, since it was the thing that was bugging.
Function tracer would trace and disable interrupts in all sorts
of funny places.
paravirt guest would turn arch_local_irq_* into functions that would
be traced.
Besides the fact that tracing arch_local_irq_* is just a bad idea,
this is what is happening.
The bug happened simply in the local_irq_restore() code:
if (raw_irqs_disabled_flags(flags)) { \
raw_local_irq_restore(flags); \
trace_hardirqs_off(); \
} else { \
trace_hardirqs_on(); \
raw_local_irq_restore(flags); \
} \
The raw_local_irq_restore() was defined as arch_local_irq_restore().
Now imagine, we are about to enable interrupts. We go into the else
case and call trace_hardirqs_on() which tells lockdep that we are enabling
interrupts, so it sets the current->hardirqs_enabled = 1.
Then we call raw_local_irq_restore() which calls arch_local_irq_restore()
which gets traced!
Now in the function tracer we disable interrupts with local_irq_save().
This is fine, but flags is stored that we have interrupts disabled.
When the function tracer calls local_irq_restore() it does it, but this
time with flags set as disabled, so we go into the if () path.
This keeps interrupts disabled and calls trace_hardirqs_off() which
sets current->hardirqs_enabled = 0.
When the tracer is finished and proceeds with the original code,
we enable interrupts but leave current->hardirqs_enabled as 0. Which
now breaks lockdeps internal processing.
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Steven Rostedt [Thu, 11 Nov 2010 03:19:24 +0000 (22:19 -0500)]
tracing: Fix module use of trace_bprintk()
On use of trace_printk() there's a macro that determines if the format
is static or a variable. If it is static, it defaults to __trace_bprintk()
otherwise it uses __trace_printk().
A while ago, Lai Jiangshan added __trace_bprintk(). In that patch, we
discussed a way to allow modules to use it. The difference between
__trace_bprintk() and __trace_printk() is that for faster processing,
just the format and args are stored in the trace instead of running
it through a sprintf function. In order to do this, the format used
by the __trace_bprintk() had to be persistent.
See commit
1ba28e02a18cbdbea123836f6c98efb09cbf59ec
The problem comes with trace_bprintk() where the module is unloaded.
The pointer left in the buffer is still pointing to the format.
To solve this issue, the formats in the module were copied into kernel
core. If the same format was used, they would use the same copy (to prevent
memory leak). This all worked well until we tried to merge everything.
At the time this was written, Lai Jiangshan, Frederic Weisbecker,
Ingo Molnar and myself were all touching the same code. When this was
merged, we lost the part of it that was in module.c. This kept out the
copying of the formats and unloading the module could cause bad pointers
left in the ring buffer.
This patch adds back (with updates required for current kernel) the
module code that sets up the necessary pointers.
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Peter Zijlstra [Mon, 1 Nov 2010 17:52:05 +0000 (18:52 +0100)]
perf, amd: Use kmalloc_node(,__GFP_ZERO) for northbridge structure allocation
Jasper suggested we use the zeroing capability of the allocators
instead of calling memset ourselves. Add node affinity while we're at
it.
Reported-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Stephane Eranian [Tue, 26 Oct 2010 14:08:01 +0000 (16:08 +0200)]
perf_events: Fix time tracking in samples
This patch corrects time tracking in samples. Without this patch
both time_enabled and time_running are bogus when user asks for
PERF_SAMPLE_READ.
One uses PERF_SAMPLE_READ to sample the values of other counters
in each sample. Because of multiplexing, it is necessary to know
both time_enabled, time_running to be able to scale counts correctly.
In this second version of the patch, we maintain a shadow
copy of ctx->time which allows us to compute ctx->time without
calling update_context_time() from NMI context. We avoid the
issue that update_context_time() must always be called with
ctx->lock held.
We do not keep shadow copies of the other event timings
because if the lead event is overflowing then it is active
and thus it's been scheduled in via event_sched_in() in
which case neither tstamp_stopped, tstamp_running can be modified.
This timing logic only applies to samples when PERF_SAMPLE_READ
is used.
Note that this patch does not address timing issues related
to sampling inheritance between tasks. This will be addressed
in a future patch.
With this patch, the libpfm4 example task_smpl now reports
correct counts (shown on 2.4GHz Core 2):
$ task_smpl -p
2400000000 -e unhalted_core_cycles:u,instructions_retired:u,baclears noploop 5
noploop for 5 seconds
IIP:0x000000004006d6 PID:5596 TID:5596 TIME:466,210,211,430 STREAM_ID:33 PERIOD:2,400,000,000 ENA=1,010,157,814 RUN=1,010,157,814 NR=3
2,400,000,254 unhalted_core_cycles:u (33)
2,399,273,744 instructions_retired:u (34)
53,340 baclears (35)
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <
4cc6e14b.
1e07e30a.256e.5190@mx.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Tom Zanussi [Wed, 10 Nov 2010 14:20:45 +0000 (08:20 -0600)]
perf trace: update usage
Update usage to reflect the different perf trace variants.
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Tom Zanussi [Wed, 10 Nov 2010 14:19:35 +0000 (08:19 -0600)]
perf trace: update Documentation with new perf trace variants
Add documentation describing new 'perf trace' command changes
e.g. <command> handling and live-mode/top variants.
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Tom Zanussi [Wed, 10 Nov 2010 14:16:51 +0000 (08:16 -0600)]
perf trace: live-mode command-line cleanup
This patch attempts to make the perf trace command-line for live-mode
commands more user-friendly and consistent with other perf commands.
The main change it makes is to allow <commands> to be run as part of
perf trace live-mode commands, as other perf commands do, instead of
the system-wide traces they're currently hard-coded to by the shell
scripts.
With this patch, the following live-mode trace now works as expected:
$ perf trace rw-by-pid ls -al
The previous system-wide behavior for this command would still be
available by explicitly specifying -a:
$ perf trace rw-by-pid -a ls -al
and if no <command> is specified, the output is also system-wide:
$ perf trace rw-by-pid
Because live-mode requires both record and report steps to be invoked,
it isn't always possible to know which args to send to the report and
which to send to the record steps - mainly this is the case for report
scripts with optional args - in those cases it would be necessary to
use separate 'perf trace record' and 'perf trace report' steps.
For example:
$ perf trace syscall-counts ls
Here we can't decide whether ls should be passed as a param to the
syscall-counts script or whether we should invoke ls as a <command>.
In these cases, we just say that we'll ignore optional script params
and always interpret the extra arguments as a <command>.
If the user instead wants the other interpretation, that can be
accomplished by using separate record and report commands explicitly:
$ perf trace record syscall-counts
$ perf trace report syscall-counts ls
So the rules that this patch implements, which seem to make the most
intuitive sense for live-mode commands:
- for commands with optional args and commands with no args, no args
are sent to the report script, all are sent to the record step
- for 'top' commands i.e. that end with 'top', <commands> can't be
used - all extra args are send to the report script as params
- for commands with required args, the n required args are taken to be
the first n args after the script name and sent to the report
script, and the rest are sent to the record step
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Tom Zanussi [Wed, 10 Nov 2010 14:15:43 +0000 (08:15 -0600)]
perf trace record: handle commands correctly
Because the perf-trace shell scripts hard-coded the use of the
perf-record system-wide param, a perf trace record session was always
system wide, even if it was given a command.
If given a command, perf trace record now only records the events for
the command, as users expect.
If no command is given, or if the '-a' option is used, the recorded
events are system-wide, as before.
root@tropicana:~# perf trace record syscall-counts ls -al
root@tropicana:~# perf trace
ls-23152 [000] 39984.890387: sys_enter: NR 12 (0, 0, 0, 0, 0, 0)
ls-23152 [000] 39984.890404: sys_enter: NR 9 (0, 0, 0, 0, 0, 0)
root@tropicana:~# perf trace record syscall-counts -a ls -al
root@tropicana:~# perf trace
npviewer.bin-22297 [000] 39831.102709: sys_enter: NR 168 (0, 0, 0, 0, 0, 0)
ls-23111 [000] 39831.107679: sys_enter: NR 59 (0, 0, 0, 0, 0, 0)
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Tom Zanussi [Wed, 10 Nov 2010 14:11:30 +0000 (08:11 -0600)]
perf record: make the record options available outside perf record
Other perf commands that invoke perf record, such as perf trace, may
want to reuse the options used by perf record.
This makes them non-static and renames them to avoid clashes with
other 'options' variables.
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Tom Zanussi [Wed, 10 Nov 2010 14:08:20 +0000 (08:08 -0600)]
perf trace scripting: remove system-wide param from shell scripts
Including -a unconditionally when recording doesn't allow for the
option of running scripts without it. Future patches will add add it
back if needed at run-time.
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Tom Zanussi [Wed, 10 Nov 2010 13:52:32 +0000 (07:52 -0600)]
perf trace scripting: fix some small memory leaks and missing error checks
Free the other two fields of script_desc which somehow got overlooked,
free malloc'ed args in case exec fails, and add missing checks for
failed mallocs.
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Corey Ashford [Tue, 9 Nov 2010 02:20:45 +0000 (18:20 -0800)]
perf: Fix usages of profile_cpu in builtin-top.c to use cpu_list
profile_cpu was left over from an earlier implementation that
supported running perf top on a single CPU. profile_cpu was no
longer set by any switch and usages of it resulted in dead code.
Instead, convert the code to use cpu_list, which is set by the
-C <cpu_list> option.
Also improved the printing of nr_cpus and cpu_list by correcting
the plurals.
Signed-off-by: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: a.p.zijlstra@chello.nl
Cc: acme@redhat.com
LKML-Reference: <
1289269245-9388-1-git-send-email-cjashfor@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cyrill Gorcunov [Sat, 6 Nov 2010 08:47:24 +0000 (11:47 +0300)]
perf, ui: Eliminate stack-smashing protection compiler complaint
The gcc complains about small auto-var strings being allocated from stack space.
Make them const to avoid this:
| CC util/ui/util.o
| cc1: warnings being treated as errors
| util/ui/util.c: In function ‘ui__dialog_yesno’:
| util/ui/util.c:108: error: not protecting function: no buffer at least 8 bytes long
| make: *** [util/ui/util.o] Error 1
The real bug is in the newtWinChoice() ABI - but that's an
externality we cannot fix here, so we use this workaround.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <
20101106084724.GA5956@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus Torvalds [Tue, 9 Nov 2010 18:34:48 +0000 (10:34 -0800)]
Merge git://git./linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
cifs: fix a memleak in cifs_setattr_nounix()
cifs: make cifs_ioctl handle NULL filp->private_data correctly
Pekka Enberg [Mon, 8 Nov 2010 19:29:07 +0000 (21:29 +0200)]
perf_events: Fix perf_counter_mmap() hook in mprotect()
As pointed out by Linus, commit dab5855 ("perf_counter: Add mmap event hooks to
mprotect()") is fundamentally wrong as mprotect_fixup() can free 'vma' due to
merging. Fix the problem by moving perf_event_mmap() hook to
mprotect_fixup().
Note: there's another successful return path from mprotect_fixup() if old
flags equal to new flags. We don't, however, need to call
perf_event_mmap() there because 'perf' already knows the VMA is
executable.
Reported-by: Dave Jones <davej@redhat.com>
Analyzed-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Suresh Jayaraman [Tue, 9 Nov 2010 06:57:41 +0000 (12:27 +0530)]
cifs: fix a memleak in cifs_setattr_nounix()
Andrew Hendry reported a kmemleak warning in 2.6.37-rc1 while editing a
text file with gedit over cifs.
unreferenced object 0xffff88022ee08b40 (size 32):
comm "gedit", pid 2524, jiffies
4300160388 (age 2633.655s)
hex dump (first 32 bytes):
5c 2e 67 6f 75 74 70 75 74 73 74 72 65 61 6d 2d \.goutputstream-
35 42 41 53 4c 56 00 de 09 00 00 00 2c 26 78 ee 5BASLV......,&x.
backtrace:
[<
ffffffff81504a4d>] kmemleak_alloc+0x2d/0x60
[<
ffffffff81136e13>] __kmalloc+0xe3/0x1d0
[<
ffffffffa0313db0>] build_path_from_dentry+0xf0/0x230 [cifs]
[<
ffffffffa031ae1e>] cifs_setattr+0x9e/0x770 [cifs]
[<
ffffffff8115fe90>] notify_change+0x170/0x2e0
[<
ffffffff81145ceb>] sys_fchmod+0x10b/0x140
[<
ffffffff8100c172>] system_call_fastpath+0x16/0x1b
[<
ffffffffffffffff>] 0xffffffffffffffff
The commit
1025774c that removed inode_setattr() seems to have introduced this
memleak by returning early without freeing 'full_path'.
Reported-by: Andrew Hendry <andrew.hendry@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Linus Torvalds [Tue, 9 Nov 2010 02:30:11 +0000 (18:30 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
kernel: Constify temporary variable in roundup()
Tetsuo Handa [Mon, 8 Nov 2010 02:20:49 +0000 (11:20 +0900)]
kernel: Constify temporary variable in roundup()
Fix build error with GCC 3.x caused by commit
b28efd54
"kernel: roundup should only reference arguments once" by constifying
temporary variable used in that macro.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Meelis Roos [Mon, 8 Nov 2010 21:38:14 +0000 (13:38 -0800)]
sparc: fix openpromfs compile
Fix openpromfs compilation by adding a missing semicolon in
fs/openpromfs/inode.c openprom_mount().
Signed-off-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 8 Nov 2010 19:54:53 +0000 (11:54 -0800)]
Merge branch 'for_linus' of git://git./linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: Add new ext4 inode tracepoints
ext4: Don't call sb_issue_discard() in ext4_free_blocks()
ext4: do not try to grab the s_umount semaphore in ext4_quota_off
ext4: fix potential race when freeing ext4_io_page structures
ext4: handle writeback of inodes which are being freed
ext4: initialize the percpu counters before replaying the journal
ext4: "ret" may be used uninitialized in ext4_lazyinit_thread()
ext4: fix lazyinit hang after removing request
Jeff Layton [Mon, 8 Nov 2010 12:28:32 +0000 (07:28 -0500)]
cifs: make cifs_ioctl handle NULL filp->private_data correctly
Commit
13cfb7334e made cifs_ioctl use the tlink attached to the
cifsFileInfo for a filp. This ignores the case of an open directory
however, which in CIFS can have a NULL private_data until a readdir
is done on it.
This patch re-adds the NULL pointer checks that were removed in commit
50ae28f01 and moves the setting of tcon and "caps" variables lower.
Long term, a better fix would be to establish a f_op->open routine for
directories that populates that field at open time, but that requires
some other changes to how readdir calls are handled.
Reported-by: Kjell Rune Skaaraas <kjella79@yahoo.no>
Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Linus Torvalds [Mon, 8 Nov 2010 18:55:29 +0000 (10:55 -0800)]
Merge git://git./linux/kernel/git/gregkh/tty-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
TTY: move .gitignore from drivers/char/ to drivers/tty/vt/
TTY: create drivers/tty/vt and move the vt code there
TTY: create drivers/tty and move the tty core files there
Linus Torvalds [Mon, 8 Nov 2010 18:54:49 +0000 (10:54 -0800)]
Merge branch 'staging-linus' of git://git./linux/kernel/git/gregkh/staging-next-2.6
* 'staging-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-next-2.6:
Staging: ath6kl: remove empty files that mess with 'distclean'
staging: ath6kl: Fixing the driver to use modified mmc_host structure
Staging: solo6x10: fix build problem
Linus Torvalds [Mon, 8 Nov 2010 18:54:23 +0000 (10:54 -0800)]
Merge branch 'rmobile-fixes-for-linus' of git://git./linux/kernel/git/lethal/sh-2.6
* 'rmobile-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
mmc: sh_mmcif: Convert extern inline to static inline.
ARM: mach-shmobile: Allow GPIO chips to register IRQ mappings.
ARM: mach-shmobile: fix sh7372 after a recent clock framework rework
ARM: mach-shmobile: include drivers/sh/Kconfig
ARM: mach-shmobile: ap4evb: Add HDMI sound support
ARM: mach-shmobile: clock-sh7372: Add FSIDIV clock support
ARM: shmobile: remove sh_timer_config clk member
Linus Torvalds [Mon, 8 Nov 2010 18:53:21 +0000 (10:53 -0800)]
Merge branch 'sh-fixes-for-linus' of git://git./linux/kernel/git/lethal/sh-2.6
* 'sh-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
sh: clkfwk: Fix up checkpatch warnings.
sh: make some needlessly global sh7724 clocks static
sh: add clk_round_parent() to optimize parent clock rate
sh: Simplify phys_addr_mask()/PTE_PHYS_MASK for 29/32-bit.
sh: nommu: Support building without an uncached mapping.
sh: nommu: use 32-bit phys mode.
sh: mach-se: Fix up SE7206 no ioport build.
sh: intc: Update for single IRQ reservation helper.
sh: clkfwk: Fix up rate rounding error handling.
sh: mach-se: Rip out superfluous 7751 PIO routines.
sh: mach-se: Rip out superfluous 770x PIO routines.
sh: mach-edosk7705: Kill off machtype, consolidate board def.
sh: mach-edosk7705: update for this century, kill off PIO trapping.
sh: mach-se: Rip out superfluous 7206 PIO routines.
sh: mach-systemh: Kill off dead board.
sh: mach-snapgear: Kill off machtype, consolidate board def.
sh: mach-snapgear: Rip out superfluous PIO routines.
sh: mach-microdev: SuperIO-relative ioport mapping.
Theodore Ts'o [Mon, 8 Nov 2010 18:51:33 +0000 (13:51 -0500)]
ext4: Add new ext4 inode tracepoints
Add ext4_evict_inode, ext4_drop_inode, ext4_mark_inode_dirty, and
ext4_begin_ordered_truncate()
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Theodore Ts'o [Mon, 8 Nov 2010 18:49:33 +0000 (13:49 -0500)]
ext4: Don't call sb_issue_discard() in ext4_free_blocks()
Commit
5c521830cf (ext4: Support discard requests when running in
no-journal mode) attempts to add sb_issue_discard() for data blocks
(in data=writeback mode) and in no-journal mode. Unfortunately, this
no longer works, because in commit
dd3932eddf (block: remove
BLKDEV_IFL_WAIT), sb_issue_discard() only presents a synchronous
interface, and there are times when we call ext4_free_blocks() when we
are are holding a spinlock, or are otherwise in an atomic context.
For now, I've removed the call to sb_issue_discard() to prevent a
deadlock or (if spinlock debugging is enabled) failures like this:
BUG: scheduling while atomic: rc.sysinit/1376/0x00000002
Pid: 1376, comm: rc.sysinit Not tainted 2.6.36-ARCH #1
Call Trace:
[<
ffffffff810397ce>] __schedule_bug+0x5e/0x70
[<
ffffffff81403110>] schedule+0x950/0xa70
[<
ffffffff81060bad>] ? insert_work+0x7d/0x90
[<
ffffffff81060fbd>] ? queue_work_on+0x1d/0x30
[<
ffffffff81061127>] ? queue_work+0x37/0x60
[<
ffffffff8140377d>] schedule_timeout+0x21d/0x360
[<
ffffffff812031c3>] ? generic_make_request+0x2c3/0x540
[<
ffffffff81402680>] wait_for_common+0xc0/0x150
[<
ffffffff81041490>] ? default_wake_function+0x0/0x10
[<
ffffffff812034bc>] ? submit_bio+0x7c/0x100
[<
ffffffff810680a0>] ? wake_bit_function+0x0/0x40
[<
ffffffff814027b8>] wait_for_completion+0x18/0x20
[<
ffffffff8120a969>] blkdev_issue_discard+0x1b9/0x210
[<
ffffffff811ba03e>] ext4_free_blocks+0x68e/0xb60
[<
ffffffff811b1650>] ? __ext4_handle_dirty_metadata+0x110/0x120
[<
ffffffff811b098c>] ext4_ext_truncate+0x8cc/0xa70
[<
ffffffff810d713e>] ? pagevec_lookup+0x1e/0x30
[<
ffffffff81191618>] ext4_truncate+0x178/0x5d0
[<
ffffffff810eacbb>] ? unmap_mapping_range+0xab/0x280
[<
ffffffff810d8976>] vmtruncate+0x56/0x70
[<
ffffffff811925cb>] ext4_setattr+0x14b/0x460
[<
ffffffff811319e4>] notify_change+0x194/0x380
[<
ffffffff81117f80>] do_truncate+0x60/0x90
[<
ffffffff811e08fa>] ? security_inode_permission+0x1a/0x20
[<
ffffffff811eaec1>] ? tomoyo_path_truncate+0x11/0x20
[<
ffffffff81127539>] do_last+0x5d9/0x770
[<
ffffffff811278bd>] do_filp_open+0x1ed/0x680
[<
ffffffff8140644f>] ? page_fault+0x1f/0x30
[<
ffffffff81132bfc>] ? alloc_fd+0xec/0x140
[<
ffffffff81118db1>] do_sys_open+0x61/0x120
[<
ffffffff81118e8b>] sys_open+0x1b/0x20
[<
ffffffff81002e6b>] system_call_fastpath+0x16/0x1b
https://bugzilla.kernel.org/show_bug.cgi?id=22302
Reported-by: Mathias Burén <mathias.buren@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: jiayingz@google.com
Dmitry Monakhov [Mon, 8 Nov 2010 18:47:33 +0000 (13:47 -0500)]
ext4: do not try to grab the s_umount semaphore in ext4_quota_off
It's not needed to sync the filesystem, and it fixes a lock_dep complaint.
Signed-off-by: Dmitry Monakhov <dmonakhov@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Theodore Ts'o [Mon, 8 Nov 2010 18:45:33 +0000 (13:45 -0500)]
ext4: fix potential race when freeing ext4_io_page structures
Use an atomic_t and make sure we don't free the structure while we
might still be submitting I/O for that page.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Theodore Ts'o [Mon, 8 Nov 2010 18:43:33 +0000 (13:43 -0500)]
ext4: handle writeback of inodes which are being freed
The following BUG can occur when an inode which is getting freed when
it still has dirty pages outstanding, and it gets deleted (in this
because it was the target of a rename). In ordered mode, we need to
make sure the data pages are written just in case we crash before the
rename (or unlink) is committed. If the inode is being freed then
when we try to igrab the inode, we end up tripping the BUG_ON at
fs/ext4/page-io.c:146.
To solve this problem, we need to keep track of the number of io
callbacks which are pending, and avoid destroying the inode until they
have all been completed. That way we don't have to bump the inode
count to keep the inode from being destroyed; an approach which
doesn't work because the count could have already been dropped down to
zero before the inode writeback has started (at which point we're not
allowed to bump the count back up to 1, since it's already started
getting freed).
Thanks to Dave Chinner for suggesting this approach, which is also
used by XFS.
kernel BUG at /scratch_space/linux-2.6/fs/ext4/page-io.c:146!
Call Trace:
[<
ffffffff811075b1>] ext4_bio_write_page+0x172/0x307
[<
ffffffff811033a7>] mpage_da_submit_io+0x2f9/0x37b
[<
ffffffff811068d7>] mpage_da_map_and_submit+0x2cc/0x2e2
[<
ffffffff811069b3>] mpage_add_bh_to_extent+0xc6/0xd5
[<
ffffffff81106c66>] write_cache_pages_da+0x2a4/0x3ac
[<
ffffffff81107044>] ext4_da_writepages+0x2d6/0x44d
[<
ffffffff81087910>] do_writepages+0x1c/0x25
[<
ffffffff810810a4>] __filemap_fdatawrite_range+0x4b/0x4d
[<
ffffffff810815f5>] filemap_fdatawrite_range+0xe/0x10
[<
ffffffff81122a2e>] jbd2_journal_begin_ordered_truncate+0x7b/0xa2
[<
ffffffff8110615d>] ext4_evict_inode+0x57/0x24c
[<
ffffffff810c14a3>] evict+0x22/0x92
[<
ffffffff810c1a3d>] iput+0x212/0x249
[<
ffffffff810bdf16>] dentry_iput+0xa1/0xb9
[<
ffffffff810bdf6b>] d_kill+0x3d/0x5d
[<
ffffffff810be613>] dput+0x13a/0x147
[<
ffffffff810b990d>] sys_renameat+0x1b5/0x258
[<
ffffffff81145f71>] ? _atomic_dec_and_lock+0x2d/0x4c
[<
ffffffff810b2950>] ? cp_new_stat+0xde/0xea
[<
ffffffff810b29c1>] ? sys_newlstat+0x2d/0x38
[<
ffffffff810b99c6>] sys_rename+0x16/0x18
[<
ffffffff81002a2b>] system_call_fastpath+0x16/0x1b
Reported-by: Nick Bowler <nbowler@elliptictech.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Tested-by: Nick Bowler <nbowler@elliptictech.com>
Paul Mundt [Mon, 8 Nov 2010 00:51:41 +0000 (09:51 +0900)]
Merge branch 'rmobile/core' into rmobile-fixes-for-linus
Paul Mundt [Mon, 8 Nov 2010 00:42:43 +0000 (09:42 +0900)]
Merge branches 'sh/pio-death', 'sh/nommu', 'sh/clkfwk', 'sh/core' and 'sh/intc-extension' into sh-fixes-for-linus
Paul Mundt [Mon, 8 Nov 2010 00:40:23 +0000 (09:40 +0900)]
sh: clkfwk: Fix up checkpatch warnings.
The clk_round_parent() change introduced various checkpatch warnings,
tidy them up.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Guennadi Liakhovetski [Thu, 4 Nov 2010 14:14:29 +0000 (14:14 +0000)]
sh: make some needlessly global sh7724 clocks static
These clocks are currently only used inside one .c file and are not
declared in any headers, therefore having them global is useless.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Guennadi Liakhovetski [Tue, 2 Nov 2010 11:27:24 +0000 (11:27 +0000)]
sh: add clk_round_parent() to optimize parent clock rate
Sometimes it is possible and reasonable to adjust the parent clock rate to
improve precision of the child clock, e.g., if the child clock has no siblings.
clk_round_parent() is a new addition to the SH clock-framework API, that
implements such an optimization for child clocks with divisors, taking all
integer values in a range.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Greg Kroah-Hartman [Sat, 6 Nov 2010 18:27:04 +0000 (11:27 -0700)]
Staging: ath6kl: remove empty files that mess with 'distclean'
These two .h files would get removed from the tree when doing
make distclean
It turns out they are not needed at all, so just delete them which fixes
people's git trees when doing development.
Reported-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Vivek Goyal [Sat, 6 Nov 2010 12:16:05 +0000 (08:16 -0400)]
floppy: fix another use-after-free
While scanning the floopy code due to
c093ee4f07f4 ("floppy: fix
use-after-free in module load failure path"), I found one more instance
of trying to access disk->queue pointer after doing put_disk() on
gendisk. For some reason , floppy moule still loads/unloads fine. The
object is probably still around with right pointer values.
o There seems to be one more instance of trying to cleanup the request
queue after we have called put_disk() on associated gendisk.
o This fix is more out of code inspection. Even without this fix for
some reason I am able to load/unload floppy module without any
issues.
o Floppy module loads/unloads fine after the fix.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Greg Kroah-Hartman [Sat, 6 Nov 2010 05:18:23 +0000 (22:18 -0700)]
TTY: move .gitignore from drivers/char/ to drivers/tty/vt/
The autogenerated files (consolemap_deftbl.c and defkeymap.c) need to
be ignored by git, so move the .gitignore file that was doing it to the
properly location now that the files have moved as well.
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Linus Torvalds [Sat, 6 Nov 2010 01:57:04 +0000 (18:57 -0700)]
ipw2x00: remove the right /proc/net entry
Commit
27ae60f8f7aa ("ipw2x00: replace "ieee80211" with "libipw" where
appropriate") changed DRV_NAME to be "libipw", but didn't properly fix
up the places where it was used to specify the name for the /proc/net/
directory.
For backwards compatibility reasons, that directory name remained
"ieee80211", but due to the DRV_NAME change, the error case printouts
and the cleanup functions now used "libipw" instead. Which made it all
fail badly.
For example, on module unload as reported by Randy:
WARNING: at fs/proc/generic.c:816 remove_proc_entry+0x156/0x35e()
name 'libipw'
because it's trying to unregister a /proc directory that obviously
doesn't even exist.
Clean it all up to use DRV_PROCNAME for the actual /proc directory name.
Reported-and-tested-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Pavel Roskin <proski@gnu.org>
Cc: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 6 Nov 2010 00:49:22 +0000 (17:49 -0700)]
Merge branch 'kvm-updates/2.6.37' of git://git./virt/kvm/kvm
* 'kvm-updates/2.6.37' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: PPC: BookE: Load the lower half of MSR
KVM: PPC: BookE: fix sleep with interrupts disabled
KVM: PPC: e500: Call kvm_vcpu_uninit() before kvmppc_e500_tlb_uninit().
PPC: KVM: Book E doesn't have __end_interrupts.
KVM: x86: Issue smp_call_function_many with preemption disabled
KVM: x86: fix information leak to userland
KVM: PPC: fix information leak to userland
KVM: MMU: fix rmap_remove on non present sptes
KVM: Write protect memory after slot swap
Linus Torvalds [Sat, 6 Nov 2010 00:45:59 +0000 (17:45 -0700)]
floppy: fix use-after-free in module load failure path
Commit
488211844e0c ("floppy: switch to one queue per drive instead of
sharing a queue") introduced a use-after-free. We do "put_disk()" on
the disk device _before_ we then clean up the queue associated with that
disk.
Move the put_disk() down to avoid dereferencing a free'd data structure.
Cc: Jens Axboe <jaxboe@fusionio.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Reported-and-tested-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Daney [Fri, 5 Nov 2010 23:17:39 +0000 (16:17 -0700)]
watchdog: Fix section mismatch and potential undefined behavior.
Commit
d9ca07a05ce1 ("watchdog: Avoid kernel crash when disabling
watchdog") introduces a section mismatch.
Now that we reference no_watchdog from non-__init code it can no longer
be __initdata.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 5 Nov 2010 22:25:48 +0000 (15:25 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (41 commits)
inet_diag: Make sure we actually run the same bytecode we audited.
netlink: Make nlmsg_find_attr take a const nlmsghdr*.
fib: fib_result_assign() should not change fib refcounts
netfilter: ip6_tables: fix information leak to userspace
cls_cgroup: Fix crash on module unload
memory corruption in X.25 facilities parsing
net dst: fix percpu_counter list corruption and poison overwritten
rds: Remove kfreed tcp conn from list
rds: Lost locking in loop connection freeing
de2104x: fix panic on load
atl1 : fix panic on load
netxen: remove unused firmware exports
caif: Remove noisy printout when disconnecting caif socket
caif: SPI-driver bugfix - incorrect padding.
caif: Bugfix for socket priority, bindtodev and dbg channel.
smsc911x: Set Ethernet EEPROM size to supported device's size
ipv4: netfilter: ip_tables: fix information leak to userland
ipv4: netfilter: arp_tables: fix information leak to userland
cxgb4vf: remove call to stop TX queues at load time.
cxgb4: remove call to stop TX queues at load time.
...
Linus Torvalds [Fri, 5 Nov 2010 21:17:22 +0000 (14:17 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/ieee1394/linux1394-2.6
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
firewire: ohci: fix race when reading count in AR descriptor
firewire: ohci: avoid reallocation of AR buffers
firewire: ohci: fix race in AR split packet handling
firewire: ohci: fix buffer overflow in AR split packet handling
Linus Torvalds [Fri, 5 Nov 2010 21:17:01 +0000 (14:17 -0700)]
Merge git://git./linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
cifs: make cifs_set_oplock_level() take a cifsInodeInfo pointer
cifs: dereferencing first then checking
cifs: trivial comment fix: tlink_tree is now a rbtree
[CIFS] Cleanup unused variable build warning
cifs: convert tlink_tree to a rbtree
cifs: store pointer to master tlink in superblock (try #2)
cifs: trivial doc fix: note setlease implemented
CIFS: Add cifs_set_oplock_level
FS: cifs, remove unneeded NULL tests
Oleg Nesterov [Fri, 5 Nov 2010 15:53:42 +0000 (16:53 +0100)]
posix-cpu-timers: workaround to suppress the problems with mt exec
posix-cpu-timers.c correctly assumes that the dying process does
posix_cpu_timers_exit_group() and removes all !CPUCLOCK_PERTHREAD
timers from signal->cpu_timers list.
But, it also assumes that timer->it.cpu.task is always the group
leader, and thus the dead ->task means the dead thread group.
This is obviously not true after de_thread() changes the leader.
After that almost every posix_cpu_timer_ method has problems.
It is not simple to fix this bug correctly. First of all, I think
that timer->it.cpu should use struct pid instead of task_struct.
Also, the locking should be reworked completely. In particular,
tasklist_lock should not be used at all. This all needs a lot of
nontrivial and hard-to-test changes.
Change __exit_signal() to do posix_cpu_timers_exit_group() when
the old leader dies during exec. This is not the fix, just the
temporary hack to hide the problem for 2.6.37 and stable. IOW,
this is obviously wrong but this is what we currently have anyway:
cpu timers do not work after mt exec.
In theory this change adds another race. The exiting leader can
detach the timers which were attached to the new leader. However,
the window between de_thread() and release_task() is small, we
can pretend that sys_timer_create() was called before de_thread().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 5 Nov 2010 21:15:17 +0000 (14:15 -0700)]
Merge branch 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/staging
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging:
hwmon: (ltc4261) Fix error message format
hwmon: (ltc4261) Add missing newline in debug message
Pavel Shilovsky [Wed, 3 Nov 2010 07:58:57 +0000 (10:58 +0300)]
cifs: make cifs_set_oplock_level() take a cifsInodeInfo pointer
All the callers already have a pointer to struct cifsInodeInfo. Use it.
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Jean Delvare [Fri, 5 Nov 2010 14:59:29 +0000 (10:59 -0400)]
hwmon: (ltc4261) Fix error message format
adapter->id is deprecated and not set by any adapter driver, so this
was certainly not what the author wanted to use. adapter->nr maybe,
but as dev_err() already includes this value, as well as the client's
address, there's no point repeating them. Better print a simple error
message in plain English words.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Guenter Roeck <guenter.roeck@ericsson.com>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Guenter Roeck [Tue, 26 Oct 2010 22:59:21 +0000 (15:59 -0700)]
hwmon: (ltc4261) Add missing newline in debug message
Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Linus Torvalds [Fri, 5 Nov 2010 16:52:25 +0000 (09:52 -0700)]
Merge git://git./linux/kernel/git/cmetcalf/linux-tile
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
arch/tile: mark "hardwall" device as non-seekable
asm-generic/stat.h: support 64-bit file time_t for stat()
arch/tile: don't allow user code to set the PL via ptrace or signal return
arch/tile: correct double syscall restart for nested signals
arch/tile: avoid __must_check warning on one strict_strtol check
arch/tile: bomb raw_local_irq_ to arch_local_irq_
arch/tile: complete migration to new kmap_atomic scheme
Randy Dunlap [Thu, 4 Nov 2010 17:28:00 +0000 (10:28 -0700)]
leds-net5501: taints kernel, add license
Add MODULE_LICENSE() that matches file comments so that kernel
is not tainted.
leds_net5501: module license 'unspecified' taints kernel.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Richard Purdie <rpurdie@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Scott Wood [Thu, 30 Sep 2010 19:31:27 +0000 (14:31 -0500)]
KVM: PPC: BookE: Load the lower half of MSR
This was preventing the guest from setting any bits in the
hardware MSR which aren't forced on, such as MSR[SPE].
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Scott Wood [Thu, 30 Sep 2010 19:28:50 +0000 (14:28 -0500)]
KVM: PPC: BookE: fix sleep with interrupts disabled
It is not legal to call mutex_lock() with interrupts disabled.
This will assert with debug checks enabled.
If there's a real need to disable interrupts here, it could be done
after the mutex is acquired -- but I don't see why it's needed at all.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Scott Wood [Tue, 5 Oct 2010 19:22:41 +0000 (14:22 -0500)]
KVM: PPC: e500: Call kvm_vcpu_uninit() before kvmppc_e500_tlb_uninit().
The VCPU uninit calls some TLB functions, and the TLB uninit function
frees the memory used by them.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Acked-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Scott Wood [Mon, 18 Oct 2010 22:35:48 +0000 (17:35 -0500)]
PPC: KVM: Book E doesn't have __end_interrupts.
Fix an unresolved symbol with CONFIG_KVM_GUEST plus CONFIG_RELOCATABLE on
Book E.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Jan Kiszka [Mon, 1 Nov 2010 13:01:13 +0000 (14:01 +0100)]
KVM: x86: Issue smp_call_function_many with preemption disabled
smp_call_function_many is specified to be called only with preemption
disabled. Fulfill this requirement.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Vasiliy Kulikov [Sat, 30 Oct 2010 18:54:47 +0000 (22:54 +0400)]
KVM: x86: fix information leak to userland
Structures kvm_vcpu_events, kvm_debugregs, kvm_pit_state2 and
kvm_clock_data are copied to userland with some padding and reserved
fields unitialized. It leads to leaking of contents of kernel stack
memory. We have to initialize them to zero.
In patch v1 Jan Kiszka suggested to fill reserved fields with zeros
instead of memset'ting the whole struct. It makes sense as these
fields are explicitly marked as padding. No more fields need zeroing.
KVM-Stable-Tag.
Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Vasiliy Kulikov [Sat, 30 Oct 2010 09:04:24 +0000 (13:04 +0400)]
KVM: PPC: fix information leak to userland
Structure kvm_ppc_pvinfo is copied to userland with flags and
pad fields unitialized. It leads to leaking of contents of
kernel stack memory.
Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Mon, 25 Oct 2010 13:58:22 +0000 (11:58 -0200)]
KVM: MMU: fix rmap_remove on non present sptes
drop_spte should not attempt to rmap_remove a non present shadow pte.
This fixes a BUG_ON seen on kvm-autotest.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Reported-by: Lucas Meneghel Rodrigues <lmr@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Michael S. Tsirkin [Mon, 25 Oct 2010 01:21:24 +0000 (03:21 +0200)]
KVM: Write protect memory after slot swap
I have observed the following bug trigger:
1. userspace calls GET_DIRTY_LOG
2. kvm_mmu_slot_remove_write_access is called and makes a page ro
3. page fault happens and makes the page writeable
fault is logged in the bitmap appropriately
4. kvm_vm_ioctl_get_dirty_log swaps slot pointers
a lot of time passes
5. guest writes into the page
6. userspace calls GET_DIRTY_LOG
At point (5), bitmap is clean and page is writeable,
thus, guest modification of memory is not logged
and GET_DIRTY_LOG returns an empty bitmap.
The rule is that all pages are either dirty in the current bitmap,
or write-protected, which is violated here.
It seems that just moving kvm_mmu_slot_remove_write_access down
to after the slot pointer swap should fix this bug.
KVM-Stable-Tag.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jesper Juhl [Thu, 4 Nov 2010 20:44:41 +0000 (21:44 +0100)]
Clean up relay_alloc_page_array() slightly by using vzalloc rather than vmalloc and memset
We can optimize kernel/relay.c::relay_alloc_page_array() slightly by
using vzalloc. The patch makes these changes:
- use vzalloc instead of vmalloc+memset.
- remove redundant local variable 'array'.
- declare local 'pa_size' as const.
Cuts down nicely on both source and object-code size.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Acked-by: Pekka Enberg <penberg@kernel.org>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Greg Kroah-Hartman [Thu, 4 Nov 2010 19:50:47 +0000 (12:50 -0700)]
TTY: create drivers/tty/vt and move the vt code there
The vt and other related code is moved into the drivers/tty/vt directory.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Greg Kroah-Hartman [Thu, 4 Nov 2010 18:10:29 +0000 (11:10 -0700)]
TTY: create drivers/tty and move the tty core files there
The tty code should be in its own subdirectory and not in the char
driver with all of the cruft that is currently there.
Based on work done by Arnd Bergmann <arnd@arndb.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Linus Torvalds [Fri, 5 Nov 2010 14:54:40 +0000 (07:54 -0700)]
Merge branch 'for-linus-fixes' of git://git./linux/kernel/git/gerg/m68knommu
* 'for-linus-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68k, m68knommu: Do not include linux/hardirq.h in asm/irqflags.h
m68knommu: add back in declaration of do_IRQ
Jeff Layton [Tue, 2 Nov 2010 20:22:50 +0000 (16:22 -0400)]
cifs: dereferencing first then checking
This patch is based on Dan's original patch. His original description is
below:
Smatch complained about a couple checking for NULL after dereferencing
bugs. I'm not super familiar with the code so I did the conservative
thing and move the dereferences after the checks.
The dereferences in cifs_lock() and cifs_fsync() were added in
ba00ba64cf0 "cifs: make various routines use the cifsFileInfo->tcon
pointer". The dereference in find_writable_file() was added in
6508d904e6f "cifs: have find_readable/writable_file filter by fsuid".
The comments there say it's possible to trigger the NULL dereference
under stress.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Suresh Jayaraman [Wed, 3 Nov 2010 05:23:49 +0000 (10:53 +0530)]
cifs: trivial comment fix: tlink_tree is now a rbtree
Noticed while reviewing (late) the rbtree conversion patchset (which has been merged
already).
Cc: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Nelson Elhage [Wed, 3 Nov 2010 16:35:41 +0000 (16:35 +0000)]
inet_diag: Make sure we actually run the same bytecode we audited.
We were using nlmsg_find_attr() to look up the bytecode by attribute when
auditing, but then just using the first attribute when actually running
bytecode. So, if we received a message with two attribute elements, where only
the second had type INET_DIAG_REQ_BYTECODE, we would validate and run different
bytecode strings.
Fix this by consistently using nlmsg_find_attr everywhere.
Signed-off-by: Nelson Elhage <nelhage@ksplice.com>
Signed-off-by: Thomas Graf <tgraf@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nelson Elhage [Wed, 3 Nov 2010 16:35:40 +0000 (16:35 +0000)]
netlink: Make nlmsg_find_attr take a const nlmsghdr*.
This will let us use it on a nlmsghdr stored inside a netlink_callback.
Signed-off-by: Nelson Elhage <nelhage@ksplice.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 4 Nov 2010 01:21:39 +0000 (01:21 +0000)]
fib: fib_result_assign() should not change fib refcounts
After commit
ebc0ffae5 (RCU conversion of fib_lookup()),
fib_result_assign() should not change fib refcounts anymore.
Thanks to Michael who did the bisection and bug report.
Reported-by: Michael Ellerman <michael@ellerman.id.au>
Tested-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Mundt [Thu, 4 Nov 2010 03:51:08 +0000 (12:51 +0900)]
sh: Simplify phys_addr_mask()/PTE_PHYS_MASK for 29/32-bit.
Given that __in_29bit_mode() is a constant for the non-PMB case, we can
simply use the PMB-facing version of phys_addr_mask() and drop the other
variants.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Paul Mundt [Thu, 4 Nov 2010 03:46:19 +0000 (12:46 +0900)]
sh: nommu: Support building without an uncached mapping.
Now that nommu selects 32BIT we run in to the situation where SH-2A
supports an uncached identity mapping by way of the BSC, while the SH-2
does not. This provides stubs for the PC manglers and tidies up some of
the system*.h mess in the process.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Paul Mundt [Thu, 4 Nov 2010 03:32:24 +0000 (12:32 +0900)]
sh: nommu: use 32-bit phys mode.
The nommu code has regressed somewhat in that 29BIT gets set for the
SH-2/2A configs regardless of the fact that they are really 32BIT sans
MMU or PMB. This does a bit of tidying to get nommu properly selecting
32BIT as it was before.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Paul Mundt [Thu, 4 Nov 2010 03:29:00 +0000 (12:29 +0900)]
sh: mach-se: Fix up SE7206 no ioport build.
There was a leftover inw() used here that really just wants to be a
__raw_readw() instead. Convert it over.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Paul Mundt [Thu, 4 Nov 2010 03:21:25 +0000 (12:21 +0900)]
mmc: sh_mmcif: Convert extern inline to static inline.
Presently the extern inline case results in a compiler warning on ARM due
to the memory barrier definition used in the I/O routines. These
ultimately all want to be static inline anyways, so just convert them all
in place.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Paul Mundt [Thu, 4 Nov 2010 03:19:11 +0000 (12:19 +0900)]
ARM: mach-shmobile: Allow GPIO chips to register IRQ mappings.
As non-PFC chips are added that may support IRQs, pass through to the
generic helper. This follows the the SH change.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Jan Engelhardt [Thu, 4 Nov 2010 01:55:39 +0000 (18:55 -0700)]
netfilter: ip6_tables: fix information leak to userspace
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 4 Nov 2010 01:52:32 +0000 (18:52 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/kaber/nf-2.6
Herbert Xu [Wed, 3 Nov 2010 13:31:05 +0000 (13:31 +0000)]
cls_cgroup: Fix crash on module unload
Somewhere along the lines net_cls_subsys_id became a macro when
cls_cgroup is built as a module. Not only did it make cls_cgroup
completely useless, it also causes it to crash on module unload.
This patch fixes this by removing that macro.
Thanks to Eric Dumazet for diagnosing this problem.
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
andrew hendry [Wed, 3 Nov 2010 12:54:53 +0000 (12:54 +0000)]
memory corruption in X.25 facilities parsing
Signed-of-by: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xiaotian Feng [Tue, 2 Nov 2010 16:11:05 +0000 (16:11 +0000)]
net dst: fix percpu_counter list corruption and poison overwritten
There're some percpu_counter list corruption and poison overwritten warnings
in recent kernel, which is resulted by
fc66f95c.
commit
fc66f95c switches to use percpu_counter, in ip6_route_net_init, kernel
init the percpu_counter for dst entries, but, the percpu_counter is never destroyed
in ip6_route_net_exit. So if the related data is freed by kernel, the freed percpu_counter
is still on the list, then if we insert/remove other percpu_counter, list corruption
resulted. Also, if the insert/remove option modifies the ->prev,->next pointer of
the freed value, the poison overwritten is resulted then.
With the following patch, the percpu_counter list corruption and poison overwritten
warnings disappeared.
Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Tue, 2 Nov 2010 01:54:01 +0000 (01:54 +0000)]
rds: Remove kfreed tcp conn from list
All the rds_tcp_connection objects are stored list, but when
being freed it should be removed from there.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Tue, 2 Nov 2010 01:52:05 +0000 (01:52 +0000)]
rds: Lost locking in loop connection freeing
The conn is removed from list in there and this requires
proper lock protection.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 3 Nov 2010 12:25:32 +0000 (12:25 +0000)]
de2104x: fix panic on load
Its now illegal to call netif_stop_queue() before register_netdev()
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 3 Nov 2010 12:11:21 +0000 (12:11 +0000)]
atl1 : fix panic on load
Its now illegal to call netif_stop_queue() before register_netdev()
Reported-by: Tom Gundersen <teg@jklm.no>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amerigo Wang [Tue, 2 Nov 2010 18:25:31 +0000 (18:25 +0000)]
netxen: remove unused firmware exports
Quote from Amit Salecha:
"Actually I was not updated, NX_UNIFIED_ROMIMAGE_NAME (phanfw.bin) is already
submitted and its present in linux-firmware.git.
I will get back to you on NX_P2_MN_ROMIMAGE_NAME, NX_P3_CT_ROMIMAGE_NAME and
NX_P3_MN_ROMIMAGE_NAME. Whether this will be submitted ?"
We have to remove these, otherwise we will get wrong info from modinfo.
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Amit Kumar Salecha <amit.salecha@qlogic.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dhananjay Phadke <dhananjay.phadke@qlogic.com>
Cc: Narender Kumar <narender.kumar@qlogic.com>
Acked-by: Amit Kumar Salecha <amit.salecha@qlogic.com>--
Signed-off-by: David S. Miller <davem@davemloft.net>
sjur.brandeland@stericsson.com [Wed, 3 Nov 2010 10:19:25 +0000 (10:19 +0000)]
caif: Remove noisy printout when disconnecting caif socket
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sjur Brændeland [Mon, 1 Nov 2010 11:52:48 +0000 (11:52 +0000)]
caif: SPI-driver bugfix - incorrect padding.
Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
André Carvalho de Matos [Mon, 1 Nov 2010 11:52:47 +0000 (11:52 +0000)]
caif: Bugfix for socket priority, bindtodev and dbg channel.
Changes:
o Bugfix: SO_PRIORITY for SOL_SOCKET could not be handled
in caif's setsockopt, using the struct sock attribute priority instead.
o Bugfix: SO_BINDTODEVICE for SOL_SOCKET could not be handled
in caif's setsockopt, using the struct sock attribute ifindex instead.
o Wrong assert statement for RFM layer segmentation.
o CAIF Debug channels was not working over SPI, caif_payload_info
containing padding info must be initialized.
o Check on pointer before dereferencing when unregister dev in caif_dev.c
Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Faith [Mon, 1 Nov 2010 11:30:08 +0000 (11:30 +0000)]
smsc911x: Set Ethernet EEPROM size to supported device's size
The SMSC911x supports 128 x 8-bit EEPROMs. Increase the EEPROM size
so more than just the MAC address can be stored.
Signed-off-by: John Faith <jfaith7@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wu Fengguang [Wed, 3 Nov 2010 17:56:49 +0000 (01:56 +0800)]
vmstat: fix offset calculation on void*
Fix regression introduced by commit
79da826aee6 ("writeback: report
dirty thresholds in /proc/vmstat").
The incorrect pointer arithmetic can result in problems like this:
BUG: unable to handle kernel paging request at
07c06d16
IP: [<
c050c336>] strnlen+0x6/0x20
Call Trace:
[<
c050a249>] ? string+0x39/0xe0
[<
c042be6b>] ? __wake_up_common+0x4b/0x80
[<
c050afcc>] ? vsnprintf+0x1ec/0x380
[<
c04b380e>] ? seq_printf+0x2e/0x60
[<
c04829a6>] ? vmstat_show+0x26/0x30
[<
c04b3bb6>] ? seq_read+0xa6/0x380
[<
c04b3b10>] ? seq_read+0x0/0x380
[<
c04d5d2f>] ? proc_reg_read+0x5f/0x90
[<
c049c4a1>] ? vfs_read+0xa1/0x140
[<
c04d5cd0>] ? proc_reg_read+0x0/0x90
[<
c049c981>] ? sys_read+0x41/0x70
[<
c0402bd0>] ? sysenter_do_call+0x12/0x26
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Michael Rubin <mrubin@google.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 3 Nov 2010 17:44:55 +0000 (13:44 -0400)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ASoC: tpa6130a2: Get rid of compile warning from tpa6130a2_power
ALSA: hda - MacBookAir3,1(3,2) alsa support
ASoC: fix the building issue of missing codec field in 'struct snd_soc_card'
ALSA: usb-audio - Support for Power/Status LED on Creative USB X-Fi S51
ALSA: asihpi - Unsafe memory management when allocating control cache
ASoC: Update WARN uses in wm_hubs
ASoC: Include cx20442 to SND_SOC_ALL_CODECS
ASoC: Fix SND_SOC_ALL_CODECS typo for jz4740
ASoC: Remove volatility from WM8900 POWER1 register
ALSA: lx6464es - make 1 bit signed bitfield unsigned
ALSA: cs46xx memory management fixes for cs46xx_dsp_spos_create()
ALSA: usb - driver neglects kmalloc return value check and may deref NULL
ASoC: tpa6130a2: Fix unbalanced regulator disables
ASoC: tlv320dac33: Mode1 FIFO auto configuration fix
ASoC: tlv320dac33: Limit the US_TO_SAMPLES macro
ASoC: tlv320dac33: Error handling for broken chip
ASoC: Check return value of struct_strtoul() in pmdown_time_set()
Theodore Ts'o [Wed, 3 Nov 2010 16:03:21 +0000 (12:03 -0400)]
ext4: initialize the percpu counters before replaying the journal
We now initialize the percpu counters before replaying the journal,
but after the journal, we recalculate the global counters, to deal
with the possibility of the per-blockgroup counts getting updated by
the journal replay.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Takashi Iwai [Wed, 3 Nov 2010 14:51:26 +0000 (15:51 +0100)]
Merge branch 'fix/asoc' into for-linus
Jarkko Nikula [Wed, 3 Nov 2010 14:39:00 +0000 (16:39 +0200)]
ASoC: tpa6130a2: Get rid of compile warning from tpa6130a2_power
Patch "ASoC: tpa6130a2: Fix unbalanced regulator disables" introduced a
compiler warning "‘ret’ may be used uninitialized in this function".
Initialize ret to zero to get rid of it and making sure that the function
does not return any random error code when the code is falling through.
Signed-off-by: Jarkko Nikula <jhnikula@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Vipin Mehta [Sat, 18 Sep 2010 01:45:38 +0000 (18:45 -0700)]
staging: ath6kl: Fixing the driver to use modified mmc_host structure
A recent change in the mmc_host structure removed the distinction
between hw and phys segments (
58cb50c20fde6059f3f8db4466a1bd4d1fff999c)
Changing the driver to use the modified structure.
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Vipin Mehta <vmehta@atheros.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Takashi Iwai [Wed, 3 Nov 2010 12:56:08 +0000 (13:56 +0100)]
Merge branch 'for-2.6.37' of git://git./linux/kernel/git/lrg/asoc-2.6 into fix/asoc
Vasiliy Kulikov [Wed, 3 Nov 2010 07:45:06 +0000 (08:45 +0100)]
ipv4: netfilter: ip_tables: fix information leak to userland
Structure ipt_getinfo is copied to userland with the field "name"
that has the last elements unitialized. It leads to leaking of
contents of kernel stack memory.
Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Vasiliy Kulikov [Wed, 3 Nov 2010 07:44:12 +0000 (08:44 +0100)]
ipv4: netfilter: arp_tables: fix information leak to userland
Structure arpt_getinfo is copied to userland with the field "name"
that has the last elements unitialized. It leads to leaking of
contents of kernel stack memory.
Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>