Linus Torvalds [Wed, 31 Dec 2008 01:25:29 +0000 (17:25 -0800)]
Merge branch 'agp-next' of git://git./linux/kernel/git/airlied/agp-2.6
* 'agp-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6:
agp/intel: Fix broken ® symbol in device name.
agp/intel: add support for G41 chipset
Linus Torvalds [Wed, 31 Dec 2008 01:23:31 +0000 (17:23 -0800)]
Merge git://git./linux/kernel/git/davem/sparc-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next-2.6: (98 commits)
sparc: move select of ARCH_SUPPORTS_MSI
sparc: drop SUN_IO
sparc: unify sections.h
sparc: use .data.init_task section for init_thread_union
sparc: fix array overrun check in of_device_64.c
sparc: unify module.c
sparc64: prepare module_64.c for unification
sparc64: use bit neutral Elf symbols
sparc: unify module.h
sparc: introduce CONFIG_BITS
sparc: fix hardirq.h removal fallout
sparc64: do not export pus_fs_struct
sparc: use sparc64 version of scatterlist.h
sparc: Commonize memcmp assembler.
sparc: Unify strlen assembler.
sparc: Add asm/asm.h
sparc: Kill memcmp_32.S code which has been ifdef'd out for centuries.
sparc: replace for_each_cpu_mask_nr with for_each_cpu
sparc: fix sparse warnings in irq_32.c
sparc: add include guards to kernel.h
...
Linus Torvalds [Wed, 31 Dec 2008 01:20:05 +0000 (17:20 -0800)]
Merge branch 'for-2.6.29' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.29' of git://git.kernel.dk/linux-2.6-block: (43 commits)
bio: get rid of bio_vec clearing
bounce: don't rely on a zeroed bio_vec list
cciss: simplify parameters to deregister_disk function
cfq-iosched: fix race between exiting queue and exiting task
loop: Do not call loop_unplug for not configured loop device.
loop: Flush possible running bios when loop device is released.
alpha: remove dead BIO_VMERGE_BOUNDARY
Get rid of CONFIG_LSF
block: make blk_softirq_init() static
block: use min_not_zero in blk_queue_stack_limits
block: add one-hit cache for disk partition lookup
cfq-iosched: remove limit of dispatch depth of max 4 times quantum
nbd: tell the block layer that it is not a rotational device
block: get rid of elevator_t typedef
aio: make the lookup_ioctx() lockless
bio: add support for inlining a number of bio_vecs inside the bio
bio: allow individual slabs in the bio_set
bio: move the slab pointer inside the bio_set
bio: only mempool back the largest bio_vec slab cache
block: don't use plugging on SSD devices
...
Linus Torvalds [Wed, 31 Dec 2008 00:20:19 +0000 (16:20 -0800)]
Merge branch 'irq-core-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, sparseirq: clean up Kconfig entry
x86: turn CONFIG_SPARSE_IRQ off by default
sparseirq: fix numa_migrate_irq_desc dependency and comments
sparseirq: add kernel-doc notation for new member in irq_desc, -v2
locking, irq: enclose irq_desc_lock_class in CONFIG_LOCKDEP
sparseirq, xen: make sure irq_desc is allocated for interrupts
sparseirq: fix !SMP building, #2
x86, sparseirq: move irq_desc according to smp_affinity, v7
proc: enclose desc variable of show_stat() in CONFIG_SPARSE_IRQ
sparse irqs: add irqnr.h to the user headers list
sparse irqs: handle !GENIRQ platforms
sparseirq: fix !SMP && !PCI_MSI && !HT_IRQ build
sparseirq: fix Alpha build failure
sparseirq: fix typo in !CONFIG_IO_APIC case
x86, MSI: pass irq_cfg and irq_desc
x86: MSI start irq numbering from nr_irqs_gsi
x86: use NR_IRQS_LEGACY
sparse irq_desc[] array: core kernel and x86 changes
genirq: record IRQ_LEVEL in irq_desc[]
irq.h: remove padding from irq_desc on 64bits
Linus Torvalds [Wed, 31 Dec 2008 00:16:21 +0000 (16:16 -0800)]
Merge branch 'timers-core-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
hrtimers: fix warning in kernel/hrtimer.c
x86: make sure we really have an hpet mapping before using it
x86: enable HPET on Fujitsu u9200
linux/timex.h: cleanup for userspace
posix-timers: simplify de_thread()->exit_itimers() path
posix-timers: check ->it_signal instead of ->it_pid to validate the timer
posix-timers: use "struct pid*" instead of "struct task_struct*"
nohz: suppress needless timer reprogramming
clocksource, acpi_pm.c: put acpi_pm_read_slow() under CONFIG_PCI
nohz: no softirq pending warnings for offline cpus
hrtimer: removing all ur callback modes, fix
hrtimer: removing all ur callback modes, fix hotplug
hrtimer: removing all ur callback modes
x86: correct link to HPET timer specification
rtc-cmos: export second NVRAM bank
Fixed up conflicts in sound/drivers/pcsp/pcsp.c and sound/core/hrtimer.c
manually.
Linus Torvalds [Wed, 31 Dec 2008 00:10:19 +0000 (16:10 -0800)]
Merge branch 'core-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (63 commits)
stacktrace: provide save_stack_trace_tsk() weak alias
rcu: provide RCU options on non-preempt architectures too
printk: fix discarding message when recursion_bug
futex: clean up futex_(un)lock_pi fault handling
"Tree RCU": scalable classic RCU implementation
futex: rename field in futex_q to clarify single waiter semantics
x86/swiotlb: add default swiotlb_arch_range_needs_mapping
x86/swiotlb: add default phys<->bus conversion
x86: unify pci iommu setup and allow swiotlb to compile for 32 bit
x86: add swiotlb allocation functions
swiotlb: consolidate swiotlb info message printing
swiotlb: support bouncing of HighMem pages
swiotlb: factor out copy to/from device
swiotlb: add arch hook to force mapping
swiotlb: allow architectures to override phys<->bus<->phys conversions
swiotlb: add comment where we handle the overflow of a dma mask on 32 bit
rcu: fix rcutorture behavior during reboot
resources: skip sanity check of busy resources
swiotlb: move some definitions to header
swiotlb: allow architectures to override swiotlb pool allocation
...
Fix up trivial conflicts in
arch/x86/kernel/Makefile
arch/x86/mm/init_32.c
include/linux/hardirq.h
as per Ingo's suggestions.
Jens Axboe [Tue, 23 Dec 2008 11:46:21 +0000 (12:46 +0100)]
bio: get rid of bio_vec clearing
We don't need to clear the memory used for adding bio_vec entries,
since nobody should be looking at members unitialized. Any valid
use should be below bio->bi_vcnt, and that members up until that count
must be valid since they were added through bio_add_page().
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Tue, 23 Dec 2008 11:44:19 +0000 (12:44 +0100)]
bounce: don't rely on a zeroed bio_vec list
__blk_queue_bounce() relies on a zeroed bio_vec list, since it looks
up arbitrary indexes in the allocated bio. The block layer only
guarentees that added entries are valid, so clear memory after alloc.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Stephen M. Cameron [Thu, 18 Dec 2008 13:55:51 +0000 (14:55 +0100)]
cciss: simplify parameters to deregister_disk function
Simplify parameters to deregister_disk function.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Mon, 15 Dec 2008 20:19:25 +0000 (21:19 +0100)]
cfq-iosched: fix race between exiting queue and exiting task
Original patch from Nikanth Karthikesan <knikanth@suse.de>
When a queue exits the queue lock is taken and cfq_exit_queue() would free all
the cic's associated with the queue.
But when a task exits, cfq_exit_io_context() gets cic one by one and then
locks the associated queue to call __cfq_exit_single_io_context. It looks like
between getting a cic from the ioc and locking the queue, the queue might have
exited on another cpu.
Fix this by rechecking the cfq_io_context queue key inside the queue lock
again, and not calling into __cfq_exit_single_io_context() if somebody
beat us to it.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Milan Broz [Fri, 12 Dec 2008 13:50:49 +0000 (14:50 +0100)]
loop: Do not call loop_unplug for not configured loop device.
In loop_unplug() function is expected that mapping is set
and lo->lo_backing_file is not NULL.
Unfortunately loop_set_fd() set the request queue unplug function,
but loop_clr_fd() doesn't clear that.
Loop device allows open of non-configured loop in some situations.
If the unplug on request queue is called, loop module oopses because
of missing lo_backing_file.
Simple reproducer:
losetup /dev/loop0 /xxx
losetup -d /dev/loop0
dmsetup create x --table "0 1 linear /dev/loop0 0"
EIP is at loop_unplug+0x1d/0x3b
...
Call Trace:
blk_unplug+0x57/0x5e
dm_table_unplug_all+0x34/0x77 [dm_mod]
destroy_inode+0x27/0x38
generic_delete_inode+0xd5/0xd9
iput+0x4b/0x4e
dm_resume+0xca/0xfe [dm_mod]
dev_suspend+0x143/0x165 [dm_mod]
dm_ctl_ioctl+0x18e/0x1cf [dm_mod]
dev_suspend+0x0/0x165 [dm_mod]
dm_ctl_ioctl+0x0/0x1cf [dm_mod]
vfs_ioctl+0x22/0x69
do_vfs_ioctl+0x39d/0x3c7
trace_hardirqs_on+0xb/0xd
remove_vma+0x50/0x56
do_munmap+0x21c/0x237
sys_ioctl+0x2c/0x45
sysenter_do_call+0x12/0x31
Several reports here
http://www.kerneloops.org/search.php?search=loop_unplug
Fix it by simply clear unplug function together with
removing of backing file.
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Milan Broz [Fri, 12 Dec 2008 13:48:27 +0000 (14:48 +0100)]
loop: Flush possible running bios when loop device is released.
When there are still queued bios and reference count
drops to zero, loop device must flush all queued bios.
Otherwise it can lead to situation that caller
closes the device, but some bios are still running
and endio() function call later OOpses when uses
unallocated mempool.
This happens for example when running dm-crypt over loop,
here is typical oops backtrace:
Oops: 0000 [#1] PREEMPT SMP
EIP is at mempool_free+0x12/0x6b
...
crypt_dec_pending+0x50/0x54 [dm_crypt]
crypt_endio+0x9f/0xa7 [dm_crypt]
crypt_endio+0x0/0xa7 [dm_crypt]
bio_endio+0x2b/0x2e
loop_thread+0x37a/0x3b1
do_lo_send_aops+0x0/0x165
autoremove_wake_function+0x0/0x33
loop_thread+0x0/0x3b1
kthread+0x3b/0x61
kthread+0x0/0x61
kernel_thread_helper+0x7/0x10
(But crash is reproducible with different dm targets
running over loop device too.)
Patch fixes it by flushing the bios in release call,
reusing the flush mechanism for switching backing store.
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
FUJITA Tomonori [Fri, 12 Dec 2008 08:52:35 +0000 (09:52 +0100)]
alpha: remove dead BIO_VMERGE_BOUNDARY
The block layer dropped the virtual merge feature
(
b8b3e16cfe6435d961f6aaebcfd52a1ff2a988c5). BIO_VMERGE_BOUNDARY
definition is meaningless now.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Fri, 12 Dec 2008 08:51:16 +0000 (09:51 +0100)]
Get rid of CONFIG_LSF
We have two seperate config entries for large devices/files. One
is CONFIG_LBD that guards just the devices, the other is CONFIG_LSF
that handles large files. This doesn't make a lot of sense, you typically
want both or none. So get rid of CONFIG_LSF and change CONFIG_LBD wording
to indicate that it covers both.
Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Roel Kluin [Wed, 10 Dec 2008 14:47:33 +0000 (15:47 +0100)]
block: make blk_softirq_init() static
Sparse asked whether these could be static.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
FUJITA Tomonori [Thu, 4 Dec 2008 07:56:35 +0000 (08:56 +0100)]
block: use min_not_zero in blk_queue_stack_limits
zero is invalid for max_phys_segments, max_hw_segments, and
max_segment_size. It's better to use use min_not_zero instead of
min. min() works though (because the commit 0e435ac makes sure that
these values are set to the default values, non zero, if a queue is
initialized properly).
With this patch, blk_queue_stack_limits does the almost same thing
that dm's combine_restrictions_low() does. I think that it's easy to
remove dm's combine_restrictions_low.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Fri, 24 Oct 2008 10:52:42 +0000 (12:52 +0200)]
block: add one-hit cache for disk partition lookup
disk_map_sector_rcu() returns a partition from a sector offset,
which we use for IO statistics on a per-partition basis. The
lookup itself is an O(N) list lookup, where N is the number of
partitions. This actually hurts performance quite a bit, even
on the lower end partitions. On higher numbered partitions,
it can get pretty bad.
Solve this by adding a one-hit cache for partition lookup.
This makes the lookup O(1) for the case where we do most IO to
one partition. Even for mixed partition workloads, amortized cost
is pretty close to O(1) since the natural IO batching makes the
one-hit cache last for lots of IOs.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Mon, 20 Oct 2008 13:44:28 +0000 (15:44 +0200)]
cfq-iosched: remove limit of dispatch depth of max 4 times quantum
This basically limits the hardware queue depth to 4*quantum at any
point in time, which is 16 with the default settings. As CFQ uses
other means to shrink the hardware queue when necessary in the first
place, there's really no need for this extra heuristic. Additionally,
it ends up hurting performance in some cases.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Fri, 31 Oct 2008 09:06:37 +0000 (10:06 +0100)]
nbd: tell the block layer that it is not a rotational device
Then we can get rid of that manual elevator type fiddling.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Fri, 31 Oct 2008 09:05:07 +0000 (10:05 +0100)]
block: get rid of elevator_t typedef
Just use struct elevator_queue everywhere instead.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Tue, 9 Dec 2008 07:11:22 +0000 (08:11 +0100)]
aio: make the lookup_ioctx() lockless
The mm->ioctx_list is currently protected by a reader-writer lock,
so we always grab that lock on the read side for doing ioctx
lookups. As the workload is extremely reader biased, turn this into
an rcu hlist so we can make lookup_ioctx() lockless. Get rid of
the rwlock and use a spinlock for providing update side exclusion.
There's usually only 1 entry on this list, so it doesn't make sense
to look into fancier data structures.
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Tue, 23 Dec 2008 11:42:54 +0000 (12:42 +0100)]
bio: add support for inlining a number of bio_vecs inside the bio
When we go and allocate a bio for IO, we actually do two allocations.
One for the bio itself, and one for the bi_io_vec that holds the
actual pages we are interested in.
This feature inlines a definable amount of io vecs inside the bio
itself, so we eliminate the bio_vec array allocation for IO's up
to a certain size. It defaults to 4 vecs, which is typically 16k
of IO.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Wed, 10 Dec 2008 14:35:05 +0000 (15:35 +0100)]
bio: allow individual slabs in the bio_set
Instead of having a global bio slab cache, add a reference to one
in each bio_set that is created. This allows for personalized slabs
in each bio_set, so that they can have bios of different sizes.
This means we can personalize the bios we return. File systems may
want to embed the bio inside another structure, to avoid allocation
more items (and stuffing them in ->bi_private) after the get a bio.
Or we may want to embed a number of bio_vecs directly at the end
of a bio, to avoid doing two allocations to return a bio. This is now
possible.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Wed, 22 Oct 2008 18:32:58 +0000 (20:32 +0200)]
bio: move the slab pointer inside the bio_set
In preparation for adding differently sized bios.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Thu, 11 Dec 2008 10:53:43 +0000 (11:53 +0100)]
bio: only mempool back the largest bio_vec slab cache
We only very rarely need the mempool backing, so it makes sense to
get rid of all but one of the mempool in a bio_set. So keep the
largest bio_vec count mempool so we can always honor the largest
allocation, and "upgrade" callers that fail.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Fri, 17 Oct 2008 11:58:29 +0000 (13:58 +0200)]
block: don't use plugging on SSD devices
We just want to hand the first bits of IO to the device as fast
as possible. Gains a few percent on the IOPS rate.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tejun Heo [Fri, 28 Nov 2008 04:32:07 +0000 (13:32 +0900)]
block: fix empty barrier on write-through w/ ordered tag
Empty barrier on write-through (or no cache) w/ ordered tag has no
command to execute and without any command to execute ordered tag is
never issued to the device and the ordering is never achieved. Force
draining for such cases.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tejun Heo [Fri, 28 Nov 2008 04:32:06 +0000 (13:32 +0900)]
block: simplify empty barrier implementation
Empty barrier required special handling in __elv_next_request() to
complete it without letting the low level driver see it.
With previous changes, barrier code is now flexible enough to skip the
BAR step using the same barrier sequence selection mechanism. Drop
the special handling and mask off q->ordered from start_ordered().
Remove blk_empty_barrier() test which now has no user.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tejun Heo [Fri, 28 Nov 2008 04:32:05 +0000 (13:32 +0900)]
block: make barrier completion more robust
Barrier completion had the following assumptions.
* start_ordered() couldn't finish the whole sequence properly. If all
actions are to be skipped, q->ordseq is set correctly but the actual
completion was never triggered thus hanging the barrier request.
* Drain completion in elv_complete_request() assumed that there's
always at least one request in the queue when drain completes.
Both assumptions are true but these assumptions need to be removed to
improve empty barrier implementation. This patch makes the following
changes.
* Make start_ordered() use blk_ordered_complete_seq() to mark skipped
steps complete and notify __elv_next_request() that it should fetch
the next request if the whole barrier has completed inside
start_ordered().
* Make drain completion path in elv_complete_request() check whether
the queue is empty. Empty queue also indicates drain completion.
* While at it, convert 0/1 return from blk_do_ordered() to false/true.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tejun Heo [Fri, 28 Nov 2008 04:32:04 +0000 (13:32 +0900)]
block: make every barrier action optional
In all barrier sequences, the barrier write itself was always assumed
to be issued and thus didn't have corresponding control flag. This
patch adds QUEUE_ORDERED_DO_BAR and unify action mask handling in
start_ordered() such that any barrier action can be skipped.
This patch doesn't introduce any visible behavior changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tejun Heo [Fri, 28 Nov 2008 04:32:03 +0000 (13:32 +0900)]
block: remove duplicate or unused barrier/discard error paths
* Because barrier mode can be changed dynamically, whether barrier is
supported or not can be determined only when actually issuing the
barrier and there is no point in checking it earlier. Drop barrier
support check in generic_make_request() and __make_request(), and
update comment around the support check in blk_do_ordered().
* There is no reason to check discard support in both
generic_make_request() and __make_request(). Drop the check in
__make_request(). While at it, move error action block to the end
of the function and add unlikely() to q existence test.
* Barrier request, be it empty or not, is never passed to low level
driver and thus it's meaningless to try to copy back req->sector to
bio->bi_sector on error. In addition, the notion of failed sector
doesn't make any sense for empty barrier to begin with. Drop the
code block from __end_that_request_first().
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tejun Heo [Fri, 28 Nov 2008 04:32:02 +0000 (13:32 +0900)]
block: reorganize QUEUE_ORDERED_* constants
Separate out ordering type (drain,) and action masks (preflush,
postflush, fua) from visible ordering mode selectors
(QUEUE_ORDERED_*). Ordering types are now named QUEUE_ORDERED_BY_*
while action masks are named QUEUE_ORDERED_DO_*.
This change is necessary to add QUEUE_ORDERED_DO_BAR and make it
optional to improve empty barrier implementation.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Richard Kennedy [Wed, 3 Dec 2008 11:41:40 +0000 (12:41 +0100)]
block: reorder struct bio to remove padding on 64bit
Remove 8 bytes of padding from struct bio which also removes 16 bytes from
struct bio_pair to make it 248 bytes. bio_pair then fits into one fewer
cache lines & into a smaller slab.
Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Cheng Renquan [Wed, 3 Dec 2008 11:41:39 +0000 (12:41 +0100)]
block: use cancel_work_sync() instead of kblockd_flush_work()
After many improvements on kblockd_flush_work, it is now identical to
cancel_work_sync, so a direct call to cancel_work_sync is suggested.
The only difference is that cancel_work_sync is a GPL symbol,
so no non-GPL modules anymore.
Signed-off-by: Cheng Renquan <crquan@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Keith Mannthey [Tue, 25 Nov 2008 09:24:35 +0000 (10:24 +0100)]
block: Supress Buffer I/O errors when SCSI REQ_QUIET flag set
Allow the scsi request REQ_QUIET flag to be propagated to the buffer
file system layer. The basic ideas is to pass the flag from the scsi
request to the bio (block IO) and then to the buffer layer. The buffer
layer can then suppress needless printks.
This patch declutters the kernel log by removed the 40-50 (per lun)
buffer io error messages seen during a boot in my multipath setup . It
is a good chance any real errors will be missed in the "noise" it the
logs without this patch.
During boot I see blocks of messages like
"
__ratelimit: 211 callbacks suppressed
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242847
Buffer I/O error on device sdm, logical block 1
Buffer I/O error on device sdm, logical block 5242878
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242872
"
in my logs.
My disk environment is multipath fiber channel using the SCSI_DH_RDAC
code and multipathd. This topology includes an "active" and "ghost"
path for each lun. IO's to the "ghost" path will never complete and the
SCSI layer, via the scsi device handler rdac code, quick returns the IOs
to theses paths and sets the REQ_QUIET scsi flag to suppress the scsi
layer messages.
I am wanting to extend the QUIET behavior to include the buffer file
system layer to deal with these errors as well. I have been running this
patch for a while now on several boxes without issue. A few runs of
bonnie++ show no noticeable difference in performance in my setup.
Thanks for John Stultz for the quiet_error finalization.
Submitted-by: Keith Mannthey <kmannth@us.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Wu Fengguang [Tue, 25 Nov 2008 08:08:39 +0000 (09:08 +0100)]
block: don't take lock on changing ra_pages
There's no need to take queue_lock or kernel_lock when modifying
bdi->ra_pages. So remove them. Also remove out of date comment for
queue_max_sectors_store().
Signed-off-by: Wu Fengguang <wfg@linux.intel.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Nikanth Karthikesan [Mon, 24 Nov 2008 09:46:29 +0000 (10:46 +0100)]
Documentation: remove reference to ll_rw_blk.c and moved drivers/block/elevator.c
The drivers/block/ll_rw_block.c has been split and organized in the block/
directory, and also drivers/block/elevator.c has been moved to the block/
directory. Update Documentation/block/biodoc.txt accordingly
Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Qinghuang Feng [Mon, 24 Nov 2008 09:43:36 +0000 (10:43 +0100)]
block/blk-tag.c: cleanup kernel-doc
There is no argument named @tags in blk_init_tags,
remove its' comment.
Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Thu, 20 Nov 2008 08:46:09 +0000 (09:46 +0100)]
cciss: switch to using hlist for command list management
This both cleans up the code and also helps detect the spurious case
of a command attempted being removed from a queue it doesn't belong
to.
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Nikanth Karthikesan [Wed, 19 Nov 2008 09:20:23 +0000 (10:20 +0100)]
Do not free io context when taking recursive faults in do_exit
When taking recursive faults in do_exit, if the io_context is not null,
exit_io_context() is being called. But it might decrement the refcount
more than once. It is better to leave this task alone.
Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Marcin Slusarz [Sun, 16 Nov 2008 18:06:37 +0000 (19:06 +0100)]
cdrom: reduce stack usage of mmc_ioctl_dvd_read_struct
1. kmalloc 192 bytes in dvd_read_bca (which is inlined into dvd_read_struct)
2. Pass struct packet_command to all dvd_read_* functions.
Checkstack output:
Before: mmc_ioctl_dvd_read_struct: 280
After: mmc_ioctl_dvd_read_struct: 56
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Marcin Slusarz [Sun, 16 Nov 2008 18:04:47 +0000 (19:04 +0100)]
cdrom: split mmc_ioctl to lower stack usage
Checkstack output:
Before:
mmc_ioctl: 584
After:
mmc_ioctl_dvd_read_struct: 280
mmc_ioctl_cdrom_subchannel: 152
mmc_ioctl_cdrom_read_data: 120
mmc_ioctl_cdrom_volume: 104
mmc_ioctl_cdrom_read_audio: 104
(mmc_ioctl is inlined into cdrom_ioctl - 104 bytes)
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Milton Miller [Mon, 17 Nov 2008 12:10:34 +0000 (13:10 +0100)]
scsi-ioctl: use clock_t <> jiffies
Convert the timeout ioctl scalling to use the clock_t functions
which are much more accurate with some USER_HZ vs HZ combinations.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Wed, 19 Nov 2008 13:38:39 +0000 (14:38 +0100)]
block: leave the request timeout timer running even on an empty list
For sync IO, we'll often do them serialized. This means we'll be touching
the queue timer for every IO, as opposed to only occasionally like we
do for queued IO. Instead of deleting the timer when the last request
is removed, just let continue running. If a new request comes up soon
we then don't have to readd the timer again. If no new requests arrive,
the timer will expire without side effect later.
This improves high iops sync IO by ~1%.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Thu, 30 Oct 2008 07:53:02 +0000 (08:53 +0100)]
block: add comment in blk_rq_timed_out() about why next can not be 0
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
malahal@us.ibm.com [Thu, 30 Oct 2008 07:51:58 +0000 (08:51 +0100)]
block: optimizations in blk_rq_timed_out_timer()
Now the rq->deadline can't be zero if the request is in the
timeout_list, so there is no need to have next_set. There is no need to
access a request's deadline field if blk_rq_timed_out is called on it.
Signed-off-by: Malahal Naineni <malahal@us.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Fernando Luis Vázquez Cao [Mon, 27 Oct 2008 09:45:54 +0000 (18:45 +0900)]
xen-blkfront: set queue paravirt flag
Xen's blkfront sets noop as the default I/O scheduler at initialization
time to avoid elevator overheads such as idling, but with the advent of
basic disk profiling capabilities this is not necessary anymore. We
should just tell the block layer that we are a paravirt front-end driver
and the elevator will automatically make the necessary adjustments.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Fernando Luis Vázquez Cao [Mon, 27 Oct 2008 09:45:15 +0000 (18:45 +0900)]
virtio_blk: set queue paravirt flag
As a paravirt front-end driver, virtio_blk is not a rotational device so
we want do avoid idling in AS/CFQ. Tell the block layer about this.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Fernando Luis Vázquez Cao [Mon, 27 Oct 2008 09:44:46 +0000 (18:44 +0900)]
block: add queue flag for paravirt frontend drivers
As is the case with SSD devices, we do not want to idle in AS/CFQ when
the block device is a paravirt front-end driver. This patch adds a flag
(QUEUE_FLAG_VIRT) which should be used by front-end drivers such as
virtio_blk and xen-blkfront to indicate a paravirtualized device.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
David S. Miller [Mon, 29 Dec 2008 04:19:47 +0000 (20:19 -0800)]
Merge branch 'master' of /linux/kernel/git/torvalds/linux-2.6
Conflicts:
arch/sparc64/kernel/idprom.c
David Howells [Mon, 29 Dec 2008 00:41:51 +0000 (00:41 +0000)]
KEYS: Fix variable uninitialisation warnings
Fix variable uninitialisation warnings introduced in:
commit
8bbf4976b59fc9fc2861e79cab7beb3f6d647640
Author: David Howells <dhowells@redhat.com>
Date: Fri Nov 14 10:39:14 2008 +1100
KEYS: Alter use of key instantiation link-to-keyring argument
As:
security/keys/keyctl.c: In function 'keyctl_negate_key':
security/keys/keyctl.c:976: warning: 'dest_keyring' may be used uninitialized in this function
security/keys/keyctl.c: In function 'keyctl_instantiate_key':
security/keys/keyctl.c:898: warning: 'dest_keyring' may be used uninitialized in this function
Some versions of gcc notice that get_instantiation_key() doesn't always set
*_dest_keyring, but fail to observe that if this happens then *_dest_keyring
will not be read by the caller.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Linus Torvalds [Mon, 29 Dec 2008 00:54:33 +0000 (16:54 -0800)]
Merge branch 'next' of git://git./linux/kernel/git/paulus/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (144 commits)
powerpc/44x: Support 16K/64K base page sizes on 44x
powerpc: Force memory size to be a multiple of PAGE_SIZE
powerpc/32: Wire up the trampoline code for kdump
powerpc/32: Add the ability for a classic ppc kernel to be loaded at 32M
powerpc/32: Allow __ioremap on RAM addresses for kdump kernel
powerpc/32: Setup OF properties for kdump
powerpc/32/kdump: Implement crash_setup_regs() using ppc_save_regs()
powerpc: Prepare xmon_save_regs for use with kdump
powerpc: Remove default kexec/crash_kernel ops assignments
powerpc: Make default kexec/crash_kernel ops implicit
powerpc: Setup OF properties for ppc32 kexec
powerpc/pseries: Fix cpu hotplug
powerpc: Fix KVM build on ppc440
powerpc/cell: add QPACE as a separate Cell platform
powerpc/cell: fix build breakage with CONFIG_SPUFS disabled
powerpc/mpc5200: fix error paths in PSC UART probe function
powerpc/mpc5200: add rts/cts handling in PSC UART driver
powerpc/mpc5200: Make PSC UART driver update serial errors counters
powerpc/mpc5200: Remove obsolete code from mpc5200 MDIO driver
powerpc/mpc5200: Add MDMA/UDMA support to MPC5200 ATA driver
...
Fix trivial conflict in drivers/char/Makefile as per Paul's directions
Stephen Rothwell [Sun, 28 Dec 2008 23:46:13 +0000 (10:46 +1100)]
net: ehea NAPI interface cleanup fix
Commit
908a7a16b852ffd618a9127be8d62432182d81b4 ("net: Remove unused
netdev arg from some NAPI interfaces") missed two spots.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stephen Rothwell [Wed, 3 Dec 2008 02:49:23 +0000 (13:49 +1100)]
cifs: update for new IP4/6 address printing
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Anholt [Tue, 23 Dec 2008 02:56:27 +0000 (18:56 -0800)]
agp/intel: Fix broken ® symbol in device name.
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@linux.ie>
Zhenyu Wang [Mon, 17 Nov 2008 06:39:00 +0000 (14:39 +0800)]
agp/intel: add support for G41 chipset
Signed-off-by: Zhenyu Wang <zhenyu.z.wang@intel.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@linux.ie>
Linus Torvalds [Sun, 28 Dec 2008 23:15:08 +0000 (15:15 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
smackfs: check for allocation failures in smk_set_access()
Linus Torvalds [Sun, 28 Dec 2008 23:13:48 +0000 (15:13 -0800)]
Merge git://git./linux/kernel/git/sam/kbuild-next
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next: (25 commits)
allow stripping of generated symbols under CONFIG_KALLSYMS_ALL
kbuild: strip generated symbols from *.ko
kbuild: simplify use of genksyms
kernel-doc: check for extra kernel-doc notations
kbuild: add headerdep used to detect inclusion cycles in header files
kbuild: fix string equality testing in tags.sh
kbuild: fix make tags/cscope
kbuild: fix make incompatibility
kbuild: remove TAR_IGNORE
setlocalversion: add git-svn support
setlocalversion: print correct subversion revision
scripts: improve the decodecode script
scripts/package: allow custom options to rpm
genksyms: allow to ignore symbol checksum changes
genksyms: track symbol checksum changes
tags and cscope support really belongs in a shell script
kconfig: fix options to check-lxdialog.sh
kbuild: gen_init_cpio expands shell variables in file names
remove bashisms from scripts/extract-ikconfig
kbuild: teach mkmakfile to be silent
...
Linus Torvalds [Sun, 28 Dec 2008 23:12:35 +0000 (15:12 -0800)]
Merge git://git./linux/kernel/git/wim/linux-2.6-nvram
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-nvram:
[PATCH] nvram - convert PRINT_PROC to seq_file
[PATCH] nvram - CodingStyle
James Morris [Sun, 28 Dec 2008 22:57:38 +0000 (09:57 +1100)]
Merge branch 'next' into for-linus
Ilya Yanok [Thu, 11 Dec 2008 01:55:41 +0000 (04:55 +0300)]
powerpc/44x: Support 16K/64K base page sizes on 44x
This adds support for 16k and 64k page sizes on PowerPC 44x processors.
The PGDIR table is much smaller than a page when using 16k or 64k
pages (512 and 32 bytes respectively) so we allocate the PGDIR with
kzalloc() instead of __get_free_pages().
One PTE table covers rather a large memory area when using 16k or 64k
pages (32MB or 512MB respectively), so we can easily put FIXMAP and
PKMAP in the area covered by one PTE table.
Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Vladimir Panfilov <pvr@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Hollis Blanchard [Wed, 26 Nov 2008 16:19:26 +0000 (10:19 -0600)]
powerpc: Force memory size to be a multiple of PAGE_SIZE
Ensure that total memory size is page-aligned, because otherwise
mark_bootmem() gets upset.
This error case was triggered by using 64 KiB pages in the kernel
while arch/powerpc/boot/4xx.c arbitrarily reduced the amount of memory
by 4096 (to work around a chip bug that affects the last 256 bytes of
physical memory).
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Linus Torvalds [Sun, 28 Dec 2008 20:54:07 +0000 (12:54 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/geert/linux-m68k
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: use the new byteorder headers
fbcon: Protect free_irq() by MACH_IS_ATARI check
fbcon: remove broken mac vbl handler
m68k: fix trigraph ignored warning in setox.S
macfb annotations and compiler warning fix
m68k: mac baboon interrupt enable/disable
m68k: machw.h cleanup
m68k: Mac via cleanup and commentry
m68k: Reinstate mac rtc
Linus Torvalds [Sun, 28 Dec 2008 20:49:40 +0000 (12:49 -0800)]
Merge git://git./linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1429 commits)
net: Allow dependancies of FDDI & Tokenring to be modular.
igb: Fix build warning when DCA is disabled.
net: Fix warning fallout from recent NAPI interface changes.
gro: Fix potential use after free
sfc: If AN is enabled, always read speed/duplex from the AN advertising bits
sfc: When disabling the NIC, close the device rather than unregistering it
sfc: SFT9001: Add cable diagnostics
sfc: Add support for multiple PHY self-tests
sfc: Merge top-level functions for self-tests
sfc: Clean up PHY mode management in loopback self-test
sfc: Fix unreliable link detection in some loopback modes
sfc: Generate unique names for per-NIC workqueues
802.3ad: use standard ethhdr instead of ad_header
802.3ad: generalize out mac address initializer
802.3ad: initialize ports LACPDU from const initializer
802.3ad: remove typedef around ad_system
802.3ad: turn ports is_individual into a bool
802.3ad: turn ports is_enabled into a bool
802.3ad: make ntt bool
ixgbe: Fix set_ringparam in ixgbe to use the same memory pools.
...
Fixed trivial IPv4/6 address printing conflicts in fs/cifs/connect.c due
to the conversion to %pI (in this networking merge) and the addition of
doing IPv6 addresses (from the earlier merge of CIFS).
Linus Torvalds [Sun, 28 Dec 2008 20:37:14 +0000 (12:37 -0800)]
Merge git://git./linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: (31 commits)
[CIFS] Remove redundant test
[CIFS] make sure that DFS pathnames are properly formed
Remove an already-checked error condition in SendReceiveBlockingLock
Streamline SendReceiveBlockingLock: Use "goto out:" in an error condition
Streamline SendReceiveBlockingLock: Use "goto out:" in an error condition
[CIFS] Streamline SendReceive[2] by using "goto out:" in an error condition
Slightly streamline SendReceive[2]
Check the return value of cifs_sign_smb[2]
[CIFS] Cleanup: Move the check for too large R/W requests
[CIFS] Slightly simplify wait_for_free_request(), remove an unnecessary "else" branch
Simplify allocate_mid() slightly: Remove some unnecessary "else" branches
[CIFS] In SendReceive, move consistency check out of the mutexed region
cifs: store password in tcon
cifs: have calc_lanman_hash take more granular args
cifs: zero out session password before freeing it
cifs: fix wait_for_response to time out sleeping processes correctly
[CIFS] Can not mount with prefixpath if root directory of share is inaccessible
[CIFS] various minor cleanups pointed out by checkpatch script
[CIFS] fix typo
[CIFS] remove sparse warning
...
Fix trivial conflict in fs/cifs/cifs_fs_sb.h due to comment changes for
the CIFS_MOUNT_xyz bit definitions between cifs updates and security
updates.
Linus Torvalds [Sun, 28 Dec 2008 20:33:59 +0000 (12:33 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (26 commits)
IB/mlx4: Set ownership bit correctly when copying CQEs during CQ resize
RDMA/nes: Remove tx_free_list
RDMA/cma: Add IPv6 support
RDMA/addr: Add support for translating IPv6 addresses
mlx4_core: Delete incorrect comment
mlx4_core: Add support for multiple completion event vectors
IB/iser: Avoid recv buffer exhaustion caused by unexpected PDUs
IB/ehca: Remove redundant test of vpage
IB/ehca: Replace modulus operations in flush error completion path
IB/ipath: Add locking for interrupt use of ipath_pd contexts vs free
IB/ipath: Fix spi_pioindex value
IB/ipath: Only do 1X workaround on rev1 chips
IB/ipath: Don't count IB symbol and link errors unless link is UP
IB/ipath: Check return value of dma_map_single()
IB/ipath: Fix PSN of send WQEs after an RDMA read resend
RDMA/nes: Cleanup warnings
RDMA/nes: Add loopback check to make_cm_node()
RDMA/nes: Check cqp_avail_reqs is empty after locking the list
RDMA/nes: Fix TCP compliance test failures
RDMA/nes: Forward packets for a new connection with stale APBVT entry
...
Linus Torvalds [Sun, 28 Dec 2008 20:33:21 +0000 (12:33 -0800)]
Merge branch 'for-linus' of git://git390.osdl.marist.edu/linux-2.6
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (85 commits)
[S390] provide documentation for hvc_iucv kernel parameter.
[S390] convert ctcm printks to dev_xxx and pr_xxx macros.
[S390] convert zfcp printks to pr_xxx macros.
[S390] convert vmlogrdr printks to pr_xxx macros.
[S390] convert zfcp dumper printks to pr_xxx macros.
[S390] convert cpu related printks to pr_xxx macros.
[S390] convert qeth printks to dev_xxx and pr_xxx macros.
[S390] convert sclp printks to pr_xxx macros.
[S390] convert iucv printks to dev_xxx and pr_xxx macros.
[S390] convert ap_bus printks to pr_xxx macros.
[S390] convert dcssblk and extmem printks messages to pr_xxx macros.
[S390] convert monwriter printks to pr_xxx macros.
[S390] convert s390 debug feature printks to pr_xxx macros.
[S390] convert monreader printks to pr_xxx macros.
[S390] convert appldata printks to pr_xxx macros.
[S390] convert setup printks to pr_xxx macros.
[S390] convert hypfs printks to pr_xxx macros.
[S390] convert time printks to pr_xxx macros.
[S390] convert cpacf printks to pr_xxx macros.
[S390] convert cio printks to pr_xxx macros.
...
Linus Torvalds [Sun, 28 Dec 2008 20:27:58 +0000 (12:27 -0800)]
Merge branch 'sched-core-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits)
sched: fix warning in fs/proc/base.c
schedstat: consolidate per-task cpu runtime stats
sched: use RCU variant of list traversal in for_each_leaf_rt_rq()
sched, cpuacct: export percpu cpuacct cgroup stats
sched, cpuacct: refactoring cpuusage_read / cpuusage_write
sched: optimize update_curr()
sched: fix wakeup preemption clock
sched: add missing arch_update_cpu_topology() call
sched: let arch_update_cpu_topology indicate if topology changed
sched: idle_balance() does not call load_balance_newidle()
sched: fix sd_parent_degenerate on non-numa smp machine
sched: add uid information to sched_debug for CONFIG_USER_SCHED
sched: move double_unlock_balance() higher
sched: update comment for move_task_off_dead_cpu
sched: fix inconsistency when redistribute per-cpu tg->cfs_rq shares
sched/rt: removed unneeded defintion
sched: add hierarchical accounting to cpu accounting controller
sched: include group statistics in /proc/sched_debug
sched: rename SCHED_NO_NO_OMIT_FRAME_POINTER => SCHED_OMIT_FRAME_POINTER
sched: clean up SCHED_CPUMASK_ALLOC
...
Linus Torvalds [Sun, 28 Dec 2008 20:21:10 +0000 (12:21 -0800)]
Merge branch 'tracing-core-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (241 commits)
sched, trace: update trace_sched_wakeup()
tracing/ftrace: don't trace on early stage of a secondary cpu boot, v3
Revert "x86: disable X86_PTRACE_BTS"
ring-buffer: prevent false positive warning
ring-buffer: fix dangling commit race
ftrace: enable format arguments checking
x86, bts: memory accounting
x86, bts: add fork and exit handling
ftrace: introduce tracing_reset_online_cpus() helper
tracing: fix warnings in kernel/trace/trace_sched_switch.c
tracing: fix warning in kernel/trace/trace.c
tracing/ring-buffer: remove unused ring_buffer size
trace: fix task state printout
ftrace: add not to regex on filtering functions
trace: better use of stack_trace_enabled for boot up code
trace: add a way to enable or disable the stack tracer
x86: entry_64 - introduce FTRACE_ frame macro v2
tracing/ftrace: add the printk-msg-only option
tracing/ftrace: use preempt_enable_no_resched_notrace in ring_buffer_time_stamp()
x86, bts: correctly report invalid bts records
...
Fixed up trivial conflict in scripts/recordmcount.pl due to SH bits
being already partly merged by the SH merge.
Linus Torvalds [Sun, 28 Dec 2008 20:07:57 +0000 (12:07 -0800)]
Merge branch 'x86-core-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (246 commits)
x86: traps.c replace #if CONFIG_X86_32 with #ifdef CONFIG_X86_32
x86: PAT: fix address types in track_pfn_vma_new()
x86: prioritize the FPU traps for the error code
x86: PAT: pfnmap documentation update changes
x86: PAT: move track untrack pfnmap stubs to asm-generic
x86: PAT: remove follow_pfnmap_pte in favor of follow_phys
x86: PAT: modify follow_phys to return phys_addr prot and return value
x86: PAT: clarify is_linear_pfn_mapping() interface
x86: ia32_signal: remove unnecessary declaration
x86: common.c boot_cpu_stack and boot_exception_stacks should be static
x86: fix intel x86_64 llc_shared_map/cpu_llc_id anomolies
x86: fix warning in arch/x86/kernel/microcode_amd.c
x86: ia32.h: remove unused struct sigfram32 and rt_sigframe32
x86: asm-offset_64: use rt_sigframe_ia32
x86: sigframe.h: include headers for dependency
x86: traps.c declare functions before they get used
x86: PAT: update documentation to cover pgprot and remap_pfn related changes - v3
x86: PAT: add pgprot_writecombine() interface for drivers - v3
x86: PAT: change pgprot_noncached to uc_minus instead of strong uc - v3
x86: PAT: implement track/untrack of pfnmap regions for x86 - v3
...
Linus Torvalds [Sun, 28 Dec 2008 19:43:54 +0000 (11:43 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (105 commits)
SELinux: don't check permissions for kernel mounts
security: pass mount flags to security_sb_kern_mount()
SELinux: correctly detect proc filesystems of the form "proc/foo"
Audit: Log TIOCSTI
user namespaces: document CFS behavior
user namespaces: require cap_set{ug}id for CLONE_NEWUSER
user namespaces: let user_ns be cloned with fairsched
CRED: fix sparse warnings
User namespaces: use the current_user_ns() macro
User namespaces: set of cleanups (v2)
nfsctl: add headers for credentials
coda: fix creds reference
capabilities: define get_vfs_caps_from_disk when file caps are not enabled
CRED: Allow kernel services to override LSM settings for task actions
CRED: Add a kernel_service object class to SELinux
CRED: Differentiate objective and effective subjective credentials on a task
CRED: Documentation
CRED: Use creds in file structs
CRED: Prettify commoncap.c
CRED: Make execve() take advantage of copy-on-write credentials
...
Linus Torvalds [Sun, 28 Dec 2008 19:43:22 +0000 (11:43 -0800)]
Merge git://git./linux/kernel/git/herbert/crypto-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (57 commits)
crypto: aes - Precompute tables
crypto: talitos - Ack done interrupt in isr instead of tasklet
crypto: testmgr - Correct comment about deflate parameters
crypto: salsa20 - Remove private wrappers around various operations
crypto: des3_ede - permit weak keys unless REQ_WEAK_KEY set
crypto: sha512 - Switch to shash
crypto: sha512 - Move message schedule W[80] to static percpu area
crypto: michael_mic - Switch to shash
crypto: wp512 - Switch to shash
crypto: tgr192 - Switch to shash
crypto: sha256 - Switch to shash
crypto: md5 - Switch to shash
crypto: md4 - Switch to shash
crypto: sha1 - Switch to shash
crypto: rmd320 - Switch to shash
crypto: rmd256 - Switch to shash
crypto: rmd160 - Switch to shash
crypto: rmd128 - Switch to shash
crypto: null - Switch to shash
crypto: hash - Make setkey optional
...
Linus Torvalds [Sun, 28 Dec 2008 19:41:32 +0000 (11:41 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: (367 commits)
ALSA: ASoC: fix a typo in omp-pcm.c
ASoC: Fix DSP formats in SSM2602 audio codec
ASoC: Fix incorrect DSP format in OMAP McBSP DAI and affected drivers
ALSA: hda: fix incorrect mixer index values for 92hd83xx
ALSA: hda: dinput_mux check
ALSA: hda - Add quirk for another HP dv7
ALSA: ASoC - Add missing __devexit annotation to wm8350.c
ALSA: ASoc: DaVinci: davinci-evm use dsp_b mode
ALSA: ASoC: DaVinci: i2s, evm, pass same value to codec and cpu_dai
ALSA: ASoC: tlv320aic3x add dsp_a
ALSA: ASoC: DaVinci: document I2S limitations
ALSA: ASoC: DaVinci: davinci-i2s clean up
ALSA: ASoC: DaVinci: davinci-i2s clean up
ALSA: ASoC: DaVinci: davinci-i2s add comments to explain polarity
ALSA: ASoC: DaVinci: davinvi-evm, make requests explicit
ALSA: ca0106 - disable 44.1kHz capture
ALSA: ca0106 - Add missing card->private_data initialization
ALSA: ca0106 - Check ac97 availability at PM
ALSA: hda - Power up always when no jack detection is available
ALSA: hda - Fix unused variable warnings in patch_sigmatel.c
...
Linus Torvalds [Sun, 28 Dec 2008 19:39:19 +0000 (11:39 -0800)]
Merge git://git./linux/kernel/git/lethal/sh-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (132 commits)
sh: oprofile: Fix up the module build.
sh: add UIO support for JPU on SH7722.
serial: sh-sci: Fix up port pinmux for SH7366.
sh: mach-rsk: Use uImage generation by default for rsk7201/7203.
sh: mach-sh03: Fix up pata_platform build breakage.
sh: enable deferred io LCDC on Migo-R
video: sh_mobile_lcdcfb deferred io support
video: deferred io with physically contiguous memory
video: deferred io cleanup
video: fix deferred io fsync()
sh: add LCDC interrupt configuration to AP325 and Migo-R
sh_mobile_lcdc: use FB_SYS helpers instead of FB_CFB
sh: split coherent pages
sh: dma: Kill off ISA DMA wrapper.
sh: Conditionalize the code dumper on CONFIG_DUMP_CODE.
sh: Kill off the unused SH_ALPHANUMERIC debug option.
sh: Enable skipping of bss on debug platforms for sh32 also.
doc: Update sh cpufreq documentation.
sh: mrshpc_setup_windows() needs to be inline.
serial: sh-sci: sci_poll_get_char() is only used by CONFIG_CONSOLE_POLL.
...
Harvey Harrison [Tue, 18 Nov 2008 19:45:23 +0000 (20:45 +0100)]
m68k: use the new byteorder headers
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Geert Uytterhoeven [Tue, 18 Nov 2008 19:45:23 +0000 (20:45 +0100)]
fbcon: Protect free_irq() by MACH_IS_ATARI check
Add missing check for Atari in free_irq() call, which could cause problems on
multi-platform m68k kernels.
Reported-by: Brad Boyer <flar@allandria.com>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Finn Thain [Tue, 18 Nov 2008 19:45:23 +0000 (20:45 +0100)]
fbcon: remove broken mac vbl handler
Remove the Mac VBL interrupt code as it doesn't work properly and
doesn't bring any benefit when fixed. Also remove unused
DEFAULT_CURSOR_BLINK_RATE macro and irqres variable.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Finn Thain [Tue, 18 Nov 2008 19:45:22 +0000 (20:45 +0100)]
m68k: fix trigraph ignored warning in setox.S
Fix the warning: trigraph ??/ ignored, use -trigraphs to enable
caused by the recent removal of -traditional option.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Finn Thain [Tue, 18 Nov 2008 19:45:22 +0000 (20:45 +0100)]
macfb annotations and compiler warning fix
Add some __iomem annotations. Remove some volatile qualifiers to fix
several compiler warnings: "passing arg 1 of `iounmap' discards qualifiers
from pointer target type".
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Finn Thain [Tue, 18 Nov 2008 19:45:21 +0000 (20:45 +0100)]
m68k: mac baboon interrupt enable/disable
No-one seems to know how to mask individual baboon interrupts, so we just
mask the umbrella IRQ. This will work as long as only the IDE driver uses
the baboon chip (it can't deadlock). Use mac_enable_irq/mac_disable_irq
rather than enable_irq/disable_irq because the latter routines count the
depth of nested calls which triggers a warning and call trace because
IRQ_NUBUS_C is enabled twice in a row (once when the baboon handler is
registered, and once when the IDE IRQ is registered).
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Finn Thain [Tue, 18 Nov 2008 19:45:20 +0000 (20:45 +0100)]
m68k: machw.h cleanup
Remove some more cruft from machw.h and drop the #include where it isn't
needed.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Finn Thain [Tue, 18 Nov 2008 19:45:20 +0000 (20:45 +0100)]
m68k: Mac via cleanup and commentry
No behavioural changes, just cleanups and better documentation.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Finn Thain [Tue, 18 Nov 2008 19:45:20 +0000 (20:45 +0100)]
m68k: Reinstate mac rtc
Reinstate the Mac hardware clock for CUDA ADB and Mac II ADB models.
It doesn't work properly on Mac IIsi ADB and PMU ADB yet, so leave them
out.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Dave Jones [Sun, 28 Dec 2008 04:43:48 +0000 (20:43 -0800)]
net: Allow dependancies of FDDI & Tokenring to be modular.
I noticed it isn't possible to build token ring & fddi drivers
without causing LLC, and a bunch of other things to be forced
built-in. For distro kernels, this means carrying a chunk of
code in the vmlinuz, even if the user doesn't use those protocols.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sam Ravnborg [Sat, 27 Dec 2008 08:56:29 +0000 (00:56 -0800)]
sparc: move select of ARCH_SUPPORTS_MSI
It is counter intuitive to have the select listed
as part of the PCI option.
Move the select to the SPARC64 specific part of the config.
PCI_MSI has a dependency on PCI so it does not harm to have
it always selected.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sam Ravnborg [Sat, 27 Dec 2008 08:55:45 +0000 (00:55 -0800)]
sparc: drop SUN_IO
SUN_IO is always 'y' so drop it and thus killing an ifdef/endif pair
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sam Ravnborg [Sat, 27 Dec 2008 08:35:12 +0000 (00:35 -0800)]
sparc: unify sections.h
While doing this use standard names for start/end
so we could use definitions straight from asm-generic
for all the typical symbols.
This also allowed us to drop the use of PROVIDE in the linker
script so sprc is less non-standard on this area.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sam Ravnborg [Sat, 27 Dec 2008 08:34:41 +0000 (00:34 -0800)]
sparc: use .data.init_task section for init_thread_union
Use a dedicated aligned section for the init_thread_union
variable and declare this section in vmlinux.lds.
This align sparc with most other architectures. Eventually this allow
the init_task bits to be unified across all architectures.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Robert Reif [Fri, 26 Dec 2008 23:39:11 +0000 (15:39 -0800)]
sparc: fix array overrun check in of_device_64.c
Do the array length check and fixup before copying the array.
Signed-off-by: Robert Reif <reif@earthlink.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sam Ravnborg [Fri, 26 Dec 2008 23:38:17 +0000 (15:38 -0800)]
sparc: unify module.c
o Copy module_64.c to module.c
o Add all sparc specific bits to module.c
o delete module_32.c
o update Makefile
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sam Ravnborg [Fri, 26 Dec 2008 23:37:24 +0000 (15:37 -0800)]
sparc64: prepare module_64.c for unification
o Introduce a helper function
o Combine sparc64 specific case values
o add ifdef's around sparc64 code snippets
Note: The ifdef around the BUG_ON is highly questionable
but for now the safe approach was taken
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sam Ravnborg [Fri, 26 Dec 2008 23:36:29 +0000 (15:36 -0800)]
sparc64: use bit neutral Elf symbols
To prepare for unification use the bit neutral versions of
the Elf types defined by asm/module.h
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sam Ravnborg [Fri, 26 Dec 2008 23:35:41 +0000 (15:35 -0800)]
sparc: unify module.h
Use some preprocessor magic in combination with the
newly introduced CONFIG_BITS to unify module.h.
A few additional symbols are added as they are needed in a follow-up patch
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sam Ravnborg [Fri, 26 Dec 2008 23:35:16 +0000 (15:35 -0800)]
sparc: introduce CONFIG_BITS
CONFIG_BITS is set to 32 for sparc32
and 64 for sparc64.
This allow us to use this symbol in for example header files
to ease unification of sparc32 and sparc64.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sam Ravnborg [Fri, 26 Dec 2008 23:33:07 +0000 (15:33 -0800)]
sparc: fix hardirq.h removal fallout
When hardirq.h are removed from asm-generic/local.h a few
bits fails to build. Fix these upfront.
Reported by Alexey Dobriyan.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 26 Dec 2008 23:13:55 +0000 (15:13 -0800)]
igb: Fix build warning when DCA is disabled.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 26 Dec 2008 23:10:00 +0000 (15:10 -0800)]
net: Fix warning fallout from recent NAPI interface changes.
When we removed the network device argument from several
NAPI interfaces in
908a7a16b852ffd618a9127be8d62432182d81b4
("net: Remove unused netdev arg from some NAPI interfaces.")
several drivers now started getting unused variable warnings.
This fixes those up.
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Fri, 26 Dec 2008 22:57:42 +0000 (14:57 -0800)]
gro: Fix potential use after free
The initial skb may have been freed after napi_gro_complete in
napi_gro_receive if it was merged into an existing packet. Thus
we cannot check same_flow (which indicates whether it was merged)
after calling napi_gro_complete.
This patch fixes this by saving the same_flow status before the
call to napi_gro_complete.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Fri, 26 Dec 2008 21:49:25 +0000 (13:49 -0800)]
sfc: If AN is enabled, always read speed/duplex from the AN advertising bits
When AN is enabled and the link is down the speed/duplex control bits
will not be meaningful. Use the advertising bits instead, and mask
them with the LPA bits if and only if AN is complete (as before).
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Fri, 26 Dec 2008 21:48:51 +0000 (13:48 -0800)]
sfc: When disabling the NIC, close the device rather than unregistering it
This should reduce user confusion and may also aid recovery (ioctls
will still be available).
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>