Márton Németh [Sat, 9 Jan 2010 14:10:34 +0000 (15:10 +0100)]
mtd: nand: make PCI device id constant
The id_table field of the struct pci_driver is constant in <linux/pci.h>
so it is worth to make cafe_nand_tbl also constant.
The semantic match that finds this kind of pattern is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@r@
disable decl_init,const_decl_init;
identifier I1, I2, x;
@@
struct I1 {
...
const struct I2 *x;
...
};
@s@
identifier r.I1, y;
identifier r.x, E;
@@
struct I1 y = {
.x = E,
};
@c@
identifier r.I2;
identifier s.E;
@@
const struct I2 E[] = ... ;
@depends on !c@
identifier r.I2;
identifier s.E;
@@
+ const
struct I2 E[] = ...;
// </smpl>
Signed-off-by: Márton Németh <nm127@freemail.hu>
Cc: Julia Lawall <julia@diku.dk>
Cc: cocci@diku.dk
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Andrey Yurovsky [Thu, 17 Dec 2009 20:31:20 +0000 (12:31 -0800)]
mtd: nandsim: fix spelling
s/nanodeconds/nanoseconds
Signed-off-by: Andrey Yurovsky <yurovsky@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Peter Huewe [Thu, 7 Jan 2010 01:51:13 +0000 (02:51 +0100)]
mtd: orion_nand: Fix build failure caused by typo
Commit
e99030609e27abff7e1a868cb56384c678b09984 ("mtd: orion_nand.c: add
error handling and use resource_size()") introduced a build error -- it
assigns something to a undeclared variable 'err', whereas the rest of
the code uses 'ret' for this task.
This patch fixes this typo and thus removes the build failure.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Reviewed-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
H Hartley Sweeten [Tue, 5 Jan 2010 21:59:58 +0000 (14:59 -0700)]
mtd: Remove now-defunct ts7250 nand driver
The ts72xx platform has been updated to use the generic platform nand
driver (plat_nand.c). This removes the now-defunct ts7250.c nand driver.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Matthieu Crapet <mcrapet@gmail.com>
Cc: Jesse Off <joff@embeddedARM.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
H Hartley Sweeten [Tue, 5 Jan 2010 21:59:56 +0000 (14:59 -0700)]
mtd: Update ep93xx/ts72xx to use generic platform nand driver
Update the ts72xx platform's nand driver support.
This changes the ts72xx platform from using a custom nand driver
(ts7250.c) to the generic platform nand driver (plat_nand.c).
Tested on TS-7250 with 32MiB NAND.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Tested-by: Matthieu Crapet <mcrapet@gmail.com>
Cc: Jesse Off <joff@embeddedARM.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
H Hartley Sweeten [Sat, 2 Jan 2010 04:35:54 +0000 (20:35 -0800)]
MTD DocBook: fix ioremap return type
ioremap() returns a void __iomem * not an unsigned long. Update the
Documentation file to reflect this.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Julia Lawall [Tue, 29 Dec 2009 19:15:23 +0000 (20:15 +0100)]
mtd: physmap_of: Correct the size argument to kzalloc
mtd_list has type struct mtd_info **, not struct mtd_info *, so the
elements of the array should have pointer type, not structure type.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@disable sizeof_type_expr@
type T;
T **x;
@@
x =
<+...sizeof(
- T
+ *x
)...+>
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Fri, 1 Jan 2010 12:16:47 +0000 (12:16 +0000)]
mtd: nand: rename w90p910_nand.c to nuc900_nand.c
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Wan ZongShun [Fri, 1 Jan 2010 10:03:47 +0000 (18:03 +0800)]
ARM: NUC900: rename mtd nand driver name
Due to I have renamed the platform_device.name,so this patch changes
this nand driver platform_driver name.
Signed-off-by: Wan ZongShun <mcuos.com@gmail.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
H Hartley Sweeten [Mon, 14 Dec 2009 21:59:27 +0000 (16:59 -0500)]
mtd: drivers/mtd/nand/sh_flctl.c: use resource_size()
Use resource_size().
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
H Hartley Sweeten [Mon, 14 Dec 2009 22:11:44 +0000 (17:11 -0500)]
mtd: tmio_nand.c: use dev_get_platdata() and resource_size()
Remove unnecessary casts and use dev_get_platdata() to retrieve the
struct mfd_cell data from the platform.
Use resource_size() for the ioremap()'s.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
H Hartley Sweeten [Mon, 14 Dec 2009 21:56:22 +0000 (16:56 -0500)]
mtd: drivers/mtd/nand/s3c2410.c: use resource_size()
Use resource_size().
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
H Hartley Sweeten [Mon, 14 Dec 2009 21:48:34 +0000 (16:48 -0500)]
mtd: orion_nand.c: add error handling and use resource_size()
Use platform_get_resource() to fetch the memory resource and
add error handling for when it is missing. Use resource_size()
for the ioremap().
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
H Hartley Sweeten [Mon, 14 Dec 2009 21:36:37 +0000 (16:36 -0500)]
mtd: nomadik_nand.c: use resource_size()
Use resource_size().
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
H Hartley Sweeten [Mon, 14 Dec 2009 21:30:46 +0000 (16:30 -0500)]
mtd: drivers/mtd/nand/gpio.c: use resource_size()
Use resource_size().
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
H Hartley Sweeten [Mon, 14 Dec 2009 21:22:49 +0000 (16:22 -0500)]
mtd: fls_upm.c: use resource_size()
Use resource_size().
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
H Hartley Sweeten [Mon, 14 Dec 2009 21:19:44 +0000 (16:19 -0500)]
mtd: fsl_elbc_nand.c: user resource_size()
Use resource_size().
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
H Hartley Sweeten [Mon, 14 Dec 2009 21:13:13 +0000 (16:13 -0500)]
mtd: davinci_nand.c: use resource_size()
The ioremap'ed sizes are off by 1; use resource_size() for correct value.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
H Hartley Sweeten [Mon, 14 Dec 2009 21:08:37 +0000 (16:08 -0500)]
mtd: au1550nd.c: remove unnecessary casts
Remove unnecessary casts for p_nand, it is already a void __iomem *.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
H Hartley Sweeten [Mon, 14 Dec 2009 21:04:07 +0000 (16:04 -0500)]
mtd: au1550nd.c: use kzalloc()
Use kzalloc() instead of kmalloc()/memset().
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
KOSAKI Motohiro [Tue, 22 Dec 2009 03:15:43 +0000 (03:15 +0000)]
kmsg_dump: Dump on crash_kexec as well
crash_kexec gets called before kmsg_dump(KMSG_DUMP_OOPS) if
panic_on_oops is set, so the kernel log buffer is not stored
for this case.
This patch adds a KMSG_DUMP_KEXEC dump type which gets called
when crash_kexec() is invoked. To avoid getting double dumps,
the old KMSG_DUMP_PANIC is moved below crash_kexec(). The
mtdoops driver is modified to handle KMSG_DUMP_KEXEC in the
same way as a panic.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Simon Kagstrom <simon.kagstrom@netinsight.net>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Linus Torvalds [Wed, 16 Dec 2009 18:23:43 +0000 (10:23 -0800)]
Merge git://git.infradead.org/mtd-2.6
* git://git.infradead.org/mtd-2.6: (90 commits)
jffs2: Fix long-standing bug with symlink garbage collection.
mtd: OneNAND: Fix test of unsigned in onenand_otp_walk()
mtd: cfi_cmdset_0002, fix lock imbalance
Revert "mtd: move mxcnd_remove to .exit.text"
mtd: m25p80: add support for Macronix MX25L4005A
kmsg_dump: fix build for CONFIG_PRINTK=n
mtd: nandsim: add support for 4KiB pages
mtd: mtdoops: refactor as a kmsg_dumper
mtd: mtdoops: make record size configurable
mtd: mtdoops: limit the maximum mtd partition size
mtd: mtdoops: keep track of used/unused pages in an array
mtd: mtdoops: several minor cleanups
core: Add kernel message dumper to call on oopses and panics
mtd: add ARM pismo support
mtd: pxa3xx_nand: Fix PIO data transfer
mtd: nand: fix multi-chip suspend problem
mtd: add support for switching old SST chips into QRY mode
mtd: fix M29W800D dev_id and uaddr
mtd: don't use PF_MEMALLOC
mtd: Add bad block table overrides to Davinci NAND driver
...
Fixed up conflicts (mostly trivial) in
drivers/mtd/devices/m25p80.c
drivers/mtd/maps/pcmciamtd.c
drivers/mtd/nand/pxa3xx_nand.c
kernel/printk.c
Linus Torvalds [Wed, 16 Dec 2009 18:12:07 +0000 (10:12 -0800)]
Merge branch 'next' of git://git./linux/kernel/git/kyle/parisc-2.6
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6:
parisc: Fixup last users of irq_chip->typename
parisc: convert /proc/pdc/{lcd,led} to seq_file
parisc: Convert BUG() to use unreachable()
parisc: Replace old style lock init in smp.c
parisc: use sort() instead of home-made implementation (v2)
parisc: add CALLER_ADDR{0-6} macros
parisc: remove unused IRQSTAT_SIRQ_PEND and IRQSTAT_SZ defines
parisc: remove duplicated #include
Linus Torvalds [Wed, 16 Dec 2009 18:11:38 +0000 (10:11 -0800)]
Merge git://git.infradead.org/iommu-2.6
* git://git.infradead.org/iommu-2.6:
implement early_io{re,un}map for ia64
Revert "Intel IOMMU: Avoid memory allocation failures in dma map api calls"
intel-iommu: ignore page table validation in pass through mode
intel-iommu: Fix oops with intel_iommu=igfx_off
intel-iommu: Check for an RMRR which ends before it starts.
intel-iommu: Apply BIOS sanity checks for interrupt remapping too.
intel-iommu: Detect DMAR in hyperspace at probe time.
dmar: Fix build failure without NUMA, warn on bogus RHSA tables and don't abort
iommu: Allocate dma-remapping structures using numa locality info
intr_remap: Allocate intr-remapping table using numa locality info
dmar: Allocate queued invalidation structure using numa locality info
dmar: support for parsing Remapping Hardware Static Affinity structure
Linus Torvalds [Wed, 16 Dec 2009 18:09:43 +0000 (10:09 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/bp/bp
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
edac, mce, amd: silence GART TLB errors
edac, mce: correct corenum reporting
Linus Torvalds [Wed, 16 Dec 2009 18:09:16 +0000 (10:09 -0800)]
Merge branch 'for_linus' of git://git./linux/kernel/git/mchehab/linux-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (116 commits)
V4L/DVB (13698): pms: replace asm/uaccess.h to linux/uaccess.h
V4L/DVB (13690): radio/si470x: #include <sched.h>
V4L/DVB (13688): au8522: modify the attributes of local filter coefficients
V4L/DVB (13687): cx231xx: use NULL when pointer is needed
V4L/DVB: Davinci VPFE Capture: remove unused #include <linux/version.h>
V4L/DVB (13685): Correct code taking the size of a pointer
V4L/DVB (13684): Fix some cut-and-paste noise in dib0090.h
V4L/DVB (13683): sanio-ms: clean up init, exit and id_table
V4L/DVB (13682): dib8000: make some constant static
V4L/DVB: lgs8gxx: Use shifts rather than multiply/divide when possible
V4L/DVB (13680b): DocBook/media: create links for included sources
V4L/DVB (13680a): DocBook/media: copy images after building HTML
V4L/DVB (13678): Add support for yet another DvbWorld, TeVii and Prof USB devices
V4L/DVB (13676): configurable IRQ mode on NetUP Dual DVB-S2 CI; IRQ from CAM processing (CI interface works faster)
V4L/DVB (13674): stv090x: Add DiSEqC envelope mode
V4L/DVB (13673): lnbp21: Implement 22 kHz tone control
V4L/DVB (13671): sh_mobile_ceu_camera: Remove frame size page alignment
V4L/DVB (13670): soc-camera: Add mt9t112 camera driver
V4L/DVB (13669): tw9910: Add sync polarity support
V4L/DVB (13668): tw9910: remove cropping
...
Linus Torvalds [Wed, 16 Dec 2009 18:06:39 +0000 (10:06 -0800)]
Merge branch 'akpm'
* akpm: (173 commits)
genalloc: use bitmap_find_next_zero_area
ia64: use bitmap_find_next_zero_area
sparc: use bitmap_find_next_zero_area
mlx4: use bitmap_find_next_zero_area
isp1362-hcd: use bitmap_find_next_zero_area
iommu-helper: use bitmap library
bitmap: introduce bitmap_set, bitmap_clear, bitmap_find_next_zero_area
qnx4: use hweight8
qnx4fs: remove remains of the (defunct) write support
resource: constify arg to resource_size() and resource_type()
gru: send cross partition interrupts using the gru
gru: function to generate chipset IPI values
gru: update driver version number
gru: improve GRU TLB dropin statistics
gru: fix GRU interrupt race at deallocate
gru: add hugepage support
gru: fix bug in allocation of kernel contexts
gru: update GRU structures to match latest hardware spec
gru: check for correct GRU chiplet assignment
gru: remove stray local_irq_enable
...
Borislav Petkov [Tue, 15 Dec 2009 15:03:53 +0000 (16:03 +0100)]
edac, mce, amd: silence GART TLB errors
Although reporting of benign GART TLB errors is disabled in
__mcheck_cpu_apply_quirks, those are still being logged, and, as a
result, trip up amd64_edac. Pull up reporting check so that machines
with loaded edac module bail out early and don't spit fragments into
dmesg.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Akinobu Mita [Wed, 16 Dec 2009 00:48:31 +0000 (16:48 -0800)]
genalloc: use bitmap_find_next_zero_area
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Wed, 16 Dec 2009 00:48:30 +0000 (16:48 -0800)]
ia64: use bitmap_find_next_zero_area
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Wed, 16 Dec 2009 00:48:30 +0000 (16:48 -0800)]
sparc: use bitmap_find_next_zero_area
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Wed, 16 Dec 2009 00:48:29 +0000 (16:48 -0800)]
mlx4: use bitmap_find_next_zero_area
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: Roland Dreier <rolandd@cisco.com>
Cc: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Wed, 16 Dec 2009 00:48:28 +0000 (16:48 -0800)]
isp1362-hcd: use bitmap_find_next_zero_area
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Lothar Wassmann <LW@KARO-electronics.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Wed, 16 Dec 2009 00:48:28 +0000 (16:48 -0800)]
iommu-helper: use bitmap library
Use bitmap library and kill some unused iommu helper functions.
1. s/iommu_area_free/bitmap_clear/
2. s/iommu_area_reserve/bitmap_set/
3. Use bitmap_find_next_zero_area instead of find_next_zero_area
This cannot be simple substitution because find_next_zero_area
doesn't check the last bit of the limit in bitmap
4. Remove iommu_area_free, iommu_area_reserve, and find_next_zero_area
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Wed, 16 Dec 2009 00:48:25 +0000 (16:48 -0800)]
bitmap: introduce bitmap_set, bitmap_clear, bitmap_find_next_zero_area
This introduces new bitmap functions:
bitmap_set: Set specified bit area
bitmap_clear: Clear specified bit area
bitmap_find_next_zero_area: Find free bit area
These are mostly stolen from iommu helper. The differences are:
- Use find_next_bit instead of doing test_bit for each bit
- Rewrite bitmap_set and bitmap_clear
Instead of setting or clearing for each bit.
- Check the last bit of the limit
iommu-helper doesn't want to find such area
- The return value if there is no zero area
find_next_zero_area in iommu helper: returns -1
bitmap_find_next_zero_area: return >= bitmap size
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Lothar Wassmann <LW@KARO-electronics.de>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Wed, 16 Dec 2009 00:48:24 +0000 (16:48 -0800)]
qnx4: use hweight8
Use hweight8 instead of counting for each bit
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Anders Larsen <al@alarsen.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Anders Larsen [Wed, 16 Dec 2009 00:48:23 +0000 (16:48 -0800)]
qnx4fs: remove remains of the (defunct) write support
commit
945ffe54bbd56ceed62de3b908800fd7c6ffb284 ("qnx4: remove write support") removed the (defunct)
write support but missed a chunk of related, dead code.
Signed-off-by: Anders Larsen <al@alarsen.net>
Cc: Jiri Kosina <jkosina@suse.cz>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jean Delvare [Wed, 16 Dec 2009 00:48:21 +0000 (16:48 -0800)]
resource: constify arg to resource_size() and resource_type()
resource_size() doesn't change the resource it operates on, so the res
parameter can be marked const. Same for resource_type().
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:21 +0000 (16:48 -0800)]
gru: send cross partition interrupts using the gru
GRU Message queue instructions are used to deliver messages to other SSIs
within the numalink domain. In most cases, a single GRU mesq instruction
will deliver both the message AND an interrupt to notify the other SSI
that a messsage is present. In some cases, however, the interrupt must be
sent explicitly.
To improve resilency, the GRU driver should send these explicit interrupts
using the GRU to write the remote chipset register. Current code sends
the interrupt using a cpu instruction to write the chipset register.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:20 +0000 (16:48 -0800)]
gru: function to generate chipset IPI values
Create a function to generate the value that is written to the UV hub MMR
to cause an IPI interrupt to be sent. The function will be used in the
GRU message queue error recovery code that sends IPIs to nodes in remote
partitions.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:19 +0000 (16:48 -0800)]
gru: update driver version number
Update the version number of the GRU driver.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:18 +0000 (16:48 -0800)]
gru: improve GRU TLB dropin statistics
Update the TLB dropin statistics kept for each GRU context. Count TLB
dropins separate from the misses - some misses do not result in a TLB
dropin. Some of the diagnostics need both counts.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:18 +0000 (16:48 -0800)]
gru: fix GRU interrupt race at deallocate
Fix a race where an interrupt could be received for a GRU context that has
been deallocated.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:17 +0000 (16:48 -0800)]
gru: add hugepage support
Add support for hugepages. Easier than I originally thought.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:17 +0000 (16:48 -0800)]
gru: fix bug in allocation of kernel contexts
Fix a bug in the assignment of GRU contexts used for kernel functions. If
a sleep occurs on the wait for a semaphore, the thread could switch cpus
and allocate resources on the wrong blade.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:16 +0000 (16:48 -0800)]
gru: update GRU structures to match latest hardware spec
Add a few new definitions for chipset MMR field names. This matches rev
0.7 of the hardware spec.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:15 +0000 (16:48 -0800)]
gru: check for correct GRU chiplet assignment
Simplify the code that checks for correct assignment of GRU contexts to
users.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:15 +0000 (16:48 -0800)]
gru: remove stray local_irq_enable
Remove a stray local_irq_enable() in the GRU TLB dropin code.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:14 +0000 (16:48 -0800)]
gru: add symbolic names for GRU error code
Use symbol names instead of numbers for error return values for the vtop
functions.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:14 +0000 (16:48 -0800)]
gru: fix bug in exception handling
Fix a GRU driver bug converting a CBR address to the context that contains
the CBR. The conversion is rarely done so performance does not matter.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:13 +0000 (16:48 -0800)]
gru: preload tlb for bcopy instructions
Add anticipatory TLB dropins for GRU TLB misses that occur on BCOPY
instructions that copy large amounts of data.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:12 +0000 (16:48 -0800)]
gru: expicitly set instruction status to active
Explicitly set GRU instructions to "ACTIVE". This eliminates the need for
barriers that would have been necessary to prevent reading the instruction
"status" field before the GRU had actually started the instruction.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:12 +0000 (16:48 -0800)]
gru: add additional GRU statistics
Add additional GRU statistics & debug messages.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:11 +0000 (16:48 -0800)]
gru: update irq infrastructure
Update the GRU irq allocate/free functions to use the latest upstream
infrastructure.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:11 +0000 (16:48 -0800)]
gru: fix prefetch and speculation bugs
Fix several bugs related to prefetch, ordering & speculation:
- GRU cch_allocate() instruction causes cacheable memory
to be created. Add a barriers to prevent speculation
from prefetching data before it exists.
- Add memory barriers before cache-flush instructions to ensure
that previously stored data is included in the line flushed to memory.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:10 +0000 (16:48 -0800)]
gru: check for valid vma
Fix bug caused by failure to allocate a GRU gts structure. The old code
failed to handle the case where the vma was invalid.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:09 +0000 (16:48 -0800)]
gru: add test for gru_copy_gpa
Improve existing driver self-tests. Add a new debugging test to the SGI
GRU driver for verifying the global GRU copy function.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:09 +0000 (16:48 -0800)]
gru: add debug option for cache flushing
Add a debug option to the SGI GRU driver for flushing GRU cache lines from
memory. In theory this is not needed but it is useful for debugging.
This has no use by end users.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:08 +0000 (16:48 -0800)]
gru: handle failures to mmu_notifier_register
Under some conditions, mmu_notifier_register() will fail to register a
mmu_notifier. Fix the GRU driver to correctly handle these failures.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:07 +0000 (16:48 -0800)]
gru: support 64-bit GRU addresses
Increase the maximum address supported by the SGI GRU driver to a full 64
bits. Note that GRU addresses are not always the same as socket virtual
addresses. Sockets may not necessarily support the full 64 bits.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:07 +0000 (16:48 -0800)]
gru: improve messages for malfunctioning GRUs
Improve error messages for malfunctioning GRUs. Identify the type of
instruction that is failing.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:06 +0000 (16:48 -0800)]
gru: fix bug in module unload
Fix bug in module unload. Previous code was not correctly deleting the
files in /proc.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:06 +0000 (16:48 -0800)]
gru: allow users to specify gru chiplet 3
This patch builds on the infrastructure introduced in the patches that
allow user specification of GRU blades & chiplets for context allocation.
This patch simplifies the algorithms for migrating GRU contexts between
blades.
No new functionality is introduced.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:05 +0000 (16:48 -0800)]
gru: allow users to specify gru chiplet 2
Add support to the GRU driver to allow users to specify the blade &
chiplet for allocation of GRU contexts. Add new statistics for context
loading/unloading/retargeting. Also deleted a few GRU stats that were no
longer being unused.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:04 +0000 (16:48 -0800)]
gru: allow users to specify gru chiplet 1
Add table & user request infrastructure that is needed to allow users to
specify the blade and chiplet for allocation of GRU contexts. Use of this
information is in a subsequent patch.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:03 +0000 (16:48 -0800)]
gru: handle blades without memory
Do not use alloc_pages_exact_node() to allocate GRU tables. If a blade
has no local memory, nid will be -1. Use alloc_pages_node() instead.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:03 +0000 (16:48 -0800)]
gru: fix istatus race in GRU tlb dropin
TLB dropins require updates to the CBR instruction istatus field. This is
needed to resolve race conditions in the chip.
The code currently uses the user address of the CBR. This works but opens
up additional endcases related to stealing of contexts and accessing the
CBR from tasks that do not have access to the user address space. (Some
of this non-user task access is debug code that is not currently being
pushed to the community).
User CBRs are also directly accessible using the kernel mapping of the
CBR. Change the TLB dropin code to use the the kernel mapping of the CBR.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:02 +0000 (16:48 -0800)]
gru: add comments raised in previous code reviews
Add comments from previous code reviews. The comments help explain some
of the more esoteric aspects of the driver.
Move a free() to the other side of an unlock.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Steiner [Wed, 16 Dec 2009 00:48:01 +0000 (16:48 -0800)]
gru: initial GRU based on blade topology
Change the GRU initialization code to initialize based on blade topology
instead of node topology. The result is the same but blade-based
initialization is cleaner.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Robin Holt [Wed, 16 Dec 2009 00:48:00 +0000 (16:48 -0800)]
UV - XPC: pass nasid instead of nid to gru_create_message_queue
Currently, the UV xpc code is passing nid to the gru_create_message_queue
instead of nasid as it expects.
Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Robin Holt [Wed, 16 Dec 2009 00:47:59 +0000 (16:47 -0800)]
x86: uv: XPC receive message reuse triggers invalid BUG_ON()
This was a difficult bug to trip. XPC was in the middle of sending an
acknowledgement for a received message.
In xpc_received_payload_uv():
.
ret = xpc_send_gru_msg(ch->sn.uv.cached_notify_gru_mq_desc, msg,
sizeof(struct xpc_notify_mq_msghdr_uv));
if (ret != xpSuccess)
XPC_DEACTIVATE_PARTITION(&xpc_partitions[ch->partid], ret);
msg->hdr.msg_slot_number += ch->remote_nentries;
at the point in xpc_send_gru_msg() where the hardware has dispatched the
acknowledgement, the remote side is able to reuse the message structure
and send a message with a different slot number. This problem is made
worse by interrupts.
The adjustment of msg_slot_number and the BUG_ON in
xpc_handle_notify_mq_msg_uv() which verifies the msg_slot_number is
consistent are only used for debug purposes. Since a fix for this that
preserves the debug functionality would either have to infringe upon the
payload or allocate another structure just for debug, I decided to remove
it entirely.
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Robin Holt [Wed, 16 Dec 2009 00:47:58 +0000 (16:47 -0800)]
X86: uv: xpc_make_first_contact hang due to not accepting ACTIVE state
Many times while the initial connection is being made, the contacted
partition will send back both the ACTIVATING and the ACTIVE
remote_act_state changes in very close succescion. The 1/4 second delay
in the make first contact loop is large enough to nearly always miss the
ACTIVATING state change.
Since either state indicates the remote partition has acknowledged our
state change, accept either.
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Robin Holt [Wed, 16 Dec 2009 00:47:57 +0000 (16:47 -0800)]
x86: uv: xpc NULL deref when mesq becomes empty
Under heavy load conditions, our set of xpc messages may become exhausted.
The code handles this correctly with the exception of the management code
which hits a NULL pointer dereference.
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Robin Holt [Wed, 16 Dec 2009 00:47:56 +0000 (16:47 -0800)]
x86: uv: update XPC to handle updated BIOS interface
The UV BIOS has moved the location of some of their pointers to the
"partition reserved page" from memory into a uv hub MMR. The GRU does not
support bcopy operations from MMR space so we need to special case the MMR
addresses using VLOAD operations.
Additionally, the BIOS call for registering a message queue watchlist has
removed the 'blade' value and eliminated the structure that was being
passed in. This is also reflected in this patch.
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Robin Holt [Wed, 16 Dec 2009 00:47:55 +0000 (16:47 -0800)]
X86: uv: implement a gru_read_gpa kernel function
The BIOS has decided to store a pointer to the partition reserved page in
a scratch MMR. The GRU is only able to read an MMR using a vload
instruction. The gru_read_gpa() function will implemented.
Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Robin Holt [Wed, 16 Dec 2009 00:47:54 +0000 (16:47 -0800)]
x86: uv: introduce uv_gpa_is_mmr
Provide a mechanism for determining if a global physical address is
pointing to a UV hub MMR.
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Robin Holt [Wed, 16 Dec 2009 00:47:53 +0000 (16:47 -0800)]
x86: uv: xpc needs to provide an abstraction for uv_gpa
Provide an SGI SN2/UV agnositic method for converting a global physical
address into a socket physical address.
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Robin Holt [Wed, 16 Dec 2009 00:47:52 +0000 (16:47 -0800)]
x86: uv: introduce a means to translate from gpa -> socket_paddr
The UV BIOS has been updated to implement some of our interface
functionality differently than originally expected. These patches update
the kernel to the bios implementation and include a few minor bug fixes
which prevent us from doing significant testing on real hardware.
This patch:
For SGI UV systems, translate from a global physical address back to a
socket physical address. This does nothing to ensure the socket physical
address is actually addressable by the kernel. That is the responsibility
of the user of the function.
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Wed, 16 Dec 2009 00:47:50 +0000 (16:47 -0800)]
direct-io: cleanup blockdev_direct_IO locking
Currently the locking in blockdev_direct_IO is a mess, we have three
different locking types and very confusing checks for some of them. The
most complicated one is DIO_OWN_LOCKING for reads, which happens to not
actually be used.
This patch gets rid of the DIO_OWN_LOCKING - as mentioned above the read
case is unused anyway, and the write side is almost identical to
DIO_NO_LOCKING. The difference is that DIO_NO_LOCKING always sets the
create argument for the get_blocks callback to zero, but we can easily
move that to the actual get_blocks callbacks. There are four users of the
DIO_NO_LOCKING mode: gfs already ignores the create argument and thus is
fine with the new version, ocfs2 only errors out if create were ever set,
and we can remove this dead code now, the block device code only ever uses
create for an error message if we are fully beyond the device which can
never happen, and last but not least XFS will need the new behavour for
writes.
Now we can replace the lock_type variable with a flags one, where no flag
means the DIO_NO_LOCKING behaviour and DIO_LOCKING is kept as the first
flag. Separate out the check for not allowing to fill holes into a
separate flag, although for now both flags always get set at the same
time.
Also revamp the documentation of the locking scheme to actually make
sense.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Zach Brown <zach.brown@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alex Elder <aelder@sgi.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jeff Moyer [Wed, 16 Dec 2009 00:47:49 +0000 (16:47 -0800)]
dio: don't zero out the pages array inside struct dio
Intel reported a performance regression caused by the following commit:
commit
848c4dd5153c7a0de55470ce99a8e13a63b4703f
Author: Zach Brown <zach.brown@oracle.com>
Date: Mon Aug 20 17:12:01 2007 -0700
dio: zero struct dio with kzalloc instead of manually
This patch uses kzalloc to zero all of struct dio rather than
manually trying to track which fields we rely on being zero. It
passed aio+dio stress testing and some bug regression testing on
ext3.
This patch was introduced by Linus in the conversation that lead up
to Badari's minimal fix to manually zero .map_bh.b_state in commit:
6a648fa72161d1f6468dabd96c5d3c0db04f598a
It makes the code a bit smaller. Maybe a couple fewer cachelines to
load, if we're lucky:
text data bss dec hex filename
3285925 568506 1304616 5159047 4eb887 vmlinux
3285797 568506 1304616 5158919 4eb807 vmlinux.patched
I was unable to measure a stable difference in the number of cpu
cycles spent in blockdev_direct_IO() when pushing aio+dio 256K reads
at ~340MB/s.
So the resulting intent of the patch isn't a performance gain but to
avoid exposing ourselves to the risk of finding another field like
.map_bh.b_state where we rely on zeroing but don't enforce it in the
code.
Zach surmised that zeroing out the page array was what caused most of
the problem, and suggested the approach taken in the attached patch for
resolving the issue. Intel re-tested with this patch and saw a 0.6%
performance gain (the original regression was 0.5%).
[akpm@linux-foundation.org: add comment]
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Acked-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Shaohua Li [Wed, 16 Dec 2009 00:47:47 +0000 (16:47 -0800)]
aio: remove unused field
Don't know the reason, but it appears ki_wait field of iocb never gets used.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Howells [Wed, 16 Dec 2009 00:47:46 +0000 (16:47 -0800)]
FS-Cache: Avoid maybe-used-uninitialised warning on variable
Andrew Morton's compiler sees the following warning in FS-Cache:
fs/fscache/object-list.c: In function 'fscache_objlist_lookup':
fs/fscache/object-list.c:94: warning: 'obj' may be used uninitialized in this function
which my compiler doesn't. This is a false positive as obj can only be
used in the comparison against minobj if minobj has been set to something
other than NULL, but for that to happen, obj has to be first set to
something.
Deal with this by preclearing obj too.
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Amerigo Wang [Wed, 16 Dec 2009 00:47:46 +0000 (16:47 -0800)]
kexec: premit reduction of the reserved memory size
Implement shrinking the reserved memory for crash kernel, if it is more
than enough.
For example, if you have already reserved 128M, now you just want 100M,
you can do:
# echo $((100*1024*1024)) > /sys/kernel/kexec_crash_size
Note, you can only do this before loading the crash kernel.
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Neil Horman <nhorman@redhat.com>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Wed, 16 Dec 2009 00:47:45 +0000 (16:47 -0800)]
parport_pc.c: use correct length in strncmp
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan Beulich [Wed, 16 Dec 2009 00:47:43 +0000 (16:47 -0800)]
dma-mapping: fix off-by-one error in dma_capable()
dma_mask is, when interpreted as address, the last valid byte, and hence
comparison msut also be done using the last valid of the buffer in
question.
Also fix the open-coded instances in lib/swiotlb.c.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Becky Bruce <beckyb@kernel.crashing.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nils Carlson [Wed, 16 Dec 2009 00:47:42 +0000 (16:47 -0800)]
edac: i5100 add 6 ranks per channel
Add support for 6 ranks per channel to the i5100 chipset. I have tested
the patch as far as possible with correctible errors and things appear
good. The DIMM mapping is correct for our board, but boards may differ.
Signed-off-by: Nils Carlson <nils.carlson@ludd.ltu.se>
Acked-by: Arthur Jones <ajones@riverbed.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nils Carlson [Wed, 16 Dec 2009 00:47:42 +0000 (16:47 -0800)]
edac: i5100 add scrubbing
Addscrubbing to the i5100 chipset. The i5100 chipset only supports one
scrubbing rate, which is not constant but dependent on memory load. The
rate returned by this driver is an estimate based on some experimentation,
but is substantially closer to the truth than the speed supplied in the
documentation.
Also, scrubbing is done once, and then a done-bit is set. This means that
to accomplish continuous scrubbing a re-enabling mechanism must be used.
I have created the simplest possible such mechanism in the form of a
work-queue which will check every five minutes. This interval is quite
arbitrary but should be sufficient for all sizes of system memory.
Signed-off-by: Nils Carlson <nils.carlson@ludd.ltu.se>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nils Carlson [Wed, 16 Dec 2009 00:47:40 +0000 (16:47 -0800)]
edac: i5100 clean controller to channel terms
The i5100 driver uses the word controller instead of channel in a lot of
places, this is simply a cleanup of the patch.
Signed-off-by: Nils Carlson <nils.carlson@ludd.ltu.se>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
André Goddard Rosa [Wed, 16 Dec 2009 00:47:40 +0000 (16:47 -0800)]
pid: reduce code size by using a pointer to iterate over array
It decreases code size by 16 bytes on my gcc 4.4.1 on Core 2:
text data bss dec hex filename
4314 2216 8 6538 198a kernel/pid.o-BEFORE
4298 2216 8 6522 197a kernel/pid.o-AFTER
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
André Goddard Rosa [Wed, 16 Dec 2009 00:47:39 +0000 (16:47 -0800)]
pid: tighten pidmap spinlock critical section by removing kfree()
Avoid calling kfree() under pidmap spinlock, calling it afterwards.
Normally kfree() is fast, but sometimes it can be slow, so avoid
calling it under the spinlock if we can do it.
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Wed, 16 Dec 2009 00:47:37 +0000 (16:47 -0800)]
elf: kill USE_ELF_CORE_DUMP
Currently all architectures but microblaze unconditionally define
USE_ELF_CORE_DUMP. The microblaze omission seems like an error to me, so
let's kill this ifdef and make sure we are the same everywhere.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: <linux-arch@vger.kernel.org>
Cc: Michal Simek <michal.simek@petalogix.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Julia Lawall [Wed, 16 Dec 2009 00:47:37 +0000 (16:47 -0800)]
drivers/char/ipmi: Use KCS_IDLE_STATE
KCS_IDLE and KCS_IDLE state have the same value, but in this function the
constants ending in _STATE are compared to the state variable.
Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Core Minyard <cminyard@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Amerigo Wang [Wed, 16 Dec 2009 00:47:36 +0000 (16:47 -0800)]
ipc: HARD_MSGMAX should be higher not lower on 64bit
We have HARD_MSGMAX lower on 64bit than on 32bit, since usually 64bit
machines have more memory than 32bit machines.
Making it higher on 64bit seems reasonable, and keep the original number
on 32bit.
Acked-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Amerigo Wang [Wed, 16 Dec 2009 00:47:35 +0000 (16:47 -0800)]
ipc: remove unreachable code in sem.c
This line is unreachable, remove it.
[akpm@linux-foundation.org: remove unneeded initialisation of `err']
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Manfred Spraul [Wed, 16 Dec 2009 00:47:34 +0000 (16:47 -0800)]
ipc/sem.c: optimize single sops when semval is zero
If multiple simple decrements on the same semaphore are pending, then the
current code scans all decrement operations, even if the semaphore value
is already 0.
The patch optimizes that: if the semaphore value is 0, then there is no
need to scan the q->alter entries.
Note that this is a common case: It happens if 100 decrements by one are
pending and now an increment by one increases the semaphore value from 0
to 1. Without this patch, all 100 entries are scanned. With the patch,
only one entry is scanned, then woken up. Then the new rule triggers and
the scanning is aborted, without looking at the remaining 99 tasks.
With this patch, single sop increment/decrement by 1 are now O(1).
(same as with Nick's patch)
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Pierre Peiffer <peifferp@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Manfred Spraul [Wed, 16 Dec 2009 00:47:33 +0000 (16:47 -0800)]
ipc/sem.c: optimize single semop operations
sysv sem has the concept of semaphore arrays that consist out of multiple
semaphores. Atomic operations that affect multiple semaphores are
supported.
The patch optimizes single semaphore operation calls that affect only one
semaphore: It's not necessary to scan all pending operations, it is
sufficient to scan the per-semaphore list.
The idea is from Nick Piggin version of an ipc sem improvement, the
implementation is different: The code tries to keep as much common code as
possible.
As the result, the patch is simpler, but optimizes fewer cases.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Pierre Peiffer <peifferp@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Manfred Spraul [Wed, 16 Dec 2009 00:47:32 +0000 (16:47 -0800)]
ipc/sem.c: add a per-semaphore pending list
Based on Nick's findings:
sysv sem has the concept of semaphore arrays that consist out of multiple
semaphores. Atomic operations that affect multiple semaphores are
supported.
The patch is the first step for optimizing simple, single semaphore
operations: In addition to the global list of all pending operations, a
2nd, per-semaphore list with the simple operations is added.
Note: this patch does not make sense by itself, the new list is used
nowhere.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Pierre Peiffer <peifferp@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Manfred Spraul [Wed, 16 Dec 2009 00:47:31 +0000 (16:47 -0800)]
ipc/sem.c: optimize if semops fail
Reduce the amount of scanning of the list of pending semaphore operations:
If try_atomic_semop failed, then no changes were applied. Thus no need to
restart.
Additionally, this patch correct an incorrect comment: It's possible to
wait for arbitrary semaphore values (do a dec by <x>, wait-for-zero, inc
by <x> in one atomic operation)
Both changes are from Nick Piggin, the patch is the result of a different
split of the individual changes.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Pierre Peiffer <peifferp@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nick Piggin [Wed, 16 Dec 2009 00:47:30 +0000 (16:47 -0800)]
ipc/sem.c: sem preempt improve
The strange sysv semaphore wakeup scheme has a kind of busy-wait lock
involved, which could deadlock if preemption is enabled during the "lock".
It is an implementation detail (due to a spinlock being held) that this is
actually the case. However if "spinlocks" are made preemptible, or if the
sem lock is changed to a sleeping lock for example, then the wakeup would
become buggy. So this might be a bugfix for -rt kernels.
Imagine waker being preempted by wakee and never clearing IN_WAKEUP -- if
wakee has higher RT priority then there is a priority inversion deadlock.
Even if there is not a priority inversion to cause a deadlock, then there
is still time wasted spinning.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Pierre Peiffer <peifferp@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nick Piggin [Wed, 16 Dec 2009 00:47:29 +0000 (16:47 -0800)]
ipc/sem.c: sem use list operations
Replace the handcoded list operations in update_queue() with the standard
list_for_each_entry macros.
list_for_each_entry_safe() must be used, because list entries can
disappear immediately uppon the wakeup event.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Pierre Peiffer <peifferp@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>