Grant C. Likely [Thu, 19 Jan 2006 08:13:20 +0000 (01:13 -0700)]
[PATCH] powerpc: Add Virtex-4 FX to cpu table
Signed-off-by: Grant C. Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Grant C. Likely [Thu, 19 Jan 2006 08:13:12 +0000 (01:13 -0700)]
[PATCH] powerpc: Add ML300 defconfig
Signed-off-by: Grant C. Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Grant C. Likely [Thu, 19 Jan 2006 08:13:03 +0000 (01:13 -0700)]
[PATCH] powerpc: Migrate ML300 reference design to the platform bus
Signed-off-by: Grant C. Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Grant C. Likely [Thu, 19 Jan 2006 08:12:48 +0000 (01:12 -0700)]
[PATCH] powerpc: Migrate Xilinx Vertex support from the OCP bus to the platfom bus.
This patch only deals with the serial port definitions as there is no
support for any other xilinx IP cores in the kernel tree at the moment.
Board specific configuration moved out of virtex.[ch] and into the
xparameters.h wrapper.
This also prepares for the transition to the flattened device tree model.
When the bootloader provides a device tree generated from an xparameters.h
files, the kernel will no longer need xparameters/*. The platform bus will
get populated with data from the device tree, and the device drivers will
be automatically connected to the devices. Only the bootloader (or
ppcboot) will need xparameters directly.
Signed-off-by: Grant C. Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Grant C. Likely [Thu, 19 Jan 2006 08:12:40 +0000 (01:12 -0700)]
[PATCH] powerpc: Make Virtex-II Pro support generic for all Virtex devices
The PPC405 hard core is used in both the Virtex-II Pro and Virtex 4 FX
FPGAs. This patch cleans up the Virtex naming convention to reflect more
than just the Virtex-II Pro.
Rename files virtex-ii_pro.[ch] to virtex.[ch]
Rename config value VIRTEX_II_PRO to XILINX_VIRTEX
Signed-off-by: Grant C. Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Grant C. Likely [Thu, 19 Jan 2006 08:12:32 +0000 (01:12 -0700)]
[PATCH] powerpc: Move xparameters.h into xilinx virtex device specific path
xparameters should not be needed by anything but virtex platform code.
Move it from include/asm-ppc/ to platforms/4xx/xparameters/
This is preparing for work to remove xparameters from the dependancy tree
for most c files. xparam changes should not cause a recompile of the world.
Instead, drivers should get device info from the platform bus (populated
by the boot code)
Signed-off-by: Grant C. Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Nathan Lynch [Tue, 7 Feb 2006 04:44:23 +0000 (22:44 -0600)]
[PATCH] powerpc: avoid timer interrupt replay effect when onlining cpu
When a cpu is hotplug-onlined, if we don't set per_cpu(last_jiffy) to
something sane, timer_interrupt will execute its while loop for every
tick missed since the cpu was last online (or since the system was
booted, if we're adding a new cpu). This can cause weird hangs, ssh
sessions dropping, and we can even go xmon if we take a global IPI at
the wrong time.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Michael Neuling [Mon, 6 Feb 2006 23:58:21 +0000 (10:58 +1100)]
[PATCH] powerpc: hypervisor check in pseries_kexec_cpu_down
We call unregister_vpa but we don't check to see if the hypervisor
supports this.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Anton Blanchard <anton@samba.org>
--
arch/powerpc/platforms/pseries/setup.c | 2 +-
1 files changed, 1 insertion(+), 1 deletion(-)
Signed-off-by: Paul Mackerras <paulus@samba.org>
Becky Bruce [Mon, 6 Feb 2006 20:26:31 +0000 (14:26 -0600)]
[PATCH] documentation/powerpc: add bus-frequency property to SOC node
Updated SOC node definition in documentation to include bus-frequency
property. Also extended mdio example to match specification.
Signed-off-by: Becky Bruce <becky.bruce@freescale.com>
Signed-off-by: Kumar Gala <galak@gate.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Michael Ellerman [Tue, 7 Feb 2006 02:26:14 +0000 (13:26 +1100)]
[PATCH] powerpc: Don't use toc in decrementer_iSeries_masked
Since
404849bbd2bfd62e05b36f4753f6e1af6050a824 we've been using
LOAD_REG_ADDRBASE, which uses the toc pointer, in decrementer_iSeries_masked.
This can explode if we take the decrementer interrupt while we're in a module,
because the toc pointer in r2 will be the module's toc pointer.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
David Gibson [Mon, 6 Feb 2006 02:24:53 +0000 (13:24 +1100)]
[PATCH] powerpc: Cleanup, consolidating icache dirtying logic
The code to mark a page as icache dirty (so that it will later be
icache-dcache flushed when we try to execute from it) is duplicated in
three places: flush_dcache_page() does this marking and nothing else,
but clear_user_page() and copy_user_page() duplicate it, since those
functions make the page icache dirty themselves.
This patch makes those other functions call flush_dcache_page()
instead, so the logic's all in one place. This will make life less
confusing if we ever need to tweak the details of the the lazy icache
flush mechanism.
arch/powerpc/mm/mem.c | 14 ++------------
1 file changed, 2 insertions(+), 12 deletions(-)
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Jesper Juhl [Sat, 4 Feb 2006 19:35:59 +0000 (20:35 +0100)]
[PATCH] Don't check pointer for NULL before passing it to kfree [arch/powerpc/kernel/rtas_flash.c]
Checking a pointer for NULL before passing it to kfree is pointless, kfree
does its own NULL checking of input.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Olaf Hering [Sat, 4 Feb 2006 12:33:46 +0000 (13:33 +0100)]
[PATCH] powerpc: fix compile warning in udbg_init_maple_realmode
arch/powerpc/kernel/udbg_16550.c: In function `udbg_init_maple_realmode':
arch/powerpc/kernel/udbg_16550.c:162: warning: assignment from incompatible pointer type
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Olaf Hering [Sat, 4 Feb 2006 11:55:41 +0000 (12:55 +0100)]
[PATCH] powerpc: add refcounting to setup_peg2 and of_get_pci_address
setup_peg2 must do some refcounting.
of_get_pci_address may need to drop the node
Pegasos l2cr : L2 cache was not active, activating
PCI bus 0 controlled by pci at
80000000
Badness in kref_get at /home/olaf/kernel/olh/ppc64/linux-2.6.16-rc2-olh/lib/kref.c:32
Call Trace:
[
C037BD00] [
C0007934] show_stack+0x5c/0x184 (unreliable)
[
C037BD30] [
C000E068] program_check_exception+0x184/0x584
[
C037BD90] [
C000F5F0] ret_from_except_full+0x0/0x4c
--- Exception: 700 at kref_get+0xc/0x24
LR = of_node_get+0x24/0x3c
[
C037BE50] [
C004FD94] __pte_alloc_kernel+0x64/0x80 (unreliable)
[
C037BE70] [
C000CA18] of_get_parent+0x34/0x58
[
C037BE90] [
C0009B18] of_get_address+0x24/0x174
[
C037BED0] [
C000A108] of_address_to_resource+0x24/0x68
[
C037BF00] [
C038B128] chrp_find_bridges+0x114/0x470
[
C037BF90] [
C038AE48] chrp_setup_arch+0x1fc/0x32c
[
C037BFB0] [
C03849B0] setup_arch+0x144/0x188
[
C037BFD0] [
C037C45C] start_kernel+0x34/0x1a8
[
C037BFF0] [
000037A0] 0x37a0
Badness in kref_get at /home/olaf/kernel/olh/ppc64/linux-2.6.16-rc2-olh/lib/kref.c:32
Call Trace:
[
C037BC90] [
C0007934] show_stack+0x5c/0x184 (unreliable)
[
C037BCC0] [
C000E068] program_check_exception+0x184/0x584
[
C037BD20] [
C000F5F0] ret_from_except_full+0x0/0x4c
--- Exception: 700 at kref_get+0xc/0x24
LR = of_node_get+0x24/0x3c
[
C037BDE0] [
00000000] 0x0 (unreliable)
[
C037BE00] [
C000CA18] of_get_parent+0x34/0x58
[
C037BE20] [
C0009CE8] of_translate_address+0x2c/0x2fc
[
C037BEA0] [
C0009FE8] __of_address_to_resource+0x30/0xc4
[
C037BED0] [
C000A130] of_address_to_resource+0x4c/0x68
[
C037BF00] [
C038B128] chrp_find_bridges+0x114/0x470
[
C037BF90] [
C038AE48] chrp_setup_arch+0x1fc/0x32c
[
C037BFB0] [
C03849B0] setup_arch+0x144/0x188
[
C037BFD0] [
C037C45C] start_kernel+0x34/0x1a8
[
C037BFF0] [
000037A0] 0x37a0
PCI bus 0 controlled by pci at
c0000000
Top of RAM: 0x10000000, Total RAM: 0x10000000
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Olaf Hering [Sat, 4 Feb 2006 11:44:56 +0000 (12:44 +0100)]
[PATCH] powerpc: remove pointer/integer confusion in of_find_node_by_name
remove pointer/integer confusion
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Olaf Hering [Sat, 4 Feb 2006 10:05:33 +0000 (11:05 +0100)]
[PATCH] powerpc: restore clock speed in /proc/cpuinfo
Use generic_calibrate_decr to restore missing clock: speed in /proc/cpuinfo
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Olaf Hering [Sat, 4 Feb 2006 09:34:56 +0000 (10:34 +0100)]
[PATCH] powerpc: remove pointer/integer confusion in generic_calibrate_decr
remove pointer/integer confusion
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Michael Ellerman [Fri, 3 Feb 2006 08:05:47 +0000 (19:05 +1100)]
[PATCH] powerpc: Don't overwrite flat device tree with kdump kernel
It's possible for prom_init to allocate the flat device tree inside the
kdump crash kernel region. If this happens, when we load the kdump kernel we
overwrite the flattened device tree, which is bad.
We could make prom_init try and avoid allocating inside the crash kernel
region, but then we run into issues if the crash kernel region uses all the
space inside the RMO. The easiest solution is to move the flat device tree
once we're running in the kernel.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Dave C Boutcher [Fri, 3 Feb 2006 07:18:36 +0000 (01:18 -0600)]
[PATCH] powerpc: remove useless call to touch_softlockup_watchdog
It turns out that we can't stop the watchdog from
triggering here. If we touch the timer (which just uses the current jiffie
value) before we enable interrupts, it does nothing because jiffies
are not mass-updated until after we enable interrupts. If we touch the
timer after we enable interrupts, its too late because the softlockup
watchdog will already have triggered. The touch_softlockup_watchdog
call removed below does nothing.
Signed-off-by: Dave Boutcher <sleddog@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Dave C Boutcher [Fri, 3 Feb 2006 07:18:39 +0000 (01:18 -0600)]
[PATCH] powerpc: prod all processors after ibm,suspend-me
We need to prod everyone here since this is the only CPU that is
guaranteed to be running after the ibm,suspend-me RTAS call returns.
Signed-off-by: Dave Boutcher <sleddog@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Dave C Boutcher [Fri, 3 Feb 2006 07:18:46 +0000 (01:18 -0600)]
[PATCH] powerpc: return correct rtas status from ibm,suspend-me
Correctly return the status from the RTAS call. rtas_call expects
to return the status as a return value.
Signed-off-by: Dave Boutcher <sleddog@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Michael Ellerman [Tue, 31 Jan 2006 06:17:47 +0000 (17:17 +1100)]
[PATCH] powerpc: Fix !SMP build of rtas.c
arch/powerpc/kernel/rtas.c is getting hvcall.h via spinlock.h, but when we're
building for UP we don't include spinlock.h.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Jake Moilanen [Tue, 31 Jan 2006 03:51:54 +0000 (21:51 -0600)]
[PATCH] powerpc: IOMMU SG paranoia
This addresses two items, which are unlikely to be hit if we
trust drivers.
The first is moving a memory barrier below where the vmerged SG count
is passed back, but before the list is set to end. If those
instructions were reordered, there could be an issue in iommu_unmap_sg().
The second is making sure we terminate the list on the failure case of
iommu_map_sg(). If a driver does not look at the failure return code,
it could pass a ill-formed SG list to iommu_unmap_sg().
Signed-off-by: Jake Moilanen <moilanen@austin.ibm.com>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Michael Ellerman [Wed, 25 Jan 2006 08:48:48 +0000 (21:48 +1300)]
[PATCH] powerpc: Refuse to boot a kdump kernel via OF
You can't boot a kdump kernel via OF, not reliably anyway, the kernel being at
32 MB conflicts with the zImage wrapper etc. and it blows up.
It's trivial to check in prom_init though, and this is early enough that we can
actually drop back to OF where a reset-all will get you going again, which is
kinda nice. I think this should go in for 2.6.16.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Michael Ellerman [Wed, 25 Jan 2006 08:31:26 +0000 (21:31 +1300)]
[PATCH] powerpc: Make sure we don't create empty lmb regions
To prevent problems later in boot, make sure we don't create zero-size lmb
regions.
I've checked all the callers, and at the moment no one should ever hit this.
All callers use a constant size, or they check the computed size before they
call us.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Michael Ellerman [Wed, 25 Jan 2006 08:31:25 +0000 (21:31 +1300)]
[PATCH] powerpc: Don't allocate zero bytes in finish_device_tree()
In prom.c we run finish_node() on allnodes twice. The first time we just
calculate how much memory we'll need, the second time we do the actual work.
If the calculation stage determines that we need 0 bytes, then we should skip
the lmb allocation. Although an alloc of zero will work, it has been seen to
lead to a BUG_ON() in reserve_bootmem() on at least one machine.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Marcelo Tosatti [Mon, 23 Jan 2006 15:57:06 +0000 (13:57 -0200)]
[PATCH] powerpc/8xx: last two 8MB D-TLB entries are incorrectly set
The last two 8MB TLB entries are being incorrectly set by initial_mmu on 8xx.
The first entry is written with the same virtual/physical address, which
renders it invalid:
BDI>rms 792 0x00001e00
BDI>rms 824 1
BDI>rds 824
SPR 824 : 0xc08000c0 -
1065353024
BDI>rds 825
SPR 825 : 0xc0800de0 -
1065349664
BDI>rds 826
SPR 826 : 0x00000000 0
And the second entry, in addition, does not have its TLB index set
correctly.
Signed-off-by: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Geoff Levand [Tue, 24 Jan 2006 01:37:11 +0000 (17:37 -0800)]
[PATCH] powerpc: Fix spufs initialization sequence.
This is a small fix to get the spufs init sequence right.
init_spu_base() in spu_base.c should be called (via
module_init(init_spu_base)) before spufs_init() (via
module_init(spufs_init)) in spufs/inode.c gets called.
Signed-off-by: Masato Noguchi <Masato.Noguchi@jp.sony.com>
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Paul Mackerras [Tue, 7 Feb 2006 02:55:30 +0000 (13:55 +1100)]
powerpc/64: Fix bug in setting floating-point exception mode
When loading up the FPU, we were using a 'ld' (load doubleword)
instruction to get the FP exception mode from the thread_struct,
but it's only an int field. This changes the ld to lwz (load
word and zero-extend).
Signed-off-by: Paul Mackerras <paulus@samba.org>
Paul Mackerras [Mon, 6 Feb 2006 23:43:36 +0000 (10:43 +1100)]
Merge ../linux-2.6
Greg KH [Sun, 5 Feb 2006 22:16:08 +0000 (14:16 -0800)]
[PATCH] USB: Fix GPL markings on usb core functions.
I thought we had fixed up all non-gpl USB drivers, and was wrong to do
this.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Sun, 5 Feb 2006 19:26:38 +0000 (11:26 -0800)]
mm/slab.c (non-NUMA): Fix compile warning and clean up code
The non-NUMA case would do an unmatched "free_alien_cache()" on an alien
pointer that had never been allocated.
It might not matter from a code generation standpoint (since in the
non-NUMA case, the code doesn't actually _do_ anything), but it not only
results in a compiler warning, it's really really ugly too.
Fix the compiler warning by just having a matching dummy allocation.
That also avoids an unnecessary #ifdef in the code.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Sun, 5 Feb 2006 19:10:54 +0000 (11:10 -0800)]
Merge /pub/scm/linux/kernel/git/davem/sparc-2.6
Linus Torvalds [Sun, 5 Feb 2006 19:10:29 +0000 (11:10 -0800)]
Merge /pub/scm/linux/kernel/git/davem/net-2.6
Robb, Sam [Sun, 5 Feb 2006 07:28:06 +0000 (23:28 -0800)]
[PATCH] kconfig: detect if -lintl is needed when linking conf,mconf
On a system where libintl.h is present, but the NLS functionality is
supplied by a separate library instead of the system C library, an attempt
to "make config" or "make menuconfig" will fail with link errors, ex:
scripts/kconfig/mconf.o:mconf.c:(.text+0xf63): undefined reference to
`_libintl_gettext'
This patch attempts to correct the problem by detecting whether or not NLS
support requires linking with libintl.
Signed-off-by: Samuel J Robb <sam.robb@timesys.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Adrian Bunk [Sun, 5 Feb 2006 07:28:05 +0000 (23:28 -0800)]
[PATCH] i386: HIGHMEM64G must depend on X86_CMPXCHG64
Due to the usage of set_64bit in include/asm-i386/pgtable-3level.h,
HIGHMEM64G must depend on X86_CMPXCHG64.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Takashi Iwai [Sun, 5 Feb 2006 07:28:05 +0000 (23:28 -0800)]
[PATCH] Fix "value computed is not used" compile warnings with gcc-4.1
Fix gcc4.1 compile warnings "value computed is not used" with
set_current_state() and set_task_state() on i386/SMP and x86-64.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Chuck Ebbert [Sun, 5 Feb 2006 07:28:04 +0000 (23:28 -0800)]
[PATCH] i386: print kernel version in register dumps
Show first field of kernel version in register dumps like x86_64 does.
Changes output from e.g.:
(2.6.16-rc1)
to:
(2.6.16-rc1 #12)
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Chuck Ebbert [Sun, 5 Feb 2006 07:28:03 +0000 (23:28 -0800)]
[PATCH] i386 cpu hotplug: don't access freed memory
i386 CPU init code accesses freed init memory when booting a newly-started
processor after CPU hotplug. The cpu_devs array is searched to find the
vendor and it contains pointers to freed data.
Fix that by:
1. Zeroing entries for freed vendor data after bootup.
2. Changing Transmeta, NSC and UMC to all __init[data].
3. Printing a warning (once only) and setting this_cpu
to a safe default when the vendor is not found.
This does not change behavior for AMD systems. They were broken already
but no error was reported.
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ulrich Drepper [Sun, 5 Feb 2006 07:28:02 +0000 (23:28 -0800)]
[PATCH] namei.c: unlock missing in error case
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Trond Myklebust [Sun, 5 Feb 2006 07:28:01 +0000 (23:28 -0800)]
[PATCH] VFS: Ensure LOOKUP_CONTINUE flag is preserved by link_path_walk()
When walking a path, the LOOKUP_CONTINUE flag is used by some filesystems
(for instance NFS) in order to determine whether or not it is looking up
the last component of the path. It this is the case, it may have to look
at the intent information in order to perform various tasks such as atomic
open.
A problem currently occurs when link_path_walk() hits a symlink. In this
case LOOKUP_CONTINUE may be cleared prematurely when we hit the end of the
path passed by __vfs_follow_link() (i.e. the end of the symlink path)
rather than when we hit the end of the path passed by the user.
The solution is to have link_path_walk() clear LOOKUP_CONTINUE if and only
if that flag was unset when we entered the function.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ravikiran G Thirumalai [Sun, 5 Feb 2006 07:27:59 +0000 (23:27 -0800)]
[PATCH] NUMA slab locking fixes: fix cpu down and up locking
This fixes locking and bugs in cpu_down and cpu_up paths of the NUMA slab
allocator. Sonny Rao <sonny@burdell.org> reported problems sometime back on
POWER5 boxes, when the last cpu on the nodes were being offlined. We could
not reproduce the same on x86_64 because the cpumask (node_to_cpumask) was not
being updated on cpu down. Since that issue is now fixed, we can reproduce
Sonny's problems on x86_64 NUMA, and here is the fix.
The problem earlier was on CPU_DOWN, if it was the last cpu on the node to go
down, the array_caches (shared, alien) and the kmem_list3 of the node were
being freed (kfree) with the kmem_list3 lock held. If the l3 or the
array_caches were to come from the same cache being cleared, we hit on
badness.
This patch cleans up the locking in cpu_up and cpu_down path. We cannot
really free l3 on cpu down because, there is no node offlining yet and even
though a cpu is not yet up, node local memory can be allocated for it. So l3s
are usually allocated at keme_cache_create and destroyed at
kmem_cache_destroy. Hence, we don't need cachep->spinlock protection to get
to the cachep->nodelist[nodeid] either.
Patch survived onlining and offlining on a 4 core 2 node Tyan box with a 4
dbench process running all the time.
Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Cc: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ravikiran G Thirumalai [Sun, 5 Feb 2006 07:27:58 +0000 (23:27 -0800)]
[PATCH] NUMA slab locking fixes: irq disabling from cahep->spinlock to l3 lock
Earlier, we had to disable on chip interrupts while taking the
cachep->spinlock because, at cache_grow, on every addition of a slab to a slab
cache, we incremented colour_next which was protected by the cachep->spinlock,
and cache_grow could occur at interrupt context. Since, now we protect the
per-node colour_next with the node's list_lock, we do not need to disable on
chip interrupts while taking the per-cache spinlock, but we just need to
disable interrupts when taking the per-node kmem_list3 list_lock.
Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Cc: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ravikiran G Thirumalai [Sun, 5 Feb 2006 07:27:56 +0000 (23:27 -0800)]
[PATCH] NUMA slab locking fixes: move color_next to l3
colour_next is used as an index to add a colouring offset to a new slab in the
cache (colour_off * colour_next). Now with the NUMA aware slab allocator, it
makes sense to colour slabs added on the same node sequentially with
colour_next.
This patch moves the colouring index "colour_next" per-node by placing it on
kmem_list3 rather than kmem_cache.
This also helps simplify locking for CPU up and down paths.
Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Cc: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Christoph Lameter [Sun, 5 Feb 2006 07:27:55 +0000 (23:27 -0800)]
[PATCH] hugetlb: add comment explaining reasons for Bus Errors
I just spent some time researching a Bus Error. Turns out that the huge
page fault handler can return VM_FAULT_SIGBUS for various conditions where
no huge page is available.
Add a note explaining the reasoning in the source.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Acked-by: William Lee Irwin III <wli@holomorphy.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrew Morton [Sun, 5 Feb 2006 07:27:54 +0000 (23:27 -0800)]
[PATCH] jbd: fix transaction batching
Ben points out that:
When writing files out using O_SYNC, jbd's 1 jiffy delay results in a
significant drop in throughput as the disk sits idle. The patch below
results in a 4-5x performance improvement (from 6.5MB/s to ~24-30MB/s on my
IDE test box) when writing out files using O_SYNC.
So optimise the batching code by omitting it entirely if the process which is
doing a sync write is the same as the one which did the most recent sync
write. If that's true, we're unlikely to get any other processes joining the
transaction.
(Has been in -mm for ages - it took me a long time to get on to performance
testing it)
Numbers, on write-cache-disabled IDE:
/usr/bin/time -p synctest -n 10 -uf -t 1 -p 1 dir-name
Unpatched:
40 seconds
Patched:
35 seconds
Batching disabled:
35 seconds
This is the problematic single-process-doing-fsync case. With multiple
fsyncing processes the numbers are AFACIT unaltered by the patch.
Aside: performance testing and instrumentation shows that the transaction
batching almost doesn't help (testing with synctest -n 1 -uf -t 100 -p 10
dir-name on non-writeback-caching IDE). This is because by the time one
process is running a synchronous commit, a bunch of other processes already
have a transaction handle open, so they're all going to batch into the same
transaction anyway.
The batching seems to offer maybe 5-10% speedup with this workload, but I'm
pretty sure it was more important than that when it was first developed 4-odd
years ago...
Cc: "Stephen C. Tweedie" <sct@redhat.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrew Morton [Sun, 5 Feb 2006 07:27:51 +0000 (23:27 -0800)]
[PATCH] reiserfs_get_acl() build fix
With CONFIG_REISERFS_FS_XATTR=y, CONFIG_REISERFS_FS_POSIX_ACL=n:
fs/reiserfs/xattr.c: In function `reiserfs_check_acl':
fs/reiserfs/xattr.c:1330: called object is not a function
Cc: Chris Mason <mason@suse.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Hugh Dickins [Sun, 5 Feb 2006 07:27:51 +0000 (23:27 -0800)]
[PATCH] x86: fix stack trace facility level
dump_stack() on page allocation failure presently has an irritating habit
of shouting just "====" at everyone: please stop it.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Stephen Smalley [Sun, 5 Feb 2006 07:27:50 +0000 (23:27 -0800)]
[PATCH] selinux: require SECURITY_NETWORK
Make SELinux depend on SECURITY_NETWORK (which depends on SECURITY), as it
requires the socket hooks for proper operation even in the local case.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Dave Jones [Sun, 5 Feb 2006 07:27:49 +0000 (23:27 -0800)]
[PATCH] missing license tag in intermodule
It may suck something awful, but it shouldn't taint the kernel.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Phillip Susi [Sun, 5 Feb 2006 07:27:48 +0000 (23:27 -0800)]
[PATCH] pktcdvd: Allow larger packets
The pktcdvd driver uses a compile time macro constant to define the maximum
supported packet length. I changed this from 32 sectors to 128 sectors
because that allows over 100 MB of additional usable space on a 700 MB cdrw,
and increases throughput.
Note that you need a modified cdrwtool program that can format a CDRW disc
with larger packets to benefit from this change.
Signed-off-by: Peter Osterlund <petero2@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Peter Osterlund [Sun, 5 Feb 2006 07:27:47 +0000 (23:27 -0800)]
[PATCH] pktcdvd: Don't waste kernel memory
Allocate memory for read-gathering at open time, when it is known just how
much memory is needed. This avoids wasting kernel memory when the real packet
size is smaller than the maximum packet size supported by the driver. This is
always the case when using DVD discs.
Signed-off-by: Peter Osterlund <petero2@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Adrian Bunk [Sun, 5 Feb 2006 07:27:45 +0000 (23:27 -0800)]
[PATCH] Let CDROM_PKTCDVD_WCACHE depend on EXPERIMENTAL
Unless the help text is outdated, this seems to be logical.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Peter Osterlund <petero2@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Peter Osterlund [Sun, 5 Feb 2006 07:27:45 +0000 (23:27 -0800)]
[PATCH] pktcdvd: remove version string
The version information is not useful for a driver that is maintained in
Linus' kernel tree.
Signed-off-by: Peter Osterlund <petero2@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Phillip Susi [Sun, 5 Feb 2006 07:27:44 +0000 (23:27 -0800)]
[PATCH] pktcdvd: Fix overflow for discs with large packets
The pktcdvd driver was using an 8 bit field to store the packet length
obtained from the disc track info. This causes it to overflow packet length
values of 128KB or more. I changed the field to 32 bits to fix this.
The pktcdvd driver defaulted to its maximum allowed packet length when it
detected a 0 in the track info field. I changed this to fail the operation
and refuse to access the media. This seems more sane than attempting to
access it with a value that almost certainly will not work.
Signed-off-by: Peter Osterlund <petero2@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Chuck Ebbert [Sun, 5 Feb 2006 07:27:42 +0000 (23:27 -0800)]
[PATCH] sched: only print migration_cost once per boot
migration_cost prints after every CPU hotplug event. Make it print only
once at boot.
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Stephen Smalley [Sun, 5 Feb 2006 07:27:42 +0000 (23:27 -0800)]
[PATCH] MAINTAINERS/CREDITS: Update SELinux contact info
Update my contact info. Please apply.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Miklos Szeredi [Sun, 5 Feb 2006 07:27:40 +0000 (23:27 -0800)]
[PATCH] fuse: fix request_end() vs fuse_reset_request() race
The last fix for this function in fact opened up a much more often
triggering race.
It was uncommented tricky code, that was buggy. Add comment, make it less
tricky and fix bug.
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Markus Lidel [Sun, 5 Feb 2006 07:27:39 +0000 (23:27 -0800)]
[PATCH] Fix i2o_scsi oops on abort
Fix http://bugzilla.kernel.org/show_bug.cgi?id=5923
When a scsi command failed, an oops would result.
Back-to-back SMART queries would make the Seagate drives unhappy. The
second SMART query would timeout, and the command would be aborted.
Acked-by: Markus Lidel <Markus.Lidel@shadowconnect.com>
Cc: Kenny Simpson <theonetruekenny@yahoo.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Tejun Heo [Sun, 5 Feb 2006 07:27:38 +0000 (23:27 -0800)]
[PATCH] block: request_queue->ordcolor must not be flipped on SOFTBARRIER
q->ordcolor must not be flipped on SOFTBARRIER.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Acked-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jens Axboe [Sun, 5 Feb 2006 07:27:38 +0000 (23:27 -0800)]
[PATCH] fix ordering on requeued request drainage
Previously, if a fs request which was being drained failed and got
requeued, blk_do_ordered() didn't allow it to be reissued, which causes
queue stall. This patch makes blk_do_ordered() use the sequence of each
request to determine whether a request can be issued or not. This fixes
the bug and simplifies code.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Acked-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Eric Dumazet [Sun, 5 Feb 2006 07:27:36 +0000 (23:27 -0800)]
[PATCH] percpu data: only iterate over possible CPUs
percpu_data blindly allocates bootmem memory to store NR_CPUS instances of
cpudata, instead of allocating memory only for possible cpus.
As a preparation for changing that, we need to convert various 0 -> NR_CPUS
loops to use for_each_cpu().
(The above only applies to users of asm-generic/percpu.h. powerpc has gone it
alone and is presently only allocating memory for present CPUs, so it's
currently corrupting memory).
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Jens Axboe <axboe@suse.de>
Cc: Anton Blanchard <anton@samba.org>
Acked-by: William Irwin <wli@holomorphy.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Sun, 5 Feb 2006 18:51:57 +0000 (10:51 -0800)]
Revert "[PATCH] x86_64: Fix the node cpumask of a cpu going down"
This reverts commit
10f4dc8b27ac42f930ac55adb8c521264dc997f8.
Quoth Andi Kleen:
"Kiran decided that it makes the problem worse than it was before.
Fixing it fully requires more work which is too much for 2.6.16. So
please revert that commit for now."
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
David S. Miller [Sat, 4 Feb 2006 10:49:23 +0000 (02:49 -0800)]
[SPARC64]: Update defconfig.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 4 Feb 2006 10:49:03 +0000 (02:49 -0800)]
[SPARC64]: Add .gitignore file for sparc64 boot images.
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sat, 4 Feb 2006 10:19:46 +0000 (02:19 -0800)]
[NETFILTER]: Fix check whether dst_entry needs to be released after NAT
After DNAT the original dst_entry needs to be released if present
so the packet doesn't skip input routing with its new address. The
current check for DNAT in ip_nat_in is reversed and checks for SNAT.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sat, 4 Feb 2006 10:19:09 +0000 (02:19 -0800)]
[NETFILTER]: Prepare {ipt,ip6t}_policy match for x_tables unification
The IPv4 and IPv6 version of the policy match are identical besides address
comparison and the data structure used for userspace communication. Unify
the data structures to break compatiblity now (before it is released), so
we can port it to x_tables in 2.6.17.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sat, 4 Feb 2006 10:17:55 +0000 (02:17 -0800)]
[NETFILTER]: Fix ip6t_policy address matching
Fix two bugs in ip6t_policy address matching:
- misorder arguments to ip6_masked_addrcmp, mask must be the second argument
- inversion incorrectly applied to the entire expression instead of just
the address comparison
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sat, 4 Feb 2006 10:17:26 +0000 (02:17 -0800)]
[NETFILTER]: Check policy length in policy match strict mode
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kirill Korotaev [Sat, 4 Feb 2006 10:16:56 +0000 (02:16 -0800)]
[NETFILTER]: Fix possible overflow in netfilters do_replace()
netfilter's do_replace() can overflow on addition within SMP_ALIGN()
and/or on multiplication by NR_CPUS, resulting in a buffer overflow on
the copy_from_user(). In practice, the overflow on addition is
triggerable on all systems, whereas the multiplication one might require
much physical memory to be present due to the check above. Either is
sufficient to overwrite arbitrary amounts of kernel memory.
I really hate adding the same check to all 4 versions of do_replace(),
but the code is duplicate...
Found by Solar Designer during security audit of OpenVZ.org
Signed-Off-By: Kirill Korotaev <dev@openvz.org>
Signed-Off-By: Solar Designer <solar@openwall.com>
Signed-off-by: Patrck McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Samir Bellabes [Sat, 4 Feb 2006 10:16:06 +0000 (02:16 -0800)]
[NETFILTER]: nf_conntrack: fix incorrect memset() size in FTP helper
This memset() is executing with a bad size. According to Yasuyuki Kozakai,
this memset() can be deleted, as 'ftp' is declared in global area.
Signed-off-by: Samir Bellabes <sbellabes@mandriva.com>
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yasuyuki Kozakai [Sat, 4 Feb 2006 10:15:36 +0000 (02:15 -0800)]
[NETFILTER]: iptables: fix typos in ipt_connbytes.h
Fix some typos that make iptables userspace compilation fail.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sat, 4 Feb 2006 10:14:51 +0000 (02:14 -0800)]
[NETFILTER]: Fix missing src port initialization in tftp expectation mask
Reported by David Ahern <dahern@avaya.com>, netfilter bugzilla #426.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sat, 4 Feb 2006 10:14:24 +0000 (02:14 -0800)]
[NETFILTER]: nfnetlink_queue: fix packet marking over netlink
The packet marked is the netlink skb, not the queued skb.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Sat, 4 Feb 2006 10:13:57 +0000 (02:13 -0800)]
[NETFILTER]: Fix undersized skb allocation in ipt_ULOG/ebt_ulog/nfnetlink_log
The skb allocated is always of size nlbufsize, even if that is smaller than
the size needed for the current packet.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Holger Eitzenberger [Sat, 4 Feb 2006 10:13:14 +0000 (02:13 -0800)]
[NETFILTER]: ULOG/nfnetlink_log: Use better default value for 'nlbufsiz'
Performance tests showed that ULOG may fail on heavy loaded systems
because of failed order-N allocations (N >= 1).
The default value of 4096 is not optimal in the sense that it actually
allocates _two_ contigous physical pages. Reasoning: ULOG uses
alloc_skb(), which adds another ~300 bytes for skb_shared_info.
This patch sets the default value to NLMSG_GOODSIZE and adds some
documentation at the top.
Signed-off-by: Holger Eitzenberger <heitzenberger@astaro.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yasuyuki Kozakai [Sat, 4 Feb 2006 10:12:14 +0000 (02:12 -0800)]
[NETFILTER]: nf_conntrack: check address family when finding protocol module
__nf_conntrack_{l3}proto_find() doesn't check the passed protocol family,
then it's possible to touch out of the array which has only AF_MAX items.
Spotted by Pablo Neira Ayuso.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso [Sat, 4 Feb 2006 10:11:41 +0000 (02:11 -0800)]
[NETFILTER]: ctnetlink: add MODULE_ALIAS for expectation subsystem
Add load-on-demand support for expectation request. eg. conntrack -L expect
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcus Sundberg [Sat, 4 Feb 2006 10:11:09 +0000 (02:11 -0800)]
[NETFILTER]: ctnetlink: Fix subsystem used for expectation events
The ctnetlink expectation events should use the NFNL_SUBSYS_CTNETLINK_EXP
subsystem, not NFNL_SUBSYS_CTNETLINK.
Signed-off-by: Marcus Sundberg <marcus@ingate.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Sat, 4 Feb 2006 10:09:34 +0000 (02:09 -0800)]
[ICMP]: Fix extra dst release when ip_options_echo fails
When two ip_route_output_key lookups in icmp_send were combined I
forgot to change the error path for ip_options_echo to not drop the
dst reference since it now sits before the dst lookup. To fix it we
simply jump past the ip_rt_put call.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Mason [Fri, 3 Feb 2006 20:51:59 +0000 (21:51 +0100)]
[PATCH] x86_64: IOMMU printk cleanup
This patch contains a printk reorder to remove the current problem of
displaying "PCI-DMA: Disabling IOMMU." and then "PCI-DMA: using GART
IOMMU" 20 lines later in dmesg.
It also constains a printk reorder in swiotlb to state swiotlb
enablement prior to describing the location of the bounce buffers, and a
printk reorder to state gart enablement prior to describing the
aperature.
Also constains a whitespace cleanup in arch/x86_64/kernel/setup.c
Tested (along with patch 2/2) on dual opteron with gart enabled,
iommu=soft, and iommu=off.
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Fri, 3 Feb 2006 20:51:56 +0000 (21:51 +0100)]
[PATCH] x86_64: Let impossible CPUs point to reference per cpu data
Hack for 2.6.16. In 2.6.17 all code that uses NR_CPUs should
be audited and changed to only touch possible CPUs.
Don't mark the reference per cpu data init data (so it stays
around after boot) and point all impossible CPUs to it. This way
they reference some valid - although shared memory. Usually
this is only initialization like INIT_LIST_HEADs and there
won't be races because these CPUs never run. Still somewhat hackish.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Fri, 3 Feb 2006 20:51:53 +0000 (21:51 +0100)]
[PATCH] i386/x86-64: Don't ack the APIC for bad interrupts when the APIC is not enabled
It's bad juju to touch the APIC when it hasn't been enabled.
I also moved ack_bad_irq for x86-64 out of line following i386.
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ashok Raj [Fri, 3 Feb 2006 20:51:50 +0000 (21:51 +0100)]
[PATCH] x86_64: Dont record local apic ids when they are disabled in MADT
Some broken BIOS's had processors disabled, but
same apic id as a valid processor. This causes
acpi_processor_start() to think this disabled
cpu is ok, and croak. So we dont record bad
apicid's anymore.
http://bugzilla.kernel.org/show_bug.cgi?id=5930
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jan Beulich [Fri, 3 Feb 2006 20:51:47 +0000 (21:51 +0100)]
[PATCH] x86_64: minor odering correction to dump_pagetable()
Checking of the validity of pointers should be consistently done before
dereferencing the pointer.
Signed-Off-By: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jan Beulich [Fri, 3 Feb 2006 20:51:44 +0000 (21:51 +0100)]
[PATCH] x86_64: small fix for CFI annotations
Conditionalize two unwind directives to match other similarly
conditional code.
Signed-Off-By: Jan Beulich <jbeulich@novell.com>
Cc: Jim Houston <jim.houston@ccur.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Fri, 3 Feb 2006 20:51:41 +0000 (21:51 +0100)]
[PATCH] x86_64: Calibrate APIC timer using PM timer
On some broken motherboards (at least one NForce3 based AMD64 laptop)
the PIT timer runs at a incorrect frequency. This patch adds a new
option "apicpmtimer" that allows to use the APIC timer and calibrate it
using the PMTimer. It requires the earlier patch that allows to run the
main timer from the APIC.
Specifying apicpmtimer implies apicmaintimer.
The option defaults to off for now.
I tested it on a few systems and the resulting APIC timer frequencies
were usually a bit off, but always <1%, which should be tolerable.
TBD figure out heuristic to enable this automatically on the affected
systems TBD perhaps do it on all NForce3s or using DMI?
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Fri, 3 Feb 2006 20:51:38 +0000 (21:51 +0100)]
[PATCH] x86_64: Don't allow kprobes on __switch_to
kprobes cannot deal with the funny calling conventions when it
runs on a different stack when it returns. If someone wants
to instrument context switch they can add a probe to schedule()
instead.
Cc: jkenisto@us.ibm.com, prasanna@in.ibm.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Zach Brown [Fri, 3 Feb 2006 20:51:35 +0000 (21:51 +0100)]
[PATCH] x86_64: align per-cpu section to configured cache bytes
Align the start of the per-cpu section to the configured number of bytes in a
cache line. This stops a BUG_ON() from triggering in load_module() when
DEFINE_PER_CPU() is used in a module and the section isn't cacheline-aligned.
Rusty also found this and sent a patch in a while ago
(http://lkml.org/lkml/2004/10/19/17), I don't know what came of that.
Signed-off-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Kevin VanMaren [Fri, 3 Feb 2006 20:51:32 +0000 (21:51 +0100)]
[PATCH] x86_64: When allocation of merged SG lists fails in the IOMMU don't merge
[ AK: I redid Kevin's fix to be simpler, but the idea and original
analysis of the problem is from Kevin]
This avoid allocation failures on some SATA systems like Nvidia CK8
when the IOMMU gets fragmented. Modern SATA devices have quite large queues
(128 entries) and the FS with ext2/3 is good enough now that it often
passes whole 128 page sg lists down to the driver. These require
512K of continuous free space in the IOMMU aperture to map when merged.
When the IOMMU is fragmented this could lead to spurious IO errors
due to failing mappings.
Short term fix is to just try to map the SG list again unmerged
page by page - this way fragmentation doesn't matter anymore.
The code for that was already there, but it just wasn't enabled for the
merge case.
According to Kevin at least the Nvidia device doesn't seem to benefit
from merging much anyways, so the only slowdown is from trying
to do an unnecessary merge attempt.
Kevin plans to implement better fragmentation avoidance in the future,
but that wouldn't be 2.6.16 material.
TBD: should add some statistic counters to count how often that really
happens.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Fri, 3 Feb 2006 20:51:29 +0000 (21:51 +0100)]
[PATCH] x86_64: Fix zero mcfg entry workaround on x86-64
I broke this earlier when moving the patch from i386 to x86-64.
Need to return the virtual address here, not the physical address.
This fixes some boot time crashes on x86-64.
Cc: gregkh@suse.de
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Fri, 3 Feb 2006 20:51:26 +0000 (21:51 +0100)]
[PATCH] x86_64: Do more checking in the SRAT header code
- Check if the processor/memory affinity entries are long enough
according to the ACPI 3.0 spec.
- Ignore memory affinity entries that define a zero length region.
All based on BIOS issues found in the field @)
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ashok Raj [Fri, 3 Feb 2006 20:51:23 +0000 (21:51 +0100)]
[PATCH] x86_64: data/functions wrongly marked as __init with cpu hotplug.
attached patch is 2 more cases i found via running the reference_init.pl
script. These were easy to spot just knowing the file names. There is
one another about init/main.c that i cant exactly zero in. (partly
because i dont know how to interpret the data thats spewed out of the tool).
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Shaohua Li [Fri, 3 Feb 2006 20:51:20 +0000 (21:51 +0100)]
[PATCH] x86_64: mark two routines as __cpuinit
SIgned-off-by: Shaohua Li<shaohua.li@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Fri, 3 Feb 2006 20:51:17 +0000 (21:51 +0100)]
[PATCH] x86_64: Clear more state when ignoring empty node in SRAT parsing
Might fix boot failures on systems with empty PXMs in SRAT
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Chen, Kenneth W [Fri, 3 Feb 2006 20:51:14 +0000 (21:51 +0100)]
[PATCH] x86_64: Fix memory policy build without CONFIG_HUGETLBFS
> mm/mempolicy.c: In function `huge_zonelist':
> mm/mempolicy.c:1045: error: `HPAGE_SHIFT' undeclared (first use in this function)
> mm/mempolicy.c:1045: error: (Each undeclared identifier is reported only once
> mm/mempolicy.c:1045: error: for each function it appears in.)
> make[1]: *** [mm/mempolicy.o] Error 1
Need to wrap huge_zonelist function with CONFIG_HUGETLBFS.
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Fri, 3 Feb 2006 20:51:11 +0000 (21:51 +0100)]
[PATCH] x86_64: Remove rogue default y in EDAC Kconfig
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Fri, 3 Feb 2006 20:51:08 +0000 (21:51 +0100)]
[PATCH] x86_64: Remove CONFIG_INIT_DEBUG
It has been enabled by default for some time now and is cheap enough
so it doesn't matter anyways.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ravikiran G Thirumalai [Fri, 3 Feb 2006 20:51:05 +0000 (21:51 +0100)]
[PATCH] x86_64: Fix the node cpumask of a cpu going down
Currently, x86_64 and ia64 arches do not clear the corresponding bits
in the node's cpumask when a cpu goes down or cpu bring up is cancelled.
This is buggy since there are pieces of common code where the cpumask is
checked in the cpu down code path to decide on things (like in the slab
down path). PPC does the right thing, but x86_64 and ia64 don't (This
was the reason Sonny hit upon a slab bug during cpu offline on ppc and
could not reproduce on other arches). This patch fixes it for x86_64.
I won't attempt ia64 as I cannot test it.
Credit for spotting this should go to Alok.
Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Fri, 3 Feb 2006 20:51:02 +0000 (21:51 +0100)]
[PATCH] x86_64: Undo the earlier changes to remove unrolled copy/memset functions
They cause quite bad performance regressions on Netburst
This is temporary until we can get new optimized functions
for these CPUs.
This undoes changes that were done in 2.6.15 and in 2.6.16-rc1,
essentially bringing the code back to 2.6.14 level. Only change
is I renamed the X86_FEATURE_K8_C flag to X86_FEATURE_REP_GOOD
and fixed the check for the flag and also fixed some comments.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>