review.tizen.org Git - kernel/kernel-generic.git/log

projects / kernel / kernel-generic.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Linus Torvalds [Fri, 9 May 2008 01:41:48 +0000 (18:41 -0700)]

Revert "PCI: remove default PCI expansion ROM memory allocation"

This reverts commit 9f8daccaa05c14e5643bdd4faf5aed9cc8e6f11e, which was
reported to break X startup (xf86-video-ati-6.8.0). See

http://bugs.freedesktop.org/show_bug.cgi?id=15523

for details.

Reported-by: Laurence Withers <l@lwithers.me.uk>
Cc: Gary Hade <garyhade@us.ibm.com>
Cc: Greg KH <greg@kroah.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: "Jun'ichi Nomura" <j-nomura@ce.jp.nec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Linus Torvalds [Thu, 8 May 2008 18:31:07 +0000 (11:31 -0700)]

Merge branch 'for-linus' of git://git./linux/kernel/git/mingo/linux-2.6-sched-fixes

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-fixes:
sched: fix weight calculations
semaphore: fix

commit | commitdiff | tree

Linus Torvalds [Thu, 8 May 2008 17:58:45 +0000 (10:58 -0700)]

Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  [ALSA] soc at91 minor bug fixes
  [ALSA] soc - at91-pcm - Fix line wrapping
  pcspkr: fix dependancies

commit | commitdiff | tree

Huang Weiyi [Thu, 8 May 2008 14:48:31 +0000 (22:48 +0800)]

Remove duplicated include in net/sunrpc/svc.c

<linux/sched.h> we included twice.

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Huang Weiyi [Thu, 8 May 2008 14:36:27 +0000 (22:36 +0800)]

fs/proc/task_mmu.c: remove duplicated include files

Removed duplicated include files <linux/ptrace.h> and <linux/seq_file.h> in
fs/proc/task_mmu.c.

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Ingo Molnar [Wed, 30 Apr 2008 07:48:07 +0000 (09:48 +0200)]

Fix drivers/media build for modular builds

Fix allmodconfig build bug introduced in latest -git by commit
7c91f0624a9 ("V4L/DVB(7767): Move tuners to common/tuners"):

  LD      kernel/built-in.o
  LD      drivers/built-in.o
  ld: drivers/media/built-in.o: No such file: No such file or directory

which happens if all media drivers are modular:

  http://redhat.com/~mingo/misc/config-Wed_Apr_30_09_24_48_CEST_2008.bad

In that case there's no obj-y rule connecting all the built-in.o files and
the link tree breaks.

The fix is to add a guaranteed obj-y rule for the core vmlinux to build.
(which results in an empty object file if all media drivers are modular)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Linus Torvalds [Thu, 8 May 2008 17:50:34 +0000 (10:50 -0700)]

Merge branch 'for-linus' of git://git./linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/ehca: Wait for async events to finish before destroying QP
  IB/ipath: Fix SDMA error recovery in absence of link status change
  IB/ipath: Need to always request and handle PIO avail interrupts
  IB/ipath: Fix count of packets received by kernel
  IB/ipath: Return the correct opcode for RDMA WRITE with immediate
  IB/ipath: Fix bug that can leave sends disabled after freeze recovery
  IB/ipath: Only increment SSN if WQE is put on send queue
  IB/ipath: Only warn about prototype chip during init
  RDMA/cxgb3: Fix severe limit on userspace memory registration size
  RDMA/cxgb3: Don't add PBL memory to gen_pool in chunks

commit | commitdiff | tree

David Howells [Wed, 7 May 2008 14:31:54 +0000 (15:31 +0100)]

MN10300: Make cpu_relax() invoke barrier()

Make cpu_relax() invoke barrier() to be the same as other arches.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Linus Torvalds [Thu, 8 May 2008 17:48:36 +0000 (10:48 -0700)]

Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  Revert "relay: fix splice problem"
  docbook: fix bio missing parameter
  block: use unitialized_var() in bio_alloc_bioset()
  block: avoid duplicate calls to get_part() in disk stat code
  cfq-iosched: make io priorities inherit CPU scheduling class as well as nice
  block: optimize generic_unplug_device()
  block: get rid of likely/unlikely predictions in merge logic
  vfs: splice remove_suid() cleanup
  cfq-iosched: fix RCU race in the cfq io_context destructor handling
  block: adjust tagging function queue bit locking
  block: sysfs store function needs to grab queue_lock and use queue_flag_*()

commit | commitdiff | tree

Linus Torvalds [Thu, 8 May 2008 17:48:03 +0000 (10:48 -0700)]

Merge branch 'for_linus' of git://git./linux/kernel/git/jack/linux-udf-2.6

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6:
  udf: Fix memory corruption when fs mounted with noadinicb option
  udf: Make udf exportable
  udf: fs/udf/partition.c:udf_get_pblock() mustn't be inline

commit | commitdiff | tree

Linus Torvalds [Thu, 8 May 2008 17:47:39 +0000 (10:47 -0700)]

Merge branch 'for-linus' of git://git390.osdl.marist.edu/linux-2.6

* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
  [S390] guest page hinting light
  [S390] tty3270: fix put_char fail/success conversion.
  [S390] compat ptrace cleanup
  [S390] s390mach compile warning
  [S390] cio: Fix parsing mechanism for blacklisted devices.
  [S390] cio: Remove cio_msg kernel parameter.
  [S390] s390-kvm: leave sie context on work. Removes preemption requirement
  [S390] s390: Optimize user and work TIF check

commit | commitdiff | tree

Andrew Morton [Wed, 7 May 2008 03:42:42 +0000 (20:42 -0700)]

drivers/scsi/dpt_i2o.c: fix build on alpha

alpha:

drivers/scsi/dpt_i2o.c:1997: error: implicit declaration of function 'adpt_alpha_info'
drivers/scsi/dpt_i2o.c: At top level:
drivers/scsi/dpt_i2o.c:2032: warning: conflicting types for 'adpt_alpha_info'
drivers/scsi/dpt_i2o.c:2032: error: static declaration of 'adpt_alpha_info' follows non-static declaration
drivers/scsi/dpt_i2o.c:1997: error: previous implicit declaration of 'adpt_alpha_info' was here

Due to a copy-n-paste error in drivers/scsi/dpti.h.

Fix that up and remove some of the many daft static-declarations-in-a-header
which this driver enjoys.

Cc: Miquel van Smoorenburg <miquels@cistron.nl>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Paul Menage [Wed, 7 May 2008 03:42:41 +0000 (20:42 -0700)]

Fix cpuset sched_relax_domain_level control file

Due to a merge conflict, the sched_relax_domain_level control file was marked
as being handled by cpuset_read/write_u64, but the code to handle it was
actually in cpuset_common_file_read/write.

Since the value being written/read is in fact a signed integer, it should be
treated as such; this patch adds cpuset_read/write_s64 functions, and uses
them to handle the sched_relax_domain_level file.

With this patch, the sched_relax_domain_level can be read and written, and the
correct contents seen/updated.

Signed-off-by: Paul Menage <menage@google.com>
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Benjamin Herrenschmidt [Wed, 7 May 2008 03:42:39 +0000 (20:42 -0700)]

slub: fix atomic usage in any_slab_objects()

any_slab_objects() does an atomic_read on an atomic_long_t, this
fixes it to use atomic_long_read instead.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Ulrich Drepper [Wed, 7 May 2008 03:42:38 +0000 (20:42 -0700)]

sys_pipe(): fix file descriptor leaks

Remember to close the files if copy_to_user() failed.

Spotted by dm.n9107@gmail.com.

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Cc: DM <dm.n9107@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Samuel Thibault [Wed, 7 May 2008 03:42:37 +0000 (20:42 -0700)]

vt: fix canonical input in UTF-8 mode

For e.g. proper TTY canonical support, IUTF8 termios flag has to be set as
appropriate. Linux used to not care about setting that flag for VT TTYs.

This patch fixes that by activating it according to the current mode of the
VT, and sets the default value according to the vt.default_utf8 parameter.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Cc: Willy Tarreau <w@1wt.eu>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Mattia Dongili [Wed, 7 May 2008 03:42:35 +0000 (20:42 -0700)]

usb/asix: add Buffalo LUA-U2-GT 10/100/1000

The USB net adapter Buffalo LUA-U2-GT (0411:006e) carries a AX88178 chip.
Tested on the above HW.

Signed-off-by: Mattia Dongili <malattia@linux.it>
Acked-off-by: David Hollis <dhollis@davehollis.com>
Cc: Greg KH <greg@kroah.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Andrew Morton [Wed, 7 May 2008 03:42:35 +0000 (20:42 -0700)]

sx.c: fix printk warnings on sparc32

drivers/char/sx.c: In function 'sx_set_real_termios':
drivers/char/sx.c:973: warning: format '%u' expects type 'unsigned int', but argument 2 has type 'long unsigned int'
drivers/char/sx.c:999: warning: format '%x' expects type 'unsigned int', but argument 2 has type 'tcflag_t'
drivers/char/sx.c:1012: warning: format '%x' expects type 'unsigned int', but argument 2 has type 'tcflag_t'

sparc32 seems to use weird types for its tty things.

[ Fine by me but this is ancient debug and most of the debug in sx just
wants deleting eventually. - Alan ]

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

WANG Cong [Wed, 7 May 2008 03:42:33 +0000 (20:42 -0700)]

uml: fix inconsistence due to tty_operation change

'put_char' of 'struct tty_operations' has changed from 'void' into 'int'.
This can also shut up compiler warnings.

Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: WANG Cong <wangcong@zeuux.org>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Harvey Harrison [Wed, 7 May 2008 03:42:32 +0000 (20:42 -0700)]

misc: fix integer as NULL pointer warnings

drivers/md/raid10.c:889:17: warning: Using plain integer as NULL pointer
drivers/media/video/cx18/cx18-driver.c:616:12: warning: Using plain integer as NULL pointer
sound/oss/kahlua.c:70:12: warning: Using plain integer as NULL pointer

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Steven Rostedt [Wed, 7 May 2008 03:42:31 +0000 (20:42 -0700)]

fix irq flags for iuu_phoenix.c

The file drivers/usb/serial/iuu_phoenix.c uses "int" for flags. This can
cause hard to find bugs on some architectures. This patch converts the flags
to use "long" instead.

This bug was discovered by doing an allyesconfig make on the -rt kernel where
checks are done to ensure all flags are of size sizeof(long).

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Steven Rostedt [Wed, 7 May 2008 03:42:30 +0000 (20:42 -0700)]

fix irq flags in rtc-ds1511

The file in drivers/rtc/rtc-ds1551.c uses "int" for flags. This can cause
hard to find bugs on some architectures. This patch converts the flags to use
"long" instead.

This bug was discovered by doing an allyesconfig make on the -rt kernel where
checks are done to ensure all flags are of size sizeof(long).

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Steven Rostedt [Wed, 7 May 2008 03:42:29 +0000 (20:42 -0700)]

fix irq flags in saa7134

Some files in the drivers/media/video/saa7134 directory uses "int" for flags.
This can cause hard to find bugs on some architectures. This patch converts
the flags to use "long" instead.

This bug was discovered by doing an allyesconfig make on the -rt kernel where
checks are done to ensure all flags are of size sizeof(long).

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Acked-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Steven Rostedt [Wed, 7 May 2008 03:42:28 +0000 (20:42 -0700)]

fix irq flags in mac80211 code

A file in the net/mac80211 directory uses "int" for flags. This can cause
hard to find bugs on some architectures. This patch converts the flags to use
"long" instead.

This bug was discovered by doing an allyesconfig make on the -rt kernel where
checks are done to ensure all flags are of size sizeof(long).

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: "John W. Linville" <linville@tuxdriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Tetsuo Handa [Wed, 7 May 2008 03:42:27 +0000 (20:42 -0700)]

serial: access after NULL check in uart_flush_buffer()

I noticed that

  static void uart_flush_buffer(struct tty_struct *tty)
  {
   struct uart_state *state = tty->driver_data;
   struct uart_port *port = state->port;
   unsigned long flags;

   /*
   * This means you called this function _after_ the port was
   * closed.  No cookie for you.
   */
   if (!state || !state->info) {
   WARN_ON(1);
   return;
   }

is too late for checking state != NULL.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Samuel Thibault [Wed, 7 May 2008 03:42:26 +0000 (20:42 -0700)]

Kconfig: improved help for CONFIG_ACCESSIBILITY

Add a small explanation of what accessibility is.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Mike Galbraith [Thu, 8 May 2008 15:00:42 +0000 (17:00 +0200)]

sched: fix weight calculations

The conversion between virtual and real time is as follows:

dvt = rw/w * dt <=> dt = w/rw * dvt

Since we want the fair sleeper granularity to be in real time, we actually
need to do:

dvt = - rw/w * l

This bug could be related to the regression reported by Yanmin Zhang:

| Comparing with kernel 2.6.25, sysbench+mysql(oltp, readonly) has lots
| of regressions with 2.6.26-rc1:
|
| 1) 8-core stoakley: 28%;
| 2) 16-core tigerton: 20%;
| 3) Itanium Montvale: 50%.

Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

commit | commitdiff | tree

Ingo Molnar [Thu, 8 May 2008 09:53:48 +0000 (11:53 +0200)]

semaphore: fix

Yanmin Zhang reported:

| Comparing with kernel 2.6.25, AIM7 (use tmpfs) has more th
| regression under 2.6.26-rc1 on my 8-core stoakley, 16-core tigerton,
| and Itanium Montecito. Bisect located the patch below:
|
| 64ac24e738823161693bf791f87adc802cf529ff is first bad commit
| commit 64ac24e738823161693bf791f87adc802cf529ff
| Author: Matthew Wilcox <matthew@wil.cx>
| Date:   Fri Mar 7 21:55:58 2008 -0500
|
|     Generic semaphore implementation
|
| After I manually reverted the patch against 2.6.26-rc1 while fixing
| lots of conflicts/errors, aim7 regression became less than 2%.

i reproduced the AIM7 workload and can confirm Yanmin's findings that
-.26-rc1 regresses over .25 - by over 67% here.

Looking at the workload i found and fixed what i believe to be the real
bug causing the AIM7 regression: it was inefficient wakeup / scheduling
/ locking behavior of the new generic semaphore code, causing suboptimal
performance.

The problem comes from the following code. The new semaphore code does
this on down():

        spin_lock_irqsave(&sem->lock, flags);
        if (likely(sem->count > 0))
                sem->count--;
        else
                __down(sem);
        spin_unlock_irqrestore(&sem->lock, flags);

and this on up():

        spin_lock_irqsave(&sem->lock, flags);
        if (likely(list_empty(&sem->wait_list)))
                sem->count++;
        else
                __up(sem);
        spin_unlock_irqrestore(&sem->lock, flags);

where __up() does:

        list_del(&waiter->list);
        waiter->up = 1;
        wake_up_process(waiter->task);

and where __down() does this in essence:

        list_add_tail(&waiter.list, &sem->wait_list);
        waiter.task = task;
        waiter.up = 0;
        for (;;) {
                [...]
                spin_unlock_irq(&sem->lock);
                timeout = schedule_timeout(timeout);
                spin_lock_irq(&sem->lock);
                if (waiter.up)
                        return 0;
        }

the fastpath looks good and obvious, but note the following property of
the contended path: if there's a task on the ->wait_list, the up() of
the current owner will "pass over" ownership to that waiting task, in a
wake-one manner, via the waiter->up flag and by removing the waiter from
the wait list.

That is all and fine in principle, but as implemented in
kernel/semaphore.c it also creates a nasty, hidden source of contention!

The contention comes from the following property of the new semaphore
code: the new owner owns the semaphore exclusively, even if it is not
running yet.

So if the old owner, even if just a few instructions later, does a
down() [lock_kernel()] again, it will be blocked and will have to wait
on the new owner to eventually be scheduled (possibly on another CPU)!
Or if another task gets to lock_kernel() sooner than the "new owner"
scheduled, it will be blocked unnecessarily and for a very long time
when there are 2000 tasks running.

I.e. the implementation of the new semaphores code does wake-one and
lock ownership in a very restrictive way - it does not allow
opportunistic re-locking of the lock at all and keeps the scheduler from
picking task order intelligently.

This kind of scheduling, with 2000 AIM7 processes running, creates awful
cross-scheduling between those 2000 tasks, causes reduced parallelism, a
throttled runqueue length and a lot of idle time. With increasing number
of CPUs it causes an exponentially worse behavior in AIM7, as the chance
for a newly woken new-owner task to actually run anytime soon is less
and less likely.

Note that it takes just a tiny bit of contention for the 'new-semaphore
catastrophy' to happen: the wakeup latencies get added to whatever small
contention there is, and quickly snowball out of control!

I believe Yanmin's findings and numbers support this analysis too.

The best fix for this problem is to use the same scheduling logic that
the kernel/mutex.c code uses: keep the wake-one behavior (that is OK and
wanted because we do not want to over-schedule), but also allow
opportunistic locking of the lock even if a wakee is already "in
flight".

The patch below implements this new logic. With this patch applied the
AIM7 regression is largely fixed on my quad testbox:

  # v2.6.25 vanilla:
  ..................
  Tasks   Jobs/Min        JTI     Real    CPU     Jobs/sec/task
  2000    56096.4         91      207.5   789.7   0.4675
  2000    55894.4         94      208.2   792.7   0.4658

  # v2.6.26-rc1-166-gc0a1811 vanilla:
  ...................................
  Tasks   Jobs/Min        JTI     Real    CPU     Jobs/sec/task
  2000    33230.6         83      350.3   784.5   0.2769
  2000    31778.1         86      366.3   783.6   0.2648

  # v2.6.26-rc1-166-gc0a1811 + semaphore-speedup:
  ...............................................
  Tasks   Jobs/Min        JTI     Real    CPU     Jobs/sec/task
  2000    55707.1         92      209.0   795.6   0.4642
  2000    55704.4         96      209.0   796.0   0.4642

i.e. a 67% speedup. We are now back to within 1% of the v2.6.25
performance levels and have zero idle time during the test, as expected.

Btw., interactivity also improved dramatically with the fix - for
example console-switching became almost instantaneous during this
workload (which after all is running 2000 tasks at once!), without the
patch it was stuck for a minute at times.

There's another nice side-effect of this speedup patch, the new generic
semaphore code got even smaller:

   text    data     bss     dec     hex filename
   1241       0       0    1241     4d9 semaphore.o.before
   1207       0       0    1207     4b7 semaphore.o.after

(because the waiter.up complication got removed.)

Longer-term we should look into using the mutex code for the generic
semaphore code as well - but i's not easy due to legacies and it's
outside of the scope of v2.6.26 and outside the scope of this patch as
well.

Bisected-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

commit | commitdiff | tree

Jens Axboe [Thu, 8 May 2008 12:06:19 +0000 (14:06 +0200)]

Revert "relay: fix splice problem"

This reverts commit c3270e577c18b3d0e984c3371493205a4807db9d.

commit | commitdiff | tree

Patrik Sevallius [Thu, 8 May 2008 12:04:08 +0000 (14:04 +0200)]

[ALSA] soc at91 minor bug fixes

Found these two bugs while browsing through the code. The first one is
a cut-n-paste bug, instead of disabling the clock when request_irq()
fails, it enabled it once more. The second one fixes a debug printout,
AT91_SSC_IER is write only, AT91_SSC_IMR is readable (the printed string
actually says imr).

Frank Mandarino was busy so he asked me to send these to this list.

/Patrik

Signed-off-by: Patrik Sevallius <patrik.sevallius@enea.com>
Acked-by: Frank Mandarino <fmandarino@endrelia.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

commit | commitdiff | tree

Mark Brown [Thu, 8 May 2008 12:03:30 +0000 (14:03 +0200)]

[ALSA] soc - at91-pcm - Fix line wrapping

There's more checkpatch stuff to fix in the driver, this just fixes the
minimum required for the following patch to be clean.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

commit | commitdiff | tree

Linus Torvalds [Thu, 8 May 2008 00:04:49 +0000 (17:04 -0700)]

Merge git://git./linux/kernel/git/davem/sparc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sparc: Fix fork/clone/vfork system call restart.
sparc: Fix mmap VA span checking.

commit | commitdiff | tree

David S. Miller [Wed, 7 May 2008 23:21:28 +0000 (16:21 -0700)]

sparc: Fix fork/clone/vfork system call restart.

We clobber %i1 as well as %i0 for these system calls,
because they give two return values.

Therefore, on error, we have to restore %i1 properly
or else the restart explodes since it uses the wrong
arguments.

This fixes glibc's nptl/tst-eintr1.c testcase.

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Auke Kok [Wed, 7 May 2008 20:42:33 +0000 (13:42 -0700)]

[MAINTAINERS] New maintainer for Intel ethernet adapters

I'm handing over maintainership to Jeff Kirsher and moving on
to other Linux/Open Source work within Intel. Good luck to Jeff ;)

Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Stefan Roscher [Wed, 7 May 2008 18:35:06 +0000 (11:35 -0700)]

IB/ehca: Wait for async events to finish before destroying QP

This is necessary because, in a multicore environment, a race between
uverbs async handler and destroy QP could occur.

Signed-off-by: Stefan Roscher <stefan.roscher at de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

commit | commitdiff | tree

John Gregor [Wed, 7 May 2008 18:01:10 +0000 (11:01 -0700)]

IB/ipath: Fix SDMA error recovery in absence of link status change

What's fixed:

    in ipath_cancel_sends()

        We need to unconditionally set ABORTING.  So, swap the tests
        so the set_bit() isn't shadowed by the &&.

        If we've disarmed the piobufs, then we need to unconditionally
        set DISARMED.  So, move it out from the overly protective if
        at the bottom.

    in sdma_abort_task()

        Abort_task was written knowing that the SDMA engine would always
        be reset (and restarted) on error.  A recent change broke that
        fundamental assumption by taking the restart portion and making
        it conditional on a link status change.  But, SDMA can go boom
        without a link status change in some conditions.

Signed-off-by: John Gregor <john.gregor@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

commit | commitdiff | tree

Dave Olson [Wed, 7 May 2008 18:00:15 +0000 (11:00 -0700)]

IB/ipath: Need to always request and handle PIO avail interrupts

Now that we always use PIO for vl15 on 7220, we could get stuck forever
if we happened to run out of PIO buffers from the verbs code, because
the setup code wouldn't run; the interrupt was also ignored if SDMA was
supported. We also have to reduce the pio update threshold if we have
fewer kernel buffers than the existing threshold.

Clean up the initialization a bit to get ordering safer and more
sensible, and use the existing ipath_chg_kernavail call to do init,
rather than doing it separately.

Drop unnecessary clearing of pio buffer on pio parity error.

Drop incorrect updating of pioavailshadow when exitting freeze mode
(software state may not match chip state if buffer has been allocated
and not yet written).

If we couldn't get a kernel buffer for a while, make sure we are
in sync with hardware, mainly to handle the exitting freeze case.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

commit | commitdiff | tree

Michael Albaugh [Wed, 7 May 2008 17:59:23 +0000 (10:59 -0700)]

IB/ipath: Fix count of packets received by kernel

The loop in ipath_kreceive() that processes packets increments the
loop-index 'i' once too often, because the exit condition does not
depend on it, and is checked after the increment. By adding a check for
!last to the iterator in the for loop, we correct that in a way that is
not so likely to be re-broken by changes in the loop body.

Signed-off-by: Michael Albaugh <micheal.albaugh@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

commit | commitdiff | tree

Ralph Campbell [Wed, 7 May 2008 17:58:50 +0000 (10:58 -0700)]

IB/ipath: Return the correct opcode for RDMA WRITE with immediate

This patch fixes a bug in the RC responder which generates a completion
entry with the wrong opcode when an RDMA WRITE with immediate is received.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

commit | commitdiff | tree

Dave Olson [Wed, 7 May 2008 17:57:48 +0000 (10:57 -0700)]

IB/ipath: Fix bug that can leave sends disabled after freeze recovery

The semantics of cancel_sends changed, but the code using it was missed.
Don't leave sends and pioavail updates disabled, and add a comment as to
why the force update is needed.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

commit | commitdiff | tree

Ralph Campbell [Wed, 7 May 2008 17:57:14 +0000 (10:57 -0700)]

IB/ipath: Only increment SSN if WQE is put on send queue

If a send work request has immediate errors and is not put on the
send queue, we shouldn't update any of the QP state.

The increment of the SSN wasn't obeying this.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

commit | commitdiff | tree

Michael Albaugh [Wed, 7 May 2008 17:56:47 +0000 (10:56 -0700)]

IB/ipath: Only warn about prototype chip during init

We warn about prototype chips, but the function that checks for
support is also called as a result of a get_portinfo request, which
can clutter the logs.

Restrict warning to only appear during initialization.

Signed-off-by: Michael Albaugh <michael.albaugh@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

commit | commitdiff | tree

Randy Dunlap [Wed, 30 Apr 2008 07:08:54 +0000 (09:08 +0200)]

docbook: fix bio missing parameter

Fix fs/bio.c kernel-doc parameter warning:
Warning(linux-2.6.25-git14//fs/bio.c:972): No description found for parameter 'reading'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

commit | commitdiff | tree

Jens Axboe [Wed, 7 May 2008 11:26:27 +0000 (13:26 +0200)]

block: use unitialized_var() in bio_alloc_bioset()

Better than setting idx to some random value and it silences the
same bogus gcc warning.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

commit | commitdiff | tree

Stas Sergeev [Wed, 7 May 2008 10:39:56 +0000 (12:39 +0200)]

pcspkr: fix dependancies

fix pcspkr dependancies: make the pcspkr platform
drivers to depend on a platform device, and
not the other way around.

Signed-off-by: Stas Sergeev <stsp@aknet.ru>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Dmitry Torokhov <dtor@mail.ru>
CC: Vojtech Pavlik <vojtech@suse.cz>
CC: Michael Opdenacker <michael-lists@free-electrons.com>
[fixed for 2.6.26-rc1 by tiwai]
Signed-off-by: Takashi Iwai <tiwai@suse.de>

commit | commitdiff | tree

David S. Miller [Wed, 7 May 2008 09:24:28 +0000 (02:24 -0700)]

sparc: Fix mmap VA span checking.

We should not conditionalize VA range checks on MAP_FIXED.

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Jens Axboe [Wed, 7 May 2008 08:15:46 +0000 (10:15 +0200)]

block: avoid duplicate calls to get_part() in disk stat code

get_part() is fairly expensive, as it O(N) loops over partitions
to find the right one. In lots of normal IO paths we end up looking
up the partition twice, to make matters even worse. Change the
stat add code to accept a passed in partition instead.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

commit | commitdiff | tree

Jens Axboe [Wed, 7 May 2008 07:51:23 +0000 (09:51 +0200)]

cfq-iosched: make io priorities inherit CPU scheduling class as well as nice

We currently set all processes to the best-effort scheduling class,
regardless of what CPU scheduling class they belong to. Improve that
so that we correctly track idle and rt scheduling classes as well.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

commit | commitdiff | tree

Jan Kara [Tue, 6 May 2008 16:26:17 +0000 (18:26 +0200)]

udf: Fix memory corruption when fs mounted with noadinicb option

When UDF filesystem is mounted with noadinicb mount option, it
happens that we extend an empty directory with a block. A code in
udf_add_entry() didn't count with this possibility and used
uninitialized data leading to memory and filesystem corruption.
Add a check whether file already has some extents before operating
on them.

Signed-off-by: Jan Kara <jack@suse.cz>

commit | commitdiff | tree

Rasmus Rohde [Wed, 30 Apr 2008 15:22:06 +0000 (17:22 +0200)]

udf: Make udf exportable

Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Rasmus Rohde <rohde@duff.dk>
Signed-off-by: Jan Kara <jack@suse.cz>

commit | commitdiff | tree

Jens Axboe [Wed, 7 May 2008 07:48:17 +0000 (09:48 +0200)]

block: optimize generic_unplug_device()

Original patch from Mikulas Patocka <mpatocka@redhat.com>

Mike Anderson was doing an OLTP benchmark on a computer with 48 physical
disks mapped to one logical device via device mapper.

He found that there was a slowdown on request_queue->lock in function
generic_unplug_device. The slowdown is caused by the fact that when some
code calls unplug on the device mapper, device mapper calls unplug on all
physical disks. These unplug calls take the lock, find that the queue is
already unplugged, release the lock and exit.

With the below patch, performance of the benchmark was increased by 18%
(the whole OLTP application, not just block layer microbenchmarks).

So I'm submitting this patch for upstream. I think the patch is correct,
because when more threads call simultaneously plug and unplug, it is
unspecified, if the queue is or isn't plugged (so the patch can't make
this worse). And the caller that plugged the queue should unplug it
anyway. (if it doesn't, there's 3ms timeout).

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

commit | commitdiff | tree

Jens Axboe [Wed, 7 May 2008 07:33:55 +0000 (09:33 +0200)]

block: get rid of likely/unlikely predictions in merge logic

They tend to depend a lot on the workload, so not a clear-cut
likely or unlikely fit.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

commit | commitdiff | tree

Miklos Szeredi [Wed, 7 May 2008 07:22:39 +0000 (09:22 +0200)]

vfs: splice remove_suid() cleanup

generic_file_splice_write() duplicates remove_suid() just because it
doesn't hold i_mutex. But it grabs i_mutex inside splice_from_pipe()
anyway, so this is rather pointless.

Move locking to generic_file_splice_write() and call remove_suid() and
__splice_from_pipe() instead.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

commit | commitdiff | tree

Jens Axboe [Wed, 7 May 2008 07:17:12 +0000 (09:17 +0200)]

cfq-iosched: fix RCU race in the cfq io_context destructor handling

put_io_context() drops the RCU read lock before calling into cfq_dtor(),
however we need to hold off freeing there before grabbing and
dereferencing the first object on the list.

So extend the rcu_read_lock() scope to cover the calling of cfq_dtor(),
and optimize cfq_free_io_context() to use a new variant for
call_for_each_cic() that assumes the RCU read lock is already held.

Hit in the wild by Alexey Dobriyan <adobriyan@gmail.com>

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

commit | commitdiff | tree

Jens Axboe [Wed, 7 May 2008 07:27:43 +0000 (09:27 +0200)]

block: adjust tagging function queue bit locking

For most initialization purposes, calling blk_queue_init_tags() without
the queue lock held is OK. Only if called for resizing an existing map
must the lock be held. Ditto for tag cleanup, the maps are reference
counted.

So switch the general queue flag setting to the unlocked variant, but
retain the locked variant for resizing.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

commit | commitdiff | tree

Martin Schwidefsky [Wed, 7 May 2008 07:22:59 +0000 (09:22 +0200)]

[S390] guest page hinting light

Use the existing arch_alloc_page/arch_free_page callbacks to do
the guest page state transitions between stable and unused.

Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Heiko Carstens [Wed, 7 May 2008 07:22:58 +0000 (09:22 +0200)]

[S390] tty3270: fix put_char fail/success conversion.

The wrong function got coverted ;)

CC drivers/s390/char/tty3270.o
drivers/s390/char/tty3270.c:1747:
warning: initialization from incompatible pointer type

Acked-by: Alan Cox <alan@redhat.com>
Cc: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Roland McGrath [Wed, 7 May 2008 07:22:57 +0000 (09:22 +0200)]

[S390] compat ptrace cleanup

This removes redundant arch code for generic ptrace requests
already handled by ptrace_request and compat_ptrace_request.
It simplifies things to just have the standard entry points,
and use the generic compat_sys_ptrace.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Martin Schwidefsky [Wed, 7 May 2008 07:22:56 +0000 (09:22 +0200)]

[S390] s390mach compile warning

Fix the following compile warning:

drivers/s390/s390mach.c: In function 's390_collect_crw_info':
drivers/s390/s390mach.c:77: warning: ignoring return value of 'down_interruptibl

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Michael Ernst [Wed, 7 May 2008 07:22:55 +0000 (09:22 +0200)]

[S390] cio: Fix parsing mechanism for blacklisted devices.

New format cssid.ssid.devno is now parsed correctly.

Signed-off-by: Michael Ernst <mernst@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Michael Ernst [Wed, 7 May 2008 07:22:54 +0000 (09:22 +0200)]

[S390] cio: Remove cio_msg kernel parameter.

The only sporadically used CIO_DEBUG messages are replaced by ordinary
CIO_MSG_EVENT messages. The CIO_MSG_EVENT messages debug levels are
consolidated.

Signed-off-by: Michael Ernst <mernst@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Christian Borntraeger [Wed, 7 May 2008 07:22:53 +0000 (09:22 +0200)]

[S390] s390-kvm: leave sie context on work. Removes preemption requirement

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

This patch fixes a bug with cpu bound guest on kvm-s390. Sometimes it
was impossible to deliver a signal to a spinning guest. We used
preemption as a circumvention. The preemption notifiers called
vcpu_load, which checked for pending signals and triggered a host
intercept. But even with preemption, a sigkill was not delivered
immediately.

This patch changes the low level host interrupt handler to check for the
SIE instruction, if TIF_WORK is set. In that case we change the
instruction pointer of the return PSW to rerun the vcpu_run loop. The kvm
code sees an intercept reason 0 if that happens. This patch adds accounting
for these types of intercept as well.

The advantages:
- works with and without preemption
- signals are delivered immediately
- much better host latencies without preemption

Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Martin Schwidefsky [Wed, 7 May 2008 07:22:52 +0000 (09:22 +0200)]

[S390] s390: Optimize user and work TIF check

On return from syscall or interrupt, we have to check if we return to
userspace (likely) and if there is work todo (less likely) to decide
if we handle the work. We can optimize this check: we first check for
the less likely work case and then check for userspace.

This patch is also a preparation for an additional patch, that fixes a bug
in KVM dealing with cpu bound guests.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Jens Axboe [Wed, 7 May 2008 07:09:39 +0000 (09:09 +0200)]

block: sysfs store function needs to grab queue_lock and use queue_flag_*()

Concurrency isn't a big deal here since we have requests in flight
at this point, but do the locked variant to set a better example.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

commit | commitdiff | tree

Linus Torvalds [Wed, 7 May 2008 01:18:43 +0000 (18:18 -0700)]

Merge git://git./linux/kernel/git/davem/sparc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc64: Fix initrd regression.
  usb: Sparc build fix, make USB_ISP1760_OF depend on PPC_OF
  sparc64: remove online_page()
  sparc64: use compat_sys_utimes instead of home-grown local copy.
  sbus: Fix bpp driver build.
  sparc video: make blank use proper constant
  Revert "[SPARC64]: Wrap SMP IPIs with irq_enter()/irq_exit()."
  sparc: tcx.c remove unnecessary function

commit | commitdiff | tree

Linus Torvalds [Wed, 7 May 2008 00:09:27 +0000 (17:09 -0700)]

Revert "uml: fix gcc problem"

This reverts commit 22eecde2f9034764a3fd095eecfa3adfb8ec9a98. Uli
reports that it breaks UML on x86-64 with the Fedora 8 gcc (gcc 4.1.2),
causing a crash on startup. See

http://marc.info/?l=linux-kernel&m=121011722806093&w=2

for a trace.

Reported-by: Ulrich Drepper <drepper@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Roland Dreier [Tue, 6 May 2008 22:56:22 +0000 (15:56 -0700)]

RDMA/cxgb3: Fix severe limit on userspace memory registration size

Currently, iw_cxgb3 is severely limited on the amount of userspace
memory that can be registered in in a single memory region, which
causes big problems for applications that expect to be able to
register 100s of MB.

The problem is that the driver uses a single kmalloc()ed buffer to
hold the physical buffer list (PBL) for the entire memory region
during registration, which means that 8 bytes of contiguous memory are
required for each page of memory being registered. For example, a 64
MB registration will require 128 KB of contiguous memory with 4 KB
pages, and it unlikely that such an allocation will succeed on a busy
system.

This is purely a driver problem: the temporary page list buffer is not
needed by the hardware, so we can fix this by writing the PBL to the
hardware in page-sized chunks rather than all at once. We do this by
splitting the memory registration operation up into several steps:

- Allocate PBL space in adapter memory for the full registration
- Copy PBL to adapter memory in chunks
- Allocate STag and enable memory region

This also allows several other cleanups to the __cxio_tpt_op()
interface and related parts of the driver.

This change leaves the reregister memory region and memory window
operations broken, but they already didn't work due to other
longstanding bugs, so fixing them will be left to a later patch.

Signed-off-by: Roland Dreier <rolandd@cisco.com>

commit | commitdiff | tree

David S. Miller [Tue, 6 May 2008 22:19:54 +0000 (15:19 -0700)]

sparc64: Fix initrd regression.

We die because we forget to convert initrd_start and
initrd_end to virtual addresses.

Reported by Mikael Pettersson

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Tue, 6 May 2008 22:15:12 +0000 (15:15 -0700)]

usb: Sparc build fix, make USB_ISP1760_OF depend on PPC_OF

Sparc doesn't have some of the OF interfaces this driver
wants to use.

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Roland Dreier [Tue, 6 May 2008 22:03:38 +0000 (15:03 -0700)]

RDMA/cxgb3: Don't add PBL memory to gen_pool in chunks

Current iw_cxgb3 code adds PBL memory to the driver's gen_pool in 2 MB
chunks. This limits the largest single allocation that can be done to
the same size, which means that with 4 KB pages, each of which takes 8
bytes of PBL memory, the largest memory region that can be allocated
is 1 GB (256K PBL entries * 4 KB/entry).

Remove this limit by adding all the PBL memory in a single gen_pool
chunk, if possible. Add code that falls back to smaller chunks if
gen_pool_add() fails, which can happen if there is not sufficient
contiguous lowmem for the internal gen_pool bitmap.

Signed-off-by: Roland Dreier <rolandd@cisco.com>

commit | commitdiff | tree

OGAWA Hirofumi [Tue, 6 May 2008 19:02:53 +0000 (04:02 +0900)]

Fix bogus warning in sysdev_driver_register()

if ((drv->entry.next != drv->entry.prev) ||
(drv->entry.next != NULL)) {

warns list_empty(&drv->entry).

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Greg KH <gregkh@suse.de>
Cc: Len Brown <lenb@kernel.org>
[ Version 2 totally redone based on suggestions from Linus & Greg ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Linus Torvalds [Tue, 6 May 2008 20:13:37 +0000 (13:13 -0700)]

VFS: fix unused variable warning

Commit 33dcdac2df54e66c447ae03f58c95c7251aa5649 ("kill ->put_inode")
removed the final use of i_op->put_inode, but left the now totally
unused "op" variable in iput().

Get rid of it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Hugh Dickins [Tue, 6 May 2008 19:49:23 +0000 (20:49 +0100)]

x86: fix PAE pmd_bad bootup warning

Fix warning from pmd_bad() at bootup on a HIGHMEM64G HIGHPTE x86_32.

That came from 9fc34113f6880b215cbea4e7017fc818700384c2 x86: debug pmd_bad();
but we understand now that the typecasting was wrong for PAE in the previous
version: pagetable pages above 4GB looked bad and stopped Arjan from booting.

And revert that cded932b75ab0a5f9181ee3da34a0a488d1a14fd x86: fix pmd_bad
and pud_bad to support huge pages.  It was the wrong way round: we shouldn't
weaken every pmd_bad and pud_bad check to let huge pages slip through - in
part they check that we _don't_ have a huge page where it's not expected.

Put the x86 pmd_bad() and pud_bad() definitions back to what they have long
been: they can be improved (x86_32 should use PTE_MASK, to stop PAE thinking
junk in the upper word is good; and x86_64 should follow x86_32's stricter
comparison, to stop thinking any subset of required bits is good); but that
should be a later patch.

Fix Hans' good observation that follow_page() will never find pmd_huge()
because that would have already failed the pmd_bad test: test pmd_huge in
between the pmd_none and pmd_bad tests.  Tighten x86's pmd_huge() check?
No, once it's a hugepage entry, it can get quite far from a good pmd: for
example, PROT_NONE leaves it with only ACCESSED of the KERN_PGTABLE bits.

However... though follow_page() contains this and another test for huge
pages, so it's nice to keep it working on them, where does it actually get
called on a huge page?  get_user_pages() checks is_vm_hugetlb_page(vma) to
to call alternative hugetlb processing, as does unmap_vmas() and others.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Earlier-version-tested-by: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jeff Chua <jeff.chua.linux@gmail.com>
Cc: Hans Rosenfeld <hans.rosenfeld@amd.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Linus Torvalds [Tue, 6 May 2008 18:39:57 +0000 (11:39 -0700)]

Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  [PATCH] fix SMP ordering hole in fcntl_setlk()
  [PATCH] kill ->put_inode
  [PATCH] fix reservation discarding in affs

commit | commitdiff | tree

Al Viro [Tue, 6 May 2008 17:58:34 +0000 (13:58 -0400)]

[PATCH] fix SMP ordering hole in fcntl_setlk()

fcntl_setlk()/close() race prevention has a subtle hole - we need to
make sure that if we *do* have an fcntl/close race on SMP box, the
access to descriptor table and inode->i_flock won't get reordered.

As it is, we get STORE inode->i_flock, LOAD descriptor table entry vs.
STORE descriptor table entry, LOAD inode->i_flock with not a single
lock in common on both sides. We do have BKL around the first STORE,
but check in locks_remove_posix() is outside of BKL and for a good
reason - we don't want BKL on common path of close(2).

Solution is to hold ->file_lock around fcheck() in there; that orders
us wrt removal from descriptor table that preceded locks_remove_posix()
on close path and we either come first (in which case eviction will be
handled by the close side) or we'll see the effect of close and do
eviction ourselves. Note that even though it's read-only access,
we do need ->file_lock here - rcu_read_lock() won't be enough to
order the things.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Christoph Hellwig [Tue, 29 Apr 2008 15:46:26 +0000 (17:46 +0200)]

[PATCH] kill ->put_inode

And with that last patch to affs killing the last put_inode instance we
can finally, after many years of transition kill this racy and awkward
interface.

(It's kinda funny that even the description in
Documentation/filesystems/vfs.txt was entirely wrong..)

Also remove a very misleading comment above the defintion of
struct super_operations.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Roman Zippel [Tue, 29 Apr 2008 15:02:20 +0000 (17:02 +0200)]

[PATCH] fix reservation discarding in affs

- remove affs_put_inode, so preallocations aren't discared unnecessarily
  often.
- remove affs_drop_inode, it's called with a spinlock held, so it can't
  use a mutex.
- make i_opencnt atomic
- avoid direct b_count manipulations
- a few allocation failure fixes, so that these are more gracefully
  handled now.
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Linus Torvalds [Tue, 6 May 2008 16:17:03 +0000 (09:17 -0700)]

Merge branch 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: (27 commits)
  pata_atiixp: Don't disable
  sata_inic162x: update intro comment, up the version and drop EXPERIMENTAL
  sata_inic162x: add cardbus support
  sata_inic162x: kill now unused SFF related stuff
  sata_inic162x: use IDMA for ATAPI commands
  sata_inic162x: use IDMA for non DMA ATA commands
  sata_inic162x: kill now unused bmdma related stuff
  sata_inic162x: use IDMA for ATA_PROT_DMA
  sata_inic162x: update TF read handling
  sata_inic162x: add / update constants
  sata_inic162x: misc clean ups
  sata_mv use hweight16() for bit counting (V2)
  sata_mv NCQ-EH for FIS-based switching
  sata_mv delayed eh handling
  libata: export ata_eh_analyze_ncq_error
  sata_mv new mv_port_intr function
  sata_mv fix mv_host_intr bug for hc_irq_cause
  sata_mv NCQ and SError fixes for mv_err_intr
  sata_mv rearrange mv_config_fbs
  sata_mv errata workaround for sata25 part 1
  ...

commit | commitdiff | tree

Alan Cox [Fri, 2 May 2008 22:13:39 +0000 (15:13 -0700)]

pata_atiixp: Don't disable

A couple of distributions (Fedora, Ubuntu) were having weird problems with the
ATI IXP series PATA controllers being reported as simplex.  At the heart of
the problem is that both distros ignored the recommendations to load pata_acpi
and ata_generic *AFTER* specific host drivers.

The underlying cause however is that if you D3 and then D0 an ATI IXP it
helpfully throws away some configuration and won't let you rewrite it.

Add checks to ata_generic and pata_acpi to pin ATIIXP devices.  Possibly the
real answer here is to quirk them and pin them, but right now we can't do that
before they've been pcim_enable()'d by a driver.

I'm indebted to David Gero for this.  His bug report not only reported the
problem but identified the cause correctly and he had tested the right values
to prove what was going on

[If you backport this for 2.6.24 you will need to pull in the 2.6.25
removal of the bogus WARN_ON() in pcim_enagle]

Signed-off-by: Alan Cox <alan@redhat.com>
Tested-by: David Gero <davidg@havidave.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

commit | commitdiff | tree

Tejun Heo [Wed, 30 Apr 2008 07:35:17 +0000 (16:35 +0900)]

sata_inic162x: update intro comment, up the version and drop EXPERIMENTAL

sata_inic162x is now ready for production use. Bump the version,
explain what's working and what's not and drop EXPERIMENTAL.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

commit | commitdiff | tree

Tejun Heo [Wed, 30 Apr 2008 07:35:16 +0000 (16:35 +0900)]

sata_inic162x: add cardbus support

When attached to cardbus, mmio region is at BAR 1. Other than that,
everything else is the same. Add support for it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

commit | commitdiff | tree

Tejun Heo [Wed, 30 Apr 2008 07:35:15 +0000 (16:35 +0900)]

sata_inic162x: kill now unused SFF related stuff

sata_inic162x now doesn't use any SFF features.  Remove all SFF
related stuff.

* Mask unsolicited ATA interrupts.  This removes our primary source of
  spurious interrupts and spurious interrupt handling can be tightened
  up.  There's no need to clear ATA interrupts by reading status
  register either.

* Don't dance with IDMA_CTL_ATA_NIEN and simplify accesses to
  IDMA_CTL.

* Inherit from sata_port_ops instead of ata_sff_port_ops.

* Don't initialize or use ioaddr.  There's no need to map BAR0-4
  anymore.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

commit | commitdiff | tree

Tejun Heo [Wed, 30 Apr 2008 07:35:14 +0000 (16:35 +0900)]

sata_inic162x: use IDMA for ATAPI commands

Use IDMA for ATAPI commands.  Write and some misc commands time out
when executed using ATAPI_PROT_DMA but ATAPI_PROT_PIO works fine.  As
PIO is driven by DMA too, it doesn't make any noticeable difference
for native SATA devices.  inic_check_atapi_dma() is implemented to
force PIO for those ATAPI commands.

After this change, sata_inic162x issues all commands using IDMA.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

commit | commitdiff | tree

Tejun Heo [Wed, 30 Apr 2008 07:35:13 +0000 (16:35 +0900)]

sata_inic162x: use IDMA for non DMA ATA commands

Use IDMA for PIO and non-data commands. This allows sata_inic162x to
safely drive LBA48 devices. Kill inic_dev_config() which contains
code to reject LBA48 devices.

With this change, status checking in inic_qc_issue() to avoid hard
lock up after hotplug can go away too.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

commit | commitdiff | tree

Tejun Heo [Wed, 30 Apr 2008 07:35:12 +0000 (16:35 +0900)]

sata_inic162x: kill now unused bmdma related stuff

sata_inic162x doesn't use BMDMA anymore. Kill bmdma related stuff.

* prdctl manipulation

* port IRQ mask manipulation

* inherit ATA_BASE_SHT instead of ATA_BMDMA_SHT

* BMDMA methods

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

commit | commitdiff | tree

Tejun Heo [Wed, 30 Apr 2008 07:35:11 +0000 (16:35 +0900)]

sata_inic162x: use IDMA for ATA_PROT_DMA

The modified driver on initio site has enough clue on how to use IDMA.
Use IDMA for ATA_PROT_DMA.

* LBA48 now works as long as it uses DMA (LBA48 devices still aren't
  allowed as it can destroy data if PIO is used for any reason).

* No need to mask IRQs for read DMAs as IDMA_DONE is properly raised
  after transfer to memory is actually completed.  There will be some
  spurious interrupts but host_intr will handle it correctly and
  manipulating port IRQ mask interacts badly with the other port for
  some reason, so command type dependent port IRQ masking is not used
  anymore.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

commit | commitdiff | tree

Tejun Heo [Thu, 1 May 2008 14:55:58 +0000 (23:55 +0900)]

sata_inic162x: update TF read handling

inic162x can't reliably read back TF or at least we don't know how to
do it yet.  The only values which seem reliable are status and error.
This patch updates access to TF.

* implement inic_tf_read() which reads the TF area in mmio area

* implement custom inic_qc_fill_rtf() which only returns true if
  status indicates device error.  it'll be returning bogus addresses
  for device errors but it'll be able to report why it failed at
  least.

* implement custom inic_check_ready() and use ata_wait_after_reset()
  instead of the SFF version.

* use inic_tf_read() for classification.

This is not perfect but it fixes hotplug detection failure and at
least makes the driver report 0's instead of random garbages while
reporting valid status and error for device errors.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

commit | commitdiff | tree

Tejun Heo [Wed, 30 Apr 2008 07:35:09 +0000 (16:35 +0900)]

sata_inic162x: add / update constants

* add a bunch of constants, most are from the datasheet, a few
undocumented ones are from initio's modified driver

* HCTL_PWRDWN is bit 12 not 13

This is in preparation of further inic162x updates.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

commit | commitdiff | tree

Tejun Heo [Wed, 30 Apr 2008 07:35:08 +0000 (16:35 +0900)]

sata_inic162x: misc clean ups

* use larger indents for structure member definitions

* kill unused variable @addr in inic_scr_write()

* kill unnecessary flushes in inic_freeze/thaw()

* kill buggy explicit kfree() on devres managed port private data

This is in preparation of further inic162x updates.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

commit | commitdiff | tree

Mark Lord [Fri, 2 May 2008 18:02:28 +0000 (14:02 -0400)]

sata_mv use hweight16() for bit counting (V2)

Some tidying as suggested by Grant Grundler.

Nuke local bit-counting function from sata_mv in favour of using hweight16().
Also add a short explanation for the 15msec timeout used when waiting for empty/idle.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

commit | commitdiff | tree

Mark Lord [Fri, 2 May 2008 06:16:20 +0000 (02:16 -0400)]

sata_mv NCQ-EH for FIS-based switching

Convert sata_mv's EH for FIS-based switching (FBS) over to the
sequence recommended by Marvell. This enables us to catch/analyze
multiple failed links on a port-multiplier when using NCQ.

To do this, we clear the ERR_DEV bit in the EDMA Halt-Conditions register,
so that the EDMA engine doesn't self-disable on the first NCQ error.

Our EH code sets the MV_PP_FLAG_DELAYED_EH flag to prevent new commands
being queued while we await completion of all outstanding NCQ commands
on all links of the failed PM.

The SATA Test Control register tells us which links have failed,
so we must only wait for any other active links to finish up
before we stop the EDMA and run the .error_handler afterward.

The patch also includes skeleton code for handling of non-NCQ FBS operation.
This is more for documentation purposes right now, as that mode is not yet
enabled in sata_mv.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

commit | commitdiff | tree

Mark Lord [Fri, 2 May 2008 06:15:37 +0000 (02:15 -0400)]

sata_mv delayed eh handling

Introduce a new "delayed error handling" mechanism in sata_mv,
to enable us to eventually deal with multiple simultaneous NCQ
failures on a single host link when a PM is present.

This involves a port flag (MV_PP_FLAG_DELAYED_EH) to prevent new
commands being queued, and a pmp bitmap to indicate which pmp links
had NCQ errors.

The new mv_pmp_error_handler() uses those values to invoke
ata_eh_analyze_ncq_error() on each failed link, prior to freezing
the port and passing control to sata_pmp_error_handler().

This is based upon a strategy suggested by Tejun.

For now, we just implement the delayed mechanism.
The next patch in this series will add the multiple-NCQ EH code
to take advantage of it.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

commit | commitdiff | tree

Mark Lord [Fri, 2 May 2008 06:14:53 +0000 (02:14 -0400)]

libata: export ata_eh_analyze_ncq_error

Export ata_eh_analyze_ncq_error() for subsequent use by sata_mv,
as suggested by Tejun.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

commit | commitdiff | tree

Mark Lord [Fri, 2 May 2008 06:14:02 +0000 (02:14 -0400)]

sata_mv new mv_port_intr function

Separate out the inner loop body of mv_host_intr()
into it's own function called mv_port_intr().

This should help maintainabilty.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

commit | commitdiff | tree

Mark Lord [Fri, 2 May 2008 06:13:27 +0000 (02:13 -0400)]

sata_mv fix mv_host_intr bug for hc_irq_cause

Remove the unwanted reads of hc_irq_cause from mv_host_intr(),
thereby removing a bug whereby we were not always reading it when needed..

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

commit | commitdiff | tree

Mark Lord [Fri, 2 May 2008 06:12:34 +0000 (02:12 -0400)]

sata_mv NCQ and SError fixes for mv_err_intr

Sigh. Undo some earlier changes to mv_port_intr(),
so that we now read/clear SError again in all cases.

Arrange the top of the function to be as close as possible
to what we need for a later update (in this series) for ERR_DEV handling.

Fix things so that libata-eh can attempt a READ_LOG_EXT_10H
in response to a failed NCQ command, by just doing a local
mv_eh_freeze() rather than ata_port_freeze().

This will now fully handle NCQ errors much of the time,
but more fixes are needed for FBS/PMP, and for certain chip errata.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

commit | commitdiff | tree

Mark Lord [Fri, 2 May 2008 06:11:45 +0000 (02:11 -0400)]

sata_mv rearrange mv_config_fbs

Rearrange mv_config_fbs() to more closely follow the (corrected) datasheet
recommendations for NCQ and FIS-based switching (FBS).

Also, maintain a port flag to let us know when FBS is enabled.
We will make more use of that flag later in this patch series.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

commit | commitdiff | tree

Mark Lord [Fri, 2 May 2008 06:10:56 +0000 (02:10 -0400)]

sata_mv errata workaround for sata25 part 1

Part 1 of workaround for errata "sata#25" for the 60x1 series
(the second half of this errata workaround is still in development.

Bit22 of the GPIO port has to be set "on" when in NCQ mode.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

commit | commitdiff | tree

Mark Lord [Fri, 2 May 2008 06:10:02 +0000 (02:10 -0400)]

sata_mv new mv_qc_defer method

The EDMA engine cannot tolerate a mix of NCQ/non-NCQ commands,
and cannot be used for PIO at all. So we need to prevent libata
from trying to feed us such mixtures.

Introduce mv_qc_defer() for this purpose, and use it for all chip versions.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

commit | commitdiff | tree

Mark Lord [Fri, 2 May 2008 06:09:14 +0000 (02:09 -0400)]

sata_mv wait for empty+idle

When performing EH, it is recommended to wait for the EDMA engine
to empty out requests-in-progress before disabling EDMA.

Introduce code to poll the EDMA_STATUS register for idle/empty bits
before disabling EDMA. For non-EH operation, this will normally exit
without delay, other than the register read.

A later series of patches may focus on eliminating this and various
other register reads (when possible) throughout the driver,
but for now we're focussing on solid reliablity.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

Domain: System / Uncategorized;