Mike Snitzer [Fri, 6 Sep 2013 15:36:01 +0000 (11:36 -0400)]
dm stripe: silence a couple sparse warnings
Eliminate the following sparse warnings:
drivers/md/dm-stripe.c:443:12: warning: symbol 'dm_stripe_init' was not declared. Should it be static?
drivers/md/dm-stripe.c:456:6: warning: symbol 'dm_stripe_exit' was not declared. Should it be static?
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Mikulas Patocka [Fri, 16 Aug 2013 14:54:23 +0000 (10:54 -0400)]
dm: add statistics support
Support the collection of I/O statistics on user-defined regions of
a DM device. If no regions are defined no statistics are collected so
there isn't any performance impact. Only bio-based DM devices are
currently supported.
Each user-defined region specifies a starting sector, length and step.
Individual statistics will be collected for each step-sized area within
the range specified.
The I/O statistics counters for each step-sized area of a region are
in the same format as /sys/block/*/stat or /proc/diskstats but extra
counters (12 and 13) are provided: total time spent reading and
writing in milliseconds. All these counters may be accessed by sending
the @stats_print message to the appropriate DM device via dmsetup.
The creation of DM statistics will allocate memory via kmalloc or
fallback to using vmalloc space. At most, 1/4 of the overall system
memory may be allocated by DM statistics. The admin can see how much
memory is used by reading
/sys/module/dm_mod/parameters/stats_current_allocated_bytes
See Documentation/device-mapper/statistics.txt for more details.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Mike Snitzer [Thu, 22 Aug 2013 13:56:18 +0000 (09:56 -0400)]
dm thin: always return -ENOSPC if no_free_space is set
If pool has 'no_free_space' set it means a previous allocation already
determined the pool has no free space (and failed that allocation with
-ENOSPC). By always returning -ENOSPC if 'no_free_space' is set, we do
not allow the pool to oscillate between allocating blocks and then not.
But a side-effect of this determinism is that if a user wants to be able
to allocate new blocks they'll need to reload the pool's table (to clear
the 'no_free_space' flag). This reload will happen automatically if the
pool's data volume is resized. But if the user takes action to free a
lot of space by deleting snapshot volumes, etc the pool will no longer
allow data allocations to continue without an intervening table reload.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Mike Snitzer [Wed, 28 Aug 2013 00:03:00 +0000 (20:03 -0400)]
dm ioctl: cleanup error handling in table_load
Make use of common cleanup code.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Mike Snitzer [Tue, 27 Aug 2013 22:57:03 +0000 (18:57 -0400)]
dm ioctl: increase granularity of type_lock when loading table
Hold the mapped device's type_lock before calling populate_table() since
it is where the table's type is determined based on the specified
targets. There is no need to allow concurrent table loads to race to
establish the table's targets or type.
This eliminates the need to grab the lock in dm_table_set_type().
Also verify that the type_lock is held in both dm_set_md_type() and
dm_get_md_type().
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Alasdair Kergon [Thu, 29 Aug 2013 15:37:45 +0000 (16:37 +0100)]
dm ioctl: prevent rename to empty name or uuid
A device-mapper device must always have a name consisting of a non-empty
string. If the device also has a uuid, this similarly must not be an
empty string.
The DM_DEV_CREATE ioctl enforces these rules when the device is created,
but this patch is needed to enforce them when DM_DEV_RENAME is used to
change the name or uuid.
Reported-by: Zdenek Kabelac <zkabelac@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Mikulas Patocka <mpatocka@redhat.com>
Mike Snitzer [Wed, 21 Aug 2013 21:40:11 +0000 (17:40 -0400)]
dm thin: set pool read-only if breaking_sharing fails block allocation
break_sharing() now handles an arbitrary alloc_data_block() error
the same way as provision_block(): marks pool read-only and errors the
cell.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Mike Snitzer [Wed, 21 Aug 2013 21:30:40 +0000 (17:30 -0400)]
dm thin: prefix pool error messages with pool device name
Useful to know which pool is experiencing the error.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Mike Snitzer [Thu, 22 Aug 2013 22:21:38 +0000 (18:21 -0400)]
dm: allow error target to replace bio-based and request-based targets
It may be useful to switch a request-based table to the "error" target.
Enhance the DM core to allow a hybrid target_type which is capable of
handling either bios (via .map) or requests (via .map_rq).
Add a request-based map function (.map_rq) to the "error" target_type;
making it DM's first hybrid target. Train dm_table_set_type() to prefer
the mapped device's established type (request-based or bio-based). If
the mapped device doesn't have an established type default to making the
table with the hybrid target(s) bio-based.
Tested 'dmsetup wipe_table' to work on both bio-based and request-based
devices.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Acked-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Mike Snitzer [Tue, 20 Aug 2013 19:05:17 +0000 (15:05 -0400)]
math64: New separate div64_u64_rem helper
Commit
f792685006274a850e6cc0ea9ade275ccdfc90bc ("math64: New
div64_u64_rem helper") implemented div64_u64 in terms of div64_u64_rem.
But div64_u64_rem was removed because it slowed down div64_u64 (and
there were no other users of div64_u64_rem).
Device Mapper's I/O statistics support has a need for div64_u64_rem;
reintroduce this helper as a separate method that doesn't slow down
div64_u64, especially on 32-bit systems.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Joe Thornber [Fri, 9 Aug 2013 12:04:56 +0000 (13:04 +0100)]
dm space map: optimise sm_ll_dec and sm_ll_inc
Prior to this patch these methods did a lookup followed by an insert.
Instead they now call a common mutate function that adjusts the value
according to a callback function. This avoids traversing the data
structures twice and hence improves performance.
Also factor out sm_ll_lookup_big_ref_count() for use by both
sm_ll_lookup() and sm_ll_mutate().
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Joe Thornber [Fri, 9 Aug 2013 11:59:30 +0000 (12:59 +0100)]
dm btree: prefetch child nodes when walking tree for a dm_btree_del
dm-btree now takes advantage of dm-bufio's ability to prefetch data via
dm_bm_prefetch(). Prior to this change many btree node visits were
causing a synchronous read.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Joe Thornber [Fri, 9 Aug 2013 11:48:42 +0000 (12:48 +0100)]
dm btree: use pop_frame in dm_btree_del to cleanup code
Remove a visited leaf straight away from the stack, rather than
marking all it's children as visited and letting it get removed on the
next iteration. May also offer a micro optimisation in dm_btree_del.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Mike Snitzer [Fri, 16 Aug 2013 14:54:21 +0000 (10:54 -0400)]
dm cache: eliminate holes in cache structure
Reorder members in the cache structure to eliminate 6 out of 7 holes
(reclaiming 24 bytes). Also, the 'worker' and 'waker' members no longer
straddle cachelines.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Mike Snitzer [Tue, 20 Aug 2013 19:02:41 +0000 (15:02 -0400)]
dm cache: fix stacking of geometry limits
Do not blindly override the queue limits (specifically io_min and
io_opt). Allow traditional stacking of these limits if io_opt is a
factor of the cache's data block size.
Without this patch mkfs.xfs does not recognize the cache device's
provided limits as a useful geometry (e.g. raid) so these hints are
ignored. This was due to setting io_min to a useless value.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Mike Snitzer [Tue, 20 Aug 2013 19:02:41 +0000 (15:02 -0400)]
dm thin: fix stacking of geometry limits
Do not blindly override the queue limits (specifically io_min and
io_opt). Allow traditional stacking of these limits if io_opt is a
factor of the thin-pool's data block size.
Without this patch mkfs.xfs does not recognize the thin device's
provided limits as a useful geometry (e.g. raid) so these hints are
ignored. This was due to setting io_min to a useless value.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Carlos Maiolino [Tue, 23 Jul 2013 14:03:16 +0000 (11:03 -0300)]
dm thin: add data block size limits to Documentation
Inform users that the data block size can't be any arbitrary number,
i.e. its value must be between 64KB and 1GB. Also, it should be a
multiple of 64KB.
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Mike Snitzer [Fri, 16 Aug 2013 14:54:19 +0000 (10:54 -0400)]
dm cache: add data block size limits to code and Documentation
Place upper bound on the cache's data block size (1GB).
Inform users that the data block size can't be any arbitrary number,
i.e. its value must be between 32KB and 1GB. Also, it should be a
multiple of 32KB.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Mike Snitzer [Fri, 16 Aug 2013 14:54:20 +0000 (10:54 -0400)]
dm cache: document metadata device is exclussive to a cache
A cache's metadata device may not be shared by multiple cache devices.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Tejun Heo [Tue, 30 Jul 2013 12:40:21 +0000 (08:40 -0400)]
dm: stop using WQ_NON_REENTRANT
dbf2576e37 ("workqueue: make all workqueues non-reentrant") made
WQ_NON_REENTRANT no-op and the flag is going away. Remove its usages.
This patch doesn't introduce any behavior changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Geert Uytterhoeven [Fri, 26 Jul 2013 07:57:31 +0000 (09:57 +0200)]
dm cache: avoid conflicting remove_mapping() in mq policy
On sparc32, which includes <linux/swap.h> from <asm/pgtable_32.h>:
drivers/md/dm-cache-policy-mq.c:962:13: error: conflicting types for 'remove_mapping'
include/linux/swap.h:285:12: note: previous declaration of 'remove_mapping' was here
As mq_remove_mapping() already exists, and the local remove_mapping() is
used only once, inline it manually to avoid the conflict.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair Kergon <agk@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Linus Torvalds [Mon, 12 Aug 2013 01:04:20 +0000 (18:04 -0700)]
Linux 3.11-rc5
Linus Torvalds [Sun, 11 Aug 2013 23:32:26 +0000 (16:32 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is three bug fixes: An fnic warning caused by sleeping under a
lock, a major regression with our updated WRITE SAME/UNMAP logic which
caused tons of USB devices (and one RAID card) to cease to function
and a megaraid_sas firmware initialisation problem which causes kdump
failures"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
[SCSI] Don't attempt to send extended INQUIRY command if skip_vpd_pages is set
[SCSI] fnic: BUG: sleeping function called from invalid context during probe
[SCSI] megaraid_sas: megaraid_sas driver init fails in kdump kernel
Linus Torvalds [Sun, 11 Aug 2013 19:12:39 +0000 (12:12 -0700)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc
Pull powerpc fixes from Ben Herrenschmidt:
"This includes small series from Michael Neuling to fix a couple of
nasty remaining problems with the new Power8 support, also targeted at
stable 3.10, without which some new userspace accessible registers
aren't properly context switched, and in some case, can be clobbered
by the user of transactional memory.
Along with that, a few slightly more minor things, such as a missing
Kconfig option to enable handling of denorm exceptions when not
running under a hypervisor (or userspace will randomly crash when
hitting denorms with the vector unit), some nasty bugs in the new
pstore oops code, and other simple bug fixes worth having in now.
Note: I picked up the two powerpc KVM fixes as Alex Graf asked me to
handle KVM bits while he is on vacation. However I'll let him decide
whether they should go to -stable or not when he is back"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/tm: Fix context switching TAR, PPR and DSCR SPRs
powerpc: Save the TAR register earlier
powerpc: Fix context switch DSCR on POWER8
powerpc: Rework setting up H/FSCR bit definitions
powerpc: Fix hypervisor facility unavaliable vector number
powerpc/kvm/book3s_pr: Return appropriate error when allocation fails
powerpc/kvm: Add signed type cast for comparation
powerpc/eeh: Add missing procfs entry for PowerNV
powerpc/pseries: Add backward compatibilty to read old kernel oops-log
powerpc/pseries: Fix buffer overflow when reading from pstore
powerpc: On POWERNV enable PPC_DENORMALISATION by default
Linus Torvalds [Sun, 11 Aug 2013 19:11:33 +0000 (12:11 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull s390 kvm fixes from Paolo Bonzini:
"Two fixes for s390"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: s390: fix pfmf non-quiescing control handling
KVM: s390: move kvm_guest_enter,exit closer to sie
Linus Torvalds [Sun, 11 Aug 2013 19:10:47 +0000 (12:10 -0700)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Some driver bugfixes for the I2C subsystem"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: mv64xxx: Document the newly introduced allwinner compatible
i2c: Fix Kontron PLD prescaler calculation
i2c: i2c-mxs: Use DMA mode even for small transfers
Linus Torvalds [Sat, 10 Aug 2013 22:21:47 +0000 (15:21 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"These are assorted fixes, mostly from Josef nailing down xfstests
runs. Zach also has a long standing fix for problems with readdir
wrapping f_pos (or ctx->pos)
These patches were spread out over different bases, so I rebased
things on top of rc4 and retested overnight"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: don't loop on large offsets in readdir
Btrfs: check to see if root_list is empty before adding it to dead roots
Btrfs: release both paths before logging dir/changed extents
Btrfs: allow splitting of hole em's when dropping extent cache
Btrfs: make sure the backref walker catches all refs to our extent
Btrfs: fix backref walking when we hit a compressed extent
Btrfs: do not offset physical if we're compressed
Btrfs: fix extent buffer leak after backref walking
Btrfs: fix a bug of snapshot-aware defrag to make it work on partial extents
btrfs: fix file truncation if FALLOC_FL_KEEP_SIZE is specified
Linus Torvalds [Sat, 10 Aug 2013 22:20:37 +0000 (15:20 -0700)]
Merge tag 'nfs-for-3.11-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
- Stable patch for lockd to fix Oopses due to inappropriate calls to
utsname()->nodename
- Stable patches for sunrpc to fix Oopses on shutdown when using
AF_LOCAL sockets with rpcbind
- Fix memory leak and error checking issues in nfs4_proc_lookup_mountpoint
- Fix a regression with the sync mount option failing to work for nfs4
mounts
- Fix a writeback performance issue when doing cache invalidation
- Remove an incorrect call to nfs_setsecurity in nfs_fhget
* tag 'nfs-for-3.11-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFSv4: Fix up nfs4_proc_lookup_mountpoint
NFS: Remove unnecessary call to nfs_setsecurity in nfs_fhget()
NFSv4: Fix the sync mount option for nfs4 mounts
NFS: Fix writeback performance issue on cache invalidation
SUNRPC: If the rpcbind channel is disconnected, fail the call to unregister
SUNRPC: Don't auto-disconnect from the local rpcbind socket
LOCKD: Don't call utsname()->nodename from nlmclnt_setlockargs
Linus Torvalds [Sat, 10 Aug 2013 22:19:58 +0000 (15:19 -0700)]
Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linux
Pull nfsd fixes from Bruce Fields:
"Some fixes for a 4.1 feature that in retrospect probably should have
waited for 3.12.... But it appears to be working now"
* 'for-3.11' of git://linux-nfs.org/~bfields/linux:
nfsd: Fix SP4_MACH_CRED negotiation in EXCHANGE_ID
nfsd4: Fix MACH_CRED NULL dereference
Linus Torvalds [Sat, 10 Aug 2013 20:00:56 +0000 (13:00 -0700)]
Merge tag 'sound-3.11' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A couple of USB-audio fixes that should also go to stable kernels"
* tag 'sound-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: usb-audio: do not trust too-big wMaxPacketSize values
ALSA: 6fire: fix DMA issues with URB transfer_buffer usage
Linus Torvalds [Sat, 10 Aug 2013 16:00:51 +0000 (09:00 -0700)]
Merge tag 'staging-3.11-rc5' of git://git./linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are 3 small fixes for staging/IIO drivers for 3.11-rc5. Nothing
huge, two IIO driver fixes, and a zcache fix. All of these have been
in linux-next for a while"
* tag 'staging-3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: zcache: fix "zcache=" kernel parameter
iio: ti_am335x_adc: Fix wrong samples received on 1st read
iio:trigger: Fix use_count race condition
Linus Torvalds [Sat, 10 Aug 2013 16:00:21 +0000 (09:00 -0700)]
Merge tag 'usb-3.11-rc5' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are 3 small USB fixes for 3.11-rc5.
One is a fix that the ChromeOS developers ran into on some Intel
hardware, one is a build fix, and the last is a MAINTAINERS update to
help people figure out where to send USB network driver patches.
All of these have been in linux-next for a while"
* tag 'usb-3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
MAINTAINERS: Add separate section for USB NETWORKING DRIVERS
usb: xhci: add missing dma-mapping.h includes
usb: core: don't try to reset_device() a port that got just disconnected
Zach Brown [Thu, 11 Jul 2013 23:19:42 +0000 (16:19 -0700)]
btrfs: don't loop on large offsets in readdir
When btrfs readdir() hits the last entry it sets the readdir offset to a
huge value to stop buggy apps from breaking when the same name is
returned by readdir() with concurrent rename()s.
But unconditionally setting the offset to INT_MAX causes readdir() to
loop returning any entries with offsets past INT_MAX. It only takes a
few hours of constant file creation and removal to create entries past
INT_MAX.
So let's set the huge offset to LLONG_MAX if the last entry has already
overflowed 32bit loff_t. Without large offsets behaviour is identical.
With large offsets 64bit apps will work and 32bit apps will be no more
broken than they currently are if they see large offsets.
Signed-off-by: Zach Brown <zab@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Thu, 25 Jul 2013 19:11:47 +0000 (15:11 -0400)]
Btrfs: check to see if root_list is empty before adding it to dead roots
A user reported a panic when running with autodefrag and deleting snapshots.
This is because we could end up trying to add the root to the dead roots list
twice. To fix this check to see if we are empty before adding ourselves to the
dead roots list. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Mon, 22 Jul 2013 16:54:30 +0000 (12:54 -0400)]
Btrfs: release both paths before logging dir/changed extents
The ceph guys tripped over this bug where we were still holding onto the
original path that we used to copy the inode with when logging. This is based
on Chris's fix which was reported to fix the problem. We need to drop the paths
in two cases anyway so just move the drop up so that we don't have duplicate
code. Thanks,
Cc: stable@vger.kernel.org
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Thu, 11 Jul 2013 14:34:59 +0000 (10:34 -0400)]
Btrfs: allow splitting of hole em's when dropping extent cache
I noticed while running multi-threaded fsync tests that sometimes fsck would
complain about an improper gap. This happens because we fail to add a hole
extent to the file, which was happening when we'd split a hole EM because
btrfs_drop_extent_cache was just discarding the whole em instead of splitting
it. So this patch fixes this by allowing us to split a hole em properly, which
means that added holes actually get logged properly and we no longer see this
fsck error. Thankfully we're tolerant of these sort of problems so a user would
not see any adverse effects of this bug, other than fsck complaining. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Fri, 5 Jul 2013 18:03:47 +0000 (14:03 -0400)]
Btrfs: make sure the backref walker catches all refs to our extent
Because we don't mess with the offset into the extent for compressed we will
properly find both extents for this case
[extent a][extent b][rest of extent a]
but because we already added a ref for the front half we won't add the inode
information for the second half. This causes us to leak that memory and not
print out the other offset when we do logical-resolve. So fix this by calling
ulist_add_merge and then add our eie to the existing entry if there is one.
With this patch we get both offsets out of logical-resolve. With this and the
other 2 patches I've sent we now pass btrfs/276 on my vm with compress-force=lzo
set. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Fri, 5 Jul 2013 17:58:19 +0000 (13:58 -0400)]
Btrfs: fix backref walking when we hit a compressed extent
If you do btrfs inspect-internal logical-resolve on a compressed extent that has
been partly overwritten it won't find anything. This is because we try and
match the extent offset we've searched for based on the extent offset in the
data extent entry. However this doesn't work for compressed extents because the
offsets are for the uncompressed size, not the compressed size. So instead only
do this check if we are not compressed, that way we can get an actual entry for
the physical offset rather than nothing for compressed. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Fri, 5 Jul 2013 17:52:51 +0000 (13:52 -0400)]
Btrfs: do not offset physical if we're compressed
xfstest btrfs/276 was freaking out on slower boxes partly because fiemap was
offsetting the physical based on the extent offset. This is perfectly fine with
uncompressed extents, however the extent offset is into the uncompressed area,
not the compressed. So we can return a physical value that isn't at all within
the area we have allocated on disk. Fix this by returning the start of the
extent if it is compressed no matter what the offset. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Liu Bo [Wed, 3 Jul 2013 06:40:44 +0000 (14:40 +0800)]
Btrfs: fix extent buffer leak after backref walking
commit
47fb091fb787420cd195e66f162737401cce023f(Btrfs: fix unlock after free on rewinded tree blocks)
takes an extra increment on the reference of allocated dummy extent buffer, so now we
cannot free this dummy one, and end up with extent buffer leak.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Liu Bo [Mon, 1 Jul 2013 14:13:26 +0000 (22:13 +0800)]
Btrfs: fix a bug of snapshot-aware defrag to make it work on partial extents
For partial extents, snapshot-aware defrag does not work as expected,
since
a) we use the wrong logical offset to search for parents, which should be
disk_bytenr + extent_offset, not just disk_bytenr,
b) 'offset' returned by the backref walking just refers to key.offset, not
the 'offset' stored in btrfs_extent_data_ref which is
(key.offset - extent_offset).
The reproducer:
$ mkfs.btrfs sda
$ mount sda /mnt
$ btrfs sub create /mnt/sub
$ for i in `seq 5 -1 1`; do dd if=/dev/zero of=/mnt/sub/foo bs=5k count=1 seek=$i conv=notrunc oflag=sync; done
$ btrfs sub snap /mnt/sub /mnt/snap1
$ btrfs sub snap /mnt/sub /mnt/snap2
$ sync; btrfs filesystem defrag /mnt/sub/foo;
$ umount /mnt
$ btrfs-debug-tree sda (Here we can check whether the defrag operation is snapshot-awared.
This addresses the above two problems.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Jie Liu [Fri, 28 Jun 2013 05:15:52 +0000 (13:15 +0800)]
btrfs: fix file truncation if FALLOC_FL_KEEP_SIZE is specified
Create a small file and fallocate it to a big size with
FALLOC_FL_KEEP_SIZE option, then truncate it back to the
small size again, the disk free space is not changed back
in this case. i.e,
total 4
-rw-r--r-- 1 root root 512 Jun 28 11:35 test
Filesystem Size Used Avail Use% Mounted on
....
/dev/sdb1 8.0G 56K 7.2G 1% /mnt
-rw-r--r-- 1 root root 512 Jun 28 11:35 /mnt/test
Filesystem Size Used Avail Use% Mounted on
....
/dev/sdb1 8.0G 5.1G 2.2G 70% /mnt
Filesystem Size Used Avail Use% Mounted on
....
/dev/sdb1 8.0G 5.1G 2.2G 70% /mnt
With this fix, the truncated up space is back as:
Filesystem Size Used Avail Use% Mounted on
....
/dev/sdb1 8.0G 56K 7.2G 1% /mnt
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Linus Torvalds [Fri, 9 Aug 2013 22:07:19 +0000 (15:07 -0700)]
Merge tag 'pm+acpi-3.11-rc5' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
- ACPI-based memory hotplug stopped working after a recent change,
because it's not possible to associate sufficiently many "physical"
devices with one ACPI device object due to an artificial limit. Fix
from Rafael J Wysocki removes that limit and makes memory hotplug
work again.
- A change made in 3.9 uncovered a bug in the ACPI processor driver
preventing NUMA nodes from being put offline due to an ordering
issue. Fix from Yasuaki Ishimatsu changes the ordering to make
things work again.
- One of the recent ACPI video commits (that hasn't been reverted so
far) uncovered a bug in the code handling quirky BIOSes that caused
some Asus machines to boot with backlight completely off which made
it quite difficult to use them afterward. Fix from Felipe Contreras
improves the quirk to cover this particular case correctly.
- A cpufreq user space interface change made in 3.10 inadvertently
renamed the ignore_nice_load sysfs attribute to ignore_nice which
resulted in some confusion. Fix from Viresh Kumar changes the name
back to ignore_nice_load.
- An initialization ordering change made in 3.9 broke cpufreq on
loongson2 boards. Fix from Aaro Koskinen restores the correct
initialization ordering there.
- Fix breakage resulting from a mistake made in 3.9 and causing the
detection of some graphics adapters (that were detected correctly
before) to fail. There are two objects representing the same PCIe
port in the affected systems' ACPI tables and both appear as
"enabled" and we are expected to guess which one to use. We used to
choose the right one before by pure luck, but when we tried to
address another similar corner case, the luck went away. This time
we try to make our guessing a bit more educated which is reported to
work on those systems.
- The /proc/acpi/wakeup interface code is missing some locking which
may lead to breakage if that file is written or read during hotplug
of wakeup devices. That should be rare but still possible, so it's
better to start using the appropriate locking there.
* tag 'pm+acpi-3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: Try harder to resolve _ADR collisions for bridges
cpufreq: rename ignore_nice as ignore_nice_load
cpufreq: loongson2: fix regression related to clock management
ACPI / processor: move try_offline_node() after acpi_unmap_lsapic()
ACPI: Drop physical_node_id_bitmap from struct acpi_device
ACPI / PM: Walk physical_node_list under physical_node_lock
ACPI / video: improve quirk check in acpi_video_bqc_quirk()
Linus Torvalds [Fri, 9 Aug 2013 22:06:17 +0000 (15:06 -0700)]
Merge tag 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/linux-staging
Pull hwmon fix from Guenter Roeck:
"Fix bug in adt7470 driver which causes it to fail writing fan speed
limits"
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (adt7470) Fix incorrect return code check
Linus Torvalds [Fri, 9 Aug 2013 22:04:09 +0000 (15:04 -0700)]
Merge branch 'v4l_for_linus' of git://git./linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
"Some driver fixes (em28xx, coda, usbtv, s5p, hdpvr and ml86v7667) and
a fix for media DocBook"
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] em28xx: fix assignment of the eeprom data
[media] hdpvr: fix iteration over uninitialized lists in hdpvr_probe()
[media] usbtv: fix dependency
[media] usbtv: Throw corrupted frames away
[media] usbtv: Fix deinterlacing
[media] v4l2: added missing mutex.h include to v4l2-ctrls.h
[media] DocBook: upgrade media_api DocBook version to 4.2
[media] ml86v7667: fix compile warning: 'ret' set but not used
[media] s5p-g2d: Fix registration failure
[media] media: coda: Fix DT driver data pointer for i.MX27
[media] s5p-mfc: Fix input/output format reporting
Linus Torvalds [Fri, 9 Aug 2013 18:53:06 +0000 (11:53 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/hid
Pull HID fix from Jiri Kosina:
"Revert of a patch which breaks enumeration workaround in
hid-logitech-dj"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
Revert "HID: hid-logitech-dj: querying_devices was never set"
Linus Torvalds [Fri, 9 Aug 2013 18:52:34 +0000 (11:52 -0700)]
Merge tag 'fbdev-fixes-3.11-rc5' of git://git./linux/kernel/git/tomba/linux
Pull fbdev fixes from Tomi Valkeinen:
- omapdss: compilation fix and DVI fix for PandaBoard
- mxsfb: fix colors when using 18bit LCD bus
* tag 'fbdev-fixes-3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux:
ARM: OMAP: dss-common: fix Panda's DVI DDC channel
video: mxsfb: fix color settings for 18bit data bus and 32bpp
OMAPDSS: analog-tv-connector: compile fix
Linus Torvalds [Fri, 9 Aug 2013 18:51:29 +0000 (11:51 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Mostly radeon, more fixes for dynamic power management which is is off
by default for this release anyways, but there are a large number of
testers, so I'd like to keep merging the fixes.
Otherwise, radeon UVD fixes affecting suspend/resume regressions, i915
regression fixes, one for your mac mini, ast, mgag200, cirrus ttm fix
and one regression fix in the core"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (25 commits)
drm: Don't pass negative delta to ktime_sub_ns()
drm/radeon: make missing smc ucode non-fatal
drm/radeon/dpm: require rlc for dpm
drm/radeon/cik: use a mutex to properly lock srbm instanced registers
drm/radeon: remove unnecessary unpin
drm/radeon: add more UVD CS checking
drm/radeon: stop sending invalid UVD destroy msg
drm/radeon: only save UVD bo when we have open handles
drm/radeon: always program the MC on startup
drm/radeon: fix audio dto calculation on DCE3+ (v3)
drm/radeon/dpm: disable sclk ss on rv6xx
drm/radeon: fix halting UVD
drm/radeon/dpm: adjust power state properly for UVD on SI
drm/radeon/dpm: fix spread spectrum setup (v2)
drm/radeon/dpm: adjust thermal protection requirements
drm/radeon: select audio dto based on encoder id for DCE3
drm/radeon: properly handle pm on gpu reset
drm/i915: do not disable backlight on vgaswitcheroo switch off
drm/i915: Don't call encoder's get_config unless encoder is active
drm/i915: avoid brightness overflow when doing scale
...
Oleg Nesterov [Fri, 9 Aug 2013 15:19:13 +0000 (17:19 +0200)]
dlm: kill the unnecessary and wrong device_close()->recalc_sigpending()
device_close()->recalc_sigpending() is not needed, sigprocmask() takes
care of TIF_SIGPENDING correctly.
And without ->siglock it is racy and wrong, it can wrongly clear
TIF_SIGPENDING and miss a signal.
But even with this patch device_close() is still buggy:
1. sigprocmask() should not be used, we have set_task_blocked(),
but this is minor.
2. We should never block SIGKILL or SIGSTOP, and this is what
the code tries to do.
3. This can't protect against SIGKILL or SIGSTOP anyway. Another
thread can do signal_wake_up(), say, do_signal_stop() or
complete_signal() or debugger.
4. sigprocmask(SIG_BLOCK, allsigs) doesn't necessarily clears
TIF_SIGPENDING, say, freezing() or ->jobctl.
5. device_write() looks equally wrong by the same reason.
Looks like, this tries to protect some wait_event_interruptible() logic
from signals, it should be turned into uninterruptible wait. Or we need
to implement something like signals_stop/start for such a use-case.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiri Kosina [Fri, 9 Aug 2013 09:34:19 +0000 (11:34 +0200)]
Revert "HID: hid-logitech-dj: querying_devices was never set"
This reverts commit
407a2c2a4d85100c8c67953e4bac2f4a6c942335.
Explanation provided by Benjamin Tissoires:
Commit "HID: hid-logitech-dj, querying_devices was never set" activate
a flag which guarantees that we do not ask the receiver for too many
enumeration. When the flag is set, each following enumeration call is
discarded (the usb request is not forwarded to the receiver). The flag
is then released when the driver receive a pairing information event,
which normally follows the enumeration request.
However, the USB3 bug makes the driver think the enumeration request
has been forwarded to the receiver. However, it is actually not the
case because the USB stack returns -EPIPE. So, when a new unknown
device appears, the workaround consisting in asking for a new
enumeration is not working anymore: this new enumeration is discarded
because of the flag, which is never reset.
A solution could be to trigger a timeout before releasing it, but for
now, let's just revert the patch.
Reported-by: Benjamin Tissoires <benjamin.tissoires@gmail.com>
Tested-by: Sune Mølgaard <sune@molgaard.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Michael Neuling [Fri, 9 Aug 2013 07:29:31 +0000 (17:29 +1000)]
powerpc/tm: Fix context switching TAR, PPR and DSCR SPRs
If a transaction is rolled back, the Target Address Register (TAR), Processor
Priority Register (PPR) and Data Stream Control Register (DSCR) should be
restored to the checkpointed values before the transaction began. Any changes
to these SPRs inside the transaction should not be visible in the abort
handler.
Currently Linux doesn't save or restore the checkpointed TAR, PPR or DSCR. If
we preempt a processes inside a transaction which has modified any of these, on
process restore, that same transaction may be aborted we but we won't see the
checkpointed versions of these SPRs.
This adds checkpointed versions of these SPRs to the thread_struct and adds the
save/restore of these three SPRs to the treclaim/trechkpt code.
Without this if any of these SPRs are modified during a transaction, users may
incorrectly see a speculated SPR value even if the transaction is aborted.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: <stable@vger.kernel.org> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Neuling [Fri, 9 Aug 2013 07:29:30 +0000 (17:29 +1000)]
powerpc: Save the TAR register earlier
This moves us to save the Target Address Register (TAR) a earlier in
__switch_to. It introduces a new function save_tar() to do this.
We need to save the TAR earlier as we will overwrite it in the transactional
memory reclaim/recheckpoint path. We are going to do this in a subsequent
patch which will fix saving the TAR register when it's modified inside a
transaction.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: <stable@vger.kernel.org> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Neuling [Fri, 9 Aug 2013 07:29:29 +0000 (17:29 +1000)]
powerpc: Fix context switch DSCR on POWER8
POWER8 allows the DSCR to be accessed directly from userspace via a new SPR
number 0x3 (Rather than 0x11. DSCR SPR number 0x11 is still used on POWER8 but
like POWER7, is only accessible in HV and OS modes). Currently, we allow this
by setting H/FSCR DSCR bit on boot.
Unfortunately this doesn't work, as the kernel needs to see the DSCR change so
that it knows to no longer restore the system wide version of DSCR on context
switch (ie. to set thread.dscr_inherit).
This clears the H/FSCR DSCR bit initially. If a process then accesses the DSCR
(via SPR 0x3), it'll trap into the kernel where we set thread.dscr_inherit in
facility_unavailable_exception().
We also change _switch() so that we set or clear the H/FSCR DSCR bit based on
the thread.dscr_inherit.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: <stable@vger.kernel.org> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Neuling [Fri, 9 Aug 2013 07:29:28 +0000 (17:29 +1000)]
powerpc: Rework setting up H/FSCR bit definitions
This reworks the Facility Status and Control Regsiter (FSCR) config bit
definitions so that we can access the bit numbers. This is needed for a
subsequent patch to fix the userspace DSCR handling.
HFSCR and FSCR bit definitions are the same, so reuse them.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: <stable@vger.kernel.org> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Neuling [Fri, 9 Aug 2013 07:29:27 +0000 (17:29 +1000)]
powerpc: Fix hypervisor facility unavaliable vector number
Currently if we take hypervisor facility unavaliable (from 0xf80/0x4f80) we
mark it as an OS facility unavaliable (0xf60) as the two share the same code
path.
The becomes a problem in facility_unavailable_exception() as we aren't able to
see the hypervisor facility unavailable exceptions.
Below fixes this by duplication the required macros.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: <stable@vger.kernel.org> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Thadeu Lima de Souza Cascardo [Wed, 17 Jul 2013 15:10:29 +0000 (12:10 -0300)]
powerpc/kvm/book3s_pr: Return appropriate error when allocation fails
err was overwritten by a previous function call, and checked to be 0. If
the following page allocation fails, 0 is going to be returned instead
of -ENOMEM.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Chen Gang [Mon, 22 Jul 2013 06:32:35 +0000 (14:32 +0800)]
powerpc/kvm: Add signed type cast for comparation
'rmls' is 'unsigned long', lpcr_rmls() will return negative number when
failure occurs, so it need a type cast for comparing.
'lpid' is 'unsigned long', kvmppc_alloc_lpid() return negative number
when failure occurs, so it need a type cast for comparing.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Mike Qiu [Wed, 7 Aug 2013 02:25:14 +0000 (22:25 -0400)]
powerpc/eeh: Add missing procfs entry for PowerNV
The procfs entry for global statistics has been missed on PowerNV
platform and the patch is going to add that.
Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Acked-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Aruna Balakrishnaiah [Thu, 8 Aug 2013 17:04:00 +0000 (22:34 +0530)]
powerpc/pseries: Add backward compatibilty to read old kernel oops-log
Older kernels has just length information in their header. Handle it
while reading old kernel oops log from pstore.
Applies on top of powerpc/pseries: Fix buffer overflow when reading from pstore
Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Aruna Balakrishnaiah [Thu, 8 Aug 2013 17:03:49 +0000 (22:33 +0530)]
powerpc/pseries: Fix buffer overflow when reading from pstore
When reading from pstore there is a buffer overflow during decompression
due to the header added in unzip_oops. Remove unzip_oops and call
pstore_decompress directly in nvram_pstore_read. Allocate buffer of size
report_length of the oops header as header will not be deallocated in pstore.
Since we have 'openssl' command line tool to decompress the compressed data,
dump the compressed data in case decompression fails instead of not dumping
anything.
Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Wed, 31 Jul 2013 06:31:26 +0000 (16:31 +1000)]
powerpc: On POWERNV enable PPC_DENORMALISATION by default
We want PPC_DENORMALISATION enabled when POWERNV is enabled,
so update the Kconfig.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org>
Dave Airlie [Thu, 8 Aug 2013 23:09:37 +0000 (09:09 +1000)]
Merge tag 'drm-intel-fixes-2013-08-08' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes
Daniel writes:
A few bugfixes for serious stuff and regressions. Highlight is the
reinstated hack to keep the i915 backlight on when running on an optimus
machine, this prevents black screens especially with some radeon muxed
platforms. And the patch to quiet dmesg on Linus' old mac mini ;-)
* tag 'drm-intel-fixes-2013-08-08' of git://people.freedesktop.org/~danvet/drm-intel:
drm/i915: do not disable backlight on vgaswitcheroo switch off
drm/i915: Don't call encoder's get_config unless encoder is active
drm/i915: avoid brightness overflow when doing scale
drm/i915: update last_vblank when disabling the power well
drm/i915: fix gen4 digital port hotplug definitions
Linus Torvalds [Thu, 8 Aug 2013 20:11:53 +0000 (13:11 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/egtvedt/linux-avr32
Pull AVR32 build fix from Hans-Christian Egtvedt.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32:
avr32: boards/atngw100/mrmt.c: fix building error
Oleg Nesterov [Thu, 8 Aug 2013 16:55:32 +0000 (18:55 +0200)]
userns: limit the maximum depth of user_namespace->parent chain
Ensure that user_namespace->parent chain can't grow too much.
Currently we use the hardroded 32 as limit.
Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Curt Brune [Thu, 8 Aug 2013 19:11:03 +0000 (12:11 -0700)]
hwmon: (adt7470) Fix incorrect return code check
In adt7470_write_word_data(), which writes two bytes using
i2c_smbus_write_byte_data(), the return codes are incorrectly AND-ed
together when they should be OR-ed together.
The return code of i2c_smbus_write_byte_data() is zero for success.
The upshot is only the first byte was ever written to the hardware.
The 2nd byte was never written out.
I noticed that trying to set the fan speed limits was not working
correctly on my system. Setting the fan speed limits is the only
code that uses adt7470_write_word_data(). After making the change
the limit settings work and the alarms work also.
Signed-off-by: Curt Brune <curt@cumulusnetworks.com>
Cc: stable@vger.kernel.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Linus Torvalds [Thu, 8 Aug 2013 16:38:19 +0000 (09:38 -0700)]
Merge tag 'ext4_for_linus' of git://git./linux/kernel/git/tytso/ext4
Pull ext4 bugfixes from Ted Ts'o.
Misc ext4 fixes, delayed by Ted moving mail servers and email getting
marked as spam due to bad spf records.
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: add WARN_ON to check the length of allocated blocks
ext4: fix retry handling in ext4_ext_truncate()
ext4: destroy ext4_es_cachep on module unload
ext4: make sure group number is bumped after a inode allocation race
Linus Torvalds [Thu, 8 Aug 2013 16:36:38 +0000 (09:36 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/linux-security
Pull security layer fix from James Morris:
"Smack casting fix"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
Smack: IPv6 casting error fix for 3.11
Linus Torvalds [Thu, 8 Aug 2013 16:34:40 +0000 (09:34 -0700)]
Merge tag 'regulator-v3.11-rc4' of git://git./linux/kernel/git/broonie/regulator
Pull regulator DT binding fixes from Mark Brown:
"A couple of fixes to bring the DT binding documentation for Palmas
into sync with the code"
* tag 'regulator-v3.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: palmas-pmic: doc: remove ti,tstep
regulator: palmas-pmic: doc: fix typo for sleep-mode
Linus Torvalds [Thu, 8 Aug 2013 16:34:04 +0000 (09:34 -0700)]
Merge tag 'regmap-v3.11-rc4' of git://git./linux/kernel/git/broonie/regmap
Pull regmap fixes from Mark Brown:
"Two things here, one is a fix for a nasty issue where we were failing
to sync the last register in a block when using raw writes and the
other fixes a missing header for the !REGMAP stubs so that we don't
rely on implicit includes in that case"
* tag 'regmap-v3.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: Add missing header for !CONFIG_REGMAP stubs
regmap: cache: Make sure to sync the last register in a block
Linus Torvalds [Thu, 8 Aug 2013 16:33:27 +0000 (09:33 -0700)]
Merge tag 'spi-v3.11-rc4' of git://git./linux/kernel/git/broonie/spi
Pull spi fix from Mark Brown:
"Just one update for SPI, a simple fix to the davinci driver to correct
the direction for which DMA is mapped following the dmaengine
conversion"
* tag 'spi-v3.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: spi-davinci: Fix direction in dma_map_single()
Linus Torvalds [Thu, 8 Aug 2013 16:32:20 +0000 (09:32 -0700)]
Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/rusty/linux
Pull virtio fixes from Rusty Russell:
"More virtio console fixes than I'm happy with, but all real issues,
and all CC:stable.."
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
virtio-scsi: Fix virtqueue affinity setup
virtio: console: return -ENODEV on all read operations after unplug
virtio: console: fix raising SIGIO after port unplug
virtio: console: clean up port data immediately at time of unplug
virtio: console: fix race in port_fops_open() and port unplug
virtio: console: fix race with port unplug and open/close
virtio/console: Add pipe_lock/unlock for splice_write
virtio/console: Quit from splice_write if pipe->nrbufs is 0
Linus Torvalds [Thu, 8 Aug 2013 16:28:08 +0000 (09:28 -0700)]
Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Kevin Hilman:
- MSM: GPIO fixes (includes old code removal)
- OMAP: earlyprintk regression, AM33xx cpgmac PM regression
- OMAP5: urgent fix for potentially harmful voltage regulator values
- Renesas: gpio-keys fix, fix SD card detection, fix shdma calculation
error
- STi: critical SMP boot fix
- tegra: DTS fix for usb-phy
- a couple MAINTAINERS updates
(Arnd is on paternity leave, Kevin is stepping up to help arm-soc
maintenance)
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
MAINTAINERS: add TI Keystone ARM platform
MAINTAINERS: delete Srinidhi from ux500
ARM: tegra: enable ULPI phy on Colibri T20
ARM: STi: remove sti_secondary_start from INIT section.
ARM: STi: Fix cpu nodes with correct device_type.
ARM: shmobile: lager: do not annotate gpio_buttons as __initdata
ARM: shmobile: BOCK-W: fix SDHI0 PFC settings
shdma: fixup sh_dmae_get_partial() calculation error
ARM: OMAP2+: hwmod: AM335x: fix cpgmac address space
ARM: OMAP2+: hwmod: rt address space index for DT
ARM: OMAP2+: Sync hwmod state with the pm_runtime and omap_device state
ARM: OMAP2+: Avoid idling memory controllers with no drivers
ARM: OMAP2+: hwmod: Fix a crash in _setup_reset() with DEBUG_LL
ARM: dts: omap5-uevm: update optional/unused regulator configurations
ARM: dts: omap5-uevm: fix regulator configurations mandatory for SoC
ARM: dts: omap5-uevm: document regulator signals used on the actual board
ARM: msm: Consolidate gpiomux for older architectures
ARM: shmobile: armadillo800eva: Don't request GPIO 166 in board code
ARM: msm: dts: Fix the gpio register address for msm8960
Linus Torvalds [Thu, 8 Aug 2013 16:06:37 +0000 (09:06 -0700)]
Revert "slub: do not put a slab to cpu partial list when cpu_partial is 0"
This reverts commit
318df36e57c0ca9f2146660d41ff28e8650af423.
This commit caused Steven Rostedt's hackbench runs to run out of memory
due to a leak. As noted by Joonsoo Kim, it is buggy in the following
scenario:
"I guess, you may set 0 to all kmem caches's cpu_partial via sysfs,
doesn't it?
In this case, memory leak is possible in following case. Code flow of
possible leak is follwing case.
* in __slab_free()
1. (!new.inuse || !prior) && !was_frozen
2. !kmem_cache_debug && !prior
3. new.frozen = 1
4. after cmpxchg_double_slab, run the (!n) case with new.frozen=1
5. with this patch, put_cpu_partial() doesn't do anything,
because this cache's cpu_partial is 0
6. return
In step 5, leak occur"
And Steven does indeed have cpu_partial set to 0 due to RT testing.
Joonsoo is cooking up a patch, but everybody agrees that reverting this
for now is the right thing to do.
Reported-and-bisected-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cong Ding [Sat, 27 Jul 2013 23:07:51 +0000 (19:07 -0400)]
avr32: boards/atngw100/mrmt.c: fix building error
there is an additional "{", which causes building error.
Signed-off-by: Cong Ding <dinggnu@gmail.com>
Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Clemens Ladisch [Thu, 8 Aug 2013 09:24:55 +0000 (11:24 +0200)]
ALSA: usb-audio: do not trust too-big wMaxPacketSize values
The driver used to assume that the streaming endpoint's wMaxPacketSize
value would be an indication of how much data the endpoint expects or
sends, and compute the number of packets per URB using this value.
However, the Focusrite Scarlett 2i4 declares a value of 1024 bytes,
while only about 88 or 44 bytes are be actually used. This discrepancy
would result in URBs with far too few packets, which would not work
correctly on the EHCI driver.
To get correct URBs, use wMaxPacketSize only as an upper limit on the
packet size.
Reported-by: James Stone <jamesmstone@gmail.com>
Tested-by: James Stone <jamesmstone@gmail.com>
Cc: <stable@vger.kernel.org> # 2.6.35+
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Trond Myklebust [Thu, 8 Aug 2013 00:38:07 +0000 (20:38 -0400)]
NFSv4: Fix up nfs4_proc_lookup_mountpoint
Currently, we do not check the return value of client = rpc_clone_client(),
nor do we shut down the resulting cloned rpc_clnt in the case where a
NFS4ERR_WRONGSEC has caused nfs4_proc_lookup_common() to replace the
original value of 'client' (causing a memory leak).
Fix both issues and simplify the code by moving the call to
rpc_clone_client() until after nfs4_proc_lookup_common() has
done its business.
Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Michel Dänzer [Wed, 12 Jun 2013 09:58:44 +0000 (11:58 +0200)]
drm: Don't pass negative delta to ktime_sub_ns()
It takes an unsigned value. This happens not to blow up on 64-bit
architectures, but it does on 32-bit, causing
drm_calc_vbltimestamp_from_scanoutpos() to calculate totally bogus
timestamps for vblank events. Which in turn causes e.g. gnome-shell to
hang after a DPMS off cycle with current xf86-video-ati Git.
[airlied: regression introduced in drm: use monotonic time in drm_calc_vbltimestamp_from_scanoutpos]
Cc: stable@vger.kernel.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59339
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59836
Tested-by: shui yangwei <yangweix.shui@intel.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Wed, 7 Aug 2013 23:47:02 +0000 (09:47 +1000)]
Merge branch 'drm-fixes-3.11' of git://people.freedesktop.org/~agd5f/linux
Some more radeon fixes. Mostly dpm and uvd fixes. Fixes hangs
with dpm on more rv6xx asics, and fixes suspend and resume with UVD.
* 'drm-fixes-3.11' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: make missing smc ucode non-fatal
drm/radeon/dpm: require rlc for dpm
drm/radeon/cik: use a mutex to properly lock srbm instanced registers
drm/radeon: remove unnecessary unpin
drm/radeon: add more UVD CS checking
drm/radeon: stop sending invalid UVD destroy msg
drm/radeon: only save UVD bo when we have open handles
drm/radeon: always program the MC on startup
drm/radeon: fix audio dto calculation on DCE3+ (v3)
drm/radeon/dpm: disable sclk ss on rv6xx
drm/radeon: fix halting UVD
drm/radeon/dpm: adjust power state properly for UVD on SI
drm/radeon/dpm: fix spread spectrum setup (v2)
drm/radeon/dpm: adjust thermal protection requirements
drm/radeon: select audio dto based on encoder id for DCE3
drm/radeon: properly handle pm on gpu reset
Alex Deucher [Wed, 7 Aug 2013 20:09:08 +0000 (16:09 -0400)]
drm/radeon: make missing smc ucode non-fatal
The smc ucode is required for dpm (dynamic power
management), but if it's missing just skip dpm setup
and don't disable acceleration.
Should fix:
https://bugs.freedesktop.org/show_bug.cgi?id=67876
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 6 Aug 2013 17:34:00 +0000 (13:34 -0400)]
drm/radeon/dpm: require rlc for dpm
The rlc is required for dpm to work properly, so if
the rlc ucode is missing, don't enable dpm. Enabling
dpm without the rlc enabled can result in hangs.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 6 Aug 2013 16:40:16 +0000 (12:40 -0400)]
drm/radeon/cik: use a mutex to properly lock srbm instanced registers
We need proper locking in the driver when accessing instanced
registers on CIK.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Mon, 5 Aug 2013 12:10:58 +0000 (14:10 +0200)]
drm/radeon: remove unnecessary unpin
We don't pin the BO on allocation, so don't unpin it on free.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Mon, 5 Aug 2013 12:10:57 +0000 (14:10 +0200)]
drm/radeon: add more UVD CS checking
Improve error handling in case userspace sends us
an invalid command buffer.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Mon, 5 Aug 2013 12:10:56 +0000 (14:10 +0200)]
drm/radeon: stop sending invalid UVD destroy msg
We also need to check the handle.
Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Mon, 5 Aug 2013 12:10:55 +0000 (14:10 +0200)]
drm/radeon: only save UVD bo when we have open handles
Otherwise just reinitialize from scratch on resume,
and so make it more likely to succeed.
Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Sun, 4 Aug 2013 16:13:17 +0000 (12:13 -0400)]
drm/radeon: always program the MC on startup
For r6xx+ asics. This mirrors the behavior of pre-r6xx
asics. We need to program the MC even if something
else in startup() fails. Failure to do so results in
an unusable GPU.
Based on a fix from: Mark Kettenis <kettenis@openbsd.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Alex Deucher [Tue, 30 Jul 2013 21:31:07 +0000 (17:31 -0400)]
drm/radeon: fix audio dto calculation on DCE3+ (v3)
Need to set the wallclock ratio and adjust the phase
and module registers appropriately. May fix problems
with audio timing at certain display timings.
v2: properly handle clocks below 24mhz
v3: rebase r600 changes
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 1 Aug 2013 18:35:02 +0000 (14:35 -0400)]
drm/radeon/dpm: disable sclk ss on rv6xx
Enabling spread spectrum on the engine clock
leads to hangs on some asics.
Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=66963
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Thu, 1 Aug 2013 15:34:07 +0000 (17:34 +0200)]
drm/radeon: fix halting UVD
Removing the clock/power or resetting the VCPU can cause
hangs if that happens in the middle of a register write.
Stall the memory and register bus before putting the VCPU
into reset. Keep it in reset when unloading the module or
suspending.
Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 1 Aug 2013 15:54:07 +0000 (11:54 -0400)]
drm/radeon/dpm: adjust power state properly for UVD on SI
There are some hardware issue with reclocking on SI when
UVD is active, so use a stable power state when UVD is
active. Fixes possible hangs and performance issues when
using UVD on SI.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 31 Jul 2013 22:32:33 +0000 (18:32 -0400)]
drm/radeon/dpm: fix spread spectrum setup (v2)
Need to check for engine and memory clock ss separately
and only enable dynamic ss if either of them are found.
This should fix systems which have a ss table, but do
not have entries for engine or memory. On those systems
we may enable dynamic spread spectrum without enabling
it on the engine or memory clocks which can lead to a
hang in some cases.
fixes some systems reported here:
https://bugs.freedesktop.org/show_bug.cgi?id=66963
v2: fix typo
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 31 Jul 2013 16:41:35 +0000 (12:41 -0400)]
drm/radeon/dpm: adjust thermal protection requirements
On rv770 and newer, clock gating is not required
for thermal protection. The only requirement is that
the design utilizes a thermal sensor.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Mon, 29 Jul 2013 22:56:13 +0000 (18:56 -0400)]
drm/radeon: select audio dto based on encoder id for DCE3
There are two audio dtos on radeon asics that you can
select between. Normally, dto0 is used for hdmi and
dto1 for DP, but it seems that the dto is somehow
tied to the encoders on DCE3 asics.
fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=67435
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Alex Deucher [Wed, 31 Jul 2013 13:16:42 +0000 (09:16 -0400)]
drm/radeon: properly handle pm on gpu reset
When we reset the GPU, we need to properly tear
down power management before reseting the GPU and then
set it back up again after reset. Add the missing
radeon_pm_[suspend|resume] calls to the gpu reset
function.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Trond Myklebust [Wed, 7 Aug 2013 16:17:19 +0000 (12:17 -0400)]
NFS: Remove unnecessary call to nfs_setsecurity in nfs_fhget()
We only need to call it on the creation of the inode.
Reported-by: Julia Lawall <Julia.Lawall@lip6.fr>
Cc: Steve Dickson <SteveD@redhat.com>
Cc: Dave Quigley <dpquigl@davequigley.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Scott Mayhew [Wed, 31 Jul 2013 14:01:41 +0000 (10:01 -0400)]
NFSv4: Fix the sync mount option for nfs4 mounts
The sync mount option stopped working for NFSv4 mounts after commit
c02d7adf8c5429727a98bad1d039bccad4c61c50 (NFSv4: Replace nfs4_path_walk() with
FS path lookup in a private namespace). If MS_SYNCHRONOUS is set in the
super_block that we're cloning from, then it should be set in the new
super_block as well.
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Mon, 5 Aug 2013 17:26:31 +0000 (13:26 -0400)]
NFS: Fix writeback performance issue on cache invalidation
If a cache invalidation is triggered, and we happen to have a lot of
writebacks cached at the time, then the call to invalidate_inode_pages2()
will end up calling ->launder_page() on each and every dirty page in order
to sync its contents to disk, thus defeating write coalescing.
The following patch ensures that we try to sync the inode to disk before
calling invalidate_inode_pages2() so that we do the writeback as efficiently
as possible.
Reported-by: William Dauchy <william@gandi.net>
Reported-by: Pascal Bouchareine <pascal@gandi.net>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Tested-by: William Dauchy <william@gandi.net>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Trond Myklebust [Mon, 5 Aug 2013 20:04:47 +0000 (16:04 -0400)]
SUNRPC: If the rpcbind channel is disconnected, fail the call to unregister
If rpcbind causes our connection to the AF_LOCAL socket to close after
we've registered a service, then we want to be careful about reconnecting
since the mount namespace may have changed.
By simply refusing to reconnect the AF_LOCAL socket in the case of
unregister, we avoid the need to somehow save the mount namespace. While
this may lead to some services not unregistering properly, it should
be safe.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Nix <nix@esperi.org.uk>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: stable@vger.kernel.org # 3.9.x
Rafael J. Wysocki [Wed, 7 Aug 2013 20:55:38 +0000 (22:55 +0200)]
Merge branch 'pm-fixes'
* pm-fixes:
cpufreq: rename ignore_nice as ignore_nice_load
cpufreq: loongson2: fix regression related to clock management
Rafael J. Wysocki [Wed, 7 Aug 2013 20:55:27 +0000 (22:55 +0200)]
Merge branch 'acpi-fixes'
* acpi-fixes:
ACPI: Try harder to resolve _ADR collisions for bridges
ACPI / processor: move try_offline_node() after acpi_unmap_lsapic()
ACPI: Drop physical_node_id_bitmap from struct acpi_device
ACPI / PM: Walk physical_node_list under physical_node_lock
ACPI / video: improve quirk check in acpi_video_bqc_quirk()