platform/kernel/linux-rpi.git
7 years agoxfs: scrub directory parent pointers
Darrick J. Wong [Wed, 18 Oct 2017 04:37:46 +0000 (21:37 -0700)]
xfs: scrub directory parent pointers

Scrub parent pointers, sort of.  For directories, we can ride the
'..' entry up to the parent to confirm that there's at most one
dentry that points back to this directory.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: scrub symbolic links
Darrick J. Wong [Wed, 18 Oct 2017 04:37:45 +0000 (21:37 -0700)]
xfs: scrub symbolic links

Create the infrastructure to scrub symbolic link data.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: scrub extended attributes
Darrick J. Wong [Wed, 18 Oct 2017 04:37:45 +0000 (21:37 -0700)]
xfs: scrub extended attributes

Scrub the hash tree, keys, and values in an extended attribute structure.
Refactor the attribute code to use the transaction if the caller supplied
one to avoid buffer deadocks.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: scrub directory freespace
Darrick J. Wong [Wed, 18 Oct 2017 04:37:44 +0000 (21:37 -0700)]
xfs: scrub directory freespace

Check the free space information in a directory.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: scrub directory metadata
Darrick J. Wong [Wed, 18 Oct 2017 04:37:44 +0000 (21:37 -0700)]
xfs: scrub directory metadata

Scrub the hash tree and all the entries in a directory.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: scrub directory/attribute btrees
Darrick J. Wong [Wed, 18 Oct 2017 04:37:43 +0000 (21:37 -0700)]
xfs: scrub directory/attribute btrees

Provide a way to check the shape and scrub the hashes and records
in a directory or extended attribute btree.  These are helper functions
for the directory & attribute scrubbers in subsequent patches.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
[fengguang: remove unneeded variable to store return value]
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: scrub inode block mappings
Darrick J. Wong [Wed, 18 Oct 2017 04:37:43 +0000 (21:37 -0700)]
xfs: scrub inode block mappings

Scrub an individual inode's block mappings to make sure they make sense.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: scrub inodes
Darrick J. Wong [Wed, 18 Oct 2017 04:37:42 +0000 (21:37 -0700)]
xfs: scrub inodes

Scrub the fields within an inode.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: scrub refcount btrees
Darrick J. Wong [Wed, 18 Oct 2017 04:37:41 +0000 (21:37 -0700)]
xfs: scrub refcount btrees

Plumb in the pieces necessary to check the refcount btree.  If rmap is
available, check the reference count by performing an interval query
against the rmapbt.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: scrub rmap btrees
Darrick J. Wong [Wed, 18 Oct 2017 04:37:41 +0000 (21:37 -0700)]
xfs: scrub rmap btrees

Check the reverse mapping records to make sure that the contents
make sense.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: scrub inode btrees
Darrick J. Wong [Wed, 18 Oct 2017 04:37:40 +0000 (21:37 -0700)]
xfs: scrub inode btrees

Check the records of the inode btrees to make sure that the values
make sense given the inode records themselves.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: scrub free space btrees
Darrick J. Wong [Wed, 18 Oct 2017 04:37:40 +0000 (21:37 -0700)]
xfs: scrub free space btrees

Check the extent records free space btrees to ensure that the values
look sane.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: scrub the AGI
Darrick J. Wong [Wed, 18 Oct 2017 04:37:39 +0000 (21:37 -0700)]
xfs: scrub the AGI

Add a forgotten check to the AGI verifier, then wire up the scrub
infrastructure to check the AGI contents.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: scrub AGF and AGFL
Darrick J. Wong [Wed, 18 Oct 2017 04:37:39 +0000 (21:37 -0700)]
xfs: scrub AGF and AGFL

Check the block references in the AGF and AGFL headers to make sure
they make sense.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: scrub the secondary superblocks
Darrick J. Wong [Wed, 18 Oct 2017 04:37:38 +0000 (21:37 -0700)]
xfs: scrub the secondary superblocks

Ensure that the geometry presented in the backup superblocks matches
the primary superblock so that repair can recover the filesystem if
that primary gets corrupted.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: create helpers to scan an allocation group
Darrick J. Wong [Wed, 18 Oct 2017 04:37:38 +0000 (21:37 -0700)]
xfs: create helpers to scan an allocation group

Add some helpers to enable us to lock an AG's headers, create btree
cursors for all btrees in that allocation group, and clean up
afterwards.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: scrub btree keys and records
Darrick J. Wong [Wed, 18 Oct 2017 04:37:37 +0000 (21:37 -0700)]
xfs: scrub btree keys and records

Add to the btree scrubber the ability to check that the keys and
records are in the right order and actually call out to our record
iterator to do actual checking of the records.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: scrub the shape of a metadata btree
Darrick J. Wong [Wed, 18 Oct 2017 04:37:37 +0000 (21:37 -0700)]
xfs: scrub the shape of a metadata btree

Create a function that can check the shape of a btree -- each block
passes basic inspection and all the pointers look ok.  In the next patch
we'll add the ability to check the actual keys and records stored within
the btree.  Add some helper functions so that we report detailed scrub
errors in a uniform manner in dmesg.  These are helper functions for
subsequent patches.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: create helpers to scrub a metadata btree
Darrick J. Wong [Wed, 18 Oct 2017 04:37:37 +0000 (21:37 -0700)]
xfs: create helpers to scrub a metadata btree

Create helper functions and tracepoints to deal with errors while
scrubbing a metadata btree.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: create helpers to record and deal with scrub problems
Darrick J. Wong [Wed, 18 Oct 2017 04:37:36 +0000 (21:37 -0700)]
xfs: create helpers to record and deal with scrub problems

Create helper functions to record crc and corruption problems, and
deal with any other runtime errors that arise.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: probe the scrub ioctl
Darrick J. Wong [Wed, 18 Oct 2017 04:37:36 +0000 (21:37 -0700)]
xfs: probe the scrub ioctl

Create a probe scrubber with id 0.  This will be used by xfs_scrub to
probe the kernel's abilities to scrub (and repair) the metadata.  We do
this by validating the ioctl inputs from userspace, preparing the
filesystem for a scrub (or a repair) operation, and immediately
returning to userspace.  Userspace can use the returned errno and
structure state to decide (in broad terms) if scrub/repair are
supported by the running kernel.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: dispatch metadata scrub subcommands
Darrick J. Wong [Wed, 18 Oct 2017 04:37:35 +0000 (21:37 -0700)]
xfs: dispatch metadata scrub subcommands

Create structures needed to hold scrubbing context and dispatch incoming
commands to the individual scrubbers.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: create an ioctl to scrub AG metadata
Darrick J. Wong [Wed, 18 Oct 2017 04:37:34 +0000 (21:37 -0700)]
xfs: create an ioctl to scrub AG metadata

Create an ioctl that can be used to scrub internal filesystem metadata.
The new ioctl takes the metadata type, an (optional) AG number, an
(optional) inode number and generation, and a flags argument.  This will
be used by the upcoming XFS online scrub tool.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: create inode pointer verifiers
Darrick J. Wong [Wed, 18 Oct 2017 04:37:34 +0000 (21:37 -0700)]
xfs: create inode pointer verifiers

Create some helper functions to check that inode pointers point to
somewhere within the filesystem and not at the static AG metadata.
Move xfs_internal_inum and create a directory inode check function.
We will use these functions in scrub and elsewhere.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: refactor btree block header checking functions
Darrick J. Wong [Wed, 18 Oct 2017 04:37:33 +0000 (21:37 -0700)]
xfs: refactor btree block header checking functions

Refactor the btree block header checks to have an internal function that
returns the address of the failing check without logging errors.  The
scrubber will call the internal function, while the external version
will maintain the current logging behavior.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: refactor btree pointer checks
Darrick J. Wong [Wed, 18 Oct 2017 04:37:33 +0000 (21:37 -0700)]
xfs: refactor btree pointer checks

Refactor the btree pointer checks so that we can call them from the
scrub code without logging errors to dmesg.  Preserve the existing error
reporting for regular operations.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: create block pointer check functions
Darrick J. Wong [Wed, 18 Oct 2017 04:37:32 +0000 (21:37 -0700)]
xfs: create block pointer check functions

Create some helper functions to check that a block pointer points
within the filesystem (or AG) and doesn't point at static metadata.
We will use this for scrub.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: return a distinct error code value for IGET_INCORE cache misses
Darrick J. Wong [Wed, 18 Oct 2017 04:37:32 +0000 (21:37 -0700)]
xfs: return a distinct error code value for IGET_INCORE cache misses

For an XFS_IGET_INCORE iget operation, if the inode isn't in the cache,
return ENODATA so that we don't confuse it with the pre-existing ENOENT
cases (inode is in cache, but freed).

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
7 years agoxfs: buffer lru reference count error injection tag
Brian Foster [Tue, 17 Oct 2017 21:16:29 +0000 (14:16 -0700)]
xfs: buffer lru reference count error injection tag

XFS uses a fixed reference count for certain types of buffers in the
internal LRU cache. These reference counts dictate how aggressively
certain buffers are reclaimed vs. others. While the reference counts
implements priority across different buffer types, all buffers
(other than uncached buffers) are typically cached for at least one
reclaim cycle.

We've had at least one bug recently that has been hidden by a
released buffer sitting around in the LRU. Users hitting the problem
were able to reproduce under enough memory pressure to cause
aggressive reclaim in a particular window of time.

To support future xfstests cases, add an error injection tag to
hardcode the buffer reference count to zero. When enabled, this
bypasses caching of associated buffers and facilitates test cases
that depend on this behavior.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoxfs: fail if xattr inactivation hits a hole
Brian Foster [Tue, 17 Oct 2017 21:16:28 +0000 (14:16 -0700)]
xfs: fail if xattr inactivation hits a hole

The child buffer read in xfs_attr3_node_inactive() should never
reach a hole in the attr fork. If this occurs, it is likely due to a
bug. Prior to commit cd87d867 ("xfs: don't crash on unexpected holes
in dir/attr btrees"), this would result in a crash. Now that the
crash has been fixed, this is a silent failure.

Pass -1 to xfs_da3_node_read() from xfs_da3_node_inactive() to
indicate that reading from a hole is an error. This logs an error to
syslog and fails the inode inactivation, leaving the inode on the AG
unlinked list until removed by xfs_repair (or log recovery). Also
update the subsequent code to reflect that the read now returns a
non-NULL buffer or an error.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoxfs: check kthread_should_stop() after the setting of task state
Hou Tao [Tue, 17 Oct 2017 21:16:28 +0000 (14:16 -0700)]
xfs: check kthread_should_stop() after the setting of task state

A umount hang is possible when a race occurs between the umount
process and the xfsaild kthread. The following sequences outline
the race:

    xfsaild: kthread_should_stop()
     => return false, so xfsaild continue

    umount: set_bit(KTHREAD_SHOULD_STOP, &kthread->flags)
    => by kthread_stop()
    umount: wake_up_process()
    => because xfsaild is still running, so 0 is returned

    xfsaild: __set_current_state(TASK_INTERRUPTIBLE)
    xfsaild: schedule()
    => now, xfsaild will wait indefinitely

    umount: wait_for_completion()
    => and umount will hang

To fix that, we need to check kthread_should_stop() after we set
the task state, so the xfsaild will either see the stop bit and
exit or the task state is reset to runnable by wake_up_process()
such that it isn't scheduled out indefinitely and detects the stop
bit at the next iteration.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Hou Tao <houtao1@huawei.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoxfs: remove xfs_bmbt_get_state
Christoph Hellwig [Tue, 17 Oct 2017 21:16:28 +0000 (14:16 -0700)]
xfs: remove xfs_bmbt_get_state

Unused after the big bmap refactor.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoxfs: remove all xfs_bmbt_set_* helpers except for xfs_bmbt_set_all
Christoph Hellwig [Tue, 17 Oct 2017 21:16:27 +0000 (14:16 -0700)]
xfs: remove all xfs_bmbt_set_* helpers except for xfs_bmbt_set_all

Unused after the big bmap refactor.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoxfs: replace xfs_bmbt_lookup_ge with xfs_bmbt_lookup_first
Christoph Hellwig [Tue, 17 Oct 2017 21:16:27 +0000 (14:16 -0700)]
xfs: replace xfs_bmbt_lookup_ge with xfs_bmbt_lookup_first

We only use xfs_bmbt_lookup_ge to look up the first bmap record in an
inode, so replace xfs_bmbt_lookup_ge with a special purpose helper that
is a bit more descriptive.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoxfs: pass a struct xfs_bmbt_irec to xfs_bmbt_lookup_eq
Christoph Hellwig [Tue, 17 Oct 2017 21:16:26 +0000 (14:16 -0700)]
xfs: pass a struct xfs_bmbt_irec to xfs_bmbt_lookup_eq

Now that we've massaged the callers into the right form we can always
pass the actual extent record instead of the individual fields.

As an additional benefit the btree cursor will now be prepoulated with
the correct extent state instead of having to fix it up later.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoxfs: pass a struct xfs_bmbt_irec to xfs_bmbt_update
Christoph Hellwig [Tue, 17 Oct 2017 21:16:26 +0000 (14:16 -0700)]
xfs: pass a struct xfs_bmbt_irec to xfs_bmbt_update

Now that we've massaged the callers into the right form we can always
pass the actual extent record instead of the individual fields.

With that xfs_bmbt_disk_set_allf can go away, and xfs_bmbt_disk_set_all
can be merged into the former implementation of xfs_bmbt_disk_set_allf.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoxfs: refactor xfs_bmap_add_extent_unwritten_real
Christoph Hellwig [Tue, 17 Oct 2017 21:16:25 +0000 (14:16 -0700)]
xfs: refactor xfs_bmap_add_extent_unwritten_real

Use xfs_iext_get_extent to find, and xfs_iext_update_extent to update
entries in the in-core extent list.  This isolates the function from
the detailed layout of the extent list, and generally makes the code
a lot more readable.

Also get rid of the oldext and newext variables as using the extent
records is a lot more descriptive.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoxfs: refactor delalloc accounting in xfs_bmap_add_extent_delay_real
Christoph Hellwig [Tue, 17 Oct 2017 21:16:25 +0000 (14:16 -0700)]
xfs: refactor delalloc accounting in xfs_bmap_add_extent_delay_real

Account for all changes to the delalloc reservation in da_new, and use a
single call xfs_mod_fdblocks to reserve/free blocks, including always
checking for an error.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoxfs: refactor xfs_bmap_add_extent_delay_real
Christoph Hellwig [Tue, 17 Oct 2017 21:16:24 +0000 (14:16 -0700)]
xfs: refactor xfs_bmap_add_extent_delay_real

Use xfs_iext_get_extent to find, and xfs_iext_update_extent to update
entries in the in-core extent list.  This isolates the function from
the detailed layout of the extent list, and generally makes the code
a lot more readable.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoxfs: refactor xfs_bmap_add_extent_hole_real
Christoph Hellwig [Tue, 17 Oct 2017 21:16:24 +0000 (14:16 -0700)]
xfs: refactor xfs_bmap_add_extent_hole_real

Use xfs_iext_update_extent to update entries in the in-core extent list.
This isolates the function from the detailed layout of the extent list,
and generally makes the code a lot more readable.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoxfs: refactor xfs_bmap_add_extent_hole_delay
Christoph Hellwig [Tue, 17 Oct 2017 21:16:23 +0000 (14:16 -0700)]
xfs: refactor xfs_bmap_add_extent_hole_delay

Use xfs_iext_get_extent to find, and xfs_iext_update_extent to update
entries in the in-core extent list.  This isolates the function from
the detailed layout of the extent list, and generally makes the code
a lot more readable.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoxfs: refactor xfs_del_extent_real
Christoph Hellwig [Tue, 17 Oct 2017 21:16:23 +0000 (14:16 -0700)]
xfs: refactor xfs_del_extent_real

Use xfs_iext_update_extent to update entries in the in-core extent list.
This isolates the function from the detailed layout of the extent list,
and generally makes the code a lot more readable.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoxfs: use the state defines in xfs_bmap_del_extent_real
Christoph Hellwig [Tue, 17 Oct 2017 21:16:23 +0000 (14:16 -0700)]
xfs: use the state defines in xfs_bmap_del_extent_real

Use the same defines as the other extent add and delete helpers, which
both improves code readability and trace point output.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoxfs: use correct state defines in xfs_bmap_del_extent_{cow,delay}
Christoph Hellwig [Tue, 17 Oct 2017 21:16:22 +0000 (14:16 -0700)]
xfs: use correct state defines in xfs_bmap_del_extent_{cow,delay}

Use the _FILLING values to match the usage in the xfs_bmap_add_extent_*
helpers.  No change in behavior, just better naming in the code and
tracepoint output.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoxfs: move some more code into xfs_bmap_del_extent_real
Christoph Hellwig [Tue, 17 Oct 2017 21:16:22 +0000 (14:16 -0700)]
xfs: move some more code into xfs_bmap_del_extent_real

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoxfs: use xfs_bmap_del_extent_delay for the data fork as well
Christoph Hellwig [Tue, 17 Oct 2017 21:16:21 +0000 (14:16 -0700)]
xfs: use xfs_bmap_del_extent_delay for the data fork as well

And remove the delalloc code from xfs_bmap_del_extent, which gets renamed
to xfs_bmap_del_extent_real to fit the naming scheme used by the other
xfs_bmap_{add,del}_extent_* routines.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoxfs: rename bno to end in __xfs_bunmapi
Christoph Hellwig [Tue, 17 Oct 2017 21:16:21 +0000 (14:16 -0700)]
xfs: rename bno to end in __xfs_bunmapi

Rename the bno variable that's used as the end of the range in
__xfs_bunmapi to end, which better describes it.  Additionally change
the start variable which takes the initial value of bno to be the
function parameter itself.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoxfs: don't set XFS_BTCUR_BPRV_WASDEL in xfs_bunmapi
Christoph Hellwig [Tue, 17 Oct 2017 21:16:20 +0000 (14:16 -0700)]
xfs: don't set XFS_BTCUR_BPRV_WASDEL in xfs_bunmapi

The XFS_BTCUR_BPRV_WASDEL flag is supposed to indicate that we are
converting a delayed allocation to a real one, which isn't the case
in xfs_bunmapi.  Setting it could theoretically lead to misaccounting
here, but it's unlikely that we ever hit it in practice.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoxfs: use xfs_iext_get_extent instead of open coding it
Christoph Hellwig [Tue, 17 Oct 2017 21:16:20 +0000 (14:16 -0700)]
xfs: use xfs_iext_get_extent instead of open coding it

This avoids exposure to details of the extent list implementation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoxfs: fix incorrect extent state in xfs_bmap_add_extent_unwritten_real
Christoph Hellwig [Tue, 17 Oct 2017 21:16:19 +0000 (14:16 -0700)]
xfs: fix incorrect extent state in xfs_bmap_add_extent_unwritten_real

There was one spot in xfs_bmap_add_extent_unwritten_real that didn't use the
passed in new extent state but always converted to normal, leading to wrong
behavior when converting from normal to unwritten.

Only found by code inspection, it seems like this code path to move partial
extent from written to unwritten while merging it with the next extent is
rarely exercised.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoxfs: simplify the xfs_getbmap interface
Christoph Hellwig [Tue, 17 Oct 2017 21:16:19 +0000 (14:16 -0700)]
xfs: simplify the xfs_getbmap interface

Instead of passing in a formatter callback allocate the bmap buffer
in the caller and process the entries there.  Additionally replace
the in-kernel buffer with a new much smaller structure, and unify
the implementation of the different ioctls in a single function.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoxfs: rewrite getbmap using the xfs_iext_* helpers
Christoph Hellwig [Tue, 17 Oct 2017 21:16:18 +0000 (14:16 -0700)]
xfs: rewrite getbmap using the xfs_iext_* helpers

Currently getbmap uses xfs_bmapi_read to query the extent map, and then
fixes up various bits that are eventually reported to userspace.

This patch instead rewrites it to use xfs_iext_lookup_extent and
xfs_iext_get_extent to iteratively process the extent map.  This not
only avoids the need to allocate a map for the returned xfs_bmbt_irec
structures but also greatly simplified the code.

There are two intentional behavior changes compared to the old code:

 - the current code reports unwritten extents that don't directly border
   a written one as unwritten even when not passing the BMV_IF_PREALLOC
   option, contrary to the documentation.  The new code requires the
   BMV_IF_PREALLOC flag to report the unwrittent extent bit.
 - The new code does never merges consecutive extents, unlike the old
   code that sometimes does it based on the boundaries of the
   xfs_bmapi_read calls.  Note that the extent merging behavior was
   entirely undocumented.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Linus Torvalds [Thu, 26 Oct 2017 21:04:14 +0000 (23:04 +0200)]
Merge tag 'for-linus' of git://git./linux/kernel/git/dledford/rdma

Pull rdma fix from Doug Ledford:
 "Fix an oops issue in the new RDMA netlink code"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
  RDMA/netlink: OOPs in rdma_nl_rcv_msg() from misinterpreted flag

7 years agoRevert "apparmor: add base infastructure for socket mediation"
Linus Torvalds [Thu, 26 Oct 2017 17:35:35 +0000 (19:35 +0200)]
Revert "apparmor: add base infastructure for socket mediation"

This reverts commit 651e28c5537abb39076d3949fb7618536f1d242e.

This caused a regression:
 "The specific problem is that dnsmasq refuses to start on openSUSE Leap
  42.2.  The specific cause is that and attempt to open a PF_LOCAL socket
  gets EACCES.  This means that networking doesn't function on a system
  with a 4.14-rc2 system."

Sadly, the developers involved seemed to be in denial for several weeks
about this, delaying the revert.  This has not been a good release for
the security subsystem, and this area needs to change development
practices.

Reported-and-bisected-by: James Bottomley <James.Bottomley@hansenpartnership.com>
Tracked-by: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Seth Arnold <seth.arnold@canonical.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoMerge tag 'pm-4.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Thu, 26 Oct 2017 17:10:39 +0000 (19:10 +0200)]
Merge tag 'pm-4.14-rc7' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management fix from Rafael Wysocki:
 "This fixes a device power management quality of service (PM QoS)
  framework implementation issue causing 'no restriction' requests for
  device resume latency, including 'no restriction' set by user space,
  to effectively override requests with specific device resume latency
  requirements.

  It is late in the cycle, but the bug in question is in the 'user space
  can trigger unexpected behavior' category and the fix is
  stable-candidate, so here it goes"

* tag 'pm-4.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / QoS: Fix device resume latency PM QoS

7 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-block
Linus Torvalds [Thu, 26 Oct 2017 15:08:48 +0000 (17:08 +0200)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "A few select fixes that should go into this series. Mainly for NVMe,
  but also a single stable fix for nbd from Josef"

* 'for-linus' of git://git.kernel.dk/linux-block:
  nbd: handle interrupted sendmsg with a sndtimeo set
  nvme-rdma: Fix error status return in tagset allocation failure
  nvme-rdma: Fix possible double free in reconnect flow
  nvmet: synchronize sqhd update
  nvme-fc: retry initial controller connections 3 times
  nvme-fc: fix iowait hang

7 years agoMerge tag 'spi-fix-v4.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brooni...
Linus Torvalds [Thu, 26 Oct 2017 15:06:35 +0000 (17:06 +0200)]
Merge tag 'spi-fix-v4.14-rc5' of git://git./linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
 "There are a bunch of device specific fixes (more than I'd like, I've
  been lax sending these) plus one important core fix for the conversion
  to use an IDR for bus number allocation which avoids issues with
  collisions when some but not all of the buses in the system have a
  fixed bus number specified.

  The Armada changes are rather large, specificially "spi: armada-3700:
  Fix padding when sending not 4-byte aligned data", but it's a storage
  corruption issue and there's things like indentation changes which
  make it look bigger than it really is. It's been cooking in -next for
  quite a while now and is part of the reason for the delay"

* tag 'spi-fix-v4.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: fix IDR collision on systems with both fixed and dynamic SPI bus numbers
  spi: bcm-qspi: Fix use after free in bcm_qspi_probe() in error path
  spi: a3700: Return correct value on timeout detection
  spi: uapi: spidev: add missing ioctl header
  spi: stm32: Fix logical error in stm32_spi_prepare_mbr()
  spi: armada-3700: Fix padding when sending not 4-byte aligned data
  spi: armada-3700: Fix failing commands with quad-SPI

7 years agoMerge tag 'ceph-for-4.14-rc7' of git://github.com/ceph/ceph-client
Linus Torvalds [Thu, 26 Oct 2017 15:04:20 +0000 (17:04 +0200)]
Merge tag 'ceph-for-4.14-rc7' of git://github.com/ceph/ceph-client

Pull ceph fix from Ilya Dryomov:
 "A small lock imbalance fix, marked for stable"

* tag 'ceph-for-4.14-rc7' of git://github.com/ceph/ceph-client:
  ceph: unlock dangling spinlock in try_flush_caps()

7 years agoMerge tag 'xfs-4.14-fixes-7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Linus Torvalds [Thu, 26 Oct 2017 06:45:40 +0000 (08:45 +0200)]
Merge tag 'xfs-4.14-fixes-7' of git://git./fs/xfs/xfs-linux

Pull xfs fix from Darrick Wong:
 "Here's (hopefully) the last bugfix for 4.14:

   - Rework nowait locking code to reduce locking overhead penalty"

* tag 'xfs-4.14-fixes-7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: fix AIM7 regression

7 years agoMerge tag 'hwmon-for-linus-v4.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 26 Oct 2017 06:11:44 +0000 (08:11 +0200)]
Merge tag 'hwmon-for-linus-v4.14-rc7' of git://git./linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:

 - Fix initial temperature readings for TMP102

 - Fix timeouts in DA9052 driver by increasing its sampling rate

* tag 'hwmon-for-linus-v4.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (tmp102) Fix first temperature reading
  hwmon: (da9052) Increase sample rate when using TSI

7 years agoMerge tag 'sound-4.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Thu, 26 Oct 2017 06:02:42 +0000 (08:02 +0200)]
Merge tag 'sound-4.14-rc7' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Just two HD-audio fixups for a recent Realtek codec model. It's pretty
  safe to apply (and unsurprisingly boring)"

* tag 'sound-4.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - fix headset mic problem for Dell machines with alc236
  ALSA: hda/realtek - Add support for ALC236/ALC3204

7 years agoRDMA/netlink: OOPs in rdma_nl_rcv_msg() from misinterpreted flag
Michael J. Ruhl [Tue, 24 Oct 2017 12:41:01 +0000 (08:41 -0400)]
RDMA/netlink: OOPs in rdma_nl_rcv_msg() from misinterpreted flag

rdma_nl_rcv_msg() checks to see if it should use the .dump() callback
or the .doit() callback.  The check is done with this check:

if (flags & NLM_F_DUMP) ...

The NLM_F_DUMP flag is two bits (NLM_F_ROOT | NLM_F_MATCH).

When an RDMA_NL_LS message (response) is received, the bit used for
indicating an error is the same bit as NLM_F_ROOT.

NLM_F_ROOT == (0x100) == RDMA_NL_LS_F_ERR.

ibacm sends a response with the RDMA_NL_LS_F_ERR bit set if an error
occurs in the service.  The current code then misinterprets the
NLM_F_DUMP bit and trys to call the .dump() callback.

If the .dump() callback for the specified request is not available
(which is true for the RDMA_NL_LS messages) the following Oops occurs:

[ 4555.960256] BUG: unable to handle kernel NULL pointer dereference at
   (null)
[ 4555.969046] IP:           (null)
[ 4555.972664] PGD 10543f1067 P4D 10543f1067 PUD 1033f93067 PMD 0
[ 4555.979287] Oops: 0010 [#1] SMP
[ 4555.982809] Modules linked in: rpcrdma ib_isert iscsi_target_mod
target_core_mod ib_iser libiscsi scsi_transport_iscsi ib_ipoib rdma_ucm ib_ucm
ib_uverbs ib_umad rdma_cm ib_cm iw_cm dm_mirror dm_region_hash dm_log dm_mod
dax sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd
glue_helper cryptd hfi1 rdmavt iTCO_wdt iTCO_vendor_support ib_core mei_me
lpc_ich pcspkr mei ioatdma sg shpchp i2c_i801 mfd_core wmi ipmi_si ipmi_devintf
ipmi_msghandler acpi_power_meter acpi_pad nfsd auth_rpcgss nfs_acl lockd grace
sunrpc ip_tables ext4 mbcache jbd2 sd_mod mgag200 drm_kms_helper syscopyarea
sysfillrect sysimgblt fb_sys_fops ttm igb ahci crc32c_intel ptp libahci
pps_core drm dca libata i2c_algo_bit i2c_core
[ 4556.061190] CPU: 54 PID: 9841 Comm: ibacm Tainted: G          I
4.14.0-rc2+ #6
[ 4556.069667] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS
SE5C610.86B.01.01.0008.021120151325 02/11/2015
[ 4556.081339] task: ffff880855f42d00 task.stack: ffffc900246b4000
[ 4556.087967] RIP: 0010:          (null)
[ 4556.092166] RSP: 0018:ffffc900246b7bc8 EFLAGS: 00010246
[ 4556.098018] RAX: ffffffff81dbe9e0 RBX: ffff881058bb1000 RCX:
0000000000000000
[ 4556.105997] RDX: 0000000000001100 RSI: ffff881058bb1320 RDI:
ffff881056362000
[ 4556.113984] RBP: ffffc900246b7bf8 R08: 0000000000000ec0 R09:
0000000000001100
[ 4556.121971] R10: ffff8810573a5000 R11: 0000000000000000 R12:
ffff881056362000
[ 4556.129957] R13: 0000000000000ec0 R14: ffff881058bb1320 R15:
0000000000000ec0
[ 4556.137945] FS:  00007fe0ba5a38c0(0000) GS:ffff88105f080000(0000)
knlGS:0000000000000000
[ 4556.147000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4556.153433] CR2: 0000000000000000 CR3: 0000001056f5d003 CR4:
00000000001606e0
[ 4556.161419] Call Trace:
[ 4556.164167]  ? netlink_dump+0x12c/0x290
[ 4556.168468]  __netlink_dump_start+0x186/0x1f0
[ 4556.173357]  rdma_nl_rcv_msg+0x193/0x1b0 [ib_core]
[ 4556.178724]  rdma_nl_rcv+0xdc/0x130 [ib_core]
[ 4556.183604]  netlink_unicast+0x181/0x240
[ 4556.187998]  netlink_sendmsg+0x2c2/0x3b0
[ 4556.192392]  sock_sendmsg+0x38/0x50
[ 4556.196299]  SYSC_sendto+0x102/0x190
[ 4556.200308]  ? __audit_syscall_entry+0xaf/0x100
[ 4556.205387]  ? syscall_trace_enter+0x1d0/0x2b0
[ 4556.210366]  ? __audit_syscall_exit+0x209/0x290
[ 4556.215442]  SyS_sendto+0xe/0x10
[ 4556.219060]  do_syscall_64+0x67/0x1b0
[ 4556.223165]  entry_SYSCALL64_slow_path+0x25/0x25
[ 4556.228328] RIP: 0033:0x7fe0b9db2a63
[ 4556.232333] RSP: 002b:00007ffc55edc260 EFLAGS: 00000293 ORIG_RAX:
000000000000002c
[ 4556.240808] RAX: ffffffffffffffda RBX: 0000000000000010 RCX:
00007fe0b9db2a63
[ 4556.248796] RDX: 0000000000000010 RSI: 00007ffc55edc280 RDI:
000000000000000d
[ 4556.256782] RBP: 00007ffc55edc670 R08: 00007ffc55edc270 R09:
000000000000000c
[ 4556.265321] R10: 0000000000000000 R11: 0000000000000293 R12:
00007ffc55edc280
[ 4556.273846] R13: 000000000260b400 R14: 000000000000000d R15:
0000000000000001
[ 4556.282368] Code:  Bad RIP value.
[ 4556.286629] RIP:           (null) RSP: ffffc900246b7bc8
[ 4556.293013] CR2: 0000000000000000
[ 4556.297292] ---[ end trace 8d67abcfd10ec209 ]---
[ 4556.305465] Kernel panic - not syncing: Fatal exception
[ 4556.313786] Kernel Offset: disabled
[ 4556.321563] ---[ end Kernel panic - not syncing: Fatal exception
[ 4556.328960] ------------[ cut here ]------------

Special case RDMA_NL_LS response messages to call the appropriate
callback.

Additionally, make sure that the .dump() callback is not NULL
before calling it.

Fixes: 647c75ac59a48a54 ("RDMA/netlink: Convert LS to doit callback")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Reviewed-by: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
7 years agoMerge remote-tracking branches 'spi/fix/armada', 'spi/fix/idr', 'spi/fix/qspi', ...
Mark Brown [Wed, 25 Oct 2017 12:06:34 +0000 (14:06 +0200)]
Merge remote-tracking branches 'spi/fix/armada', 'spi/fix/idr', 'spi/fix/qspi', 'spi/fix/stm32' and 'spi/fix/uapi' into spi-linus

7 years agoceph: unlock dangling spinlock in try_flush_caps()
Jeff Layton [Thu, 19 Oct 2017 12:52:58 +0000 (08:52 -0400)]
ceph: unlock dangling spinlock in try_flush_caps()

sparse warns:

  fs/ceph/caps.c:2042:9: warning: context imbalance in 'try_flush_caps' - wrong count at exit

We need to exit this function with the lock unlocked, but a couple of
cases leave it locked.

Cc: stable@vger.kernel.org
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
7 years agoMerge tag 'nfs-for-4.14-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Linus Torvalds [Wed, 25 Oct 2017 04:46:43 +0000 (06:46 +0200)]
Merge tag 'nfs-for-4.14-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:

 - Fix a list corruption in xprt_release()

 - Fix a workqueue lockdep warning due to unsafe use of
   cancel_work_sync()

* tag 'nfs-for-4.14-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  SUNRPC: Destroy transport from the system workqueue
  SUNRPC: fix a list corruption issue in xprt_release()

7 years agonbd: handle interrupted sendmsg with a sndtimeo set
Josef Bacik [Tue, 24 Oct 2017 19:57:18 +0000 (15:57 -0400)]
nbd: handle interrupted sendmsg with a sndtimeo set

If you do not set sk_sndtimeo you will get -ERESTARTSYS if there is a
pending signal when you enter sendmsg, which we handle properly.
However if you set a timeout for your commands we'll set sk_sndtimeo to
that timeout, which means that sendmsg will start returning -EINTR
instead of -ERESTARTSYS.  Fix this by checking either cases and doing
the correct thing.

Cc: stable@vger.kernel.org
Fixes: dc88e34d69d8 ("nbd: set sk->sk_sndtimeo for our sockets")
Reported-and-tested-by: Daniel Xu <dlxu@fb.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
7 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Tue, 24 Oct 2017 16:51:59 +0000 (18:51 +0200)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull KVM fixes from Radim Krčmář:
 "PPC fixes for potential host oops and hangs"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: PPC: Book3S HV: Add more barriers in XIVE load/unload code
  KVM: PPC: Book3S: Protect kvmppc_gpa_to_ua() with SRCU
  KVM: PPC: Book3S HV: POWER9 more doorbell fixes
  KVM: PPC: Fix oops when checking KVM_CAP_PPC_HTM

7 years agoPM / QoS: Fix device resume latency PM QoS
Rafael J. Wysocki [Tue, 24 Oct 2017 13:20:45 +0000 (15:20 +0200)]
PM / QoS: Fix device resume latency PM QoS

The special value of 0 for device resume latency PM QoS means
"no restriction", but there are two problems with that.

First, device resume latency PM QoS requests with 0 as the
value are always put in front of requests with positive
values in the priority lists used internally by the PM QoS
framework, causing 0 to be chosen as an effective constraint
value.  However, that 0 is then interpreted as "no restriction"
effectively overriding the other requests with specific
restrictions which is incorrect.

Second, the users of device resume latency PM QoS have no
way to specify that *any* resume latency at all should be
avoided, which is an artificial limitation in general.

To address these issues, modify device resume latency PM QoS to
use S32_MAX as the "no constraint" value and 0 as the "no
latency at all" one and rework its users (the cpuidle menu
governor, the genpd QoS governor and the runtime PM framework)
to follow these changes.

Also add a special "n/a" value to the corresponding user space I/F
to allow user space to indicate that it cannot accept any resume
latencies at all for the given device.

Fixes: 85dc0b8a4019 (PM / QoS: Make it possible to expose PM QoS latency constraints)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=197323
Reported-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Alex Shi <alex.shi@linaro.org>
Cc: All applicable <stable@vger.kernel.org>
7 years agohwmon: (tmp102) Fix first temperature reading
Guenter Roeck [Tue, 24 Oct 2017 00:36:03 +0000 (17:36 -0700)]
hwmon: (tmp102) Fix first temperature reading

Commit 3d8f7a89a197 ("hwmon: (tmp102) Improve handling of initial read
delay") reduced the initial temperature read delay and made it dependent
on the chip's shutdown mode. If the chip was not in shutdown mode at probe,
the read delay no longer applies.

This ignores the fact that the chip initialization changes the temperature
sensor resolution, and that the temperature register values change when
the resolution is changed. As a result, the reported temperature is twice
as high as the real temperature until the first temperature conversion
after the configuration change is complete. This can result in unexpected
behavior and, worst case, in a system shutdown. To fix the problem,
let's just always wait for a conversion to complete before reporting
a temperature.

Fixes: 3d8f7a89a197 ("hwmon: (tmp102) Improve handling of initial read delay")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=197167
Reported-by: Ralf Goebel <ralf.goebel@imago-technologies.com>
Cc: Ralf Goebel <ralf.goebel@imago-technologies.com>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
7 years agoALSA: hda - fix headset mic problem for Dell machines with alc236
Hui Wang [Tue, 24 Oct 2017 08:53:34 +0000 (16:53 +0800)]
ALSA: hda - fix headset mic problem for Dell machines with alc236

We have several Dell laptops which use the codec alc236, the headset
mic can't work on these machines. Following the commit 736f20a70, we
add the pin cfg table to make the headset mic work.

Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
7 years agoxfs: fix AIM7 regression
Christoph Hellwig [Tue, 24 Oct 2017 01:31:50 +0000 (18:31 -0700)]
xfs: fix AIM7 regression

Apparently our current rwsem code doesn't like doing the trylock, then
lock for real scheme.  So change our read/write methods to just do the
trylock for the RWF_NOWAIT case.  This fixes a ~25% regression in
AIM7.

Fixes: 91f9943e ("fs: support RWF_NOWAIT for buffered reads")
Reported-by: kernel test robot <xiaolong.ye@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
7 years agoMerge tag 'platform-drivers-x86-v4.14-3' of git://git.infradead.org/linux-platform...
Linus Torvalds [Mon, 23 Oct 2017 17:43:30 +0000 (13:43 -0400)]
Merge tag 'platform-drivers-x86-v4.14-3' of git://git.infradead.org/linux-platform-drivers-x86

Pull x86 platform driver fixes from Darren Hart:
 "Use a spin_lock instead of mutex in atomic context. The devm_ fix is a
  dependency. Summary:

  intel_pmc_ipc:
   - Use spin_lock to protect GCR updates
   - Use devm_* calls in driver probe function"

* tag 'platform-drivers-x86-v4.14-3' of git://git.infradead.org/linux-platform-drivers-x86:
  platform/x86: intel_pmc_ipc: Use spin_lock to protect GCR updates
  platform/x86: intel_pmc_ipc: Use devm_* calls in driver probe function

7 years agoplatform/x86: intel_pmc_ipc: Use spin_lock to protect GCR updates
Kuppuswamy Sathyanarayanan [Sat, 7 Oct 2017 22:19:51 +0000 (15:19 -0700)]
platform/x86: intel_pmc_ipc: Use spin_lock to protect GCR updates

Currently, update_no_reboot_bit() function implemented in this driver
uses mutex_lock() to protect its register updates. But this function is
called with in atomic context in iTCO_wdt_start() and iTCO_wdt_stop()
functions in iTCO_wdt.c driver, which in turn causes "sleeping into
atomic context" issue. This patch fixes this issue by replacing the
mutex_lock() with spin_lock() to protect the GCR read/write/update APIs.

Fixes: 9d855d4 ("platform/x86: intel_pmc_ipc: Fix iTCO_wdt GCS memory mapping failure")
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kupuswamy@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
7 years agoplatform/x86: intel_pmc_ipc: Use devm_* calls in driver probe function
Kuppuswamy Sathyanarayanan [Tue, 5 Sep 2017 05:37:21 +0000 (22:37 -0700)]
platform/x86: intel_pmc_ipc: Use devm_* calls in driver probe function

This patch cleans up unnecessary free/alloc calls in ipc_plat_probe(),
ipc_pci_probe() and ipc_plat_get_res() functions by using devm_*
calls.

This patch also adds proper error handling for failure cases in
ipc_pci_probe() function.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
[andy: fixed style issues, missed devm_free_irq(), removed unnecessary log message]
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
7 years agoMerge branch 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Linus Torvalds [Mon, 23 Oct 2017 15:24:52 +0000 (11:24 -0400)]
Merge branch 'for-4.14-fixes' of git://git./linux/kernel/git/tj/wq

Pull workqueue fix from Tejun Heo:
 "This is a fix for an old bug in workqueue. Workqueue used a mutex to
  arbitrate who gets to be the manager of a pool. When the manager role
  gets released, the mutex gets unlocked while holding the pool's
  irqsafe spinlock. This can lead to deadlocks as mutex's internal
  spinlock isn't irqsafe. This got discovered by recent fixes to mutex
  lockdep annotations.

  The fix is a bit invasive for rc6 but if anything were wrong with the
  fix it would likely have already blown up in -next, and we want the
  fix in -stable anyway"

* 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: replace pool->manager_arb mutex with a flag

7 years agoMerge tag 'pinctrl-v4.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw...
Linus Torvalds [Mon, 23 Oct 2017 14:36:04 +0000 (10:36 -0400)]
Merge tag 'pinctrl-v4.14-4' of git://git./linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:
 "Two last minute fixes for pin controllers, both regressions in
  specific drivers:

   - Fix a touchpad pin control issue on the AMD affecting Asus laptops

   - Fix an interrupt handling regression on the MCP23s08"

* tag 'pinctrl-v4.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: mcp23s08: fix interrupt handling regression
  pinctrl/amd: fix masking of GPIO interrupts

7 years agoMerge tag 'regulator-fix-v4.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 23 Oct 2017 14:32:59 +0000 (10:32 -0400)]
Merge tag 'regulator-fix-v4.14-rc5' of git://git./linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
 "A couple of small driver specific bug fixes that have been collected
  since the merge window"

* tag 'regulator-fix-v4.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: rn5t618: Do not index regulator_desc arrays by id
  regulator: axp20x: Fix poly-phase bit offset for AXP803 DCDC5/6

7 years agoLinux 4.14-rc6 v4.14-rc6
Linus Torvalds [Mon, 23 Oct 2017 10:49:47 +0000 (06:49 -0400)]
Linux 4.14-rc6

7 years agoMerge tag 'staging-4.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Mon, 23 Oct 2017 10:37:16 +0000 (06:37 -0400)]
Merge tag 'staging-4.14-rc6' of git://git./linux/kernel/git/gregkh/staging

Pull staging and IIO fixes from Greg KH:
 "Here are a small number of patches to resolve some reported IIO and a
  staging driver problem. Nothing major here, full details are in the
  shortlog below.

  All have been in linux-next with no reported issues"

* tag 'staging-4.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: bcm2835-audio: Fix memory corruption
  iio: adc: at91-sama5d2_adc: fix probe error on missing trigger property
  iio: adc: dln2-adc: fix build error
  iio: dummy: events: Add missing break
  staging: iio: ade7759: fix signed extension bug on shift of a u8
  iio: pressure: zpa2326: Remove always-true check which confuses gcc
  iio: proximity: as3935: noise detection + threshold changes

7 years agoMerge tag 'char-misc-4.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregk...
Linus Torvalds [Mon, 23 Oct 2017 10:35:01 +0000 (06:35 -0400)]
Merge tag 'char-misc-4.14-rc6' of git://git./linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
 "Here are four small fixes for 4.14-rc6.

  Three of them are binder driver fixes for reported issues, and the
  last one is a hyperv driver bugfix. Nothing major, but good fixes to
  get into 4.14-final.

  All of these have been in linux-next with no reported issues"

* tag 'char-misc-4.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  android: binder: Fix null ptr dereference in debug msg
  android: binder: Don't get mm from task
  vmbus: hvsock: add proper sync for vmbus_hvsock_device_unregister()
  binder: call poll_wait() unconditionally.

7 years agoMerge tag 'usb-4.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Mon, 23 Oct 2017 10:33:05 +0000 (06:33 -0400)]
Merge tag 'usb-4.14-rc6' of git://git./linux/kernel/git/gregkh/usb

Pull USB/PHY fixes from Greg KH:
 "Here are a small number of USB and PHY driver fixes for 4.14-rc6

  There is the usual musb and xhci fixes in here, as well as some needed
  phy patches. Also is a nasty regression fix for usbfs that has started
  to hit a lot of people using virtual machines.

  All of these have been in linux-next with no reported problems"

* tag 'usb-4.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (26 commits)
  usb: hub: Allow reset retry for USB2 devices on connect bounce
  USB: core: fix out-of-bounds access bug in usb_get_bos_descriptor()
  MAINTAINERS: fix git tree url for musb module
  usb: quirks: add quirk for WORLDE MINI MIDI keyboard
  usb: musb: sunxi: Explicitly release USB PHY on exit
  usb: musb: Check for host-mode using is_host_active() on reset interrupt
  usb: musb: musb_cppi41: Configure the number of channels for DA8xx
  usb: musb: musb_cppi41: Fix cppi41_set_dma_mode() for DA8xx
  usb: musb: musb_cppi41: Fix the address of teardown and autoreq registers
  USB: musb: fix late external abort on suspend
  USB: musb: fix session-bit runtime-PM quirk
  usb: cdc_acm: Add quirk for Elatec TWN3
  USB: devio: Revert "USB: devio: Don't corrupt user memory"
  usb: xhci: Handle error condition in xhci_stop_device()
  usb: xhci: Reset halted endpoint if trb is noop
  xhci: Cleanup current_cmd in xhci_cleanup_command_queue()
  xhci: Identify USB 3.1 capable hosts by their port protocol capability
  USB: serial: metro-usb: add MS7820 device id
  phy: rockchip-typec: Check for errors from tcphy_phy_init()
  phy: rockchip-typec: Don't set the aux voltage swing to 400 mV
  ...

7 years agoMerge remote-tracking branches 'regulator/fix/axp20x' and 'regulator/fix/rn5t618...
Mark Brown [Mon, 23 Oct 2017 09:46:30 +0000 (11:46 +0200)]
Merge remote-tracking branches 'regulator/fix/axp20x' and 'regulator/fix/rn5t618' into regulator-linus

7 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Sun, 22 Oct 2017 20:19:12 +0000 (16:19 -0400)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

Pull input fix from Dmitry Torokhov:
 "A fix for a broken commit in the previous pull breaking automatic
  module loading of input handlers, such ad evdev"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: do not use property bits when generating module alias

7 years agoInput: do not use property bits when generating module alias
Dmitry Torokhov [Sun, 22 Oct 2017 18:42:29 +0000 (11:42 -0700)]
Input: do not use property bits when generating module alias

The commit 8724ecb07229 ("Input: allow matching device IDs on property
bits") started using property bits when generating module aliases for input
handlers, but did not adjust the generation of MODALIAS attribute on input
device uevents, breaking automatic module loading. Given that no handler
currently uses property bits in their module tables, let's revert this part
of the commit for now.

Reported-by: Damien Wyart <damien.wyart@gmail.com>
Tested-by: Damien Wyart <damien.wyart@gmail.com>
Fixes: 8724ecb07229 ("Input: allow matching device IDs on property bits")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 22 Oct 2017 10:58:23 +0000 (06:58 -0400)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "A couple of fixes addressing the following issues:

   - The last polishing for the TLB code, removing the last BUG_ON() and
     the debug file along with tidying up the lazy TLB code.

   - Prevent triple fault on 1st Gen. 486 caused by stupidly calling the
     early IDT setup after the first function which causes a fault which
     should be caught by the exception table.

   - Limit the mmap of /dev/mem to valid addresses

   - Prevent late microcode loading on Broadwell X

   - Remove a redundant assignment in the cache info code"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Limit mmap() of /dev/mem to valid physical addresses
  x86/mm: Remove debug/x86/tlb_defer_switch_to_init_mm
  x86/mm: Tidy up "x86/mm: Flush more aggressively in lazy TLB mode"
  x86/mm/64: Remove the last VM_BUG_ON() from the TLB code
  x86/microcode/intel: Disable late loading on model 79
  x86/idt: Initialize early IDT before cr4_init_shadow()
  x86/cpu/intel_cacheinfo: Remove redundant assignment to 'this_leaf'

7 years agoMerge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 22 Oct 2017 10:56:25 +0000 (06:56 -0400)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull timer fix from Thomas Gleixner:
 "A single fix to make the cs5535 clock event driver robust agaist
  spurious interrupts"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clockevents/drivers/cs5535: Improve resilience to spurious interrupts

7 years agoMerge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 22 Oct 2017 10:54:42 +0000 (06:54 -0400)]
Merge branch 'smp-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull smp/hotplug fix from Thomas Gleixner:
 "The recent rework of the callback invocation missed to cleanup the
  leftovers of the operation, so under certain circumstances a
  subsequent CPU hotplug operation accesses stale data and crashes.
  Clean it up."

* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  cpu/hotplug: Reset node state after operation

7 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 22 Oct 2017 10:52:53 +0000 (06:52 -0400)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Thomas Gleixner:
 "A series of fixes for perf tooling:

   - Make xyarray return the X/Y size correctly which fixes a crash in
     the exit code.

   - Fix the libc path in test so it works not only on Debian/Ubuntu
     correctly

   - Check for eBPF file existance and output a useful error message
     instead of failing to compile a non existant file

   - Make sure perf_hpp_fmt is not longer references before freeing it

   - Use list_del_init() in the histogram code to prevent a crash when
     the already deleted element is deleted again

   - Remove the leftovers of the removed '-l' option

   - Add reviewer entries to the MAINTAINERS file"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf test shell trace+probe_libc_inet_pton.sh: Be compatible with Debian/Ubuntu
  perf xyarray: Fix wrong processing when closing evsel fd
  perf buildid-list: Fix crash when processing PERF_RECORD_NAMESPACE
  perf record: Fix documentation for a inexistent option '-l'
  perf tools: Add long time reviewers to MAINTAINERS
  perf tools: Check wether the eBPF file exists in event parsing
  perf hists: Add extra integrity checks to fmt_free()
  perf hists: Fix crash in perf_hpp__reset_output_field()

7 years agoMerge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 22 Oct 2017 10:42:58 +0000 (06:42 -0400)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:
 "A set of small fixes mostly in the irq drivers area:

   - Make the tango irq chip work correctly, which requires a new
     function in the generiq irq chip implementation

   - A set of updates to the GIC-V3 ITS driver removing a bogus BUG_ON()
     and parsing the VCPU table size correctly"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: generic chip: remove irq_gc_mask_disable_reg_and_ack()
  irqchip/tango: Use irq_gc_mask_disable_and_ack_set
  genirq: generic chip: Add irq_gc_mask_disable_and_ack_set()
  irqchip/gic-v3-its: Add missing changes to support 52bit physical address
  irqchip/gic-v3-its: Fix the incorrect parsing of VCPU table size
  irqchip/gic-v3-its: Fix the incorrect BUG_ON in its_init_vpe_domain()
  DT: arm,gic-v3: Update the ITS size in the examples

7 years agoMerge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 22 Oct 2017 10:39:58 +0000 (06:39 -0400)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull objtool fix from Thomas Gleixner:
 "Plug a memory leak in the instruction decoder"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Fix memory leak in decode_instructions()

7 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Sun, 22 Oct 2017 02:44:48 +0000 (22:44 -0400)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:
 "A little more than usual this time around. Been travelling, so that is
  part of it.

  Anyways, here are the highlights:

   1) Deal with memcontrol races wrt. listener dismantle, from Eric
      Dumazet.

   2) Handle page allocation failures properly in nfp driver, from Jaku
      Kicinski.

   3) Fix memory leaks in macsec, from Sabrina Dubroca.

   4) Fix crashes in pppol2tp_session_ioctl(), from Guillaume Nault.

   5) Several fixes in bnxt_en driver, including preventing potential
      NVRAM parameter corruption from Michael Chan.

   6) Fix for KRACK attacks in wireless, from Johannes Berg.

   7) rtnetlink event generation fixes from Xin Long.

   8) Deadlock in mlxsw driver, from Ido Schimmel.

   9) Disallow arithmetic operations on context pointers in bpf, from
      Jakub Kicinski.

  10) Missing sock_owned_by_user() check in sctp_icmp_redirect(), from
      Xin Long.

  11) Only TCP is supported for sockmap, make that explicit with a
      check, from John Fastabend.

  12) Fix IP options state races in DCCP and TCP, from Eric Dumazet.

  13) Fix panic in packet_getsockopt(), also from Eric Dumazet.

  14) Add missing locked in hv_sock layer, from Dexuan Cui.

  15) Various aquantia bug fixes, including several statistics handling
      cures. From Igor Russkikh et al.

  16) Fix arithmetic overflow in devmap code, from John Fastabend.

  17) Fix busted socket memory accounting when we get a fault in the tcp
      zero copy paths. From Willem de Bruijn.

  18) Don't leave opt->tot_len uninitialized in ipv6, from Eric Dumazet"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits)
  stmmac: Don't access tx_q->dirty_tx before netif_tx_lock
  ipv6: flowlabel: do not leave opt->tot_len with garbage
  of_mdio: Fix broken PHY IRQ in case of probe deferral
  textsearch: fix typos in library helpers
  rxrpc: Don't release call mutex on error pointer
  net: stmmac: Prevent infinite loop in get_rx_timestamp_status()
  net: stmmac: Fix stmmac_get_rx_hwtstamp()
  net: stmmac: Add missing call to dev_kfree_skb()
  mlxsw: spectrum_router: Configure TIGCR on init
  mlxsw: reg: Add Tunneling IPinIP General Configuration Register
  net: ethtool: remove error check for legacy setting transceiver type
  soreuseport: fix initialization race
  net: bridge: fix returning of vlan range op errors
  sock: correct sk_wmem_queued accounting on efault in tcp zerocopy
  bpf: add test cases to bpf selftests to cover all access tests
  bpf: fix pattern matches for direct packet access
  bpf: fix off by one for range markings with L{T, E} patterns
  bpf: devmap fix arithmetic overflow in bitmap_size calculation
  net: aquantia: Bad udp rate on default interrupt coalescing
  net: aquantia: Enable coalescing management via ethtool interface
  ...

7 years agostmmac: Don't access tx_q->dirty_tx before netif_tx_lock
Bernd Edlinger [Sat, 21 Oct 2017 06:51:30 +0000 (06:51 +0000)]
stmmac: Don't access tx_q->dirty_tx before netif_tx_lock

This is the possible reason for different hard to reproduce
problems on my ARMv7-SMP test system.

The symptoms are in recent kernels imprecise external aborts,
and in older kernels various kinds of network stalls and
unexpected page allocation failures.

My testing indicates that the trouble started between v4.5 and v4.6
and prevails up to v4.14.

Using the dirty_tx before acquiring the spin lock is clearly
wrong and was first introduced with v4.6.

Fixes: e3ad57c96715 ("stmmac: review RX/TX ring management")

Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoipv6: flowlabel: do not leave opt->tot_len with garbage
Eric Dumazet [Sat, 21 Oct 2017 19:26:23 +0000 (12:26 -0700)]
ipv6: flowlabel: do not leave opt->tot_len with garbage

When syzkaller team brought us a C repro for the crash [1] that
had been reported many times in the past, I finally could find
the root cause.

If FlowLabel info is merged by fl6_merge_options(), we leave
part of the opt_space storage provided by udp/raw/l2tp with random value
in opt_space.tot_len, unless a control message was provided at sendmsg()
time.

Then ip6_setup_cork() would use this random value to perform a kzalloc()
call. Undefined behavior and crashes.

Fix is to properly set tot_len in fl6_merge_options()

At the same time, we can also avoid consuming memory and cpu cycles
to clear it, if every option is copied via a kmemdup(). This is the
change in ip6_setup_cork().

[1]
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 6613 Comm: syz-executor0 Not tainted 4.14.0-rc4+ #127
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
task: ffff8801cb64a100 task.stack: ffff8801cc350000
RIP: 0010:ip6_setup_cork+0x274/0x15c0 net/ipv6/ip6_output.c:1168
RSP: 0018:ffff8801cc357550 EFLAGS: 00010203
RAX: dffffc0000000000 RBX: ffff8801cc357748 RCX: 0000000000000010
RDX: 0000000000000002 RSI: ffffffff842bd1d9 RDI: 0000000000000014
RBP: ffff8801cc357620 R08: ffff8801cb17f380 R09: ffff8801cc357b10
R10: ffff8801cb64a100 R11: 0000000000000000 R12: ffff8801cc357ab0
R13: ffff8801cc357b10 R14: 0000000000000000 R15: ffff8801c3bbf0c0
FS:  00007f9c5c459700(0000) GS:ffff8801db200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020324000 CR3: 00000001d1cf2000 CR4: 00000000001406f0
DR0: 0000000020001010 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
Call Trace:
 ip6_make_skb+0x282/0x530 net/ipv6/ip6_output.c:1729
 udpv6_sendmsg+0x2769/0x3380 net/ipv6/udp.c:1340
 inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:762
 sock_sendmsg_nosec net/socket.c:633 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:643
 SYSC_sendto+0x358/0x5a0 net/socket.c:1750
 SyS_sendto+0x40/0x50 net/socket.c:1718
 entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x4520a9
RSP: 002b:00007f9c5c458c08 EFLAGS: 00000216 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000000718000 RCX: 00000000004520a9
RDX: 0000000000000001 RSI: 0000000020fd1000 RDI: 0000000000000016
RBP: 0000000000000086 R08: 0000000020e0afe4 R09: 000000000000001c
R10: 0000000000000000 R11: 0000000000000216 R12: 00000000004bb1ee
R13: 00000000ffffffff R14: 0000000000000016 R15: 0000000000000029
Code: e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 ea 0f 00 00 48 8d 79 04 48 b8 00 00 00 00 00 fc ff df 45 8b 74 24 04 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85
RIP: ip6_setup_cork+0x274/0x15c0 net/ipv6/ip6_output.c:1168 RSP: ffff8801cc357550

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoof_mdio: Fix broken PHY IRQ in case of probe deferral
Geert Uytterhoeven [Wed, 18 Oct 2017 11:54:03 +0000 (13:54 +0200)]
of_mdio: Fix broken PHY IRQ in case of probe deferral

If an Ethernet PHY is initialized before the interrupt controller it is
connected to, a message like the following is printed:

    irq: no irq domain found for /interrupt-controller@e61c0000 !

However, the actual error is ignored, leading to a non-functional (POLL)
PHY interrupt later:

    Micrel KSZ8041RNLI ee700000.ethernet-ffffffff:01: attached PHY driver [Micrel KSZ8041RNLI] (mii_bus:phy_addr=ee700000.ethernet-ffffffff:01, irq=POLL)

Depending on whether the PHY driver will fall back to polling, Ethernet
may or may not work.

To fix this:
  1. Switch of_mdiobus_register_phy() from irq_of_parse_and_map() to
     of_irq_get().
     Unlike the former, the latter returns -EPROBE_DEFER if the
     interrupt controller is not yet available, so this condition can be
     detected.
     Other errors are handled the same as before, i.e. use the passed
     mdio->irq[addr] as interrupt.
  2. Propagate and handle errors from of_mdiobus_register_phy() and
     of_mdiobus_register_device().

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agotextsearch: fix typos in library helpers
Randy Dunlap [Fri, 20 Oct 2017 19:15:52 +0000 (12:15 -0700)]
textsearch: fix typos in library helpers

Fix spellos (typos) in textsearch library helpers.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agorxrpc: Don't release call mutex on error pointer
David Howells [Fri, 20 Oct 2017 16:01:22 +0000 (17:01 +0100)]
rxrpc: Don't release call mutex on error pointer

Don't release call mutex at the end of rxrpc_kernel_begin_call() if the
call pointer actually holds an error value.

Fixes: 540b1c48c37a ("rxrpc: Fix deadlock between call creation and sendmsg/recvmsg")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'stmmac-hw-tstamp-fixes'
David S. Miller [Sun, 22 Oct 2017 01:50:40 +0000 (02:50 +0100)]
Merge branch 'stmmac-hw-tstamp-fixes'

Jose Abreu says:

====================
net: stmmac: Fix HW timestamping

Three fixes for HW timestamping feature, all of them for RX side.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: stmmac: Prevent infinite loop in get_rx_timestamp_status()
Jose Abreu [Fri, 20 Oct 2017 13:37:36 +0000 (14:37 +0100)]
net: stmmac: Prevent infinite loop in get_rx_timestamp_status()

Prevent infinite loop by correctly setting the loop condition to
break when i == 10.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: stmmac: Fix stmmac_get_rx_hwtstamp()
Jose Abreu [Fri, 20 Oct 2017 13:37:35 +0000 (14:37 +0100)]
net: stmmac: Fix stmmac_get_rx_hwtstamp()

When using GMAC4 the valid timestamp is from CTX next desc but
we are passing the previous desc to get_rx_timestamp_status()
callback.

Fix this and while at it rework a little bit the function logic.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: stmmac: Add missing call to dev_kfree_skb()
Jose Abreu [Fri, 20 Oct 2017 13:37:34 +0000 (14:37 +0100)]
net: stmmac: Add missing call to dev_kfree_skb()

When RX HW timestamp is enabled and a frame is discarded we are
not freeing the skb but instead only setting to NULL the entry.

Add a call to dev_kfree_skb_any() so that skb entry is correctly
freed.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>