Brian Foster [Thu, 12 Jul 2018 05:26:30 +0000 (22:26 -0700)]
xfs: remove xfs_alloc_arg firstblock field
The xfs_alloc_arg.firstblock field is used to control the starting
agno for an allocation. The structure already carries a pointer to
the transaction, which carries the current firstblock value.
Remove the field and access ->t_firstblock directly in the
allocation code.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:29 +0000 (22:26 -0700)]
xfs: remove xfs_btree_cur private firstblock field
The bmbt cursor private structure has a firstblock field that is
used to maintain locking order on bmbt allocations. The field holds
an actual firstblock value (as opposed to a pointer), so it is
initialized on cursor creation, updated on allocation and then the
value is transferred back to the source before the cursor is
destroyed.
This value is always transferred from and back to the ->t_firstblock
field. Since xfs_btree_cur already carries a reference to the
transaction, we can remove this field from xfs_btree_cur and the
associated copying. The bmbt allocations will update the value in
the transaction directly.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:29 +0000 (22:26 -0700)]
xfs: remove bmap format helpers firstblock params
The bmap format helpers receive firstblock via ->t_firstblock. Drop
the param and access it directly.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:28 +0000 (22:26 -0700)]
xfs: remove bmap extent add helper firstblock params
The add extent helpers all receive firstblock via ->t_firstblock.
Drop the parameter and access it directly.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:28 +0000 (22:26 -0700)]
xfs: remove xfs_bmalloca firstblock field
The xfs_bmalloca.firstblock field carries the firstblock value from
the transaction into the bmap infrastructure. It's initialized in
one place from ->t_firstblock, so drop the field and access
->t_firstblock directly throughout the bmap code.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:27 +0000 (22:26 -0700)]
xfs: use ->t_firstblock in bmap extent split
Also remove the unnecessary xfs_bmap_split_extent_at() parameter.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:27 +0000 (22:26 -0700)]
xfs: remove bmap insert/collapse firstblock param
The only callers pass ->t_firstblock.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:25 +0000 (22:26 -0700)]
xfs: remove xfs_bunmapi() firstblock param
All callers pass ->t_firstblock from the current transaction.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:25 +0000 (22:26 -0700)]
xfs: remove xfs_bmapi_write() firstblock param
All callers pass ->t_firstblock from the current transaction.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:24 +0000 (22:26 -0700)]
xfs: use ->t_firstblock in insert/collapse range
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:24 +0000 (22:26 -0700)]
xfs: use ->t_firstblock in xfs_bmapi_remap()
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:23 +0000 (22:26 -0700)]
xfs: use ->t_firstblock for all xfs_bunmapi() callers
Convert all xfs_bunmapi() callers to ->t_firstblock.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:23 +0000 (22:26 -0700)]
xfs: use ->t_firstblock for all xfs_bmapi_write() callers
Convert all xfs_bmapi_write() users to ->t_firstblock.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:22 +0000 (22:26 -0700)]
xfs: use ->t_firstblock in xattr ops
Similar to the dirops code, the xattr code uses an on-stack
firstblock variable for the various operations. This code rolls the
underlying transaction in various places, however, which means we
cannot simply replace the local firstblock vars with ->t_firstblock.
Doing so (without further changes) would invalidate the memory
pointed to by xfs_da_args.firstblock as soon as the first
transaction rolls.
To avoid this problem, remove xfs_da_args.firstblock and replace all
such accesses with ->t_firstblock at the same time. This ensures
that accesses to the current firstblock always occur through the
current transaction rather than a potentially invalid xfs_da_args
pointer.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:21 +0000 (22:26 -0700)]
xfs: use ->t_firstblock in attrfork add
Note that this codepath is a user of struct xfs_da_args. Switch it
over to ->t_firstblock in preparation to remove
xfs_da_args.firstblock.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:21 +0000 (22:26 -0700)]
xfs: remove firstblock param from xfs dir ops
All callers of the xfs_dir_*() functions pass ->t_firstblock as the
firstblock parameter. Drop the parameter and access ->t_firstblock
directly.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:20 +0000 (22:26 -0700)]
xfs: use ->t_firstblock in dir ops
Callers of the xfs_dir_*() functions currently pass an on-stack
firstblock variable. While the dirops infrastructure carries a
pointer to this variable, it never rolls the transaction and so it
is safe to use ->t_firstblock instead.
Fix up the various xfs_dir_*() callers to use ->t_firstblock. Also
remove the unnecessary parameter for xfs_cross_rename().
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:20 +0000 (22:26 -0700)]
xfs: add firstblock field to xfs_trans
A firstblock var is typically allocated and initialized along with
xfs_defer_ops structures and passed around independent from the
associated transaction. To facilitate combining the two, add an
optional ->t_firstblock field to xfs_trans that can be used in place
of an on-stack variable.
The firstblock value follows the lifetime of the transaction, so
initialize it on allocation and when a transaction rolls.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:19 +0000 (22:26 -0700)]
xfs: allow null firstblock in xfs_bmapi_write() when tp is null
xfs_bmapi_write() always expects a valid firstblock pointer. It
immediately dereferences the pointer to help determine how to
initialize the bma.minleft field. The remaining accesses are
related to modifying btree format forks, which is only relevant for
!COW fork callers.
The reflink code passes a NULL transaction to xfs_bmapi_write() in a
couple places that do COW fork unwritten conversion. The purpose of
the firstblock field is to track the first block allocation in the
current transaction, so technically firstblock should not be
required for these callers either.
Tweak xfs_bmapi_write() to initialize the bma correctly without
accessing the firstblock pointer if no transaction is provided in
the first place. Update the reflink callers to pass NULL instead of
otherwise unused firstblock references.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:19 +0000 (22:26 -0700)]
xfs: refactor dfops init to attach to transaction
Most callers of xfs_defer_init() immediately attach the dfops
structure to a transaction. Add a transaction parameter to eliminate
much of this boilerplate code. This also helps self-document the
fact that many codepaths now expect a dfops pointer implicitly via
xfs_trans->t_dfops.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:18 +0000 (22:26 -0700)]
xfs: use ->t_dfops in reflink cow recover path
Use ->t_dfops of the leftover COW reservation cleanup transaction.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:18 +0000 (22:26 -0700)]
xfs: use ->t_dfops in cancel cow blocks operation
Use ->t_dfops of the transaction from the caller. Reset it before we
return to avoid leaks of local stack memory.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:17 +0000 (22:26 -0700)]
xfs: use ->t_dfops for rmap extent swap operations
xfs_swap_extent_rmap() uses a local dfops instance with a
transaction from the caller. Since there is only one caller, pull
the dfops structure into the caller and attach it to the
transaction. This avoids the need to clear ->t_dfops to prevent
invalid stack memory access.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:17 +0000 (22:26 -0700)]
xfs: remove unused btree cursor bc_private.a.dfops field
The xfs_btree_cur.bc_private.a.dfops field is only ever initialized
by the refcountbt cursor init function. The only caller of that
function with a non-NULL dfops is from deferred completion context,
which already has attached to ->t_dfops.
In addition to that, the only actual reference of a.dfops is the
cursor duplication function, which means the field is effectively
unused.
Remove the dfops field from the bc_private.a union. Any future users
can acquire the dfops from the transaction. This patch does not
change behavior.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:16 +0000 (22:26 -0700)]
xfs: remove xfs_btree_cur bmbt dfops field
All assignments of xfs_btree_cur.bc_private.b.dfops originate from
->t_dfops. Replace accesses of the former with the latter and remove
the unnecessary field. This patch does not change behavior.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:16 +0000 (22:26 -0700)]
xfs: remove dfops param from internal bmap extent helpers
All callers of the various bmap extent helpers now use ->t_dfops.
Remove the unnecessary dfops params and access ->t_dfops directly.
This patch does not change behavior.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:15 +0000 (22:26 -0700)]
xfs: use ->t_dfops for collapse/insert range operations
Use ->t_dfops for the collapse and insert range transactions. These
are the only callers of the respective bmap helpers, so replace the
unnecessary dfops parameters with direct accesses to ->t_dfops.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:14 +0000 (22:26 -0700)]
xfs: remove struct xfs_bmalloca dfops field
Now that bma.dfops is only assigned from ->t_dfops, replace all
accesses to the former with the latter and remove the unnecessary
field. This patch does not change behavior.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:14 +0000 (22:26 -0700)]
xfs: remove xfs_bmapi_remap() dfops param
All xfs_bmapi_remap() callers already use ->t_dfops. Note that
deferred completion context unconditionally sets ->t_dfops if it
hasn't already been set by the caller. Remove the unnecessary
parameter and access ->t_dfops directly.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:13 +0000 (22:26 -0700)]
xfs: remove xfs_bunmapi() dfops param
Now that all xfs_bunmapi() callers use ->t_dfops, remove the
unnecessary parameter and access ->t_dfops directly. This patch does
not change behavior.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:13 +0000 (22:26 -0700)]
xfs: use ->t_dfops for all xfs_bunmapi() callers
Use ->t_dfops for all remaining xfs_bunmapi() callers. This prepares
the latter to no longer require a dfops parameter.
Note that xfs_itruncate_extents_flags() associates a local dfops
with a transaction provided from the caller. Since there are
multiple callers, set and reset ->t_dfops before the function
returns to avoid exposure of stack memory to the caller.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:12 +0000 (22:26 -0700)]
xfs: remove xfs_bmapi_write() dfops param
Now that all callers use ->t_dfops, the xfs_bmapi_write() dfops
parameter is no longer necessary. Remove it and access ->t_dfops
directly. This patch does not change behavior.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:12 +0000 (22:26 -0700)]
xfs: use ->t_dfops for all xfs_bmapi_write() callers
Attach ->t_dfops for all remaining callers of xfs_bmapi_write().
This prepares the latter to no longer require a separate dfops
parameter.
Note that xfs_symlink() already uses ->t_dfops. Fix up the local
references for consistency.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:11 +0000 (22:26 -0700)]
xfs: use ->t_dfops in dqalloc transaction
xfs_dquot_disk_alloc() receives a transaction from the caller and
passes a local dfops along to xfs_bmapi_write(). If we attach this
dfops to the transaction, we have to make sure to clear it before
returning to avoid invalid access of stack memory.
Since xfs_qm_dqread_alloc() is the only caller, pull dfops into the
caller and attach it to the transaction to eliminate this pattern
entirely.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:11 +0000 (22:26 -0700)]
xfs: replace xfs_da_args->dfops accesses with ->t_dfops and remove
Now that xfs_da_args->dfops is always assigned from a ->t_dfops
pointer (or one that is immediately attached), replace all
downstream accesses of the former with the latter and remove the
field from struct xfs_da_args.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:10 +0000 (22:26 -0700)]
xfs: use ->t_dfops in extent split tx and remove param
Attach the local dfops to ->t_dfops of the extent split transaction.
Since this is the only caller of xfs_bmap_split_extent_at(), remove
the dfops parameter as well.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:10 +0000 (22:26 -0700)]
xfs: remove dfops param in attr fork add path
Now that the attribute fork add tx carries dfops along with the
transaction, it is unnecessary to pass it down the stack. Remove the
dfops parameter and access ->t_dfops directly where necessary. This
patch does not change behavior.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:09 +0000 (22:26 -0700)]
xfs: use ->t_dfops for attr set/remove operations
Attach the local dfops to the transaction allocated for xattr add
and remove operations. Add an earlier initialization in
xfs_attr_remove() to ensure the structure is valid if it remains
unused at transaction commit time.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:09 +0000 (22:26 -0700)]
xfs: use ->t_dfops for recovery of [b|c]ui log items
Log recovery passes down a central dfops structure to recovery
handlers for bui and cui log items. Each of these handlers allocates
and commits a transaction and defers any remaining operations to be
completed by the main recovery sequence.
Since dfops outlives the transaction in this context, set and clear
->t_dfops appropriately such that the *_finish_item() paths and
below (i.e., xfs_bmapi*()) can expect to find the dfops in the
transaction without it being committed with the dfops attached. This
is required because transaction commit expects that an associated
dfops is finished and in this context the dfops may be populated at
commit time.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:08 +0000 (22:26 -0700)]
xfs: remove dfops param from high level dirname calls
All callers of the directory create, rename and remove interfaces
already associate the dfops with the transaction. Drop the dfops
parameters in these calls in preparation for further cleanups in the
layers below. This patch does not change behavior.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:07 +0000 (22:26 -0700)]
xfs: remove dfops parameter from ifree call stack
The inode free callchain starting in xfs_inactive_ifree() already
associates its dfops with the transaction. It still passes the dfops
on the stack down through xfs_difree_inobt(), however.
Clean up the call stack and reference dfops directly from the
transaction. This patch does not change behavior.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:07 +0000 (22:26 -0700)]
xfs: rename xfs_trans ->t_agfl_dfops to ->t_dfops
The ->t_agfl_dfops field is currently used to defer agfl block frees
from associated transaction contexts. While all known problematic
contexts have already been updated to use ->t_agfl_dfops, the
broader goal is defer agfl frees from all callers that already use a
deferred operations structure. Further, the transaction field
facilitates a good amount of code clean up where the transaction and
dfops have historically been passed down through the stack
separately.
Rename the field to something more generic to prepare to use it as
such throughout XFS. This patch does not change behavior.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Brian Foster [Thu, 12 Jul 2018 05:26:06 +0000 (22:26 -0700)]
xfs: cow unwritten conversion uses uninitialized dfops
A couple COW fork unwritten extent conversion helpers pass an
uninitialized dfops pointer to xfs_bmapi_write(). This does not
cause problems because conversion does not use a transaction or the
dfops structure for the COW fork. Drop the uninitialized usage of
dfops in these codepaths and pass NULL along to xfs_bmapi_write()
instead.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Christoph Hellwig [Thu, 12 Jul 2018 05:26:06 +0000 (22:26 -0700)]
xfs: update my copyrights for the writeback and iomap code
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Christoph Hellwig [Thu, 12 Jul 2018 05:26:05 +0000 (22:26 -0700)]
xfs: add support for sub-pagesize writeback without buffer_heads
Switch to using the iomap_page structure for checking sub-page uptodate
status and track sub-page I/O completion status, and remove large
quantities of boilerplate code working around buffer heads.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Christoph Hellwig [Thu, 12 Jul 2018 05:26:05 +0000 (22:26 -0700)]
iomap: add support for sub-pagesize buffered I/O without buffer heads
After already supporting a simple implementation of buffered writes for
the blocksize == PAGE_SIZE case in the last commit this adds full support
even for smaller block sizes. There are three bits of per-block
information in the buffer_head structure that really matter for the iomap
read and write path:
- uptodate status (BH_uptodate)
- marked as currently under read I/O (BH_Async_Read)
- marked as currently under write I/O (BH_Async_Write)
Instead of having new per-block structures this now adds a per-page
structure called struct iomap_page to track this information in a slightly
different form:
- a bitmap for the per-block uptodate status. For worst case of a 64k
page size system this bitmap needs to contain 128 bits. For the
typical 4k page size case it only needs 8 bits, although we still
need a full unsigned long due to the way the atomic bitmap API works.
- two atomic_t counters are used to track the outstanding read and write
counts
There is quite a bit of boilerplate code as the buffered I/O path uses
various helper methods, but the actual code is very straight forward.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Christoph Hellwig [Thu, 12 Jul 2018 05:26:04 +0000 (22:26 -0700)]
xfs: allow writeback on pages without buffer heads
Disable the IOMAP_F_BUFFER_HEAD flag on file systems with a block size
equal to the page size, and deal with pages without buffer heads in
writeback. Thanks to the previous refactoring this is basically trivial
now.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Christoph Hellwig [Thu, 12 Jul 2018 05:26:04 +0000 (22:26 -0700)]
xfs: refactor the tail of xfs_writepage_map
Rejuggle how we deal with the different error vs non-error and have
ioends vs not have ioend cases to keep the fast path streamlined, and
the duplicate code at a minimum.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Christoph Hellwig [Thu, 12 Jul 2018 05:26:03 +0000 (22:26 -0700)]
xfs: remove xfs_start_page_writeback
This helper only has two callers, one of them with a constant error
argument. Remove it to make pending changes to the code a little easier.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Christoph Hellwig [Thu, 12 Jul 2018 05:26:03 +0000 (22:26 -0700)]
xfs: move all writeback buffer_head manipulation into xfs_map_at_offset
This keeps it in a single place so it can be made otional more easily.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Christoph Hellwig [Thu, 12 Jul 2018 05:26:02 +0000 (22:26 -0700)]
xfs: don't look at buffer heads in xfs_add_to_ioend
Calculate all information for the bio based on the passed in information
without requiring a buffer_head structure.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Christoph Hellwig [Thu, 12 Jul 2018 05:26:02 +0000 (22:26 -0700)]
xfs: remove the imap_valid flag
Simplify the way we check for a valid imap - we know we have a valid
mapping after xfs_map_blocks returned successfully, and we know we can
call xfs_imap_valid on any imap, as it will always fail on a
zero-initialized map.
We can also remove the xfs_imap_valid function and fold it into
xfs_map_blocks now.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Christoph Hellwig [Thu, 12 Jul 2018 05:26:02 +0000 (22:26 -0700)]
xfs: simplify xfs_map_blocks by using xfs_iext_lookup_extent directly
xfs_bmapi_read adds zero value in xfs_map_blocks. Replace it with a
direct call to the low-level extent lookup function.
Note that we now always pass a 0 length to the trace points as we ask
for an unspecified len.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Christoph Hellwig [Thu, 12 Jul 2018 05:26:01 +0000 (22:26 -0700)]
xfs: remove xfs_reflink_find_cow_mapping
We only have one caller left, and open coding the simple extent list
lookup in it allows us to make the code both more understandable and
reuse calculations and variables already present.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Christoph Hellwig [Thu, 12 Jul 2018 05:26:01 +0000 (22:26 -0700)]
xfs: remove the now unused XFS_BMAPI_IGSTATE flag
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Dave Chinner [Thu, 12 Jul 2018 05:26:00 +0000 (22:26 -0700)]
xfs: make xfs_writepage_map extent map centric
xfs_writepage_map() iterates over the bufferheads on a page to decide
what sort of IO to do and what actions to take. However, when it comes
to reflink and deciding when it needs to execute a COW operation, we no
longer look at the bufferhead state but instead we ignore than and look
up internal state held in the COW fork extent list.
This means xfs_writepage_map() is somewhat confused. It does stuff, then
ignores it, then tries to handle the impedence mismatch by shovelling the
results inside the existing mapping code. It works, but it's a bit of a
mess and it makes it hard to fix the cached map bug that the writepage
code currently has.
To unify the two different mechanisms, we first have to choose a direction.
That's already been set - we're de-emphasising bufferheads so they are no
longer a control structure as we need to do taht to allow for eventual
removal. Hence we need to move away from looking at bufferhead state to
determine what operations we need to perform.
We can't completely get rid of bufferheads yet - they do contain some
state that is absolutely necessary, such as whether that part of the page
contains valid data or not (buffer_uptodate()). Other state in the
bufferhead is redundant:
BH_dirty - the page is dirty, so we can ignore this and just
write it
BH_delay - we have delalloc extent info in the DATA fork extent
tree
BH_unwritten - same as BH_delay
BH_mapped - indicates we've already used it once for IO and it is
mapped to a disk address. Needs to be ignored for COW
blocks.
The BH_mapped flag is an interesting case - it's supposed to indicate that
it's already mapped to disk and so we can just use it "as is". In theory,
we don't even have to do an extent lookup to find where to write it too,
but we have to do that anyway to determine we are actually writing over a
valid extent. Hence it's not even serving the purpose of avoiding a an
extent lookup during writeback, and so we can pretty much ignore it.
Especially as we have to ignore it for COW operations...
Therefore, use the extent map as the source of information to tell us
what actions we need to take and what sort of IO we should perform. The
first step is to have xfs_map_blocks() set the io type according to what
it looks up. This means it can easily handle both normal overwrite and
COW cases. The only thing we also need to add is the ability to return
hole mappings.
We need to return and cache hole mappings now for the case of multiple
blocks per page. We no longer use the BH_mapped to indicate a block over
a hole, so we have to get that info from xfs_map_blocks(). We cache it so
that holes that span two pages don't need separate lookups. This allows us
to avoid ever doing write IO over a hole, too.
Now that we have xfs_map_blocks() returning both a cached map and the type
of IO we need to perform, we can rewrite xfs_writepage_map() to drop all
the bufferhead control. It's also much simplified because it doesn't need
to explicitly handle COW operations. Instead of iterating bufferheads, it
iterates blocks within the page and then looks up what per-block state is
required from the appropriate bufferhead. It then validates the cached
map, and if it's not valid, we get a new map. If we don't get a valid map
or it's over a hole, we skip the block.
At this point, we have to remap the bufferhead via xfs_map_at_offset().
As previously noted, we had to do this even if the buffer was already
mapped as the mapping would be stale for XFS_IO_DELALLOC, XFS_IO_UNWRITTEN
and XFS_IO_COW IO types. With xfs_map_blocks() now controlling the type,
even XFS_IO_OVERWRITE types need remapping, as converted-but-not-yet-
written delalloc extents beyond EOF can be reported at XFS_IO_OVERWRITE.
Bufferheads that span such regions still need their BH_Delay flags cleared
and their block numbers calculated, so we now unconditionally map each
bufferhead before submission.
But wait! There's more - remember the old "treat unwritten extents as
holes on read" hack? Yeah, that means we can have a dirty page with
unmapped, unwritten bufferheads that contain data! What makes these so
special is that the unwritten "hole" bufferheads do not have a valid block
device pointer, so if we attempt to write them xfs_add_to_ioend() blows
up. So we make xfs_map_at_offset() do the "realtime or data device"
lookup from the inode and ignore what was or wasn't put into the
bufferhead when the buffer was instantiated.
The astute reader will have realised by now that this code treats
unwritten extents in multiple-blocks-per-page situations differently.
If we get any combination of unwritten blocks on a dirty page that contain
valid data in the page, we're going to convert them to real extents. This
can actually be a win, because it means that pages with interleaving
unwritten and written blocks will get converted to a single written extent
with zeros replacing the interspersed unwritten blocks. This is actually
good for reducing extent list and conversion overhead, and it means we
issue a contiguous IO instead of lots of little ones. The downside is
that we use up a little extra IO bandwidth. Neither of these seem like a
bad thing given that spinning disks are seek sensitive, and SSDs/pmem have
bandwidth to burn and the lower Io latency/CPU overhead of fewer, larger
IOs will result in better performance on them...
As a result of all this, the only state we actually care about from the
bufferhead is a single flag - BH_Uptodate. We still use the bufferhead to
pass some information to the bio via xfs_add_to_ioend(), but that is
trivial to separate and pass explicitly. This means we really only need
1 bit of state per block per page from the buffered write path in the
writeback path. Everything else we do with the bufferhead is purely to
make the buffered IO front end continue to work correctly. i.e we've
pretty much marginalised bufferheads in the writeback path completely.
Signed-off-By: Dave Chinner <dchinner@redhat.com>
[hch: forward port, refactor and split off bits into other commits]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Christoph Hellwig [Thu, 12 Jul 2018 05:26:00 +0000 (22:26 -0700)]
xfs: rename the offset variable in xfs_writepage_map
Calling it file_offset makes the usage more clear, especially with
a new poffset variable that will be added soon for the offset inside
the page.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Christoph Hellwig [Thu, 12 Jul 2018 05:25:59 +0000 (22:25 -0700)]
xfs: remove xfs_map_cow
We can handle the existing cow mapping case as a special case directly
in xfs_writepage_map, and share code for allocating delalloc blocks
with regular I/O in xfs_map_blocks. This means we need to always
call xfs_map_blocks for reflink inodes, but we can still skip most of
the work if it turns out that there is no COW mapping overlapping the
current block.
As a subtle detail we need to start caching holes in the wpc to deal
with the case of COW reservations between EOF. But we'll need that
infrastructure later anyway, so this is no big deal.
Based on a patch from Dave Chinner.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Christoph Hellwig [Thu, 12 Jul 2018 05:25:59 +0000 (22:25 -0700)]
xfs: remove xfs_reflink_trim_irec_to_next_cow
We already have to check for overlapping COW extents everytime we
come back to a page in xfs_writepage_map / xfs_map_cow, so this
additional trim is not required.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Christoph Hellwig [Thu, 12 Jul 2018 05:25:59 +0000 (22:25 -0700)]
xfs: don't use XFS_BMAPI_IGSTATE in xfs_map_blocks
We want to be able to use the extent state as a reliably indicator for
the type of I/O, and stop using the buffer head state. For this we
need to stop using the XFS_BMAPI_IGSTATE so that we don't see merged
extents of different types.
Based on a patch from Dave Chinner.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Christoph Hellwig [Thu, 12 Jul 2018 05:25:58 +0000 (22:25 -0700)]
xfs: don't clear imap_valid for a non-uptodate buffers
Finding a buffer that isn't uptodate doesn't invalidate the mapping for
any given block. The last_sector check will already take care of starting
another ioend as soon as we find any non-update buffer, and if the current
mapping doesn't include the next uptodate buffer the xfs_imap_valid check
will take care of it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Christoph Hellwig [Thu, 12 Jul 2018 05:25:58 +0000 (22:25 -0700)]
xfs: do not set the page uptodate in xfs_writepage_map
We already track the page uptodate status based on the buffer uptodate
status, which is updated whenever reading or zeroing blocks.
This code has been there since commit a ptool commit in 2002, which
claims to:
"merge" the 2.4 fsx fix for block size < page size to 2.5. This needed
major changes to actually fit.
and isn't present in other writepage implementations.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Christoph Hellwig [Thu, 12 Jul 2018 05:25:57 +0000 (22:25 -0700)]
xfs: move locking into xfs_bmap_punch_delalloc_range
Both callers want the same looking, so do it only once.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Christoph Hellwig [Thu, 12 Jul 2018 05:25:57 +0000 (22:25 -0700)]
xfs: simplify xfs_aops_discard_page
Instead of looking at the buffer heads to see if a block is delalloc just
call xfs_bmap_punch_delalloc_range on the whole page - this will leave
any non-delalloc block intact and handle the iteration for us. As a side
effect one more place stops caring about buffer heads and we can remove the
xfs_check_page_type function entirely.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Christoph Hellwig [Thu, 12 Jul 2018 05:25:56 +0000 (22:25 -0700)]
xfs: use iomap for blocksize == PAGE_SIZE readpage and readpages
For file systems with a block size that equals the page size we never do
partial reads, so we can use the buffer_head-less iomap versions of
readpage and readpages without conflicting with the buffer_head structures
create later in write_begin.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Darrick J. Wong [Thu, 12 Jul 2018 05:24:40 +0000 (22:24 -0700)]
Merge branch 'iomap-4.19-merge' into xfs-4.19-merge
Linus Torvalds [Sun, 8 Jul 2018 23:34:02 +0000 (16:34 -0700)]
Linux 4.18-rc4
Linus Torvalds [Sun, 8 Jul 2018 21:12:46 +0000 (14:12 -0700)]
Merge tag 'armsoc-fixes' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"A small collection of fixes, sort of the usual at this point, all for
i.MX or OMAP:
- Enable ULPI drivers on i.MX to avoid a hang
- Pinctrl fix for touchscreen on i.MX51 ZII RDU1
- Fixes for ethernet clock references on am3517
- mmc0 write protect detection fix for am335x
- kzalloc->kcalloc conversion in an OMAP driver
- USB metastability fix for USB on dra7
- Fix touchscreen wakeup on am437x"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: imx_v4_v5_defconfig: Select ULPI support
ARM: imx_v6_v7_defconfig: Select ULPI support
ARM: dts: omap3: Fix am3517 mdio and emac clock references
ARM: dts: am335x-bone-common: Fix mmc0 Write Protect
bus: ti-sysc: Use 2-factor allocator arguments
ARM: dts: dra7: Disable metastability workaround for USB2
ARM: dts: imx51-zii-rdu1: fix touchscreen pinctrl
ARM: dts: am437x: make edt-ft5x06 a wakeup source
Linus Torvalds [Sun, 8 Jul 2018 20:56:25 +0000 (13:56 -0700)]
Merge branch 'x86-pti-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86/pti updates from Thomas Gleixner:
"Two small fixes correcting the handling of SSB mitigations on AMD
processors"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/bugs: Fix the AMD SSBD usage of the SPEC_CTRL MSR
x86/bugs: Update when to check for the LS_CFG SSBD mitigation
Linus Torvalds [Sun, 8 Jul 2018 20:26:55 +0000 (13:26 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
- Prevent an out-of-bounds access in mtrr_write()
- Break a circular dependency in the new hyperv IPI acceleration code
- Address the build breakage related to inline functions by enforcing
gnu_inline and explicitly bringing native_save_fl() out of line,
which also adds a set of _ARM_ARG macros which provide 32/64bit
safety.
- Initialize the shadow CR4 per cpu variable before using it.
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mtrr: Don't copy out-of-bounds data in mtrr_write
x86/hyper-v: Fix the circular dependency in IPI enlightenment
x86/paravirt: Make native_save_fl() extern inline
x86/asm: Add _ASM_ARG* constants for argument registers to <asm/asm.h>
compiler-gcc.h: Add __attribute__((gnu_inline)) to all inline declarations
x86/mm/32: Initialize the CR4 shadow before __flush_tlb_all()
Linus Torvalds [Sun, 8 Jul 2018 19:41:23 +0000 (12:41 -0700)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull scheduler fixes from Thomas Gleixner:
- The hopefully final fix for the reported race problems in
kthread_parkme(). The previous attempt still left a hole and was
partially wrong.
- Plug a race in the remote tick mechanism which triggers a warning
about updates not being done correctly. That's a false positive if
the race condition is hit as the remote CPU is idle. Plug it by
checking the condition again when holding run queue lock.
- Fix a bug in the utilization estimation of a run queue which causes
the estimation to be 0 when a run queue is throttled.
- Advance the global expiration of the period timer when the timer is
restarted after a idle period. Otherwise the expiry time is stale and
the timer fires prematurely.
- Cure the drift between the bandwidth timer and the runqueue
accounting, which leads to bogus throttling of runqueues
- Place the call to cpufreq_update_util() correctly so the function
will observe the correct number of running RT tasks and not a stale
one.
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
kthread, sched/core: Fix kthread_parkme() (again...)
sched/util_est: Fix util_est_dequeue() for throttled cfs_rq
sched/fair: Advance global expiration when period timer is restarted
sched/fair: Fix bandwidth timer clock drift condition
sched/rt: Fix call to cpufreq_update_util()
sched/nohz: Skip remote tick on idle task entirely
Linus Torvalds [Sun, 8 Jul 2018 18:57:40 +0000 (11:57 -0700)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull objtool fix from Thomas Gleixner:
"A single fix for objtool to address a bug in handling the cold
subfunction detection for aliased functions which was added recently.
The bug causes objtool to enter an infinite loop"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Support GCC 8 '-fnoreorder-functions'
Linus Torvalds [Sun, 8 Jul 2018 18:29:14 +0000 (11:29 -0700)]
Merge branch 'linus' of git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
- add missing RETs in x86 aegis/morus
- fix build error in arm speck
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: x86 - Add missing RETs
crypto: arm/speck - fix building in Thumb2 mode
Linus Torvalds [Sun, 8 Jul 2018 18:10:30 +0000 (11:10 -0700)]
Merge tag 'ext4_for_linus_stable' of git://git./linux/kernel/git/tytso/ext4
Pull ext4 bugfixes from Ted Ts'o:
"Bug fixes for ext4; most of which relate to vulnerabilities where a
maliciously crafted file system image can result in a kernel OOPS or
hang.
At least one fix addresses an inline data bug could be triggered by
userspace without the need of a crafted file system (although it does
require that the inline data feature be enabled)"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: check superblock mapped prior to committing
ext4: add more mount time checks of the superblock
ext4: add more inode number paranoia checks
ext4: avoid running out of journal credits when appending to an inline file
jbd2: don't mark block as modified if the handle is out of credits
ext4: never move the system.data xattr out of the inode body
ext4: clear i_data in ext4_inode_info when removing inline data
ext4: include the illegal physical block in the bad map ext4_error msg
ext4: verify the depth of extent tree in ext4_find_extent()
ext4: only look at the bg_flags field if it is valid
ext4: make sure bitmaps and the inode table don't overlap with bg descriptors
ext4: always check block group bounds in ext4_init_block_bitmap()
ext4: always verify the magic number in xattr blocks
ext4: add corruption check in ext4_xattr_set_entry()
ext4: add warn_on_error mount option
Linus Torvalds [Sun, 8 Jul 2018 17:55:21 +0000 (10:55 -0700)]
Merge tag 'pci-v4.18-fixes-2' of git://git./linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
- Fix a use-after-free in the endpoint code (Dan Carpenter)
- Stop defaulting CONFIG_PCIE_DW_PLAT_HOST to yes (Geert Uytterhoeven)
- Fix an nfp regression caused by a change in how we limit the number
of VFs we can enable (Jakub Kicinski)
- Fix failure path cleanup issues in the new R-Car gen3 PHY support
(Marek Vasut)
- Fix leaks of OF nodes in faraday, xilinx-nwl, xilinx (Nicholas Mc
Guire)
* tag 'pci-v4.18-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
nfp: stop limiting VFs to 0
PCI/IOV: Reset total_VFs limit after detaching PF driver
PCI: faraday: Add missing of_node_put()
PCI: xilinx-nwl: Add missing of_node_put()
PCI: xilinx: Add missing of_node_put()
PCI: endpoint: Use after free in pci_epf_unregister_driver()
PCI: controller: dwc: Do not let PCIE_DW_PLAT_HOST default to yes
PCI: rcar: Clean up PHY init on failure
PCI: rcar: Shut the PHY down in failpath
Linus Torvalds [Sun, 8 Jul 2018 01:31:34 +0000 (18:31 -0700)]
Merge tag '4.18-rc3-smb3fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Five smb3/cifs fixes for stable (including for some leaks and memory
overwrites) and also a few fixes for recent regressions in packet
signing.
Additional testing at the recent SMB3 test event, and some good work
by Paulo and others spotted the issues fixed here. In addition to my
xfstest runs on these, Aurelien and Stefano did additional test runs
to verify this set"
* tag '4.18-rc3-smb3fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Fix stack out-of-bounds in smb{2,3}_create_lease_buf()
cifs: Fix infinite loop when using hard mount option
cifs: Fix slab-out-of-bounds in send_set_info() on SMB2 ACE setting
cifs: Fix memory leak in smb2_set_ea()
cifs: fix SMB1 breakage
cifs: Fix validation of signed data in smb2
cifs: Fix validation of signed data in smb3+
cifs: Fix use after free of a mid_q_entry
Linus Torvalds [Sun, 8 Jul 2018 00:55:16 +0000 (17:55 -0700)]
Merge tag 'dma-mapping-4.18-3' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fix from Christoph Hellwig:
"Revert an incorrect dma-mapping commit for 4.18-rc"
* tag 'dma-mapping-4.18-3' of git://git.infradead.org/users/hch/dma-mapping:
Revert "iommu/intel-iommu: Enable CONFIG_DMA_DIRECT_OPS=y and clean up intel_{alloc,free}_coherent()"
Linus Torvalds [Sun, 8 Jul 2018 00:29:08 +0000 (17:29 -0700)]
Merge tag 'dmaengine-fix-4.18-rc4' of git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine fixes from Vinod Koul:
"We have few odd driver fixes and one email update change for you this
time:
- Driver fixes for k3dma (off by one), pl330 (burst residue
granularity) and omap-dma (incorrect residue_granularity)
- Sinan's email update"
* tag 'dmaengine-fix-4.18-rc4' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: k3dma: Off by one in k3_of_dma_simple_xlate()
dmaengine: pl330: report BURST residue granularity
MAINTAINERS: Update email-id of Sinan Kaya
dmaengine: ti: omap-dma: Fix OMAP1510 incorrect residue_granularity
Linus Torvalds [Sun, 8 Jul 2018 00:15:38 +0000 (17:15 -0700)]
Merge tag 'for-linus-4.18-2' of git://github.com/cminyard/linux-ipmi
Pull IPMI fixes from Corey Minyard:
"A couple of small fixes: one to the BMC side of things that fixes an
interrupt issue, and one oops fix if init fails in a certain way on
the client driver"
* tag 'for-linus-4.18-2' of git://github.com/cminyard/linux-ipmi:
ipmi: kcs_bmc: fix IRQ exception if the channel is not open
ipmi: Cleanup oops on initialization failure
Linus Torvalds [Sat, 7 Jul 2018 17:51:25 +0000 (10:51 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 LDFLAGS clean-up from Catalin Marinas:
- use aarch64elf instead of aarch64linux
- move endianness options to LDFLAGS instead from LD
- remove no-op '-p' linker flag
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: remove no-op -p linker flag
arm64: add endianness option to LDFLAGS instead of LD
arm64: Use aarch64elf and aarch64elfb emulation mode variants
Jann Horn [Fri, 6 Jul 2018 21:50:03 +0000 (23:50 +0200)]
x86/mtrr: Don't copy out-of-bounds data in mtrr_write
Don't access the provided buffer out of bounds - this can cause a kernel
out-of-bounds read when invoked through sys_splice() or other things that
use kernel_write()/__kernel_write().
Fixes:
7f8ec5a4f01a ("x86/mtrr: Convert to use strncpy_from_user() helper")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180706215003.156702-1-jannh@google.com
Linus Torvalds [Sat, 7 Jul 2018 02:45:47 +0000 (19:45 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is two minor bug fixes (aacraid, target) and a fix for a
potential exploit in the way sg handles teardown"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: sg: mitigate read/write abuse
scsi: aacraid: Fix PD performance regression over incorrect qd being set
scsi: target: Fix truncated PR-in ReadKeys response
Linus Torvalds [Sat, 7 Jul 2018 02:13:42 +0000 (19:13 -0700)]
Merge tag 'for-linus-
20180706' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Two minor fixes for this series:
- add LOOP_SET_BLOCK_SIZE as compat ioctl (Evan Green)
- drbd use-after-free fix (Lars Ellenberg)"
* tag 'for-linus-
20180706' of git://git.kernel.dk/linux-block:
loop: Add LOOP_SET_BLOCK_SIZE in compat ioctl
drbd: fix access after free
Linus Torvalds [Fri, 6 Jul 2018 19:32:17 +0000 (12:32 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"The usual collection of driver fixlets:
- build cleanup/fix for the sunxi makefile that tried to save size
but failed and prevented dead code elimination from working
- two Davinci clk driver fixes for a typo causing build failures in
different configurations and an error check that checks the wrong
variable.
- undo the DT ABI breaking imx6ul binding header shuffle that got
merged this cycle"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
dt-bindings: clock: imx6ul: Do not change the clock definition order
clk: davinci: fix a typo (which leads to build failures)
clk: davinci: cfgchip: testing the wrong variable
clk: sunxi-ng: replace lib-y with obj-y
Linus Torvalds [Fri, 6 Jul 2018 19:23:53 +0000 (12:23 -0700)]
Merge tag 'vfio-v4.18-rc4' of git://github.com/awilliam/linux-vfio
Pull VFIO fixes from Alex Williamson:
- Make vfio-pci IGD extensions optional via Kconfig (Alex Williamson)
- Remove unused and soon to be removed map_atomic callback from mbochs
sample driver, add unmap callback to avoid dmabuf leaks (Gerd
Hoffmann)
- Fix usage of get_user_pages_longterm() (Jason Gunthorpe)
- Fix sample mbochs driver vm_operations_struct.fault return type
(Souptick Joarder)
* tag 'vfio-v4.18-rc4' of git://github.com/awilliam/linux-vfio:
sample/vfio-mdev: Change return type to vm_fault_t
vfio: Use get_user_pages_longterm correctly
sample/mdev/mbochs: add mbochs_kunmap_dmabuf
sample/mdev/mbochs: remove mbochs_kmap_atomic_dmabuf
vfio/pci: Make IGD support a configurable option
Linus Torvalds [Fri, 6 Jul 2018 16:14:34 +0000 (09:14 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
"A few more changes for v4.18:
- wire up the two new system calls io_pgetevents and rseq
- fix a register corruption in the expolines code for machines
without EXRL
- drastically reduce the memory utilization of the dasd driver
- fix reference counting for KVM page table pages"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: wire up rseq system call
s390: wire up io_pgetevents system call
s390/mm: fix refcount usage for 4K pgste
s390/dasd: reduce the default queue depth and nr of hardware queues
s390: Correct register corruption in critical section cleanup
K. Y. Srinivasan [Tue, 3 Jul 2018 23:01:55 +0000 (16:01 -0700)]
x86/hyper-v: Fix the circular dependency in IPI enlightenment
The IPI hypercalls depend on being able to map the Linux notion of CPU ID
to the hypervisor's notion of the CPU ID. The array hv_vp_index[] provides
this mapping. Code for populating this array depends on the IPI functionality.
Break this circular dependency.
[ tglx: Use a proper define instead of '-1' with a u32 variable as pointed
out by Vitaly ]
Fixes:
68bb7bfb7985 ("X86/Hyper-V: Enable IPI enlightenments")
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Cc: gregkh@linuxfoundation.org
Cc: devel@linuxdriverproject.org
Cc: olaf@aepfle.de
Cc: apw@canonical.com
Cc: jasowang@redhat.com
Cc: hpa@zytor.com
Cc: sthemmin@microsoft.com
Cc: Michael.H.Kelley@microsoft.com
Cc: vkuznets@redhat.com
Link: https://lkml.kernel.org/r/20180703230155.15160-1-kys@linuxonhyperv.com
Linus Torvalds [Fri, 6 Jul 2018 02:43:29 +0000 (19:43 -0700)]
Merge tag 'drm-fixes-2018-07-06' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"This is the drm fixes for rc4.
It's a bit larger than I'd like but the exynos cleanups are pretty
mechanical, and I'd rather have them in sooner rather than later so we
can avoid too much conflicts around them. The non-mechanincal exynos
changes are mostly fixes for new feature recently introduced.
Apart from the exynos updates, we have:
i915:
- GVT and GGTT mapping fixes
amdgpu:
- fix HDMI2.0 4K@60 Hz regression
- Hotplug fixes for dual-GPU laptops to make power management better
- misc vega12 bios fixes, a race fix and some typos.
sii8620 bridge:
- small fixes around mode setting
core:
- use kvzalloc to allocate blob property memory"
* tag 'drm-fixes-2018-07-06' of git://anongit.freedesktop.org/drm/drm: (34 commits)
drm/amd/display: add a check for display depth validity
drm/amd/display: adding ycbcr420 pixel encoding for hdmi
drm/udl: fix display corruption of the last line
drm/bridge/sii8620: Fix link mode selection
drm/bridge/sii8620: Fix display of packed pixel modes
drm/bridge/sii8620: Send AVI infoframe in all MHL versions
drm/amdgpu: fix user fence write race condition
drm/i915: Try GGTT mmapping whole object as partial
drm/amdgpu/pm: fix display count in non-DC path
drm/amdgpu: fix swapped emit_ib_size in vce3
drm: Use kvzalloc for allocating blob property memory
drm/i915/gvt: changed DDI mode emulation type
drm/i915/gvt: fix a bug of partially write ggtt enties
drm/exynos: Replace drm_dev_unref with drm_dev_put
drm/exynos: Replace drm_gem_object_unreference_unlocked with put function
drm/exynos: Replace drm_framebuffer_{un/reference} with put,get functions
drm/exynos: ipp: use correct enum type
drm/exynos: decon5433: Fix WINCONx reset value
drm/exynos: decon5433: Fix per-plane global alpha for XRGB modes
drm/exynos: fimc: Use real buffer width for configuring the hardware
...
Linus Torvalds [Fri, 6 Jul 2018 02:29:07 +0000 (19:29 -0700)]
Merge tag 'trace-v4.18-rc3' of git://git./linux/kernel/git/rostedt/linux-trace
Pull tracing fixes and cleanups from Steven Rostedt:
"While cleaning out my INBOX, I found a few patches that were lost in
the noise. These are minor bug fixes and clean ups. Those include:
- avoid a string overflow
- code that didn't match the comment (but should)
- a small code optimization (use of a conditional)
- quiet printf warnings
- nuke unused code
- fix function graph interrupt annotation"
* tag 'trace-v4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix missing return symbol in function_graph output
ftrace: Nuke clear_ftrace_function
tracing: Use __printf markup to silence compiler
tracing: Optimize trace_buffer_iter() logic
tracing: Make create_filter() code match the comments
tracing: Avoid string overflow
Dave Airlie [Fri, 6 Jul 2018 00:46:58 +0000 (10:46 +1000)]
Merge tag 'exynos-drm-fixes-for-v4.18-rc4' of git://git./linux/kernel/git/daeinki/drm-exynos into drm-fixes
Fixups
- Fix several problems to IPPv2 merged to mainline recentely.
. An align problem of width size that IPP driver incorrectly
calculated the real buffer size.
. Horizontal and vertical flip problem.
. Per-plane global alpha for XRGB modes.
. Incorrect variant of the YUV modes.
- Fix plane overlapping problem.
. The stange order of overlapping planes on XRGB modes
by setting global alpha value to maximum value.
Cleanup
- Rename a enum type, drm_ipp_size_id, to one specific to Exynos,
drm_exynos_ipp_limit_type.
- Replace {un/reference} with {put,get} functions.
. it replaces several reference/unreference functions with Linux
kernel nameing standard.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1530512041-21392-1-git-send-email-inki.dae@samsung.com
Dave Airlie [Fri, 6 Jul 2018 00:44:35 +0000 (10:44 +1000)]
Merge branch 'drm-fixes-4.18' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
- Fix an HDMI 2.0 4k@60 regression
- Hotplug fixes for PX/HG laptops
- Fixes for vbios changes in vega12
- Fix a race in the user fence code
- Fix a couple of misc typos
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180705155206.2752-1-alexander.deucher@amd.com
Dave Airlie [Fri, 6 Jul 2018 00:44:04 +0000 (10:44 +1000)]
Merge tag 'drm-intel-fixes-2018-07-05' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
A couple of GVT fixes, and a GGTT mmapping fix.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/8736wxq35t.fsf@intel.com
Dave Airlie [Fri, 6 Jul 2018 00:41:12 +0000 (10:41 +1000)]
Merge tag 'drm-misc-fixes-2018-07-05' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Fixes for v4.18-rc4:
- A few small fixes for the sii8620 bridge.
- Allocate blob property memory using kvzalloc instead of kmalloc.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/4267636e-bb7c-8f69-eeff-12e045b3e7e1@linux.intel.com
Olof Johansson [Thu, 5 Jul 2018 21:59:20 +0000 (14:59 -0700)]
Merge tag 'omap-for-v4.18/fixes-signed' of git://git./linux/kernel/git/tmlind/linux-omap into fixes
Fixes for omap for v4.18-rc cycle
Few dts fixes for regressions for various SoCs and
devices for touchscreen wake, dra7 USB quirk, pinmux
for beaglebone mmc, and emac clock.
Also included is a change for ti-sysc to use kcalloc
that Kees wanted to get into v4.18 as that's the last
one he wanted to fix for improved defense against
allocation overflows.
* tag 'omap-for-v4.18/fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: omap3: Fix am3517 mdio and emac clock references
ARM: dts: am335x-bone-common: Fix mmc0 Write Protect
bus: ti-sysc: Use 2-factor allocator arguments
ARM: dts: dra7: Disable metastability workaround for USB2
ARM: dts: am437x: make edt-ft5x06 a wakeup source
Signed-off-by: Olof Johansson <olof@lixom.net>
Linus Torvalds [Wed, 4 Jul 2018 00:10:19 +0000 (17:10 -0700)]
Fix up non-directory creation in SGID directories
sgid directories have special semantics, making newly created files in
the directory belong to the group of the directory, and newly created
subdirectories will also become sgid. This is historically used for
group-shared directories.
But group directories writable by non-group members should not imply
that such non-group members can magically join the group, so make sure
to clear the sgid bit on non-directories for non-members (but remember
that sgid without group execute means "mandatory locking", just to
confuse things even more).
Reported-by: Jann Horn <jannh@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Thu, 5 Jul 2018 19:29:55 +0000 (13:29 -0600)]
Revert "iommu/intel-iommu: Enable CONFIG_DMA_DIRECT_OPS=y and clean up intel_{alloc,free}_coherent()"
This commit may cause a less than required dma mask to be used for
some allocations, which apparently leads to module load failures for
iwlwifi sometimes.
This reverts commit
d657c5c73ca987214a6f9436e435b34fc60f332a.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Fabio Coatti <fabio.coatti@gmail.com>
Tested-by: Fabio Coatti <fabio.coatti@gmail.com>
Stefano Brivio [Thu, 5 Jul 2018 13:10:02 +0000 (15:10 +0200)]
cifs: Fix stack out-of-bounds in smb{2,3}_create_lease_buf()
smb{2,3}_create_lease_buf() store a lease key in the lease
context for later usage on a lease break.
In most paths, the key is currently sourced from data that
happens to be on the stack near local variables for oplock in
SMB2_open() callers, e.g. from open_shroot(), whereas
smb2_open_file() properly allocates space on its stack for it.
The address of those local variables holding the oplock is then
passed to create_lease_buf handlers via SMB2_open(), and 16
bytes near oplock are used. This causes a stack out-of-bounds
access as reported by KASAN on SMB2.1 and SMB3 mounts (first
out-of-bounds access is shown here):
[ 111.528823] BUG: KASAN: stack-out-of-bounds in smb3_create_lease_buf+0x399/0x3b0 [cifs]
[ 111.530815] Read of size 8 at addr
ffff88010829f249 by task mount.cifs/985
[ 111.532838] CPU: 3 PID: 985 Comm: mount.cifs Not tainted 4.18.0-rc3+ #91
[ 111.534656] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 111.536838] Call Trace:
[ 111.537528] dump_stack+0xc2/0x16b
[ 111.540890] print_address_description+0x6a/0x270
[ 111.542185] kasan_report+0x258/0x380
[ 111.544701] smb3_create_lease_buf+0x399/0x3b0 [cifs]
[ 111.546134] SMB2_open+0x1ef8/0x4b70 [cifs]
[ 111.575883] open_shroot+0x339/0x550 [cifs]
[ 111.591969] smb3_qfs_tcon+0x32c/0x1e60 [cifs]
[ 111.617405] cifs_mount+0x4f3/0x2fc0 [cifs]
[ 111.674332] cifs_smb3_do_mount+0x263/0xf10 [cifs]
[ 111.677915] mount_fs+0x55/0x2b0
[ 111.679504] vfs_kern_mount.part.22+0xaa/0x430
[ 111.684511] do_mount+0xc40/0x2660
[ 111.698301] ksys_mount+0x80/0xd0
[ 111.701541] do_syscall_64+0x14e/0x4b0
[ 111.711807] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 111.713665] RIP: 0033:0x7f372385b5fa
[ 111.715311] Code: 48 8b 0d 99 78 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 66 78 2c 00 f7 d8 64 89 01 48
[ 111.720330] RSP: 002b:
00007ffff27049d8 EFLAGS:
00000206 ORIG_RAX:
00000000000000a5
[ 111.722601] RAX:
ffffffffffffffda RBX:
0000000000000000 RCX:
00007f372385b5fa
[ 111.724842] RDX:
000055c2ecdc73b2 RSI:
000055c2ecdc73f9 RDI:
00007ffff270580f
[ 111.727083] RBP:
00007ffff2705804 R08:
000055c2ee976060 R09:
0000000000001000
[ 111.729319] R10:
0000000000000000 R11:
0000000000000206 R12:
00007f3723f4d000
[ 111.731615] R13:
000055c2ee976060 R14:
00007f3723f4f90f R15:
0000000000000000
[ 111.735448] The buggy address belongs to the page:
[ 111.737420] page:
ffffea000420a7c0 count:0 mapcount:0 mapping:
0000000000000000 index:0x0
[ 111.739890] flags: 0x17ffffc0000000()
[ 111.741750] raw:
0017ffffc0000000 0000000000000000 dead000000000200 0000000000000000
[ 111.744216] raw:
0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[ 111.746679] page dumped because: kasan: bad access detected
[ 111.750482] Memory state around the buggy address:
[ 111.752562]
ffff88010829f100: 00 f2 f2 f2 f2 f2 f2 f2 00 00 00 00 00 00 00 00
[ 111.754991]
ffff88010829f180: 00 00 f2 f2 00 00 00 00 00 00 00 00 00 00 00 00
[ 111.757401] >
ffff88010829f200: 00 00 00 00 00 f1 f1 f1 f1 01 f2 f2 f2 f2 f2 f2
[ 111.759801] ^
[ 111.762034]
ffff88010829f280: f2 02 f2 f2 f2 f2 f2 f2 f2 00 00 00 00 00 00 00
[ 111.764486]
ffff88010829f300: f2 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 111.766913] ==================================================================
Lease keys are however already generated and stored in fid data
on open and create paths: pass them down to the lease context
creation handlers and use them.
Suggested-by: Aurélien Aptel <aaptel@suse.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Fixes:
b8c32dbb0deb ("CIFS: Request SMB2.1 leases")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Paulo Alcantara [Thu, 5 Jul 2018 16:46:34 +0000 (13:46 -0300)]
cifs: Fix infinite loop when using hard mount option
For every request we send, whether it is SMB1 or SMB2+, we attempt to
reconnect tcon (cifs_reconnect_tcon or smb2_reconnect) before carrying
out the request.
So, while server->tcpStatus != CifsNeedReconnect, we wait for the
reconnection to succeed on wait_event_interruptible_timeout(). If it
returns, that means that either the condition was evaluated to true, or
timeout elapsed, or it was interrupted by a signal.
Since we're not handling the case where the process woke up due to a
received signal (-ERESTARTSYS), the next call to
wait_event_interruptible_timeout() will _always_ fail and we end up
looping forever inside either cifs_reconnect_tcon() or smb2_reconnect().
Here's an example of how to trigger that:
$ mount.cifs //foo/share /mnt/test -o
username=foo,password=foo,vers=1.0,hard
(break connection to server before executing bellow cmd)
$ stat -f /mnt/test & sleep 140
[1] 2511
$ ps -aux -q 2511
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 2511 0.0 0.0 12892 1008 pts/0 S 12:24 0:00 stat -f
/mnt/test
$ kill -9 2511
(wait for a while; process is stuck in the kernel)
$ ps -aux -q 2511
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 2511 83.2 0.0 12892 1008 pts/0 R 12:24 30:01 stat -f
/mnt/test
By using 'hard' mount point means that cifs.ko will keep retrying
indefinitely, however we must allow the process to be killed otherwise
it would hang the system.
Signed-off-by: Paulo Alcantara <palcantara@suse.de>
Cc: stable@vger.kernel.org
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Stefano Brivio [Thu, 5 Jul 2018 09:46:42 +0000 (11:46 +0200)]
cifs: Fix slab-out-of-bounds in send_set_info() on SMB2 ACE setting
A "small" CIFS buffer is not big enough in general to hold a
setacl request for SMB2, and we end up overflowing the buffer in
send_set_info(). For instance:
# mount.cifs //127.0.0.1/test /mnt/test -o username=test,password=test,nounix,cifsacl
# touch /mnt/test/acltest
# getcifsacl /mnt/test/acltest
REVISION:0x1
CONTROL:0x9004
OWNER:S-1-5-21-
2926364953-
924364008-
418108241-1000
GROUP:S-1-22-2-1001
ACL:S-1-5-21-
2926364953-
924364008-
418108241-1000:ALLOWED/0x0/0x1e01ff
ACL:S-1-22-2-1001:ALLOWED/0x0/R
ACL:S-1-22-2-1001:ALLOWED/0x0/R
ACL:S-1-5-21-
2926364953-
924364008-
418108241-1000:ALLOWED/0x0/0x1e01ff
ACL:S-1-1-0:ALLOWED/0x0/R
# setcifsacl -a "ACL:S-1-22-2-1004:ALLOWED/0x0/R" /mnt/test/acltest
this setacl will cause the following KASAN splat:
[ 330.777927] BUG: KASAN: slab-out-of-bounds in send_set_info+0x4dd/0xc20 [cifs]
[ 330.779696] Write of size 696 at addr
ffff88010d5e2860 by task setcifsacl/1012
[ 330.781882] CPU: 1 PID: 1012 Comm: setcifsacl Not tainted 4.18.0-rc2+ #2
[ 330.783140] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 330.784395] Call Trace:
[ 330.784789] dump_stack+0xc2/0x16b
[ 330.786777] print_address_description+0x6a/0x270
[ 330.787520] kasan_report+0x258/0x380
[ 330.788845] memcpy+0x34/0x50
[ 330.789369] send_set_info+0x4dd/0xc20 [cifs]
[ 330.799511] SMB2_set_acl+0x76/0xa0 [cifs]
[ 330.801395] set_smb2_acl+0x7ac/0xf30 [cifs]
[ 330.830888] cifs_xattr_set+0x963/0xe40 [cifs]
[ 330.840367] __vfs_setxattr+0x84/0xb0
[ 330.842060] __vfs_setxattr_noperm+0xe6/0x370
[ 330.843848] vfs_setxattr+0xc2/0xd0
[ 330.845519] setxattr+0x258/0x320
[ 330.859211] path_setxattr+0x15b/0x1b0
[ 330.864392] __x64_sys_setxattr+0xc0/0x160
[ 330.866133] do_syscall_64+0x14e/0x4b0
[ 330.876631] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 330.878503] RIP: 0033:0x7ff2e507db0a
[ 330.880151] Code: 48 8b 0d 89 93 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 bc 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 56 93 2c 00 f7 d8 64 89 01 48
[ 330.885358] RSP: 002b:
00007ffdc4903c18 EFLAGS:
00000246 ORIG_RAX:
00000000000000bc
[ 330.887733] RAX:
ffffffffffffffda RBX:
000055d1170de140 RCX:
00007ff2e507db0a
[ 330.890067] RDX:
000055d1170de7d0 RSI:
000055d115b39184 RDI:
00007ffdc4904818
[ 330.892410] RBP:
0000000000000001 R08:
0000000000000000 R09:
000055d1170de7e4
[ 330.894785] R10:
00000000000002b8 R11:
0000000000000246 R12:
0000000000000007
[ 330.897148] R13:
000055d1170de0c0 R14:
0000000000000008 R15:
000055d1170de550
[ 330.901057] Allocated by task 1012:
[ 330.902888] kasan_kmalloc+0xa0/0xd0
[ 330.904714] kmem_cache_alloc+0xc8/0x1d0
[ 330.906615] mempool_alloc+0x11e/0x380
[ 330.908496] cifs_small_buf_get+0x35/0x60 [cifs]
[ 330.910510] smb2_plain_req_init+0x4a/0xd60 [cifs]
[ 330.912551] send_set_info+0x198/0xc20 [cifs]
[ 330.914535] SMB2_set_acl+0x76/0xa0 [cifs]
[ 330.916465] set_smb2_acl+0x7ac/0xf30 [cifs]
[ 330.918453] cifs_xattr_set+0x963/0xe40 [cifs]
[ 330.920426] __vfs_setxattr+0x84/0xb0
[ 330.922284] __vfs_setxattr_noperm+0xe6/0x370
[ 330.924213] vfs_setxattr+0xc2/0xd0
[ 330.926008] setxattr+0x258/0x320
[ 330.927762] path_setxattr+0x15b/0x1b0
[ 330.929592] __x64_sys_setxattr+0xc0/0x160
[ 330.931459] do_syscall_64+0x14e/0x4b0
[ 330.933314] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 330.936843] Freed by task 0:
[ 330.938588] (stack is not available)
[ 330.941886] The buggy address belongs to the object at
ffff88010d5e2800
which belongs to the cache cifs_small_rq of size 448
[ 330.946362] The buggy address is located 96 bytes inside of
448-byte region [
ffff88010d5e2800,
ffff88010d5e29c0)
[ 330.950722] The buggy address belongs to the page:
[ 330.952789] page:
ffffea0004357880 count:1 mapcount:0 mapping:
ffff880108fdca80 index:0x0 compound_mapcount: 0
[ 330.955665] flags: 0x17ffffc0008100(slab|head)
[ 330.957760] raw:
0017ffffc0008100 dead000000000100 dead000000000200 ffff880108fdca80
[ 330.960356] raw:
0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
[ 330.963005] page dumped because: kasan: bad access detected
[ 330.967039] Memory state around the buggy address:
[ 330.969255]
ffff88010d5e2880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 330.971833]
ffff88010d5e2900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 330.974397] >
ffff88010d5e2980: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
[ 330.976956] ^
[ 330.979226]
ffff88010d5e2a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 330.981755]
ffff88010d5e2a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 330.984225] ==================================================================
Fix this by allocating a regular CIFS buffer in
smb2_plain_req_init() if the request command is SMB2_SET_INFO.
Reported-by: Jianhong Yin <jiyin@redhat.com>
Fixes:
366ed846df60 ("cifs: Use smb 2 - 3 and cifsacl mount options setacl function")
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-and-tested-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Paulo Alcantara [Wed, 4 Jul 2018 17:16:16 +0000 (14:16 -0300)]
cifs: Fix memory leak in smb2_set_ea()
This patch fixes a memory leak when doing a setxattr(2) in SMB2+.
Signed-off-by: Paulo Alcantara <palcantara@suse.de>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>