platform/upstream/kernel-adaptation-pc.git
12 years agoNFSv4: Return the delegation if the server returns NFS4ERR_OPENMODE
Trond Myklebust [Wed, 7 Mar 2012 21:39:06 +0000 (16:39 -0500)]
NFSv4: Return the delegation if the server returns NFS4ERR_OPENMODE

If a setattr() fails because of an NFS4ERR_OPENMODE error, it is
probably due to us holding a read delegation. Ensure that the
recovery routines return that delegation in this case.

Reported-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
12 years agoNFSv4: Don't free the nfs4_lock_state until after the release_lockowner
Trond Myklebust [Wed, 7 Mar 2012 18:49:12 +0000 (13:49 -0500)]
NFSv4: Don't free the nfs4_lock_state until after the release_lockowner

Otherwise we can end up with sequence id problems if the client reuses
the owner_id before the server has processed the release_lockowner

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4.1 handle DS stateid errors
Andy Adamson [Wed, 7 Mar 2012 15:49:41 +0000 (10:49 -0500)]
NFSv4.1 handle DS stateid errors

Handle DS READ and WRITE stateid errors by recovering the stateid on the MDS.

NFS4ERR_OLD_STATEID is ignored as the client always sends a
state sequenceid of zero for DS READ and WRITE stateids.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: add fh_crc to debug output
Weston Andros Adamson [Wed, 7 Mar 2012 02:58:20 +0000 (21:58 -0500)]
NFS: add fh_crc to debug output

Print the filehandle crc in two debug messages

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: add filehandle crc for debug display
Weston Andros Adamson [Wed, 7 Mar 2012 01:46:43 +0000 (20:46 -0500)]
NFS: add filehandle crc for debug display

Match wireshark's CRC-32 hash for easier debugging

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4: Add a helper encode_uint64
Trond Myklebust [Mon, 5 Mar 2012 16:40:12 +0000 (11:40 -0500)]
NFSv4: Add a helper encode_uint64

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4: More xdr cleanups
Trond Myklebust [Mon, 5 Mar 2012 16:27:16 +0000 (11:27 -0500)]
NFSv4: More xdr cleanups

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4: Cleanup - convert more functions to use encode_op_hdr
Trond Myklebust [Mon, 5 Mar 2012 01:49:32 +0000 (20:49 -0500)]
NFSv4: Cleanup - convert more functions to use encode_op_hdr

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: Fix nfs4_verifier memory alignment
Chuck Lever [Fri, 2 Mar 2012 22:14:31 +0000 (17:14 -0500)]
NFS: Fix nfs4_verifier memory alignment

Clean up due to code review.

The nfs4_verifier's data field is not guaranteed to be u32-aligned.
Casting an array of chars to a u32 * is considered generally
hazardous.

Fix this by using a __be32 array to generate a verifier's contents,
and then byte-copy the contents into the verifier field.  The contents
of a verifier, for all intents and purposes, are opaque bytes.  Only
local code that generates a verifier need know the actual content and
format.  Everyone else compares the full byte array for exact
equality.

Also, sizeof(nfs4_verifer) is the size of the in-core verifier data
structure, but NFS4_VERIFIER_SIZE is the number of octets in an XDR'd
verifier.  The two are not interchangeable, even if they happen to
have the same value.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4: Add a encode op helper
Trond Myklebust [Sun, 4 Mar 2012 23:13:57 +0000 (18:13 -0500)]
NFSv4: Add a encode op helper

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4: Add a helper for encoding NFSv4 sequence ids
Trond Myklebust [Sun, 4 Mar 2012 23:13:57 +0000 (18:13 -0500)]
NFSv4: Add a helper for encoding NFSv4 sequence ids

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4: Minor clean ups for encode_string()
Trond Myklebust [Sun, 4 Mar 2012 23:13:57 +0000 (18:13 -0500)]
NFSv4: Minor clean ups for encode_string()

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4: Simplify the struct nfs4_stateid
Trond Myklebust [Sun, 4 Mar 2012 23:13:57 +0000 (18:13 -0500)]
NFSv4: Simplify the struct nfs4_stateid

Replace the union with the common struct stateid4 as defined in both
RFC3530 and RFC5661. This makes it easier to access the sequence id,
which will again make implementing support for parallel OPEN calls
easier.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4: Add helpers for basic copying of stateids
Trond Myklebust [Sun, 4 Mar 2012 23:13:56 +0000 (18:13 -0500)]
NFSv4: Add helpers for basic copying of stateids

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4: Rename nfs4_copy_stateid()
Trond Myklebust [Sun, 4 Mar 2012 23:13:56 +0000 (18:13 -0500)]
NFSv4: Rename nfs4_copy_stateid()

It is really a function for selecting the correct stateid to use in a
read or write situation.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4: Add a helper for encoding stateids
Trond Myklebust [Sun, 4 Mar 2012 23:13:56 +0000 (18:13 -0500)]
NFSv4: Add a helper for encoding stateids

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4: Add a helper for encoding opaque data
Trond Myklebust [Sun, 4 Mar 2012 23:13:56 +0000 (18:13 -0500)]
NFSv4: Add a helper for encoding opaque data

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4: Rename encode_stateid() to encode_open_stateid()
Trond Myklebust [Sun, 4 Mar 2012 23:13:56 +0000 (18:13 -0500)]
NFSv4: Rename encode_stateid() to encode_open_stateid()

The current version of encode_stateid really only applies to open stateids.
You can't use it for locks, delegations or layouts.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4: Further clean-ups of delegation stateid validation
Trond Myklebust [Sun, 4 Mar 2012 23:13:56 +0000 (18:13 -0500)]
NFSv4: Further clean-ups of delegation stateid validation

Change the name to reflect what we're really doing: testing two
stateids for whether or not they match according the the rules in
RFC3530 and RFC5661.
Move the code from callback_proc.c to nfs4proc.c

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4.1: Fix matching of the stateids when returning a delegation
Trond Myklebust [Sun, 4 Mar 2012 23:13:56 +0000 (18:13 -0500)]
NFSv4.1: Fix matching of the stateids when returning a delegation

nfs41_validate_delegation_stateid is broken if we supply a stateid with
a non-zero sequence id. Instead of trying to match the sequence id,
the function assumes that we always want to error. While this is
true for a delegation callback, it is not true in general.

Also fix a typo in nfs4_callback_recall.

Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: Properly handle the case where the delegation is revoked
Trond Myklebust [Tue, 6 Mar 2012 00:56:44 +0000 (19:56 -0500)]
NFS: Properly handle the case where the delegation is revoked

If we know that the delegation stateid is bad or revoked, we need to
remove that delegation as soon as possible, and then mark all the
stateids that relied on that delegation for recovery. We cannot use
the delegation as part of the recovery process.

Also note that NFSv4.1 uses a different error code (NFS4ERR_DELEG_REVOKED)
to indicate that the delegation was revoked.

Finally, ensure that setlk() and setattr() can both recover safely from
a revoked delegation.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
12 years agoNFS: Fix a typo in _nfs_display_fhandle
Trond Myklebust [Tue, 6 Mar 2012 15:14:35 +0000 (10:14 -0500)]
NFS: Fix a typo in _nfs_display_fhandle

The check for 'fh == NULL' needs to come _before_ we dereference
fh.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: Fix a compile issue when !CONFIG_NFS_V4_1
Trond Myklebust [Sun, 4 Mar 2012 23:12:57 +0000 (18:12 -0500)]
NFS: Fix a compile issue when !CONFIG_NFS_V4_1

The attempt to display the implementation ID needs to be conditional on
whether or not CONFIG_NFS_V4_1 is defined

Reported-by: Bryan Schumaker <Bryan.Schumaker@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: Undo changes to idmap.h
Bryan Schumaker [Mon, 5 Mar 2012 19:58:15 +0000 (14:58 -0500)]
NFS: Undo changes to idmap.h

When compiled without NFS v4 configured these function won't be defined
and the compiler will yell.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoMerge commit 'nfs-for-3.3-4' into nfs-for-next
Trond Myklebust [Sat, 3 Mar 2012 20:04:15 +0000 (15:04 -0500)]
Merge commit 'nfs-for-3.3-4' into nfs-for-next

Conflicts:
fs/nfs/nfs4proc.c

Back-merge of the upstream kernel in order to fix a conflict with the
slotid type conversion and implementation id patches...

12 years agoNFS: Reduce debugging noise from encode_compound_hdr
Chuck Lever [Fri, 2 Mar 2012 21:58:56 +0000 (16:58 -0500)]
NFS: Reduce debugging noise from encode_compound_hdr

Get rid of

  encode_compound: tag=

when XDR debugging is enabled.  The current Linux client never sets
compound tags.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: Request fh_expire_type attribute in "server caps" operation
Chuck Lever [Thu, 1 Mar 2012 22:02:05 +0000 (17:02 -0500)]
NFS: Request fh_expire_type attribute in "server caps" operation

The fh_expire_type file attribute is a filesystem wide attribute that
consists of flags that indicate what characteristics file handles
on this FSID have.

Our client doesn't support volatile file handles.  It should find
out early (say, at mount time) whether the server is going to play
shenanighans with file handles during a migration.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: Introduce NFS_ATTR_FATTR_V4_LOCATIONS
Chuck Lever [Thu, 1 Mar 2012 22:01:57 +0000 (17:01 -0500)]
NFS: Introduce NFS_ATTR_FATTR_V4_LOCATIONS

The Linux NFS client must distinguish between referral events (which
it currently supports) and migration events (which it does not yet
support).

In both types of events, an fs_locations array is returned.  But upper
layers, not the XDR layer, should make the distinction between a
referral and a migration.  There really isn't a way for an XDR decoder
function to distinguish the two, in general.

Slightly adjust the FATTR flags returned by decode_fs_locations()
to set NFS_ATTR_FATTR_V4_LOCATIONS only if a non-empty locations
array was returned from the server.  Then have logic in nfs4proc.c
distinguish whether the locations array is for a referral or
something else.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: Simplify arguments of encode_renew()
Chuck Lever [Thu, 1 Mar 2012 22:01:48 +0000 (17:01 -0500)]
NFS: Simplify arguments of encode_renew()

Clean up: pass just the clientid4 to encode_renew().  This enables it
to be used by callers who might not have an full nfs_client.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: Add a client-side function to display NFS file handles
Chuck Lever [Thu, 1 Mar 2012 22:01:31 +0000 (17:01 -0500)]
NFS: Add a client-side function to display NFS file handles

For debugging, introduce a simplistic function to print NFS file
handles on the system console.  The main function is hooked into the
dprintk debugging facility, but you can directly call the helper,
_nfs_display_fhandle(), if you want to print a handle unconditionally.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: Make clientaddr= optional
Chuck Lever [Thu, 1 Mar 2012 22:01:23 +0000 (17:01 -0500)]
NFS: Make clientaddr= optional

For NFSv4 mounts, the clientaddr= mount option has always been
required.  Now we have rpc_localaddr() in the kernel, which was
modeled after the same logic in the mount.nfs command that constructs
the clientaddr= mount option.  If user space doesn't provide a
clientaddr= mount option, the kernel can now construct its own.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoSUNRPC: Add API to acquire source address
Chuck Lever [Thu, 1 Mar 2012 22:01:14 +0000 (17:01 -0500)]
SUNRPC: Add API to acquire source address

NFSv4.0 clients must send endpoint information for their callback
service to NFSv4.0 servers during their first contact with a server.
Traditionally on Linux, user space provides the callback endpoint IP
address via the "clientaddr=" mount option.

During an NFSv4 migration event, it is possible that an FSID may be
migrated to a destination server that is accessible via a different
source IP address than the source server was.  The client must update
callback endpoint information on the destination server so that it can
maintain leases and allow delegation.

Without a new "clientaddr=" option from user space, however, the
kernel itself must construct an appropriate IP address for the
callback update.  Provide an API in the RPC client for upper layer
RPC consumers to acquire a source address for a remote.

The mechanism used by the mount.nfs command is copied: set up a
connected UDP socket to the designated remote, then scrape the source
address off the socket.  We are careful to select the correct network
namespace when setting up the temporary UDP socket.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoSUNRPC: Move clnt->cl_server into struct rpc_xprt
Trond Myklebust [Thu, 1 Mar 2012 22:01:05 +0000 (17:01 -0500)]
SUNRPC: Move clnt->cl_server into struct rpc_xprt

When the cl_xprt field is updated, the cl_server field will also have
to change.  Since the contents of cl_server follow the remote endpoint
of cl_xprt, just move that field to the rpc_xprt.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
[ cel: simplify check_gss_callback_principal(), whitespace changes ]
[ cel: forward ported to 3.4 ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoSUNRPC: Use RCU to dereference the rpc_clnt.cl_xprt field
Trond Myklebust [Thu, 1 Mar 2012 22:00:56 +0000 (17:00 -0500)]
SUNRPC: Use RCU to dereference the rpc_clnt.cl_xprt field

A migration event will replace the rpc_xprt used by an rpc_clnt.  To
ensure this can be done safely, all references to cl_xprt must now use
a form of rcu_dereference().

Special care is taken with rpc_peeraddr2str(), which returns a pointer
to memory whose lifetime is the same as the rpc_xprt.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
[ cel: fix lockdep splats and layering violations ]
[ cel: forward ported to 3.4 ]
[ cel: remove rpc_max_reqs(), add rpc_net_ns() ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: Add debugging messages to NFSv4's CLOSE procedure
Chuck Lever [Thu, 1 Mar 2012 22:00:40 +0000 (17:00 -0500)]
NFS: Add debugging messages to NFSv4's CLOSE procedure

CLOSE is new with NFSv4.  Sometimes it's important to know the timing
of this operation compared to things like lease renewal.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: Clean up debugging in decode_pathname()
Chuck Lever [Thu, 1 Mar 2012 22:00:31 +0000 (17:00 -0500)]
NFS: Clean up debugging in decode_pathname()

I noticed recently that decode_attr_fs_locations() is not generating
very pretty debugging output.  The pathname components each appear on
a separate line of output, though that does not appear to be the
intended display behavior.  The preferred way to generate continued
lines of output on the console is to use pr_cont().

Note that incoming pathname4 components contain a string that is not
necessarily NUL-terminated.  I did actually see some trailing garbage
on the console.  In addition to correcting the line continuation
problem, add a string precision format specifier to ensure that each
component string is displayed properly, and that vsnprintf() does
not Oops.

Someone pointed out that allowing incoming network data to possibly
generate a console line of unbounded length may not be such a good
idea.  Since this output will rarely be enabled, and there is a hard
upper bound (NFS4_PATHNAME_MAXCOMPONENTS) in our implementation, this
is probably not a major concern.

It might be useful to additionally sanity-check the length of each
incoming component, however.  RFC 3530bis15 does not suggest a maximum
number of UTF-8 characters per component for either the pathname4 or
component4 types.  However, we could invent one that is appropriate
for our implementation.

Another possibility is to scrap all of this and print these pathnames
in upper layers after a reasonable amount of sanity checking in the
XDR layer.  This would give us an opportunity to allocate a full
buffer so that the whole pathname would be output via a single
dprintk.

Introduced by commit 7aaa0b3b: "NFSv4: convert fs-locations-components
to conform to RFC3530," (June 9, 2006).

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: Make nfs_cache_array.size a signed integer
Chuck Lever [Thu, 1 Mar 2012 22:00:23 +0000 (17:00 -0500)]
NFS: Make nfs_cache_array.size a signed integer

Eliminate a number of implicit type casts in comparisons, and these
compiler warnings:

fs/nfs/dir.c: In function â€˜nfs_readdir_clear_array’:
fs/nfs/dir.c:264:16: warning: comparison between signed and unsigned
integer expressions [-Wsign-compare]
fs/nfs/dir.c: In function â€˜nfs_readdir_search_for_cookie’:
fs/nfs/dir.c:352:16: warning: comparison between signed and unsigned
integer expressions [-Wsign-compare]
fs/nfs/dir.c: In function â€˜nfs_do_filldir’:
fs/nfs/dir.c:769:38: warning: comparison between signed and unsigned
integer expressions [-Wsign-compare]
fs/nfs/dir.c:780:9: warning: comparison between signed and unsigned
integer expressions [-Wsign-compare]

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: Consolidate the parsing of the '-ov4.x' and '-overs=4.x' mount options
Trond Myklebust [Fri, 2 Mar 2012 19:06:39 +0000 (14:06 -0500)]
NFS: Consolidate the parsing of the '-ov4.x' and '-overs=4.x' mount options

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: Ensure we display the minor version correctly in /proc/mounts etc.
Trond Myklebust [Fri, 2 Mar 2012 19:00:20 +0000 (14:00 -0500)]
NFS: Ensure we display the minor version correctly in /proc/mounts etc.

The 'minorversion' mount option is now deprecated, so we need to display
the minor version number in the 'vers=' format.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: Extend the -overs= mount option to allow 4.x minorversions
Trond Myklebust [Fri, 2 Mar 2012 18:59:49 +0000 (13:59 -0500)]
NFS: Extend the -overs= mount option to allow 4.x minorversions

Allow the user to mount an NFSv4.0 or NFSv4.1 partition using a
standard syntax of '-overs=4.0', or '-overs=4.1' rather than the
more cumbersome '-overs=4,minorversion=1'.

See also the earlier patch by Dros Adamson, which added the
Linux-specific syntax '-ov4.0', '-ov4.1'.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4: parse and display server implementation ids
Weston Andros Adamson [Fri, 17 Feb 2012 20:20:26 +0000 (15:20 -0500)]
NFSv4: parse and display server implementation ids

Shows the implementation ids in /proc/self/mountstats.  This doesn't break
the nfs-utils mountstats tool.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4: fix server_scope memory leak
Weston Andros Adamson [Fri, 17 Feb 2012 20:20:25 +0000 (15:20 -0500)]
NFSv4: fix server_scope memory leak

server_scope would never be freed if nfs4_check_cl_exchange_flags() returned
non-zero

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4: Send implementation id with exchange_id
Weston Andros Adamson [Fri, 17 Feb 2012 20:20:24 +0000 (15:20 -0500)]
NFSv4: Send implementation id with exchange_id

Send the nfs implementation id in EXCHANGE_ID requests unless the module
parameter nfs.send_implementation_id is 0.

This adds a CONFIG variable for the nii_domain that defaults to "kernel.org".

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: Store the legacy idmapper result in the keyring
Bryan Schumaker [Fri, 24 Feb 2012 19:14:51 +0000 (14:14 -0500)]
NFS: Store the legacy idmapper result in the keyring

This patch removes the old hashmap-based caching and instead uses a
"request key actor" to place an upcall to the legacy idmapper rather
than going through /sbin/request-key.  This will only be used as a
fallback if /etc/request-key.conf isn't configured to use nfsidmap.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoCreated a function for setting timeouts on keys
Bryan Schumaker [Fri, 24 Feb 2012 19:14:50 +0000 (14:14 -0500)]
Created a function for setting timeouts on keys

The keyctl_set_timeout function isn't exported to other parts of the
kernel, but I want to use it for the NFS idmapper.  I already have the
key, but I wanted a generic way to set the timeout.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4.1: Get rid of NFS4CLNT_LAYOUTRECALL
Trond Myklebust [Thu, 1 Mar 2012 16:17:50 +0000 (11:17 -0500)]
NFSv4.1: Get rid of NFS4CLNT_LAYOUTRECALL

The NFS4CLNT_LAYOUTRECALL bit is a long-term impediment to scalability. It
basically stops all other recalls by a given server once any layout recall
is requested.

If the recall is for a different file, then we don't care.
If the recall applies to the same file, then we're in one of two situations:
Either we are in the case of a replay of an existing request, in which case
the session is supposed to deal with matters, or we are dealing with a
completely different request, in which case we should just try to process
it.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4.1: Get rid of redundant NFS4CLNT_LAYOUTRECALL tests
Trond Myklebust [Thu, 1 Mar 2012 16:17:47 +0000 (11:17 -0500)]
NFSv4.1: Get rid of redundant NFS4CLNT_LAYOUTRECALL tests

The NFS4CLNT_LAYOUTRECALL tests in pnfs_layout_process and
pnfs_update_layout are redundant.

In the case of a bulk layout recall, we're always testing for
the NFS_LAYOUT_BULK_RECALL flay anyway.
In the case of a file or segment recall, the call to
pnfs_set_layout_stateid() updates the layout_header 'barrier'
sequence id, which triggers the test in pnfs_layoutgets_blocked()
and is less race-prone than NFS4CLNT_LAYOUTRECALL anyway.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoSUNRPC: move waitq from RPC pipe to RPC inode
Stanislav Kinsbursky [Mon, 27 Feb 2012 18:05:54 +0000 (22:05 +0400)]
SUNRPC: move waitq from RPC pipe to RPC inode

Currently, wait queue, used for polling of RPC pipe changes from user-space,
is a part of RPC pipe. But the pipe data itself can be released on NFS umount
prior to dentry-inode pair, connected to it (is case of this pair is open by
some process).
This is not a problem for almost all pipe users, because all PipeFS file
operations checks pipe reference prior to using it.
Except evenfd. This thing registers itself with "poll" file operation and thus
has a reference to pipe wait queue. This leads to oopses on destroying eventfd
after NFS umount (like rpc_idmapd do) since not pipe data left to the point
already.
The solution is to wait queue from pipe data to internal RPC inode data. This
looks more logical, because this wiat queue used only for user-space processes,
which already holds inode reference.

Note: upcalls have to get pipe->dentry prior to dereferecing wait queue to make
sure, that mount point won't disappear from underneath us.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoSUNRPC: check RPC inode's pipe reference before dereferencing
Stanislav Kinsbursky [Mon, 27 Feb 2012 18:05:45 +0000 (22:05 +0400)]
SUNRPC: check RPC inode's pipe reference before dereferencing

There are 2 tightly bound objects: pipe data (created for kernel needs, has
reference to dentry, which depends on PipeFS mount/umount) and PipeFS
dentry/inode pair (created on mount for user-space needs). They both
independently may have or have not a valid reference to each other.
This means, that we have to make sure, that pipe->dentry reference is valid on
upcalls, and dentry->pipe reference is valid on downcalls. The latter check is
absent - my fault.
IOW, PipeFS dentry can be opened by some process (rpc.idmapd for example), but
it's pipe data can belong to NFS mount, which was unmounted already and thus
pipe data was destroyed.
To fix this, pipe reference have to be set to NULL on rpc_unlink() and checked
on PipeFS file operations instead of pipe->dentry check.

Note: PipeFS "poll" file operation will be updated in next patch, because it's
logic is more complicated.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: release per-net clients lock before calling PipeFS dentries creation
Stanislav Kinsbursky [Mon, 27 Feb 2012 18:05:37 +0000 (22:05 +0400)]
NFS: release per-net clients lock before calling PipeFS dentries creation

v3:
1) Lookup for client is performed from the beginning of the list on each PipeFS
event handling operation.

Lockdep is sad otherwise, because inode mutex is taken on PipeFS dentry
creation, which can be called on mount notification, where this per-net client
lock is taken on clients list walk.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoSUNRPC: release per-net clients lock before calling PipeFS dentries creation
Stanislav Kinsbursky [Mon, 27 Feb 2012 18:05:29 +0000 (22:05 +0400)]
SUNRPC: release per-net clients lock before calling PipeFS dentries creation

v3:
1) Lookup for client is performed from the beginning of the list on each PipeFS
event handling operation.

Lockdep is sad otherwise, because inode mutex is taken on PipeFS dentry
creation, which can be called on mount notification, where this per-net client
lock is taken on clients list walk.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4.1: Don't call nfs4_deviceid_purge_client() unless we're NFSv4.1
Trond Myklebust [Sun, 26 Feb 2012 22:34:22 +0000 (17:34 -0500)]
NFSv4.1: Don't call nfs4_deviceid_purge_client() unless we're NFSv4.1

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: Ensure struct nfs_client holds a reference to the net namespace
Trond Myklebust [Sun, 19 Feb 2012 07:46:49 +0000 (08:46 +0100)]
NFS: Ensure struct nfs_client holds a reference to the net namespace

Otherwise we have no guarantee that the net namespace won't just
disappear from underneath us once the task that created it
is destroyed.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Stanislav Kinsbursky <skinsbursky@parallels.com>
12 years agoNFS: Ensure that the nfs_client 'net' field is always set
Trond Myklebust [Sun, 19 Feb 2012 07:44:07 +0000 (08:44 +0100)]
NFS: Ensure that the nfs_client 'net' field is always set

Currently, the nfs_parsed_mount_data->net field is initialised in
the nfs_parse_mount_options() function, which means that it only
gets set if we're using text based mounts. The legacy binary
mount interface is therefore broken.

Fix is to initialise the ->net field in nfs_alloc_parsed_mount_data.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Stanislav Kinsbursky <skinsbursky@parallels.com>
12 years agoNFSv4: fix server_scope memory leak
Weston Andros Adamson [Thu, 16 Feb 2012 16:17:05 +0000 (11:17 -0500)]
NFSv4: fix server_scope memory leak

server_scope would never be freed if nfs4_check_cl_exchange_flags() returned
non-zero

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4.1: Fix a NFSv4.1 session initialisation regression
Trond Myklebust [Wed, 15 Feb 2012 01:33:19 +0000 (20:33 -0500)]
NFSv4.1: Fix a NFSv4.1 session initialisation regression

Commit aacd553 (NFSv4.1: cleanup init and reset of session slot tables)
introduces a regression in the session initialisation code. New tables
now find their sequence ids initialised to 0, rather than the mandated
value of 1 (see RFC5661).

Fix the problem by merging nfs4_reset_slot_table() and nfs4_init_slot_table().
Since the tbl->max_slots is initialised to 0, the test in
nfs4_reset_slot_table for max_reqs != tbl->max_slots will automatically
pass for an empty table.

Reported-by: Vitaliy Gusev <gusev.vitaliy@nexenta.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: include filelayout DS rpc stats in mountstats
Weston Andros Adamson [Fri, 17 Feb 2012 18:15:24 +0000 (13:15 -0500)]
NFS: include filelayout DS rpc stats in mountstats

Include RPC statistics from all data servers in /proc/self/mountstats for pNFS
filelayout mounts.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4.1 set highest_used_slotid to NFS4_NO_SLOT
Andy Adamson [Fri, 17 Feb 2012 18:05:23 +0000 (13:05 -0500)]
NFSv4.1 set highest_used_slotid to NFS4_NO_SLOT

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agonfs: Clean up debugging in nfs_follow_mountpoint()
Chuck Lever [Wed, 15 Feb 2012 21:35:17 +0000 (16:35 -0500)]
nfs: Clean up debugging in nfs_follow_mountpoint()

Clean up: Fix a debugging message which had an obsolete function name
in it (nfs_follow_mountpoint).

Introduced by commit 36d43a43 "NFS: Use d_automount() rather than
abusing follow_link()" (January 14, 2011)

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoSUNRPC: Use KERN_DEFAULT for debugging printk's
Chuck Lever [Wed, 15 Feb 2012 21:35:08 +0000 (16:35 -0500)]
SUNRPC: Use KERN_DEFAULT for debugging printk's

Our dprintk() debugging facility doesn't specify any verbosity level
for it's printk() calls, but it should.

The default verbosity for printk's is KERN_DEFAULT.  You might argue
that these are debugging printk's and thus the verbosity should be
KERN_DEBUG.  That would mean that to see NFS and SUNRPC debugging
output an admin would also have to boost the syslog verbosity, which
would be insufferably noisy.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoSUNRPC: add sending,pending queue and max slot to xprt stats
Andy Adamson [Tue, 14 Feb 2012 21:19:18 +0000 (16:19 -0500)]
SUNRPC: add sending,pending queue and max slot to xprt stats

With static RPC slots, the xprt backlog queue stats were useful in showing
when the transport (TCP) was starved by lack of RPC slots. The new dynamic
RPC slot code, commit d9ba131d8f58c0d2ff5029e7002ab43f913b36f9, always
provides an RPC slot and so only uses the xprt backlog queue when the
tcp_max_slot_table_entries value has been hit or when an allocation error
occurs. All requests are now placed on the xprt sending or pending queue which
need to be monitored for debugging.

The max_slot stat shows the maximum number of dynamic RPC slots reached which is
useful when debugging performance issues.

Add the new fields at the end of the mountstats xprt stanza so that mountstats
outputs the previous correct values and ignores the new fields. Bump
NFS_IOSTATS_VERS.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoSUNRPC: init per-net rpcbind spinlock
Stanislav Kinsbursky [Thu, 16 Feb 2012 13:42:12 +0000 (17:42 +0400)]
SUNRPC: init per-net rpcbind spinlock

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agonfs41: Verify channel's attributes accordingly to RFC v2
Vitaliy Gusev [Wed, 15 Feb 2012 15:38:25 +0000 (19:38 +0400)]
nfs41: Verify channel's attributes accordingly to RFC v2

 ca_maxoperations:

      For the backchannel, the server MUST
      NOT change the value the client offers.  For the fore channel,
      the server MAY change the requested value.

  ca_maxrequests:

       For the backchannel, the server MUST NOT change the
       value the client offers.  For the fore channel, the server MAY
       change the requested value.

Signed-off-by: Vitaliy Gusev <gusev.vitaliy@nexenta.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: dont allow minorversion= opt when vers != 4
Weston Andros Adamson [Wed, 1 Feb 2012 19:06:41 +0000 (14:06 -0500)]
NFS: dont allow minorversion= opt when vers != 4

Don't allow invalid 'vers' and 'minorversion' combinations in mount options,
such as "vers=3,minorversion=1".

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoSUNRPC: Ensure that we can trace waitqueues when !defined(CONFIG_SYSCTL)
Trond Myklebust [Thu, 9 Feb 2012 03:01:15 +0000 (22:01 -0500)]
SUNRPC: Ensure that we can trace waitqueues when !defined(CONFIG_SYSCTL)

The tracepoint code relies on the queue->name being defined in order to
be able to display the name of the waitqueue on which an RPC task is
sleeping.

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
12 years agoNFSv4: Further reduce the footprint of the idmapper
Trond Myklebust [Wed, 8 Feb 2012 18:39:15 +0000 (13:39 -0500)]
NFSv4: Further reduce the footprint of the idmapper

Don't allocate the legacy idmapper tables until we actually need
them.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
12 years agoNFSv4: The idmapper now depends on keyring functionality
Trond Myklebust [Wed, 8 Feb 2012 18:21:38 +0000 (13:21 -0500)]
NFSv4: The idmapper now depends on keyring functionality

Add the appropriate 'select KEYS' to the NFSv4 Kconfig entry.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4: Reduce the footprint of the idmapper
Trond Myklebust [Tue, 7 Feb 2012 19:59:05 +0000 (14:59 -0500)]
NFSv4: Reduce the footprint of the idmapper

Instead of pre-allocating the storage for all the strings, we can
significantly reduce the size of that table by doing the allocation
when we do the downcall.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
12 years agoNFS: add mount options 'v4.0' and 'v4.1'
Weston Andros Adamson [Tue, 7 Feb 2012 16:49:11 +0000 (11:49 -0500)]
NFS: add mount options 'v4.0' and 'v4.1'

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: fix nfs4_find_client_sessionid() arguments list
Stanislav Kinsbursky [Tue, 7 Feb 2012 15:53:19 +0000 (19:53 +0400)]
NFS: fix nfs4_find_client_sessionid() arguments list

It's not compilable in case of CONFIG_NFS_V4_1 is not set.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: Initialise the nfs_net->nfs_client_lock
Trond Myklebust [Tue, 7 Feb 2012 05:05:11 +0000 (00:05 -0500)]
NFS: Initialise the nfs_net->nfs_client_lock

Ensure that we initialise the nfs_net->nfs_client_lock spinlock.
Also ensure that nfs_server_remove_lists() doesn't try to
dereference server->nfs_client before that is initialised.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Stanislav Kinsbursky <skinsbursky@parallels.com>
12 years agoLockd: shutdown NLM hosts in network namespace context
Stanislav Kinsbursky [Tue, 31 Jan 2012 11:08:29 +0000 (15:08 +0400)]
Lockd: shutdown NLM hosts in network namespace context

Lockd now managed in network namespace context. And this patch introduces
network namespace related NLM hosts shutdown in case of releasing per-net Lockd
resources.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoLockD: make NSM network namespace aware
Stanislav Kinsbursky [Tue, 31 Jan 2012 11:08:21 +0000 (15:08 +0400)]
LockD: make NSM network namespace aware

NLM host is network namespace aware now.
So NSM have to take it into account.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoLockD: make nlm hosts network namespace aware
Stanislav Kinsbursky [Tue, 31 Jan 2012 11:08:13 +0000 (15:08 +0400)]
LockD: make nlm hosts network namespace aware

This object depends on RPC client, and thus on network namespace.
So let's make it's allocation and lookup in network namespace context.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoLockd: per-net up and down routines introduced
Stanislav Kinsbursky [Tue, 31 Jan 2012 11:08:05 +0000 (15:08 +0400)]
Lockd: per-net up and down routines introduced

This patch introduces per-net Lockd initialization and destruction routines.
The logic is the same as in global Lockd up and down routines. Probably the
solution is not the best one. But at least it looks clear.
So per-net "up" routine are called only in case of lockd is running already. If
per-net resources are not allocated yet, then service is being registered with
local portmapper and lockd sockets created.
Per-net "down" routine is called on every lockd_down() call in case of global
users counter is not zero.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoLockd: pernet usage counter introduced
Stanislav Kinsbursky [Tue, 31 Jan 2012 11:07:57 +0000 (15:07 +0400)]
Lockd: pernet usage counter introduced

Lockd is going to be shared between network namespaces - i.e. going to be able
to handle lock requests from different network namespaces. This means, that
network namespace related resources have to be allocated not once (like now),
but for every network namespace context, from which service is requested to
operate.
This patch implements Lockd per-net users accounting. New per-net counter is
used to determine, when per-net resources have to be freed.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoLockd: create permanent lockd sockets in current network namespace
Stanislav Kinsbursky [Tue, 31 Jan 2012 11:07:48 +0000 (15:07 +0400)]
Lockd: create permanent lockd sockets in current network namespace

This patch parametrizes Lockd permanent sockets creation routine by network
namespace context.
It also replaces hard-coded init_net with current network namespace context in
Lockd sockets creation routines.
This approach looks safe, because Lockd is created during NFS mount (or NFS
server start) and thus socket is required exactly in current network namespace
context. But in the same time it means, that Lockd sockets inherits first Lockd
requester network namespace. This issue will be fixed in further patches of the
series.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoSUNRPC: service shutdown function in network namespace context introduced
Stanislav Kinsbursky [Tue, 31 Jan 2012 10:09:25 +0000 (14:09 +0400)]
SUNRPC: service shutdown function in network namespace context introduced

This function is enough for releasing resources, allocated for network
namespace context, in case of sharing service between them.
IOW, each service "user" (LockD, NFSd, etc), which wants to share service
between network namespaces, have to release related resources by the function,
introduced in this patch, instead of performing service shutdown (of course in
case the service is shared already to the moment of release).

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoSUNRPC: service destruction in network namespace context
Stanislav Kinsbursky [Tue, 31 Jan 2012 10:09:17 +0000 (14:09 +0400)]
SUNRPC: service destruction in network namespace context

v2: Added comment to BUG_ON's in svc_destroy() to make code looks clearer.

This patch introduces network namespace filter for service destruction
function.
Nothing special here - just do exactly the same operations, but only for
tranports in passed networks namespace context.
BTW, BUG_ON() checks for empty service transports lists were returned into
svc_destroy() function. This is because of swithing generic svc_close_all() to
networks namespace dependable svc_close_net().

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoSUNRPC: clear svc transports lists helper introduced
Stanislav Kinsbursky [Tue, 31 Jan 2012 10:09:08 +0000 (14:09 +0400)]
SUNRPC: clear svc transports lists helper introduced

This patch moves service transports deletion from service sockets lists to
separated function.
This is a precursor patch, which would be usefull with service shutdown in
network namespace context, introduced later in the series.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoSUNRPC: clear svc pools lists helper introduced
Stanislav Kinsbursky [Tue, 31 Jan 2012 10:09:00 +0000 (14:09 +0400)]
SUNRPC: clear svc pools lists helper introduced

This patch moves removing of service transport from it's pools ready lists to
separated function. Also this clear is now done with list_for_each_entry_safe()
helper.
This is a precursor patch, which would be usefull with service shutdown in
network namespace context, introduced later in the series.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4.1: Add a module parameter to set the number of session slots
Trond Myklebust [Tue, 7 Feb 2012 00:50:40 +0000 (19:50 -0500)]
NFSv4.1: Add a module parameter to set the number of session slots

Add the module parameter 'max_session_slots' to set the initial number
of slots that the NFSv4.1 client will attempt to negotiate with the
server.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4.1: Convert slotid from u8 to u32
Trond Myklebust [Tue, 7 Feb 2012 00:38:51 +0000 (19:38 -0500)]
NFSv4.1: Convert slotid from u8 to u32

It is perfectly legal to negotiate up to 2^32-1 slots in the protocol,
and with 10GigE, we are already seeing that 255 slots is far too limiting.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFSv4: Ensure we throw out bad delegation stateids on NFS4ERR_BAD_STATEID
Trond Myklebust [Thu, 9 Feb 2012 20:31:36 +0000 (15:31 -0500)]
NFSv4: Ensure we throw out bad delegation stateids on NFS4ERR_BAD_STATEID

To ensure that we don't just reuse the bad delegation when we attempt to
recover the nfs4_state that received the bad stateid error.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
12 years agoNFS: build fixed in case of NFS_USE_NEW_IDMAPPER is undefined
Stanislav Kinsbursky [Thu, 26 Jan 2012 11:11:41 +0000 (15:11 +0400)]
NFS: build fixed in case of NFS_USE_NEW_IDMAPPER is undefined

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: pass transport net to rpc_pton() while parse server name
Stanislav Kinsbursky [Thu, 26 Jan 2012 11:12:05 +0000 (15:12 +0400)]
NFS: pass transport net to rpc_pton() while parse server name

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: pass current net to rpc_pton() while parsing mount options
Stanislav Kinsbursky [Thu, 26 Jan 2012 11:11:57 +0000 (15:11 +0400)]
NFS: pass current net to rpc_pton() while parsing mount options

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: search for client session id in proper network namespace
Stanislav Kinsbursky [Thu, 26 Jan 2012 11:11:49 +0000 (15:11 +0400)]
NFS: search for client session id in proper network namespace

Network namespace is taken from request transport and passed as a part of
cb_process_state structure.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: pass proper net rpc_pton() in nfs_dns_resolve_name()
Stanislav Kinsbursky [Thu, 26 Jan 2012 11:11:33 +0000 (15:11 +0400)]
NFS: pass proper net rpc_pton() in nfs_dns_resolve_name()

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: make nfs_client_lock per net ns
Stanislav Kinsbursky [Mon, 23 Jan 2012 17:26:31 +0000 (17:26 +0000)]
NFS: make nfs_client_lock per net ns

This patch makes nfs_clients_lock allocated per network namespace. All items it
protects are already network namespace aware.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: make cb_ident_idr per net ns
Stanislav Kinsbursky [Mon, 23 Jan 2012 17:26:22 +0000 (17:26 +0000)]
NFS: make cb_ident_idr per net ns

This patch makes ID's infrastructure network namespace aware. This was done
mainly because of nfs_client_lock, which is desired to be per network
namespace, but protects NFS clients ID's.

NOTE: NFS client's net pointer have to be set prior to ID initialization,
proper assignment was moved.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: make nfs_volume_list per net ns
Stanislav Kinsbursky [Mon, 23 Jan 2012 17:26:14 +0000 (17:26 +0000)]
NFS: make nfs_volume_list per net ns

This patch splits global list of NFS servers into per-net-ns array of lists.
This looks more strict and clearer.
BTW, this patch also makes "/proc/fs/nfsfs/volumes" content depends on /proc
mount owner pid namespace. See below for details.

NOTE: few words about how was /proc/fs/nfsfs/ entries content show per network
namespace done. This is a little bit tricky and not the best is could be. But
it's cheap (proper fix for /proc conteinerization is a hard nut to crack).
The idea is simple: take proper network namespace from pid namespace
child reaper nsproxy of /proc/ mount creator.
This actually means, that if there are 2 containers with different net
namespace sharing pid namespace, then read of /proc/fs/nfsfs/ entries will
always return content, taken from net namespace of pid namespace creator task
(and thus second namespace set wil be unvisible).

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: make nfs_client_list per net ns
Stanislav Kinsbursky [Mon, 23 Jan 2012 17:26:05 +0000 (17:26 +0000)]
NFS: make nfs_client_list per net ns

This patch splits global list of NFS clients into per-net-ns array of lists.
This looks more strict and clearer.
BTW, this patch also makes "/proc/fs/nfsfs/servers" entry content depends on
/proc mount owner pid namespace. See below for details.

NOTE: few words about how was /proc/fs/nfsfs/ entries content show per network
namespace done. This is a little bit tricky and not the best is could be. But
it's cheap (proper fix for /proc conteinerization is a hard nut to crack).
The idea is simple: take proper network namespace from pid namespace
child reaper nsproxy of /proc/ mount creator.
This actually means, that if there are 2 containers with different net
namespace sharing pid namespace, then read of /proc/fs/nfsfs/ entries will
always return content, taken from net namespace of pid namespace creator task
(and thus second namespace set wil be unvisible).

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: Update idmapper documentation
Bryan Schumaker [Thu, 26 Jan 2012 21:54:25 +0000 (16:54 -0500)]
NFS: Update idmapper documentation

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: Keep idmapper include files in one place
Bryan Schumaker [Thu, 26 Jan 2012 21:54:24 +0000 (16:54 -0500)]
NFS: Keep idmapper include files in one place

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: Fall back on old idmapper if request_key() fails
Bryan Schumaker [Thu, 26 Jan 2012 21:54:23 +0000 (16:54 -0500)]
NFS: Fall back on old idmapper if request_key() fails

This patch removes the CONFIG_NFS_USE_NEW_IDMAPPER compile option.
First, the idmapper will attempt to map the id using /sbin/request-key
and nfsidmap.  If this fails (if /etc/request-key.conf is not configured
properly) then the idmapper will call the legacy code to perform the
mapping.  I left a comment stating where the legacy code begins to make
it easier for somebody to remove in the future.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: Fix comparison between DS address lists
Weston Andros Adamson [Fri, 3 Feb 2012 20:45:40 +0000 (15:45 -0500)]
NFS: Fix comparison between DS address lists

data_server_cache entries should only be treated as the same if the address
list hasn't changed.

A new entry will be made when an MDS changes an address list (as seen by
GETDEVINFO). The old entry will be freed once all references are gone.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: start printks w/ NFS: even if __func__ shown
Weston Andros Adamson [Thu, 26 Jan 2012 18:32:23 +0000 (13:32 -0500)]
NFS: start printks w/ NFS: even if __func__ shown

This patch addresses printks that have some context to show that they are
from fs/nfs/, but for the sake of consistency now start with NFS:

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoNFS: printks in fs/nfs/ should start with NFS:
Weston Andros Adamson [Thu, 26 Jan 2012 18:32:22 +0000 (13:32 -0500)]
NFS: printks in fs/nfs/ should start with NFS:

Messages like "Got error -10052 from the server on DESTROY_SESSION. Session
has been destroyed regardless" can be confusing to users who aren't very
familiar with NFS.

NOTE: This patch ignores any printks() that start by printing __func__ - that
will be in a separate patch.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
12 years agoSUNRPC: remove an unneeded NULL check in xprt_connect()
Dan Carpenter [Wed, 1 Feb 2012 07:46:20 +0000 (10:46 +0300)]
SUNRPC: remove an unneeded NULL check in xprt_connect()

We check "task->tk_rqstp" and then we dereference it without checking on
the next line.  The only caller is call_connect() and that has a check
which prevents it from calling xprt_connect() with a NULL.

                if (task->tk_status < 0)
                        return;

If "task->tk_rqstp" were NULL then "tk_status" would be -EAGAIN.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>