profile/common/kernel-common.git
12 years agonfsd4: fix, consolidate client_has_state
J. Bruce Fields [Tue, 29 May 2012 20:37:44 +0000 (16:37 -0400)]
nfsd4: fix, consolidate client_has_state

Whoops: first, I reimplemented the already-existing has_resources
without noticing; second, I got the test backwards.  I did pick a better
name, though.  Combine the two....

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: don't remove rebooted client record until confirmation
J. Bruce Fields [Tue, 29 May 2012 18:44:28 +0000 (14:44 -0400)]
nfsd4: don't remove rebooted client record until confirmation

In the NFSv4.1 client-reboot case we're currently removing the client's
previous state in exchange_id.  That's wrong--we should be waiting till
the confirming create_session.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: remove some dprintk's and a comment
J. Bruce Fields [Tue, 29 May 2012 18:26:30 +0000 (14:26 -0400)]
nfsd4: remove some dprintk's and a comment

The comment is redundant, and if we really want dprintk's here they'd
probably be better in the common (check-slot_seqid) code.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: return "real" sequence id in confirmed case
J. Bruce Fields [Sat, 26 May 2012 01:40:23 +0000 (21:40 -0400)]
nfsd4: return "real" sequence id in confirmed case

The client should ignore the returned sequence_id in the case where the
CONFIRMED flag is set on an exchange_id reply--and in the unconfirmed
case "1" is always the right response.  So it shouldn't actually matter
what we return here.

We could continue returning 1 just to catch clients ignoring the spec
here, but I'd rather be generous.  Other things equal, returning the
existing sequence_id seems more informative.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: fix exchange_id to return confirm flag
J. Bruce Fields [Sat, 26 May 2012 01:24:40 +0000 (21:24 -0400)]
nfsd4: fix exchange_id to return confirm flag

Otherwise nfsd4_set_ex_flags writes over the return flags.

Reported-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: clarify that renewing expired client is a bug
J. Bruce Fields [Fri, 4 May 2012 18:57:52 +0000 (14:57 -0400)]
nfsd4: clarify that renewing expired client is a bug

This can't happen:
- cl_time is zeroed only by unhash_client_locked, which is only
  ever called under both the state lock and the client lock.
- every caller of renew_client() should have looked up a
  (non-expired) client and then called renew_client() all
  without dropping the state lock.
- the only other caller of renew_client_locked() is
  release_session_client(), which first checks under the
  client_lock that the cl_time is nonzero.

So make it clear that this is a bug, not something we handle.  I can't
quite bring myself to make this a BUG(), though, as there are a lot of
renew_client() callers, and returning here is probably safer than a
BUG().

We'll consider making it a BUG() after some more cleanup.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: simpler ordering of setclientid_confirm checks
J. Bruce Fields [Sat, 19 May 2012 17:55:22 +0000 (13:55 -0400)]
nfsd4: simpler ordering of setclientid_confirm checks

The cases here divide into two main categories:

- if there's an uncomfirmed record with a matching verifier,
  then this is a "normal", succesful case: we're either creating
  a new client, or updating an existing one.
- otherwise, this is a weird case: a replay, or a server reboot.

Reordering to reflect that makes the code a bit more concise and the
logic a lot easier to understand.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: setclientid: remove pointless assignment
J. Bruce Fields [Wed, 23 May 2012 15:38:38 +0000 (11:38 -0400)]
nfsd4: setclientid: remove pointless assignment

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: fix error return in non-matching-creds case
J. Bruce Fields [Sat, 19 May 2012 14:05:58 +0000 (10:05 -0400)]
nfsd4: fix error return in non-matching-creds case

Note CLID_INUSE is for the case where two clients are trying to use the
same client-provided long-form client identifiers.  But what we're
looking at here is the server-returned shorthand client id--if those
clash there's a bug somewhere.

Fix the error return, pull the check out into common code, and do the
check unconditionally in all cases.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: fix setclientid_confirm same_cred check
J. Bruce Fields [Sat, 19 May 2012 02:42:16 +0000 (22:42 -0400)]
nfsd4: fix setclientid_confirm same_cred check

New clients are created only by nfsd4_setclientid(), which always gives
any new client a unique clientid.  The only exception is in the
"callback update" case, in which case it may create an unconfirmed
client with the same clientid as a confirmed client.  In that case it
also checks that the confirmed client has the same credential.

Therefore, it is pointless for setclientid_confirm to check whether a
confirmed and unconfirmed client with the same clientid have matching
credentials--they're guaranteed to.

Instead, it should be checking whether the credential on the
setclientid_confirm matches either of those.  Otherwise, it could be
anyone sending the setclientid_confirm.  Granted, I can't see why anyone
would, but still it's probalby safer to check.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: merge 3 setclientid cases to 2
J. Bruce Fields [Sat, 19 May 2012 02:23:42 +0000 (22:23 -0400)]
nfsd4: merge 3 setclientid cases to 2

Boy, is this simpler.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: pull out common code from setclientid cases
J. Bruce Fields [Sat, 19 May 2012 02:06:41 +0000 (22:06 -0400)]
nfsd4: pull out common code from setclientid cases

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: merge last two setclientid cases
J. Bruce Fields [Sat, 19 May 2012 02:00:38 +0000 (22:00 -0400)]
nfsd4: merge last two setclientid cases

The code here is mostly the same.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: setclientid/confirm comment cleanup
J. Bruce Fields [Sat, 19 May 2012 01:54:19 +0000 (21:54 -0400)]
nfsd4: setclientid/confirm comment cleanup

Be a little more concise.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: setclientid remove unnecessary terms from a logical expression
J. Bruce Fields [Sat, 19 May 2012 01:34:55 +0000 (21:34 -0400)]
nfsd4: setclientid remove unnecessary terms from a logical expression

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: move rq_flavor into svc_cred
J. Bruce Fields [Tue, 15 May 2012 02:06:49 +0000 (22:06 -0400)]
nfsd4: move rq_flavor into svc_cred

Move the rq_flavor into struct svc_cred, and use it in setclientid and
exchange_id comparisons as well.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: stricter cred comparison for setclientid/exchange_id
J. Bruce Fields [Tue, 15 May 2012 01:20:54 +0000 (21:20 -0400)]
nfsd4: stricter cred comparison for setclientid/exchange_id

The typical setclientid or exchange_id will probably be performed with a
credential that maps to either root or nobody, so comparing just uid's
is unlikely to be useful.  So, use everything else we can get our hands
on.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: move principal name into svc_cred
J. Bruce Fields [Mon, 14 May 2012 23:55:22 +0000 (19:55 -0400)]
nfsd4: move principal name into svc_cred

Instead of keeping the principal name associated with a request in a
structure that's private to auth_gss and using an accessor function,
move it to svc_cred.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: allow removing clients not holding state
J. Bruce Fields [Mon, 14 May 2012 19:57:23 +0000 (15:57 -0400)]
nfsd4: allow removing clients not holding state

RFC 5661 actually says we should allow an exchange_id to remove a
matching client, even if the exchange_id comes from a different
principal, *if* the victim client lacks any state.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: rearrange exchange_id logic to simplify
J. Bruce Fields [Sun, 13 May 2012 00:37:23 +0000 (20:37 -0400)]
nfsd4: rearrange exchange_id logic to simplify

Minor cleanup: it's simpler to have separate code paths for the update
and non-update cases.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: exchange_id cleanup: comments
J. Bruce Fields [Mon, 14 May 2012 13:47:11 +0000 (09:47 -0400)]
nfsd4: exchange_id cleanup: comments

Make these comments a bit more concise and uniform.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: exchange_id cleanup: local shorthands for repeated tests
J. Bruce Fields [Mon, 14 May 2012 13:08:10 +0000 (09:08 -0400)]
nfsd4: exchange_id cleanup: local shorthands for repeated tests

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: allow an EXCHANGE_ID to kill a 4.0 client
J. Bruce Fields [Sun, 13 May 2012 01:32:30 +0000 (21:32 -0400)]
nfsd4: allow an EXCHANGE_ID to kill a 4.0 client

Following rfc 5661 section 2.4.1, we can permit a 4.1 client to remove
an established 4.0 client's state.

(But we don't allow updates.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: exchange_id: check creds before killing confirmed client
J. Bruce Fields [Sun, 13 May 2012 01:08:41 +0000 (21:08 -0400)]
nfsd4: exchange_id: check creds before killing confirmed client

We mustn't allow a client to destroy another client with established
state unless it has the right credential.

And some minor cleanup.

(Note: our comparison of credentials is actually pretty bogus currently;
that will need to be fixed in another patch.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: exchange_id error cleanup
J. Bruce Fields [Sun, 13 May 2012 00:53:20 +0000 (20:53 -0400)]
nfsd4: exchange_id error cleanup

There's no point to the dprintk here as the main proc_compound loop
already does this.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: exchange_id has a pointless copy
J. Bruce Fields [Fri, 4 May 2012 19:16:06 +0000 (15:16 -0400)]
nfsd4: exchange_id has a pointless copy

We just verified above that these two verifiers are already the same.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agosvcrpc: fix a comment typo
J. Bruce Fields [Wed, 16 May 2012 21:14:14 +0000 (17:14 -0400)]
svcrpc: fix a comment typo

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: return 0 on reads of fault injection files
Weston Andros Adamson [Thu, 10 May 2012 19:31:10 +0000 (15:31 -0400)]
nfsd: return 0 on reads of fault injection files

debugfs read operations were returning the contents of an uninitialized u64.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: wrap all accesses to st_deny_bmap
Jeff Layton [Fri, 11 May 2012 13:45:14 +0000 (09:45 -0400)]
nfsd: wrap all accesses to st_deny_bmap

Handle the st_deny_bmap in a similar fashion to the st_access_bmap. Add
accessor functions and use those instead of bare bitops.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: wrap accesses to st_access_bmap
Jeff Layton [Fri, 11 May 2012 13:45:13 +0000 (09:45 -0400)]
nfsd: wrap accesses to st_access_bmap

Currently, we do this for the most part with "bare" bitops, but
eventually we'll need to expand the share mode code to handle access
and deny modes on other nodes.

In order to facilitate that code in the future, move to some generic
accessor functions. For now, these are mostly static inlines, but
eventually we'll want to move these to "real" functions that are
able to handle multi-node configurations or have a way to "swap in"
new operations to be done in lieu of or in conjunction with these
atomic bitops.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: make test_share a bool return
Jeff Layton [Fri, 11 May 2012 13:45:12 +0000 (09:45 -0400)]
nfsd: make test_share a bool return

All of the callers treat the return that way already.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: consolidate set_access and set_deny
Jeff Layton [Fri, 11 May 2012 13:45:11 +0000 (09:45 -0400)]
nfsd: consolidate set_access and set_deny

These functions are identical. Also, rename them to bmap_to_share_mode
to better reflect what they do, and have them just return the result
instead of passing in a pointer to the storage location.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agoNFSD: SETCLIENTID_CONFIRM returns NFS4ERR_CLID_INUSE too often
Chuck Lever [Tue, 15 May 2012 21:42:08 +0000 (17:42 -0400)]
NFSD: SETCLIENTID_CONFIRM returns NFS4ERR_CLID_INUSE too often

According to RFC 3530bis, the only items SETCLIENTID_CONFIRM processing
should be concerned with is the clientid, clientid verifier, and
principal.  The client's IP address is not supposed to be interesting.

And, NFS4ERR_CLID_INUSE is meant only for principal mismatches.

I triggered this logic with a prototype UCS client -- one that
uses the same nfs_client_id4 string for all servers.  The client
mounted our server via its IPv4, then via its IPv6 address.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agoLockD: add debug message to start and stop functions
Stanislav Kinsbursky [Wed, 25 Apr 2012 14:23:16 +0000 (18:23 +0400)]
LockD: add debug message to start and stop functions

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agoLockD: service start function introduced
Stanislav Kinsbursky [Wed, 25 Apr 2012 14:23:09 +0000 (18:23 +0400)]
LockD: service start function introduced

This is just a code move, which from my POV makes the code look better.
I.e. now on start we have 3 different stages:
1) Service creation.
2) Service per-net data allocation.
3) Service start.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agoLockD: move global usage counter manipulation from error path
Stanislav Kinsbursky [Wed, 25 Apr 2012 14:23:02 +0000 (18:23 +0400)]
LockD: move global usage counter manipulation from error path

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agoLockD: service creation function introduced
Stanislav Kinsbursky [Wed, 25 Apr 2012 14:22:54 +0000 (18:22 +0400)]
LockD: service creation function introduced

This function creates service if it doesn't exist, or increases usage
counter if it does, and returns a pointer to it.  The usage counter will
be droppepd by svc_destroy() later in lockd_up().

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agoLockD: use existing per-net data function on service creation
Stanislav Kinsbursky [Wed, 25 Apr 2012 14:22:47 +0000 (18:22 +0400)]
LockD: use existing per-net data function on service creation

This patch also replaces svc_rpcb_setup() with svc_bind().

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agoLockD: pass service to per-net up and down functions
Stanislav Kinsbursky [Wed, 25 Apr 2012 14:22:40 +0000 (18:22 +0400)]
LockD: pass service to per-net up and down functions

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agosunrpc: do array overrun check in svc_recv before allocating pages
Jeff Layton [Fri, 4 May 2012 15:44:12 +0000 (11:44 -0400)]
sunrpc: do array overrun check in svc_recv before allocating pages

There's little point in waiting until after we allocate all of the pages
to see if we're going to overrun the array. In the event that this
calculation is really off we could end up scribbling over a bunch of
memory and make it tougher to debug.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agoSUNRPC: move per-net operations from svc_destroy()
Stanislav Kinsbursky [Fri, 4 May 2012 08:49:41 +0000 (12:49 +0400)]
SUNRPC: move per-net operations from svc_destroy()

The idea is to separate service destruction and per-net operations,
because these are two different things and the mix looks ugly.

Notes:

1) For NFS server this patch looks ugly (sorry for that). But these
place will be rewritten soon during NFSd containerization.

2) LockD per-net counter increase int lockd_up() was moved prior to
make_socks() to make lockd_down_net() call safe in case of error.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agoSUNRPC: new svc_bind() routine introduced
Stanislav Kinsbursky [Wed, 2 May 2012 12:08:38 +0000 (16:08 +0400)]
SUNRPC: new svc_bind() routine introduced

This new routine is responsible for service registration in a specified
network context.

The idea is to separate service creation from per-net operations.

Note also: since registering service with svc_bind() can fail, the
service will be destroyed and during destruction it will try to
unregister itself from rpcbind. In this case unregistration has to be
skipped.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agorpc: handle rotated gss data for Windows interoperability
J. Bruce Fields [Thu, 12 Apr 2012 00:08:45 +0000 (20:08 -0400)]
rpc: handle rotated gss data for Windows interoperability

The data in Kerberos gss tokens can be rotated.  But we were lazy and
rejected any nonzero rotation value.  It wasn't necessary for the
implementations we were testing against at the time.

But it appears that Windows does use a nonzero value here.

So, implement rotation to bring ourselves into compliance with the spec
and to interoperate with Windows.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: add IPv6 addr escaping to fs_location hosts
Weston Andros Adamson [Tue, 24 Apr 2012 15:07:59 +0000 (11:07 -0400)]
nfsd: add IPv6 addr escaping to fs_location hosts

The fs_location->hosts list is split on colons, but this doesn't work when
IPv6 addresses are used (they contain colons).
This patch adds the function nfsd4_encode_components_esc() to
allow the caller to specify escape characters when splitting on 'sep'.
In order to fix referrals, this patch must be used with the mountd patch
that similarly fixes IPv6 [] escaping.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: fix change attribute endianness
J. Bruce Fields [Wed, 25 Apr 2012 22:11:04 +0000 (18:11 -0400)]
nfsd4: fix change attribute endianness

Though actually this doesn't matter much, as NFSv4.0 clients are
required to treat the change attribute as opaque.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: fix free_stateid return endianness
J. Bruce Fields [Wed, 25 Apr 2012 22:04:54 +0000 (18:04 -0400)]
nfsd4: fix free_stateid return endianness

Cc: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: int/__be32 fixes
J. Bruce Fields [Wed, 25 Apr 2012 21:58:50 +0000 (17:58 -0400)]
nfsd4: int/__be32 fixes

In each of these cases there's a simple unambiguous correct choice, and
no actual bug.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: preserve __user annotation on cld downcall msg
J. Bruce Fields [Wed, 25 Apr 2012 20:58:05 +0000 (16:58 -0400)]
nfsd4: preserve __user annotation on cld downcall msg

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd4: fix missing "static"
J. Bruce Fields [Wed, 25 Apr 2012 20:56:22 +0000 (16:56 -0400)]
nfsd4: fix missing "static"

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: state.c should include current_stateid.h
J. Bruce Fields [Wed, 25 Apr 2012 20:49:18 +0000 (16:49 -0400)]
nfsd: state.c should include current_stateid.h

OK, admittedly I'm mainly just trying to shut sparse up.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: trivial: use SEEK_SET instead of 0 in vfs_llseek
Jeff Layton [Wed, 25 Apr 2012 19:30:00 +0000 (15:30 -0400)]
nfsd: trivial: use SEEK_SET instead of 0 in vfs_llseek

They're equivalent, but SEEK_SET is more informative...

Cc: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agoSUNRPC: split upcall function to extract reusable parts
Simo Sorce [Tue, 17 Apr 2012 13:39:06 +0000 (09:39 -0400)]
SUNRPC: split upcall function to extract reusable parts

This is needed to share code between the current server upcall mechanism
and the new gssproxy upcall mechanism introduced in a following patch.

Signed-off-by: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: allocate id-to-name and name-to-id caches in per-net operations.
Stanislav Kinsbursky [Wed, 11 Apr 2012 13:33:05 +0000 (17:33 +0400)]
nfsd: allocate id-to-name and name-to-id caches in per-net operations.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: make name-to-id cache allocated per network namespace context
Stanislav Kinsbursky [Wed, 11 Apr 2012 13:32:58 +0000 (17:32 +0400)]
nfsd: make name-to-id cache allocated per network namespace context

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: make id-to-name cache allocated per network namespace context
Stanislav Kinsbursky [Wed, 11 Apr 2012 13:32:51 +0000 (17:32 +0400)]
nfsd: make id-to-name cache allocated per network namespace context

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: pass network context to idmap init/exit functions
Stanislav Kinsbursky [Wed, 11 Apr 2012 13:32:44 +0000 (17:32 +0400)]
nfsd: pass network context to idmap init/exit functions

These functions will be called from per-net operations.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: allocate export and expkey caches in per-net operations.
Stanislav Kinsbursky [Wed, 11 Apr 2012 11:13:35 +0000 (15:13 +0400)]
nfsd: allocate export and expkey caches in per-net operations.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: make expkey cache allocated per network namespace context
Stanislav Kinsbursky [Wed, 11 Apr 2012 11:13:28 +0000 (15:13 +0400)]
nfsd: make expkey cache allocated per network namespace context

This patch also changes svcauth_unix_purge() function: added network namespace
as a parameter and thus loop over all networks was replaced by only one call
for ip map cache purge.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: make export cache allocated per network namespace context
Stanislav Kinsbursky [Wed, 11 Apr 2012 11:13:21 +0000 (15:13 +0400)]
nfsd: make export cache allocated per network namespace context

This patch also changes prototypes of nfsd_export_flush() and exp_rootfh():
network namespace parameter added.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: pass pointer to export cache down to stack wherever possible.
Stanislav Kinsbursky [Wed, 28 Mar 2012 15:09:42 +0000 (19:09 +0400)]
nfsd: pass pointer to export cache down to stack wherever possible.

This cache will be per-net soon. And it's easier to get the pointer to desired
per-net instance only once and then pass it down instead of discovering it in
every place were required.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: pass network context to export caches init/shutdown routines
Stanislav Kinsbursky [Wed, 11 Apr 2012 11:13:14 +0000 (15:13 +0400)]
nfsd: pass network context to export caches init/shutdown routines

These functions will be called from per-net operations.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agoLockd: pass network namespace to creation and destruction routines
Stanislav Kinsbursky [Thu, 29 Mar 2012 14:54:33 +0000 (18:54 +0400)]
Lockd: pass network namespace to creation and destruction routines

v2: dereference of most probably already released nlm_host removed in
nlmclnt_done() and reclaimer().

These routines are called from locks reclaimer() kernel thread. This thread
works in "init_net" network context and currently relays on persence on lockd
thread and it's per-net resources. Thus lockd_up() and lockd_down() can't relay
on current network context. So let's pass corrent one into them.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agoNFSd: remove hard-coded dereferences to name-to-id and id-to-name caches
Stanislav Kinsbursky [Thu, 29 Mar 2012 15:34:16 +0000 (19:34 +0400)]
NFSd: remove hard-coded dereferences to name-to-id and id-to-name caches

These dereferences to global static caches are redundant. They also prevents
converting these caches into per-net ones. So this patch is cleanup + precursor
of patch set,a which will make them per-net.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: pass pointer to expkey cache down to stack wherever possible.
Stanislav Kinsbursky [Wed, 28 Mar 2012 15:09:50 +0000 (19:09 +0400)]
nfsd: pass pointer to expkey cache down to stack wherever possible.

This cache will be per-net soon. And it's easier to get the pointer to desired
per-net instance only once and then pass it down instead of discovering it in
every place were required.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: use hash table from cache detail in nfsd export seq ops
Stanislav Kinsbursky [Wed, 28 Mar 2012 15:09:35 +0000 (19:09 +0400)]
nfsd: use hash table from cache detail in nfsd export seq ops

Hard-code is redundant and will prevent from making caches per net ns.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: pass svc_export_cache pointer as private data to "exports" seq file ops
Stanislav Kinsbursky [Wed, 28 Mar 2012 15:09:29 +0000 (19:09 +0400)]
nfsd: pass svc_export_cache pointer as private data to "exports" seq file ops

Global svc_export_cache cache is going to be replaced with per-net instance. So
prepare the ground for it.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: use exp_put() for svc_export_cache put
Stanislav Kinsbursky [Wed, 28 Mar 2012 15:09:22 +0000 (19:09 +0400)]
nfsd: use exp_put() for svc_export_cache put

This patch replaces cache_put() call for svc_export_cache by exp_put() call.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: use cache detail pointer from svc_export structure on cache put
Stanislav Kinsbursky [Wed, 28 Mar 2012 15:09:15 +0000 (19:09 +0400)]
nfsd: use cache detail pointer from svc_export structure on cache put

Hard-coded pointer is redundant now and can be replaced.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: add link to owner cache detail to svc_export structure
Stanislav Kinsbursky [Wed, 28 Mar 2012 15:09:08 +0000 (19:09 +0400)]
nfsd: add link to owner cache detail to svc_export structure

Without info about owner cache datail it won't be able to find out, which
per-net cache detail have to be.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: use passed cache_detail pointer expkey_parse()
Stanislav Kinsbursky [Wed, 28 Mar 2012 15:09:01 +0000 (19:09 +0400)]
nfsd: use passed cache_detail pointer expkey_parse()

Using of hard-coded svc_expkey_cache pointer in expkey_parse() looks redundant.
Moreover, global cache will be replaced with per-net instance soon.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: don't use locks_in_grace to determine whether to call nfs4_grace_end
Jeff Layton [Tue, 10 Apr 2012 15:08:48 +0000 (11:08 -0400)]
nfsd: don't use locks_in_grace to determine whether to call nfs4_grace_end

It's possible that lockd or another lock manager might still be on the
list after we call nfsd4_end_grace. If the laundromat thread runs
again at that point, then we could end up calling nfsd4_end_grace more
than once.

That's not only inefficient, but calling nfsd4_recdir_purge_old more
than once could be problematic. Fix this by adding a new global
"grace_ended" flag and use that to determine whether we've already
called nfsd4_grace_end.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agosvcauth: remove unused define
Simo Sorce [Thu, 29 Mar 2012 23:18:19 +0000 (19:18 -0400)]
svcauth: remove unused define

Signed-off-by: Simo Sorce <simo@redhat.com>
12 years agonfsd: trivial: remove unused variable from nfsd4_lock
Jeff Layton [Fri, 30 Mar 2012 13:46:21 +0000 (09:46 -0400)]
nfsd: trivial: remove unused variable from nfsd4_lock

..."fp" is set but never used.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agonfsd: don't fail unchecked creates of non-special files
J. Bruce Fields [Mon, 9 Apr 2012 22:06:49 +0000 (18:06 -0400)]
nfsd: don't fail unchecked creates of non-special files

Allow a v3 unchecked open of a non-regular file succeed as if it were a
lookup; typically a client in such a case will want to fall back on a
local open, so succeeding and giving it the filehandle is more useful
than failing with nfserr_exist, which makes it appear that nothing at
all exists by that name.

Similarly for v4, on an open-create, return the same errors we would on
an attempt to open a non-regular file, instead of returning
nfserr_exist.

This fixes a problem found doing a v4 open of a symlink with
O_RDONLY|O_CREAT, which resulted in the current client returning EEXIST.

Thanks also to Trond for analysis.

Cc: stable@kernel.org
Reported-by: Orion Poplawski <orion@cora.nwra.com>
Tested-by: Orion Poplawski <orion@cora.nwra.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
12 years agoLinux 3.4-rc2 v3.4-rc2
Linus Torvalds [Sun, 8 Apr 2012 01:30:41 +0000 (18:30 -0700)]
Linux 3.4-rc2

12 years agoMerge tag 'regmap-3.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie...
Linus Torvalds [Sat, 7 Apr 2012 16:56:00 +0000 (09:56 -0700)]
Merge tag 'regmap-3.4-fixes' of git://git./linux/kernel/git/broonie/regmap

Pull two more small regmap fixes from Mark Brown:
 - Now we have users for it that aren't running Android it turns out
   that regcache_sync_region() is much more useful to drivers if it's
   exported for use by modules.  Who knew?
 - Make sure we don't divide by zero when doing debugfs dumps of
   rbtrees, not visible up until now because everything was providing at
   least some cache on startup.

* tag 'regmap-3.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: prevent division by zero in rbtree_show
  regmap: Export regcache_sync_region()

12 years agoMerge branch 'kvm-updates/3.4' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sat, 7 Apr 2012 16:53:33 +0000 (09:53 -0700)]
Merge branch 'kvm-updates/3.4' of git://git./virt/kvm/kvm

Pull a few KVM fixes from Avi Kivity:
 "A bunch of powerpc KVM fixes, a guest and a host RCU fix (unrelated),
  and a small build fix."

* 'kvm-updates/3.4' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: Resolve RCU vs. async page fault problem
  KVM: VMX: vmx_set_cr0 expects kvm->srcu locked
  KVM: PMU: Fix integer constant is too large warning in kvm_pmu_set_msr()
  KVM: PPC: Book3S: PR: Fix preemption
  KVM: PPC: Save/Restore CR over vcpu_run
  KVM: PPC: Book3S HV: Save and restore CR in __kvmppc_vcore_entry
  KVM: PPC: Book3S HV: Fix kvm_alloc_linear in case where no linears exist
  KVM: PPC: Book3S: Compile fix for ppc32 in HIOR access code

12 years agoMerge tag 'sh-for-linus' of git://github.com/pmundt/linux-sh
Linus Torvalds [Sat, 7 Apr 2012 16:52:46 +0000 (09:52 -0700)]
Merge tag 'sh-for-linus' of git://github.com/pmundt/linux-sh

Pull SuperH fixes from Paul Mundt.

* tag 'sh-for-linus' of git://github.com/pmundt/linux-sh:
  sh: fix clock-sh7757 for the latest sh_mobile_sdhi driver
  serial: sh-sci: use serial_port_in/out vs sci_in/out.
  sh: vsyscall: Fix up .eh_frame generation.
  sh: dma: Fix up device attribute mismatch from sysdev fallout.
  sh: dwarf unwinder depends on SHcompact.
  sh: fix up fallout from system.h disintegration.

12 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris...
Linus Torvalds [Sat, 7 Apr 2012 16:51:36 +0000 (09:51 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/linux-security

Pull security layer fixlet from James Morris.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  sysctl: fix write access to dmesg_restrict/kptr_restrict

12 years agoMerge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Linus Torvalds [Sat, 7 Apr 2012 02:56:04 +0000 (19:56 -0700)]
Merge branch 'release' of git://git./linux/kernel/git/lenb/linux

Pull ACPI & Power Management patches from Len Brown:
 "Two fixes for cpuidle merge-window changes, plus a URL fix in
  MAINTAINERS"

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  MAINTAINERS: Update git url for ACPI
  cpuidle: Fix panic in CPU off-lining with no idle driver
  ACPI processor: Use safe_halt() rather than halt() in acpi_idle_play_dead()

12 years agoMerge branch '3.4-rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nab...
Linus Torvalds [Sat, 7 Apr 2012 02:54:26 +0000 (19:54 -0700)]
Merge branch '3.4-rc-fixes' of git://git./linux/kernel/git/nab/target-pending

Pull target fixes from Nicholas Bellinger:
 "Pull two tcm_fc fabric related fixes for -rc2:

  Note that both have been CC'ed to stable, and patch #1 is the
  important one that addresses a memory corruption bug related to FC
  exchange timeouts + command abort.

  Thanks again to MDR for tracking down this issue!"

* '3.4-rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  tcm_fc: Do not free tpg structure during wq allocation failure
  tcm_fc: Add abort flag for gracefully handling exchange timeout

12 years agotcm_fc: Do not free tpg structure during wq allocation failure
Mark Rustad [Tue, 3 Apr 2012 17:24:52 +0000 (10:24 -0700)]
tcm_fc: Do not free tpg structure during wq allocation failure

Avoid freeing a registered tpg structure if an alloc_workqueue call
fails.  This fixes a bug where the failure was leaking memory associated
with se_portal_group setup during the original core_tpg_register() call.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Acked-by: Kiran Patil <Kiran.patil@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
12 years agotcm_fc: Add abort flag for gracefully handling exchange timeout
Mark Rustad [Tue, 3 Apr 2012 17:24:41 +0000 (10:24 -0700)]
tcm_fc: Add abort flag for gracefully handling exchange timeout

Add abort flag and use it to terminate processing when an exchange
is timed out or is reset. The abort flag is used in place of the
transport_generic_free_cmd function call in the reset and timeout
cases, because calling that function in that context would free
memory that was in use. The aborted flag allows the lifetime to
be managed in a more normal way, while truncating the processing.

This change eliminates a source of memory corruption which
manifested in a variety of ugly ways.

(nab: Drop unused struct fc_exch *ep in ft_recv_seq)

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Acked-by: Kiran Patil <Kiran.patil@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
12 years agoMerge branches 'idle-fix' and 'misc' into release
Len Brown [Sat, 7 Apr 2012 01:48:59 +0000 (21:48 -0400)]
Merge branches 'idle-fix' and 'misc' into release

12 years agoMAINTAINERS: Update git url for ACPI
Igor Murzov [Fri, 30 Mar 2012 18:40:12 +0000 (22:40 +0400)]
MAINTAINERS: Update git url for ACPI

Signed-off-by: Igor Murzov <e-mail@date.by>
Signed-off-by: Len Brown <len.brown@intel.com>
12 years agoMerge branch 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux...
Linus Torvalds [Sat, 7 Apr 2012 00:56:20 +0000 (17:56 -0700)]
Merge branch 'stable' of git://git./linux/kernel/git/cmetcalf/linux-tile

Pull arch/tile bug fixes from Chris Metcalf:
 "This includes Paul Gortmaker's change to fix the <asm/system.h>
  disintegration issues on tile, a fix to unbreak the tilepro ethernet
  driver, and a backlog of bugfix-only changes from internal Tilera
  development over the last few months.

  They have all been to LKML and on linux-next for the last few days.
  The EDAC change to MAINTAINERS is an oddity but discussion on the
  linux-edac list suggested I ask you to pull that change through my
  tree since they don't have a tree to pull edac changes from at the
  moment."

* 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: (39 commits)
  drivers/net/ethernet/tile: fix netdev_alloc_skb() bombing
  MAINTAINERS: update EDAC information
  tilepro ethernet driver: fix a few minor issues
  tile-srom.c driver: minor code cleanup
  edac: say "TILEGx" not "TILEPro" for the tilegx edac driver
  arch/tile: avoid accidentally unmasking NMI-type interrupt accidentally
  arch/tile: remove bogus performance optimization
  arch/tile: return SIGBUS for addresses that are unaligned AND invalid
  arch/tile: fix finv_buffer_remote() for tilegx
  arch/tile: use atomic exchange in arch_write_unlock()
  arch/tile: stop mentioning the "kvm" subdirectory
  arch/tile: export the page_home() function.
  arch/tile: fix pointer cast in cacheflush.c
  arch/tile: fix single-stepping over swint1 instructions on tilegx
  arch/tile: implement panic_smp_self_stop()
  arch/tile: add "nop" after "nap" to help GX idle power draw
  arch/tile: use proper memparse() for "maxmem" options
  arch/tile: fix up locking in pgtable.c slightly
  arch/tile: don't leak kernel memory when we unload modules
  arch/tile: fix bug in delay_backoff()
  ...

12 years agoMerge tag 'stable/for-linus-3.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 7 Apr 2012 00:54:53 +0000 (17:54 -0700)]
Merge tag 'stable/for-linus-3.4-rc1-tag' of git://git./linux/kernel/git/konrad/xen

Pull xen fixes from Konrad Rzeszutek Wilk:
 "Two fixes for regressions:
   * one is a workaround that will be removed in v3.5 with proper fix in
     the tip/x86 tree,
   * the other is to fix drivers to load on PV (a previous patch made
     them only load in PVonHVM mode).

  The rest are just minor fixes in the various drivers and some cleanup
  in the core code."

* tag 'stable/for-linus-3.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/pcifront: avoid pci_frontend_enable_msix() falsely returning success
  xen/pciback: fix XEN_PCI_OP_enable_msix result
  xen/smp: Remove unnecessary call to smp_processor_id()
  xen/x86: Workaround 'x86/ioapic: Add register level checks to detect bogus io-apic entries'
  xen: only check xen_platform_pci_unplug if hvm

12 years agoMerge tag 'mmc-fixes-for-3.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 7 Apr 2012 00:22:23 +0000 (17:22 -0700)]
Merge tag 'mmc-fixes-for-3.4-rc2' of git://git./linux/kernel/git/cjb/mmc

Pull MMC fixes from Chris Ball:
 - Disable use of MSI in sdhci-pci, which caused multiple chipsets to
   stop working in 3.4-rc1.  I'll wait to turn this on again until we
   have a chipset whitelist for it.
 - Fix a libertas SDIO powered-resume regression introduced in 3.3;
   thanks to Neil Brown and Rafael Wysocki for this fix.
 - Fix module reloading on omap_hsmmc.
 - Stop trusting the spec/card's specified maximum data timeout length,
   and use three seconds instead.  Previously we used 300ms.

Also cleanups and fixes for s3c, atmel, sh_mmcif and omap_hsmmc.

* tag 'mmc-fixes-for-3.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: (28 commits)
  mmc: use really long write timeout to deal with crappy cards
  mmc: sdhci-dove: Fix compile error by including module.h
  mmc: Prevent 1.8V switch for SD hosts that don't support UHS modes.
  Revert "mmc: sdhci-pci: Add MSI support"
  Revert "mmc: sdhci-pci: add quirks for broken MSI on O2Micro controllers"
  mmc: core: fix power class selection
  mmc: omap_hsmmc: fix module re-insertion
  mmc: omap_hsmmc: convert to module_platform_driver
  mmc: omap_hsmmc: make it behave well as a module
  mmc: omap_hsmmc: trivial cleanups
  mmc: omap_hsmmc: context save after enabling runtime pm
  mmc: omap_hsmmc: use runtime put sync in probe error patch
  mmc: sdio: Use empty system suspend/resume callbacks at the bus level
  mmc: bus: print bus speed mode of UHS-I card
  mmc: sdhci-pci: add quirks for broken MSI on O2Micro controllers
  mmc: sh_mmcif: Simplify calculation of mmc->f_min
  mmc: sh_mmcif: mmc->f_max should be half of the bus clock
  mmc: sh_mmcif: double clock speed
  mmc: block: Remove use of mmc_blk_set_blksize
  mmc: atmel-mci: add support for odd clock dividers
  ...

12 years agoMake the "word-at-a-time" helper functions more commonly usable
Linus Torvalds [Fri, 6 Apr 2012 20:54:56 +0000 (13:54 -0700)]
Make the "word-at-a-time" helper functions more commonly usable

I have a new optimized x86 "strncpy_from_user()" that will use these
same helper functions for all the same reasons the name lookup code uses
them.  This is preparation for that.

This moves them into an architecture-specific header file.  It's
architecture-specific for two reasons:

 - some of the functions are likely to want architecture-specific
   implementations.  Even if the current code happens to be "generic" in
   the sense that it should work on any little-endian machine, it's
   likely that the "multiply by a big constant and shift" implementation
   is less than optimal for an architecture that has a guaranteed fast
   bit count instruction, for example.

 - I expect that if architectures like sparc want to start playing
   around with this, we'll need to abstract out a few more details (in
   particular the actual unaligned accesses).  So we're likely to have
   more architecture-specific stuff if non-x86 architectures start using
   this.

   (and if it turns out that non-x86 architectures don't start using
   this, then having it in an architecture-specific header is still the
   right thing to do, of course)

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agocpuidle: Fix panic in CPU off-lining with no idle driver
Toshi Kani [Sun, 1 Apr 2012 03:37:02 +0000 (21:37 -0600)]
cpuidle: Fix panic in CPU off-lining with no idle driver

Fix a NULL pointer dereference panic in cpuidle_play_dead() during
CPU off-lining when no cpuidle driver is registered.  A cpuidle
driver may be registered at boot-time based on CPU type.  This patch
allows an off-lined CPU to enter HLT-based idle in this condition.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: Boris Ostrovsky <boris.ostrovsky@amd.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Tested-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Len Brown <len.brown@intel.com>
12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Fri, 6 Apr 2012 17:37:38 +0000 (10:37 -0700)]
Merge git://git./linux/kernel/git/davem/net

Pull networking updates from David Miller:

 1) Fix inaccuracies in network driver interface documentation, from Ben
    Hutchings.

 2) Fix handling of negative offsets in BPF JITs, from Jan Seiffert.

 3) Compile warning, locking, and refcounting fixes in netfilter's
    xt_CT, from Pablo Neira Ayuso.

 4) phonet sendmsg needs to validate user length just like any other
    datagram protocol, fix from Sasha Levin.

 5) Ipv6 multicast code uses wrong loop index, from RongQing Li.

 6) Link handling and firmware fixes in bnx2x driver from Yaniv Rosner
    and Yuval Mintz.

 7) mlx4 erroneously allocates 4 pages at a time, regardless of page
    size, fix from Thadeu Lima de Souza Cascardo.

 8) SCTP socket option wasn't extended in a backwards compatible way,
    fix from Thomas Graf.

 9) Add missing address change event emissions to bonding, from Shlomo
    Pongratz.

10) /proc/net/dev regressed because it uses a private offset to track
    where we are in the hash table, but this doesn't track the offset
    pullback that the seq_file code does resulting in some entries being
    missed in large dumps.

    Fix from Eric Dumazet.

11) do_tcp_sendpage() unloads the send queue way too fast, because it
    invokes tcp_push() when it shouldn't.  Let the natural sequence
    generated by the splice paths, and the assosciated MSG_MORE
    settings, guide the tcp_push() calls.

    Otherwise what goes out of TCP is spaghetti and doesn't batch
    effectively into GSO/TSO clusters.

    From Eric Dumazet.

12) Once we put a SKB into either the netlink receiver's queue or a
    socket error queue, it can be consumed and freed up, therefore we
    cannot touch it after queueing it like that.

    Fixes from Eric Dumazet.

13) PPP has this annoying behavior in that for every transmit call it
    immediately stops the TX queue, then calls down into the next layer
    to transmit the PPP frame.

    But if that next layer can take it immediately, it just un-stops the
    TX queue right before returning from the transmit method.

    Besides being useless work, it makes several facilities unusable, in
    particular things like the equalizers.  Well behaved devices should
    only stop the TX queue when they really are full, and in PPP's case
    when it gets backlogged to the downstream device.

    David Woodhouse therefore fixed PPP to not stop the TX queue until
    it's downstream can't take data any more.

14) IFF_UNICAST_FLT got accidently lost in some recent stmmac driver
    changes, re-add.  From Marc Kleine-Budde.

15) Fix link flaps in ixgbe, from Eric W. Multanen.

16) Descriptor writeback fixes in e1000e from Matthew Vick.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits)
  net: fix a race in sock_queue_err_skb()
  netlink: fix races after skb queueing
  doc, net: Update ndo_start_xmit return type and values
  doc, net: Remove instruction to set net_device::trans_start
  doc, net: Update netdev operation names
  doc, net: Update documentation of synchronisation for TX multiqueue
  doc, net: Remove obsolete reference to dev->poll
  ethtool: Remove exception to the requirement of holding RTNL lock
  MAINTAINERS: update for Marvell Ethernet drivers
  bonding: properly unset current_arp_slave on slave link up
  phonet: Check input from user before allocating
  tcp: tcp_sendpages() should call tcp_push() once
  ipv6: fix array index in ip6_mc_add_src()
  mlx4: allocate just enough pages instead of always 4 pages
  stmmac: re-add IFF_UNICAST_FLT for dwmac1000
  bnx2x: Clear MDC/MDIO warning message
  bnx2x: Fix BCM57711+BCM84823 link issue
  bnx2x: Clear BCM84833 LED after fan failure
  bnx2x: Fix BCM84833 PHY FW version presentation
  bnx2x: Fix link issue for BCM8727 boards.
  ...

12 years agoxen/pcifront: avoid pci_frontend_enable_msix() falsely returning success
Jan Beulich [Mon, 2 Apr 2012 14:22:39 +0000 (15:22 +0100)]
xen/pcifront: avoid pci_frontend_enable_msix() falsely returning success

The original XenoLinux code has always had things this way, and for
compatibility reasons (in particular with a subsequent pciback
adjustment) upstream Linux should behave the same way (allowing for two
distinct error indications to be returned by the backend).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
12 years agoxen/pciback: fix XEN_PCI_OP_enable_msix result
Jan Beulich [Mon, 2 Apr 2012 14:32:22 +0000 (15:32 +0100)]
xen/pciback: fix XEN_PCI_OP_enable_msix result

Prior to 2.6.19 and as of 2.6.31, pci_enable_msix() can return a
positive value to indicate the number of vectors (less than the amount
requested) that can be set up for a given device. Returning this as an
operation value (secondary result) is fine, but (primary) operation
results are expected to be negative (error) or zero (success) according
to the protocol. With the frontend fixed to match the XenoLinux
behavior, the backend can now validly return zero (success) here,
passing the upper limit on the number of vectors in op->value.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
12 years agoxen/smp: Remove unnecessary call to smp_processor_id()
Srivatsa S. Bhat [Thu, 22 Mar 2012 12:59:24 +0000 (18:29 +0530)]
xen/smp: Remove unnecessary call to smp_processor_id()

There is an extra and unnecessary call to smp_processor_id()
in cpu_bringup(). Remove it.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
12 years agoxen/x86: Workaround 'x86/ioapic: Add register level checks to detect bogus io-apic...
Konrad Rzeszutek Wilk [Tue, 20 Mar 2012 19:04:18 +0000 (15:04 -0400)]
xen/x86: Workaround 'x86/ioapic: Add register level checks to detect bogus io-apic entries'

The above mentioned patch checks the IOAPIC and if it contains
-1, then it unmaps said IOAPIC. But under Xen we get this:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
IP: [<ffffffff8134e51f>] xen_irq_init+0x1f/0xb0
PGD 0
Oops: 0002 [#1] SMP
CPU 0
Modules linked in:

Pid: 1, comm: swapper/0 Not tainted 3.2.10-3.fc16.x86_64 #1 Dell Inc. Inspiron
1525                  /0U990C
RIP: e030:[<ffffffff8134e51f>]  [<ffffffff8134e51f>] xen_irq_init+0x1f/0xb0
RSP: e02b: ffff8800d42cbb70  EFLAGS: 00010202
RAX: 0000000000000000 RBX: 00000000ffffffef RCX: 0000000000000001
RDX: 0000000000000040 RSI: 00000000ffffffef RDI: 0000000000000001
RBP: ffff8800d42cbb80 R08: ffff8800d6400000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 00000000ffffffef
R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000010
FS:  0000000000000000(0000) GS:ffff8800df5fe000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0:000000008005003b
CR2: 0000000000000040 CR3: 0000000001a05000 CR4: 0000000000002660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper/0 (pid: 1, threadinfo ffff8800d42ca000, task ffff8800d42d0000)
Stack:
 00000000ffffffef 0000000000000010 ffff8800d42cbbe0 ffffffff8134f157
 ffffffff8100a9b2 ffffffff8182ffd1 00000000000000a0 00000000829e7384
 0000000000000002 0000000000000010 00000000ffffffff 0000000000000000
Call Trace:
 [<ffffffff8134f157>] xen_bind_pirq_gsi_to_irq+0x87/0x230
 [<ffffffff8100a9b2>] ? check_events+0x12+0x20
 [<ffffffff814bab42>] xen_register_pirq+0x82/0xe0
 [<ffffffff814bac1a>] xen_register_gsi.part.2+0x4a/0xd0
 [<ffffffff814bacc0>] acpi_register_gsi_xen+0x20/0x30
 [<ffffffff8103036f>] acpi_register_gsi+0xf/0x20
 [<ffffffff8131abdb>] acpi_pci_irq_enable+0x12e/0x202
 [<ffffffff814bc849>] pcibios_enable_device+0x39/0x40
 [<ffffffff812dc7ab>] do_pci_enable_device+0x4b/0x70
 [<ffffffff812dc878>] __pci_enable_device_flags+0xa8/0xf0
 [<ffffffff812dc8d3>] pci_enable_device+0x13/0x20

The reason we are dying is b/c the call acpi_get_override_irq() is used,
which returns the polarity and trigger for the IRQs. That function calls
mp_find_ioapics to get the 'struct ioapic' structure - which along with the
mp_irq[x] is used to figure out the default values and the polarity/trigger
overrides. Since the mp_find_ioapics now returns -1 [b/c the IOAPIC is filled
with 0xffffffff], the acpi_get_override_irq() stops trying to lookup in the
mp_irq[x] the proper INT_SRV_OVR and we can't install the SCI interrupt.

The proper fix for this is going in v3.5 and adds an x86_io_apic_ops
struct so that platforms can override it. But for v3.4 lets carry this
work-around. This patch does that by providing a slightly different variant
of the fake IOAPIC entries.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
12 years agoxen: only check xen_platform_pci_unplug if hvm
Igor Mammedov [Tue, 27 Mar 2012 17:31:08 +0000 (19:31 +0200)]
xen: only check xen_platform_pci_unplug if hvm

commit b9136d207f08
  xen: initialize platform-pci even if xen_emul_unplug=never

breaks blkfront/netfront by not loading them because of
xen_platform_pci_unplug=0 and it is never set for PV guest.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
12 years agonet: fix a race in sock_queue_err_skb()
Eric Dumazet [Fri, 6 Apr 2012 08:49:10 +0000 (10:49 +0200)]
net: fix a race in sock_queue_err_skb()

As soon as an skb is queued into socket error queue, another thread
can consume it, so we are not allowed to reference skb anymore, or risk
use after free.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonetlink: fix races after skb queueing
Eric Dumazet [Thu, 5 Apr 2012 22:17:46 +0000 (22:17 +0000)]
netlink: fix races after skb queueing

As soon as an skb is queued into socket receive_queue, another thread
can consume it, so we are not allowed to reference skb anymore, or risk
use after free.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodoc, net: Update ndo_start_xmit return type and values
Ben Hutchings [Thu, 5 Apr 2012 14:40:25 +0000 (14:40 +0000)]
doc, net: Update ndo_start_xmit return type and values

Commit dc1f8bf68b311b1537cb65893430b6796118498a ('netdev: change
transmit to limited range type') changed the required return type and
9a1654ba0b50402a6bd03c7b0fe9b0200a5ea7b1 ('net: Optimize
hard_start_xmit() return checking') changed the valid numerical
return values.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agodoc, net: Remove instruction to set net_device::trans_start
Ben Hutchings [Thu, 5 Apr 2012 14:40:06 +0000 (14:40 +0000)]
doc, net: Remove instruction to set net_device::trans_start

Commit 08baf561083bc27a953aa087dd8a664bb2b88e8e ('net:
txq_trans_update() helper') made it unnecessary for most drivers to
set net_device::trans_start (or netdev_queue::trans_start).

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>