platform/kernel/linux-rpi.git
15 years agonfsd41: define NFSD_DRC_SIZE_SHIFT in set_max_drc
Benny Halevy [Mon, 6 Apr 2009 09:00:36 +0000 (12:00 +0300)]
nfsd41: define NFSD_DRC_SIZE_SHIFT in set_max_drc

Fixes the following compiler error:
fs/nfsd/nfssvc.c: In function 'set_max_drc':
fs/nfsd/nfssvc.c:240: error: 'NFSD_DRC_SIZE_SHIFT' undeclared

CONFIG_NFSD_V4 is not set

Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: Documentation/filesystems/nfs41-server.txt
Benny Halevy [Fri, 3 Apr 2009 05:29:20 +0000 (08:29 +0300)]
nfsd41: Documentation/filesystems/nfs41-server.txt

Initial nfs41 server write up describing the status of the linux
server implementation.

[nfsd41: document unenforced nfs41 compound ordering rules.]
[get rid of CONFIG_NFSD_V4_1]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: CREATE_EXCLUSIVE4_1
Benny Halevy [Fri, 3 Apr 2009 05:29:17 +0000 (08:29 +0300)]
nfsd41: CREATE_EXCLUSIVE4_1

Implement the CREATE_EXCLUSIVE4_1 open mode conforming to
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26

This mode allows the client to atomically create a file
if it doesn't exist while setting some of its attributes.

It must be implemented if the server supports persistent
reply cache and/or pnfs.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: SUPPATTR_EXCLCREAT attribute
Benny Halevy [Fri, 3 Apr 2009 05:29:14 +0000 (08:29 +0300)]
nfsd41: SUPPATTR_EXCLCREAT attribute

Return bitmask for supported EXCLUSIVE4_1 create attributes.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: support for 3-word long attribute bitmask
Andy Adamson [Fri, 3 Apr 2009 05:29:11 +0000 (08:29 +0300)]
nfsd41: support for 3-word long attribute bitmask

Also, use client minorversion to generate supported attrs

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd: dynamically skip encoded fattr bitmap in _nfsd4_verify
Benny Halevy [Fri, 3 Apr 2009 05:29:08 +0000 (08:29 +0300)]
nfsd: dynamically skip encoded fattr bitmap in _nfsd4_verify

_nfsd4_verify currently skips 3 words from the encoded buffer begining.
With support for 3-word attr bitmaps in nfsd41, nfsd4_encode_fattr
may encode 1, 2, or 3 words, and not always 2 as it used to be, hence
we need to find out where to skip using the encoded bitmap length.

Note: This patch may be applied over pre-nfsd41 nfsd.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: pass writable attrs mask to nfsd4_decode_fattr
Benny Halevy [Fri, 3 Apr 2009 05:29:05 +0000 (08:29 +0300)]
nfsd41: pass writable attrs mask to nfsd4_decode_fattr

In preparation for EXCLUSIVE4_1

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: provide support for minor version 1 at rpc level
Marc Eshel [Fri, 3 Apr 2009 05:29:02 +0000 (08:29 +0300)]
nfsd41: provide support for minor version 1 at rpc level

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: control nfsv4.1 svc via /proc/fs/nfsd/versions
Benny Halevy [Fri, 3 Apr 2009 05:28:59 +0000 (08:28 +0300)]
nfsd41: control nfsv4.1 svc via /proc/fs/nfsd/versions

Support enabling and disabling nfsv4.1 via /proc/fs/nfsd/versions
by writing the strings "+4.1" or "-4.1" correspondingly.

Use user mode nfs-utils (rpc.nfsd option) to enable.
This will allow us to get rid of CONFIG_NFSD_V4_1

[nfsd41: disable support for minorversion by default]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: add OPEN4_SHARE_ACCESS_WANT nfs4_stateid bmap
Andy Adamson [Fri, 3 Apr 2009 05:28:56 +0000 (08:28 +0300)]
nfsd41: add OPEN4_SHARE_ACCESS_WANT nfs4_stateid bmap

Separate the access bits from the want bits and enable __set_bit to
work correctly with st_access_bmap.

Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: access_valid
Andy Adamson [Fri, 3 Apr 2009 05:28:53 +0000 (08:28 +0300)]
nfsd41: access_valid

For nfs41, the open share flags are used also for
delegation "wants" and "signals".  Check that they are valid.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: clientid handling
Andy Adamson [Fri, 3 Apr 2009 05:28:50 +0000 (08:28 +0300)]
nfsd41: clientid handling

Extract the clientid from sessionid to set the op_clientid on open.
Verify that the clid for other stateful ops is zero for minorversion != 0
Do all other checks for stateful ops without sessions.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
[fixed whitespace indent]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41 remove sl_session from nfsd4_open]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: check encode size for sessions maxresponse cached
Andy Adamson [Fri, 3 Apr 2009 05:28:48 +0000 (08:28 +0300)]
nfsd41: check encode size for sessions maxresponse cached

Calculate the space the compound response has taken after encoding the current
operation.

pad: add on 8 bytes for the next operation's op_code and status so that
there is room to cache a failure on the next operation.

Compare this length to the session se_fmaxresp_cached and return
nfserr_rep_too_big_to_cache if the length is too large.

Our se_fmaxresp_cached will always be a multiple of PAGE_SIZE, and so
will be at least a page and will therefore hold the xdr_buf head.

Signed-off-by: Andy Adamson <andros@netapp.com>
[nfsd41: non-page DRC for solo sequence responses]
[fixed nfsd4_check_drc_limit cosmetics]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use cstate session in nfsd4_check_drc_limit]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: stateid handling
Andy Adamson [Fri, 3 Apr 2009 05:28:45 +0000 (08:28 +0300)]
nfsd41: stateid handling

When sessions are used, stateful operation sequenceid and stateid handling
are not used. When sessions are used,  on the first open set the seqid to 1,
mark state confirmed and skip seqid processing.

When sessionas are used the stateid generation number is ignored when it is zero
whereas without sessions bad_stateid or stale stateid is returned.

Add flags to propagate session use to all stateful ops and down to
check_stateid_generation.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
[nfsd4_has_session should return a boolean, not u32]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: pass nfsd4_compoundres * to nfsd4_process_open1]
[nfsd41: calculate HAS_SESSION in nfs4_preprocess_stateid_op]
[nfsd41: calculate HAS_SESSION in nfs4_preprocess_seqid_op]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd: pass nfsd4_compound_state* to nfs4_preprocess_{state,seq}id_op
Benny Halevy [Fri, 3 Apr 2009 05:28:41 +0000 (08:28 +0300)]
nfsd: pass nfsd4_compound_state* to nfs4_preprocess_{state,seq}id_op

Currently we only use cstate->current_fh,
will also be used by nfsd41 code.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: destroy_session operation
Benny Halevy [Fri, 3 Apr 2009 05:28:38 +0000 (08:28 +0300)]
nfsd41: destroy_session operation

Implement the destory_session operation confoming to
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26

[use sessionid_lock spin lock]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: non-page DRC for solo sequence responses
Andy Adamson [Fri, 3 Apr 2009 05:28:35 +0000 (08:28 +0300)]
nfsd41: non-page DRC for solo sequence responses

A session inactivity time compound (lease renewal) or a compound where the
sequence operation has sa_cachethis set to FALSE do not require any pages
to be held in the v4.1 DRC. This is because struct nfsd4_slot is already
caching the session information.

Add logic to the nfs41 server to not cache response pages for solo sequence
responses.

Return nfserr_replay_uncached_rep on the operation following the sequence
operation when sa_cachethis is FALSE.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use cstate session in nfsd4_replay_cache_entry]
[nfsd41: rename nfsd4_no_page_in_cache]
[nfsd41 rename nfsd4_enc_no_page_replay]
[nfsd41 nfsd4_is_solo_sequence]
[nfsd41 change nfsd4_not_cached return]
Signed-off-by: Andy Adamson <andros@netapp.com>
[changed return type to bool]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41 drop parens in nfsd4_is_solo_sequence call]
Signed-off-by: Andy Adamson <andros@netapp.com>
[changed "== 0" to "!"]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: Add a create session replay cache
Andy Adamson [Fri, 3 Apr 2009 05:28:32 +0000 (08:28 +0300)]
nfsd41: Add a create session replay cache

Replace the nfs4_client cl_seqid field with a single struct nfs41_slot used
for the create session replay cache.

The CREATE_SESSION slot sets the sl_session pointer to NULL. Otherwise, the
slot and it's replay cache are used just like the session slots.

Fix unconfirmed create_session replay response by initializing the
create_session slot sequence id to 0.

A future patch will set the CREATE_SESSION cache when a SEQUENCE operation
preceeds the CREATE_SESSION operation. This compound is currently only cached
in the session slot table.

Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use bool inuse for slot state]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: revert portion of nfsd4_set_cache_entry]
Signed-off-by: Andy Adamson <andros@netpp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: create_session operation
Andy Adamson [Fri, 3 Apr 2009 05:28:28 +0000 (08:28 +0300)]
nfsd41: create_session operation

Implement the create_session operation confoming to
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26

Look up the client id (generated by the server on exchange_id,
given by the client on create_session).
If neither a confirmed or unconfirmed client is found
then the client id is stale
If a confirmed cilent is found (i.e. we already received
create_session for it) then compare the sequence id
to determine if it's a replay or possibly a mis-ordered rpc.
If the seqid is in order, update the confirmed client seqid
and procedd with updating the session parameters.

If an unconfirmed client_id is found then verify the creds
and seqid.  If both match move the client id to confirmed state
and proceed with processing the create_session.

Currently, we do not support persistent sessions, and RDMA.

alloc_init_session generates a new sessionid and creates
a session structure.

NFSD_PAGES_PER_SLOT is used for the max response cached calculation, and for
the counting of DRC pages using the hard limits set in struct srv_serv.

A note on NFSD_PAGES_PER_SLOT:

Other patches in this series allow for NFSD_PAGES_PER_SLOT + 1 pages to be
cached in a DRC slot when the response size is less than NFSD_PAGES_PER_SLOT *
PAGE_SIZE but xdr_buf pages are used. e.g. a READDIR operation will encode a
small amount of data in the xdr_buf head, and then the READDIR in the xdr_buf
pages.  So, the hard limit calculation use of pages by a session is
underestimated by the number of cached operations using the xdr_buf pages.

Yet another patch caches no pages for the solo sequence operation, or any
compound where cache_this is False.  So the hard limit calculation use of
pages by a session is overestimated by the number of these operations in the
cache.

TODO: improve resource pre-allocation and negotiate session
parameters accordingly.  Respect and possibly adjust
backchannel attributes.

Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
[nfsd41: remove headerpadsz from channel attributes]
Our client and server only support a headerpadsz of 0.
[nfsd41: use DRC limits in fore channel init]
[nfsd41: do not change CREATE_SESSION back channel attrs]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[use sessionid_lock spin lock]
[nfsd41: use bool inuse for slot state]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41 remove sl_session from alloc_init_session]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[simplify nfsd4_encode_create_session error handling]
[nfsd41: fix comment style in init_forechannel_attrs]
[nfsd41: allocate struct nfsd4_session and slot table in one piece]
[nfsd41: no need to INIT_LIST_HEAD in alloc_init_session just prior to list_add]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: clear DRC cache on free_session
Andy Adamson [Fri, 3 Apr 2009 05:28:25 +0000 (08:28 +0300)]
nfsd41: clear DRC cache on free_session

Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: nfsd DRC logic
Andy Adamson [Fri, 3 Apr 2009 05:28:22 +0000 (08:28 +0300)]
nfsd41: nfsd DRC logic

Replay a request in nfsd4_sequence.
Add a minorversion to struct nfsd4_compound_state.

Pass the current slot to nfs4svc_encode_compound res via struct
nfsd4_compoundres to set an NFSv4.1 DRC entry.

Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use bool inuse for slot state]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use cstate session in nfs4svc_encode_compoundres]
[nfsd41 replace nfsd4_set_cache_entry]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: hard page limit for DRC
Andy Adamson [Fri, 3 Apr 2009 05:28:18 +0000 (08:28 +0300)]
nfsd41: hard page limit for DRC

Use no more than 1/128th of the number of free pages at nfsd startup for the
v4.1 DRC.

This is an arbitrary default which should probably end up under the control
of an administrator.

Signed-off-by: Andy Adamson <andros@netapp.com>
[moved added fields in struct svc_serv under CONFIG_NFSD_V4_1]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[fix set_max_drc calculation of sv_drc_max_pages]
[moved NFSD_DRC_SIZE_SHIFT's declaration up in header file]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: DRC save, restore, and clear functions
Andy Adamson [Fri, 3 Apr 2009 05:28:15 +0000 (08:28 +0300)]
nfsd41: DRC save, restore, and clear functions

Cache all the result pages, including the rpc header in rq_respages[0],
for a request in the slot table cache entry.

Cache the statp pointer from nfsd_dispatch which points into rq_respages[0]
just past the rpc header. When setting a cache entry, calculate and save the
length of the nfs data minus the rpc header for rq_respages[0].

When replaying a cache entry, replace the cached rpc header with the
replayed request rpc result header, unless there is not enough room in the
cached results first page. In that case, use the cached rpc header.

The sessions fore channel maxresponse size cached is set to NFSD_PAGES_PER_SLOT
* PAGE_SIZE. For compounds we are cacheing with operations such as READDIR
that use the xdr_buf->pages to hold data, we choose to cache the extra page of
data rather than copying data from xdr_buf->pages into the xdr_buf->head page.

[nfsd41: limit cache to maxresponsesize_cached]
[nfsd41: mv nfsd4_set_statp under CONFIG_NFSD_V4_1]
[nfsd41: rename nfsd4_move_pages]
[nfsd41: rename page_no variable]
[nfsd41: rename nfsd4_set_cache_entry]
[nfsd41: fix nfsd41_copy_replay_data comment]
[nfsd41: add to nfsd4_set_cache_entry]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: enforce NFS4ERR_SEQUENCE_POS operation order rules for minorversion != 0...
Andy Adamson [Fri, 3 Apr 2009 05:28:12 +0000 (08:28 +0300)]
nfsd41: enforce NFS4ERR_SEQUENCE_POS operation order rules for minorversion != 0 only.

Signed-off-by: Andy Adamson<andros@netapp.com>
[nfsd41: do not verify nfserr_sequence_pos for minorversion 0]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: sequence operation
Benny Halevy [Fri, 3 Apr 2009 05:28:08 +0000 (08:28 +0300)]
nfsd41: sequence operation

Implement the sequence operation conforming to
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26

Check for stale clientid (as derived from the sessionid).
Enforce slotid range and exactly-once semantics using
the slotid and seqid.

If everything went well renew the client lease and
mark the slot INPROGRESS.

Add a struct nfsd4_slot pointer to struct nfsd4_compound_state.
To be used for sessions DRC replay.

[nfsd41: rename sequence catchthis to cachethis]
Signed-off-by: Andy Adamson<andros@netapp.com>
[pulled some code to set cstate->slot from "nfsd DRC logic"]
[use sessionid_lock spin lock]
[nfsd41: use bool inuse for slot state]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd: add a struct nfsd4_slot pointer to struct nfsd4_compound_state]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: add nfsd4_session pointer to nfsd4_compound_state]
[nfsd41: set cstate session]
[nfsd41: use cstate session in nfsd4_sequence]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[simplify nfsd4_encode_sequence error handling]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: match clientid establishment method
Andy Adamson [Fri, 3 Apr 2009 05:28:05 +0000 (08:28 +0300)]
nfsd41: match clientid establishment method

We need to distinguish between client names provided by NFSv4.0 clients
SETCLIENTID and those provided by NFSv4.1 via EXCHANGE_ID when looking
up the clientid by string.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
[nfsd41: use boolean values for use_exchange_id argument]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: simplify match_clientid_establishment logic]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: exchange_id operation
Andy Adamson [Fri, 3 Apr 2009 05:28:01 +0000 (08:28 +0300)]
nfsd41: exchange_id operation

Implement the exchange_id operation confoming to
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-28

Based on the client provided name, hash a client id.
If a confirmed one is found, compare the op's creds and
verifier.  If the creds match and the verifier is different
then expire the old client (client re-incarnated), otherwise,
if both match, assume it's a replay and ignore it.

If an unconfirmed client is found, then copy the new creds
and verifer if need update, otherwise assume replay.

The client is moved to a confirmed state on create_session.

In the nfs41 branch set the exchange_id flags to
EXCHGID4_FLAG_USE_NON_PNFS | EXCHGID4_FLAG_SUPP_MOVED_REFER
(pNFS is not supported, Referrals are supported,
Migration is not.).

Address various scenarios from section 18.35 of the spec:

1. Check for EXCHGID4_FLAG_UPD_CONFIRMED_REC_A and set
   EXCHGID4_FLAG_CONFIRMED_R as appropriate.

2. Return error codes per 18.35.4 scenarios.

3. Update client records or generate new client ids depending on
   scenario.

Note: 18.35.4 case 3 probably still needs revisiting.  The handling
seems not quite right.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Andy Adamosn <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use utsname for major_id (and copy to server_scope)]
[nfsd41: fix handling of various exchange id scenarios]
Signed-off-by: Mike Sager <sager@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: reverse use of EXCHGID4_INVAL_FLAG_MASK_A]
[simplify nfsd4_encode_exchange_id error handling]
[nfsd41: embed an xdr_netobj in nfsd4_exchange_id]
[nfsd41: return nfserr_serverfault for spa_how == SP4_MACH_CRED]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: proc stubs
Andy Adamson [Fri, 3 Apr 2009 05:27:58 +0000 (08:27 +0300)]
nfsd41: proc stubs

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: xdr infrastructure
Andy Adamson [Fri, 3 Apr 2009 05:27:55 +0000 (08:27 +0300)]
nfsd41: xdr infrastructure

Define nfsd41_dec_ops vector and add it to nfsd4_minorversion for
minorversion 1.

Note: nfsd4_enc_ops vector is shared for v4.0 and v4.1
since we don't need to filter out obsolete ops as this is
done in the decoding phase.

exchange_id, create_session, destroy_session, and sequence ops are
implemented as stubs returning nfserr_opnotsupp at this stage.

[was nfsd41: xdr stubs]
[get rid of CONFIG_NFSD_V4_1]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: sessionid hashing
Marc Eshel [Fri, 3 Apr 2009 05:27:52 +0000 (08:27 +0300)]
nfsd41: sessionid hashing

Simple sessionid hashing using its monotonically increasing sequence number.

Locking considerations:
sessionid_hashtbl access is controlled by the sessionid_lock spin lock.
It must be taken for insert, delete, and lookup.
nfsd4_sequence looks up the session id and if the session is found,
it calls nfsd4_get_session (still under the sessionid_lock).
nfsd4_destroy_session calls nfsd4_put_session after unhashing
it, so when the session's kref reaches zero it's going to get freed.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[we don't use a prime for sessionid hash table size]
[use sessionid_lock spin lock]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: release_session when client is expired
Marc Eshel [Fri, 3 Apr 2009 05:27:49 +0000 (08:27 +0300)]
nfsd41: release_session when client is expired

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[add CONFIG_NFSD_V4_1 to fix v4.0 regression bug]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: introduce nfs4_client cl_sessions list
Marc Eshel [Fri, 3 Apr 2009 05:27:46 +0000 (08:27 +0300)]
nfsd41: introduce nfs4_client cl_sessions list

[get rid of CONFIG_NFSD_V4_1]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: sessions basic data types
Andy Adamson [Fri, 3 Apr 2009 05:27:43 +0000 (08:27 +0300)]
nfsd41: sessions basic data types

This patch provides basic data structures representing the nfs41
sessions and slots, plus helpers for keeping a reference count
on the session and freeing it.

Note that our server only support a headerpadsz of 0 and
it ignores backchannel attributes at the moment.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: remove headerpadsz from channel attributes]
[nfsd41: embed nfsd4_channel in nfsd4_session]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use bool inuse for slot state]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41 remove sl_session from nfsd4_slot]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd41: define nfs41 error codes
Marc Eshel [Fri, 3 Apr 2009 05:27:40 +0000 (08:27 +0300)]
nfsd41: define nfs41 error codes

Define all error code present in
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-29.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: clean up error code definitions]
[nfsd41: change NFSERR_REPLAY_ME]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfs41: common protocol definitions
Benny Halevy [Fri, 3 Apr 2009 05:27:36 +0000 (08:27 +0300)]
nfs41: common protocol definitions

Define all NFSv4.1 common operation and error code constants.

Note that some of the definitions are used by both the nfs41 client
and the server code. This patch is duplicated in the nfs41 and nfsd41
sessions patchset.

Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: add exchange id flags]
Signed-off-by: Mike Sager <sager@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[removed server-only hunk changing NFSERR_REPLAY_ME]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: add SEQ4_XX to nfs41-common-protocol]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: generic error code update]
[nfs41: reverse EXCHGID4_INVAL_FLAG_MASK_{A,R}]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd: don't use the deferral service, return NFS4ERR_DELAY
Andy Adamson [Fri, 3 Apr 2009 05:27:32 +0000 (08:27 +0300)]
nfsd: don't use the deferral service, return NFS4ERR_DELAY

On an NFSv4.1 server cache miss that causes an upcall, NFS4ERR_DELAY will be
returned. It is up to the NFSv4.1 client to resend only the operations that
have not been processed.

Initialize rq_usedeferral to 1 in svc_process(). It sill be turned off in
nfsd4_proc_compound() only when NFSv4.1 Sessions are used.

Note: this isn't an adequate solution on its own. It's acceptable as a way
to get some minimal 4.1 up and working, but we're going to have to find a
way to avoid returning DELAY in all common cases before 4.1 can really be
considered ready.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: reverse rq_nodeferral negative logic]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[sunrpc: initialize rq_usedeferral]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd: remove nfsd4_ops array size
Benny Halevy [Sat, 28 Mar 2009 08:32:05 +0000 (11:32 +0300)]
nfsd: remove nfsd4_ops array size

There's no need for it.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd: embed nfsd4_current_state in nfsd4_compoundres
Andy Adamson [Sat, 28 Mar 2009 08:30:52 +0000 (11:30 +0300)]
nfsd: embed nfsd4_current_state in nfsd4_compoundres

Remove the allocation of struct nfsd4_compound_state.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoFix a build warning about leaking CONFIG_NFSD to userspace.
Greg Banks [Thu, 26 Mar 2009 15:32:47 +0000 (02:32 +1100)]
Fix a build warning about leaking CONFIG_NFSD to userspace.

Fix a build warning about leaking CONFIG_NFSD to userspace.

The nfsd_stats data structure does not need to be available to
userspace; no kernel interface uses it.  So move it inside #ifdef
__KERNEL__ and the warning goes away.

Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoDocument /proc/fs/nfsd/pool_stats
Greg Banks [Thu, 26 Mar 2009 06:45:27 +0000 (17:45 +1100)]
Document /proc/fs/nfsd/pool_stats

Document the format and semantics of the /proc/fs/nfsd/pool_stats file.

Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agosunrpc/svc.c: Remove unused line 'rqstp->rq_server = serv;' in svc_process
ideawu [Thu, 26 Mar 2009 04:55:29 +0000 (12:55 +0800)]
sunrpc/svc.c: Remove unused line 'rqstp->rq_server = serv;' in svc_process

There is no need to set rqstp->rq_server to serv, while serv is initialized as rqstp->rq_server at previous line. And between these two lines, there is no change to rqstp->rq_server.

Signed-off-by: ideawu <ideawu@163.com>
Reviewed-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoSUNRPC: Clean up static inline functions in svc_xprt.h
Chuck Lever [Thu, 12 Mar 2009 16:07:14 +0000 (12:07 -0400)]
SUNRPC: Clean up static inline functions in svc_xprt.h

Clean up:  Enable the use of const arguments in higher level svc_ APIs
by adding const to the arguments of the helper functions in svc_xprt.h

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoInconsistent setattr behaviour
Sachin S. Prabhu [Mon, 23 Feb 2009 16:22:03 +0000 (16:22 +0000)]
Inconsistent setattr behaviour

There is an inconsistency seen in the behaviour of nfs compared to other local
filesystems on linux when changing owner or group of a directory. If the
directory has SUID/SGID flags set, on changing owner or group on the directory,
the flags are stripped off on nfs. These flags are maintained on other
filesystems such as ext3.

To reproduce on a nfs share or local filesystem, run the following commands
mkdir test; chmod +s+g test; chown user1 test; ls -ld test

On the nfs share, the flags are stripped and the output seen is
drwxr-xr-x 2 user1 root 4096 Feb 23  2009 test

On other local filesystems(ex: ext3), the flags are not stripped and the output
seen is
drwsr-sr-x 2 user1 root 4096 Feb 23 13:57 test

chown_common() called from sys_chown() will only strip the flags if the inode is
not a directory.
static int chown_common(struct dentry * dentry, uid_t user, gid_t group)
{
..
        if (!S_ISDIR(inode->i_mode))
                newattrs.ia_valid |=
                        ATTR_KILL_SUID | ATTR_KILL_SGID | ATTR_KILL_PRIV;
..
}

See: http://www.opengroup.org/onlinepubs/7990989775/xsh/chown.html

"If the path argument refers to a regular file, the set-user-ID (S_ISUID) and
set-group-ID (S_ISGID) bits of the file mode are cleared upon successful return
from chown(), unless the call is made by a process with appropriate privileges,
in which case it is implementation-dependent whether these bits are altered. If
chown() is successfully invoked on a file that is not a regular file, these
bits may be cleared. These bits are defined in <sys/stat.h>."

The behaviour as it stands does not appear to violate POSIX.  However the
actions performed are inconsistent when comparing ext3 and nfs.

Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agosvcrpc: take advantage of tcp autotuning
Olga Kornievskaia [Tue, 21 Oct 2008 18:13:47 +0000 (14:13 -0400)]
svcrpc: take advantage of tcp autotuning

Allow the NFSv4 server to make use of TCP autotuning behaviour, which
was previously disabled by setting the sk_userlocks variable.

Set the receive buffers to be big enough to receive the whole RPC
request, and set this for the listening socket, not the accept socket.

Remove the code that readjusts the receive/send buffer sizes for the
accepted socket. Previously this code was used to influence the TCP
window management behaviour, which is no longer needed when autotuning
is enabled.

This can improve IO bandwidth on networks with high bandwidth-delay
products, where a large tcp window is required.  It also simplifies
performance tuning, since getting adequate tcp buffers previously
required increasing the number of nfsd threads.

Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu>
Cc: Jim Rees <rees@umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: don't check ip address in setclientid
J. Bruce Fields [Wed, 18 Mar 2009 19:06:26 +0000 (15:06 -0400)]
nfsd4: don't check ip address in setclientid

The spec allows clients to change ip address, so we shouldn't be
requiring that setclientid always come from the same address.  For
example, a client could reboot and get a new dhcpd address, but still
present the same clientid to the server.  In that case the server should
revoke the client's previous state and allow it to continue, instead of
(as it currently does) returning a CLID_INUSE error.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoknfsd: add file to export stats about nfsd pools
Greg Banks [Tue, 13 Jan 2009 10:26:36 +0000 (21:26 +1100)]
knfsd: add file to export stats about nfsd pools

Add /proc/fs/nfsd/pool_stats to export to userspace various
statistics about the operation of rpc server thread pools.

This patch is based on a forward-ported version of
knfsd-add-pool-thread-stats which has been shipping in the SGI
"Enhanced NFS" product since 2006 and which was previously
posted:

http://article.gmane.org/gmane.linux.nfs/10375

It has also been updated thus:

 * moved EXPORT_SYMBOL() to near the function it exports
 * made the new struct struct seq_operations const
 * used SEQ_START_TOKEN instead of ((void *)1)
 * merged fix from SGI PV 990526 "sunrpc: use dprintk instead of
   printk in svc_pool_stats_*()" by Harshula Jayasuriya.
 * merged fix from SGI PV 964001 "Crash reading pool_stats before
   nfsds are started".

Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: Harshula Jayasuriya <harshula@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoknfsd: avoid overloading the CPU scheduler with enormous load averages
Greg Banks [Tue, 13 Jan 2009 10:26:35 +0000 (21:26 +1100)]
knfsd: avoid overloading the CPU scheduler with enormous load averages

Avoid overloading the CPU scheduler with enormous load averages
when handling high call-rate NFS loads.  When the knfsd bottom half
is made aware of an incoming call by the socket layer, it tries to
choose an nfsd thread and wake it up.  As long as there are idle
threads, one will be woken up.

If there are lot of nfsd threads (a sensible configuration when
the server is disk-bound or is running an HSM), there will be many
more nfsd threads than CPUs to run them.  Under a high call-rate
low service-time workload, the result is that almost every nfsd is
runnable, but only a handful are actually able to run.  This situation
causes two significant problems:

1. The CPU scheduler takes over 10% of each CPU, which is robbing
   the nfsd threads of valuable CPU time.

2. At a high enough load, the nfsd threads starve userspace threads
   of CPU time, to the point where daemons like portmap and rpc.mountd
   do not schedule for tens of seconds at a time.  Clients attempting
   to mount an NFS filesystem timeout at the very first step (opening
   a TCP connection to portmap) because portmap cannot wake up from
   select() and call accept() in time.

Disclaimer: these effects were observed on a SLES9 kernel, modern
kernels' schedulers may behave more gracefully.

The solution is simple: keep in each svc_pool a counter of the number
of threads which have been woken but have not yet run, and do not wake
any more if that count reaches an arbitrary small threshold.

Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients, each with 16
synthetic client threads simulating an rsync (i.e. recursive directory
listing) workload reading from an i386 RH9 install image (161480
regular files in 10841 directories) on the server.  That tree is small
enough to fill in the server's RAM so no disk traffic was involved.
This setup gives a sustained call rate in excess of 60000 calls/sec
before being CPU-bound on the server.  The server was running 128 nfsds.

Profiling showed schedule() taking 6.7% of every CPU, and __wake_up()
taking 5.2%.  This patch drops those contributions to 3.0% and 2.2%.
Load average was over 120 before the patch, and 20.9 after.

This patch is a forward-ported version of knfsd-avoid-nfsd-overload
which has been shipping in the SGI "Enhanced NFS" product since 2006.
It has been posted before:

http://article.gmane.org/gmane.linux.nfs/10374

Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoknfsd: remove the nfsd thread busy histogram
Greg Banks [Tue, 13 Jan 2009 10:26:34 +0000 (21:26 +1100)]
knfsd: remove the nfsd thread busy histogram

Stop gathering the data that feeds the 'th' line in /proc/net/rpc/nfsd
because the questionable data provided is not worth the scalability
impact of calculating it.  Instead, always report zeroes.  The current
approach suffers from three major issues:

1. update_thread_usage() increments buckets by call service
   time or call arrival time...in jiffies.  On lightly loaded
   machines, call service times are usually < 1 jiffy; on
   heavily loaded machines call arrival times will be << 1 jiffy.
   So a large portion of the updates to the buckets are rounded
   down to zero, and the histogram is undercounting.

2. As seen previously on the nfs mailing list, the format in which
   the histogram is presented is cryptic, difficult to explain,
   and difficult to use.

3. Updating the histogram requires taking a global spinlock and
   dirtying the global variables nfsd_last_call, nfsd_busy, and
   nfsdstats *twice* on every RPC call, which is a significant
   scaling limitation.

Testing on a 4 CPU 4 NIC Altix using 4 IRIX clients each doing
1K streaming reads at full line rate, shows the stats update code
(inlined into nfsd()) takes about 1.7% of each CPU.  This patch drops
the contribution from nfsd() into the profile noise.

This patch is a forward-ported version of knfsd-remove-nfsd-threadstats
which has been shipping in the SGI "Enhanced NFS" product since 2006.
In that time, exactly one customer has noticed that the threadstats
were missing.  It has been previously posted:

http://article.gmane.org/gmane.linux.nfs/10376

and more recently requested to be posted again.

Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: remove redundant check from nfsd4_open
J. Bruce Fields [Sat, 14 Mar 2009 20:38:41 +0000 (16:38 -0400)]
nfsd4: remove redundant check from nfsd4_open

Note that we already checked for this invalid case at the top of this
function.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: don't do lookup within readdir in recovery code
J. Bruce Fields [Fri, 13 Mar 2009 20:02:59 +0000 (16:02 -0400)]
nfsd4: don't do lookup within readdir in recovery code

The main nfsd code was recently modified to no longer do lookups from
withing the readdir callback, to avoid locking problems on certain
filesystems.

This (rather hacky, and overdue for replacement) NFSv4 recovery code has
the same problem.  Fix it to build up a list of names (instead of
dentries) and do the lookups afterwards.

Reported symptoms were a deadlock in the xfs code (called from
nfsd4_recdir_load), with /var/lib/nfs on xfs.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Reported-by: David Warren <warren@atmos.washington.edu>
15 years agonfsd4: support putpubfh operation
J. Bruce Fields [Mon, 9 Mar 2009 16:17:29 +0000 (12:17 -0400)]
nfsd4: support putpubfh operation

Currently putpubfh returns NFSERR_OPNOTSUPP, which isn't actually
allowed for v4.  The right error is probably NFSERR_NOTSUPP.

But let's just implement it; though rarely seen, it can be used by
Solaris (with a special mount option), is mandated by the rfc, and is
trivial for us to support.

Thanks to Yang Hongyang for pointing out the original problem, and to
Mike Eisler, Tom Talpey, Trond Myklebust, and Dave Noveck for further
argument....

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoShort write in nfsd becomes a full write to the client
David Shaw [Fri, 6 Mar 2009 01:16:14 +0000 (20:16 -0500)]
Short write in nfsd becomes a full write to the client

If a filesystem being written to via NFS returns a short write count
(as opposed to an error) to nfsd, nfsd treats that as a success for
the entire write, rather than the short count that actually succeeded.

For example, given a 8192 byte write, if the underlying filesystem
only writes 4096 bytes, nfsd will ack back to the nfs client that all
8192 bytes were written.  The nfs client does have retry logic for
short writes, but this is never called as the client is told the
complete write succeeded.

There are probably other ways it could happen, but in my case it
happened with a fuse (filesystem in userspace) filesystem which can
rather easily have a partial write.

Here is a patch to properly return the short write count to the
client.

Signed-off-by: David Shaw <dshaw@jabberwocky.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoNFSD: return nfsv4 error code nfserr_notsupp rather than nfsv[23]'s nfserr_opnotsupp
Benny Halevy [Wed, 4 Mar 2009 21:06:06 +0000 (23:06 +0200)]
NFSD: return nfsv4 error code nfserr_notsupp rather than nfsv[23]'s nfserr_opnotsupp

Thanks for Bill Baker at sun.com for catching this
at Connectathon 2009.

This bug was introduced in 2.6.27

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: move rpc_client setup to a separate function
J. Bruce Fields [Mon, 23 Feb 2009 00:43:45 +0000 (16:43 -0800)]
nfsd4: move rpc_client setup to a separate function

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: fix do_probe_callback errors
J. Bruce Fields [Sun, 22 Feb 2009 23:52:13 +0000 (15:52 -0800)]
nfsd4: fix do_probe_callback errors

The errors returned aren't used.  Just return 0 and make them available
to a dprintk().  Also, consistently use -ERRNO errors instead of nfs
errors.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Reviewed-by: Benny Halevy <bhalevy@panasas.com>
15 years agonfsd4: remove use of mutex for file_hashtable
J. Bruce Fields [Sun, 22 Feb 2009 22:51:34 +0000 (14:51 -0800)]
nfsd4: remove use of mutex for file_hashtable

As part of reducing the scope of the client_mutex, and in order to
remove the need for mutexes from the callback code (so that callbacks
can be done as asynchronous rpc calls), move manipulations of the
file_hashtable under the recall_lock.

Update the relevant comments while we're here.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Alexandros Batsakis <batsakis@netapp.com>
Reviewed-by: Benny Halevy <bhalevy@panasas.com>
15 years agonfsd4: put_nfs4_client does not require state lock
J. Bruce Fields [Sat, 21 Feb 2009 23:39:54 +0000 (15:39 -0800)]
nfsd4: put_nfs4_client does not require state lock

Since free_client() is guaranteed to only be called once, and to only
touch the client structure itself (not any common data structures), it
has no need for the state lock.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Alexandros Batsakis <batsakis@netapp.com>
15 years agonfsd4: rename io_during_grace_disallowed
J. Bruce Fields [Sat, 21 Feb 2009 23:23:01 +0000 (15:23 -0800)]
nfsd4: rename io_during_grace_disallowed

Use a slightly clearer, more concise name.  Also removed unused
argument.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: remove unused CHECK_FH flag
J. Bruce Fields [Sat, 21 Feb 2009 21:36:16 +0000 (13:36 -0800)]
nfsd4: remove unused CHECK_FH flag

All users now pass this, so it's meaningless.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: fail when delegreturn gets a non-delegation stateid
J. Bruce Fields [Sat, 21 Feb 2009 21:32:28 +0000 (13:32 -0800)]
nfsd4: fail when delegreturn gets a non-delegation stateid

Previous cleanup reveals an obvious (though harmless) bug: when
delegreturn gets a stateid that isn't for a delegation, it should return
an error rather than doing nothing.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: separate delegreturn case from preprocess_stateid_op
J. Bruce Fields [Sat, 21 Feb 2009 21:29:14 +0000 (13:29 -0800)]
nfsd4: separate delegreturn case from preprocess_stateid_op

Delegreturn is enough a special case for preprocess_stateid_op to
warrant just open-coding it in delegreturn.

There should be no change in behavior here; we're just reshuffling code.

Thanks to Yang Hongyang for catching a critical typo.

Reviewed-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: add a helper function to decide if stateid is delegation
J. Bruce Fields [Sat, 21 Feb 2009 21:17:19 +0000 (13:17 -0800)]
nfsd4: add a helper function to decide if stateid is delegation

Make this check self-documenting.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: remove some dprintk's
J. Bruce Fields [Sat, 21 Feb 2009 20:13:24 +0000 (12:13 -0800)]
nfsd4: remove some dprintk's

I can't recall ever seeing these printk's used to debug a problem.  I'll
happily put them back if we see a case where they'd be useful.  (Though
if we do that the find_XXX() errors would probably be better
reported in find_XXX() functions themselves.)

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: remove unneeded local variable
J. Bruce Fields [Sat, 21 Feb 2009 19:27:30 +0000 (11:27 -0800)]
nfsd4: remove unneeded local variable

We no longer need stidp.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: remove redundant "if" in nfs4_preprocess_stateid_op
J. Bruce Fields [Sat, 21 Feb 2009 19:14:43 +0000 (11:14 -0800)]
nfsd4: remove redundant "if" in nfs4_preprocess_stateid_op

Note that we exit this first big "if" with stp == NULL if and only if we
took the first branch; therefore, the second "if" is redundant, and we
can just combine the two, simplifying the logic.

Reviewed-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: move check_stateid_generation check
J. Bruce Fields [Sat, 21 Feb 2009 19:11:50 +0000 (11:11 -0800)]
nfsd4: move check_stateid_generation check

No change in behavior.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: trivial preprocess_stateid_op cleanup
J. Bruce Fields [Sat, 21 Feb 2009 18:40:22 +0000 (10:40 -0800)]
nfsd4: trivial preprocess_stateid_op cleanup

Remove a couple redundant comments, adjust style; no change in behavior.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfs: replace uses of __constant_{endian}
Harvey Harrison [Thu, 12 Feb 2009 01:16:58 +0000 (17:16 -0800)]
nfs: replace uses of __constant_{endian}

The base versions handle constant folding now, none of these headers
are exported to userspace, so the __ prefixed versions are not
necessary.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd(v2/v3): fix the failure of creation from HPUX client
wengang wang [Tue, 10 Feb 2009 03:27:51 +0000 (11:27 +0800)]
nfsd(v2/v3): fix the failure of creation from HPUX client

sometimes HPUX nfs client sends a create request to linux nfs server(v2/v3).
the dump of the request is like:
    obj_attributes
        mode: value follows
            set_it: value follows (1)
            mode: 00
        uid: no value
            set_it: no value (0)
        gid: value follows
            set_it: value follows (1)
            gid: 8030
        size: value follows
            set_it: value follows (1)
            size: 0
        atime: don't change
            set_it: don't change (0)
        mtime: don't change
            set_it: don't change (0)

note that mode is 00(havs no rwx privilege even for the owner) and it requires
to set size to 0.

as current nfsd(v2/v3) implementation, the server does mainly 2 steps:
1) creates the file in mode specified by calling vfs_create().
2) sets attributes for the file by calling nfsd_setattr().

at step 2), it finally calls file system specific setattr() function which may
fail when checking permission because changing size needs WRITE privilege but
it has none since mode is 000.

for this case, a new file created, we may simply ignore the request of
setting size to 0, so that WRITE privilege is not needed and the open
succeeds.

Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
--
 vfs.c |   19 +++++++++++++++++++
 1 file changed, 19 insertions(+)
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agolockd: clean up blocking lock cases of nlsmvc_lock()
Miklos Szeredi [Mon, 9 Feb 2009 17:30:43 +0000 (12:30 -0500)]
lockd: clean up blocking lock cases of nlsmvc_lock()

No change in behavior, just rearranging the switch so that we break out
of the switch if and only if we're in the wait case.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd: lock state around put client and delegation in nfsd4_cb_recall
Alexandros Batsakis [Fri, 19 Dec 2008 03:55:16 +0000 (19:55 -0800)]
nfsd: lock state around put client and delegation in nfsd4_cb_recall

not having the state locked before putting the client/delegation causes a bug.
Also removed the comment from the function header about the state being already locked

Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: use helper for copying delegation filehandle
J. Bruce Fields [Mon, 2 Feb 2009 22:30:51 +0000 (17:30 -0500)]
nfsd4: use helper for copying delegation filehandle

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: use helper for copying filehandles for replay
J. Bruce Fields [Mon, 2 Feb 2009 22:23:10 +0000 (17:23 -0500)]
nfsd4: use helper for copying filehandles for replay

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: fix misplaced comment
J. Bruce Fields [Mon, 2 Feb 2009 22:04:03 +0000 (17:04 -0500)]
nfsd4: fix misplaced comment

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd: clarify exclusive create bitmask result.
J. Bruce Fields [Mon, 2 Feb 2009 20:12:27 +0000 (15:12 -0500)]
nfsd: clarify exclusive create bitmask result.

The use of |= is confusing--the bitmask is always initialized to zero in
this case, so we're effectively just doing an assignment here.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd : Define NFSD only when FILE_LOCKING is enabled
Manish Katiyar [Thu, 29 Jan 2009 15:23:57 +0000 (20:53 +0530)]
nfsd : Define NFSD only when FILE_LOCKING is enabled

Enable NFSD only when FILE_LOCKING is enabled, since we don't want to
support NFSD without FILE_LOCKING.

Signed-off-by: Manish Katiyar <mkatiyar@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agoNFSD: cleanup for nfs3proc.c
Qinghuang Feng [Sun, 11 Jan 2009 19:13:53 +0000 (03:13 +0800)]
NFSD: cleanup for nfs3proc.c

MSDOS_SUPER_MAGIC is defined in <linux/magic.h>,
so use MSDOS_SUPER_MAGIC directly.

Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: split open/lockowner release code
J. Bruce Fields [Sun, 11 Jan 2009 20:24:04 +0000 (15:24 -0500)]
nfsd4: split open/lockowner release code

The caller always knows specifically whether it's releasing a lockowner
or an openowner, and the code is simpler if we use separate functions
(and the apparent recursion is gone).

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: remove a forward declaration
J. Bruce Fields [Sun, 11 Jan 2009 19:37:31 +0000 (14:37 -0500)]
nfsd4: remove a forward declaration

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
15 years agonfsd4: split lockstateid/openstateid release logic
J. Bruce Fields [Sun, 11 Jan 2009 19:27:17 +0000 (14:27 -0500)]
nfsd4: split lockstateid/openstateid release logic

The flags here attempt to make the code more general, but I find it
actually just adds confusion.

I think it's clearer to separate the logic for the open and lock cases
entirely.  And eventually we may want to separate the stateowner and
stateid types as well, as many of the fields aren't shared between the
lock and open cases.

Also move to eliminate forward references.

Start with the stateid's.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Reviewed-by: Benny Halevy <bhalevy@panasas.com>
15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6
Linus Torvalds [Wed, 18 Mar 2009 16:34:17 +0000 (09:34 -0700)]
Merge git://git./linux/kernel/git/gregkh/staging-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6:
  Staging: benet: remove driver now that it is merged in drivers/net/

15 years agoMerge branch 'for-2.6.29' of git://linux-nfs.org/~bfields/linux
Linus Torvalds [Wed, 18 Mar 2009 16:27:20 +0000 (09:27 -0700)]
Merge branch 'for-2.6.29' of git://linux-nfs.org/~bfields/linux

* 'for-2.6.29' of git://linux-nfs.org/~bfields/linux:
  nfsd: nfsd should drop CAP_MKNOD for non-root
  NFSD: provide encode routine for OP_OPENATTR

15 years agoStaging: benet: remove driver now that it is merged in drivers/net/
Greg Kroah-Hartman [Wed, 18 Mar 2009 16:22:17 +0000 (09:22 -0700)]
Staging: benet: remove driver now that it is merged in drivers/net/

The benet driver is now in the proper place in drivers/net/benet, so we
can remove the staging version.

Acked-by: Sathya Perla <sathyap@serverengines.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoMerge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Linus Torvalds [Wed, 18 Mar 2009 16:05:40 +0000 (09:05 -0700)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/ps3: ps3_defconfig updates
  powerpc/mm: Respect _PAGE_COHERENT on classic ppc32 SW
  powerpc/5200: Enable CPU_FTR_NEED_COHERENT for MPC52xx
  ps3/block: Replace mtd/ps3vram by block/ps3vram

15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
Linus Torvalds [Wed, 18 Mar 2009 16:04:25 +0000 (09:04 -0700)]
Merge git://git./linux/kernel/git/rusty/linux-2.6-for-linus

* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  module: fix refptr allocation and release order

15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
Linus Torvalds [Wed, 18 Mar 2009 16:03:18 +0000 (09:03 -0700)]
Merge git://git./linux/kernel/git/gregkh/usb-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
  USB: storage: Unusual USB device Prolific 2507 variation added
  USB: Add device id for Option GTM380 to option driver
  USB: Add Vendor/Product ID for new CDMA U727 to option driver
  USB: Updated unusual-devs entry for USB mass storage on Nokia 6233
  USB: Option: let cdc-acm handle Sony Ericsson F3507g / Dell 5530
  USB: EHCI: expedite unlinks when the root hub is suspended
  USB: EHCI: Fix isochronous URB leak
  USB: option.c: add ZTE 622 modem device
  USB: wusbcore/wa-xfer, fix lock imbalance
  USB: misc/vstusb, fix lock imbalance
  USB: misc/adutux, fix lock imbalance
  USB: image/mdc800, fix lock imbalance
  USB: atm/cxacru, fix lock imbalance
  USB: unusual_devs: Add support for GI 0431 SD-Card interface
  USB: serial: new cp2101 device id
  USB: serial: ftdi: enable UART detection on gnICE JTAG adaptors blacklist interface0
  USB: serial: add FTDI USB/Serial converter devices
  USB: usbfs: keep async URBs until the device file is closed
  USB: usbtmc: add protocol 1 support
  USB: usbtmc: fix stupid bug in open()

15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
Linus Torvalds [Wed, 18 Mar 2009 14:39:11 +0000 (07:39 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  ALSA: Fix vunmap and free order in snd_free_sgbuf_pages()
  ALSA: mixart, fix lock imbalance
  ALSA: pcm_oss, fix locking typo
  ALSA: oss-mixer - Fixes recording gain control
  ALSA: hda - Workaround for buggy DMA position on ATI controllers
  ALSA: hda - Fix DMA mask for ATI controllers
  ALSA: opl3sa2 - Fix NULL dereference when suspending snd_opl3sa2

15 years agoMerge branch 'fix/opl3sa2-suspend' into for-linus
Takashi Iwai [Wed, 18 Mar 2009 07:04:36 +0000 (08:04 +0100)]
Merge branch 'fix/opl3sa2-suspend' into for-linus

15 years agoMerge branch 'fix/hda' into for-linus
Takashi Iwai [Wed, 18 Mar 2009 07:04:16 +0000 (08:04 +0100)]
Merge branch 'fix/hda' into for-linus

15 years agoALSA: Fix vunmap and free order in snd_free_sgbuf_pages()
Takashi Iwai [Tue, 17 Mar 2009 13:00:06 +0000 (14:00 +0100)]
ALSA: Fix vunmap and free order in snd_free_sgbuf_pages()

In snd_free_sgbuf_pags(), vunmap() is called after releasing the SG
pages, and it causes errors on Xen as Xen manages the pages
differently.  Although no significant errors have been reported on
the actual hardware, this order should be fixed other way round,
first vunmap() then free pages.

Cc: Jan Beulich <jbeulich@novell.com>
Cc: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
15 years agoALSA: mixart, fix lock imbalance
Jiri Slaby [Wed, 11 Mar 2009 19:11:41 +0000 (20:11 +0100)]
ALSA: mixart, fix lock imbalance

There is an omitted unlock in one snd_mixart_hw_params fail path. Fix it.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
15 years agoALSA: pcm_oss, fix locking typo
Jiri Slaby [Wed, 11 Mar 2009 19:11:40 +0000 (20:11 +0100)]
ALSA: pcm_oss, fix locking typo

s/mutex_lock/mutex_unlock/ on 2 fail paths in snd_pcm_oss_proc_write.
Probably a typo, lock should be unlocked when leaving the function.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
15 years agoALSA: oss-mixer - Fixes recording gain control
Viral Mehta [Tue, 10 Mar 2009 14:43:18 +0000 (15:43 +0100)]
ALSA: oss-mixer - Fixes recording gain control

At the time of initialization, SNDRV_MIXER_OSS_PRESENT_PVOLUME bit is not
set for MIC (slot 7).
So, the same should not be checked when an application tries to do gain
control for audio recording devices.

Just check slot->present for SNDRV_MIXER_OSS_PRESENT_CVOLUME independently.
Verified with a simple application which opens /dev/dsp for recording and
/dev/mixer for volume control.

Have tested two usb audio mic devices.

Signed-off-by: Viral Mehta <viral.mehta@einfochips.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
15 years agoALSA: hda - Workaround for buggy DMA position on ATI controllers
Takashi Iwai [Tue, 17 Mar 2009 06:49:14 +0000 (07:49 +0100)]
ALSA: hda - Workaround for buggy DMA position on ATI controllers

The position-buffer on ATI controllers are unreliable as well as
on VIA chips, thus the same workaround for DMA position reading as
VIA is useful for ATI.

Cc: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
15 years agoALSA: hda - Fix DMA mask for ATI controllers
Takashi Iwai [Tue, 17 Mar 2009 06:47:18 +0000 (07:47 +0100)]
ALSA: hda - Fix DMA mask for ATI controllers

ATI controllers (at least some SB0600 models) appear buggy to handle
64bit DMA.  As a workaround, reset GCAP bit0 and let the driver to
use only 32bit DMA on these controllers.

Cc: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
15 years agoMerge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Linus Torvalds [Wed, 18 Mar 2009 03:55:40 +0000 (20:55 -0700)]
Merge branch 'for_linus' of git://git./linux/kernel/git/tytso/ext4

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix bb_prealloc_list corruption due to wrong group locking
  ext4: fix bogus BUG_ONs in in mballoc code
  ext4: Print the find_group_flex() warning only once
  ext4: fix header check in ext4_ext_search_right() for deep extent trees.

15 years agopowerpc/ps3: ps3_defconfig updates
Geoff Levand [Fri, 13 Mar 2009 06:52:22 +0000 (06:52 +0000)]
powerpc/ps3: ps3_defconfig updates

Update ps3_defconfig.

Sets these options:

  CONFIG_PS3_VRAM=m
  CONFIG_BLK_DEV_DM=m
  CONFIG_USB_HIDDEV=y
  CONFIG_EXT4_FS=y

Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
15 years agoMerge commit 'gcl/merge' into merge
Benjamin Herrenschmidt [Wed, 18 Mar 2009 02:16:30 +0000 (13:16 +1100)]
Merge commit 'gcl/merge' into merge

15 years agomodule: fix refptr allocation and release order
Masami Hiramatsu [Mon, 16 Mar 2009 22:13:36 +0000 (18:13 -0400)]
module: fix refptr allocation and release order

Impact: fix ref-after-free crash on failed module load

Fix refptr bug: Change refptr allocation and release order not to access a module
data structure pointed by 'mod' after freeing mod->module_core.
This bug will cause kernel panic(e.g. failed to find undefined symbols).

This bug was reported on systemtap bugzilla.
http://sources.redhat.com/bugzilla/show_bug.cgi?id=9927

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
15 years agoUSB: storage: Unusual USB device Prolific 2507 variation added
Thomas Bartosik [Mon, 16 Mar 2009 15:04:38 +0000 (16:04 +0100)]
USB: storage: Unusual USB device Prolific 2507 variation added

The "c-enter" USB to Toshiba 1.8" IDE enclosure needs special treatment
to work flawlessly. This patch is absolutely trivial, as the integrated
USB-IDE bridge is already identified to be an "unusual" device, only the
bcdDevice is different (lower) to the bcdDeviceMin already included in
the kernel.
It is a Prolific 2507 bridge.

T:  Bus=02 Lev=01 Prnt=01 Port=02 Cnt=01 Dev#=  4 Spd=480 MxCh= 0
D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
P:  Vendor=067b ProdID=2507 Rev= 0.01
S:  Manufacturer=Prolific Technology Inc.
S:  Product=ATAPI-6 Bridge Controller
S:  SerialNumber=00000272
C:* #Ifs= 1 Cfg#= 1 Atr=c0 MxPwr=100mA
I:* If#= 0 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage
E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms

Signed-off-by: Thomas Bartosik <tbartdev@gmx-topmail.de>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>