History log of /freebsd-current/sys/fs/nfsclient/nfs_clstate.c
Revision Date Author Comments
# 8efba70d 25-Apr-2024 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Revert part of commit 196787f79e67

Commit 196787f79e67 erroneously assumed that the client code for
Open/Claim_deleg_cur_FH was broken, but it was not.
It was actually wireshark that was broken and indicated
that the correct XDR was bogus.

This reverts the part of 196787f79e67 that changed the arguments for
Open/Claim_deleg_cur_FH.

Found during the IETF bakeathon testing event this week.

MFC after: 3 days


# 501bdf30 06-Nov-2023 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: newnfs_copycred() cannot be called when a mutex is held

Since newnfs_copycred() calls crsetgroups() which in turn calls
crextend() which might do a malloc(M_WAITOK), newnfs_copycred()
cannot be called with a mutex held. Fortunately, the malloc()
call is rarely done, since XU_GROUPS is 16 and the NFS client
uses a maximum of 17 (only 17 groups will cause the malloc() to
be called). Further, it is only a problem if the malloc() tries
to sleep(). As such, this bug does not seem to have caused
problems in practice.

This patch fixes the one place in the NFS client where
newnfs_copycred() is called while a mutex is held by moving the
call to after where the mutex is released.

Found by inspection while working on an experimental patch.

MFC after: 2 weeks


# 196787f7 20-Oct-2023 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Use Claim_Null_FH and Claim_Deleg_Cur_FH

For NFSv4.1/4.2, there are two new options for the Open operation.
These two options use the file handle for the file instead of the
file handle for the directory plus a file name. By doing so, the
client code is simplified (it no longer needs the "nfsv4node" structure
attached to the NFS vnode). It also avoids problems caused by another
NFS client (or process running locally in the NFS server) doing a
rename or remove of the file name between the Lookup and Open.

Unfortunately, there was a bug (fixed recently by commit X)
in the NFS server which mis-parsed the Claim_Deleg_Cur_FH
arguments. To allow this patch to work with the broken FreeBSD
NFSv4.1/4.2 server, NFSMNTP_BUGGYFBSDSRV is defined and is set
when a correctly formatted Claim_Deleg_Cur_FH fails with NFSERR_EXPIRED.
(This is what the old, broken NFS server does, since it erroneously
uses the Getattr arguments as a stateID.) Once this flag is set,
the client fills in a stateID, to make the broken NFS server happy.

Tested at a recent IETF NFSv4 Bakeathon.

MFC after: 1 month


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# dff31ae1 09-Jul-2022 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Move nfsrpc_destroysession into nfscommon

This patch moves nfsrpc_destroysession() into nfscommon.ko
and also modifies its arguments slightly. This will allow
the function to be called from nfsv4_sequencelookup() in
a future commit.

This patch should not result in a semantics change.

PR: 260011
MFC after: 2 weeks


# 746974c0 23-Jun-2022 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Clean up the code by not using the vnode_vtype() macro

The vnode_vtype() macro was used to make the code compatible
with Mac OSX, for the Mac OSX port.
For FreeBSD, this macro just obscured the code, so
avoid using it to clean up the code.

This commit should not result in a semantics change.


# 3c4266ed 17-Jun-2022 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Clean up the code by removing unused arguments

The "void *stuff" (also called fstuff and dstuff) argument
was used by the Mac OSX port. For FreeBSD, this argument
is always NULL, so remove it to clean up the code.

This commit gets rid of "stuff" for assorted functions
defined in nfs_clrpcops.c and called in nfs_clvnops.c and
nfs_clstate.c.

This commit should not result in a semantics change.


# 56b64e28 02-Jun-2022 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Do not flush when a write delegation is held

When a NFSv4 byte range write lock is unlocked, all
data modifications need to be flushed to the server
to satisfy the coherency requirements for byte range
locking. However, if a write delegation for the
file is held by the client, flushing is not required,
since no other NFSv4 client can have the file NFSv4
Opened.

Found by inspection as suggested by a similar change
that was done to the Linux NFSv4 client.


# 425e5c73 27-May-2022 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Do not handle NFSERR_BADSESSION in operation code

The NFSERR_BADSESSION reply from a NFSv4.1/4.2 server
is handled by newnfs_request(). It should not be handled
separately after newnfs_request() has returned.

These two cases were spotted during code inspection.
One of them should only redo what newnfs_request() already
did by the same "nfscl" thread. The other might have
resulted in recovery being done twice, but the code is
only used for "pnfs" mounts, so that would be rare.
Also, since NFSERR_BADSESSION should only be replied by
a server after the server reboots, this would be extremely
rare.

MFC after: 2 weeks


# 1cedb4ea 25-Feb-2022 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Fix a use after free in nfscl_cleanupkext()

ler@, markj@ reported a use after free in nfscl_cleanupkext().
They also provided two possible causes:
- In nfscl_cleanup_common(), "own" is the owner string
owp->nfsow_owner. If we free that particular
owner structure, than in subsequent comparisons
"own" will point to freed memory.
- nfscl_cleanup_common() can free more than one owner, so the use
of LIST_FOREACH_SAFE() in nfscl_cleanupkext() is not sufficient.

I also believe there is a 3rd:
- If nfscl_freeopenowner() or nfscl_freelockowner() is called
without the NFSCLSTATE mutex held, this could race with
nfscl_cleanupkext().
This could happen when the exclusive lock is held
on the client, such as when delegations are being returned
or when recovering from NFSERR_EXPIRED.

This patch fixes them as follows:
1 - Copy the owner string to a local variable before the
nfscl_cleanup_common() call.
2 - Modify nfscl_cleanup_common() so that it will never free more
than the first matching element. Normally there should only
be one element in each list with a matching open/lock owner
anyhow (but there might be a bug that results in a duplicate).
This should guarantee that the FOREACH_SAFE loops in
nfscl_cleanupkext() are adequate.
3 - Acquire the NFSCLSTATE mutex in nfscl_freeopenowner()
and nfscl_freelockowner(), if it is not already held.
This serializes all of these calls with the ones done in
nfscl_cleanup_common().

Reported by: ler
Reviewed by: markj
Tested by: cy
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D34334


# 06148d22 24-Feb-2022 Rick Macklem <rmacklem@FreeBSD.org>

Revert "nfscl: Fix a use after free in nfscl_cleanupkext()"

This reverts commit dd08b84e35b6fdee0df5fd0e0533cd361dec71cb.

cy@ reported a problem caused by this patch. He will be
testing an alternate patch, but I'm reverting this one.


# dd08b84e 22-Feb-2022 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Fix a use after free in nfscl_cleanupkext()

ler@, markj@ reported a use after free in nfscl_cleanupkext().
They also provided two possible causes:
- In nfscl_cleanup_common(), "own" is the owner string
owp->nfsow_owner. If we free that particular
owner structure, than in subsequent comparisons
"own" will point to freed memory.
- nfscl_cleanup_common() can free more than one owner, so the use
of LIST_FOREACH_SAFE() in nfscl_cleanupkext() is not sufficient.

I also believe there is a 3rd:
- If nfscl_freeopenowner() or nfscl_freelockowner() is called
without the NFSCLSTATE mutex held, this could race with
nfscl_cleanupkext().
This could happen when the exclusive lock is held
on the client, such as when delegations are being returned.

This patch fixes them as follows:
1 - Copy the owner string to a local variable before the
nfscl_cleanup_common() call.
2 - Modify nfscl_cleanup_common() to return whether or not a
free was done.
When a free was done, do a goto to restart the loop, instead
of using FOREACH_SAFE, which was not safe in this case.
3 - Acquire the NFSCLSTATE mutex in nfscl_freeopenowner()
and nfscl_freelockowner(), if it not already held.
This serializes all of these calls with the ones done in
nfscl_cleanup_common().

Reported by: ler
Reviewed by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D34334


# e0861304 15-Dec-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Handle CB_SEQUENCE not first op correctly

The check for "not first operation" in CB_SEQUENCE
was done after the slot, etc. was updated. This patch
moves the check to the beginning of CB_SEQUENCE
processing.

While here, also fix the check for "no CB_SEQUENCE operation first"
by moving the check to the beginning of callback operation parsing,
since the check was in a couple of the other operations, but
not all of them.

Reported by: rtm@lcs.mit.edu
Tested by: rtm@lcs.mit.edu
PR: 260412
MFC after: 2 weeks


# d9931c25 09-Dec-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Sanity check the callback tag length

The sanity check for tag length in a callback request
was broken in two ways:

It checked for a negative value, but not a large positive
value.

It did not set taglen to -1, to indicate to the code that
it should not be used.

This patch fixes both of these issues.

Reported by: rtm@lcs.mit.edu
Tested by: rtm@lcs.mit.edu
PR: 260266
MFC after: 2 weeks


# ce9676de 12-Nov-2021 Rick Macklem <rmacklem@FreeBSD.org>

pNFS: Add nfsstats counters for number of Layouts

For pNFS, Layouts are issued by the server to indicate
where a file's data resides on the DS(s). This patch
adds counters for how many layouts are allocated to
the nfsstatsv1 structure, using two reserved fields.

MFC after: 2 weeks


# f5d5164f 05-Nov-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Fix two more cases for forced dismount

Although I was not able to cause a failure during testing, there
are places in nfscl_removedeleg() and nfscl_renamedeleg() where
I think a forced dismount could get hung. This patch fixes those.

This patch only affects forced dismount and only if the NFSv4
server is issuing delegations to the client.

Found by code inspection.

MFC after: 2 weeks


# 44122258 03-Nov-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Fix use after free for forced dismount

When a forced dismount is done and delegations are being
issued by the server (disabled by default for FreeBSD
servers), the delegation structure is free'd before the
loop calling vflush(). This could result in a use after
free of the delegation structure.

This patch changes the code so that the delegation
structures are not free'd until after the vflush()
loop for forced dismounts.

Found during a recent IETF NFSv4 working group testing event.

MFC after: 2 weeks


# 331883a2 02-Nov-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Check for a forced dismount in nfscl_getref()

The nfscl_getref() function is called within nfscl_doiods() when
the NFSv4.1/4.2 pNFS client is doing I/O on a DS. As such,
nfscl_getref() needs to check for a forced dismount.
This patch adds that check.

Found during a recent IETF NFSv4 working group testing event.

MFC after: 2 weeks


# d5d2ce1c 31-Oct-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Do pNFS layout return_on_close synchronously

For pNFS servers that specify that Layouts are to be returned
upon close, they may expect that LayoutReturn to happen before
the associated Close.

This patch modifies the NFSv4.1/4.2 pNFS client so that this
is done. This only affects a pNFS mount against a non-FreeBSD
NFSv4.1/4.2 server that specifies return_on_close in LayoutGet
replies.

Found during a recent IETF NFSv4 working group testing event.

MFC after: 2 weeks


# dc6dd769 29-Oct-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Use NFSMNTP_DELEGISSUED in two more functions

Commit 5e5ca4c8fc53 added a NFSMNTP_DELEGISSUED flag to indicate when
a delegation has been issued to the mount. For the common case
where an NFSv4 server is not issuing delegations, this flag
can be checked to avoid acquisition of the NFSCLSTATEMUTEX.

This patch adds checks for NFSMNTP_DELEGISSUED being set
to two more functions.

This change appears to be performance neutral for a small number
of opens, but should reduce lock contention for a large number of opens
for the common case where server is not issuing delegations.

MFC after: 2 week


# 52dee2bc 18-Oct-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Handle NFSv4.1/4.2 Close RPC NFSERR_DELAY replies better

Without this patch, if a NFSv4.1/4.2 server replies NFSERR_DELAY to
a Close operation, the client loops retrying the Close while holding
a shared lock on the clientID. This shared lock blocks returns of
delegations, even though the server has issued a CB_RECALL to request
the delegation return.

This patch delays doing a retry of a Close that received a reply of
NFSERR_DELAY until after the shared lock on the clientID is released,
for NFSv4.1/4.2. To fix this for NFSv4.0 would be very difficult and
since the only known NFSv4 server to reply NFSERR_DELAY to Close only
does NFSv4.1/4.2, this fix is hoped to be sufficient.

This problem was detected during a recent IETF working group NFSv4
testing event.

MFC after: 2 week


# e2aab5e2 16-Oct-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Move release of the clientID lock into nfscl_doclose()

This patch moves release of the shared clientID lock from nfsrpc_close()
just after the nfscl_doclose() call to the end of nfscl_doclose() call.
This does make the code cleaner, since the shared lock is acquired at
the beginning of nfscl_doclose(). The only semantics change is that
the code no longer drops and reaquires the NFSCLSTATELOCK() mutex,
which I do not believe will have a negative effect on the NFSv4 client.

This is being done to prepare the code for a future patch that fixes
the case where an NFSv4.1/4.2 server replies NFSERR_DELAY to a Close
operation.

MFC after: 2 week


# 77c595ce 15-Oct-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Add an argument to nfscl_tryclose()

This patch adds a new argument to nfscl_tryclose() to indicate
whether or not it should loop when a NFSERR_DELAY reply is received
from the NFSv4 server. Since this new argument is always passed in
as "true" at this time, no semantics change should occur.

This is being done to prepare the code for a future patch that fixes
the case where an NFSv4.1/4.2 server replies NFSERR_DELAY to a Close
operation.

MFC after: 2 week


# 6495766a 14-Oct-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Restructure nfscl_freeopen() slightly

This patch factors the unlinking of the nfsclopen structure out of
nfscl_freeopen() into a separate function called nfscl_unlinkopen().
It also adds a new argument to nfscl_freeopen() to conditionally do
the unlink. Since this new argument is always passed in as "true"
at this time, no semantics change should occur.

This is being done to prepare the code for a future patch that fixes
the case where an NFSv4.1/4.2 server replies NFSERR_DELAY to a Close
operation.

MFC after: 2 week


# 24af0fcd 13-Oct-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Make nfscl_getlayout() acquire the correct pNFS layout

Without this patch, if a pNFS read layout has already been acquired
for a file, writes would be redirected to the Metadata Server (MDS),
because nfscl_getlayout() would not acquire a read/write layout for
the file. This happened because there was no "mode" argument to
nfscl_getlayout() to indicate whether reading or writing was being done.
Since doing I/O through the Metadata Server is not encouraged for some
pNFS servers, it is preferable to get a read/write layout for writes
instead of redirecting the write to the MDS.

This patch adds a access mode argument to nfscl_getlayout() and
nfsrpc_getlayout(), so that nfscl_getlayout() knows to acquire a read/write
layout for writing, even if a read layout has already been acquired.
This patch only affects NFSv4.1/4.2 client behaviour when pNFS ("pnfs" mount
option against a server that supports pNFS) is in use.

This problem was detected during a recent NFSv4 interoperability
testing event held by the IETF working group.

MFC after: 2 week


# b82168e6 12-Oct-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Fix another deadlock related to the NFSv4 clientID lock

Without this patch, it is possible to hang the NFSv4 client,
when a rename/remove is being done on a file where the client
holds a delegation, if pNFS is being used. For a delegation
to be returned, dirty data blocks must be flushed to the NFSv4
server. When pNFS is in use, a shared lock on the clientID
must be acquired while doing a write to the DS(s).
However, if rename/remove is doing the delegation return
an exclusive lock will be acquired on the clientID, preventing
the write to the DS(s) from acquiring a shared lock on the clientID.

This patch stops rename/remove from doing a delegation return
if pNFS is enabled. Since doing delegation return in the same
compound as rename/remove is only an optimization, not doing
so should not cause problems.

This problem was detected during a recent NFSv4 interoperability
testing event held by the IETF working group.

MFC after: 1 week


# 120b20bd 11-Oct-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Fix a deadlock related to the NFSv4 clientID lock

Without this patch, it is possible for a process doing an NFSv4
Open/create of a file to block to allow another process
to acquire the exclusive lock on the clientID when holding
a shared lock on the clientID. As such, both processes
deadlock, with one wanting the exclusive lock, while the
other holds the shared lock. This deadlock is unlikely to occur
unless delegations are in use on the NFSv4 mount.

This patch fixes the problem by not deferring to the process
waiting for the exclusive lock when a shared lock (reference cnt)
is already held by the process.

This problem was detected during a recent NFSv4 interoperability
testing event held by the IETF working group.

MFC after: 1 week


# 62c5be4a 26-Sep-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Add a check for "has acquired a delegation" to nfscl_removedeleg()

Commit 5e5ca4c8fc53 added a flag to a NFSv4 mount point that is set when
the first delegation is acquired from the NFSv4 server.

For a common case where delegations are not being issued by the
NFSv4 server, the nfscl_removedeleg() code acquires the mutex lock for
open/lock state, finds the delegation list empty, then just unlocks the
mutex and returns. This patch adds a check of the flag to avoid the
need to acquire the mutex for this common case.

This change appears to be performance neutral for a small number
of opens, but should reduce lock contention for a large number of opens
for the common case where server is not issuing delegations.

This commit should not affect the high level semantics of delegation
handling.

MFC after: 2 weeks


# 3ad1e1c1 11-Aug-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Add a Lookup+Open RPC for NFSv4.1/4.2

This patch adds a Lookup+Open compound RPC to the NFSv4.1/4.2
NFS client, which can be used by nfs_lookup() so that a
subsequent Open RPC is not required.
It uses the cn_flags OPENREAD, OPENWRITE added by commit c18c74a87c15.
This reduced the number of RPCs by about 15% for a kernel
build over NFS.

For now, use of Lookup+Open is only done when the "oneopenown"
mount option is used. It may be possible for Lookup+Open to
be used for non-oneopenown NFSv4.1/4.2 mounts, but that will
require extensive further testing to determine if it works.

While here, I've added the changes to the nfscommon module
that are needed to implement the Deallocate NFSv4.2 operation.
This avoids needing another cycle of changes to the internal
KAPI between the NFS modules.

This commit has changed the internal KAPI between the NFS
modules and, as such, all need to be rebuilt from sources.
I have not bumped __FreeBSD_version, since it was bumped a
few days ago.


# efea1bc1 28-Jul-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Cache an open stateid for the "oneopenown" mount option

For NFSv4.1/4.2, if the "oneopenown" mount option is used,
there is, at most, only one open stateid for each NFS vnode.
When an open stateid for a file is acquired, set a pointer to
the open structure in the NFS vnode. This pointer can be used to
acquire the open stateid without searching the open linked list
when the following is true:
- No delegations have been issued for the file. Since delegations
can outlive an NFS vnode for a file, use the global
NFSMNTP_DELEGISSUED flag on the mount to determine this.
- No lock stateid has been issued for the file. To determine
this, a new NFS vnode flag called NMIGHTBELOCKED is set when a lock
stateid is issued, which can then be tested.

When this open structure pointer can be used, it avoids the need to
acquire the NFSCLSTATELOCK() and searching the open structure list for
an open. The NFSCLSTATELOCK() can be highly contended when there are
a lot of opens issued for the NFSv4.1/4.2 mount.

This patch only affects NFSv4.1/4.2 mounts when the "oneopenown"
mount option is used.

MFC after: 2 weeks


# 54ff3b39 28-Jul-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Set correct lockowner for "oneopenown" mount option

For NFSv4.1/4.2, the client may use either an open, lock or
delegation stateid as the stateid argument for an I/O operation.
RFC 5661 defines an order of preference of delegation, then lock
and finally open stateid for the argument, although NFSv4.1/4.2
servers are expected to handle any stateid type.

For the "oneopenown" mount option, the lock owner was not being
correctly generated and, as such, the I/O operation would use an
open stateid, even when a lock stateid existed. Although this
did not and should not affect an NFSv4.1/4.2 server's behaviour,
this patch makes the behaviour for "oneopenown" the same as when
the mount option is not specified.

Found during inspection of packet captures. No failure during
testing against NFSv4.1/4.2 servers of the unpatched code occurred.

MFC after: 2 weeks


# 7685f834 19-Jul-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Send stateid.seqid of 0 for NFSv4.1/4.2 mounts

For NFSv4.1/4.2, the client may set the "seqid" field of the
stateid to 0 in RPC requests. This indicates to the server that
it should not check the "seqid" or return NFSERR_OLDSTATEID if the
"seqid" value is not up to date w.r.t. Open/Lock operations
on the stateid. This "seqid" is incremented by the NFSv4 server
for each Open/OpenDowngrade/Lock/Locku operation done on the stateid.

Since a failure return of NFSERR_OLDSTATEID is of no use to
the client for I/O operations, it makes sense to set "seqid"
to 0 for the stateid argument for I/O operations.
This avoids server failure replies of NFSERR_OLDSTATEID,
although I am not aware of any case where this failure occurs.

This makes the FreeBSD NFSv4.1/4.2 client compatible with the
Linux NFSv4.1/4.2 client.

MFC after: 2 weeks


# a145cf3f 24-Jun-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Change the default minor version for NFSv4 mounts

When NFSv4.1 support was added to the client, the implementation was
still experimental and, as such, the default minor version was set to 0.
Since the NFSv4.1 client implementation is now believed to be solid
and the NFSv4.1/4.2 protocol is significantly better than NFSv4.0,
I beieve that NFSv4.1/4.2 should be used where possible.

This patch changes the default minor version for NFSv4 to be the highest
minor version supported by the NFSv4 server. If a specific minor version
is desired, the "minorversion" mount option can be used to override
this default. This is compatible with the Linux NFSv4 client behaviour.

This was discussed on freebsd-current@ in mid-May 2021 under
the subject "changing the default NFSv4 minor version" and
the consensus seemed to be support for this change.
It also appeared that changing this for FreeBSD 13.1 was
not considered a POLA violation, so long as UPDATING
and RELNOTES entries were made for it.

MFC after: 2 weeks


# aed98fa5 15-Jun-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Make NFSv4.0 client acquisition NFSv4.1/4.2 compatible

When the NFSv4.0 client was implemented, acquisition of a clientid
via SetClientID/SetClientIDConfirm was done upon the first Open,
since that was when it was needed. NFSv4.1/4.2 acquires the clientid
during mount (via ExchangeID/CreateSession), since the associated
session is required during mount.

This patch modifies the NFSv4.0 mount so that it acquires the
clientid during mount. This simplifies the code and makes it
easy to implement "find the highest minor version supported by
the NFSv4 server", which will be done for the default minorversion
in a future commit.
The "start_renewthread" argument for nfscl_getcl() is replaced
by "tryminvers", which will be used by the aforementioned
future commit.

MFC after: 2 weeks


# 5e5ca4c8 09-Jun-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Add a "has acquired a delegation" flag for delegations

A problem was reported via email, where a large (130000+) accumulation
of NFSv4 opens on an NFSv4 mount caused significant lock contention
on the mutex used to protect the client mount's open/lock state.
Although the root cause for the accumulation of opens was not
resolved, it is obvious that the NFSv4 client is not designed to
handle 100000+ opens efficiently.

For a common case where delegations are not being issued by the
NFSv4 server, the code acquires the mutex lock for open/lock state,
finds the delegation list empty and just unlocks the mutex and returns.
This patch adds an NFS mount point flag that is set when a delegation
is issued for the mount. Then the patched code checks for this flag
before acquiring the open/lock mutex, avoiding the need to acquire
the lock for the case where delegations are not being issued by the
NFSv4 server.
This change appears to be performance neutral for a small number
of opens, but should reduce lock contention for a large number of opens
for the common case where server is not issuing delegations.

This commit should not affect the high level semantics of delegation
handling.

MFC after: 2 weeks


# 96b40b89 27-May-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Use hash lists to improve expected search performance for opens

A problem was reported via email, where a large (130000+) accumulation
of NFSv4 opens on an NFSv4 mount caused significant lock contention
on the mutex used to protect the client mount's open/lock state.
Although the root cause for the accumulation of opens was not
resolved, it is obvious that the NFSv4 client is not designed to
handle 100000+ opens efficiently. When searching for an open,
usually for a match by file handle, a linear search of all opens
is done.

Commit 3f7e14ad9345 added a hash table of lists hashed on file handle
for the opens. This patch uses the hash lists for searching for
a matching open based of file handle instead of an exhaustive
linear search of all opens.
This change appears to be performance neutral for a small number
of opens, but should improve expected performance for a large
number of opens.

This commit should not affect the high level semantics of open
handling.

MFC after: 2 weeks


# 724072ab 25-May-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Use hash lists to improve expected search performance for opens

A problem was reported via email, where a large (130000+) accumulation
of NFSv4 opens on an NFSv4 mount caused significant lock contention
on the mutex used to protect the client mount's open/lock state.
Although the root cause for the accumulation of opens was not
resolved, it is obvious that the NFSv4 client is not designed to
handle 100000+ opens efficiently. When searching for an open,
usually for a match by file handle, a linear search of all opens
is done.

Commit 3f7e14ad9345 added a hash table of lists hashed on file handle
for the opens. This patch uses the hash lists for searching for
a matching open based of file handle instead of an exhaustive
linear search of all opens.
This change appears to be performance neutral for a small number
of opens, but should improve expected performance for a large
number of opens. This patch also moves any found match to the front
of the hash list, to try and maintain the hash lists in recently
used ordering (least recently used at the end of the list).

This commit should not affect the high level semantics of open
handling.

MFC after: 2 weeks


# 3f7e14ad 22-May-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Add hash lists for the NFSv4 opens

A problem was reported via email, where a large (130000+) accumulation
of NFSv4 opens on an NFSv4 mount caused significant lock contention
on the mutex used to protect the client mount's open/lock state.
Although the root cause for the accumulation of opens was not
resolved, it is obvious that the NFSv4 client is not designed to
handle 100000+ opens efficiently. When searching for an open,
usually for a match by file handle, a linear search of all opens
is done.

This patch adds a table of hash lists for the opens, hashed on
file handle. This table will be used by future commits to
search for an open based on file handle more efficiently.

MFC after: 2 weeks


# c28cb257 19-May-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: Fix NFSv4.1/4.2 mount recovery from an expired lease

The most difficult NFSv4 client recovery case happens when the
lease has expired on the server. For NFSv4.0, the client will
receive a NFSERR_EXPIRED reply from the server to indicate this
has happened.
For NFSv4.1/4.2, most RPCs have a Sequence operation and, as such,
the client will receive a NFSERR_BADSESSION reply when the lease
has expired for these RPCs. The client will then call nfscl_recover()
to handle the NFSERR_BADSESSION reply. However, for the expired lease
case, the first reclaim Open will fail with NFSERR_NOGRACE.

This patch recognizes this case and calls nfscl_expireclient()
to handle the recovery from an expired lease.

This patch only affects NFSv4.1/4.2 mounts when the lease
expires on the server, due to a network partitioning that
exceeds the lease duration or similar.

MFC after: 2 weeks


# f6fec55f 27-Apr-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: add check for NULL clp and forced dismounts to nfscl_delegreturnvp()

Commit aad780464fad added a function called nfscl_delegreturnvp()
to return delegations during the NFS VOP_RECLAIM().
The function erroneously assumed that nm_clp would
be non-NULL. It will be NULL for NFSV4.0 mounts until
a regular file is opened. It will also be NULL during
vflush() in nfs_unmount() for a forced dismount.

This patch adds a check for clp == NULL to fix this.

Also, since it makes no sense to call nfscl_delegreturnvp()
during a forced dismount, the patch adds a check for that
case and does not do the call during forced dismounts.

PR: 255436
Reported by: ish@amail.plala.or.jp
MFC after: 2 weeks


# aad78046 25-Apr-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: return delegations in the NFS VOP_RECLAIM()

After a vnode is recycled it can no longer be
acquired via vfs_hash_get() and, as such,
a delegation for the vnode cannot be recalled.

In the unlikely event that a delegation still
exists when the vnode is being recycled, return
the delegation since it will no longer be
recallable.

Until you have this patch in your NFSv4 client,
you should consider avoiding the use of delegations.

MFC after: 2 weeks


# 02695ea8 25-Apr-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfscl: fix delegation recall when the file is not open

Without this patch, if a NFSv4 server recalled a
delegation when the file is not open, the renew
thread would block in the NFS VOP_INACTIVE()
trying to acquire the client state lock that it
already holds.

This patch fixes the problem by delaying the
vrele() call until after the client state
lock is released.

This bug has been in the NFSv4 client for
a long time, but since it only affects
delegation when recalled due to another
client opening the file, it got missed
during previous testing.

Until you have this patch in your client,
you should avoid the use of delegations.

MFC after: 2 weeks


# 4e6c2a1e 01-Apr-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfsv4 client: factor loop contents out into a separate function

Commit fdc9b2d50fe9 replaced a couple of while loops with LIST_FOREACH()
loops. This patch factors the body of that loop out into a separate
function called nfscl_checkown().
This prepares the code for future changes to use a hash table of
lists for open searches via file handle.

This patch should not result in a semantics change.

MFC after: 2 weeks


# fdc9b2d5 29-Mar-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfsv4 client: replace while loops with LIST_FOREACH() loops

This patch replaces a couple of while() loops with LIST_FOREACH() loops.
While here, declare a couple of variables "bool".
I think LIST_FOREACH() is preferred and makes the code more readable.
This also prepares the code for future changes to use a hash table of
lists for open searches via file handle.

This patch should not result in a semantics change.

MFC after: 2 weeks


# e61b29ab 29-Mar-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfsv4.1/4.2 client: fix handling of delegations for "oneopenown" mnt option

If a delegation for a file has been acquired, the "oneopenown" option
was ignored when the local open was issued. This could result in multiple
openowners/opens for a file, that would be transferred to the server
when the delegation was recalled.
This would not be serious, but could result in more than one openowner.
Since the Amazon/EFS does not issue delegations, this probably never
occurs in practice.
Spotted during code inspection.

This small patch fixes the code so that it checks for "oneopenown"
when doing client local opens on a delegation.

MFC after: 2 weeks


# 82ee386c 23-Mar-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfsv4 client: fix forced dismount when sleeping in the renew thread

During a recent NFSv4 testing event a test server caused a hang
where "umount -N" failed. The renew thread was sleeping on "nfsv4lck"
and the "umount" was sleeping, waiting for the renew thread to
terminate.

This is the second of two patches that is hoped to fix the renew thread
so that it will terminate when "umount -N" is done on the mount.

This patch adds a 5second timeout on the msleep()s and checks for
the forced dismount flag so that the renew thread will
wake up and see the forced dismount flag. Normally a wakeup()
will occur in less than 5seconds, but if a premature return from
msleep() does occur, it will simply loop around and msleep() again.
The patch also adds the "mp" argument to nfsv4_lock() so that it
will return when the forced dismount flag is set.

While here, replace the nfsmsleep() wrapper that was used for portability
with the actual msleep() call.

MFC after: 2 weeks


# fd232a21 18-Mar-2021 Rick Macklem <rmacklem@FreeBSD.org>

nfsv4 pnfs client: fix updating of the layout stateid.seqid

During a recent NFSv4 testing event a test server was replying
NFSERR_OLDSTATEID for layout stateids presented to the server
for LayoutReturn operations. Upon rereading RFC5661, it was
apparent that the FreeBSD NFSv4.1/4.2 pNFS client did not
maintain the seqid field of the layout stateid correctly.

This patch is believed to correct the problem. Tested against
a FreeBSD pNFS server with diagnostics added to check the stateid's
seqid did not indicate problems. Unfortunately, testing aginst
this server will not happen in the near future, so the fix may
not be correct yet.

MFC after: 2 weeks


# 586ee69f 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

fs: clean up empty lines in .c and .h files


# eea79fde 17-Jun-2020 Alan Somers <asomers@FreeBSD.org>

Remove vfs_statfs and vnode_mount macros from NFS

These macro definitions are no longer needed as the NFS OSX port is long
dead. The vfs_statfs macro conflicts with the vfsops field of the same
name.

Submitted by: shivank@
Reviewed by: rmacklem
MFC after: 2 weeks
Sponsored by: Google, Inc. (GSoC 2020)
Differential Revision: https://reviews.freebsd.org/D25263


# b9cc3262 12-May-2020 Ryan Moeller <freqlabs@FreeBSD.org>

nfs: Remove APPLESTATIC macro

It is no longer useful.

Reviewed by: rmacklem
Approved by: mav (mentor)
MFC after: 1 week
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D24811


# 32033b3d 08-May-2020 Ryan Moeller <freqlabs@FreeBSD.org>

Remove APPLEKEXT ifndefs

They are no longer useful.

Reviewed by: rmacklem
Approved by: mav (mentor)
MFC after: 1 week
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D24752


# 897d7d45 22-Apr-2020 Rick Macklem <rmacklem@FreeBSD.org>

Make the NFSv4.n client's recovery from NFSERR_BADSESSION RFC5661 conformant.

RFC5661 specifies that a client's recovery upon receipt of NFSERR_BADSESSION
should first consist of a CreateSession operation using the extant ClientID.
If that fails, then a full recovery beginning with the ExchangeID operation
is to be done.
Without this patch, the FreeBSD client did not attempt the CreateSession
operation with the extant ClientID and went directly to a full recovery
beginning with ExchangeID. I have had this patch several years, but since
no extant NFSv4.n server required the CreateSession with extant ClientID,
I have never committed it.
I an committing it now, since I suspect some future NFSv4.n server will
require this and it should not negatively impact recovery for extant NFSv4.n
servers, since they should all return NFSERR_STATECLIENTID for this first
CreateSession.

The patched client has been tested for recovery against both the FreeBSD
and Linux NFSv4.n servers and no problems have been observed.

MFC after: 1 month


# c057a378 12-Dec-2019 Rick Macklem <rmacklem@FreeBSD.org>

Add support for NFSv4.2 to the NFS client and server.

This patch adds support for NFSv4.2 (RFC-7862) and Extended Attributes
(RFC-8276) to the NFS client and server.
NFSv4.2 is comprised of several optional features that can be supported
in addition to NFSv4.1. This patch adds the following optional features:
- posix_fadvise(POSIX_FADV_WILLNEED/POSIX_FADV_DONTNEED)
- posix_fallocate()
- intra server file range copying via the copy_file_range(2) syscall
--> Avoiding data tranfer over the wire to/from the NFS client.
- lseek(SEEK_DATA/SEEK_HOLE)
- Extended attribute syscalls for "user" namespace attributes as defined
by RFC-8276.

Although this patch is fairly large, it should not affect support for
the other versions of NFS. However it does add two new sysctls that allow
a sysadmin to limit which minor versions of NFSv4 a server supports, allowing
a sysadmin to disable NFSv4.2.

Unfortunately, when the NFS stats structure was last revised, it was assumed
that there would be no additional operations added beyond what was
specified in RFC-7862. However RFC-8276 did add additional operations,
forcing the NFS stats structure to revised again. It now has extra unused
entries in all arrays, so that future extensions to NFSv4.2 can be
accomodated without revising this structure again.

A future commit will update nfsstat(1) to report counts for the new NFSv4.2
specific operations/procedures.

This patch affects the internal interface between the nfscommon, nfscl and
nfsd modules and, as such, they all must be upgraded simultaneously.
I will do a version bump (although arguably not needed), due to this.

This code has survived a "make universe" but has not been built with a
recent GCC. If you encounter build problems, please email me.

Relnotes: yes


# eeb1f3ed 14-Apr-2019 Rick Macklem <rmacklem@FreeBSD.org>

Fix the NFSv4 client to safely find processes.

r340744 broke the NFSv4 client, because it replaced pfind_locked() with a
call to pfind(), since pfind() acquires the sx lock for the pid hash and
the NFSv4 already holds a mutex when it does the call.
The patch fixes the problem by recreating a pfind_any_locked() and adding the
functions pidhash_slockall() and pidhash_sunlockall to acquire/release
all of the pid hash locks.
These functions are then used by the NFSv4 client instead of acquiring
the allproc_lock and calling pfind().

Reviewed by: kib, mjg
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D19887


# c9172fb4 18-Feb-2019 Edward Tomasz Napierala <trasz@FreeBSD.org>

Work around the "nfscl: bad open cnt on server" assertion
that can happen when rerooting into NFSv4 rootfs with kernel
built with INVARIANTS.

I've talked to rmacklem@ (back in 2017), and while the root cause
is still unknown, the case guarded by assertion (nfscl_doclose()
being called from VOP_INACTIVE) is believed to be safe, and the
whole thing seems to run just fine.

Obtained from: CheriBSD
MFC after: 2 weeks
Sponsored by: DARPA, AFRL


# 743d5281 30-Jul-2018 Rick Macklem <rmacklem@FreeBSD.org>

Silence newer gcc warnings.

Newer versions of gcc generate "set, but not used" warnings.
Add __unused macros to silence these warnings.
Although the variables are not being used, they are values parsed from
arguments to callback RPCs that might be needed in the future.

Requested by: mmacy


# 33f4bcca 27-Jul-2018 Eitan Adler <eadler@FreeBSD.org>

Use https over http for FreeBSD pages


# 5da38824 15-Jul-2018 Rick Macklem <rmacklem@FreeBSD.org>

Shut down the TCP connection to a DS in the pNFS client when Renew fails.

When a NFSv4.1 client mount using pNFS detects a failure trying to do a
Renew (actually just a Sequence operation), the code would simply try
again and again and again every 30sec.
This would tie up the "nfscl" thread, which should also be doing other
things like Renews on other DSs and the MDS.
This patch adds code which closes down the TCP connection and marks it
defunct when Renew detects an failure to communicate with the DS, so
further Renews will not be attempted until a new working TCP connection to
the DS is established.
It also makes the call to nfscl_cancelreqs() unconditional, since
nfscl_cancelreqs() checks the NFSCLDS_SAMECONN flag and does so while holding
the lock.
This fix only applies to the NFSv4.1 client whne using pNFS and without it
the only effect would have been an "nfscl" thread busy doing Renew attempts
on an unresponsive DS.

MFC after: 2 weeks


# 89c64a3a 14-Jul-2018 Rick Macklem <rmacklem@FreeBSD.org>

Fix the pNFS client when mirrors aren't on the same machine.

Without this patch, the client side NFSv4.1 pNFS code erroneously did writes
and commits to both DS mirrors using the TCP connection of the first one.
For my test setup this worked, since I have both DSs running on the same
machine, but it would have failed when the DSs are on separate machines.
This patch fixes the code to use the correct TCP connection for each DS.
This patch should only affect the NFSv4.1 client when using "pnfs" mounts
to mirrored DSs.

MFC after: 2 weeks


# 0e7bd20b 13-Jul-2018 Rick Macklem <rmacklem@FreeBSD.org>

Close down the TCP connection to a pNFS DS when it is disabled.

So long as the TCP connection to a pNFS DS isn't shared with other DSs,
it can be closed down when the DS is being disabled in the pNFS client.
This causes any RPCs in progress to fail.
This patch only affects the NFSv4.1 pNFS client when errors occur
while doing I/O on a DS.

MFC after: 2 weeks


# 2e35b8fe 22-Jun-2018 Rick Macklem <rmacklem@FreeBSD.org>

Change the NFSv4.1 pNFS client so that it returns the DS error in layoutreturn.

When the NFSv4.1 pNFS client gets an error for a DS I/O operation using a
Flexible File layout, it returns the layout with an error.
This patch changes the code slightly, so that it returns the layout for all
errors except EACCES and lets the MDS decide what to do based on the error.
It also makes a couple of changes to nfscl_layoutrecall() to ensure that
the first layoutreturn(s) will have the error in the reply.
Plus, the patch adds a wakeup() so that the "nfscl" thread won't wait 1sec
before doing the LayoutReturn.
Tested against the pNFS service.
This patch should not affect non-pNFS use of the client.
The unused "dsp" argument will be used by a future patch that disables the
connection to the DS when possible.

MFC after: 2 weeks


# 2bad6424 17-Jun-2018 Rick Macklem <rmacklem@FreeBSD.org>

Make the pNFS NFSv4.1 client return a Flexible File layout upon error.

The Flexible File layout LayoutReturn operation has argument fields where
an I/O error encountered when attempting I/O on a DS can be reported back
to the MDS.
This patch adds code to the client to do this for the Flexible File layout
mirrored case.
This patch should only affect mounts using the "pnfs" option against servers
that support the Flexible File layout.

MFC after: 2 weeks


# 90d2dfab 12-Jun-2018 Rick Macklem <rmacklem@FreeBSD.org>

Merge the pNFS server code from projects/pnfs-planb-server into head.

This code merge adds a pNFS service to the NFSv4.1 server. Although it is
a large commit it should not affect behaviour for a non-pNFS NFS server.
Some documentation on how this works can be found at:
http://people.freebsd.org/~rmacklem/pnfs-planb-setup.txt
and will hopefully be turned into a proper document soon.
This is a merge of the kernel code. Userland and man page changes will
come soon, once the dust settles on this merge.
It has passed a "make universe", so I hope it will not cause build problems.
It also adds NFSv4.1 server support for the "current stateid".

Here is a brief overview of the pNFS service:
A pNFS service separates the Read/Write oeprations from all the other NFSv4.1
Metadata operations. It is hoped that this separation allows a pNFS service
to be configured that exceeds the limits of a single NFS server for either
storage capacity and/or I/O bandwidth.
It is possible to configure mirroring within the data servers (DSs) so that
the data storage file for an MDS file will be mirrored on two or more of
the DSs.
When this is used, failure of a DS will not stop the pNFS service and a
failed DS can be recovered once repaired while the pNFS service continues
to operate. Although two way mirroring would be the norm, it is possible
to set a mirroring level of up to four or the number of DSs, whichever is
less.
The Metadata server will always be a single point of failure,
just as a single NFS server is.

A Plan B pNFS service consists of a single MetaData Server (MDS) and K
Data Servers (DS), all of which are recent FreeBSD systems.
Clients will mount the MDS as they would a single NFS server.
When files are created, the MDS creates a file tree identical to what a
single NFS server creates, except that all the regular (VREG) files will
be empty. As such, if you look at the exported tree on the MDS directly
on the MDS server (not via an NFS mount), the files will all be of size 0.
Each of these files will also have two extended attributes in the system
attribute name space:
pnfsd.dsfile - This extended attrbute stores the information that
the MDS needs to find the data storage file(s) on DS(s) for this file.
pnfsd.dsattr - This extended attribute stores the Size, AccessTime, ModifyTime
and Change attributes for the file, so that the MDS doesn't need to
acquire the attributes from the DS for every Getattr operation.
For each regular (VREG) file, the MDS creates a data storage file on one
(or more if mirroring is enabled) of the DSs in one of the "dsNN"
subdirectories. The name of this file is the file handle
of the file on the MDS in hexadecimal so that the name is unique.
The DSs use subdirectories named "ds0" to "dsN" so that no one directory
gets too large. The value of "N" is set via the sysctl vfs.nfsd.dsdirsize
on the MDS, with the default being 20.
For production servers that will store a lot of files, this value should
probably be much larger.
It can be increased when the "nfsd" daemon is not running on the MDS,
once the "dsK" directories are created.

For pNFS aware NFSv4.1 clients, the FreeBSD server will return two pieces
of information to the client that allows it to do I/O directly to the DS.
DeviceInfo - This is relatively static information that defines what a DS
is. The critical bits of information returned by the FreeBSD
server is the IP address of the DS and, for the Flexible
File layout, that NFSv4.1 is to be used and that it is
"tightly coupled".
There is a "deviceid" which identifies the DeviceInfo.
Layout - This is per file and can be recalled by the server when it
is no longer valid. For the FreeBSD server, there is support
for two types of layout, call File and Flexible File layout.
Both allow the client to do I/O on the DS via NFSv4.1 I/O
operations. The Flexible File layout is a more recent variant
that allows specification of mirrors, where the client is
expected to do writes to all mirrors to maintain them in a
consistent state. The Flexible File layout also allows the
client to report I/O errors for a DS back to the MDS.
The Flexible File layout supports two variants referred to as
"tightly coupled" vs "loosely coupled". The FreeBSD server always
uses the "tightly coupled" variant where the client uses the
same credentials to do I/O on the DS as it would on the MDS.
For the "loosely coupled" variant, the layout specifies a
synthetic user/group that the client uses to do I/O on the DS.
The FreeBSD server does not do striping and always returns
layouts for the entire file. The critical information in a layout
is Read vs Read/Writea and DeviceID(s) that identify which
DS(s) the data is stored on.

At this time, the MDS generates File Layout layouts to NFSv4.1 clients
that know how to do pNFS for the non-mirrored DS case unless the sysctl
vfs.nfsd.default_flexfile is set non-zero, in which case Flexible File
layouts are generated.
The mirrored DS configuration always generates Flexible File layouts.
For NFS clients that do not support NFSv4.1 pNFS, all I/O operations
are done against the MDS which acts as a proxy for the appropriate DS(s).
When the MDS receives an I/O RPC, it will do the RPC on the DS as a proxy.
If the DS is on the same machine, the MDS/DS will do the RPC on the DS as
a proxy and so on, until the machine runs out of some resource, such as
session slots or mbufs.
As such, DSs must be separate systems from the MDS.

Tested by: james.rose@framestore.com
Relnotes: yes


# 80977534 10-Jun-2018 Rick Macklem <rmacklem@FreeBSD.org>

Add checks for the Flexible File layout to LayoutRecall callbacks.

The Flexible File layout case wasn't handled by LayoutRecall callbacks
because it just checked for File layout and returned NFSERR_NOMATCHLAYOUT
otherwise. This patch adds the Flexible File layout handling.
Found during testing of the pNFS server.

MFC after: 2 weeks


# 260785fe 26-May-2018 Rick Macklem <rmacklem@FreeBSD.org>

Fix the sleep event for layout recall.

The sleep for I/O completion during an NFSv4.1 pNFS layout recall used
the wrong event value and could result in the "[nfscl]" thread hung
for the mount.
This patch fixes the event to be the correct.
This bug will only affect NFSv4.1 pnfs mounts and only when the server
does a layout recall callback, so it won't affect many. Without the patch,
a mount without the "pnfs" option will avoid the problem.
Found during testing of the pNFS server.

MFC after: 1 week


# 222daa42 25-Jan-2018 Conrad Meyer <cem@FreeBSD.org>

style: Remove remaining deprecated MALLOC/FREE macros

Mechanically replace uses of MALLOC/FREE with appropriate invocations of
malloc(9) / free(9) (a series of sed expressions). Something like:

* MALLOC(a, b, ... -> a = malloc(...
* FREE( -> free(
* free((caddr_t) -> free(

No functional change.

For now, punt on modifying contrib ipfilter code, leaving a definition of
the macro in its KMALLOC().

Reported by: jhb
Reviewed by: cy, imp, markj, rmacklem
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D14035


# 151ba793 24-Dec-2017 Alexander Kabaev <kan@FreeBSD.org>

Do pass removing some write-only variables from the kernel.

This reduces noise when kernel is compiled by newer GCC versions,
such as one used by external toolchain ports.

Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial)
Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c)
Differential Revision: https://reviews.freebsd.org/D10385


# d63027b6 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/fs: further adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.


# b949cc41 05-Oct-2017 Rick Macklem <rmacklem@FreeBSD.org>

Add Flex File Layout support to the NFSv4.1 pNFS client.

This patch adds support for the Flexible File Layout to the pNFS client.
Although the patch is rather large, it should only affect NFS mounts
using the "pnfs" option against pNFS servers that do not support File
Layout.
There are still a couple of things missing from the Flexible File Layout
client implementation:
- The code does not yet do a LayoutReturn with I/O error stats when
I/O error(s) occur when attempting to do I/O on a DS.
This will be fixed in a future commit, since it is important for the
MDS to know that I/O on a DS is failing.
- The current code does writes and commits to mirror DSs serially.
Making them happen concurrently will be done in a future commit,
after discussion on freebsd-current@ on the best way to do this.
- The code does not handle NFSv4.0 DSs. Since there is no extant pNFS
server that implements NFSv4.0 DSs and NFSv4.1 DSs makes more sense
now, I don't intend to implement this until there is a need for it.
There is support for NFSv4.1 and NFSv3 DSs.


# bd290946 27-Sep-2017 Rick Macklem <rmacklem@FreeBSD.org>

Fix a memory leak that occurred in the pNFS client.

When a "pnfs" NFSv4.1 mount was unmounted, it didn't free up the layouts
and deviceinfo structures. This leak only affects "pnfs" mounts and only
when the mount is umounted.
Found while testing the pNFS Flexible File layout client code.

MFC after: 2 weeks


# b0932afa 19-Sep-2017 Rick Macklem <rmacklem@FreeBSD.org>

Simplify nfsrpc_layoutreturn() args.

Simplify nfsrpc_layoutreturn() args. in preparation for the addition
of Flex File layout support, since File layout uses a 0 length field.
Flex Files does use a longer field, but that will be added in a
subsequent commit.


# ab118d04 19-Sep-2017 Rick Macklem <rmacklem@FreeBSD.org>

Simplify nfsrpc_layoutcommit() args.

Simplify nfsrpc_layoutcommit() args. in preparation for the addition
of Flex File layout support, since it also uses a 0 length field.


# 16f300fa 27-Jul-2017 Rick Macklem <rmacklem@FreeBSD.org>

Replace the checks for MNTK_UNMOUNTF with a macro that does the same thing.

This patch defines a macro that checks for MNTK_UNMOUNTF and replaces
explicit checks with this macro. It has no effect on semantics, but
prepares the code for a future patch where there will also be a
NFS specific flag for "forced dismount about to occur".

Suggested by: kib
MFC after: 2 weeks


# 6d7963ec 21-Jun-2017 Rick Macklem <rmacklem@FreeBSD.org>

Ensure that the credentials field of the NFSv4 client open structure is
initialized.

bdrewery@ has reported panics "newnfs_copycred: negative nfsc_ngroups".
The only way I can see that this occurs is that the credentials field of
the open structure gets used before being filled in.
I am not sure quite how this happens, but for the file create case, the
code is serialized via the vnode lock on the directory. If, somehow, a
link to the same file gets created just after file creation, this might
occur.

This patch ensures that the credentials field is initialized to a reasonable
set of credentials before the structure is linked into any list, so I
this should ensure it is initialized before use.
I am committing the patch now, since bdrewery@ notes that the panics
are intermittent and it may be months before he knows if the patch fixes
his problem.

Reported by: bdrewery
MFC after: 2 weeks


# ad81354c 26-Apr-2017 Rick Macklem <rmacklem@FreeBSD.org>

Fix handling of a NFSv4.1 callback reply from the session cache.

The nfsv4_seqsession() call returns NFSERR_REPLYFROMCACHE when it has a
reply in the session, due to a requestor retry. The code erroneously
assumed a return of 0 for this case. This patch fixes this and adds
a KASSERT(). This would be an extremely rare occurrence. It was found
during code inspection during the pNFS server development.

MFC after: 2 weeks


# 6406db24 23-Apr-2017 Rick Macklem <rmacklem@FreeBSD.org>

Make the NFSv4 client to use a write open for reading if allowed by the server.

An NFSv4 server has the option of allowing a Read to be done using a Write
Open. If this is not allowed, the server will return NFSERR_OPENMODE.
This patch attempts the read with a write open and then disables this
if the server replies NFSERR_OPENMODE.
This change will avoid some uses of the special stateids. This will be
useful for pNFS/DS Reads, since they cannot use special stateids.
It will also be useful for any NFSv4 server that does not support reading
via the special stateids. It has been tested against both types of NFSv4 server.

MFC after: 2 weeks


# 4e47dd18 22-Apr-2017 Rick Macklem <rmacklem@FreeBSD.org>

Fix the NFSv4.1/pNFS client return layout on close.

The "return layout on close" case in the pNFS client was badly broken.
Fortunately, extant pNFS servers that I have tested against do not
do this. This patch fixes it. It also changes the way the layout stateid.seqid
is set for LayoutReturn. I think this change is correct w.r.t. the RFC,
but I am not 100% sure.
This was found during recent testing of the pNFS server under development.

MFC after: 2 weeks


# 6d4377c1 14-Apr-2017 Rick Macklem <rmacklem@FreeBSD.org>

Remove unused "cred" argument to ncl_flush().

The "cred" argument of ncl_flush() is unused and it was confusing to have
the code passing in NULL for this argument in some cases. This patch deletes
this argument.
There is no semantic change because of this patch.

MFC after: 2 weeks


# 037a2012 13-Apr-2017 Rick Macklem <rmacklem@FreeBSD.org>

Add an NFSv4.1 mount option for "use one openowner".

Some NFSv4.1 servers such as AmazonEFS can only support a small fixed number
of open_owner4s. This patch adds a mount option called "oneopenown" that
can be used for NFSv4.1 mounts to make the client do all Opens with the
same open_owner4 string. This option can only be used with NFSv4.1 and
may not work correctly when Delegations are is use.

Reported by: cperciva
Tested by: cperciva
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D8988


# 7d9b62e1 11-Apr-2017 Rick Macklem <rmacklem@FreeBSD.org>

Don't throw away Open state when a NFSv4.1 client recovery fails.

If the ExchangeID/CreateSession operations done by an NFSv4.1 client
after the server crashes/reboots fails, it is possible that some process/thread
is waiting for an open_owner lock. If the client state is free'd, this
can cause a crash.
This would not normally happen, but has been observed on a mount of the
AmazonEFS service.

Reported by: cperciva
Tested by: cperciva
PR: 216086
MFC after: 2 weeks


# 9e507a6a 11-Apr-2017 Rick Macklem <rmacklem@FreeBSD.org>

During a server crash recovery, fix the NFSv4.1 client for a NFSERR_BADSESSION
during recovery.

If the NFSv4.1 client gets a NFSv4.1 NFSERR_BADSESSION reply to an Open/Lock
operation while recovering from the server crash/reboot, allow the opens
to be retained for a subsequent recovery attempt. Since NFSv4.1 servers
should only reply NFSERR_BADSESSION after a crash/reboot that has lost
state, this case should almost never happen.
However, for the AmazonEFS file service, this has been observed when
the client does a fresh TCP connection for RPCs.

Reported by: cperciva
Tested by: cperciva
PR: 216088
MFC after: 2 weeks


# 8efffe80 09-Apr-2017 Rick Macklem <rmacklem@FreeBSD.org>

Avoid starvation of the server crash recovery thread for the NFSv4 client.

This patch gives a requestor of the exclusive lock on the client state
in the NFSv4 client priority over shared lock requestors. This avoids
the server crash recovery thread being starved out by other threads doing
RPCs.

Tested by: cperciva
PR: 216087
MFC after: 2 weeks


# b2fc0141 23-Dec-2016 Rick Macklem <rmacklem@FreeBSD.org>

Fix NFSv4.1 client recovery from NFS4ERR_BAD_SESSION errors.

For most NFSv4.1 servers, a NFS4ERR_BAD_SESSION error is a rare failure
that indicates that the server has lost session/open/lock state.
However, recent testing by cperciva@ against the AmazonEFS server found
several problems with client recovery from this due to it generating this
failure frequently.
Briefly, the problems fixed are:
- If all session slots were in use at the time of the failure, some processes
would continue to loop waiting for a slot on the old session forever.
- If an RPC that doesn't use open/lock state failed with NFS4ERR_BAD_SESSION,
it would fail the RPC/syscall instead of initiating recovery and then
looping to retry the RPC.
- If a successful reply to an RPC for an old session wasn't processed
until after a new session was created for a NFS4ERR_BAD_SESSION error,
it would erroneously update the new session and corrupt it.
- The use of the first element of the session list in the nfs mount
structure (which is always the current metadata session) was slightly
racey. With changes for the above problems it became more racey, so all
uses of this head pointer was wrapped with a NFSLOCKMNT()/NFSUNLOCKMNT().
- Although the kernel malloc() usually allocates more bytes than requested
and, as such, this wouldn't have caused problems, the allocation of a
session structure was 1 byte smaller than it should have been.
(Null termination byte for the string not included in byte count.)

There are probably still problems with a pNFS data server that fails
with NFS4ERR_BAD_SESSION, but I have no server that does this to test
against (the AmazonEFS server doesn't do pNFS), so I can't fix these yet.

Although this patch is fairly large, it should only affect the handling
of NFS4ERR_BAD_SESSION error replies from an NFSv4.1 server.
Thanks go to cperciva@ for the extension testing he did to help isolate/fix
these problems.

Reported by: cperciva
Tested by: cperciva
MFC after: 3 months
Differential Revision: https://reviews.freebsd.org/D8745


# 1b819cf2 12-Aug-2016 Rick Macklem <rmacklem@FreeBSD.org>

Update the nfsstats structure to include the changes needed by
the patch in D1626 plus changes so that it includes counts for
NFSv4.1 (and the draft of NFSv4.2).
Also, make all the counts uint64_t and add a vers field at the
beginning, so that future revisions can easily be implemented.
There is code in place to handle the old vesion of the nfsstats
structure for backwards binary compatibility.

Subsequent commits will update nfsstat(8) to use the new fields.

Submitted by: will (earlier version)
Reviewed by: ken
MFC after: 1 month
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D1626


# a96c9b30 29-Apr-2016 Pedro F. Giffuni <pfg@FreeBSD.org>

NFS: spelling fixes on comments.

No funcional change.


# 74b8d63d 10-Apr-2016 Pedro F. Giffuni <pfg@FreeBSD.org>

Cleanup unnecessary semicolons from the kernel.

Found with devel/coccinelle.


# c59e4cc3 01-Jul-2014 Rick Macklem <rmacklem@FreeBSD.org>

Merge the NFSv4.1 server code in projects/nfsv4.1-server over
into head. The code is not believed to have any effect
on the semantics of non-NFSv4.1 server behaviour.
It is a rather large merge, but I am hoping that there will
not be any regressions for the NFS server.

MFC after: 1 month


# 2d5f8359 28-Jun-2014 Rick Macklem <rmacklem@FreeBSD.org>

There might be a potential race condition for the NFSv4 client
when a newly created file has another open done on it that
update the open mode. This patch moves the code that updates
the open mode up into the block where the mutex is held to
ensure this cannot happen. No bug caused by this potential
race has been observed, but this fix is a safety belt to ensure
it cannot happen.

MFC after: 2 weeks


# b921158a 23-Dec-2013 Rick Macklem <rmacklem@FreeBSD.org>

The NFSv4 client was passing both the p and cred arguments to
nfsv4_fillattr() as NULLs for the Getattr callback. This caused
nfsv4_fillattr() to not fill in the Change attribute for the reply.
I believe this was a violation of the RFC, but had little effect on
server behaviour. This patch passes a non-NULL p argument to fix this.

MFC after: 1 week


# 6b8fe5d5 23-Dec-2013 Rick Macklem <rmacklem@FreeBSD.org>

The NFSv4.1 client didn't return NFSv4.1 specific error codes
for the Getattr and Recall callbacks. This patch fixes it.
Since the NFSv4.1 specific error codes would only happen for
abnormal circumstances, this patch has little effect, in practice.

MFC after: 1 week


# 2e6a4b0c 22-Jun-2013 Rick Macklem <rmacklem@FreeBSD.org>

Fix r252074 so that it builds on 64bit arches.


# 1dd95a04 21-Jun-2013 Rick Macklem <rmacklem@FreeBSD.org>

The NFSv4.1 LayoutCommit operation requires a valid offset and length.
(0, 0 is not sufficient) This patch a loop for each file layout, using
the offset, length of each file layout in a separate LayoutCommit.


# b96f7e0a 20-Feb-2013 Warner Losh <imp@FreeBSD.org>

The request queue is already locked, so we don't need the splsofclock/splx
here to note future work.


# a89a2c8b 25-Jan-2013 John Baldwin <jhb@FreeBSD.org>

Further cleanups to use of timestamps in NFS:
- Use NFSD_MONOSEC (which maps to time_uptime) instead of the seconds
portion of wall-time stamps to manage timeouts on events.
- Remove unused nd_starttime from the per-request structure in the new
NFS server.
- Use nanotime() for the modification time on a delegation to get as
precise a time as possible.
- Use time_second instead of extracting the second from a call to
getmicrotime().

Submitted by: bde (3)
Reviewed by: bde, rmacklem
MFC after: 2 weeks


# 1f60bfd8 08-Dec-2012 Rick Macklem <rmacklem@FreeBSD.org>

Move the NFSv4.1 client patches over from projects/nfsv4.1-client
to head. I don't think the NFS client behaviour will change unless
the new "minorversion=1" mount option is used. It includes basic
NFSv4.1 support plus support for pNFS using the Files Layout only.
All problems detecting during an NFSv4.1 Bakeathon testing event
in June 2012 have been resolved in this code and it has been tested
against the NFSv4.1 server available to me.
Although not reviewed, I believe that kib@ has looked at it.


# 8c9c3223 07-Feb-2012 Rick Macklem <rmacklem@FreeBSD.org>

r228827 fixed a problem where copying of NFSv4 open credentials into
a credential structure would corrupt it. This happened when the
p argument was != NULL. However, I now realize that the copying of
open credentials should only happen for p == NULL, since that indicates
that it is a read-ahead or write-behind. This patch fixes this.
After this commit, r228827 could be reverted, but I think the code is
clearer and safer with the patch, so I am going to leave it in.
Without this patch, it was possible that a NFSv4 VOP_SETATTR() could have
changed the credentials of the caller. This would have happened if
the process doing the VOP_SETATTR() did not have the file open, but
some other process running as a different uid had the file open for writing
at the same time.

MFC after: 5 days


# 7a2e4d80 02-Dec-2011 Rick Macklem <rmacklem@FreeBSD.org>

Post r223774, the NFSv4 client no longer has multiple instances
of the same lock_owner4 string. As such, the handling of cleanup
of lock_owners could be simplified. This simplification permitted
the client to do a ReleaseLockOwner operation when the process that
the lock_owner4 string represents, has exited. This permits the
server to release any storage related to the lock_owner4 string
before the associated open is closed. Without this change, it
is possible to exhaust a server's storage when a long running
process opens a file and then many child processes do locking
on the file, because the open doesn't get closed. A similar patch
was applied to the Linux NFSv4 client recently so that it wouldn't
exhaust a server's storage.

Reviewed by: zack
MFC after: 2 weeks


# f9340edf 21-Nov-2011 Rick Macklem <rmacklem@FreeBSD.org>

Clean up some cruft in the NFSv4 client left over from the
OpenBSD port, so that it is more readable. No logic change
is made by this commit.

MFC after: 2 weeks


# d57a9d5f 19-Nov-2011 Rick Macklem <rmacklem@FreeBSD.org>

Since the nfscl_cleanup() function isn't used by the FreeBSD NFSv4 client,
delete the code and fix up the related comments. This should not have
any functional effect on the client.

MFC after: 2 weeks


# 2f27585e 19-Nov-2011 Rick Macklem <rmacklem@FreeBSD.org>

Post r223774 the NFSv4 client never uses the linked list with the
head nfsc_defunctlockowner. This patch simply removes the code that
loops through this always empty list, since the code no longer does
anything useful. It should not have any effect on the client's
behaviour.

MFC after: 2 weeks


# 305a0c91 12-Jul-2011 Rick Macklem <rmacklem@FreeBSD.org>

r222389 introduced a case where the NFSv4 client could
loop in nfscl_getcl() when a forced dismount is in progress,
because nfsv4_lock() will return 0 without sleeping when
MNTK_UNMOUNTF is set.
This patch fixes it so it won't loop calling nfsv4_lock()
for this case.

MFC after: 2 weeks


# 98a7b279 04-Jul-2011 Rick Macklem <rmacklem@FreeBSD.org>

The algorithm used by nfscl_getopen() could have resulted in
multiple instances of the same lock_owner when a process both
inherited an open file descriptor plus opened the same file itself.
Since some NFSv4 servers cannot handle multiple instances of
the same lock_owner string, this patch changes the algorithm
used by nfscl_getopen() in the new NFSv4 client to keep that
from happening. The new algorithm is simpler, since there is
no longer any need to ascend the process's parentage tree because
all NFSv4 Closes for a file are done at VOP_INACTIVE()/VOP_RECLAIM(),
making the Opens indistinct w.r.t. use with Lock Ops.
This problem was discovered at the recent NFSv4 interoperability
Bakeathon.

MFC after: 2 weeks


# 1171f21d 03-Jul-2011 Rick Macklem <rmacklem@FreeBSD.org>

Modify the new NFSv4 client so that it appends a file handle
to the lock_owner4 string that goes on the wire. Also, add
code to do a ReleaseLockOwner Op on the lock_owner4 string
before a Close. Apparently not all NFSv4 servers handle multiple
instances of the same lock_owner4 string, at least not in a
compatible way. This patch avoids having multiple instances,
except for one unusual case, which will be fixed by a future commit.
Found at the recent NFSv4 interoperability Bakeathon.

Tested by: tdh at excfb.com
MFC after: 2 weeks


# f8f4e256 05-Jun-2011 Rick Macklem <rmacklem@FreeBSD.org>

The new NFSv4 client was erroneously using "p" instead of
"p_leader" for the "id" for POSIX byte range locking. I think
this would only have affected processes created by rfork(2)
with the RFTHREAD flag specified. This patch fixes that by
passing the "id" down through the various functions from
nfs_advlock().

MFC after: 2 weeks


# ff29f3b2 27-May-2011 Rick Macklem <rmacklem@FreeBSD.org>

Fix the new NFS client so that it handles NFSv4 state
correctly during a forced dismount. This required that
the exclusive and shared (refcnt) sleep lock functions check
for MNTK_UMOUNTF before sleeping, so that they won't block
while nfscl_umount() is getting rid of the state. As
such, a "struct mount *" argument was added to the locking
functions. I believe the only remaining case where a forced
dismount can get hung in the kernel is when a thread is
already attempting to do a TCP connect to a dead server
when the krpc client structure called nr_client is NULL.
This will only happen just after a "mount -u" with options
that force a new TCP connection is done, so it shouldn't
be a problem in practice.

MFC after: 2 weeks


# a8bafa5d 18-Apr-2011 Rick Macklem <rmacklem@FreeBSD.org>

Revert r220761 since, as kib@ pointed out, the case of
adding the check to nfsrpc_close() isn't useful. Also,
the check in nfscl_getcl() must be more involved, since
it needs to check before and after the acquisition of
the refcnt on nfsc_lock, while the mutex that protects
the client state data is held.


# be8b35ed 17-Apr-2011 Rick Macklem <rmacklem@FreeBSD.org>

Add checks for MNTK_UNMOUNTF at the beginning of three
functions, so that threads don't get stuck in them during
a forced dismount. nfs_sync/VFS_SYNC() needs this, since it is
called by dounmount() before VFS_UNMOUNT(). The nfscl_nget()
case makes sure that a thread doing an VOP_OPEN() or
VOP_ADVLOCK() call doesn't get blocked before attempting
the RPC. Attempting RPCs don't block, since they all
fail once a forced dismount is in progress.
The third one at the beginning of nfsrpc_close()
is done so threads don't get blocked while doing VOP_INACTIVE()
as the vnodes are cleared out.
With these three changes plus a change to the umount(1)
command so that it doesn't do "sync()" for the forced case
seem to make forced dismounts work for the experimental NFS
client.

MFC after: 2 weeks


# a09001a8 14-Apr-2011 Rick Macklem <rmacklem@FreeBSD.org>

Fix the experimental NFSv4 server so that it uses VOP_PATHCONF()
to determine if a file system supports NFSv4 ACLs. Since
VOP_PATHCONF() must be called with a locked vnode, the function
is called before nfsvno_fillattr() and the result is passed in
as an extra argument.

MFC after: 2 weeks


# 07c0c166 14-Apr-2011 Rick Macklem <rmacklem@FreeBSD.org>

Modify the experimental NFSv4 server so that it handles
crossing of server mount points properly. The functions
nfsvno_fillattr() and nfsv4_fillattr() were modified to
take the extra arguments that are the mount point, a flag
to indicate that it is a file system root and the mounted
on fileno. The mount point argument needs to be busy when
nfsvno_fillattr() is called, since the vp argument is not
locked.

Reviewed by: kib
MFC after: 2 weeks


# c5dd9d8c 26-Oct-2010 Rick Macklem <rmacklem@FreeBSD.org>

Add a flag to the experimental NFSv4 client to indicate when
delegations are being returned for reasons other than a Recall.
Also, re-organize nfscl_recalldeleg() slightly, so that it leaves
clearing NMODIFIED to the ncl_flush() call and invalidates the
attribute cache after flushing. It is hoped that these changes
might fix the problem others have seen when using the NFSv4
client with delegations enabled, since I can't reliably reproduce
the problem. These changes only affect the client when doing NFSv4
mounts with delegations enabled.

MFC after: 10 days


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# e3649d5a 02-Aug-2010 Rick Macklem <rmacklem@FreeBSD.org>

Modify the return value for nfscl_mustflush() from boolean_t,
which I mistakenly thought was correct w.r.t. style(9), back
to int and add the checks for != 0. This is just a stylistic
modification.

MFC after: 1 week


# 5813b99c 17-Jul-2010 Rick Macklem <rmacklem@FreeBSD.org>

Change the nfscl_mustflush() function in the experimental NFSv4
client to return a boolean_t in order to make it more compatible
with style(9).

MFC after: 2 weeks


# 227b9ebe 30-Apr-2010 Rick Macklem <rmacklem@FreeBSD.org>

MFC: r207170
An NFSv4 server will reply NFSERR_GRACE for non-recovery RPCs
during the grace period after startup. This grace period must
be at least the lease duration, which is typically 1-2 minutes.
It seems prudent for the experimental NFS client to wait a few
seconds before retrying such an RPC, so that the server isn't
flooded with non-recovery RPCs during recovery. This patch adds
an argument to nfs_catnap() to implement a 5 second delay
for this case.


# 52d8bf89 29-Apr-2010 Rick Macklem <rmacklem@FreeBSD.org>

MFC: r207082
When the experimental NFS client is handling an NFSv4 server reboot
with delegations enabled, the recovery could fail if the renew
thread is trying to return a delegation, since it will not do the
recovery. This patch fixes the above by having nfscl_recalldeleg()
fail with the I/O operations returning EIO, so that they will be
attempted later. Most of the patch consists of adding an argument
to various functions to indicate the delegation recall case where
this needs to be done.


# 91f671cf 26-Apr-2010 Rick Macklem <rmacklem@FreeBSD.org>

MFC: r206880
For the experimental NFS client doing an NFSv4 mount,
set the NFSCLFLAGS_RECVRINPROG while doing recovery from an expired
lease in a manner similar to r206818 for server reboot recovery.
This will prevent the function that acquires stateids for I/O
operations from acquiring out of date stateids during recovery.
Also, fix up mutex locking on the nfsc_flags field.


# a862e504 24-Apr-2010 Rick Macklem <rmacklem@FreeBSD.org>

MFC: r206818
Avoid extraneous recovery cycles in the experimental NFS client
when an NFSv4 server reboots, by doing two things.
1 - Make the function that acquires a stateid for I/O operations
block until recovery is complete, so that it doesn't acquire
out of date stateids.
2 - Only allow a recovery once every 1/2 of a lease duration, since
the NFSv4 server must provide a recovery grace period of at
least a lease duration. This should avoid recoveries caused
by an out of date stateid that was acquired for an I/O op.
just before a recovery cycle started.


# 23f929df 24-Apr-2010 Rick Macklem <rmacklem@FreeBSD.org>

An NFSv4 server will reply NFSERR_GRACE for non-recovery RPCs
during the grace period after startup. This grace period must
be at least the lease duration, which is typically 1-2 minutes.
It seems prudent for the experimental NFS client to wait a few
seconds before retrying such an RPC, so that the server isn't
flooded with non-recovery RPCs during recovery. This patch adds
an argument to nfs_catnap() to implement a 5 second delay
for this case.

MFC after: 1 week


# bf129106 22-Apr-2010 Rick Macklem <rmacklem@FreeBSD.org>

MFC: r206690
Add mutex lock calls to 2 cases in the experimental NFS client's
renew thread where they were missing.


# 4a9c979c 22-Apr-2010 Rick Macklem <rmacklem@FreeBSD.org>

MFC: r206688
The experimental NFS client was not filling in recovery credentials
for opens done locally in the client when a delegation for the file
was held. This could cause the client to crash in crsetgroups() when
recovering from a server crash/reboot. This patch fills in the
recovery credentials for this case, in order to avoid the client crash.
Also, add KASSERT()s to the credential copy functions, to catch any
other cases where the credentials aren't filled in correctly.


# 67c5c2d2 22-Apr-2010 Rick Macklem <rmacklem@FreeBSD.org>

When the experimental NFS client is handling an NFSv4 server reboot
with delegations enabled, the recovery could fail if the renew
thread is trying to return a delegation, since it will not do the
recovery. This patch fixes the above by having nfscl_recalldeleg()
fail with the I/O operations returning EIO, so that they will be
attempted later. Most of the patch consists of adding an argument
to various functions to indicate the delegation recall case where
this needs to be done.

MFC after: 1 week


# a318bc27 19-Apr-2010 Rick Macklem <rmacklem@FreeBSD.org>

For the experimental NFS client doing an NFSv4 mount,
set the NFSCLFLAGS_RECVRINPROG while doing recovery from an expired
lease in a manner similar to r206818 for server reboot recovery.
This will prevent the function that acquires stateids for I/O
operations from acquiring out of date stateids during recovery.
Also, fix up mutex locking on the nfsc_flags field.

MFC after: 1 week


# 7ea710b3 18-Apr-2010 Rick Macklem <rmacklem@FreeBSD.org>

Avoid extraneous recovery cycles in the experimental NFS client
when an NFSv4 server reboots, by doing two things.
1 - Make the function that acquires a stateid for I/O operations
block until recovery is complete, so that it doesn't acquire
out of date stateids.
2 - Only allow a recovery once every 1/2 of a lease duration, since
the NFSv4 server must provide a recovery grace period of at
least a lease duration. This should avoid recoveries caused
by an out of date stateid that was acquired for an I/O op.
just before a recovery cycle started.

MFC after: 1 week


# 0ac68bd3 15-Apr-2010 Rick Macklem <rmacklem@FreeBSD.org>

Add mutex lock calls to 2 cases in the experimental NFS client's
renew thread where they were missing.

MFC after: 1 week


# 55909abf 15-Apr-2010 Rick Macklem <rmacklem@FreeBSD.org>

The experimental NFS client was not filling in recovery credentials
for opens done locally in the client when a delegation for the file
was held. This could cause the client to crash in crsetgroups() when
recovering from a server crash/reboot. This patch fills in the
recovery credentials for this case, in order to avoid the client crash.
Also, add KASSERT()s to the credential copy functions, to catch any
other cases where the credentials aren't filled in correctly.

MFC after: 1 week


# 05cb5487 17-Jan-2010 Rick Macklem <rmacklem@FreeBSD.org>

MFC: r201439
Fix three related problems in the experimental nfs client when
checking for conflicts w.r.t. byte range locks for NFSv4.
1 - Return 0 instead of EACCES when a conflict is found, for F_GETLK.
2 - Check for "same file" when checking for a conflict.
3 - Don't check for a conflict for the F_UNLCK case.


# b04b4acd 03-Jan-2010 Rick Macklem <rmacklem@FreeBSD.org>

Fix three related problems in the experimental nfs client when
checking for conflicts w.r.t. byte range locks for NFSv4.
1 - Return 0 instead of EACCES when a conflict is found, for F_GETLK.
2 - Check for "same file" when checking for a conflict.
3 - Don't check for a conflict for the F_UNLCK case.


# d2129b50 31-Dec-2009 Jaakko Heinonen <jh@FreeBSD.org>

MFC r198289:

Fix comment typos.

Approved by: trasz (mentor)


# 417612e0 20-Oct-2009 Jaakko Heinonen <jh@FreeBSD.org>

Fix comment typos.

Reviewed by: rmacklem
Approved by: trasz (mentor)


# 7f1968ba 22-Jul-2009 Rick Macklem <rmacklem@FreeBSD.org>

When using an NFSv4 mount in the experimental nfs client with delegations
being issued from the server, there was a case where an Open issued locally
based on the delegation would be released before the associated vnode
became inactive. If the delegation was recalled after the open was released,
an Open against the server would not have been acquired and subsequent I/O
operations would need to use the special stateid of all zeros. This patch
fixes that case.

Approved by: re (kensmith), kib (mentor)


# 9ca27b56 09-Jul-2009 Rick Macklem <rmacklem@FreeBSD.org>

Since the nfscl_getclose() function both decremented open counts and,
optionally, created a separate list of NFSv4 opens to be closed, it
was possible for the associated OpenOwner to be free'd before the Open
was closed. The problem was that the Open was taken off the OpenOwner
list before the Close RPC was done and OpenOwners can be free'd once the
list is empty. This patch separates out the case of doing the Close RPC
into a separate function called nfscl_doclose() and simplifies nfsrpc_doclose()
so that it closes a single open instead of a list of them. This avoids
removing the Open from the OpenOwner list before doing the Close RPC.

Approved by: re (kensmith), kib (mentor)


# 01de879a 13-Jun-2009 Jamie Gritton <jamie@FreeBSD.org>

Use getcredhostuuid instead of accessing the prison directly.

Approved by: bz (mentor)


# 2fa32154 08-Jun-2009 Rick Macklem <rmacklem@FreeBSD.org>

Fix nfscl_getcl() so that it doesn't crash when it is called to
do an NFSv4 Close operation with the cred argument NULL. Also,
clarify what NULL arguments mean in the function's comment.

Approved by: kib (mentor)


# 76ca6f88 29-May-2009 Jamie Gritton <jamie@FreeBSD.org>

Place hostnames and similar information fully under the prison system.
The system hostname is now stored in prison0, and the global variable
"hostname" has been removed, as has the hostname_mtx mutex. Jails may
have their own host information, or they may inherit it from the
parent/system. The proper way to read the hostname is via
getcredhostname(), which will copy either the hostname associated with
the passed cred, or the system hostname if you pass NULL. The system
hostname can still be accessed directly (and without locking) at
prison0.pr_host, but that should be avoided where possible.

The "similar information" referred to is domainname, hostid, and
hostuuid, which have also become prison parameters and had their
associated global variables removed.

Approved by: bz (mentor)


# 47a59856 18-May-2009 Rick Macklem <rmacklem@FreeBSD.org>

Change the experimental NFSv4 client so that it does not do
the NFSv4 Close operations until ncl_inactive(). This is
necessary so that the Open StateIDs are available for doing
I/O on mmap'd files after VOP_CLOSE(). I also changed some
indentation for the nfscl_getclose() function.

Approved by: kib (mentor)


# 9ec7b004 04-May-2009 Rick Macklem <rmacklem@FreeBSD.org>

Add the experimental nfs subtree to the kernel, that includes
support for NFSv4 as well as NFSv2 and 3.
It lives in 3 subdirs under sys/fs:
nfs - functions that are common to the client and server
nfsclient - a mutation of sys/nfsclient that call generic functions
to do RPCs and handle state. As such, it retains the
buffer cache handling characteristics and vnode semantics that
are found in sys/nfsclient, for the most part.
nfsserver - the server. It includes a DRC designed specifically for
NFSv4, that is used instead of the generic DRC in sys/rpc.
The build glue will be checked in later, so at this point, it
consists of 3 new subdirs that should not affect kernel building.

Approved by: kib (mentor)