#
cc760de2 |
|
11-Jan-2024 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Only update atime for Copy when noatime is not specified Commit 57ce37f9dcd0 modified the NFSv4.2 Copy operation so that it will update atime on the infd file whenever possible. This is done by adding a Setattr of TimeAccess for the input file. This patch disables this change for the case of an NFSv4.2 mount with the "noatime" mount option, which avoids the additional Setattr of TimeAccess operation. MFC after: 1 week
|
#
b484bcd5 |
|
22-Dec-2023 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Fix handling of a copyout() error reply If vfs.nfs.nfs_directio_enable is set non-zero (the default is zero) and a file on an NFS mount is read after being opened with O_DIRECT | O_ RDONLY, a call to nfsm_mbufuio() calls copyout() without checking for an error return. If copyout() returns EFAULT, this would not work correctly. Only the call path VOP_READ()->ncl_readrpc()->nfsrpc_read()->nfsrpc_readrpc() will do this and the error return for EFAULT will be returned back to VOP_READ(). This patch adds the error check to nfsm_mbufuio(). Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D43160
|
#
57ce37f9 |
|
18-Oct-2023 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Make NFSv4.2 Copy set atime on infd RFC7862 does not specify infile atime behaviour when a NFSv4.2 Copy operation is performed. Since the collective opinion of a mailing list discussion (on freebsd-hackers@) seemed to indicate that copy_file_range(2) should update atime on the infd, even if there is no data copied, this patch attempts to ensure that behaviour. For Copy, it preceeds the Copy operation with a Setattr of TimeAccess_Set(NFSv4. speak for atime) for the invp. For the case where no data will be copied, it does a Setattr RPC to set TimeAccess_Set for the invp. A __FreeBSD_version bump will be done as a separate commit, since this patch changes the internal interface between the nfscommon and nfscl modules. MFC after: 1 month
|
#
db7257ef |
|
17-Oct-2023 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfsd: Fix a server crash PR#274346 reports a crash which appears to be caused by a NULL default session being destroyed. This patch should avoid the crash. Tested by: Joshua Kinard <freebsd@kumba.dev> PR: 274346 MFC after: 2 weeks
|
#
685dc743 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
ba8cc6d7 |
|
12-Mar-2023 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: use __enum_uint8 for vtype and vstate This whacks hackery around only reading v_type once. Bump __FreeBSD_version to 1400093
|
#
4adb28c0 |
|
07-Apr-2023 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Fix support for doing Null RPCs Although the NFS client does not currently perform Null RPCs, this fix is needed if/when it might do so. Found during testing of experimental code that uses Null RPCs to maintain/monitor TCP connections for "nconnect" mounts. MFC after: 3 months
|
#
f4179ad4 |
|
01-Apr-2023 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscommon: Add support for an NFSv4 operation bitmap NFSv4.1/4.2 uses operation bitmaps for various operations, such as the SP4_MACH_CRED case for ExchangeID. This patch adds support for operation bitmaps so that support for SP4_MACH_CRED can be added to the NFSv4.1/4.2 server in a future commit. This commit should not change any NFSv4.1/4.2 semantics. MFC after: 3 months
|
#
695d87ba |
|
28-Mar-2023 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Make coverity happy Coverity does not like code that checks a function's return value sometimes. Add "(void)" in front of the function when the return value does not matter to try and make it happy. A recent commit deleted "(void)"s in front of nfsm_fhtom(). This commit puts them back in. Reported by: emaste MFC after: 3 months
|
#
896516e5 |
|
16-Mar-2023 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Add a new NFSv4.1/4.2 mount option for Kerberized mounts Without this patch, a Kerberized NFSv4.1/4.2 mount must provide a Kerberos credential for the client at mount time. This credential is typically referred to as a "machine credential". It can be created one of two ways: - The user (usually root) has a valid TGT at the time the mount is done and this becomes the machine credential. There are two problems with this. 1 - The user doing the mount must have a valid TGT for a user principal at mount time. As such, the mount cannot be put in fstab(5) or similar. 2 - When the TGT expires, the mount breaks. - The client machine has a service principal in its default keytab file and this service principal (typically called a host-based initiator credential) is used as the machine credential. There are problems with this approach as well: 1 - There is a certain amount of administrative overhead creating the service principal for the NFS client, creating a keytab entry for this principal and then copying the keytab entry into the client's default keytab file via some secure means. 2 - The NFS client must have a fixed, well known, DNS name, since that FQDN is in the service principal name as the instance. This patch uses a feature of NFSv4.1/4.2 called SP4_NONE, which allows the state maintenance operations to be performed by any authentication mechanism, to do these operations via AUTH_SYS instead of RPCSEC_GSS (Kerberos). As such, neither of the above mechanisms is needed. It is hoped that this option will encourage adoption of Kerberized NFS mounts using TLS, to provide a more secure NFS mount. This new NFSv4.1/4.2 mount option, called "syskrb5" must be used with "sec=krb5[ip]" to avoid the need for either of the above Kerberos setups to be done by the client. Note that all file access/modification operations still require users on the NFS client to have a valid TGT recognized by the NFSv4.1/4.2 server. As such, this option allows, at most, a malicious client to do some sort of DOS attack. Although not required, use of "tls" with this new option is encouraged, since it provides on-the-wire encryption plus, optionally, client identity verification via a X.509 certificate provided to the server during TLS handshake. Alternately, "sec=krb5p" does provide on-the-wire encryption of file data. A mount_nfs(8) man page update will be done in a separate commit. Discussed on: freebsd-current@ MFC after: 3 months
|
#
f0db2b60 |
|
14-Feb-2023 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfsd: Continue adding macros so nfsd can run in a vnet prison Commit 7344856e3a6d added a lot of macros that will front end vnet macros so that nfsd(8) can run in vnet prison. This patch adds some more, to allow the nfsuserd(8) daemon to run in vnet prison, once the macros map to vnet ones. This is the last commit for NFSD_VNET_xxx macros, but there are still some for KRPC_VNET_xxx and KGSS_VNET_xx to allow the rpc.tlsservd(8) and gssd(8) daemons to run in a vnet prison. MFC after: 3 months
|
#
bf312482 |
|
08-Nov-2022 |
Gordon Bergling <gbe@FreeBSD.org> |
nfs: Fix common typos in source code comments - s/attrbute/attribute/ MFC after: 3 days
|
#
8b43388c |
|
23-Sep-2022 |
Zhenlei Huang <zlei.huang@gmail.com> |
nfscl: Fix parameter order in the calls to MGET(). Reviewed by: imp, rmacklem Differential Revision: https://reviews.freebsd.org/D36644
|
#
117cea02 |
|
28-Aug-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Fix setup of Sequence when all slots marked bad Commit 40ada74ee1da modified the NFSv4.1/4.2 client so that it would issue a DestroySession to the server when all session slots are marked bad. Once this is done, the Sequence operation should get a NFSERR_BADSESSION reply from the server. Without this patch, the code was setting ND_HASSLOTID when, in fact, there was no slot marked in use by nfsv4_sequencelookup(). This would result in the code freeing a slot not in use. The effect of this was minimal, since the session was already destroyed. This patch fixes the code so that it does not set ND_HASSLOTID for this case. MFC after: 2 weeks
|
#
40ada74e |
|
09-Jul-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Add optional support for slots marked bad This patch adds support for session slots marked bad to nfsv4_sequencelookup(). An additional boolean argument indicates if the check for slots marked bad should be done. The "cred" argument added to nfscl_reqstart() by commit 326bcf9394c7 is now passed into nfsv4_setquence() so that it can optionally set the boolean argument for nfsv4_sequencelookup(). When optionally enabled, nfsv4_setsequence() will do a DestroySession when all slots are marked bad. Since the code that marks slots bad is not yet committed, this patch should not result in a semantics change. PR: 260011 MFC after: 2 weeks
|
#
dff31ae1 |
|
09-Jul-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Move nfsrpc_destroysession into nfscommon This patch moves nfsrpc_destroysession() into nfscommon.ko and also modifies its arguments slightly. This will allow the function to be called from nfsv4_sequencelookup() in a future commit. This patch should not result in a semantics change. PR: 260011 MFC after: 2 weeks
|
#
326bcf93 |
|
08-Jul-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Add a cred argument to nfscl_reqstart() To deal with broken session slots caused by the use of the "soft" and/or "intr" mount options, nfsv4_sequencelookup() will be modified to track the potentially broken session slots. Then, when all session slots are potentially broken, do a DeleteSession operation, so that the NFSv4 server will reply NFSERR_BADSESSION to uses of the session. These changes will be done in future commits. However, to do the DeleteSession RPC, a "cred" argument is needed for nfscl_reqstart(). This patch adds this argument, which is unused at this time. If the argument is NULL, it indicates that DeleteSession should not be done (usually because the RPC does not use sessions). This patch should not cause any semantics change. PR: 260011 MFC after: 2 weeks
|
#
1ebc14c9 |
|
24-Jun-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscommon: Clean up the code by not using the vnode_vtype() macro The vnode_vtype() macro was used to make the code compatible with Mac OSX, for the Mac OSX port. For FreeBSD, this macro just obscured the code, so avoid using it to clean up the code. This commit should not result in a semantics change.
|
#
6d25ea6d |
|
18-Jun-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Clean up the code by removing #if(n)def APPLE The definition of "APPLE" was used by the Mac OSX port. For FreeBSD, this definition is never used, so remove the references to it to clean up the code. This commit should not result in a semantics change.
|
#
ef4edb70 |
|
04-May-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfsd: Add a sanity check for Owner/OwnerGroup string length Robert Morris reported that, if a client sends an absurdly large Owner/OwnerGroup string, the kernel malloc() for the large size string can block forever. This patch adds a sanity limit for Owner/OwnerGroup string length. Since the RFCs do not specify any limit and FreeBSD can handle a group name greater than 1Kbyte, the limit is set at a generous 10Kbytes. Reported by: rtm@lcs.mit.edu PR: 260546 MFC after: 2 weeks
|
#
21de450a |
|
08-Apr-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Add support for a NFSv4 AppendWrite RPC For IO_APPEND VOP_WRITE()s, the code first does a Getattr RPC to acquire the file's size, before it can do the Write RPC. Although NFS does not have an append write operation, an NFSv4 compound can use a Verify operation to check that the client's notion of the file's size is correct, followed by the Write operation. This patch modifies nfscl_wcc_data() to optionally acquire the file's size, for use with an AppendWrite. Although the "stuff" arguments are always NULL (these were used for the Mac OSX port and should be cleared out someday), make the argument to nfscl_wcc_data() explicitly NULL for clarity. This patch does not cause any semantics change until the AppendWrite is added in a future commit.
|
#
330aa8ac |
|
05-Apr-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Add support for a NFSv4 AppendWrite RPC For IO_APPEND VOP_WRITE()s, the code first does a Getattr RPC to acquire the file's size, before it can do the Write RPC. Although NFS does not have an append write operation, an NFSv4 compound can use a Verify operation to check that the client's notion of the file's size is correct before doing the Write operation. This patch prepares the NFSv4 client for such an RPC, which will be added in a future commit. This patch does not cause any semantics change.
|
#
a91a5784 |
|
11-Jan-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfsd: Do not accept audit/alarm ACEs for the NFSv4 server The UFS and ZFS file systems only support Allow/Deny ACEs in the NFSv4 ACLs. This patch does not allow the server to parse Audit/Alarm ACEs. The NFSv4 client is still allowed to pase Audit/Alarm ACEs, since non-FreeBSD NFSv4 servers may use them. This patch should not have a significant effect, since the UFS and ZFS file systems will not handle these ACEs anyhow. It simply serves as an additional "safety belt" for the NFSv4 server. MFC after: 2 weeks
|
#
5da9b3b0 |
|
11-Jan-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
Revert "nfscommon: Add arguments for support of the dacl attribute" This reverts commit 0fa074b53e7c22157dcb41aaa25a33abc8118f26. I now see that the implementation of the "dacl" operation requires that the NFSv4 server to "automatic inheritance" and I do not plan on doing this. As such, this patch is harmless, but unneeded.
|
#
3455c738 |
|
09-Jan-2022 |
Alexander Motin <mav@FreeBSD.org> |
nfsd: Reduce callouts rate. Before this callouts were scheduled twice a seconds even if nfsd was never used. This reduces the rate to ~1Hz and only after nfsd first started. MFC after: 2 weeks
|
#
0fa074b5 |
|
26-Dec-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscommon: Add arguments for support of the dacl attribute NFSv4.1/4.2 has an alternative to the acl attribute, called dacl, that includes support for the ACL_ENTRY_INHERITED flag, called NFSV4ACE_INHERITED in NFSv4. This patch adds a dacl argument to nfsrv_buildacl(), nfsrv_dissectacl() and nfsrv_dissectace(), so that they will handle NFSV4ACE_INHERITED when dacl == true. Since these functions are always called with dacl == false for this patch, semantics should not have changed. A future patch will add support for dacl. MFC after: 2 weeks
|
#
2d90ef47 |
|
04-Dec-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfsd: Fix Verify for attributes like FilesAvail When the Verify operation calls nfsv4_loadattr(), it provides the "struct statfs" information that can be used for doing a compare for FilesAvail, FilesFree, FilesTotal, SpaceAvail, SpaceFree and SpaceTotal. However, the code erroneously used the "struct nfsstatfs *" argument that is NULL. This patch fixes these cases to use the correct argument structure. For the case of FilesAvail, the code in nfsv4_fillattr() was factored out into a separate function called nfsv4_filesavail(), so that it can be called from nfsv4_loadattr() as well as nfsv4_fillattr(). In fact, most of the code in nfsv4_filesavail() is old OpenBSD code that does not build/run on FreeBSD, but I left it in place, in case it is of some use someday. I am not aware of any extant NFSv4 client that does Verify on these attributes. Reported by: rtm@lcs.mit.edu Tested by: rtm@lcs.mit.edu PR: 260176 MFC after: 2 weeks
|
#
480be96e |
|
04-Dec-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfsd: Sanity check the Layouttype count Reported by: rtm@lcs.mit.edu Tested by: rtm@lcs.mit.edu PR: 260155 MFC after: 2 weeks
|
#
db0ac6de |
|
02-Dec-2021 |
Cy Schubert <cy@FreeBSD.org> |
Revert "wpa: Import wpa_supplicant/hostapd commit 14ab4a816" This reverts commit 266f97b5e9a7958e365e78288616a459b40d924a, reversing changes made to a10253cffea84c0c980a36ba6776b00ed96c3e3b. A mismerge of a merge to catch up to main resulted in files being committed which should not have been.
|
#
fd020f19 |
|
01-Dec-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfsd: Sanity check the ACL attribute When an ACL is presented to the NFSv4 server in Setattr or Verify, parsing of the ACL assumed a sane acecnt and sane sizes for the "who" strings. This patch adds sanity checks for these. The patch also fixes handling of an error return from nfsrv_dissectacl() for one broken case. Reported by: rtm@lcs.mit.edu Tested by: rtm@lcs.mit.edu PR: 260111 MFC after: 2 weeks
|
#
638b90a1 |
|
28-Nov-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfs: Quiet a few "unused" warnings For most of these warnings, the variable is loaded with data parsed out of an RPC messages. In case the data is useful in the future, I just marked these with __unused.
|
#
44744f75 |
|
11-Nov-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Add a LayoutError RPC for NFSv4.2 pNFS mounts If a pNFS server's DS runs out of disk space, it replies NFSERR_NOSPC to the client doing writing. For the Linux client, it then sends a LayoutError RPC to the MDS server to tell it about the error. This patch adds the same to the FreeBSD NFSv4.2 pNFS client, to maintain Linux compatible behaviour, particlularily for non-FreeBSD pNFS servers. MFC after: 2 weeks
|
#
d70ca5b0 |
|
08-Nov-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfsd: Fix f_bavail and f_ffree for NFSv4 when negative Since the NFS Space_available and Files_available are unsigned, the NFSv3 server sets them to 0 when negative, so that they do not appear to be large positive values for non-FreeBSD clients. This patch fixes the NFSv4 server to do the same. Found during a recent IEFT NFSv4 working group testing event. MFC after: 2 weeks
|
#
a4667e09 |
|
19-Oct-2021 |
Mark Johnston <markj@FreeBSD.org> |
Convert vm_page_alloc() callers to use vm_page_alloc_noobj(). Remove page zeroing code from consumers and stop specifying VM_ALLOC_NOOBJ. In a few places, also convert an allocation loop to simply use VM_ALLOC_WAITOK. Similarly, convert vm_page_alloc_domain() callers. Note that callers are now responsible for assigning the pindex. Reviewed by: alc, hselasky, kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D31986
|
#
55089ef4 |
|
11-Sep-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Make vfs.nfs.maxcopyrange larger by default As of commit 103b207536f9, the NFSv4.2 server will limit the size of a Copy operation based upon a 1 second timeout. The Linux 5.2 kernel server also limits Copy operation size to 4Mbytes. As such, the NFSv4.2 client can attempt a large Copy without resulting in a long RPC RTT for these servers. This patch changes vfs.nfs.maxcopyrange to 64bits and sets the default to the maximum possible size of SSIZE_MAX, since a larger size makes the Copy operation more efficient and allows for copying to complete with fewer RPCs. The sysctl may be need to be made smaller for other non-FreeBSD NFSv4.2 servers. MFC after: 2 weeks
|
#
3ad1e1c1 |
|
11-Aug-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Add a Lookup+Open RPC for NFSv4.1/4.2 This patch adds a Lookup+Open compound RPC to the NFSv4.1/4.2 NFS client, which can be used by nfs_lookup() so that a subsequent Open RPC is not required. It uses the cn_flags OPENREAD, OPENWRITE added by commit c18c74a87c15. This reduced the number of RPCs by about 15% for a kernel build over NFS. For now, use of Lookup+Open is only done when the "oneopenown" mount option is used. It may be possible for Lookup+Open to be used for non-oneopenown NFSv4.1/4.2 mounts, but that will require extensive further testing to determine if it works. While here, I've added the changes to the nfscommon module that are needed to implement the Deallocate NFSv4.2 operation. This avoids needing another cycle of changes to the internal KAPI between the NFS modules. This commit has changed the internal KAPI between the NFS modules and, as such, all need to be rebuilt from sources. I have not bumped __FreeBSD_version, since it was bumped a few days ago.
|
#
ee29e6f3 |
|
16-Jul-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfsd: Add sysctl to set maximum I/O size up to 1Mbyte Since MAXPHYS now allows the FreeBSD NFS client to do 1Mbyte I/O operations, add a sysctl called vfs.nfsd.srvmaxio so that the maximum NFS server I/O size can be set up to 1Mbyte. The Linux NFS client can also do 1Mbyte I/O operations. The default of 128Kbytes for the maximum I/O size has not been changed for two reasons: - kern.ipc.maxsockbuf must be increased to support 1Mbyte I/O - The limited benchmarking I can do actually shows a drop in I/O rate when the I/O size is above 256Kbytes. However, daveb@spectralogic.com reports seeing an increase in I/O rate for the 1Mbyte I/O size vs 128Kbytes using a Linux client. Reviewed by: asomers MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D30826
|
#
1e0a518d |
|
08-Jul-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Add a Linux compatible "nconnect" mount option Linux has had an "nconnect" NFS mount option for some time. It specifies that N (up to 16) TCP connections are to created for a mount, instead of just one TCP connection. A discussion on freebsd-net@ indicated that this could improve client<-->server network bandwidth, if either the client or server have one of the following: - multiple network ports aggregated to-gether with lagg/lacp. - a fast NIC that is using multiple queues It does result in using more IP port#s and might increase server peak load for a client. One difference from the Linux implementation is that this implementation uses the first TCP connection for all RPCs composed of small messages and uses the additional TCP connections for RPCs that normally have large messages (Read/Readdir/Write). The Linux implementation spreads all RPCs across all TCP connections in a round robin fashion, whereas this implementation spreads Read/Readdir/Write across the additional TCP connections in a round robin fashion. Reviewed by: markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D30970
|
#
947bd247 |
|
30-May-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfsd: Add support for the NFSv4.1/4.2 Secinfo_no_name operation The Linux client is now attempting to use the Secinfo_no_name operation for NFSv4.1/4.2 mounts. Although it does not seem to mind the NFSERR_NOTSUPP reply, adding support for it seems reasonable. I also noticed that "savflag" needed to be 64bits in nfsrvd_secinfo() since nd_flag in now 64bits, so I changed the declaration of it there. I also added code to set "vp" NULL after performing Secinfo/Secinfo_no_name, since these operations consume the current FH, which is represented by "vp" in nfsrvd_compound(). Fixing when the server replies NFSERR_WRONGSEC so that it conforms to RFC5661 Sec. 2.6 still needs to be done in a future commit. MFC after: 2 weeks
|
#
dd02d9d6 |
|
07-May-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Add support for va_birthtime to NFSv4 There is a NFSv4 file attribute called TimeCreate that can be used for va_birthtime. r362175 added some support for use of TimeCreate. This patch completes support of va_birthtime by adding support for setting this attribute to the server. It also eanbles the client to acquire and set the attribute for a NFSv4 server that supports the attribute. Reviewed by: markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D30156
|
#
87597731 |
|
26-Apr-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfsd: fix the slot sequence# when a callback fails Commit 4281bfec3628 patched the server so that the callback session slot would be free'd for reuse when a callback attempt fails. However, this can often result in the sequence# for the session slot to be advanced such that the client end will reply NFSERR_SEQMISORDERED. To avoid the NFSERR_SEQMISORDERED client reply, this patch negates the sequence# advance for the case where the callback has failed. The common case is a failed back channel, where the callback cannot be sent to the client, and not advancing the sequence# is correct for this case. For the uncommon case where the client's reply to the callback is lost, not advancing the sequence# will indicate to the client that the next callback is a retry and not a new callback. But, since the FreeBSD server always sets "csa_cachethis" false in the callback sequence operation, a retry and a new callback should be handled the same way by the client, so this should not matter. Until you have this patch in your NFSv4.1/4.2 server, you should consider avoiding the use of delegations. Even with this patch, interoperation with the Linux NFSv4.1/4.2 client in kernel versions prior to 5.3 can result in frequent 15second delays if delegations are enabled. This occurs because, for kernels prior to 5.3, the Linux client does a TCP reconnect every time it sees multiple concurrent callbacks and then it takes 15seconds to recover the back channel after doing so. MFC after: 2 weeks
|
#
78ffcb86 |
|
19-Apr-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscommon: fix function name in comment MFC after: 2 weeks
|
#
34256484 |
|
15-Apr-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
Revert "nfsd: cut the Linux NFSv4.1/4.2 some slack w.r.t. RFC5661" This reverts commit 9edaceca8165e2864267547311daf145bb520270. It turns out that the Linux client intentionally does an NFSv4.1 RPC with only a Sequence operation in it and with "seqid + 1" for the slot. This is used to re-synchronize the slot's seqid and the client expects the NFS4ERR_SEQ_MISORDERED error reply. As such, revert the patch, so that the server remains RFC5661 compliant.
|
#
9edaceca |
|
11-Apr-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfsd: cut the Linux NFSv4.1/4.2 some slack w.r.t. RFC5661 Recent testing of network partitioning a FreeBSD NFSv4.1 server from a Linux NFSv4.1 client identified problems with both the FreeBSD server and Linux client. Sometimes, after some Linux NFSv4.1/4.2 clients establish a new TCP connection, they will advance the sequence number for a session slot by 2 instead of 1. RFC5661 specifies that a server should reply NFS4ERR_SEQ_MISORDERED for this case. This might result in a system call error in the client and seems to disable future use of the slot by the client. Since advancing the sequence number by 2 seems harmless, allow this case if vfs.nfs.linuxseqsesshack is non-zero. Note that, if the order of RPCs is actually reversed, a subsequent RPC with a smaller sequence number value for the slot will be received. This will result in a NFS4ERR_SEQ_MISORDERED reply. This has not been observed during testing. Setting vfs.nfs.linuxseqsesshack to 0 will provide RFC5661 compliant behaviour. This fix affects the fairly rare case where a NFSv4 Linux client does a TCP reconnect and then apparently erroneously increments the sequence number for the session slot twice during the reconnect cycle. PR: 254816 MFC after: 2 weeks
|
#
7763814f |
|
11-Apr-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfsv4 client: do the BindConnectionToSession as required During a recent testing event, it was reported that the NFSv4.1/4.2 server erroneously bound the back channel to a new TCP connection. RFC5661 specifies that the fore channel is implicitly bound to a new TCP connection when an RPC with Sequence (almost any of them) is done on it. For the back channel to be bound to the new TCP connection, an explicit BindConnectionToSession must be done as the first RPC on the new connection. Since new TCP connections are created by the "reconnect" layer (sys/rpc/clnt_rc.c) of the krpc, this patch adds an optional upcall done by the krpc whenever a new connection is created. The patch also adds the specific upcall function that does a BindConnectionToSession and configures the krpc to call it when required. This is necessary for correct interoperability with NFSv4.1/NFSv4.2 servers when the nfscbd daemon is running. If doing NFSv4.1/NFSv4.2 mounts without this patch, it is recommended that the nfscbd daemon not be running and that the "pnfs" mount option not be specified. PR: 254840 Comments by: asomers MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D29475
|
#
22cefe3d |
|
10-Apr-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfsd: fix replies from session cache for multiple retries Recent testing of network partitioning a FreeBSD NFSv4.1 server from a Linux NFSv4.1 client identified problems with both the FreeBSD server and Linux client. Commit 05a39c2c1c18 fixed replying with the cached reply in in the session slot if same session slot sequence#. However, the code uses the reply and, as such, will fail for a subsequent retry of the RPC. A subsequent retry would be an extremely rare event, but this patch fixes this, so long as m_copym(..M_NOWAIT) does not fail, which should also be a rare event. This fix affects the exceedingly rare case where a NFSv4 client retries a non-idempotent RPC, such as a lock operation, multiple times. Note that retries only occur after the client has needed to create a new TCP connection, with a new TCP connection for each retry. MFC after: 2 weeks
|
#
5f742d38 |
|
19-Mar-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfsv4 client: fix forced dismount when sleeping on nfsv4lck During a recent NFSv4 testing event a test server caused a hang where "umount -N" failed. The renew thread was sleeping on "nfsv4lck" and the "umount" was sleeping, waiting for the renew thread to terminate. This is the first of two patches that is hoped to fix the renew thread so that it will terminate when "umount -N" is done on the mount. nfsv4_lock() checks for forced dismount, but only after it wakes up from msleep(). Without this patch, a wakeup() call was required. This patch adds a 1second timeout on the msleep(), so that it will wake up and see the forced dismount flag. Normally a wakeup() will occur in less than 1second, but if a premature return from msleep() does occur, it will simply loop around and msleep() again. While here, replace the nfsmsleep() wrapper that was used for portability with the actual msleep() call and make the same change for nfsv4_getref(). MFC after: 2 weeks
|
#
52e63ec2 |
|
17-Dec-2020 |
Brooks Davis <brooks@FreeBSD.org> |
VFS_QUOTACTL: Remove needless casts of arg The argument is a void * so there's no need to cast it to caddr_t. Update documentation to match function decleration. Reviewed by: freqlabs Obtained from: CheriBSD MFC after: 1 week Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D27093
|
#
586ee69f |
|
01-Sep-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fs: clean up empty lines in .c and .h files
|
#
808306dd |
|
17-Aug-2020 |
Rick Macklem <rmacklem@FreeBSD.org> |
Delete the unused "use_ext" argument to nfscl_reqstart(). This is a partial revert of r363210, since the "use_ext" argument added by that commit is not actually useful. This patch should not result in any semantics change.
|
#
02511d21 |
|
10-Aug-2020 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add an argument to newnfs_connect() that indicates use TLS for the connection. For NFSv4.0, the server creates a server->client TCP connection for callbacks. If the client mount on the server is using TLS, enable TLS for this callback TCP connection. TLS connections from clients will not be supported until the kernel RPC changes are committed. Since this changes the internal ABI between the NFS kernel modules that will require a version bump, delete newnfs_trimtrailing(), which is no longer used. Since LCL_TLSCB is not yet set, these changes should not have any semantic affect at this time.
|
#
194d8704 |
|
26-Jul-2020 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix the NFSv4 client so that it checks for support of TimeCreate before trying to set it. r362490 added support for setting of the TimeCreate (va_birthtime) attribute, but it does so without checking to see if the server supports the attribute. This could result in NFSERR_ATTRNOTSUPP error replies to the Setattr operation. This patch adds code to check that the server supports TimeCreate before attempting to do a Setattr of it to avoid these error returns.
|
#
022346fa |
|
06-Jul-2020 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add support for ext_pgs mbufs to nfsrvd_rephead(). This is another in the series of commits that add support to the NFS client and server for building RPC messages in ext_pgs mbufs with anonymous pages. This is useful so that the entire mbuf list does not need to be copied before calling sosend() when NFS over TLS is enabled. Since ND_EXTPG is never set yet, there is no semantic change at this time.
|
#
34fc29e0 |
|
05-Jul-2020 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add support for ext_pgs mbufs to nfsm_strtom(). Also, add a new function nfsm_add_ext_pgs() which will either add a page or add a new ext_pgs mbuf with a page to the mbuf list. Used by nfsm_strtom(). This is another in the series of commits that add support to the NFS client and server for building RPC messages in ext_pgs mbufs with anonymous pages. This is useful so that the entire mbuf list does not need to be copied before calling sosend() when NFS over TLS is enabled. Since ND_EXTPG is never set yet, there is no semantic change at this time.
|
#
dccb5806 |
|
03-Jul-2020 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add support for ext_pgs mbufs to nfscl_reqstart() and nfsm_set(). This is another in the series of commits that add support to the NFS client and server for building RPC messages in ext_pgs mbufs with anonymous pages. This is useful so that the entire mbuf list does not need to be copied before calling sosend() when NFS over TLS is enabled. Since ND_EXTPG is never set yet, there is no semantic change at this time.
|
#
4476c1de |
|
25-Jun-2020 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add a boolean argument to nfscl_reqstart() to indicate that ext_pgs mbufs should be used. For KERN_TLS (and possibly some other future network interface) the mbuf list passed into sosend() must be ext_pgs mbufs. The krpc could simply copy all the mbuf data into ext_pgs mbufs before calling sosend(), but that would be inefficient for large RPC messages. This patch adds an argument to nfscl_reqstart() to indicate that it should fill the RPC message into ext_pgs mbufs. It also adds fields to "struct nfsrv_descript" needed for building NFS RPC messages in ext_pgs mbufs, along with new flags for this. Since the argument is always "false", this commit should not result in any semantic change. However, this commit prepares the code for future commits that will add support for building of NFS RPC messages in ext_pgs mbufs.
|
#
c07782e1 |
|
22-Jun-2020 |
Doug Rabson <dfr@FreeBSD.org> |
Add some missing parts for supporting va_birthtime. Reviewed by: rmacklem
|
#
eea79fde |
|
17-Jun-2020 |
Alan Somers <asomers@FreeBSD.org> |
Remove vfs_statfs and vnode_mount macros from NFS These macro definitions are no longer needed as the NFS OSX port is long dead. The vfs_statfs macro conflicts with the vfsops field of the same name. Submitted by: shivank@ Reviewed by: rmacklem MFC after: 2 weeks Sponsored by: Google, Inc. (GSoC 2020) Differential Revision: https://reviews.freebsd.org/D25263
|
#
3900c114 |
|
14-Jun-2020 |
Doug Rabson <dfr@FreeBSD.org> |
Add support for the timecreate attribute This maps to the va_birthtime VFS attribute.
|
#
3d7650f0 |
|
17-May-2020 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add a function nfsm_set() to initialize "struct nfsrv_descript" for building mbuf lists. This function is currently trivial, but will that will change when support for building NFS messages in ext_pgs mbufs is added. Adding support for ext_pgs mbufs is needed for KERN_TLS, which will be used to implement nfs-over-tls.
|
#
b9cc3262 |
|
12-May-2020 |
Ryan Moeller <freqlabs@FreeBSD.org> |
nfs: Remove APPLESTATIC macro It is no longer useful. Reviewed by: rmacklem Approved by: mav (mentor) MFC after: 1 week Sponsored by: iXsystems, Inc. Differential Revision: https://reviews.freebsd.org/D24811
|
#
32033b3d |
|
08-May-2020 |
Ryan Moeller <freqlabs@FreeBSD.org> |
Remove APPLEKEXT ifndefs They are no longer useful. Reviewed by: rmacklem Approved by: mav (mentor) MFC after: 1 week Sponsored by: iXsystems, Inc. Differential Revision: https://reviews.freebsd.org/D24752
|
#
04d6c514 |
|
05-May-2020 |
Rick Macklem <rmacklem@FreeBSD.org> |
Delete unused function newnfs_trimleading. The NFS function called newnfs_trimleading() has not been used by the code in long time. To give you a clue, it still had a K&R style function declaration. Delete it, since it is just cruft, as a part of the NFS mbuf handling cleanup in preparation for adding ext_pgs mbuf support. The ext_pgs mbuf support for the build/send side is needed by nfs-over-tls.
|
#
3973ef1d |
|
04-May-2020 |
Rick Macklem <rmacklem@FreeBSD.org> |
Revert r360514, to avoid unnecessary churn of the sources. r360514 prepared the NFS code for changes to handle ext_pgs mbufs on the receive side. However, at this time, KERN_TLS does not pass ext_pgs mbufs up through soreceive(). As such, as this time, only the send/build side of the NFS mbuf code needs to handle ext_pgs mbufs. Revert r360514 since the rather extensive changes required for receive side ext_pgs mbufs are not yet needed. This avoids unnecessary churn of the sources.
|
#
0c9cd5ca |
|
30-Apr-2020 |
Rick Macklem <rmacklem@FreeBSD.org> |
Factor some code out of nfsm_dissct() into separate functions. Factoring some of the code in nfsm_dissct() out into separate functions allows these functions to be used elsewhere in the NFS mbuf handling code. Other uses of these functions will be done in future commits. It also makes it easier to add support for ext_pgs mbufs, which is needed for nfs-over-tls under development in base/projects/nfs-over-tls. Although the algorithm in nfsm_dissct() is somewhat re-written by this patch, the semantics of nfsm_dissct() should not have changed.
|
#
e4a458bb |
|
24-Apr-2020 |
Rick Macklem <rmacklem@FreeBSD.org> |
Remove Mac OS/X macros that did nothing for FreeBSD. The macros CAST_USER_ADDR_T() and CAST_DOWN() were used for the Mac OS/X port. The first of these macros was a no-op for FreeBSD and the second is no longer used. This patch gets rid of them. It also deletes the "mbuf_t" typedef which is no longer used in the FreeBSD code from nfskpiport.h This patch should not change semantics.
|
#
ae070589 |
|
17-Apr-2020 |
Rick Macklem <rmacklem@FreeBSD.org> |
Replace all instances of the typedef mbuf_t with "struct mbuf *". The typedef mbuf_t was used for the Mac OS/X port of the code long ago. Since this port is no longer used and the use of mbuf_t obscures what the code does (and is not consistent with style(9)), it is no longer needed. This patch replaces all instances of mbuf_t with "struct mbuf *", so that it is no longer used. This patch should not result in any semantic change.
|
#
c948a17a |
|
09-Apr-2020 |
Rick Macklem <rmacklem@FreeBSD.org> |
Replace mbuf macros with the code they would generate in the NFS code. When the code was ported to Mac OS/X, mbuf handling functions were converted to using the Mac OS/X accessor functions. For FreeBSD, they are a simple set of macros in sys/fs/nfs/nfskpiport.h. Since porting to Mac OS/X is no longer a consideration, replacement of these macros with the code generated by them makes the code more readable. When support for external page mbufs is added as needed by the KERN_TLS, the patch becomes simpler if done without the macros. This patch should not result in any semantic change. This conversion will be committed one file at a time.
|
#
b249ce48 |
|
03-Jan-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: drop the mostly unused flags argument from VOP_UNLOCK Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D21427
|
#
c057a378 |
|
12-Dec-2019 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add support for NFSv4.2 to the NFS client and server. This patch adds support for NFSv4.2 (RFC-7862) and Extended Attributes (RFC-8276) to the NFS client and server. NFSv4.2 is comprised of several optional features that can be supported in addition to NFSv4.1. This patch adds the following optional features: - posix_fadvise(POSIX_FADV_WILLNEED/POSIX_FADV_DONTNEED) - posix_fallocate() - intra server file range copying via the copy_file_range(2) syscall --> Avoiding data tranfer over the wire to/from the NFS client. - lseek(SEEK_DATA/SEEK_HOLE) - Extended attribute syscalls for "user" namespace attributes as defined by RFC-8276. Although this patch is fairly large, it should not affect support for the other versions of NFS. However it does add two new sysctls that allow a sysadmin to limit which minor versions of NFSv4 a server supports, allowing a sysadmin to disable NFSv4.2. Unfortunately, when the NFS stats structure was last revised, it was assumed that there would be no additional operations added beyond what was specified in RFC-7862. However RFC-8276 did add additional operations, forcing the NFS stats structure to revised again. It now has extra unused entries in all arrays, so that future extensions to NFSv4.2 can be accomodated without revising this structure again. A future commit will update nfsstat(1) to report counts for the new NFSv4.2 specific operations/procedures. This patch affects the internal interface between the nfscommon, nfscl and nfsd modules and, as such, they all must be upgraded simultaneously. I will do a version bump (although arguably not needed), due to this. This code has survived a "make universe" but has not been built with a recent GCC. If you encounter build problems, please email me. Relnotes: yes
|
#
e1cda5ee |
|
28-Nov-2019 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix two races while handling nfsuserd daemon start/stop. A crash was reported where the nr_client field was NULL during an upcall to the nfsuserd daemon. Since nr_client == NULL only occurs when the nfsuserd daemon is being shut down, it appeared to be caused by a race between doing an upcall and the daemon shutting down. By inspection two races were identified: 1 - The nfsrv_nfsuserd variable is used to indicate whether or not the daemon is running. However it did not handle the intermediate phase where the daemon is starting or stopping. This was fixed by making nfsrv_nfsuserd tri-state and having the functions that are called during start/stop to obey the intermediate state. 2 - nfsrv_nfsuserd was checked to see that the daemon was running at the beginning of an upcall, but nothing prevented the daemon from being shut down while an upcall was still in progress. This race probably caused the crash. The patch fixes this by adding a count of upcalls in progress and having the shut down function delay until this count goes to zero before getting rid of nr_client and related data used by an upcall. Tested by: avg (Panzura QA) Reported by: avg Reviewed by: avg MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D22377
|
#
a6f77c9a |
|
21-Apr-2019 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add #ifdef INET as requested by bz@.
|
#
ea5776ec |
|
18-Apr-2019 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix the NFSv4.0 server so that it does not support NFSv4.1 attributes. During inspection of a packet trace, I noticed that an NFSv4.0 mount reported that it supported attributes that are only defined for NFSv4.1. In practice, this bug appears to be benign, since NFSv4.0 clients will not use attributes that were added for NFSv4.1. However, this was not correct and this patch fixes the NFSv4.0 server so that it only supports attributes defined for NFSv4.0. It also adds a definition for NFSv4.1 attributes that can only be set, although it is only defined as 0 for now. This is anticipation of the addition of support for the NFSv4.1 mode+mask attribute soon. MFC after: 2 weeks
|
#
80405bcf |
|
06-Apr-2019 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add INET6 support for the upcalls to the nfsuserd daemon. The kernel code uses UDP to do upcalls to the nfsuserd(8) daemon to get updates to the username<->uid and groupname<->gid mappings. A change to AF_LOCAL last year had to be reverted, since it could result in vnode locking issues on the AF_LOCAL socket. This patch adds INET6 support and the required #ifdef INET and INET6 to the code. Requested by: bz PR: 205193 Reviewed by: bz, rgrimes MFC after: 2 weeks Differential Revision: http://reviews.freebsd.org/D19218
|
#
02c8dd7d |
|
04-Apr-2019 |
Rick Macklem <rmacklem@FreeBSD.org> |
Revert r320698, since the related userland changes were reverted by r338192. r338192 reverted the changes to nfsuserd so that it could use an AF_LOCAL socket, since it resulted in a vnode locking panic(). Post r338192 nfsuserd daemons use the old AF_INET socket for upcalls and do not use these kernel changes. I left them in for a while, so that nfsuserd daemons built from head sources between r320757 (Jul. 6, 2017) and r338192 (Aug. 22, 2018) would need them by default. This only affects head, since the changes were never MFC'd. I will add an UPDATING entry, since an nfsuserd daemon built from head sources between r320757 and r338192 will not run unless the "-use-udpsock" option is specified. (This command line option is only in the affected revisions of the nfsuserd daemon.) I suspect few will be affected by this, since most who run systems built from head sources (not stable or releases) will have rebuilt their nfsuserd daemon from sources post r338192 (Aug. 22, 2018) This is being reverted in preparation for an update to include AF_INET6 support to the code.
|
#
2df8bd90 |
|
12-Mar-2019 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Drop unused 'p' argument to nfsv4_strtogid(). MFC after: 2 weeks Sponsored by: DARPA, AFRL
|
#
c703cba8 |
|
12-Mar-2019 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Drop unused 'p' argument to nfsv4_gidtostr(). MFC after: 2 weeks Sponsored by: DARPA, AFRL
|
#
0658ac39 |
|
12-Mar-2019 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Drop unused 'p' argument to nfsv4_strtouid(). MFC after: 2 weeks Sponsored by: DARPA, AFRL
|
#
0f86b94a |
|
12-Mar-2019 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Drop unused 'p' argument to nfsv4_uidtostr(). MFC after: 2 weeks Sponsored by: DARPA, AFRL
|
#
f32bf292 |
|
12-Mar-2019 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Drop unused 'p' argument to nfsrv_getuser(). Reviewed by: rmacklem MFC after: 2 weeks Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D19455
|
#
cc426dd3 |
|
11-Dec-2018 |
Mateusz Guzik <mjg@FreeBSD.org> |
Remove unused argument to priv_check_cred. Patch mostly generated with cocinnelle: @@ expression E1,E2; @@ - priv_check_cred(E1,E2,0) + priv_check_cred(E1,E2) Sponsored by: The FreeBSD Foundation
|
#
778f2983 |
|
19-Nov-2018 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfsm_advance() would panic() when the offs argument was negative. The code assumed that this would indicate a corrupted mbuf chain, but it could simply be caused by bogus RPC message data. This patch replaces the panic() with a printf() plus error return. MFC after: 1 week
|
#
3e5ba2e1 |
|
17-Aug-2018 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix LORs between vn_start_write() and vn_lock() in the pNFS server. When coding the pNFS server, I added several vn_start_write() calls done while the vnode was locked, not realizing I had introduced LORs and possible deadlock when an exported file system on the MDS is suspended. This patch fixes this by removing the added vn_start_write() calls and modifying the code so that the extant vn_start_write() call before the NFS RPC/operation is done when needed by the pNFS server. Flags are changed so that LayoutCommit and LayoutReturn now get a vn_start_write() done for them. When the pNFS server is enabled, the code now also changes the flags for Getattr, so that the vn_start_write() is done for Getattr, since it may need to do a vn_set_extattr(). The nfs_writerpc flag array was made global to the NFS server and renamed nfsrv_writerpc, which is consistent naming for globals in the NFS server. Thanks go to kib@ for reporting that doing vn_start_write() while the vnode is locked results in a LOR. This patch only affects the behaviour of the pNFS server.
|
#
2f32675c |
|
02-Jul-2018 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add an optional feature to the pNFS server. Without this patch, the pNFS server distributes the data storage files across all of the specified DSs. A tester noted that it would be nice if a system administrator could control which DSs are used to store the file data for a given exported MDS file system. This patch adds the kernel support to do this. It also makes a slight semantic change to nfsv4_findmirror(), since some uses of it no longer require that the DS being searched for have a current mirror. A patch that will be committed in a few minutes will modify the nfsd daemon to support this feature. The patch should only affect sites using the pNFS server (specified via the "-p" command line option for nfsd. Suggested by: james.rose@framestore.com
|
#
9f4c522e |
|
22-Jun-2018 |
Rick Macklem <rmacklem@FreeBSD.org> |
Set the slotid and ND_HASSLOTID flag for NFSv4.1 sequenced operations. Most NFSv4.1 compound RPCs start with a Sequence operation. For these cases, save the slotid and note that it is saved by setting ND_HASSLOTID. This is used by r335568 to free up the session slot and disable it. MFC after: 2 weeks
|
#
c338c94d |
|
14-Jun-2018 |
Rick Macklem <rmacklem@FreeBSD.org> |
Move four functions in nfscl.ko to nfscommon.ko. Four functions nfscl_reqstart(), nfscl_fillsattr(), nfsm_stateidtom() and nfsmnt_mdssession() are now called from within the nfsd. As such, they needed to be moved from nfscl.ko to nfscommon.ko so that nfsd.ko would load when nfscl.ko wasn't loaded. Reported by: herbert@gojira.at
|
#
90d2dfab |
|
12-Jun-2018 |
Rick Macklem <rmacklem@FreeBSD.org> |
Merge the pNFS server code from projects/pnfs-planb-server into head. This code merge adds a pNFS service to the NFSv4.1 server. Although it is a large commit it should not affect behaviour for a non-pNFS NFS server. Some documentation on how this works can be found at: http://people.freebsd.org/~rmacklem/pnfs-planb-setup.txt and will hopefully be turned into a proper document soon. This is a merge of the kernel code. Userland and man page changes will come soon, once the dust settles on this merge. It has passed a "make universe", so I hope it will not cause build problems. It also adds NFSv4.1 server support for the "current stateid". Here is a brief overview of the pNFS service: A pNFS service separates the Read/Write oeprations from all the other NFSv4.1 Metadata operations. It is hoped that this separation allows a pNFS service to be configured that exceeds the limits of a single NFS server for either storage capacity and/or I/O bandwidth. It is possible to configure mirroring within the data servers (DSs) so that the data storage file for an MDS file will be mirrored on two or more of the DSs. When this is used, failure of a DS will not stop the pNFS service and a failed DS can be recovered once repaired while the pNFS service continues to operate. Although two way mirroring would be the norm, it is possible to set a mirroring level of up to four or the number of DSs, whichever is less. The Metadata server will always be a single point of failure, just as a single NFS server is. A Plan B pNFS service consists of a single MetaData Server (MDS) and K Data Servers (DS), all of which are recent FreeBSD systems. Clients will mount the MDS as they would a single NFS server. When files are created, the MDS creates a file tree identical to what a single NFS server creates, except that all the regular (VREG) files will be empty. As such, if you look at the exported tree on the MDS directly on the MDS server (not via an NFS mount), the files will all be of size 0. Each of these files will also have two extended attributes in the system attribute name space: pnfsd.dsfile - This extended attrbute stores the information that the MDS needs to find the data storage file(s) on DS(s) for this file. pnfsd.dsattr - This extended attribute stores the Size, AccessTime, ModifyTime and Change attributes for the file, so that the MDS doesn't need to acquire the attributes from the DS for every Getattr operation. For each regular (VREG) file, the MDS creates a data storage file on one (or more if mirroring is enabled) of the DSs in one of the "dsNN" subdirectories. The name of this file is the file handle of the file on the MDS in hexadecimal so that the name is unique. The DSs use subdirectories named "ds0" to "dsN" so that no one directory gets too large. The value of "N" is set via the sysctl vfs.nfsd.dsdirsize on the MDS, with the default being 20. For production servers that will store a lot of files, this value should probably be much larger. It can be increased when the "nfsd" daemon is not running on the MDS, once the "dsK" directories are created. For pNFS aware NFSv4.1 clients, the FreeBSD server will return two pieces of information to the client that allows it to do I/O directly to the DS. DeviceInfo - This is relatively static information that defines what a DS is. The critical bits of information returned by the FreeBSD server is the IP address of the DS and, for the Flexible File layout, that NFSv4.1 is to be used and that it is "tightly coupled". There is a "deviceid" which identifies the DeviceInfo. Layout - This is per file and can be recalled by the server when it is no longer valid. For the FreeBSD server, there is support for two types of layout, call File and Flexible File layout. Both allow the client to do I/O on the DS via NFSv4.1 I/O operations. The Flexible File layout is a more recent variant that allows specification of mirrors, where the client is expected to do writes to all mirrors to maintain them in a consistent state. The Flexible File layout also allows the client to report I/O errors for a DS back to the MDS. The Flexible File layout supports two variants referred to as "tightly coupled" vs "loosely coupled". The FreeBSD server always uses the "tightly coupled" variant where the client uses the same credentials to do I/O on the DS as it would on the MDS. For the "loosely coupled" variant, the layout specifies a synthetic user/group that the client uses to do I/O on the DS. The FreeBSD server does not do striping and always returns layouts for the entire file. The critical information in a layout is Read vs Read/Writea and DeviceID(s) that identify which DS(s) the data is stored on. At this time, the MDS generates File Layout layouts to NFSv4.1 clients that know how to do pNFS for the non-mirrored DS case unless the sysctl vfs.nfsd.default_flexfile is set non-zero, in which case Flexible File layouts are generated. The mirrored DS configuration always generates Flexible File layouts. For NFS clients that do not support NFSv4.1 pNFS, all I/O operations are done against the MDS which acts as a proxy for the appropriate DS(s). When the MDS receives an I/O RPC, it will do the RPC on the DS as a proxy. If the DS is on the same machine, the MDS/DS will do the RPC on the DS as a proxy and so on, until the machine runs out of some resource, such as session slots or mbufs. As such, DSs must be separate systems from the MDS. Tested by: james.rose@framestore.com Relnotes: yes
|
#
9442a64e |
|
01-Jun-2018 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add the BindConnectiontoSession operation to the NFSv4.1 server. Under some fairly unusual circumstances, the Linux NFSv4.1 client is doing a BindConnectiontoSession operation for TCP connections. It is also used by the ESXi6.5 NFSv4.1 client. This patch adds this operation to the NFSv4.1 server. Reported by: andreas.nagy@frequentis.com Tested by: andreas.nagy@frequentis.com MFC after: 2 weeks
|
#
b97b91b5 |
|
25-Jan-2018 |
Conrad Meyer <cem@FreeBSD.org> |
nfs: Remove NFSSOCKADDRALLOC, NFSSOCKADDRFREE macros They were just thin wrappers over malloc(9) w/ M_ZERO and free(9). Discussed with: rmacklem, markj Sponsored by: Dell EMC Isilon
|
#
222daa42 |
|
25-Jan-2018 |
Conrad Meyer <cem@FreeBSD.org> |
style: Remove remaining deprecated MALLOC/FREE macros Mechanically replace uses of MALLOC/FREE with appropriate invocations of malloc(9) / free(9) (a series of sed expressions). Something like: * MALLOC(a, b, ... -> a = malloc(... * FREE( -> free( * free((caddr_t) -> free( No functional change. For now, punt on modifying contrib ipfilter code, leaving a definition of the macro in its KMALLOC(). Reported by: jhb Reviewed by: cy, imp, markj, rmacklem Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D14035
|
#
151ba793 |
|
24-Dec-2017 |
Alexander Kabaev <kan@FreeBSD.org> |
Do pass removing some write-only variables from the kernel. This reduces noise when kernel is compiled by newer GCC versions, such as one used by external toolchain ports. Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial) Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c) Differential Revision: https://reviews.freebsd.org/D10385
|
#
55384243 |
|
19-Dec-2017 |
John Baldwin <jhb@FreeBSD.org> |
Replace one more LINK_MAX with NFS_LINK_MAX missed in r326991. Sponsored by: Chelsio Communications
|
#
a0a073b1 |
|
19-Dec-2017 |
John Baldwin <jhb@FreeBSD.org> |
Update NFS to handle larger link counts post ino64. - Define a NFS_LINK_MAX as UINT32_MAX to match the wire protocol. - Use NFS_LINK_MAX instead of LINK_MAX as the fallback value reported for a PATHCONF RPC by the NFS server. - Use NFS_LINK_MAX instead of LINK_MAX as the default value reported by the NFS client pathconf() if not overridden by the NFS server. - When reading the link count out of an RPC reply, read the full 32 bits instead of the lower 16 bits. Reviewed by: rmacklem (earlier version) Sponsored by: Chelsio Communications
|
#
51369649 |
|
20-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.
|
#
be3d32ad |
|
28-Sep-2017 |
Rick Macklem <rmacklem@FreeBSD.org> |
Change nfsv4_getipaddr() and nfsrpc_fillsa() to not use sockaddr_storage. This patch changes nfsv4_getipaddr() and nfsrpc_fillsa() to use a sockaddr_in * and sockaddr_in6 * instead of sockaddr_storage, to avoid allocating the latter on the stack. It also moves the nfsrpc_fillsa() call to after the completion of parsing of the DeviceInfo reply from the server. This patch is in preparation for addition of Flex File Layout support in a future commit. It only affects the "pnfs" NFSv4.1 client mount option and should not have changed its semantics.
|
#
16f300fa |
|
27-Jul-2017 |
Rick Macklem <rmacklem@FreeBSD.org> |
Replace the checks for MNTK_UNMOUNTF with a macro that does the same thing. This patch defines a macro that checks for MNTK_UNMOUNTF and replaces explicit checks with this macro. It has no effect on semantics, but prepares the code for a future patch where there will also be a NFS specific flag for "forced dismount about to occur". Suggested by: kib MFC after: 2 weeks
|
#
1d2fef9b |
|
19-Jul-2017 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Rename vfs.nfsd.enable_uidtostring to vfs.nfs.enable_uidtostring. It applies to both NFS client and NFS server, and is useful for both. This is different from vfs.nfsd.enable_stringtouid, which is specific to server side. Reviewed by: rmacklem@ MFC after: 2 weeks Sponsored by: DARPA, AFRL
|
#
25d694a6 |
|
05-Jul-2017 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add support for AF_LOCAL socket upcalls to the nfsuserd daemon. This patch adds support for AF_LOCAL socket upcalls to an nfsuserd daemon that supports them. A future patch to the nfsuserd daemon will use AF_LOCAL sockets to avoid a problem when using upcalls to 127.0.0.1 if jails are in use. Suggested by: dfr PR: 205193
|
#
3c264086 |
|
27-Jun-2017 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Revert part of r320359, as suggested by rmacklem@. That case is only used for nfsuserd -manage-gids and shouldn't depend on sysctl. MFC after: 2 weeks Sponsored by: DARPA, AFRL
|
#
6a3450e1 |
|
26-Jun-2017 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Add vfs.nfsd.nfsd_enable_uidtostring, which works just like vfs.nfsd.nfsd_enable_stringtouid, but in reverse - when set to 1, it forces the NFSv4 server to return numeric UIDs and GIDs instead of "user@domain" strings. This helps with clients that can't translate returned identifiers, eg when rerooting. The same can be achieved by just never running nfsuserd(8), but the sysctl is useful to toggle the behaviour back and forth without rebooting. Reviewed by: rmacklem (earlier version) MFC after: 2 weeks Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D11326
|
#
a351e99c |
|
24-Jun-2017 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add two new compound RPCs to the NFSv4.1/pNFS client. When the NFSv4.1 client is doing pNFS, it needs to get an Open and a Layout for every file it will be doing I/O on. The current code does two separate RPCs to get these. This patch adds two new compounds that do the both the Open and LayoutGet in the same RPC, reducing the RPC count. It also factors out the code that sets up and parses the LayoutGet operation into separate functions, so that the code doesn't get duplicated for these new RPCs. This patch is fairly large, but should only affect the NFSv4.1 client when the "pnfs" option is specified. PR: 219550 MFC after: 2 weeks
|
#
95ac7f1a |
|
18-Jun-2017 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix the NFS client/server so that it actually uses the 64bit ino_t filenos. The code still doesn't use d_off. That will come in a future commit. The code also removes the checks for servers returning a fileno that doesn't fit in 32bits, since that should work ok now. Bump __FreeBSD_version since this patch changes the interface between the NFS kernel modules. Reviewed by: kib
|
#
8c1d0d9c |
|
21-Apr-2017 |
Rick Macklem <rmacklem@FreeBSD.org> |
Set default uid/gid to nobody/nogroup for NFSv4 mapping. The default uid/gid for NFSv4 are set by the nfsuserd(8) daemon. However, they were 0 until the nfsuserd(8) was run. Since it is possible to use NFSv4 without running the nfsuserd(8) daemon, set them to nobody/nogroup initially. Without this patch, the values would be set by the nfsuserd(8) daemon and left changed even if the nfsuserd(8) daemon was killed. The default values of 0 meant that setting a group to "wheel" would fail even when done by root. It also adds a definition of GID_NOGROUP to sys/conf.h. Discussed on: freebsd-current@ MFC after: 2 weeks
|
#
b843ada7 |
|
21-Apr-2017 |
Rick Macklem <rmacklem@FreeBSD.org> |
Revert r317240. I didn't realize there were defined constants for uid/gid values in sys/conf.h. I will do another commit using those.
|
#
1350db17 |
|
20-Apr-2017 |
Rick Macklem <rmacklem@FreeBSD.org> |
Set default uid/gid to nobody/nogroup for NFSv4 mapping. The default uid/gid for NFSv4 are set by the nfsuserd(8) daemon. However, they were 0 until the nfsuserd(8) was run. Since it is possible to use NFSv4 without running the nfsuserd(8) daemon, set them to nobody/nogroup initially. Without this patch, the values would be set by the nfsuserd(8) daemon and left changed even if the nfsuserd(8) daemon was killed. Also, the default values of 0 meant that setting a group to "wheel" would fail even when done by root and this patch fixes this issue. MFC after: 2 weeks
|
#
fb556791 |
|
10-Apr-2017 |
Rick Macklem <rmacklem@FreeBSD.org> |
Set initial values for nfsstatfs in the NFSv4 client. The AmazonEFS NFSv4.1 server does not support the FILES_FREE and FILES_TOTAL attributes. As such, an NFSv4.1 mount to the server would return garbage for these values. This patch initializes the fields of the nfsstatfs structure, so that "df" and friends will at least return consistent bogus values. This patch should have effect when mounting other NFSv4.1 servers. Reported by: cperciva MFC after: 2 weeks
|
#
2242bc81 |
|
09-Apr-2017 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix the NFSv4.1 client for NFSERR_BADSESSION recovery via ReclaimComplete. For the ReclaimComplete operation, the RPC layer should not loop on NFSERR_BADSESSION. If it does, the recovery thread (nfscl) can get stuck looping and will not do a recovery. This patch fixes it so it does not loop. This bug only affects NFSv4.1 and only when a server reboots. Tested by: cperciva PR: 215886 MFC after: 2 weeks
|
#
fbbd9655 |
|
28-Feb-2017 |
Warner Losh <imp@FreeBSD.org> |
Renumber copyright clause 4 Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is too pedantic, so give up on that point. Submitted by: Jan Schaumann <jschauma@stevens.edu> Pull Request: https://github.com/freebsd/freebsd/pull/96
|
#
2f304845 |
|
05-Jan-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Do not allocate struct statfs on kernel stack. Right now size of the structure is 472 bytes on amd64, which is already large and stack allocations are indesirable. With the ino64 work, MNAMELEN is increased to 1024, which will make it impossible to have struct statfs on the stack. Extracted from: ino64 work by gleb Discussed with: mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
b2fc0141 |
|
23-Dec-2016 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix NFSv4.1 client recovery from NFS4ERR_BAD_SESSION errors. For most NFSv4.1 servers, a NFS4ERR_BAD_SESSION error is a rare failure that indicates that the server has lost session/open/lock state. However, recent testing by cperciva@ against the AmazonEFS server found several problems with client recovery from this due to it generating this failure frequently. Briefly, the problems fixed are: - If all session slots were in use at the time of the failure, some processes would continue to loop waiting for a slot on the old session forever. - If an RPC that doesn't use open/lock state failed with NFS4ERR_BAD_SESSION, it would fail the RPC/syscall instead of initiating recovery and then looping to retry the RPC. - If a successful reply to an RPC for an old session wasn't processed until after a new session was created for a NFS4ERR_BAD_SESSION error, it would erroneously update the new session and corrupt it. - The use of the first element of the session list in the nfs mount structure (which is always the current metadata session) was slightly racey. With changes for the above problems it became more racey, so all uses of this head pointer was wrapped with a NFSLOCKMNT()/NFSUNLOCKMNT(). - Although the kernel malloc() usually allocates more bytes than requested and, as such, this wouldn't have caused problems, the allocation of a session structure was 1 byte smaller than it should have been. (Null termination byte for the string not included in byte count.) There are probably still problems with a pNFS data server that fails with NFS4ERR_BAD_SESSION, but I have no server that does this to test against (the AmazonEFS server doesn't do pNFS), so I can't fix these yet. Although this patch is fairly large, it should only affect the handling of NFS4ERR_BAD_SESSION error replies from an NFSv4.1 server. Thanks go to cperciva@ for the extension testing he did to help isolate/fix these problems. Reported by: cperciva Tested by: cperciva MFC after: 3 months Differential Revision: https://reviews.freebsd.org/D8745
|
#
63659ba6 |
|
15-Nov-2016 |
Colin Percival <cperciva@FreeBSD.org> |
Reduce NFS "NFSv4( mounted on)? fileid > 32bits" log spam. Rather than printing a warning for every time we receive a fileid > 2^32 from the NFS server, count warnings and print at most one of each warning type per minute, e.g., Nov 15 05:17:34 ip-172-30-1-221 kernel: NFSv4 fileid > 32bits (24730 occurrences) Nov 15 05:17:56 ip-172-30-1-221 kernel: NFSv4 mounted on fileid > 32bits (178 occurrences) Nov 15 05:18:53 ip-172-30-1-221 kernel: NFSv4 fileid > 32bits (7582 occurrences) Nov 15 05:18:58 ip-172-30-1-221 kernel: NFSv4 mounted on fileid > 32bits (23 occurrences) A buildworld with an NFS mounted /usr/obj can otherwise result in hundreds of thousands of lines being printed, which seems unnecessarily verbose. When ino_t becomes a 64-bit type, these printfs will no longer be needed (and the problems associated with truncating 64-bit fileids to generate 32-bit inode numbers will also go away). Reviewed by: rmacklem MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D8523
|
#
8edac6ee |
|
06-May-2016 |
Ed Maste <emaste@FreeBSD.org> |
Add nid_namelen bounds check to nfssvc system call This is only allowed by root and only used by the nfs daemon, which should not provide an incorrect value. However, it's still good practice to validate data provided by userland. PR: 206626 Reported by: CTurt <cturt@hardenedbsd.org> Reviewed by: rmacklem MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D6201
|
#
74b8d63d |
|
10-Apr-2016 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
Cleanup unnecessary semicolons from the kernel. Found with devel/coccinelle.
|
#
65171ebb |
|
01-Dec-2015 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix the memory leak that occurs when the nfscommon.ko module is unloaded. This leak was introduced by r291527. Since the nfscommon.ko module is rarely unloaded, this leak would not have been much of an issue. MFC after: 2 weeks
|
#
84be7e09 |
|
30-Nov-2015 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add kernel support to the NFS server for the "-manage-gids" option that will be added to the nfsuserd daemon in a future commit. It modifies the cache used by NFSv4 for name<-->id translation (both username/uid and group/gid) to support this. When "-manage-gids" is set, the server looks up each uid for the RPC and uses the list of groups cached in the server instead of the list of groups provided in the RPC request. The cached group list is acquired for the cache by the nfsuserd daemon via getgrouplist(3). This avoids the 16 groups limit for the list in the RPC request. Since the cache is now used for every RPC when "-manage-gids" is enabled, the code also modifies the cache to use a separate mutex for each hash list instead of a single global mutex. Suggested by: jpaetzel Tested by: jpaetzel MFC after: 2 weeks
|
#
6d659a5d |
|
19-Dec-2014 |
Benno Rice <benno@FreeBSD.org> |
Adjust the test of a KASSERT to better match the intent. This assertion was added in r246213 as a guard against corrupted mbufs arriving from drivers, the key distinguishing factor of said mbufs being that they had a negative length. Given we're in a while loop specifically designed to skip over zero-length mbufs, panicking on a zero-length mbuf seems incorrect. No objection from: kib
|
#
d8a5961f |
|
02-Oct-2014 |
Marcelo Araujo <araujo@FreeBSD.org> |
Fix failures and warnings reported by newpynfs20090424 test tool. This fix addresses only issues with the pynfs reports, none of these issues are know to create problems for extant real clients. Submitted by: Bart Hsiao <bart.hsiao@gmail.com> Reworked by: myself Reviewed by: rmacklem Approved by: rmacklem Sponsored by: QNAP Systems Inc.
|
#
c59e4cc3 |
|
01-Jul-2014 |
Rick Macklem <rmacklem@FreeBSD.org> |
Merge the NFSv4.1 server code in projects/nfsv4.1-server over into head. The code is not believed to have any effect on the semantics of non-NFSv4.1 server behaviour. It is a rather large merge, but I am hoping that there will not be any regressions for the NFS server. MFC after: 1 month
|
#
ca20bd92 |
|
02-May-2014 |
Rick Macklem <rmacklem@FreeBSD.org> |
The new draft specification for NFSv4.0 specifies that a server should either accept owner and owner_group strings that are just the digits of the uid/gid or return NFS4ERR_BADOWNER. This patch adds a sysctl vfs.nfsd.enable_stringtouid, which can be set to enable the server w.r.t. accepting numeric string. It also ensures that NFS4ERR_BADOWNER is returned if numeric uid/gid strings are not enabled. This fixes the server for recent Linux nfs4 clients that use numeric uid/gid strings by default. Reported and tested by: craigyk@gmail.com MFC after: 2 weeks
|
#
a6f8e64e |
|
18-Apr-2014 |
Rick Macklem <rmacklem@FreeBSD.org> |
Modify the Lookup RPC for NFSv4 so that it acquires directory attributes. This allows the client to cache directory names when they are looked up, reducing the Lookup RPC count by about 40% for software builds. MFC after: 2 weeks
|
#
b921158a |
|
23-Dec-2013 |
Rick Macklem <rmacklem@FreeBSD.org> |
The NFSv4 client was passing both the p and cred arguments to nfsv4_fillattr() as NULLs for the Getattr callback. This caused nfsv4_fillattr() to not fill in the Change attribute for the reply. I believe this was a violation of the RFC, but had little effect on server behaviour. This patch passes a non-NULL p argument to fix this. MFC after: 1 week
|
#
42b6336a |
|
09-Nov-2013 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix an NFSv4.1 client specific case where a forced dismount would hang. The hang occurred in nfsv4_setsequence() when it couldn't find an available session slot and is fixed by checking for a forced dismount in progress and just returning for this case. MFC after: 1 month
|
#
a36b76a7 |
|
20-Jul-2013 |
Rick Macklem <rmacklem@FreeBSD.org> |
The NFSv4 server incorrectly assumed that the high order words of the attribute bitmap argument would be non-zero. This caused an interoperability problem for a recent patch to the Linux NFSv4 client. The Linux folks have changed their patch to avoid this, but this patch fixes the problem on the server. Reported and tested by: Andre Heider (a.heider@gmail.com) MFC after: 3 days
|
#
d96b98a3 |
|
17-Apr-2013 |
Kenneth D. Merry <ken@FreeBSD.org> |
Revamp the old NFS server's File Handle Affinity (FHA) code so that it will work with either the old or new server. The FHA code keeps a cache of currently active file handles for NFSv2 and v3 requests, so that read and write requests for the same file are directed to the same group of threads (reads) or thread (writes). It does not currently work for NFSv4 requests. They are more complex, and will take more work to support. This improves read-ahead performance, especially with ZFS, if the FHA tuning parameters are configured appropriately. Without the FHA code, concurrent reads that are part of a sequential read from a file will be directed to separate NFS threads. This has the effect of confusing the ZFS zfetch (prefetch) code and makes sequential reads significantly slower with clients like Linux that do a lot of prefetching. The FHA code has also been updated to direct write requests to nearby file offsets to the same thread in the same way it batches reads, and the FHA code will now also send writes to multiple threads when needed. This improves sequential write performance in ZFS, because writes to a file are now more ordered. Since NFS writes (generally less than 64K) are smaller than the typical ZFS record size (usually 128K), out of order NFS writes to the same block can trigger a read in ZFS. Sending them down the same thread increases the odds of their being in order. In order for multiple write threads per file in the FHA code to be useful, writes in the NFS server have been changed to use a LK_SHARED vnode lock, and upgrade that to LK_EXCLUSIVE if the filesystem doesn't allow multiple writers to a file at once. ZFS is currently the only filesystem that allows multiple writers to a file, because it has internal file range locking. This change does not affect the NFSv4 code. This improves random write performance to a single file in ZFS, since we can now have multiple writers inside ZFS at one time. I have changed the default tuning parameters to a 22 bit (4MB) window size (from 256K) and unlimited commands per thread as a result of my benchmarking with ZFS. The FHA code has been updated to allow configuring the tuning parameters from loader tunable variables in addition to sysctl variables. The read offset window calculation has been slightly modified as well. Instead of having separate bins, each file handle has a rolling window of bin_shift size. This minimizes glitches in throughput when shifting from one bin to another. sys/conf/files: Add nfs_fha_new.c and nfs_fha_old.c. Compile nfs_fha.c when either the old or the new NFS server is built. sys/fs/nfs/nfsport.h, sys/fs/nfs/nfs_commonport.c: Bring in changes from Rick Macklem to newnfs_realign that allow it to operate in blocking (M_WAITOK) or non-blocking (M_NOWAIT) mode. sys/fs/nfs/nfs_commonsubs.c, sys/fs/nfs/nfs_var.h: Bring in a change from Rick Macklem to allow telling nfsm_dissect() whether or not to wait for mallocs. sys/fs/nfs/nfsm_subs.h: Bring in changes from Rick Macklem to create a new nfsm_dissect_nonblock() inline function and NFSM_DISSECT_NONBLOCK() macro. sys/fs/nfs/nfs_commonkrpc.c, sys/fs/nfsclient/nfs_clkrpc.c: Add the malloc wait flag to a newnfs_realign() call. sys/fs/nfsserver/nfs_nfsdkrpc.c: Setup the new NFS server's RPC thread pool so that it will call the FHA code. Add the malloc flag argument to newnfs_realign(). Unstaticize newnfs_nfsv3_procid[] so that we can use it in the FHA code. sys/fs/nfsserver/nfs_nfsdsocket.c: In nfsrvd_dorpc(), add NFSPROC_WRITE to the list of RPC types that use the LK_SHARED lock type. sys/fs/nfsserver/nfs_nfsdport.c: In nfsd_fhtovp(), if we're starting a write, check to see whether the underlying filesystem supports shared writes. If not, upgrade the lock type from LK_SHARED to LK_EXCLUSIVE. sys/nfsserver/nfs_fha.c: Remove all code that is specific to the NFS server implementation. Anything that is server-specific is now accessed through a callback supplied by that server's FHA shim in the new softc. There are now separate sysctls and tunables for the FHA implementations for the old and new NFS servers. The new NFS server has its tunables under vfs.nfsd.fha, the old NFS server's tunables are under vfs.nfsrv.fha as before. In fha_extract_info(), use callouts for all server-specific code. Getting file handles and offsets is now done in the individual server's shim module. In fha_hash_entry_choose_thread(), change the way we decide whether two reads are in proximity to each other. Previously, the calculation was a simple shift operation to see whether the offsets were in the same power of 2 bucket. The issue was that there would be a bucket (and therefore thread) transition, even if the reads were in close proximity. When there is a thread transition, reads wind up going somewhat out of order, and ZFS gets confused. The new calculation simply tries to see whether the offsets are within 1 << bin_shift of each other. If they are, the reads will be sent to the same thread. The effect of this change is that for sequential reads, if the client doesn't exceed the max_reqs_per_nfsd parameter and the bin_shift is set to a reasonable value (22, or 4MB works well in my tests), the reads in any sequential stream will largely be confined to a single thread. Change fha_assign() so that it takes a softc argument. It is now called from the individual server's shim code, which will pass in the softc. Change fhe_stats_sysctl() so that it takes a softc parameter. It is now called from the individual server's shim code. Add the current offset to the list of things printed out about each active thread. Change the num_reads and num_writes counters in the fha_hash_entry structure to 32-bit values, and rename them num_rw and num_exclusive, respectively, to reflect their changed usage. Add an enable sysctl and tunable that allows the user to disable the FHA code (when vfs.XXX.fha.enable = 0). This is useful for before/after performance comparisons. nfs_fha.h: Move most structure definitions out of nfs_fha.c and into the header file, so that the individual server shims can see them. Change the default bin_shift to 22 (4MB) instead of 18 (256K). Allow unlimited commands per thread. sys/nfsserver/nfs_fha_old.c, sys/nfsserver/nfs_fha_old.h, sys/fs/nfsserver/nfs_fha_new.c, sys/fs/nfsserver/nfs_fha_new.h: Add shims for the old and new NFS servers to interface with the FHA code, and callbacks for the The shims contain all of the code and definitions that are specific to the NFS servers. They setup the server-specific callbacks and set the server name for the sysctl and loader tunable variables. sys/nfsserver/nfs_srvkrpc.c: Configure the RPC code to call fhaold_assign() instead of fha_assign(). sys/modules/nfsd/Makefile: Add nfs_fha.c and nfs_fha_new.c. sys/modules/nfsserver/Makefile: Add nfs_fha_old.c. Reviewed by: rmacklem Sponsored by: Spectra Logic MFC after: 2 weeks
|
#
dd603523 |
|
01-Feb-2013 |
Konstantin Belousov <kib@FreeBSD.org> |
Assert that the mbuf in the chain has sane length. Proper place for this check is somewhere in the network code, but this assertion already proven to be useful in catching what seems to be driver bugs causing NFS scrambling random memory. Discussed with: rmacklem MFC after: 1 week
|
#
5055536e |
|
16-Jan-2013 |
John Baldwin <jhb@FreeBSD.org> |
Use the VA_UTIMES_NULL flag to detect when NULL was passed to utimes() instead of comparing the desired time against the current time as a heuristic. Reviewed by: rmacklem MFC after: 1 week
|
#
1f60bfd8 |
|
08-Dec-2012 |
Rick Macklem <rmacklem@FreeBSD.org> |
Move the NFSv4.1 client patches over from projects/nfsv4.1-client to head. I don't think the NFS client behaviour will change unless the new "minorversion=1" mount option is used. It includes basic NFSv4.1 support plus support for pNFS using the Files Layout only. All problems detecting during an NFSv4.1 Bakeathon testing event in June 2012 have been resolved in this code and it has been tested against the NFSv4.1 server available to me. Although not reviewed, I believe that kib@ has looked at it.
|
#
eb1b1807 |
|
05-Dec-2012 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Mechanically substitute flags from historic mbuf allocator with malloc(9) flags within sys. Exceptions: - sys/contrib not touched - sys/mbuf.h edited manually
|
#
c52005a3 |
|
19-Sep-2012 |
Rick Macklem <rmacklem@FreeBSD.org> |
Modify the NFSv4 client so that it can handle owner and owner_group strings that consist entirely of digits, interpreting them as the uid/gid number. This change was needed since new (>= 3.3) Linux servers reply with these strings by default. This change is mandated by the rfc3530bis draft. Reported on freebsd-stable@ under the Subject heading "Problem with Linux >= 3.3 as NFSv4 server" by Norbert Aschendorff on Aug. 20, 2012. Tested by: norbert.aschendorff at yahoo.de Reviewed by: jhb MFC after: 2 weeks
|
#
f7258644 |
|
07-Jan-2012 |
Rick Macklem <rmacklem@FreeBSD.org> |
opt_inet6.h was missing from some files in the new NFS subsystem. The effect of this was, for clients mounted via inet6 addresses, that the DRC cache would never have a hit in the server. It also broke NFSv4 callbacks when an inet6 address was the only one available in the client. This patch fixes the above, plus deletes opt_inet6.h from a couple of files it is not needed for. MFC after: 2 weeks
|
#
061c683c |
|
16-Jul-2011 |
Zack Kirsch <zack@FreeBSD.org> |
Revert revision 224079 as Rick pointed out that I would be calling VOP_PATHCONF without the vnode lock held. Implicitly approved by: zml (mentor)
|
#
a9285ae5 |
|
16-Jul-2011 |
Zack Kirsch <zack@FreeBSD.org> |
Add DEXITCODE plumbing to NFS. Isilon has the concept of an in-memory exit-code ring that saves the last exit code of a function and allows for stack tracing. This is very helpful when debugging tough issues. This patch is essentially a no-op for BSD at this point, until we upstream the dexitcode logic itself. The patch adds DEXITCODE calls to every NFS function that returns an errno error code. A number of code paths were also reorganized to have single exit paths, to reduce code duplication. Submitted by: David Kwan <dkwan@isilon.com> Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks
|
#
a9989634 |
|
16-Jul-2011 |
Zack Kirsch <zack@FreeBSD.org> |
Simple find/replace of VOP_UNLOCK -> NFSVOPUNLOCK. This is done so that NFSVOPUNLOCK can be modified later to add enhanced logging and assertions. Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks
|
#
98f234f3 |
|
16-Jul-2011 |
Zack Kirsch <zack@FreeBSD.org> |
Simple find/replace of vn_lock -> NFSVOPLOCK. This is done so that NFSVOPLOCK can be modified later to add enhanced logging and assertions. Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks
|
#
51c099f5 |
|
16-Jul-2011 |
Zack Kirsch <zack@FreeBSD.org> |
Change loadattr and fillattr to ask the file system for the pathconf variable. Small modification where VOP_PATHCONF was being called directly. Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks
|
#
b008a72c |
|
16-Jul-2011 |
Zack Kirsch <zack@FreeBSD.org> |
Small acl patch to return the aclerror that comes back from nfsrv_dissectacl(). This fixes a problem where ATTRNOTSUPP was being returned instead of BADOWNER. Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks
|
#
ff29f3b2 |
|
27-May-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix the new NFS client so that it handles NFSv4 state correctly during a forced dismount. This required that the exclusive and shared (refcnt) sleep lock functions check for MNTK_UMOUNTF before sleeping, so that they won't block while nfscl_umount() is getting rid of the state. As such, a "struct mount *" argument was added to the locking functions. I believe the only remaining case where a forced dismount can get hung in the kernel is when a thread is already attempting to do a TCP connect to a dead server when the krpc client structure called nr_client is NULL. This will only happen just after a "mount -u" with options that force a new TCP connection is done, so it shouldn't be a problem in practice. MFC after: 2 weeks
|
#
a09001a8 |
|
14-Apr-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix the experimental NFSv4 server so that it uses VOP_PATHCONF() to determine if a file system supports NFSv4 ACLs. Since VOP_PATHCONF() must be called with a locked vnode, the function is called before nfsvno_fillattr() and the result is passed in as an extra argument. MFC after: 2 weeks
|
#
07c0c166 |
|
14-Apr-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
Modify the experimental NFSv4 server so that it handles crossing of server mount points properly. The functions nfsvno_fillattr() and nfsv4_fillattr() were modified to take the extra arguments that are the mount point, a flag to indicate that it is a file system root and the mounted on fileno. The mount point argument needs to be busy when nfsvno_fillattr() is called, since the vp argument is not locked. Reviewed by: kib MFC after: 2 weeks
|
#
8207db3e |
|
18-Jan-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix the experimental NFSv4 server so that it uses VOP_ACCESSX() to check for VREAD_ACL instead of VOP_ACCESS(). MFC after: 3 days
|
#
5a12538b |
|
01-Jan-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add support for shared vnode locks for the Read operation in the experimental NFSv4 server. Reviewed by: kib MFC after: 2 weeks
|
#
17891d00 |
|
25-Dec-2010 |
Rick Macklem <rmacklem@FreeBSD.org> |
Modify the experimental NFS server so that it uses LK_SHARED for RPC operations when it can. Since VFS_FHTOVP() currently always gets an exclusively locked vnode and is usually called at the beginning of each RPC, the RPCs for a given vnode will still be serialized. As such, passing a lock type argument to VFS_FHTOVP() would be preferable to doing the vn_lock() with LK_DOWNGRADE after the VFS_FHTOVP() call. Reviewed by: kib MFC after: 2 weeks
|
#
a7d5f7eb |
|
19-Oct-2010 |
Jamie Gritton <jamie@FreeBSD.org> |
A new jail(8) with a configuration file, to replace the work currently done by /etc/rc.d/jail.
|
#
2ec3f925 |
|
28-Aug-2010 |
Rick Macklem <rmacklem@FreeBSD.org> |
The timer routine in the experimental NFS server did not acquire the correct mutex when checking nfsv4root_lock. Although this could be fixed by adding mutex lock/unlock calls, zack.kirsch at isilon.com suggested a better fix that uses a non-blocking acquisition of a reference count on nfsv4root_lock. This fix allows the weird NFSLOCKSTATE(); NFSUNLOCKSTATE(); synchronization to be deleted. This patch applies this fix. Tested by: zack.kirsch at isilon.com MFC after: 2 weeks
|
#
066adacf |
|
13-Apr-2010 |
Rick Macklem <rmacklem@FreeBSD.org> |
MFC: r205941 This patch should fix handling of byte range locks locally on the server for the experimental nfs server. When enabled by setting vfs.newnfs.locallocks_enable to non-zero, the experimental nfs server will now acquire byte range locks on the file on behalf of NFSv4 clients, such that lock conflicts between the NFSv4 clients and processes running locally on the server, will be recognized and handled correctly.
|
#
a43fcbe3 |
|
30-Mar-2010 |
Rick Macklem <rmacklem@FreeBSD.org> |
This patch should fix handling of byte range locks locally on the server for the experimental nfs server. When enabled by setting vfs.newnfs.locallocks_enable to non-zero, the experimental nfs server will now acquire byte range locks on the file on behalf of NFSv4 clients, such that lock conflicts between the NFSv4 clients and processes running locally on the server, will be recognized and handled correctly. MFC after: 2 weeks
|
#
74991298 |
|
03-Dec-2009 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Remove unneeded ifdefs. Reviewed by: rmacklem
|
#
c3e22f83 |
|
26-May-2009 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix the experimental nfs subsystem so that it builds with the current NFSv4 ACLs, as defined in sys/acl.h. It still needs a way to test a mount point for NFSv4 ACL support before it will work. Until then, the NFSHASNFS4ACL() macro just always returns 0. Approved by: kib (mentor)
|
#
dfd233ed |
|
11-May-2009 |
Attilio Rao <attilio@FreeBSD.org> |
Remove the thread argument from the FSD (File-System Dependent) parts of the VFS. Now all the VFS_* functions and relating parts don't want the context as long as it always refers to curthread. In some points, in particular when dealing with VOPs and functions living in the same namespace (eg. vflush) which still need to be converted, pass curthread explicitly in order to retain the old behaviour. Such loose ends will be fixed ASAP. While here fix a bug: now, UFS_EXTATTR can be compiled alone without the UFS_EXTATTR_AUTOSTART option. VFS KPI is heavilly changed by this commit so thirdy parts modules needs to be recompiled. Bump __FreeBSD_version in order to signal such situation.
|
#
9ec7b004 |
|
04-May-2009 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add the experimental nfs subtree to the kernel, that includes support for NFSv4 as well as NFSv2 and 3. It lives in 3 subdirs under sys/fs: nfs - functions that are common to the client and server nfsclient - a mutation of sys/nfsclient that call generic functions to do RPCs and handle state. As such, it retains the buffer cache handling characteristics and vnode semantics that are found in sys/nfsclient, for the most part. nfsserver - the server. It includes a DRC designed specifically for NFSv4, that is used instead of the generic DRC in sys/rpc. The build glue will be checked in later, so at this point, it consists of 3 new subdirs that should not affect kernel building. Approved by: kib (mentor)
|