Cross Reference: /freebsd-current/sys/rpc/clnt

History log of /freebsd-current/sys/rpc/clnt_vc.c
Revision	Date	Author	Comments
# 4ba444de	26-Apr-2024	Rick Macklem <rmacklem@FreeBSD.org>	krpc: Ref cnt the client structures for TLS upcalls A crash occurred during testing, where the client structures had already been free'd when the upcall thread tried to lock them. This patch acquires a reference count on both of the structures and these are released when the upcall is done, so that the structures cannot be free'd prematurely. This happened because the testing is done over a very slow vpn. Found during a IETF bakeathon testing event this week. MFC after: 5 days
# e205fd31	09-Apr-2024	Gleb Smirnoff <glebius@FreeBSD.org>	rpc: use new macros to lock socket buffers Fixes: d80a97def9a1db6f07f5d2e68f7ad62b27918947
# f79a8585	30-Jan-2024	Gleb Smirnoff <glebius@FreeBSD.org>	sockets: garbage collect SS_ISCONFIRMING Fixes: 8df32b19dee92b5eaa4b488ae78dca6accfcb38e
# 29363fb4	23-Nov-2023	Warner Losh <imp@FreeBSD.org>	sys: Remove ancient SCCS tags. Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl script. Sponsored by: Netflix
# 685dc743	16-Aug-2023	Warner Losh <imp@FreeBSD.org>	sys: Remove $FreeBSD$: one-line .c pattern Remove /^[\s]__FBSDID$"\$FreeBSD\$"$;?\s*\n/
# 82512c17	14-Oct-2022	Rick Macklem <rmacklem@FreeBSD.org>	clnt_vc.c: Replace msleep() with pause() to avoid assert panic An msleep() in clnt_vc.c used a global "fake_wchan" wchan argument along with the mutex in a CLIENT structure. As such, it was possible to use different mutexes for the same wchan and cause a panic assert. Since this is in a rarely executed code path, the assert panic was only recently observed. Since "fake_wchan" never gets a wakeup, this msleep() can be replaced with a pause() to avoid the panic assert, which is what this patch does. Reviewed by: kib, markj MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D36977
# 0b4f2ab0	15-May-2022	Rick Macklem <rmacklem@FreeBSD.org>	krpc: Fix NFS-over-TLS for KTLS1.3 When NFS-over-TLS uses KTLS1.3, the client can receive post-handshake handshake records. These records can be safely thown away, but are not handled correctly via the rpctls_ct_handlerecord() upcall to the daemon. Commit 373511338d95 changed soreceive_generic() so that it will only return ENXIO for Alert records when MSG_TLSAPPDATA is specified. As such, the post-handshake handshake records will be returned to the krpc. This patch modifies the krpc so that it will throw these records away, which seems sufficient to make NFS-over-TLS work with KTLS1.3. This change has no effect on the use of KTLS1.2, since it does not generate post-handshake handshake records. MFC after: 2 weeks
# 43283184	12-May-2022	Gleb Smirnoff <glebius@FreeBSD.org>	sockets: use socket buffer mutexes in struct socket directly Since c67f3b8b78e the sockbuf mutexes belong to the containing socket, and socket buffers just point to it. In 74a68313b50 macros that access this mutex directly were added. Go over the core socket code and eliminate code that reaches the mutex by dereferencing the sockbuf compatibility pointer. This change requires a KPI change, as some functions were given the sockbuf pointer only without any hint if it is a receive or send buffer. This change doesn't cover the whole kernel, many protocols still use compatibility pointers internally. However, it allows operation of a protocol that doesn't use them. Reviewed by: markj Differential revision: https://reviews.freebsd.org/D35152
# 77bc5890	04-Apr-2022	Warner Losh <imp@FreeBSD.org>	clnt_vc_destroy: eliminiate write only variable stat Sponsored by: Netflix
# e3ba94d4	09-Nov-2021	John Baldwin <jhb@FreeBSD.org>	Don't require the socket lock for sorele(). Previously, sorele() always required the socket lock and dropped the lock if the released reference was not the last reference. Many callers locked the socket lock just before calling sorele() resulting in a wasted lock/unlock when not dropping the last reference. Move the previous implementation of sorele() into a new sorele_locked() function and use it instead of sorele() for various places in uipc_socket.c that called sorele() while already holding the socket lock. The sorele() macro now uses refcount_release_if_not_last() try to drop the socket reference without locking the socket. If that shortcut fails, it locks the socket and calls sorele_locked(). Reviewed by: kib, markj Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D32741
# 20d728b5	09-Jul-2021	Mark Johnston <markj@FreeBSD.org>	rpc: Make function tables const No functional change intended. MFC after: 1 week Sponsored by: The FreeBSD Foundation
# ab0c29af	21-Aug-2020	Rick Macklem <rmacklem@FreeBSD.org>	Add TLS support to the kernel RPC. An internet draft titled "Towards Remote Procedure Call Encryption By Default" describes how TLS is to be used for Sun RPC, with NFS as an intended use case. This patch adds client and server support for this to the kernel RPC, using KERN_TLS and upcalls to daemons for the handshake, peer reset and other non-application data record cases. The upcalls to the daemons use three fields to uniquely identify the TCP connection. They are the time.tv_sec, time.tv_usec of the connection establshment, plus a 64bit sequence number. The time fields avoid problems with re-use of the sequence number after a daemon restart. For the server side, once a Null RPC with AUTH_TLS is received, kernel reception on the socket is blocked and an upcall to the rpctlssd(8) daemon is done to perform the TLS handshake. Upon completion, the completion status of the handshake is stored in xp_tls as flag bits and the reply to the Null RPC is sent. For the client, if CLSET_TLS has been set, a new TCP connection will send the Null RPC with AUTH_TLS to initiate the handshake. The client kernel RPC code will then block kernel I/O on the socket and do an upcall to the rpctlscd(8) daemon to perform the handshake. If the upcall is successful, ct_rcvstate will be maintained to indicate if/when an upcall is being done. If non-application data records are received, the code does an upcall to the appropriate daemon, which will do a SSL_read() of 0 length to handle the record(s). When the socket is being shut down, upcalls are done to the daemons, so that they can perform SSL_shutdown() calls to perform the "peer reset". The rpctlssd(8) and rpctlscd(8) daemons require a patched version of the openssl library and, as such, will not be committed to head at this time. Although the changes done by this patch are fairly numerous, there should be no semantics change to the kernel RPC at this time. A future commit to the NFS code will optionally enable use of TLS for NFS.
# b94b9a80	20-Jun-2020	Rick Macklem <rmacklem@FreeBSD.org>	Fix up a comment added by r362455.
# 4302e8b6	20-Jun-2020	Rick Macklem <rmacklem@FreeBSD.org>	Modify the way the client side krpc does soreceive() for TCP. Without this patch, clnt_vc_soupcall() first does a soreceive() for 4 bytes (the Sun RPC over TCP record mark) and then soreceive(s) for the RPC message. This first soreceive() almost always results in an mbuf allocation, since having the 4byte record mark in a separate mbuf in the socket rcv queue is unlikely. This is somewhat inefficient and rather odd. It also will not work for the ktls rx, since the latter returns a TLS record for each soreceive(). This patch replaces the above with code similar to what the server side of the krpc does for TCP, where it does a soreceive() for as much data as possible and then parses RPC messages out of the received data. A new field of the TCP socket structure called ct_raw is the list of received mbufs that the RPC message(s) are parsed from. I think this results in cleaner code and is needed for support of nfs-over-tls. It also fixes the code for the case where a server sends an RPC message in multiple RPC message fragments. Although this is allowed by RFC5531, no extant NFS server does this. However, it is probably good to fix this in case some future NFS server does do this.
# 51369649	20-Nov-2017	Pedro F. Giffuni <pfg@FreeBSD.org>	sys: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.
# dfd174d6	06-May-2017	Rick Macklem <rmacklem@FreeBSD.org>	Fix the client side krpc from doing TCP reconnects for ERESTART from sosend(). When sosend() replies ERESTART in the client side krpc, it indicates that the RPC message hasn't yet been sent and that the send queue is full or locked while a signal is posted for the process. Without this patch, this would result in a RPC_CANTSEND reply from clnt_vc_call(), which would cause clnt_reconnect_call() to create a new TCP transport connection. For most NFS servers, this wasn't a serious problem, although it did imply retries of outstanding RPCs, which could possibly have missed the DRC. For an NFSv4.1 mount to AmazonEFS, this caused a serious problem, since AmazonEFS often didn't retain the NFSv4.1 session and would reply with NFS4ERR_BAD_SESSION. This implies to the client a crash/reboot which requires open/lock state recovery. Three options were considered to fix this: - Return the ERESTART all the way up to the system call boundary and then have the system call redone. This is fraught with risk, due to convoluted code paths, asynchronous I/O RPCs etc. cperciva@ worked on this, but it is still a work in prgress and may not be feasible. - Set SB_NOINTR for the socket buffer. This fixes the problem, but makes the sosend() completely non interruptible, which kib@ considered inappropriate. It also would break forced dismount when a thread was blocked in sosend(). - Modify the retry loop in clnt_vc_call(), so that it loops for this case for up to 15sec. Testing showed that the sosend() usually succeeded by the 2nd retry. The extreme case observed was 111 loop iterations, or about 100msec of delay. This third alternative is what is implemented in this patch, since the change is: - localized - straightforward - forced dismount is not broken by it. This patch has been tested by cperciva@ extensively against AmazonEFS. Reported by: cperciva Tested by: cperciva MFC after: 2 weeks
# 34f1fddb	10-Apr-2017	Rick Macklem <rmacklem@FreeBSD.org>	Fix a crash during unmount of an NFSv4.1 mount. Larry Rosenman reported a crash on freebsd-current@ which was caused by a premature release of the krpc backchannel socket structure. I believe this was caused by a race between the SVC_RELEASE() in clnt_vc.c and the xprt_unregister() in the higher layer (clnt_rc.c), which tried to lock the mutex in the xprt structure and crashed. This patch fixes this by removing the xprt_unregister() in the clnt_vc layer and allowing this to always be done by the clnt_rc (higher reconnect layer). Reported by: ler@lerctr.org Tested by: ler@letctr.org MFC after: 2 weeks
# 7d3db235	11-Jul-2016	Enji Cooper <ngie@FreeBSD.org>	Deobfuscate cleanup path in clnt_vc_create(..) Similar to r300836, r301800, and r302550, cl and ct will always be non-NULL as they're allocated using the mem_alloc routines, which always use `malloc(..., M_WAITOK)`. MFC after: 1 week Reported by: Coverity CID: 1007342 Sponsored by: EMC / Isilon Storage Division
# 6244c6e7	05-May-2016	Pedro F. Giffuni <pfg@FreeBSD.org>	sys/rpc: minor spelling fixes. No functional change.
# cfa6009e	12-Nov-2014	Gleb Smirnoff <glebius@FreeBSD.org>	In preparation of merging projects/sendfile, transform bare access to sb_cc member of struct sockbuf to a couple of inline functions: sbavail() and sbused() Right now they are equal, but once notion of "not ready socket buffer data", will be checked in, they are going to be different. Sponsored by: Netflix Sponsored by: Nginx, Inc.
# c3e2c655	02-May-2014	Christian Brueffer <brueffer@FreeBSD.org>	Properly free resources in case of error. CID: 1007032 Found with: Coverity Prevent(tm) MFC after: 2 weeks
# a6132f60	24-Dec-2013	Dimitry Andric <dim@FreeBSD.org>	Remove some unused static const strings under sys/rpc, which have never been used since the initial commit (r177633). MFC after: 3 days
# 2e322d37	25-Nov-2013	Hiroki Sato <hrs@FreeBSD.org>	Replace Sun RPC license in TI-RPC library with a 3-clause BSD license, with the explicit permission of Sun Microsystems in 2009.
# 3b14c753	13-Mar-2013	John Baldwin <jhb@FreeBSD.org>	Revert 195703 and 195821 as this special stop handling in NFS is now implemented via VFCF_SBDRY rather than passing PBDRY to individual sleep calls.
# bd54830b	11-Mar-2013	Gleb Smirnoff <glebius@FreeBSD.org>	Use m_get(), m_gethdr() and m_getcl() instead of historic macros. Sponsored by: Nginx, Inc.
# e2adc47d	07-Dec-2012	Rick Macklem <rmacklem@FreeBSD.org>	Add support for backchannels to the kernel RPC. Backchannels are used by NFSv4.1 for callbacks. A backchannel is a connection established by the client, but used for RPCs done by the server on the client (callbacks). As a result, this patch mixes some client side calls in the server side and vice versa. Some definitions in the .c files were extracted out into a file called krpc.h, so that they could be included in multiple .c files. This code has been in projects/nfsv4.1-client for some time. Although no one has given it a formal review, I believe kib@ has taken a look at it.
# eb1b1807	05-Dec-2012	Gleb Smirnoff <glebius@FreeBSD.org>	Mechanically substitute flags from historic mbuf allocator with malloc(9) flags within sys. Exceptions: - sys/contrib not touched - sys/mbuf.h edited manually
# 0c2222ba	02-Oct-2012	Pedro F. Giffuni <pfg@FreeBSD.org>	libtirpc: be sure to free cl_netid and cl_tp When creating a client with clnt_tli_create, it uses strdup to copy strings for these fields if nconf is passed in. clnt_dg_destroy frees these strings already. Make sure clnt_vc_destroy frees them in the same way. This change matches the reference (OpenSolaris) implementation. Tested by: David Wolfskill Obtained from: Bull GNU/Linux NFSv4 Project (libtirpc) MFC after: 2 weeks
# c148237d	23-Sep-2012	Pedro F. Giffuni <pfg@FreeBSD.org>	Partial revert of r239963: The following change caused rpc.lockd to exit after startup: ____ libtirpc: be sure to free cl_netid and cl_tp When creating a client with clnt_tli_create, it uses strdup to copy strings for these fields if nconf is passed in. clnt_dg_destroy frees these strings already. Make sure clnt_vc_destroy frees them in the same way. ____ MFC after: 3 days Reported by: David Wolfskill Tested by: David Wolfskill
# 43981b6c	31-Aug-2012	Pedro F. Giffuni <pfg@FreeBSD.org>	Bring some changes from Bull's NFSv4 libtirpc implementation. We especifically ignored the glibc compatibility changes but this should help interaction with Solaris and Linux. ____ Fixed infinite loop in svc_run() author Steve Dickson Tue, 10 Jun 2008 12:35:52 -0500 (13:35 -0400) Fixed infinite loop in svc_run() ____ __rpc_taddr2uaddr_af() assumes the netbuf to always have a non-zero data. This is a bad assumption and can lead to a seg-fault. This patch adds a check for zero length and returns NULL when found. author Steve Dickson Mon, 27 Oct 2008 11:46:54 -0500 (12:46 -0400) ____ Changed clnt_spcreateerror() to return clearer and more concise error messages. author Steve Dickson Thu, 20 Nov 2008 08:55:31 -0500 (08:55 -0500) ____ Converted all uid and gid variables of the type uid_t and gid_t. author Steve Dickson Wed, 28 Jan 2009 12:44:46 -0500 (12:44 -0500) ____ libtirpc: set r_netid and r_owner in __rpcb_findaddr_timed These fields in the rpcbind GETADDR call are being passed uninitialized to CLNT_CALL. In the case of x86_64 at least, this usually leads to a segfault. On x86, it sometimes causes segfaults and other times causes garbage to be sent on the wire. rpcbind generally ignores the r_owner field for calls that come in over the wire, so it really doesn't matter what we send in that slot. We just need to send something. The reference implementation from Sun seems to send a blank string. Have ours follow suit. author Jeff Layton Fri, 13 Mar 2009 11:44:16 -0500 (12:44 -0400) ____ libtirpc: be sure to free cl_netid and cl_tp When creating a client with clnt_tli_create, it uses strdup to copy strings for these fields if nconf is passed in. clnt_dg_destroy frees these strings already. Make sure clnt_vc_destroy frees them in the same way. author Jeff Layton Fri, 13 Mar 2009 11:47:36 -0500 (12:47 -0400) Obtained from: Bull GNU/Linux NFSv4 Project MFC after: 3 weeks
# 7b67bd9f	27-Apr-2011	Rick Macklem <rmacklem@FreeBSD.org>	This patch is believed to fix a problem in the kernel rpc for non-interruptible NFS mounts, where a kernel thread will seem to be stuck sleeping on "rpccon". The msleep() in clnt_vc_create() that was waiting to a TCP connect to complete would return ERESTART, since PCATCH was specified. Then the tsleep() in clnt_reconnect_call() would sleep for 1 second and then try again and again and... The patch changes the msleep() in clnt_vc_create() so it only sets the PCATCH flag for interruptible cases. Tested by: pho Reviewed by: jhb MFC after: 2 weeks
# 5e8eb3cd	12-Apr-2011	Rick Macklem <rmacklem@FreeBSD.org>	Fix a couple of mbuf leaks introduced by r217242. I do not believe that these leaks had a practical impact, since the situations in which they would have occurred would have been extremely rare. MFC after: 2 weeks
# 1fb51a12	16-Feb-2011	Bjoern A. Zeeb <bz@FreeBSD.org>	Mfp4 CH=177274,177280,177284-177285,177297,177324-177325 VNET socket push back: try to minimize the number of places where we have to switch vnets and narrow down the time we stay switched. Add assertions to the socket code to catch possibly unset vnets as seen in r204147. While this reduces the number of vnet recursion in some places like NFS, POSIX local sockets and some netgraph, .. recursions are impossible to fix. The current expectations are documented at the beginning of uipc_socket.c along with the other information there. Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH Reviewed by: jhb Tested by: zec Tested by: Mikolaj Golub (to.my.trociny gmail.com) MFC after: 2 weeks
# 2a1e0fb4	10-Jan-2011	Rick Macklem <rmacklem@FreeBSD.org>	Fix a bug in the client side krpc where it was, sometimes erroneously, assumed that 4 bytes of data were in the first mbuf of a list by replacing the bcopy() with m_copydata(). Also, replace the uses of m_pullup(), which can fail for reasons other than not enough data, with m_copydata(). For the cases where it isn't known that there is enough data in the mbuf list, check first via m_len and m_length(). This is believed to fix a problem reported by dpd at dpdtech.com and george+freebsd at m5p.com. Reviewed by: jhb MFC after: 8 days
# a7d5f7eb	19-Oct-2010	Jamie Gritton <jamie@FreeBSD.org>	A new jail(8) with a configuration file, to replace the work currently done by /etc/rc.d/jail.
# cec077bc	12-Oct-2010	Rick Macklem <rmacklem@FreeBSD.org>	Fix the krpc so that it can handle NFSv3,UDP mounts with a read/write data size greater than 8192. Since soreserve(so, 2561024, 2561024) would always fail for the default value of sb_max, modify clnt_dg.c so that it uses the calculated values and checks for an error return from soreserve(). Also, add a check for error return from soreserve() to clnt_vc.c and change __rpc_get_t_size() to use sb_max_adj instead of the bogus maxsize == 256*1024. PR: kern/150910 Reviewed by: jhb MFC after: 2 weeks
# 43bd8298	15-Nov-2009	Rick Macklem <rmacklem@FreeBSD.org>	MFC: r199053 Add a check for the connection being shut down to the krpc client just before queuing a request for the connection. The code already had a check for the connection being shut down while the request was queued, but not one for the shut down having been initiated by the server before the request was in the queue. This fixes some cases of problems w.r.t. reconnecting to a NFS server that drops inactive TCP connections. Tested by: Olaf Seibert, Daniel Braniss Reviewed by: dfr
# f9917533	08-Nov-2009	Rick Macklem <rmacklem@FreeBSD.org>	Add a check for the connection being shut down to the krpc client just before queuing a request for the connection. The code already had a check for the connection being shut down while the request was queued, but not one for the shut down having been initiated by the server before the request was in the queue. This appears to fix the problem of slow reconnects against an NFS server that drops inactive connections reported by Olaf Seibert, but does not fix the case where the FreeBSD client generates RST segments at about the same time as ACKs. This is still a problem that is being investigated. This patch does not cause a regression for this case. Tested by: Olaf Seibert, Daniel Braniss Reviewed by: dfr MFC after: 5 days
# 83864c81	28-Aug-2009	Marko Zec <zec@FreeBSD.org>	MFC r196503: Fix NFS panics with options VIMAGE kernels by apropriately setting curvnet context inside the RPC code. Temporarily set td's cred to mount's cred before calling socreate() via __rpc_nconf2socket(). Submitted by: rmacklem (in part) Reviewed by: rmacklem, rwatson Discussed with: dfr, bz Approved by: re (rwatson), julian (mentor) Approved by: re (rwatson)
# 0348c661	24-Aug-2009	Marko Zec <zec@FreeBSD.org>	Fix NFS panics with options VIMAGE kernels by apropriately setting curvnet context inside the RPC code. Temporarily set td's cred to mount's cred before calling socreate() via __rpc_nconf2socket(). Submitted by: rmacklem (in part) Reviewed by: rmacklem, rwatson Discussed with: dfr, bz Approved by: re (rwatson), julian (mentor) MFC after: 3 days
# b35687df	14-Jul-2009	Konstantin Belousov <kib@FreeBSD.org>	Use PBDRY flag for msleep(9) in NFS and NLM when sleeping thread owns kernel resources that block other threads, like vnode locks. The SIGSTOP sent to such thread (process, rather) shall not stop it until thread releases the resources. Tested by: pho Reviewed by: jhb Approved by: re (kensmith)
# 3144f812	04-Jun-2009	Rick Macklem <rmacklem@FreeBSD.org>	Fix upcall races in the client side krpc. For the client side upcall, holding SOCKBUF_LOCK() isn't sufficient to guarantee that there is no upcall in progress, since SOCKBUF_LOCK() is released/re-acquired in the upcall. An upcall reference counter was added to the upcall structure that is incremented at the beginning of the upcall and decremented at the end of the upcall. As such, a reference count == 0 when holding the SOCKBUF_LOCK() guarantees there is no upcall in progress. Add a function that is called just after soupcall_clear(), which waits until the reference count == 0. Also, move the mtx_destroy() down to after soupcall_clear(), so that the mutex is not destroyed before upcalls are done. Reviewed by: dfr, jhb Tested by: pho Approved by: kib (mentor)
# 74fb0ba7	01-Jun-2009	John Baldwin <jhb@FreeBSD.org>	Rework socket upcalls to close some races with setup/teardown of upcalls. - Each socket upcall is now invoked with the appropriate socket buffer locked. It is not permissible to call soisconnected() with this lock held; however, so socket upcalls now return an integer value. The two possible values are SU_OK and SU_ISCONNECTED. If an upcall returns SU_ISCONNECTED, then the soisconnected() will be invoked on the socket after the socket buffer lock is dropped. - A new API is provided for setting and clearing socket upcalls. The API consists of soupcall_set() and soupcall_clear(). - To simplify locking, each socket buffer now has a separate upcall. - When a socket upcall returns SU_ISCONNECTED, the upcall is cleared from the receive socket buffer automatically. Note that a SO_SND upcall should never return SU_ISCONNECTED. - All this means that accept filters should now return SU_ISCONNECTED instead of calling soisconnected() directly. They also no longer need to explicitly clear the upcall on the new socket. - The HTTP accept filter still uses soupcall_set() to manage its internal state machine, but other accept filters no longer have any explicit knowlege of socket upcall internals aside from their return value. - The various RPC client upcalls currently drop the socket buffer lock while invoking soreceive() as a temporary band-aid. The plan for the future is to add a new flag to allow soreceive() to be called with the socket buffer locked. - The AIO callback for socket I/O is now also invoked with the socket buffer locked. Previously sowakeup() would drop the socket buffer lock only to call aio_swake() which immediately re-acquired the socket buffer lock for the duration of the function call. Discussed with: rwatson, rmacklem
# a9ccfd56	11-Nov-2008	Doug Rabson <dfr@FreeBSD.org>	Add a missing call to mtx_destroy().
# a9148abd	03-Nov-2008	Doug Rabson <dfr@FreeBSD.org>	Implement support for RPCSEC_GSS authentication to both the NFS client and server. This replaces the RPC implementation of the NFS client and server with the newer RPC implementation originally developed (actually ported from the userland sunrpc code) to support the NFS Lock Manager. I have tested this code extensively and I believe it is stable and that performance is at least equal to the legacy RPC implementation. The NFS code currently contains support for both the new RPC implementation and the older legacy implementation inherited from the original NFS codebase. The default is to use the new implementation - add the NFS_LEGACYRPC option to fall back to the old code. When I merge this support back to RELENG_7, I will probably change this so that users have to 'opt in' to get the new code. To use RPCSEC_GSS on either client or server, you must build a kernel which includes the KGSSAPI option and the crypto device. On the userland side, you must build at least a new libc, mountd, mount_nfs and gssd. You must install new versions of /etc/rc.d/gssd and /etc/rc.d/nfsd and add 'gssd_enable=YES' to /etc/rc.conf. As long as gssd is running, you should be able to mount an NFS filesystem from a server that requires RPCSEC_GSS authentication. The mount itself can happen without any kerberos credentials but all access to the filesystem will be denied unless the accessing user has a valid ticket file in the standard place (/tmp/krb5cc_<uid>). There is currently no support for situations where the ticket file is in a different place, such as when the user logged in via SSH and has delegated credentials from that login. This restriction is also present in Solaris and Linux. In theory, we could improve this in future, possibly using Brooks Davis' implementation of variant symlinks. Supporting RPCSEC_GSS on a server is nearly as simple. You must create service creds for the server in the form 'nfs/<fqdn>@<REALM>' and install them in /etc/krb5.keytab. The standard heimdal utility ktutil makes this fairly easy. After the service creds have been created, you can add a '-sec=krb5' option to /etc/exports and restart both mountd and nfsd. The only other difference an administrator should notice is that nfsd doesn't fork to create service threads any more. In normal operation, there will be two nfsd processes, one in userland waiting for TCP connections and one in the kernel handling requests. The latter process will create as many kthreads as required - these should be visible via 'top -H'. The code has some support for varying the number of service threads according to load but initially at least, nfsd uses a fixed number of threads according to the value supplied to its '-n' option. Sponsored by: Isilon Systems MFC after: 1 month
# d7f03759	19-Oct-2008	Ulf Lilleengen <lulf@FreeBSD.org>	- Import the HEAD csup code which is the basis for the cvsmode work.
# c675522f	26-Jun-2008	Doug Rabson <dfr@FreeBSD.org>	Re-implement the client side of rpc.lockd in the kernel. This implementation provides the correct semantics for flock(2) style locks which are used by the lockf(1) command line tool and the pidfile(3) library. It also implements recovery from server restarts and ensures that dirty cache blocks are written to the server before obtaining locks (allowing multiple clients to use file locking to safely share data). Sponsored by: Isilon Systems PR: 94256 MFC after: 2 weeks
# ee31b83a	28-Mar-2008	Doug Rabson <dfr@FreeBSD.org>	Minor changes to improve compatibility with older FreeBSD releases.
# dfdcada3	26-Mar-2008	Doug Rabson <dfr@FreeBSD.org>	Add the new kernel-mode NFS Lock Manager. To use it instead of the user-mode lock manager, build a kernel with the NFSLOCKD option and add '-k' to 'rpc_lockd_flags' in rc.conf. Highlights include: * Thread-safe kernel RPC client - many threads can use the same RPC client handle safely with replies being de-multiplexed at the socket upcall (typically driven directly by the NIC interrupt) and handed off to whichever thread matches the reply. For UDP sockets, many RPC clients can share the same socket. This allows the use of a single privileged UDP port number to talk to an arbitrary number of remote hosts. * Single-threaded kernel RPC server. Adding support for multi-threaded server would be relatively straightforward and would follow approximately the Solaris KPI. A single thread should be sufficient for the NLM since it should rarely block in normal operation. * Kernel mode NLM server supporting cancel requests and granted callbacks. I've tested the NLM server reasonably extensively - it passes both my own tests and the NFS Connectathon locking tests running on Solaris, Mac OS X and Ubuntu Linux. * Userland NLM client supported. While the NLM server doesn't have support for the local NFS client's locking needs, it does have to field async replies and granted callbacks from remote NLMs that the local client has contacted. We relay these replies to the userland rpc.lockd over a local domain RPC socket. * Robust deadlock detection for the local lock manager. In particular it will detect deadlocks caused by a lock request that covers more than one blocking request. As required by the NLM protocol, all deadlock detection happens synchronously - a user is guaranteed that if a lock request isn't rejected immediately, the lock will eventually be granted. The old system allowed for a 'deferred deadlock' condition where a blocked lock request could wake up and find that some other deadlock-causing lock owner had beaten them to the lock. * Since both local and remote locks are managed by the same kernel locking code, local and remote processes can safely use file locks for mutual exclusion. Local processes have no fairness advantage compared to remote processes when contending to lock a region that has just been unlocked - the local lock manager enforces a strict first-come first-served model for both local and remote lockers. Sponsored by: Isilon Systems PR: 95247 107555 115524 116679 MFC after: 2 weeks