Cross Reference: /freebsd-current/sys/rpc/svc.h

History log of /freebsd-current/sys/rpc/svc.h
Revision	Date	Author	Comments
# a16ff32f	20-Mar-2024	John Baldwin <jhb@FreeBSD.org>	NFS: Request use of TCP_USE_DDP for in-kernel TCP sockets Since this is an optimization, ignore failures to enable the option. For the server side, defer enabling DDP until the first non-NULLPROC RPC is received. This allows TLS handling (which uses NULLPROC RPCs) to enable TLS offload first. Reviewed by: rmacklem Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D44002
# 29363fb4	23-Nov-2023	Warner Losh <imp@FreeBSD.org>	sys: Remove ancient SCCS tags. Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl script. Sponsored by: Netflix
# 2ff63af9	16-Aug-2023	Warner Losh <imp@FreeBSD.org>	sys: Remove $FreeBSD$: one-line .h pattern Remove /^\s\+\s\$FreeBSD\$.$\n/
# 564ed8e8	22-Aug-2022	Rick Macklem <rmacklem@FreeBSD.org>	nfsd: Allow multiple instances of rpc.tlsservd During a discussion with someone working on NFS-over-TLS for a non-FreeBSD platform, we agreed that a single server daemon for TLS handshakes could become a bottleneck when an NFS server first boots, if many concurrent NFS-over-TLS connections are attempted. This patch modifies the kernel RPC code so that it can handle multiple rpc.tlsservd daemons. A separate commit currently under review as D35886 for the rpc.tlsservd daemon.
# bcd0e31d	28-Dec-2021	John Baldwin <jhb@FreeBSD.org>	sys/rpc: Use C99 fixed-width integer types. No functional change. Reviewed by: imp, emaste Differential Revision: https://reviews.freebsd.org/D33640
# 631504fb	03-Sep-2021	Gordon Bergling <gbe@FreeBSD.org>	Fix a common typo in source code comments - s/existant/existent/ MFC after: 3 days
# 20d728b5	09-Jul-2021	Mark Johnston <markj@FreeBSD.org>	rpc: Make function tables const No functional change intended. MFC after: 1 week Sponsored by: The FreeBSD Foundation
# e1a907a2	11-Jun-2021	Rick Macklem <rmacklem@FreeBSD.org>	krpc: Acquire ref count of CLIENT for backchannel use Michael Dexter <editor@callfortesting.org> reported a crash in FreeNAS, where the first argument to clnt_bck_svccall() was no longer valid. This argument is a pointer to the callback CLIENT structure, which is free'd when the associated NFSv4 ClientID is free'd. This appears to have occurred because a callback reply was still in the socket receive queue when the CLIENT structure was free'd. This patch acquires a reference count on the CLIENT that is not CLNT_RELEASE()'d until the socket structure is destroyed. This should guarantee that the CLIENT structure is still valid when clnt_bck_svccall() is called. It also adds a check for closed or closing to clnt_bck_svccall() so that it will not process the callback RPC reply message after the ClientID is free'd. Comments by: mav MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D30153
# ab0c29af	21-Aug-2020	Rick Macklem <rmacklem@FreeBSD.org>	Add TLS support to the kernel RPC. An internet draft titled "Towards Remote Procedure Call Encryption By Default" describes how TLS is to be used for Sun RPC, with NFS as an intended use case. This patch adds client and server support for this to the kernel RPC, using KERN_TLS and upcalls to daemons for the handshake, peer reset and other non-application data record cases. The upcalls to the daemons use three fields to uniquely identify the TCP connection. They are the time.tv_sec, time.tv_usec of the connection establshment, plus a 64bit sequence number. The time fields avoid problems with re-use of the sequence number after a daemon restart. For the server side, once a Null RPC with AUTH_TLS is received, kernel reception on the socket is blocked and an upcall to the rpctlssd(8) daemon is done to perform the TLS handshake. Upon completion, the completion status of the handshake is stored in xp_tls as flag bits and the reply to the Null RPC is sent. For the client, if CLSET_TLS has been set, a new TCP connection will send the Null RPC with AUTH_TLS to initiate the handshake. The client kernel RPC code will then block kernel I/O on the socket and do an upcall to the rpctlscd(8) daemon to perform the handshake. If the upcall is successful, ct_rcvstate will be maintained to indicate if/when an upcall is being done. If non-application data records are received, the code does an upcall to the appropriate daemon, which will do a SSL_read() of 0 length to handle the record(s). When the socket is being shut down, upcalls are done to the daemons, so that they can perform SSL_shutdown() calls to perform the "peer reset". The rpctlssd(8) and rpctlscd(8) daemons require a patched version of the openssl library and, as such, will not be committed to head at this time. Although the changes done by this patch are fairly numerous, there should be no semantics change to the kernel RPC at this time. A future commit to the NFS code will optionally enable use of TLS for NFS.
# 51369649	20-Nov-2017	Pedro F. Giffuni <pfg@FreeBSD.org>	sys: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.
# 90f90687	14-Feb-2017	Andriy Gapon <avg@FreeBSD.org>	add svcpool_close to handle killed nfsd threads This patch adds a new function to the server krpc called svcpool_close(). It is similar to svcpool_destroy(), but does not free the data structures, so that the pool can be used again. This function is then used instead of svcpool_destroy(), svcpool_create() when the nfsd threads are killed. PR: 204340 Reported by: Panzura Approved by: rmacklem Obtained from: rmacklem MFC after: 1 week
# 6244c6e7	05-May-2016	Pedro F. Giffuni <pfg@FreeBSD.org>	sys/rpc: minor spelling fixes. No functional change.
# 3c42b5bf	31-Mar-2015	Garrett Wollman <wollman@FreeBSD.org>	Fix overflow bugs in and remove obsolete limit from kernel RPC implementation. The kernel RPC code, which is responsible for the low-level scheduling of incoming NFS requests, contains a throttling mechanism that prevents too much kernel memory from being tied up by NFS requests that are being serviced. When the throttle is engaged, the RPC layer stops servicing incoming NFS sockets, resulting ultimately in backpressure on the clients (if they're using TCP). However, this is a very heavy-handed mechanism as it prevents all clients from making any requests, regardless of how heavy or light they are. (Thus, when engaged, the throttle often prevents clients from even mounting the filesystem.) The throttle mechanism applies specifically to requests that have been received by the RPC layer (from a TCP or UDP socket) and are queued waiting to be serviced by one of the nfsd threads; it does not limit the amount of backlog in the socket buffers. The original implementation limited the total bytes of queued requests to the minimum of a quarter of (nmbclusters * MCLBYTES) and 45 MiB. The former limit seems reasonable, since requests queued in the socket buffers and replies being constructed to the requests in progress will all require some amount of network memory, but the 45 MiB limit is plainly ridiculous for modern memory sizes: when running 256 service threads on a busy server, 45 MiB would result in just a single maximum-sized NFS3PROC_WRITE queued per thread before throttling. Removing this limit exposed integer-overflow bugs in the original computation, and related bugs in the routines that actually account for the amount of traffic enqueued for service threads. The old implementation also attempted to reduce accounting overhead by batching updates until each queue is fully drained, but this is prone to livelock, resulting in repeated accumulate-throttle-drain cycles on a busy server. Various data types are changed to long or unsigned long; explicit 64-bit types are not used due to the unavailability of 64-bit atomics on many 32-bit platforms, but those platforms also cannot support nmbclusters large enough to cause overflow. This code (in a 10.1 kernel) is presently running on production NFS servers at CSAIL. Summary of this revision: * Removes 45 MiB limit on requests queued for nfsd service threads * Fixes integer-overflow and signedness bugs * Avoids unnecessary throttling by not deferring accounting for completed requests Differential Revision: https://reviews.freebsd.org/D2165 Reviewed by: rmacklem, mav MFC after: 30 days Relnotes: yes Sponsored by: MIT Computer Science & Artificial Intelligence Laboratory
# c59e4cc3	01-Jul-2014	Rick Macklem <rmacklem@FreeBSD.org>	Merge the NFSv4.1 server code in projects/nfsv4.1-server over into head. The code is not believed to have any effect on the semantics of non-NFSv4.1 server behaviour. It is a rather large merge, but I am hoping that there will not be any regressions for the NFS server. MFC after: 1 month
# b563304c	08-Jun-2014	Alexander Motin <mav@FreeBSD.org>	Split RPC pool threads into number of smaller semi-isolated groups. Old design with unified thread pool was good from the point of thread utilization. But single pool-wide mutex became huge congestion point for systems with many CPUs. To reduce the congestion create several thread groups within a pool (one group for every 6 CPUs and 12 threads), each group with own mutex. Each connection during its registration is assigned to one of the groups in round-robin fashion. File affinify code may still move requests between the groups, but otherwise groups are self-contained. MFC after: 2 weeks Sponsored by: iXsystems, Inc.
# b5d7fb73	08-Jun-2014	Alexander Motin <mav@FreeBSD.org>	Remove st_idle variable, duplicating st_xprt. MFC after: 2 weeks
# b776fb2d	08-Jun-2014	Alexander Motin <mav@FreeBSD.org>	Introduce new per-thread lock to protect the list of requests. This allows to slightly simplify svc_run_internal() code: if we processed all the requests in a queue, then we know that new one will not appear. MFC after: 2 weeks
# bcea84bd	08-Jan-2014	Peter Wemm <peter@FreeBSD.org>	Don't expose svc_loss_reg / _unreg to userland as they're kernel-only additions from r260229 and the SVCPOOL type doesn't exist in userland.
# 0979970a	05-Jan-2014	Alexander Motin <mav@FreeBSD.org>	Fix NULL dereference panic on UDP requests introduced in r260229.
# c809a67a	04-Jan-2014	Alexander Motin <mav@FreeBSD.org>	Replace locks added in r260229 to protect sequence counters with atomics. New algorithm does not create additional lock congestion, while some races it includes should not be a problem. Those races may keep requests in DRC cache for some more time by returning ACK position smaller then actual, but it still should be able to drop thems when proper ACK finally read. Races of the original algorithm based on TCP seq number were worse because they happened when reply sequence number were recorded. After that even correctly read ACKs could not clean DRC sometimes.
# d473bac7	03-Jan-2014	Alexander Motin <mav@FreeBSD.org>	Rework NFS Duplicate Request Cache cleanup logic. - Introduce additional hash to group requests by hash of sockref. This allows to process TCP acknowledgements without looping though all the cache, and as result allows to do it every time. - Indroduce additional callbacks to notify application layer about sockets disconnection. Without this last few requests processed just before socket disconnection never processed their ACKs and stuck in cache for many hours. - Implement transport-specific method for tracking reply acknowledgements. New implementation does not cross multiple stack layers to get the data and does not have race conditions that previously made some requests stuck in cache. This could be done more efficiently at sockbuf layer, but that would broke some KBIs, while I don't know other consumers for it aside NFS. - Instead of traversing all DRC twice per request, run cleaning only once per request, and except in some conditions traverse only single hash slot at a time. Together this limits NFS DRC growth only to situations of real connectivity problems. If network is working well, and so all replies are acknowledged, cache remains almost empty even after hours of heavy load. Without this change on the same test cache was growing to many thousand requests even with perfectly working local network. As another result this reduces CPU time spent on the DRC handling during SPEC NFS benchmark from about 10% to 0.5%. Sponsored by: iXsystems, Inc.
# f8fb069d	30-Dec-2013	Alexander Motin <mav@FreeBSD.org>	Move most of NFS file handle affinity code out of the heavily congested global RPC thread pool lock and protect it with own set of locks. On synthetic benchmarks this improves peak NFS request rate by 40%.
# 5c42b9dc	29-Dec-2013	Alexander Motin <mav@FreeBSD.org>	Introduce xprt_inactive_self() -- variant for use when sure that port is assigned to thread. For example, withing receive handlers. In that case the function reduces to single assignment and can avoid locking.
# ba981145	20-Dec-2013	Alexander Motin <mav@FreeBSD.org>	Remove several linear list traversals per request from RPC server code. Do not insert active ports into pool->sp_active list if they are success- fully assigned to some thread. This makes that list include only ports that really require attention, and so traversal can be reduced to simple taking the first one. Remove idle thread from pool->sp_idlethreads list when assigning some work (port of requests) to it. That again makes possible to replace list traversals with simple taking the first element.
# 2e322d37	25-Nov-2013	Hiroki Sato <hrs@FreeBSD.org>	Replace Sun RPC license in TI-RPC library with a 3-clause BSD license, with the explicit permission of Sun Microsystems in 2009.
# e2adc47d	07-Dec-2012	Rick Macklem <rmacklem@FreeBSD.org>	Add support for backchannels to the kernel RPC. Backchannels are used by NFSv4.1 for callbacks. A backchannel is a connection established by the client, but used for RPCs done by the server on the client (callbacks). As a result, this patch mixes some client side calls in the server side and vice versa. Some definitions in the .c files were extracted out into a file called krpc.h, so that they could be included in multiple .c files. This code has been in projects/nfsv4.1-client for some time. Although no one has given it a formal review, I believe kib@ has taken a look at it.
# a7d5f7eb	19-Oct-2010	Jamie Gritton <jamie@FreeBSD.org>	A new jail(8) with a configuration file, to replace the work currently done by /etc/rc.d/jail.
# a4fa5e6d	04-Jun-2009	Rick Macklem <rmacklem@FreeBSD.org>	Fix two races in the server side krpc w.r.t upcalls: Add a flag so that soupcall_clear() is only called once to cancel an upcall. Move the test for xprt_registered in the upcall down to after the mtx_lock() of the pool mutex, to catch the case where it is unregistered while the upcall is waiting for the mutex. Also, move the mtx_destroy() of the pool mutex to after SVC_RELEASE(), so that it isn't destroyed before the upcalls are disabled. Reviewed by: dfr, jhb Tested by: pho Approved by: kib (mentor)
# 201e7488	16-Apr-2009	Rick Macklem <rmacklem@FreeBSD.org>	Added a field to the SVCXPRT structure that the nfsv4 server can use to identify if the socket is the same one that a cached request came in on. It is set by nfsrvd_addsock() to a unique value generated by incrementing an unsigned 64bit static variable for each assignment and then the value of xp_sockref is tested to see if it is equal to the value that was saved with the cached reply. Submitted by: rmacklem Reviewed by: dfr Approved by: kib (mentor)
# a9148abd	03-Nov-2008	Doug Rabson <dfr@FreeBSD.org>	Implement support for RPCSEC_GSS authentication to both the NFS client and server. This replaces the RPC implementation of the NFS client and server with the newer RPC implementation originally developed (actually ported from the userland sunrpc code) to support the NFS Lock Manager. I have tested this code extensively and I believe it is stable and that performance is at least equal to the legacy RPC implementation. The NFS code currently contains support for both the new RPC implementation and the older legacy implementation inherited from the original NFS codebase. The default is to use the new implementation - add the NFS_LEGACYRPC option to fall back to the old code. When I merge this support back to RELENG_7, I will probably change this so that users have to 'opt in' to get the new code. To use RPCSEC_GSS on either client or server, you must build a kernel which includes the KGSSAPI option and the crypto device. On the userland side, you must build at least a new libc, mountd, mount_nfs and gssd. You must install new versions of /etc/rc.d/gssd and /etc/rc.d/nfsd and add 'gssd_enable=YES' to /etc/rc.conf. As long as gssd is running, you should be able to mount an NFS filesystem from a server that requires RPCSEC_GSS authentication. The mount itself can happen without any kerberos credentials but all access to the filesystem will be denied unless the accessing user has a valid ticket file in the standard place (/tmp/krb5cc_<uid>). There is currently no support for situations where the ticket file is in a different place, such as when the user logged in via SSH and has delegated credentials from that login. This restriction is also present in Solaris and Linux. In theory, we could improve this in future, possibly using Brooks Davis' implementation of variant symlinks. Supporting RPCSEC_GSS on a server is nearly as simple. You must create service creds for the server in the form 'nfs/<fqdn>@<REALM>' and install them in /etc/krb5.keytab. The standard heimdal utility ktutil makes this fairly easy. After the service creds have been created, you can add a '-sec=krb5' option to /etc/exports and restart both mountd and nfsd. The only other difference an administrator should notice is that nfsd doesn't fork to create service threads any more. In normal operation, there will be two nfsd processes, one in userland waiting for TCP connections and one in the kernel handling requests. The latter process will create as many kthreads as required - these should be visible via 'top -H'. The code has some support for varying the number of service threads according to load but initially at least, nfsd uses a fixed number of threads according to the value supplied to its '-n' option. Sponsored by: Isilon Systems MFC after: 1 month
# d7f03759	19-Oct-2008	Ulf Lilleengen <lulf@FreeBSD.org>	- Import the HEAD csup code which is the basis for the cvsmode work.
# dfdcada3	26-Mar-2008	Doug Rabson <dfr@FreeBSD.org>	Add the new kernel-mode NFS Lock Manager. To use it instead of the user-mode lock manager, build a kernel with the NFSLOCKD option and add '-k' to 'rpc_lockd_flags' in rc.conf. Highlights include: * Thread-safe kernel RPC client - many threads can use the same RPC client handle safely with replies being de-multiplexed at the socket upcall (typically driven directly by the NIC interrupt) and handed off to whichever thread matches the reply. For UDP sockets, many RPC clients can share the same socket. This allows the use of a single privileged UDP port number to talk to an arbitrary number of remote hosts. * Single-threaded kernel RPC server. Adding support for multi-threaded server would be relatively straightforward and would follow approximately the Solaris KPI. A single thread should be sufficient for the NLM since it should rarely block in normal operation. * Kernel mode NLM server supporting cancel requests and granted callbacks. I've tested the NLM server reasonably extensively - it passes both my own tests and the NFS Connectathon locking tests running on Solaris, Mac OS X and Ubuntu Linux. * Userland NLM client supported. While the NLM server doesn't have support for the local NFS client's locking needs, it does have to field async replies and granted callbacks from remote NLMs that the local client has contacted. We relay these replies to the userland rpc.lockd over a local domain RPC socket. * Robust deadlock detection for the local lock manager. In particular it will detect deadlocks caused by a lock request that covers more than one blocking request. As required by the NLM protocol, all deadlock detection happens synchronously - a user is guaranteed that if a lock request isn't rejected immediately, the lock will eventually be granted. The old system allowed for a 'deferred deadlock' condition where a blocked lock request could wake up and find that some other deadlock-causing lock owner had beaten them to the lock. * Since both local and remote locks are managed by the same kernel locking code, local and remote processes can safely use file locks for mutual exclusion. Local processes have no fairness advantage compared to remote processes when contending to lock a region that has just been unlocked - the local lock manager enforces a strict first-come first-served model for both local and remote lockers. Sponsored by: Isilon Systems PR: 95247 107555 115524 116679 MFC after: 2 weeks