History log of /freebsd-10-stable/sys/nfs/nfs_lock.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 227293 07-Nov-2011 ed

Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs.

This means that their use is restricted to a single C file.


# 216931 03-Jan-2011 rmacklem

Fix the nlm so that it no longer depends on the regular
nfs client and, as such, can be loaded for the experimental
nfs client without the regular client.

Reviewed by: jhb
MFC after: 2 weeks


# 214048 18-Oct-2010 rmacklem

Modify the NFS clients and the NLM so that the NLM can be used
by both clients. Since the NLM uses various fields of the
nfsmount structure, those fields were extracted and put in a
separate nfs_mountcommon structure stored in sys/nfs/nfs_mountcommon.h.
This structure also has a function pointer for a function that
extracts the required information from the mount point and nfs vnode
for that particular client, for information stored differently by the
clients.

Reviewed by: jhb
MFC after: 2 weeks


# 210455 24-Jul-2010 rmacklem

Move sys/nfsclient/nfs_lock.c into sys/nfs and build it as a separate
module that can be used by both the regular and experimental nfs
clients. This fixes the problem reported by jh@ where /dev/nfslock
would be registered twice when both nfs clients were used.
I also defined the size of the lm_fh field to be the correct value,
as it should be the maximum size of an NFSv3 file handle.

Reviewed by: jh
MFC after: 2 weeks


# 195202 30-Jun-2009 dfr

Remove the old kernel RPC implementation and the NFS_LEGACYRPC option.

Approved by: re


# 192578 22-May-2009 rwatson

Remove the unmaintained University of Michigan NFSv4 client from 8.x
prior to 8.0-RELEASE. Rick Macklem's new and more feature-rich NFSv234
client and server are replacing it.

Discussed with: rmacklem


# 184214 23-Oct-2008 des

Fix a number of style issues in the MALLOC / FREE commit. I've tried to
be careful not to fix anything that was already broken; the NFSv4 code is
particularly bad in this respect.


# 184205 23-Oct-2008 des

Retire the MALLOC and FREE macros. They are an abomination unto style(9).

MFC after: 3 months


# 178243 16-Apr-2008 kib

Move the head of byte-level advisory lock list from the
filesystem-specific vnode data to the struct vnode. Provide the
default implementation for the vop_advlock and vop_advlockasync.
Purge the locks on the vnode reclaim by using the lf_purgelocks().
The default implementation is augmented for the nfs and smbfs.
In the nfs_advlock, push the Giant inside the nfs_dolock.

Before the change, the vop_advlock and vop_advlockasync have taken the
unlocked vnode and dereferenced the fs-private inode data, racing with
with the vnode reclamation due to forced unmount. Now, the vop_getattr
under the shared vnode lock is used to obtain the inode size, and
later, in the lf_advlockasync, after locking the vnode interlock, the
VI_DOOMED flag is checked to prevent an operation on the doomed vnode.

The implementation of the lf_purgelocks() is submitted by dfr.

Reported by: kris
Tested by: kris, pho
Discussed with: jeff, dfr
MFC after: 2 weeks


# 177633 26-Mar-2008 dfr

Add the new kernel-mode NFS Lock Manager. To use it instead of the
user-mode lock manager, build a kernel with the NFSLOCKD option and
add '-k' to 'rpc_lockd_flags' in rc.conf.

Highlights include:

* Thread-safe kernel RPC client - many threads can use the same RPC
client handle safely with replies being de-multiplexed at the socket
upcall (typically driven directly by the NIC interrupt) and handed
off to whichever thread matches the reply. For UDP sockets, many RPC
clients can share the same socket. This allows the use of a single
privileged UDP port number to talk to an arbitrary number of remote
hosts.

* Single-threaded kernel RPC server. Adding support for multi-threaded
server would be relatively straightforward and would follow
approximately the Solaris KPI. A single thread should be sufficient
for the NLM since it should rarely block in normal operation.

* Kernel mode NLM server supporting cancel requests and granted
callbacks. I've tested the NLM server reasonably extensively - it
passes both my own tests and the NFS Connectathon locking tests
running on Solaris, Mac OS X and Ubuntu Linux.

* Userland NLM client supported. While the NLM server doesn't have
support for the local NFS client's locking needs, it does have to
field async replies and granted callbacks from remote NLMs that the
local client has contacted. We relay these replies to the userland
rpc.lockd over a local domain RPC socket.

* Robust deadlock detection for the local lock manager. In particular
it will detect deadlocks caused by a lock request that covers more
than one blocking request. As required by the NLM protocol, all
deadlock detection happens synchronously - a user is guaranteed that
if a lock request isn't rejected immediately, the lock will
eventually be granted. The old system allowed for a 'deferred
deadlock' condition where a blocked lock request could wake up and
find that some other deadlock-causing lock owner had beaten them to
the lock.

* Since both local and remote locks are managed by the same kernel
locking code, local and remote processes can safely use file locks
for mutual exclusion. Local processes have no fairness advantage
compared to remote processes when contending to lock a region that
has just been unlocked - the local lock manager enforces a strict
first-come first-served model for both local and remote lockers.

Sponsored by: Isilon Systems
PR: 95247 107555 115524 116679
MFC after: 2 weeks


# 168931 21-Apr-2007 rwatson

Attempt to rationalize NFS privileges:

- Replace PRIV_NFSD with PRIV_NFS_DAEMON, add PRIV_NFS_LOCKD.

- Use PRIV_NFS_DAEMON in the NFS server.

- In the NFS client, move the privilege check from nfslockdans(), which
occurs every time a write is performed on /dev/nfslock, and instead do it
in nfslock_open() just once. This allows us to avoid checking the saved
uid for root, and just use the effective on open. Use PRIV_NFS_LOCKD.


# 161371 16-Aug-2006 thomas

Fix typos in comment.


# 154316 13-Jan-2006 rwatson

In nfs_dolock(), GC now under-used ioflg, rendered obsolete when we moved
from using a fifo to talk to rpc.lockd to using a special device node.

Noticed by: Coverity Prevent analysis tool
MFC after: 3 days


# 151897 31-Oct-2005 rwatson

Normalize a significant number of kernel malloc type names:

- Prefer '_' to ' ', as it results in more easily parsed results in
memory monitoring tools such as vmstat.

- Remove punctuation that is incompatible with using memory type names
as file names, such as '/' characters.

- Disambiguate some collisions by adding subsystem prefixes to some
memory types.

- Generally prefer lower case to upper case.

- If the same type is defined in multiple architecture directories,
attempt to use the same name in additional cases.

Not all instances were caught in this change, so more work is required to
finish this conversion. Similar changes are required for UMA zone names.


# 151695 26-Oct-2005 glebius

- Fix leak of struct nlminfo on process exit.
- Fix malloc type collision, that made the above problem
difficult to understand.

Reported by: Vladimir Sharun <sharun ukr.net>


# 138430 06-Dec-2004 phk

For reasons unknown, the nfs locking code used a fifo to send requests to
userland and a dedicated system call to get replies.

The vnode-bypass of fifos broke this into a panic.

Ditch all the magic and create a device /dev/nfslock instead, and
use that for both directions apart from the shorter path, this is
also faster because the device driver runs Giant free using the
vnode bypass.

Noticed by: marcel


# 122698 14-Nov-2003 alfred

University of Michigan's Citi NFSv4 kernel client code.

Submitted by: Jim Rees <rees@umich.edu>


# 118094 27-Jul-2003 phk

Add fdidx argument to vn_open() and vn_open_cred() and pass -1 throughout.


# 116185 11-Jun-2003 rwatson

Add the comment I meant to add about not passing in PCATCH to the
tsleep(). Note the XXX.


# 115415 30-May-2003 rwatson

rpc.lockd stability workaround: remove PCATCH from the tsleep() in
nfs_lock.c. Right now, if we permit a signal to interrupt the sleep,
we will slip the lock and no process on that client, the server, or
any other client will be able to acquire the lock. This can happen,
for example, if a user hits Ctrl-C or Ctrl-T while a process is
waiting for the lock. By removing PCATCH, we prevent that from
happening, at the cost of not permitting a user-requested lock abort:
also nasty. However, a user interface bug might be preferable to a
serious semantic bug, so we go with that for now.

We need to teach the rpc.lockd/kernel protocol how to abort lock
requests, and rpc.lockd how to handle aborted lock requests; patches
for the kernel bit are floating around, but no rpc.lockd bit yet.

Approved by: re (scottl)


# 114434 01-May-2003 des

Instead of recording the Unix time in a process when it starts, record the
uptime. Where necessary, convert it back to Unix time by adding boottime
to it. This fixes a potential problem in the accounting code, which would
compute the elapsed time incorrectly if the Unix time was stepped during
the lifetime of the process.


# 114216 29-Apr-2003 kan

Deprecate machine/limits.h in favor of new sys/limits.h.
Change all in-tree consumers to include <sys/limits.h>

Discussed on: standards@
Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>


# 112685 26-Mar-2003 rwatson

Add O_NONBLOCK to the vn_open_cred() flags for NFS client locking when
opening the POSIX fifo; convert ENXIO error returns to EOPNOTSUPP.

This improves handling of the case where the /var/run/lock fifo exists
but there is no listener: we immediately return EOPNOTSUPP rather
than blocking until a listener turns up. This could occur during a
diskless boot before rpc.lockd is loaded, or if the lock file persists
across a reboot following the disabling of rpc.lockd. This may have
suddenly started to occur due to fifo blocking fixes--previously it
looks like attempts to read on a fifo with no listener would time out
due to insufficient resources.

Reviewed by: alfred


# 111119 19-Feb-2003 imp

Back out M_* changes, per decision of the TRB.

Approved by: trb


# 109623 21-Jan-2003 alfred

Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.


# 107104 20-Nov-2002 alfred

reapply 1.26 through 1.28.

Approved by: re


# 107101 20-Nov-2002 alfred

forgot about 5.x freeze, backout 1.26 through 1.28 pending re@ appoval.


# 107100 20-Nov-2002 alfred

remove useless casts, unused macros and cleanup a line wrap.


# 107099 20-Nov-2002 alfred

comment and untwist error return logic


# 107098 20-Nov-2002 alfred

Remove an outdated comment complaining about exporting struct ucred
to userspace, I fixed it a while ago.


# 101947 15-Aug-2002 alfred

Remove a case of exposing 'struct ucred' to userspace. Use a struct xucred
for LOCKD_MSG instead.

Requested by: rwatson


# 101941 15-Aug-2002 rwatson

In order to better support flexible and extensible access control,
make a series of modifications to the credential arguments relating
to file read and write operations to cliarfy which credential is
used for what:

- Change fo_read() and fo_write() to accept "active_cred" instead of
"cred", and change the semantics of consumers of fo_read() and
fo_write() to pass the active credential of the thread requesting
an operation rather than the cached file cred. The cached file
cred is still available in fo_read() and fo_write() consumers
via fp->f_cred. These changes largely in sys_generic.c.

For each implementation of fo_read() and fo_write(), update cred
usage to reflect this change and maintain current semantics:

- badfo_readwrite() unchanged
- kqueue_read/write() unchanged
pipe_read/write() now authorize MAC using active_cred rather
than td->td_ucred
- soo_read/write() unchanged
- vn_read/write() now authorize MAC using active_cred but
VOP_READ/WRITE() with fp->f_cred

Modify vn_rdwr() to accept two credential arguments instead of a
single credential: active_cred and file_cred. Use active_cred
for MAC authorization, and select a credential for use in
VOP_READ/WRITE() based on whether file_cred is NULL or not. If
file_cred is provided, authorize the VOP using that cred,
otherwise the active credential, matching current semantics.

Modify current vn_rdwr() consumers to pass a file_cred if used
in the context of a struct file, and to always pass active_cred.
When vn_rdwr() is used without a file_cred, pass NOCRED.

These changes should maintain current semantics for read/write,
but avoid a redundant passing of fp->f_cred, as well as making
it more clear what the origin of each credential is in file
descriptor read/write operations.

Follow-up commits will make similar changes to other file descriptor
operations, and modify the MAC framework to pass both credentials
to MAC policy modules so they can implement either semantic for
revocation.

Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs


# 101744 12-Aug-2002 rwatson

Pass IO_NOMACCHECK to vn_rdwr() in the following checks to prevent
enforcement of MAC policy on the read or write operations:

- In ext2fs, don't enforce MAC on loop-back reads and writes supporting
directory read operations in lookup(), directory modifications in
rename(), directory write operations in mkdir(), symlink write
operations in symlink().

- In the NFS client locking code, perform vn_rdwr() on the NFS locking
socket without enforcing MAC, since the write is done on behalf of
the kernel NFS implementation rather than the user process.

- In UFS, don't enforce MAC on loop-back reads and writes supporting
directory read operations in lookup(), and symlink write operations
in symlink().

Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs


# 100134 15-Jul-2002 alfred

Add IPv6 support.

Submitted by: Jean-Luc Richier <Jean-Luc.Richier@imag.fr>


# 93593 01-Apr-2002 jhb

Change the suser() API to take advantage of td_ucred as well as do a
general cleanup of the API. The entire API now consists of two functions
similar to the pre-KSE API. The suser() function takes a thread pointer
as its only argument. The td_ucred member of this thread must be valid
so the only valid thread pointers are curthread and a few kernel threads
such as thread0. The suser_cred() function takes a pointer to a struct
ucred as its first argument and an integer flag as its second argument.
The flag is currently only used for the PRISON_ROOT flag.

Discussed on: smp@


# 91420 27-Feb-2002 jhb

Use thread0.td_ucred instead of proc0.p_ucred. This change is cosmetic
and isn't strictly required. However, it lowers the number of false
positives found when grep'ing the kernel sources for p_ucred to ensure
proper locking.


# 91406 27-Feb-2002 jhb

Simple p_ucred -> td_ucred changes to start using the per-thread ucred
reference.


# 86363 14-Nov-2001 rwatson

o Modify nfslockdans() to accept a thread reference instead of a proc
reference: with td->td_ucred, it will be desirable to authorize
based on td->td_ucred, rather than p->p_ucred.
o Since the same variable 'p' was later used with pfind() on the target
process for the wakeup, introduce a new local variable 'targetp'
to use instead.

Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs


# 86278 11-Nov-2001 alfred

turn vn_open() into a wrapper around vn_open_cred() which allows
one to perform a vn_open using temporary/other/fake credentials.

Modify the nfs client side locking code to use vn_open_cred() passing
proc0's ucred instead of the old way which was to temporary raise
privs while running vn_open(). This should close the race hopefully.


# 85398 24-Oct-2001 rwatson

o Note an additional potential problem here: LOCKD_MSG directly exports
struct ucred to userland. In 5.0-CURRENT, it is desirable to instead
export struct xucred, as ucred contains mutexes, pointers, and other
kernel evil. I'll add it to my work queue.


# 85370 23-Oct-2001 rwatson

o Add two comments identifying problems with the current nfs_lock.c
implementation, so that the information doesn't get lost.
(1) /var/run/lock is looked up relative to the current thread's root
directory, but it's not clear that's desirable.
(2) A race condition associated with live credential modification on
a shared credential is present when privilege is granted for
the purposes of talking to /var/run/lock.


# 83651 18-Sep-2001 peter

Cleanup and split of nfs client and server code.
This builds on the top of several repo-copies.


# 83366 12-Sep-2001 julian

KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha


# 82213 23-Aug-2001 ache

Stupid error from my side in prev. commit: || -> &&


# 82204 23-Aug-2001 ache

Implement l_len<0 per POSIX check.
Check for valid l_whence too.


# 82194 23-Aug-2001 ache

Even better move: suppose that server is able to handle SEEK_END,
so check arguments for all but not SEEK_END case, leaving SEEK_END
handling for server


# 82193 23-Aug-2001 ache

Apparently SEEK_END locking not supported by NFS. Previous variant
returns EINVAL in that case, change it to EOPNOTSUPP.


# 82190 23-Aug-2001 ache

Move <machine/*> after <sys/*>

Pointed by: bde


# 82174 23-Aug-2001 ache

adv. lock:
detect off_t overflow _before_ it occurse and return EOVERFLOW instead of
EINVAL


# 77563 31-May-2001 jake

Unlock the process returned from pfind() if it does not return NULL.
This fixes a witness lock violation for nfssvc returning with locks
held.

Submitted by: Jean-Luc Richier <Jean-Luc.Richier@imag.fr>
PR: kern/27776


# 77183 25-May-2001 rwatson

o Merge contents of struct pcred into struct ucred. Specifically, add the
real uid, saved uid, real gid, and saved gid to ucred, as well as the
pcred->pc_uidinfo, which was associated with the real uid, only rename
it to cr_ruidinfo so as not to conflict with cr_uidinfo, which
corresponds to the effective uid.
o Remove p_cred from struct proc; add p_ucred to struct proc, replacing
original macro that pointed.
p->p_ucred to p->p_cred->pc_ucred.
o Universally update code so that it makes use of ucred instead of pcred,
p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo,
cr_{r,sv}{u,g}id instead of p_*, etc.
o Remove pcred0 and its initialization from init_main.c; initialize
cr_ruidinfo there.
o Restruction many credential modification chunks to always crdup while
we figure out locking and optimizations; generally speaking, this
means moving to a structure like this:
newcred = crdup(oldcred);
...
p->p_ucred = newcred;
crfree(oldcred);
It's not race-free, but better than nothing. There are also races
in sys_process.c, all inter-process authorization, fork, exec, and
exit.
o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid;
remove comments indicating that the old arrangement was a problem.
o Restructure exec1() a little to use newcred/oldcred arrangement, and
use improved uid management primitives.
o Clean up exit1() so as to do less work in credential cleanup due to
pcred removal.
o Clean up fork1() so as to do less work in credential cleanup and
allocation.
o Clean up ktrcanset() to take into account changes, and move to using
suser_xxx() instead of performing a direct uid==0 comparision.
o Improve commenting in various kern_prot.c credential modification
calls to better document current behavior. In a couple of places,
current behavior is a little questionable and we need to check
POSIX.1 to make sure it's "right". More commenting work still
remains to be done.
o Update credential management calls, such as crfree(), to take into
account new ruidinfo reference.
o Modify or add the following uid and gid helper routines:
change_euid()
change_egid()
change_ruid()
change_rgid()
change_svuid()
change_svgid()
In each case, the call now acts on a credential not a process, and as
such no longer requires more complicated process locking/etc. They
now assume the caller will do any necessary allocation of an
exclusive credential reference. Each is commented to document its
reference requirements.
o CANSIGIO() is simplified to require only credentials, not processes
and pcreds.
o Remove lots of (p_pcred==NULL) checks.
o Add an XXX to authorization code in nfs_lock.c, since it's
questionable, and needs to be considered carefully.
o Simplify posix4 authorization code to require only credentials, not
processes and pcreds. Note that this authorization, as well as
CANSIGIO(), needs to be updated to use the p_cansignal() and
p_cansched() centralized authorization routines, as they currently
do not take into account some desirable restrictions that are handled
by the centralized routines, as well as being inconsistent with other
similar authorization instances.
o Update libkvm to take these changes into account.

Obtained from: TrustedBSD Project
Reviewed by: green, bde, jhb, freebsd-arch, freebsd-audit


# 76166 01-May-2001 markm

Undo part of the tangle of having sys/lock.h and sys/mutex.h included in
other "system" header files.

Also help the deprecation of lockmgr.h by making it a sub-include of
sys/lock.h and removing sys/lockmgr.h form kernel .c files.

Sort sys/*.h includes where possible in affected files.

OK'ed by: bde (with reservations)


# 76117 29-Apr-2001 grog

Revert consequences of changes to mount.h, part 2.

Requested by: bde


# 75858 23-Apr-2001 grog

Correct #includes to work with fixed sys/mount.h.


# 75631 17-Apr-2001 alfred

Implement client side NFS locks.

Obtained from: BSD/os
Import Ok'd by: mckusick, jkh, motd on builder.freebsd.org