History log of /freebsd-current/lib/libprocstat/libprocstat.c
Revision Date Author Comments
# 1a8d1764 29-Mar-2024 Gleb Smirnoff <glebius@FreeBSD.org>

inpcb: fully retire inp_ppcb pointer

Before a protocol specific control block started to embed inpcb in self
(see 0aa120d52f3c, e68b3792440c, 483fe96511ec) this pointer used to point
at it.

Retain kf_sock_inpcb field in the struct kinfo_file in <sys/user.h>. The
exp-run detected a minimal use of the field in ports:
* sysutils/lsof - patched upstream
* net-mgmt/netdata - patch accepted upstream
* emulators/qemu-user-static - upstream master branch seems not using
the field anymore
We can keep the field around for some time, but eventually it may be
reused for something else.

PR: 277659 (exp-run)
Reviewed by: tuexen
Differential Revision: https://reviews.freebsd.org/D44491


# a2f733ab 24-Nov-2023 Warner Losh <imp@FreeBSD.org>

lib: Automated cleanup of cdefs and other formatting

Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.

Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/

Sponsored by: Netflix


# 248fe3d3 16-Oct-2023 Brooks Davis <brooks@FreeBSD.org>

libprocstat: improve conditional for 32-bit compat

Include support for translating 32-bit auxv vectors on non-64-bit
platforms that aren't riscv (which has no 32-bit ABI support and
probably never will).

Reviewed by: markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D42201


# 8f06fabe 16-Oct-2023 Brooks Davis <brooks@FreeBSD.org>

libprocstat: copy all the 32-bit auxv entries

Use source struct size not the destination struct size so we copy all
the auxv entries, not just the first half of them.

Fix a style issue on an adjacent line.

Reviewed by: markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D42200


# 72a4ee26 16-Oct-2023 Brooks Davis <brooks@FreeBSD.org>

libprocstat: make sv_name not static

Making this variable static makes is_elf32_sysctl() and callers thread
unsafe.

Use a less absurd length for sv_name. The longest name in the system is
"FreeBSD ELF64 V2" which tips the scales at 16+1 bytes. We'll almost
certainly have other problems if we exceed 32 characters.

Reviewed by: markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D42199


# 9735cc0e 16-Oct-2023 Brooks Davis <brooks@FreeBSD.org>

libprocstat: simplify auxv value conversion

Avoid a weird dance through the union and treat all 32-bit values as
unsigned integers. This avoids sign extension of flags and userspace
pointers.

Reviewed by: markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D42198


# ccac440f 02-Oct-2023 Brooks Davis <brooks@FreeBSD.org>

libprocstat: style: space after switch

Style demands a space after the switch keyword.

Noticed reviewing code in CheriBSD that propagated the style bug.

Reported by: markj
Sponsored by: DARPA
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D42041


# 1d386b48 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 039d1496 03-Apr-2022 Konstantin Belousov <kib@FreeBSD.org>

libprocstat: add procstat_getadvlock(3)

For now, only for sysctl target. This is not a new situation, for
instance kstacks also work for sysctl only.

Reviewed by: markj, rmacklem
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D34756


# 7a9423d6 02-Dec-2021 Konstantin Belousov <kib@FreeBSD.org>

procstat_getfiles_sysctl: do not require non-null ki_fd

ki_fd is legitimately NULL when 32bit process requests process data
from 64bit host kernel. The field is not used by the code for sysctl
case; procstat_getfiles_kvm() checks ki_fd.

PR: 260174
Reported by: Damjan Jovanovic <damjan.jov@gmail.com>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week


# 0ea3e4a2 02-Dec-2021 Konstantin Belousov <kib@FreeBSD.org>

Style

Sponsored by: The FreeBSD Foundation
MFC after: 1 week


# 427f12f1 27-May-2021 Eric van Gyzen <vangyzen@FreeBSD.org>

libprocstat kstack: fix race with thread creation

When collecting kernel stacks for a target process, if the process
adds a thread between the two calls to sysctl, ignore the additional
threads. Previously, procstat would print only a useless error
message. Now, it prints a consistent snapshot of the stacks.
We know that snapshot is already stale, but it could still be stale
even with a more complex fix to reallocate and retry, so such a fix
is hardly worth the effort.

Reported by: Daniel.Mitchell@emc.com
MFC after: 1 week
Sponsored by: Dell EMC Isilon


# d485c77f 18-Feb-2021 Konstantin Belousov <kib@FreeBSD.org>

Remove #define _KERNEL hacks from libprocstat

Make sys/buf.h, sys/pipe.h, sys/fs/devfs/devfs*.h headers usable in
userspace, assuming that the consumer has an idea what it is for.
Unhide more material from sys/mount.h and sys/ufs/ufs/inode.h,
sys/ufs/ufs/ufsmount.h for consumption of userspace tools, with the
same caveat.

Remove unacceptable hack from usr.sbin/makefs which relied on sys/buf.h
being unusable in userspace, where it override struct buf with its own
definition. Instead, provide struct m_buf and struct m_vnode and adapt
code to use local variants.

Reviewed by: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D28679


# 67af9aba 23-Dec-2020 Konstantin Belousov <kib@FreeBSD.org>

Decode and report native eventfd descriptors from libprocstat and procstat.

Submitted by: greg@unrelenting.technology
Reviewed by: markj (previous version)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D26668


# 688f8b82 24-Nov-2020 John Baldwin <jhb@FreeBSD.org>

Remove the cloned file descriptors for /dev/crypto.

Crypto file descriptors were added in the original OCF import as a way
to provide per-open data (specifically the list of symmetric
sessions). However, this gives a bit of a confusing API where one has
to open /dev/crypto and then invoke an ioctl to obtain a second file
descriptor. This also does not match the API used with /dev/crypto on
other BSDs or with Linux's /dev/crypto driver.

Character devices have gained support for per-open data via cdevpriv
since OCF was imported, so use cdevpriv to simplify the userland API
by permitting ioctls directly on /dev/crypto descriptors.

To provide backwards compatibility, CRIOGET now opens another
/dev/crypto descriptor via kern_openat() rather than dup'ing the
existing file descriptor. This preserves prior semantics in case
CRIOGET is invoked multiple times on a single file descriptor.

Reviewed by: markj
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D27302


# 85078b85 17-Nov-2020 Conrad Meyer <cem@FreeBSD.org>

Split out cwd/root/jail, cmask state from filedesc table

No functional change intended.

Tracking these structures separately for each proc enables future work to
correctly emulate clone(2) in linux(4).

__FreeBSD_version is bumped (to 1300130) for consumption by, e.g., lsof.

Reviewed by: kib
Discussed with: markj, mjg
Differential Revision: https://reviews.freebsd.org/D27037


# 9e5787d2 24-Aug-2020 Matt Macy <mmacy@FreeBSD.org>

Merge OpenZFS support in to HEAD.

The primary benefit is maintaining a completely shared
code base with the community allowing FreeBSD to receive
new features sooner and with less effort.

I would advise against doing 'zpool upgrade'
or creating indispensable pools using new
features until this change has had a month+
to soak.

Work on merging FreeBSD support in to what was
at the time "ZFS on Linux" began in August 2018.
I first publicly proposed transitioning FreeBSD
to (new) OpenZFS on December 18th, 2018. FreeBSD
support in OpenZFS was finally completed in December
2019. A CFT for downstreaming OpenZFS support in
to FreeBSD was first issued on July 8th. All issues
that were reported have been addressed or, for
a couple of less critical matters there are
pull requests in progress with OpenZFS. iXsystems
has tested and dogfooded extensively internally.
The TrueNAS 12 release is based on OpenZFS with
some additional features that have not yet made
it upstream.

Improvements include:
project quotas, encrypted datasets,
allocation classes, vectorized raidz,
vectorized checksums, various command line
improvements, zstd compression.

Thanks to those who have helped along the way:
Ryan Moeller, Allan Jude, Zack Welch, and many
others.

Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D25872


# f1221c59 15-Jul-2020 Mateusz Guzik <mjg@FreeBSD.org>

libprocstat: fix kvm filedesc access after introduction of fdescenttbl


# e165a15b 21-May-2020 Andriy Gapon <avg@FreeBSD.org>

libprocstat: fix reading of file descriptor table via kvm

This seems to have been broken since r247602 (from year 2013!).
Can be easily tested with
fstat -N /boot/kernel/kernel -M /var/crash/vmcore.last

MFC after: 1 week
Sponsored by: Panzura


# d2222aa0e 07-Mar-2020 Mateusz Guzik <mjg@FreeBSD.org>

fd: use smr for managing struct pwd

This has a side effect of eliminating filedesc slock/sunlock during path
lookup, which in turn removes contention vs concurrent modifications to the fd
table.

Reviewed by: markj, kib
Differential Revision: https://reviews.freebsd.org/D23889


# 8d03b99b 01-Mar-2020 Mateusz Guzik <mjg@FreeBSD.org>

fd: move vnodes out of filedesc into a dedicated structure

The new structure is copy-on-write. With the assumption that path lookups are
significantly more frequent than chdirs and chrooting this is a win.

This provides stable root and jail root vnodes without the need to reference
them on lookup, which in turn means less work on globally shared structures.
Note this also happens to fix a bug where jail vnode was never referenced,
meaning subsequent access on lookup could run into use-after-free.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D23884


# 630cb9c5 06-Jan-2020 Mateusz Guzik <mjg@FreeBSD.org>

procstat: read lo_name instead of now removed v_tag


# 8b75b1ad 08-Dec-2019 Doug Moore <dougm@FreeBSD.org>

Define a vm_map method for user-space for advancing from a map entry
to its successor in cases where examining a map entry requires a
helper like kvm_read_all. Use that method, with kvm_read_all, to fix
procstat_getfiles_kvm, which tries to find the successor now without
using such a helper. This addresses a problem introduced by r355491.

Reviewed by: markj (previous version)
Discussed with: kib
Differential Revision: https://reviews.freebsd.org/D22728


# 7c065540 07-Dec-2019 Doug Moore <dougm@FreeBSD.org>

Fix a type error in fixing libprocstat to be compatible with vm_map changes.

Approved by: markj
Differential Revision: https://reviews.freebsd.org/D22726


# 99b1d4c1 07-Dec-2019 Doug Moore <dougm@FreeBSD.org>

r355491 broke compilation of libprocstat.c. Change that code to use
new methods for accessing first, next map entries.

Approved by: kib
Differential Revision: https://reviews.freebsd.org/D22725


# a66732de 03-Dec-2018 Konstantin Belousov <kib@FreeBSD.org>

Print type designator 'D' for the KF_TYPE_DEV files.

No type-specific data is provided by the kernel.

Sponsored by: Mellanox Technologies
MFC after: 1 week


# 9b207441 27-May-2018 Eric van Gyzen <vangyzen@FreeBSD.org>

libprocstat: fix memory leak

Free the rlimits array on the happy path in procstat_getrlimit_core().

Reported by: Coverity
CID: 1373328
Sponsored by: Dell EMC


# df57947f 18-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

spdx: initial adoption of licensing ID tags.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.

Initially, only tag files that use BSD 4-Clause "Original" license.

RelNotes: yes
Differential Revision: https://reviews.freebsd.org/D13133


# 3cfa7c6e 03-Oct-2017 Edward Tomasz Napierala <trasz@FreeBSD.org>

Make procstat(1) recognize process descriptors, so that it shows
"P" instead of "?" in "procstat -af" output. Note that there are
still a few more DTYPE_* kinds we don't decode yet.

Reported by: rwatson
MFC after: 2 weeks
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D12426


# 0e229f34 02-Oct-2017 Gleb Smirnoff <glebius@FreeBSD.org>

Hide struct socket and struct unpcb from the userland.

Violators may define _WANT_SOCKET and _WANT_UNPCB respectively and
are not guaranteed for stability of the structures. The violators
list is the the usual one: libprocstat(3) and netstat(1) internally
and lsof in ports.

In struct xunpcb remove the inclusion of kernel structure and add
a bunch of spare fields. The xsocket already has socket not included,
but add there spares as well. Embed xsockbuf into xsocket.

Sort declarations in sys/socketvar.h to separate kernel only from
userland available ones.

PR: 221820 (exp-run)


# a2ae08e7 27-Jun-2017 Enji Cooper <ngie@FreeBSD.org>

procstat_getptlwpinfo(..): clarify the fact that KVM/SYSCTL support
isn't supported

This will make the error message reported in bug 220023 a bit more
intuitive for end-users that don't have access to the source code to
decode the procstat->type argument.

MFC after: 1 month
MFC with: r316286
PR: 220023


# 95b97895 26-May-2017 Conrad Meyer <cem@FreeBSD.org>

procstat(1): Add TCP socket send/recv buffer size

Add TCP socket send and receive buffer size to procstat -f output.

Reviewed by: kib, markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D10689


# 69921123 23-May-2017 Konstantin Belousov <kib@FreeBSD.org>

Commit the 64-bit inode project.

Extend the ino_t, dev_t, nlink_t types to 64-bit ints. Modify
struct dirent layout to add d_off, increase the size of d_fileno
to 64-bits, increase the size of d_namlen to 16-bits, and change
the required alignment. Increase struct statfs f_mntfromname[] and
f_mntonname[] array length MNAMELEN to 1024.

ABI breakage is mitigated by providing compatibility using versioned
symbols, ingenious use of the existing padding in structures, and
by employing other tricks. Unfortunately, not everything can be
fixed, especially outside the base system. For instance, third-party
APIs which pass struct stat around are broken in backward and
forward incompatible ways.

Kinfo sysctl MIBs ABI is changed in backward-compatible way, but
there is no general mechanism to handle other sysctl MIBS which
return structures where the layout has changed. It was considered
that the breakage is either in the management interfaces, where we
usually allow ABI slip, or is not important.

Struct xvnode changed layout, no compat shims are provided.

For struct xtty, dev_t tty device member was reduced to uint32_t.
It was decided that keeping ABI compat in this case is more useful
than reporting 64-bit dev_t, for the sake of pstat.

Update note: strictly follow the instructions in UPDATING. Build
and install the new kernel with COMPAT_FREEBSD11 option enabled,
then reboot, and only then install new world.

Credits: The 64-bit inode project, also known as ino64, started life
many years ago as a project by Gleb Kurtsou (gleb). Kirk McKusick
(mckusick) then picked up and updated the patch, and acted as a
flag-waver. Feedback, suggestions, and discussions were carried
by Ed Maste (emaste), John Baldwin (jhb), Jilles Tjoelker (jilles),
and Rick Macklem (rmacklem). Kris Moore (kris) performed an initial
ports investigation followed by an exp-run by Antoine Brodin (antoine).
Essential and all-embracing testing was done by Peter Holm (pho).
The heavy lifting of coordinating all these efforts and bringing the
project to completion were done by Konstantin Belousov (kib).

Sponsored by: The FreeBSD Foundation (emaste, kib)
Differential revision: https://reviews.freebsd.org/D10439


# a4ba6502 30-Mar-2017 Tycho Nightingale <tychon@FreeBSD.org>

Reorder includes to placate MIPS build.

Reported by: markj
Sponsored by: Dell EMC Isilon


# 86be94fc 30-Mar-2017 Tycho Nightingale <tychon@FreeBSD.org>

Add support for capturing 'struct ptrace_lwpinfo' for signals
resulting in a process dumping core in the corefile.

Also extend procstat to view select members of 'struct ptrace_lwpinfo'
from the contents of the note.

Sponsored by: Dell EMC Isilon


# cc65eb4e 21-Mar-2017 Gleb Smirnoff <glebius@FreeBSD.org>

Hide struct inpcb, struct tcpcb from the userland.

This is a painful change, but it is needed. On the one hand, we avoid
modifying them, and this slows down some ideas, on the other hand we still
eventually modify them and tools like netstat(1) never work on next version of
FreeBSD. We maintain a ton of spares in them, and we already got some ifdef
hell at the end of tcpcb.

Details:
- Hide struct inpcb, struct tcpcb under _KERNEL || _WANT_FOO.
- Make struct xinpcb, struct xtcpcb pure API structures, not including
kernel structures inpcb and tcpcb inside. Export into these structures
the fields from inpcb and tcpcb that are known to be used, and put there
a ton of spare space.
- Make kernel and userland utilities compilable after these changes.
- Bump __FreeBSD_version.

Reviewed by: rrs, gnn
Differential Revision: D10018


# 75b6a179 08-Jan-2017 Enji Cooper <ngie@FreeBSD.org>

Use nitems({mib,name}) instead of hardcoding their value

MFC after: 3 days


# e6b95927 06-Oct-2015 Conrad Meyer <cem@FreeBSD.org>

Fix core corruption caused by race in note_procstat_vmmap

This fix is spiritually similar to r287442 and was discovered thanks to
the KASSERT added in that revision.

NT_PROCSTAT_VMMAP output length, when packing kinfo structs, is tied to
the length of filenames corresponding to vnodes in the process' vm map
via vn_fullpath. As vnodes may move during coredump, this is racy.

We do not remove the race, only prevent it from causing coredump
corruption.

- Add a sysctl, kern.coredump_pack_vmmapinfo, to allow users to disable
kinfo packing for PROCSTAT_VMMAP notes. This avoids VMMAP corruption
and truncation, even if names change, at the cost of up to PATH_MAX
bytes per mapped object. The new sysctl is documented in core.5.

- Fix note_procstat_vmmap to self-limit in the second pass. This
addresses corruption, at the cost of sometimes producing a truncated
result.

- Fix PROCSTAT_VMMAP consumers libutil (and libprocstat, via copy-paste)
to grok the new zero padding.

Reported by: pho (https://people.freebsd.org/~pho/stress/log/datamove4-2.txt)
Relnotes: yes
Sponsored by: EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D3824


# 14bdbaf2 03-Sep-2015 Conrad Meyer <cem@FreeBSD.org>

Detect badly behaved coredump note helpers

Coredump notes depend on being able to invoke dump routines twice; once
in a dry-run mode to get the size of the note, and another to actually
emit the note to the corefile.

When a note helper emits a different length section the second time
around than the length it requested the first time, the kernel produces
a corrupt coredump.

NT_PROCSTAT_FILES output length, when packing kinfo structs, is tied to
the length of filenames corresponding to vnodes in the process' fd table
via vn_fullpath. As vnodes may move around during dump, this is racy.

So:

- Detect badly behaved notes in putnote() and pad underfilled notes.

- Add a fail point, debug.fail_point.fill_kinfo_vnode__random_path to
exercise the NT_PROCSTAT_FILES corruption. It simply picks random
lengths to expand or truncate paths to in fo_fill_kinfo_vnode().

- Add a sysctl, kern.coredump_pack_fileinfo, to allow users to
disable kinfo packing for PROCSTAT_FILES notes. This should avoid
both FILES note corruption and truncation, even if filenames change,
at the cost of about 1 kiB in padding bloat per open fd. Document
the new sysctl in core.5.

- Fix note_procstat_files to self-limit in the 2nd pass. Since
sometimes this will result in a short write, pad up to our advertised
size. This addresses note corruption, at the risk of sometimes
truncating the last several fd info entries.

- Fix NT_PROCSTAT_FILES consumers libutil and libprocstat to grok the
zero padding.

With suggestions from: bjk, jhb, kib, wblock
Approved by: markj (mentor)
Relnotes: yes
Sponsored by: EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D3548


# a770b4d8 01-Jun-2015 Marcelo Araujo <araujo@FreeBSD.org>

Remove unused variable spotted by clang.

Differential Revision: D2685
Reviewed by: rodrigc, stas


# b881b8be 16-Mar-2014 Robert Watson <rwatson@FreeBSD.org>

Update most userspace consumers of capability.h to use capsicum.h instead.

auditdistd is not updated as I will make the change upstream and then do a
vendor import sometime in the next week or two.

MFC after: 3 weeks


# 09b46be1 02-Mar-2014 Robert Watson <rwatson@FreeBSD.org>

When querying a process's umask via sysctl in libprocstat(), don't
print a warning if EPERM is returned as this is an expected failure
mode rather than error -- similar to current handling of ESRCH.
This makes the output of 'procstat -as' vastly more palatable.

MFC after: 3 days
Sponsored by: DARPA, AFRL


# 772f6645 09-Oct-2013 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Handle the cases where NULL is passed as cap_rightsp to the
filestat_new_entry() function.

Reported by: Alex Kozlov <spam@rm-rf.kiev.ua>
Approved by: re (gjb)


# 7008be5b 04-Sep-2013 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Change the cap_rights_t type from uint64_t to a structure that we can extend
in the future in a backward compatible (API and ABI) way.

The cap_rights_t represents capability rights. We used to use one bit to
represent one right, but we are running out of spare bits. Currently the new
structure provides place for 114 rights (so 50 more than the previous
cap_rights_t), but it is possible to grow the structure to hold at least 285
rights, although we can make it even larger if 285 rights won't be enough.

The structure definition looks like this:

struct cap_rights {
uint64_t cr_rights[CAP_RIGHTS_VERSION + 2];
};

The initial CAP_RIGHTS_VERSION is 0.

The top two bits in the first element of the cr_rights[] array contain total
number of elements in the array - 2. This means if those two bits are equal to
0, we have 2 array elements.

The top two bits in all remaining array elements should be 0.
The next five bits in all array elements contain array index. Only one bit is
used and bit position in this five-bits range defines array index. This means
there can be at most five array elements in the future.

To define new right the CAPRIGHT() macro must be used. The macro takes two
arguments - an array index and a bit to set, eg.

#define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL)

We still support aliases that combine few rights, but the rights have to belong
to the same array element, eg:

#define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL)
#define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL)

#define CAP_FCHMODAT (CAP_FCHMOD | CAP_LOOKUP)

There is new API to manage the new cap_rights_t structure:

cap_rights_t *cap_rights_init(cap_rights_t *rights, ...);
void cap_rights_set(cap_rights_t *rights, ...);
void cap_rights_clear(cap_rights_t *rights, ...);
bool cap_rights_is_set(const cap_rights_t *rights, ...);

bool cap_rights_is_valid(const cap_rights_t *rights);
void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src);
void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src);
bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little);

Capability rights to the cap_rights_init(), cap_rights_set(),
cap_rights_clear() and cap_rights_is_set() functions are provided by
separating them with commas, eg:

cap_rights_t rights;

cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT);

There is no need to terminate the list of rights, as those functions are
actually macros that take care of the termination, eg:

#define cap_rights_set(rights, ...) \
__cap_rights_set((rights), __VA_ARGS__, 0ULL)
void __cap_rights_set(cap_rights_t *rights, ...);

Thanks to using one bit as an array index we can assert in those functions that
there are no two rights belonging to different array elements provided
together. For example this is illegal and will be detected, because CAP_LOOKUP
belongs to element 0 and CAP_PDKILL to element 1:

cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL);

Providing several rights that belongs to the same array's element this way is
correct, but is not advised. It should only be used for aliases definition.

This commit also breaks compatibility with some existing Capsicum system calls,
but I see no other way to do that. This should be fine as Capsicum is still
experimental and this change is not going to 9.x.

Sponsored by: The FreeBSD Foundation


# 237abf0c 28-Jun-2013 Davide Italiano <davide@FreeBSD.org>

- Trim an unused and bogus Makefile for mount_smbfs.
- Reconnect with some minor modifications, in particular now selsocket()
internals are adapted to use sbintime units after recent'ish calloutng
switch.


# 608203fd 11-Jun-2013 John Baldwin <jhb@FreeBSD.org>

Borrow the algorithm from kvm_getprocs() to fix procstat_getprocs() to
handle the case where the process tables grows in between the calls to
fetch the size and fetch the table.

MFC after: 1 week


# 467112b4 08-May-2013 Mikolaj Golub <trociny@FreeBSD.org>

Make errbuf optional, so if a caller is not interested in an error
message she can pass NULL (procstat(1) already does this).

MFC after: 2 weeks


# 958aa575 03-May-2013 John Baldwin <jhb@FreeBSD.org>

Similar to 233760 and 236717, export some more useful info about the
kernel-based POSIX semaphore descriptors to userland via procstat(1) and
fstat(1):
- Change sem file descriptors to track the pathname they are associated
with and add a ksem_info() method to copy the path out to a
caller-supplied buffer.
- Use the fo_stat() method of shared memory objects and ksem_info() to
export the path, mode, and value of a semaphore via struct kinfo_file.
- Add a struct semstat to the libprocstat(3) interface along with a
procstat_get_sem_info() to export the mode and value of a semaphore.
- Teach fstat about semaphores and to display their path, mode, and value.

MFC after: 2 weeks


# dd70ad64 01-May-2013 Mikolaj Golub <trociny@FreeBSD.org>

procstat_getpathname: for kvm method, instead of returning the error
that the method is not supported, return an empty string.

This looks more handy for callers like procstat(1), which will not
abort after the failed call and still output some useful information.

MFC after: 3 weeks


# 1f84c47e 01-May-2013 Mikolaj Golub <trociny@FreeBSD.org>

KVM method support for procstat_getgroups, procstat_getumask,
procstat_getrlimit, and procstat_getosrel.

MFC after: 3 weeks


# 89358231 20-Apr-2013 Mikolaj Golub <trociny@FreeBSD.org>

Add procstat_getkstack function to dump kernel stacks of a process.

MFC after: 1 month


# 2ff020d3 20-Apr-2013 Mikolaj Golub <trociny@FreeBSD.org>

Add procstat_getauxv function to retrieve a process auxiliary vector.

MFC after: 1 month


# 4482b5e3 20-Apr-2013 Mikolaj Golub <trociny@FreeBSD.org>

Extend libprocstat with functions to retrieve process command line
arguments and environment variables.

Suggested by: stas
Reviewed by: jhb and stas (initial version)
MFC after: 1 month


# eec6cb1c 20-Apr-2013 Mikolaj Golub <trociny@FreeBSD.org>

Add procstat_getosrel function to retrieve a process osrel info.

MFC after: 1 month


# 4cdf9796 20-Apr-2013 Mikolaj Golub <trociny@FreeBSD.org>

Add procstat_getpathname function to retrieve a process executable.

MFC after: 1 month


# 7cc0ebfd 20-Apr-2013 Mikolaj Golub <trociny@FreeBSD.org>

Add procstat_getrlimit function to retrieve a process resource limits info.

MFC after: 1 month


# 5b9bcba9 20-Apr-2013 Mikolaj Golub <trociny@FreeBSD.org>

Add procstat_getumask function to retrieve a process umask.

MFC after: 1 month


# 7f1d14e6 20-Apr-2013 Mikolaj Golub <trociny@FreeBSD.org>

Add procstat_getgroups function to retrieve process groups.

MFC after: 1 month


# 39680c7b 20-Apr-2013 Mikolaj Golub <trociny@FreeBSD.org>

Add procstat_getvmmap function to get VM layout of a process.

MFC after: 1 month


# 7153ad2b 20-Apr-2013 Mikolaj Golub <trociny@FreeBSD.org>

Make libprocstat(3) extract procstat notes from a process core file.

PR: kern/173723
Suggested by: jhb
Glanced by: kib
MFC after: 1 month


# 2609222a 01-Mar-2013 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Merge Capsicum overhaul:

- Capability is no longer separate descriptor type. Now every descriptor
has set of its own capability rights.

- The cap_new(2) system call is left, but it is no longer documented and
should not be used in new code.

- The new syscall cap_rights_limit(2) should be used instead of
cap_new(2), which limits capability rights of the given descriptor
without creating a new one.

- The cap_getrights(2) syscall is renamed to cap_rights_get(2).

- If CAP_IOCTL capability right is present we can further reduce allowed
ioctls list with the new cap_ioctls_limit(2) syscall. List of allowed
ioctls can be retrived with cap_ioctls_get(2) syscall.

- If CAP_FCNTL capability right is present we can further reduce fcntls
that can be used with the new cap_fcntls_limit(2) syscall and retrive
them with cap_fcntls_get(2).

- To support ioctl and fcntl white-listing the filedesc structure was
heavly modified.

- The audit subsystem, kdump and procstat tools were updated to
recognize new syscalls.

- Capability rights were revised and eventhough I tried hard to provide
backward API and ABI compatibility there are some incompatible changes
that are described in detail below:

CAP_CREATE old behaviour:
- Allow for openat(2)+O_CREAT.
- Allow for linkat(2).
- Allow for symlinkat(2).
CAP_CREATE new behaviour:
- Allow for openat(2)+O_CREAT.

Added CAP_LINKAT:
- Allow for linkat(2). ABI: Reuses CAP_RMDIR bit.
- Allow to be target for renameat(2).

Added CAP_SYMLINKAT:
- Allow for symlinkat(2).

Removed CAP_DELETE. Old behaviour:
- Allow for unlinkat(2) when removing non-directory object.
- Allow to be source for renameat(2).

Removed CAP_RMDIR. Old behaviour:
- Allow for unlinkat(2) when removing directory.

Added CAP_RENAMEAT:
- Required for source directory for the renameat(2) syscall.

Added CAP_UNLINKAT (effectively it replaces CAP_DELETE and CAP_RMDIR):
- Allow for unlinkat(2) on any object.
- Required if target of renameat(2) exists and will be removed by this
call.

Removed CAP_MAPEXEC.

CAP_MMAP old behaviour:
- Allow for mmap(2) with any combination of PROT_NONE, PROT_READ and
PROT_WRITE.
CAP_MMAP new behaviour:
- Allow for mmap(2)+PROT_NONE.

Added CAP_MMAP_R:
- Allow for mmap(PROT_READ).
Added CAP_MMAP_W:
- Allow for mmap(PROT_WRITE).
Added CAP_MMAP_X:
- Allow for mmap(PROT_EXEC).
Added CAP_MMAP_RW:
- Allow for mmap(PROT_READ | PROT_WRITE).
Added CAP_MMAP_RX:
- Allow for mmap(PROT_READ | PROT_EXEC).
Added CAP_MMAP_WX:
- Allow for mmap(PROT_WRITE | PROT_EXEC).
Added CAP_MMAP_RWX:
- Allow for mmap(PROT_READ | PROT_WRITE | PROT_EXEC).

Renamed CAP_MKDIR to CAP_MKDIRAT.
Renamed CAP_MKFIFO to CAP_MKFIFOAT.
Renamed CAP_MKNODE to CAP_MKNODEAT.

CAP_READ old behaviour:
- Allow pread(2).
- Disallow read(2), readv(2) (if there is no CAP_SEEK).
CAP_READ new behaviour:
- Allow read(2), readv(2).
- Disallow pread(2) (CAP_SEEK was also required).

CAP_WRITE old behaviour:
- Allow pwrite(2).
- Disallow write(2), writev(2) (if there is no CAP_SEEK).
CAP_WRITE new behaviour:
- Allow write(2), writev(2).
- Disallow pwrite(2) (CAP_SEEK was also required).

Added convinient defines:

#define CAP_PREAD (CAP_SEEK | CAP_READ)
#define CAP_PWRITE (CAP_SEEK | CAP_WRITE)
#define CAP_MMAP_R (CAP_MMAP | CAP_SEEK | CAP_READ)
#define CAP_MMAP_W (CAP_MMAP | CAP_SEEK | CAP_WRITE)
#define CAP_MMAP_X (CAP_MMAP | CAP_SEEK | 0x0000000000000008ULL)
#define CAP_MMAP_RW (CAP_MMAP_R | CAP_MMAP_W)
#define CAP_MMAP_RX (CAP_MMAP_R | CAP_MMAP_X)
#define CAP_MMAP_WX (CAP_MMAP_W | CAP_MMAP_X)
#define CAP_MMAP_RWX (CAP_MMAP_R | CAP_MMAP_W | CAP_MMAP_X)
#define CAP_RECV CAP_READ
#define CAP_SEND CAP_WRITE

#define CAP_SOCK_CLIENT \
(CAP_CONNECT | CAP_GETPEERNAME | CAP_GETSOCKNAME | CAP_GETSOCKOPT | \
CAP_PEELOFF | CAP_RECV | CAP_SEND | CAP_SETSOCKOPT | CAP_SHUTDOWN)
#define CAP_SOCK_SERVER \
(CAP_ACCEPT | CAP_BIND | CAP_GETPEERNAME | CAP_GETSOCKNAME | \
CAP_GETSOCKOPT | CAP_LISTEN | CAP_PEELOFF | CAP_RECV | CAP_SEND | \
CAP_SETSOCKOPT | CAP_SHUTDOWN)

Added defines for backward API compatibility:

#define CAP_MAPEXEC CAP_MMAP_X
#define CAP_DELETE CAP_UNLINKAT
#define CAP_MKDIR CAP_MKDIRAT
#define CAP_RMDIR CAP_UNLINKAT
#define CAP_MKFIFO CAP_MKFIFOAT
#define CAP_MKNOD CAP_MKNODAT
#define CAP_SOCK_ALL (CAP_SOCK_CLIENT | CAP_SOCK_SERVER)

Sponsored by: The FreeBSD Foundation
Reviewed by: Christoph Mallon <christoph.mallon@gmx.de>
Many aspects discussed with: rwatson, benl, jonathan
ABI compatibility discussed with: kib


# 2e564269 17-Oct-2012 Attilio Rao <attilio@FreeBSD.org>

Disconnect non-MPSAFE SMBFS from the build in preparation for dropping
GIANT from VFS. In addition, disconnect also netsmb, which is a base
requirement for SMBFS.

In the while SMBFS regular users can use FUSE interface and smbnetfs
port to work with their SMBFS partitions.

Also, there are ongoing efforts by vendor to support in-kernel smbfs,
so there are good chances that it will get relinked once properly locked.

This is not targeted for MFC.


# a42ac676 17-Oct-2012 Attilio Rao <attilio@FreeBSD.org>

Disconnect non-MPSAFE NTFS from the build in preparation for dropping
GIANT from VFS. This code is particulary broken and fragile and other
in-kernel implementations around, found in other operating systems,
don't really seem clean and solid enough to be imported at all.
If someone wants to reconsider in-kernel NTFS implementation for
inclusion again, a fair effort for completely fixing and cleaning it
up is expected.

In the while NTFS regular users can use FUSE interface and ntfs-3g
port to work with their NTFS partitions.

This is not targeted for MFC.


# e6116d5b 17-Oct-2012 Attilio Rao <attilio@FreeBSD.org>

Disconnect non-MPSAFE NWFS from the build in preparation for dropping
GIANT from VFS. In addition, disconnect also netncp, which is a base
requirement for NWFS.

In the possibility of a future maintenance of the code and later
readd to the FreeBSD base, maybe we should think about a better location
for netncp. I'm not entirely sure the / top location is actually right,
however I will let network people to comment on that more specifically.

This is not targeted for MFC.


# 391bdfb8 06-Oct-2012 Andriy Gapon <avg@FreeBSD.org>

procstat_getprocs: honor kvm_getprocs interface - cnt is signed

MFC after: 10 days


# a9f5e425 07-Jun-2012 John Baldwin <jhb@FreeBSD.org>

Teach procstat_get_shm_info_kvm() how to fetch the pathname of a SHM file
descriptor from a core and set it in fts->fs_path.

MFC after: 1 week


# e506e182 01-Apr-2012 John Baldwin <jhb@FreeBSD.org>

Export some more useful info about shared memory objects to userland
via procstat(1) and fstat(1):
- Change shm file descriptors to track the pathname they are associated
with and add a shm_path() method to copy the path out to a caller-supplied
buffer.
- Use the fo_stat() method of shared memory objects and shm_path() to
export the path, mode, and size of a shared memory object via
struct kinfo_file.
- Add a struct shmstat to the libprocstat(3) interface along with a
procstat_get_shm_info() to export the mode and size of a shared memory
object.
- Change procstat to always print out the path for a given object if it
is valid.
- Teach fstat about shared memory objects and to display their path,
mode, and size.

MFC after: 2 weeks


# d57486e2 13-Aug-2011 Robert Watson <rwatson@FreeBSD.org>

Updates to libprocstat(3) and procstat(1) to allow monitoring Capsicum
capability mode and capabilities.

Right now no attempt is made to unwrap capabilities when operating on
a crashdump, so further refinement is required.

Approved by: re (bz)
Sponsored by: Google Inc


# 279a233d 18-Jun-2011 Jilles Tjoelker <jilles@FreeBSD.org>

libprocstat: For MAP_PRIVATE, do not consider the file open for writing.

If a file is mapped with with MAP_PRIVATE, no write permission is required
and changes do not end up in the file. Therefore, tools like fuser and fstat
should not show the file as open for writing.

The protection as displayed by procstat -v still includes write in this
case, and shows 'C' for copy-on-write.


# 80905c35 18-Jun-2011 Jilles Tjoelker <jilles@FreeBSD.org>

libprocstat: Fix typo in error messages.


# b0766d1f 18-Jun-2011 Jilles Tjoelker <jilles@FreeBSD.org>

libprocstat: Remove spaces between function name and open parenthesis.


# 65869214 18-Jun-2011 Jilles Tjoelker <jilles@FreeBSD.org>

libprocstat: Correct format for size_t (should be %zu, not %zd).


# 5f301949 18-Jun-2011 Ben Laurie <benl@FreeBSD.org>

Fix clang warnings.

Approved by: philip (mentor)


# d7b666ae 18-May-2011 Sergey Kandaurov <pluknet@FreeBSD.org>

Release allocated memory in procstat_close().

Reviewed by: stass


# 72266aa3 14-May-2011 Stanislav Sedov <stas@FreeBSD.org>

- Whitespace fix.


# 59ea47d0 12-May-2011 Stanislav Sedov <stas@FreeBSD.org>

- Don't try to build NWFS support module if NCP/IPX is disabled in the build.
- Rename ZFS definition to LIBPROCSTAT_ZFS to be consistent with NWFS and to
prevent possible collisions.

Reported by: many


# 0daf62d9 12-May-2011 Stanislav Sedov <stas@FreeBSD.org>

- Commit work from libprocstat project. These patches add support for runtime
file and processes information retrieval from the running kernel via sysctl
in the form of new library, libprocstat. The library also supports KVM backend
for analyzing memory crash dumps. Both procstat(1) and fstat(1) utilities have
been modified to take advantage of the library (as the bonus point the fstat(1)
utility no longer need superuser privileges to operate), and the procstat(1)
utility is now able to display information from memory dumps as well.

The newly introduced fuser(1) utility also uses this library and able to operate
via sysctl and kvm backends.

The library is by no means complete (e.g. KVM backend is missing vnode name
resolution routines, and there're no manpages for the library itself) so I
plan to improve it further. I'm commiting it so it will get wider exposure
and review.

We won't be able to MFC this work as it relies on changes in HEAD, which
was introduced some time ago, that break kernel ABI. OTOH we may be able
to merge the library with KVM backend if we really need it there.

Discussed with: rwatson