History log of /freebsd-10-stable/sys/kern/kern_proc.c
Revision Date Author Comments
# 328571 29-Jan-2018 jhb

MFC 327561:
Report offset relative to the backing object for kinfo_vmentry structures.

For the pathname reported in kinfo_vmentry structures (kve_path), the
sysctl handlers walk the object chain to find the bottom-most VM object.
This permits a COW mapping of a file with dirty pages to report the
pathname of the originally mapped file. Do the same for the object
offset (kve_offset) computing a cumulative offset during the same object
walk so that the reported offset is relative to the reported pathname.

Note that ptrace(PT_VM_ENTRY) already returns a cumulative offset
rather than the raw offset of the VM map entry.

Note also that this does not affect procstat -v output (even structured
output) since that output does not include the kve_offset field.

Sponsored by: DARPA / AFRL


# 326243 27-Nov-2017 delphij

MFC r325755: Be more careful when doing calculation with request from
userland.


# 324641 15-Oct-2017 brooks

MFC r320999:

Add 32-bit compat for kinfo_proc's ki_tdaddr.

This appears to have been an oversight in r213536.

Reviewed by: markj
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D11521


# 315260 14-Mar-2017 hselasky

MFC r313941:

Make sure the thread constructor and destructor eventhandlers are
called for all threads belonging to a procedure. Currently the first
thread in a procedure is kept around as an optimisation step and is
never freed. Because the first thread in a procedure is never freed
nor allocated, its destructor and constructor callbacks are never
called which means per thread structures allocated by dtrace and the
Linux emulation layers for example, might be present for threads which
don't need these structures.

This patch adds a thread construction and destruction call for the
first thread in a procedure.

Tested: dtrace, linux emulation
Reviewed by: kib @
Sponsored by: Mellanox Technologies


# 310121 15-Dec-2016 vangyzen

MFC r309676

Export the whole thread name in kinfo_proc

kinfo_proc::ki_tdname is three characters shorter than
thread::td_name. Add a ki_moretdname field for these three
extra characters. Add the new field to kinfo_proc32, as well.
Update all in-tree consumers to read the new field and assemble
the full name, except for lldb's HostThreadFreeBSD.cpp, which
I will handle separately. Bump __FreeBSD_version.

Sponsored by: Dell EMC


# 302237 27-Jun-2016 bdrewery

MFC r292384:

Fix style issues around existing SDT probes.

** Changes to sys/netinet/in_kdtrace.c and sys/netinet/in_kdtrace.h skipped.


# 298588 25-Apr-2016 markj

MFC r298173:
Use a loop instead of a goto in sysctl_kern_proc_kstack().


# 296731 12-Mar-2016 kib

MFC r295391:
Remove the assert which outlived its usefulness.


# 295454 09-Feb-2016 jhb

MFC 287442,287537,288944:
Fix corruption of coredumps due to procstat notes changing size during
coredump generation. The changes in r287442 required some reworking
since the 'fo_fill_kinfo' file op does not exist in stable/10.

287442:
Detect badly behaved coredump note helpers

Coredump notes depend on being able to invoke dump routines twice; once
in a dry-run mode to get the size of the note, and another to actually
emit the note to the corefile.

When a note helper emits a different length section the second time
around than the length it requested the first time, the kernel produces
a corrupt coredump.

NT_PROCSTAT_FILES output length, when packing kinfo structs, is tied to
the length of filenames corresponding to vnodes in the process' fd table
via vn_fullpath. As vnodes may move around during dump, this is racy.

So:

- Detect badly behaved notes in putnote() and pad underfilled notes.

- Add a fail point, debug.fail_point.fill_kinfo_vnode__random_path to
exercise the NT_PROCSTAT_FILES corruption. It simply picks random
lengths to expand or truncate paths to in fo_fill_kinfo_vnode().

- Add a sysctl, kern.coredump_pack_fileinfo, to allow users to
disable kinfo packing for PROCSTAT_FILES notes. This should avoid
both FILES note corruption and truncation, even if filenames change,
at the cost of about 1 kiB in padding bloat per open fd. Document
the new sysctl in core.5.

- Fix note_procstat_files to self-limit in the 2nd pass. Since
sometimes this will result in a short write, pad up to our advertised
size. This addresses note corruption, at the risk of sometimes
truncating the last several fd info entries.

- Fix NT_PROCSTAT_FILES consumers libutil and libprocstat to grok the
zero padding.

287537:
Follow-up to r287442: Move sysctl to compiled-once file

Avoid duplicate sysctl nodes.

288944:
Fix core corruption caused by race in note_procstat_vmmap

This fix is spiritually similar to r287442 and was discovered thanks to
the KASSERT added in that revision.

NT_PROCSTAT_VMMAP output length, when packing kinfo structs, is tied to
the length of filenames corresponding to vnodes in the process' vm map
via vn_fullpath. As vnodes may move during coredump, this is racy.

We do not remove the race, only prevent it from causing coredump
corruption.

- Add a sysctl, kern.coredump_pack_vmmapinfo, to allow users to disable
kinfo packing for PROCSTAT_VMMAP notes. This avoids VMMAP corruption
and truncation, even if names change, at the cost of up to PATH_MAX
bytes per mapped object. The new sysctl is documented in core.5.

- Fix note_procstat_vmmap to self-limit in the second pass. This
addresses corruption, at the cost of sometimes producing a truncated
result.

- Fix PROCSTAT_VMMAP consumers libutil (and libprocstat, via copy-paste)
to grok the new zero padding.

Approved by: re (gjb)


# 294283 18-Jan-2016 jhb

MFC 290728:
Export various helper variables describing the layout and size of
certain kernel structures for use by debuggers. This mostly aids
in examining cores from a kernel without debug symbols as a debugger
can infer these values if debug symbols are available.

One set of variables describes the layout of 'struct linker_file' to
walk the list of loaded kernel modules.

A second set of variables describes the layout of 'struct proc' and
'struct thread' to walk the list of processes in the kernel and the
threads in each process.

The 'pcb_size' variable is used to index into the stoppcbs[] array.

The 'vm_maxuser_address' is used to distinguish kernel virtual addresses
from user addresses. This doesn't have to be perfect, and
'vm_maxuser_address' is a cheap and simple way to differentiate kernel
pointers from simple values like TIDs and PIDs.

While here, annotate the fields in struct pcb used by kgdb on amd64
and i386 to note that their ABI should be preserved. Annotations for
other platforms will be added in the future.


# 293473 09-Jan-2016 dchagin

To facillitate an upcoming Linuxulator merging partially
MFC r275121 (by kib). Only merge the syntax changes from r275121,
PROC_*LOCK() macros still lock the same proc spinlock.

The process spin lock currently has the following distinct uses:

- Threads lifetime cycle, in particular, counting of the threads in
the process, and interlocking with process mutex and thread lock.
The main reason of this is that turnstile locks are after thread
locks, so you e.g. cannot unlock blockable mutex (think process
mutex) while owning thread lock.

- Virtual and profiling itimers, since the timers activation is done
from the clock interrupt context. Replace the p_slock by p_itimmtx
and PROC_ITIMLOCK().

- Profiling code (profil(2)), for similar reason. Replace the p_slock
by p_profmtx and PROC_PROFLOCK().

- Resource usage accounting. Need for the spinlock there is subtle,
my understanding is that spinlock blocks context switching for the
current thread, which prevents td_runtime and similar fields from
changing (updates are done at the mi_switch()). Replace the p_slock
by p_statmtx and PROC_STATLOCK().

Discussed with: kib


# 293314 07-Jan-2016 mjg

MFC r292440:

proc: fix a race which could result in dereference of bad p_pgrp pointer on
fork

During fork p_starcopy - p_endcopy area of a process is populated with bcopy
with only proc lock held. Another forking thread can find such a process and
proceed to access p_pgrp included in said area.

Fix the problem by moving the field outside. It is being properly assigned
later.


# 289798 23-Oct-2015 avg

MFC r288336: save some bytes by using more concise SDT_PROBE<n>


# 288967 06-Oct-2015 jhb

MFC 287864:
When a process group leader exits, all of the processes in the group are
sent SIGHUP and SIGCONT if any of the processes are stopped. Currently this
behavior is triggered for any type of process stop including ptrace() stops
and transient stops for single threading during exit() and execve().
Thus, if a debugger is attached to a process in a group when the leader
exits, the entire group can be HUPed. Instead, only send the signals if a
process in the group is stopped due to SIGSTOP.


# 288499 02-Oct-2015 vangyzen

MFC r283924

Provide vnode in memory map info for files on tmpfs

When providing memory map information to userland, populate the vnode pointer
for tmpfs files. Set the memory mapping to appear as a vnode type, to match
FreeBSD 9 behavior.

This fixes the use of tmpfs files with the dtrace pid provider,
procstat -v, procfs, linprocfs, pmc (pmcstat), and ptrace (PT_VM_ENTRY).

Submitted by: Eric Badger <eric@badgerio.us> (initial revision)
Obtained from: Dell Inc.
PR: 198431


# 286222 03-Aug-2015 trasz

MFC r282086:

Make setproctitle(3) work in Capsicum capability mode. This makes
ctld(8) child processes to indicate initiator address and name in
their titles, similar to what iscsid(8) child processes do.

PR: 181352
Sponsored by: The FreeBSD Foundation


# 279926 12-Mar-2015 kib

MFC r272566:
Convert -1 from sbuf_bcat() to ENOMEM.


# 276272 26-Dec-2014 kib

MFC r275745:
Add facility to stop all userspace processes.

MFC r275753:
Fix gcc build.

MFC r275820:
Add missed break.


# 270264 21-Aug-2014 kib

MFC r269656:
Implement and use proc_realparent(9).

MFC r270024 (by markj):
Correct the order of arguments passed to LIST_INSERT_AFTER().

For merge, the p_treeflag member of struct proc was moved to the end
of the structure, to keep KBI intact.


# 269367 01-Aug-2014 kib

MFC r269205:
Simplify the expression.


# 269073 24-Jul-2014 kib

MFC r268466:
Calculate the amount of resident pages by looking in the objects chain
backing the region. Add a knob to disable the residency calculation at
all.

MFC r268490:
Unconditionally initialize addr to handle the case of changed map
timestamp while the map is unlocked.

MFC r268711:
Change the calculation of the kinfo_vmentry field kve_private_resident
to reflect its name.

MFC r268712:
Followup to r268466.
- Move the code to calculate resident count into separate function.
It reduces the indent level and makes the operation of
vmmap_skip_res_cnt tunable more clear.
- Optimize the calculation of the resident page count for map entry.
Skip directly to the next lowest available index and page among the
whole shadow chain.
- Restore the use of pmap_incore(9), only to verify that current
mapping is indeed superpage.
- Note the issue with the invalid pages.


# 262228 19-Feb-2014 jhb

MFC 261780:
Expose OBJT_MGTDEVICE VM objects used for GEM/TTM with drm2 as an
explicit object type.


# 260817 17-Jan-2014 avg

MFC r258622: dtrace sdt: remove the ugly sname parameter of SDT_PROBE_DEFINE


# 258885 03-Dec-2013 kib

MFC r258661:
Add sysctl KERN_PROC_SIGTRAMP to retrieve signal trampoline location for the
given process.

Approved by: re (gjb)


# 288499 02-Oct-2015 vangyzen

MFC r283924

Provide vnode in memory map info for files on tmpfs

When providing memory map information to userland, populate the vnode pointer
for tmpfs files. Set the memory mapping to appear as a vnode type, to match
FreeBSD 9 behavior.

This fixes the use of tmpfs files with the dtrace pid provider,
procstat -v, procfs, linprocfs, pmc (pmcstat), and ptrace (PT_VM_ENTRY).

Submitted by: Eric Badger <eric@badgerio.us> (initial revision)
Obtained from: Dell Inc.
PR: 198431


# 286222 03-Aug-2015 trasz

MFC r282086:

Make setproctitle(3) work in Capsicum capability mode. This makes
ctld(8) child processes to indicate initiator address and name in
their titles, similar to what iscsid(8) child processes do.

PR: 181352
Sponsored by: The FreeBSD Foundation


# 279926 12-Mar-2015 kib

MFC r272566:
Convert -1 from sbuf_bcat() to ENOMEM.


# 276272 26-Dec-2014 kib

MFC r275745:
Add facility to stop all userspace processes.

MFC r275753:
Fix gcc build.

MFC r275820:
Add missed break.


# 270264 21-Aug-2014 kib

MFC r269656:
Implement and use proc_realparent(9).

MFC r270024 (by markj):
Correct the order of arguments passed to LIST_INSERT_AFTER().

For merge, the p_treeflag member of struct proc was moved to the end
of the structure, to keep KBI intact.


# 269367 01-Aug-2014 kib

MFC r269205:
Simplify the expression.


# 269073 24-Jul-2014 kib

MFC r268466:
Calculate the amount of resident pages by looking in the objects chain
backing the region. Add a knob to disable the residency calculation at
all.

MFC r268490:
Unconditionally initialize addr to handle the case of changed map
timestamp while the map is unlocked.

MFC r268711:
Change the calculation of the kinfo_vmentry field kve_private_resident
to reflect its name.

MFC r268712:
Followup to r268466.
- Move the code to calculate resident count into separate function.
It reduces the indent level and makes the operation of
vmmap_skip_res_cnt tunable more clear.
- Optimize the calculation of the resident page count for map entry.
Skip directly to the next lowest available index and page among the
whole shadow chain.
- Restore the use of pmap_incore(9), only to verify that current
mapping is indeed superpage.
- Note the issue with the invalid pages.


# 262228 19-Feb-2014 jhb

MFC 261780:
Expose OBJT_MGTDEVICE VM objects used for GEM/TTM with drm2 as an
explicit object type.


# 260817 17-Jan-2014 avg

MFC r258622: dtrace sdt: remove the ugly sname parameter of SDT_PROBE_DEFINE


# 258885 03-Dec-2013 kib

MFC r258661:
Add sysctl KERN_PROC_SIGTRAMP to retrieve signal trampoline location for the
given process.

Approved by: re (gjb)