History log of /freebsd-current/sys/sys/protosw.h
Revision Date Author Comments
# b925d719 07-May-2024 Gleb Smirnoff <glebius@FreeBSD.org>

sockets: garbage collect PRCOREQUESTS and stale comment

The code deleted predates FreeBSD history. The comment deleted is 99%
outdated. Why KAME decided to use these constants instead of normal ones
also lost in centuries.


# 289bee16 16-Jan-2024 Gleb Smirnoff <glebius@FreeBSD.org>

sockets: remove dom_dispose and PR_RIGHTS

Passing file descriptors (rights) via sockets is a feature specific to
PF_UNIX only, so fully isolate the logic into uipc_usrreq.c.

Reviewed by: tuexen
Differential Revision: https://reviews.freebsd.org/D43414


# 5bba2728 16-Jan-2024 Gleb Smirnoff <glebius@FreeBSD.org>

sockets: make pr_shutdown fully protocol specific method

Disassemble a one-for-all soshutdown() into protocol specific methods.
This creates a small amount of copy & paste, but makes code a lot more
self documented, as protocol specific method would execute only the code
that is relevant to that protocol and nothing else. This also fixes a
couple recent regressions and reduces risk of future regressions. The
extended KPI for the new pr_shutdown removes need for the extra pr_flush
which was added for the sake of SCTP which could not perform its shutdown
properly with the old one. Particularly for SCTP this change streamlines
a lot of code.

Some notes on why certain parts of code were copied or were not to certain
protocols:
* The (SS_ISCONNECTED | SS_ISCONNECTING | SS_ISDISCONNECTING) check is
needed only for those protocols that may be connected or disconnected.
* The above reduces into only SS_ISCONNECTED for those protocols that
always connect instantly.
* The ENOTCONN and continue processing hack is left only for datagram
protocols.
* The SOLISTENING(so) block is copied to those protocols that listen(2).
* sorflush() on SHUT_RD is copied almost to every protocol, but that
will be refactored later.
* wakeup(&so->so_timeo) is copied to protocols that can make a non-instant
connect(2), can SO_LINGER or can accept(2).

There are three protocols (netgraph(4), Bluetooth, SDP) that did not have
pr_shutdown, but old soshutdown() would still perform sorflush() on
SHUT_RD for them and also wakeup(9). Those protocols partially supported
shutdown(2) returning EOPNOTSUP for SHUT_WR/SHUT_RDWR, now they fully lost
shutdown(2) support. I'm pretty sure netgraph(4) and Bluetooth are okay
about that and SDP is almost abandoned anyway.

Reviewed by: tuexen
Differential Revision: https://reviews.freebsd.org/D43413


# 0598824c 12-Jan-2024 Gleb Smirnoff <glebius@FreeBSD.org>

protosw: remove unneeded include


# 0fac350c 30-Nov-2023 Gleb Smirnoff <glebius@FreeBSD.org>

sockets: don't malloc/free sockaddr memory on getpeername/getsockname

Just like it was done for accept(2) in cfb1e92912b4, use same approach
for two simplier syscalls that return socket addresses. Although,
these two syscalls aren't performance critical, this change generalizes
some code between 3 syscalls trimming code size.

Following example of accept(2), provide VNET-aware and INVARIANT-checking
wrappers sopeeraddr() and sosockaddr() around protosw methods.

Reviewed by: tuexen
Differential Revision: https://reviews.freebsd.org/D42694


# cfb1e929 30-Nov-2023 Gleb Smirnoff <glebius@FreeBSD.org>

sockets: don't malloc/free sockaddr memory on accept(2)

Let the accept functions provide stack memory for protocols to fill it in.
Generic code should provide sockaddr_storage, specialized code may provide
smaller structure.

While rewriting accept(2) make 'addrlen' a true in/out parameter, reporting
required length in case if provided length was insufficient. Our manual
page accept(2) and POSIX don't explicitly require that, but one can read
the text as they do. Linux also does that. Update tests accordingly.

Reviewed by: rscheff, tuexen, zlei, dchagin
Differential Revision: https://reviews.freebsd.org/D42635


# 29363fb4 23-Nov-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove ancient SCCS tags.

Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl script.

Sponsored by: Netflix


# 2ff63af9 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .h pattern

Remove /^\s*\*+\s*\$FreeBSD\$.*$\n/


# fcb3f813 03-Oct-2022 Gleb Smirnoff <glebius@FreeBSD.org>

netinet*: remove PRC_ constants and streamline ICMP processing

In the original design of the network stack from the protocol control
input method pr_ctlinput was used notify the protocols about two very
different kinds of events: internal system events and receival of an
ICMP messages from outside. These events were coded with PRC_ codes.
Today these methods are removed from the protosw(9) and are isolated
to IPv4 and IPv6 stacks and are called only from icmp*_input(). The
PRC_ codes now just create a shim layer between ICMP codes and errors
or actions taken by protocols.

- Change ipproto_ctlinput_t to pass just pointer to ICMP header. This
allows protocols to not deduct it from the internal IP header.
- Change ip6proto_ctlinput_t to pass just struct ip6ctlparam pointer.
It has all the information needed to the protocols. In the structure,
change ip6c_finaldst fields to sockaddr_in6. The reason is that
icmp6_input() already has this address wrapped in sockaddr, and the
protocols want this address as sockaddr.
- For UDP tunneling control input, as well as for IPSEC control input,
change the prototypes to accept a transparent union of either ICMP
header pointer or struct ip6ctlparam pointer.
- In icmp_input() and icmp6_input() do only validation of ICMP header and
count bad packets. The translation of ICMP codes to errors/actions is
done by protocols.
- Provide icmp_errmap() and icmp6_errmap() as substitute to inetctlerrmap,
inet6ctlerrmap arrays.
- In protocol ctlinput methods either trust what icmp_errmap() recommend,
or do our own logic based on the ICMP header.

Differential revision: https://reviews.freebsd.org/D36731


# f6696856 27-Sep-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

protocols: make socket buffers ioctl handler changeable

Allow to set custom per-protocol handlers for the socket buffers
ioctls by introducing pr_setsbopt callback with the default value
set to the currently-used sbsetopt().

Reviewed by: glebius
Differential Revision: https://reviews.freebsd.org/D36746


# 24af7808 30-Aug-2022 Gleb Smirnoff <glebius@FreeBSD.org>

protosw: repair protocol selection logic in socket(2)

Pointy hat to: glebius
Fixes: 61f7427f02a307d28af674a12c45dd546e3898e4


# 61f7427f 30-Aug-2022 Gleb Smirnoff <glebius@FreeBSD.org>

protosw: cleanup protocols that existed merely to provide pr_input

Since 4.4BSD the protosw was used to implement socket types created
by socket(2) syscall and at the same to demultiplex incoming IPv4
datagrams (later copied to IPv6). This story ended with 78b1fc05b20.

These entries (e.g. IPPROTO_ICMP) in inetsw that were added to catch
packets in ip_input(), they would also be returned by pffindproto()
if user says socket(AF_INET, SOCK_RAW, IPPROTO_ICMP). Thus, for raw
sockets to work correctly, all the entries were pointing at raw_usrreq
differentiating only in the value of pr_protocol.

With 78b1fc05b20 all these entries are no longer needed, as ip_protox
is independent of protosw. Any socket syscall requesting SOCK_RAW type
would end up with rip_protosw. And this protosw has its pr_protocol
set to 0, allowing to mark socket with any protocol.

For IPv6 raw socket the change required two small fixes:
o Validate user provided protocol value
o Always use protocol number stored in inp in rip6_attach, instead
of protosw value, which is now always 0.

Differential revision: https://reviews.freebsd.org/D36380


# e7d02be1 17-Aug-2022 Gleb Smirnoff <glebius@FreeBSD.org>

protosw: refactor protosw and domain static declaration and load

o Assert that every protosw has pr_attach. Now this structure is
only for socket protocols declarations and nothing else.
o Merge struct pr_usrreqs into struct protosw. This was suggested
in 1996 by wollman@ (see 7b187005d18ef), and later reiterated
in 2006 by rwatson@ (see 6fbb9cf860dcd).
o Make struct domain hold a variable sized array of protosw pointers.
For most protocols these pointers are initialized statically.
Those domains that may have loadable protocols have spacers. IPv4
and IPv6 have 8 spacers each (andre@ dff3237ee54ea).
o For inetsw and inet6sw leave a comment noting that many protosw
entries very likely are dead code.
o Refactor pf_proto_[un]register() into protosw_[un]register().
o Isolate pr_*_notsupp() methods into uipc_domain.c

Reviewed by: melifaro
Differential revision: https://reviews.freebsd.org/D36232


# d9f6ac88 17-Aug-2022 Gleb Smirnoff <glebius@FreeBSD.org>

protosw: retire PRU_ flags and their char names

For many years only TCP debugging used them, but relatively recently
TCP DTrace probes also start to use them. Move their declarations
into tcp_debug.h, but start including tcp_debug.h unconditionally,
so that compilation with DTrace and without TCPDEBUG is possible.


# 81a34d37 17-Aug-2022 Gleb Smirnoff <glebius@FreeBSD.org>

protosw: retire pr_drain and use EVENTHANDLER(9) directly

The method was called for two different conditions: 1) the VM layer is
low on pages or 2) one of UMA zones of mbuf allocator exhausted.
This change 2) into a new event handler, but all affected network
subsystems modified to subscribe to both, so this change shall not
bring functional changes under different low memory situations.

There were three subsystems still using pr_drain: TCP, SCTP and frag6.
The latter had its protosw entry for the only reason to register its
pr_drain method.

Reviewed by: tuexen, melifaro
Differential revision: https://reviews.freebsd.org/D36164


# 1922eb3e 17-Aug-2022 Gleb Smirnoff <glebius@FreeBSD.org>

protosw: retire pr_slowtimo and pr_fasttimo

They were useful many years ago, when the callwheel was not efficient,
and the kernel tried to have as little callout entries scheduled as
possible.

Reviewed by: tuexen, melifaro
Differential revision: https://reviews.freebsd.org/D36163


# 78b1fc05 17-Aug-2022 Gleb Smirnoff <glebius@FreeBSD.org>

protosw: separate pr_input and pr_ctlinput out of protosw

The protosw KPI historically has implemented two quite orthogonal
things: protocols that implement a certain kind of socket, and
protocols that are IPv4/IPv6 protocol. These two things do not
make one-to-one correspondence. The pr_input and pr_ctlinput methods
were utilized only in IP protocols. This strange duality required
IP protocols that doesn't have a socket to declare protosw, e.g.
carp(4). On the other hand developers of socket protocols thought
that they need to define pr_input/pr_ctlinput always, which lead to
strange dead code, e.g. div_input() or sdp_ctlinput().

With this change pr_input and pr_ctlinput as part of protosw disappear
and IPv4/IPv6 get their private single level protocol switch table
ip_protox[] and ip6_protox[] respectively, pointing at array of
ipproto_input_t functions. The pr_ctlinput that was used for
control input coming from the network (ICMP, ICMPv6) is now represented
by ip_ctlprotox[] and ip6_ctlprotox[].

ipproto_register() becomes the only official way to register in the
table. Those protocols that were always static and unlikely anybody
is interested in making them loadable, are now registered by ip_init(),
ip6_init(). An IP protocol that considers itself unloadable shall
register itself within its own private SYSINIT().

Reviewed by: tuexen, melifaro
Differential revision: https://reviews.freebsd.org/D36157


# 489482e2 17-Aug-2022 Gleb Smirnoff <glebius@FreeBSD.org>

ipsec: isolate knowledge about protocols that are last header

Retire PR_LASTHDR protosw flag.

Reviewed by: ae
Differential revision: https://reviews.freebsd.org/D36155


# f277746e 12-Aug-2022 Gleb Smirnoff <glebius@FreeBSD.org>

protosw: change prototype for pr_control

For some reason protosw.h is used during world complation and userland
is not aware of caddr_t, a relic from the first version of C. Broken
buildworld is good reason to get rid of yet another caddr_t in kernel.

Fixes: 886fc1e80490fb03e72e306774766cbb2c733ac6


# 948f31d7 12-Aug-2022 Gleb Smirnoff <glebius@FreeBSD.org>

netinet: do not broadcast PRC_REDIRECT_HOST on ICMP redirect

This is expensive and useless call. It has been useless since Alexander
melifaro@ moved the forwarding table to nexthops with passive invalidation.
What happens now is that cached route in a inpcb would get invalidated
on next ip_output().

These were the last users of pfctlinput(), so garbage collect it.

Reviewed by: melifaro
Differential revision: https://reviews.freebsd.org/D36156


# 886fc1e8 12-Aug-2022 Gleb Smirnoff <glebius@FreeBSD.org>

protosw: provide prototypes for all protocol switch methods

Reviewed by: melifaro
Differential revision: https://reviews.freebsd.org/D36153


# 8c77967e 11-Aug-2022 Gleb Smirnoff <glebius@FreeBSD.org>

protosw: retire pr_output method

The only place to execute this method was raw_usend(). Only those
protocols that used raw socket were able to actually enter that method.
All pr_output assignments being deleted by this commit were a dead code
for many years.

Reviewed by: melifaro
Differential revision: https://reviews.freebsd.org/D36126


# b8103ca7 11-Aug-2022 Gleb Smirnoff <glebius@FreeBSD.org>

netinet: get interface event notifications directly via EVENTHANDLER(9)

The old mechanism of getting them via domains/protocols control input
is a relict from the previous century, when nothing like EVENTHANDLER(9)
existed yet. Retire PRC_IFDOWN/PRC_IFUP as netinet was the only one
to use them.

Reviewed by: melifaro
Differential revision: https://reviews.freebsd.org/D36116


# a4fc4142 24-Jun-2022 Gleb Smirnoff <glebius@FreeBSD.org>

sockets: enable protocol specific socket buffers

Split struct sockbuf into common shared fields and protocol specific
union, where protocols are free to implement whatever buffer they
want. Such protocols should mark themselves with PR_SOCKBUF and are
expected to initialize their buffers in their pr_attach and tear
them down in pr_detach.

Reviewed by: markj
Differential revision: https://reviews.freebsd.org/D35299


# 89128ff3 03-Jan-2022 Gleb Smirnoff <glebius@FreeBSD.org>

protocols: init with standard SYSINIT(9) or VNET_SYSINIT

The historical BSD network stack loop that rolls over domains and
over protocols has no advantages over more modern SYSINIT(9).
While doing the sweep, split global and per-VNET initializers.

Getting rid of pr_init allows to achieve several things:
o Get rid of ifdef's that protect against double foo_init() when
both INET and INET6 are compiled in.
o Isolate initializers statically to the module they init.
o Makes code easier to understand and maintain.

Reviewed by: melifaro
Differential revision: https://reviews.freebsd.org/D33537


# e0a17c3f 15-Aug-2021 Mateusz Guzik <mjg@FreeBSD.org>

uipc: create dedicated lists for fast and slow timeout callbacks

This avoids having to walk all possible protocols only to check if they
have one (vast majority does not).

Original patch by kevans@.

Reviewed by: kevans
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 27457983 07-Apr-2021 Mark Johnston <markj@FreeBSD.org>

capsicum: Limit socket operations in capability mode

Capsicum did not prevent certain privileged networking operations,
specifically creation of raw sockets and network configuration ioctls.
However, these facilities can be used to circumvent some of the
restrictions that capability mode is supposed to enforce.

Add capability mode checks to disallow network configuration ioctls and
creation of sockets other than PF_LOCAL and SOCK_DGRAM/STREAM/SEQPACKET
internet sockets.

Reviewed by: oshogbo
Discussed with: emaste
Reported by: manu
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D29423


# a9839c4a 07-Aug-2020 Bjoern A. Zeeb <bz@FreeBSD.org>

IPV6_PKTINFO support for v4-mapped IPv6 sockets

When using v4-mapped IPv6 sockets with IPV6_PKTINFO we do not
respect the given v4-mapped src address on the IPv4 socket.
Implement the needed functionality. This allows single-socket
UDP applications (such as OpenVPN) to work better on FreeBSD.

Requested by: Gert Doering (gert greenie.net), pfsense
Tested by: Gert Doering (gert greenie.net)
Reviewed by: melifaro
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24135


# 237c1f93 15-Oct-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Remove pfctlinput2(). It came from KAME and had never ever been in use.


# 51369649 20-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys: further adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.


# fbbd9655 28-Feb-2017 Warner Losh <imp@FreeBSD.org>

Renumber copyright clause 4

Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.

Submitted by: Jan Schaumann <jschauma@stevens.edu>
Pull Request: https://github.com/freebsd/freebsd/pull/96


# 3f58662d 01-Jun-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

The pr_destroy field does not allow us to run the teardown code in a
specific order. VNET_SYSUNINITs however are doing exactly that.
Thus remove the VIMAGE conditional field from the domain(9) protosw
structure and replace it with VNET_SYSUNINITs.
This also allows us to change some order and to make the teardown functions
file local static.
Also convert divert(4) as it uses the same mechanism ip(4) and ip6(4) use
internally.

Slightly reshuffle the SI_SUB_* fields in kernel.h and add a new ones, e.g.,
for pfil consumers (firewalls), partially for this commit and for others
to come.

Reviewed by: gnn, tuexen (sctp), jhb (kernel.h)
Obtained from: projects/vnet
MFC after: 2 weeks
X-MFC: do not remove pr_destroy
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D6652


# 8722384b 29-Apr-2016 John Baldwin <jhb@FreeBSD.org>

Introduce a new protocol hook pru_aio_queue.

This allows a protocol to claim individual AIO requests instead of using
the default socket AIO handling.

Sponsored by: Chelsio Communications


# 651e4e6a 30-Nov-2014 Gleb Smirnoff <glebius@FreeBSD.org>

Merge from projects/sendfile: extend protocols API to support
sending not ready data:
o Add new flag to pru_send() flags - PRUS_NOTREADY.
o Add new protocol method pru_ready().

Sponsored by: Nginx, Inc.
Sponsored by: Netflix


# d1f79a3b 10-Nov-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Remove kernel handling of ICMP_SOURCEQUENCH.
It hasn't been used for a very long time.
Additionally, it was deprecated by RFC 6633.


# 73d76e77 14-Aug-2014 Kevin Lo <kevlo@FreeBSD.org>

Change pr_output's prototype to avoid the need for explicit casts.
This is a follow up to r269699.

Phabric: D564
Reviewed by: jhb


# 8f5a8818 07-Aug-2014 Kevin Lo <kevlo@FreeBSD.org>

Merge 'struct ip6protosw' and 'struct protosw' into one. Now we have
only one protocol switch structure that is shared between ipv4 and ipv6.

Phabric: D476
Reviewed by: jhb


# 7493f24e 02-Mar-2013 Pawel Jakub Dawidek <pjd@FreeBSD.org>

- Implement two new system calls:

int bindat(int fd, int s, const struct sockaddr *addr, socklen_t addrlen);
int connectat(int fd, int s, const struct sockaddr *name, socklen_t namelen);

which allow to bind and connect respectively to a UNIX domain socket with a
path relative to the directory associated with the given file descriptor 'fd'.

- Add manual pages for the new syscalls.

- Make the new syscalls available for processes in capability mode sandbox.

- Add capability rights CAP_BINDAT and CAP_CONNECTAT that has to be present on
the directory descriptor for the syscalls to work.

- Update audit(4) to support those two new syscalls and to handle path
in sockaddr_un structure relative to the given directory descriptor.

- Update procstat(1) to recognize the new capability rights.

- Document the new capability rights in cap_rights_limit(2).

Sponsored by: The FreeBSD Foundation
Discussed with: rwatson, jilles, kib, des


# e5ed2130 18-Feb-2013 Pawel Jakub Dawidek <pjd@FreeBSD.org>

More white-space cleanups.

Reported by: zont (the first one)


# cbc9087c 17-Feb-2013 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Remove trailing spaces.


# b08d12d9 06-Dec-2012 Kevin Lo <kevlo@FreeBSD.org>

- according to POSIX, make socket(2) return EAFNOSUPPORT rather than
EPROTONOSUPPORT if the address family is not supported.
- introduce pffinddomain() to find a domain by family and use it as
appropriate.

Reviewed by: glebius


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# bc29160d 08-Jun-2009 Marko Zec <zec@FreeBSD.org>

Introduce an infrastructure for dismantling vnet instances.

Vnet modules and protocol domains may now register destructor
functions to clean up and release per-module state. The destructor
mechanisms can be triggered by invoking "vimage -d", or a future
equivalent command which will be provided via the new jail framework.

While this patch introduces numerous placeholder destructor functions,
many of those are currently incomplete, thus leaking memory or (even
worse) failing to stop all running timers. Many of such issues are
already known and will be incrementaly fixed over the next weeks in
smaller incremental commits.

Apart from introducing new fields in structs ifnet, domain, protosw
and vnet_net, which requires the kernel and modules to be rebuilt, this
change should have no impact on nooptions VIMAGE builds, since vnet
destructors can only be called in VIMAGE kernels. Moreover,
destructor functions should be in general compiled in only in
options VIMAGE builds, except for kernel modules which can be safely
kldunloaded at run time.

Bump __FreeBSD_version to 800097.
Reviewed by: bz, julian
Approved by: rwatson, kib (re), julian (mentor)


# cb6ff36d 06-Jan-2009 Bjoern A. Zeeb <bz@FreeBSD.org>

Further cleanup protosw.h:
- Remove unused typedefs to avoid confusion and ease in merging
ip6protosw with protosw.
- Correct a few comments.
- Remove most of a comment about usrreq. [1]
- Use tabs instead of spaces for consistency.

Submitted by: rwatson [1]
Reviewed by: rwatson
MFC after: 3 weeks


# a2ff111c 05-Jan-2009 Randall Stewart <rrs@FreeBSD.org>

Add the missing PRU_FLUSH and 'FLUSH' defines noticed
by rwatson. Opps..


# 15d657fd 04-Jan-2009 Robert Watson <rwatson@FreeBSD.org>

Remove now-unused pr_ousrreq from struct protosw. It may not have been
used since the last millenia.


# d7f03759 19-Oct-2008 Ulf Lilleengen <lulf@FreeBSD.org>

- Import the HEAD csup code which is the basis for the cvsmode work.


# 26b74636 02-Oct-2008 Robert Watson <rwatson@FreeBSD.org>

Remove __Break_the_struct_layout_for_now field from struct pr_usrreqs,
added in FreeBSD 6.x to break the binary layout of the data structure
during a conversion to C99 sparse structure initialization. Probably
should have been removed before 7.0, but 8.0 will do.


# cf71e438 14-Apr-2008 Randall Stewart <rrs@FreeBSD.org>

Add pru_flush routine so a transport can
flush itself during Shutdown

MFC after: 1 week


# b0668f71 24-Jul-2006 Robert Watson <rwatson@FreeBSD.org>

soreceive_generic(), and sopoll_generic(). Add new functions sosend(),
soreceive(), and sopoll(), which are wrappers for pru_sosend,
pru_soreceive, and pru_sopoll, and are now used univerally by socket
consumers rather than either directly invoking the old so*() functions
or directly invoking the protocol switch method (about an even split
prior to this commit).

This completes an architectural change that was begun in 1996 to permit
protocols to provide substitute implementations, as now used by UDP.
Consumers now uniformly invoke sosend(), soreceive(), and sopoll() to
perform these operations on sockets -- in particular, distributed file
systems and socket system calls.

Architectural head nod: sam, gnn, wollman


# 7455136b 14-Jul-2006 Robert Watson <rwatson@FreeBSD.org>

Define prototype for pru_close, which in the future will notify the
protocol of a socket close event distinct from a detach event, which
will (in a future commit) become aligned with pru_abort, which will
also be a notification of close prior to detach. Add prurequests event
for close, as well as patch up some existing missing ones.


# 5908c617 11-Jul-2006 Robert Watson <rwatson@FreeBSD.org>

Several protocol switch functions (pru_abort, pru_detach, pru_sosetlabel)
return void, so don't implement no-op versions of these functions.
Instead, consistently check if those switch pointers are NULL before
invoking them.


# 7ffadf35 16-Jun-2006 Robert Watson <rwatson@FreeBSD.org>

Remove extra blank line below comment.

MFC after: 1 week


# 6fbb9cf8 07-Jun-2006 Robert Watson <rwatson@FreeBSD.org>

Update comments in struct protosw to reflect changing times:

- Between 1996 and 1997, wollman eliminated pr_usrreq() and replaced it
with direct function pointers. Update comment to reflect these changes.

- In 2003, I added pru_sosetlabel(). Update comment to reflect this
change.

MFC after: 1 week


# bc725eaf 01-Apr-2006 Robert Watson <rwatson@FreeBSD.org>

Chance protocol switch method pru_detach() so that it returns void
rather than an error. Detaches do not "fail", they other occur or
the protocol flags SS_PROTOREF to take ownership of the socket.

soclose() no longer looks at so_pcb to see if it's NULL, relying
entirely on the protocol to decide whether it's time to free the
socket or not using SS_PROTOREF. so_pcb is now entirely owned and
managed by the protocol code. Likewise, no longer test so_pcb in
other socket functions, such as soreceive(), which have no business
digging into protocol internals.

Protocol detach routines no longer try to free the socket on detach,
this is performed in the socket code if the protocol permits it.

In rts_detach(), no longer test for rp != NULL in detach, and
likewise in other protocols that don't permit a NULL so_pcb, reduce
the incidence of testing for it during detach.

netinet and netinet6 are not fully updated to this change, which
will be in an upcoming commit. In their current state they may leak
memory or panic.

MFC after: 3 months


# ac45e92f 01-Apr-2006 Robert Watson <rwatson@FreeBSD.org>

Change protocol switch pru_abort() API so that it returns void rather
than an int, as an error here is not meaningful. Modify soabort() to
unconditionally free the socket on the return of pru_abort(), and
modify most protocols to no longer conditionally free the socket,
since the caller will do this.

This commit likely leaves parts of netinet and netinet6 in a situation
where they may panic or leak memory, as they have not are not fully
updated by this commit. This will be corrected shortly in followup
commits to these components.

MFC after: 3 months


# a5c0b80e 15-Mar-2006 Robert Watson <rwatson@FreeBSD.org>

Back out accidentally committed protosw.h:1.49. One of those days. It
will be recommitted with the remainder of the change in the next day or
two.

Submitted by: thompsa


# cf4f9f6d 15-Mar-2006 Robert Watson <rwatson@FreeBSD.org>

Correct spelling of 0x4000 in previous commit. This one line change from
a 42k patch seemed easier to retype than apply, but apparently not. :-)

Submitted by: pjd


# d374e81e 30-Oct-2005 Robert Watson <rwatson@FreeBSD.org>

Push the assignment of a new or updated so_qlimit from solisten()
following the protocol pru_listen() call to solisten_proto(), so
that it occurs under the socket lock acquisition that also sets
SO_ACCEPTCONN. This requires passing the new backlog parameter
to the protocol, which also allows the protocol to be aware of
changes in queue limit should it wish to do something about the
new queue limit. This continues a move towards the socket layer
acting as a library for the protocol.

Bump __FreeBSD_version due to a change in the in-kernel protocol
interface. This change has been tested with IPv4 and UNIX domain
sockets, but not other protocols.


# c948b4bc 11-Aug-2005 David E. O'Brien <obrien@FreeBSD.org>

Embellish comment.

Submitted by: Rostislav Krasny <rosti.bsd@gmail.com>


# 31793d59 10-Aug-2005 David E. O'Brien <obrien@FreeBSD.org>

Match IPv6 and use a static struct pr_usrreqs nousrreqs.


# 756d52a1 08-Nov-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Initialize struct pr_userreqs in new/sparse style and fill in common
default elements in net_init_domain().

This makes it possible to grep these structures and see any bogosities.


# 312c75c3 19-Oct-2004 Andre Oppermann <andre@FreeBSD.org>

Support for dynamically loadable and unloadable protocols within existing protocol
families.

The protosw[] array of any particular protocol family ("domain") is of fixed size
defined at compile time. This made it impossible to dynamically add or remove any
protocols to or from it. We work around this by introducing so called SPACER's
which are embedded into the protosw[] array at compile time. The SPACER's have
a special protocol number (32767) to indicate the fact that they are SPACER's but
are otherwise NULL. Only as many protocols can be dynamically loaded as SPACER's
are provided in the protosw[] structure.

The pr_usrreqs structure is treated more special and contains pointers to dummy
functions only returning EOPNOTSUPP. This is needed because the use of those
functions pointers is usually not checked within the kernel because until now it
was assumed to be a valid function pointer. Instead of fixing all potential
callers we just return a proper error code.

Two new functions provide a clean API to register and unregister a protocol. The
register function expects a pointer to a valid and complete struct protosw including
a pointer to struct pru_usrreqs provided by the caller. Upon successful registration
the pr_init() function will be called to finish initialization of the protocol. The
unregister function restores the SPACER in place of the protocol again. It is the
responseability of the caller to ensure proper closing of all sockets and freeing
of memory allocation by the unloading protocol.

sys/protosw.h

o Define generic PROTO_SPACER to be 32767
o Prototypes for all pru_*_notsupp() functions
o Prototypes for pf_proto_[un]register() functions

kern/uipc_domain.c

o Global struct pr_usrreqs nousrreqs containing valid pointers to the
pru_*_notsupp() functions
o New functions pf_proto_[un]register()

kern/uipc_socket2.c

o New functions bodies for all pru_*_notsupp() functions


# 82c6e879 06-Apr-2004 Warner Losh <imp@FreeBSD.org>

Remove advertising clause from University of California Regent's license,
per letter dated July 22, 1999.

Approved by: core


# a557af22 17-Nov-2003 Robert Watson <rwatson@FreeBSD.org>

Introduce a MAC label reference in 'struct inpcb', which caches
the MAC label referenced from 'struct socket' in the IPv4 and
IPv6-based protocols. This permits MAC labels to be checked during
network delivery operations without dereferencing inp->inp_socket
to get to so->so_label, which will eventually avoid our having to
grab the socket lock during delivery at the network layer.

This change introduces 'struct inpcb' as a labeled object to the
MAC Framework, along with the normal circus of entry points:
initialization, creation from socket, destruction, as well as a
delivery access control check.

For most policies, the inpcb label will simply be a cache of the
socket label, so a new protocol switch method is introduced,
pr_sosetlabel() to notify protocols that the socket layer label
has been updated so that the cache can be updated while holding
appropriate locks. Most protocols implement this using
pru_sosetlabel_null(), but IPv4/IPv6 protocols using inpcbs use
the the worker function in_pcbsosetlabel(), which calls into the
MAC Framework to perform a cache update.

Biba, LOMAC, and MLS implement these entry points, as do the stub
policy, and test policy.

Reviewed by: sam, bms
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories


# 134ea224 23-Sep-2003 Sam Leffler <sam@FreeBSD.org>

o update PFIL_HOOKS support to current API used by netbsd
o revamp IPv4+IPv6+bridge usage to match API changes
o remove pfil_head instances from protosw entries (no longer used)
o add locking
o bump FreeBSD version for 3rd party modules

Heavy lifting by: "Max Laier" <max@love2party.net>
Supported by: FreeBSD Foundation
Obtained from: NetBSD (bits of pfil.h and pfil.c)


# 9d5abbdd 01-Jan-2003 Jens Schweikhardt <schweikh@FreeBSD.org>

Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup,
especially in troff files.


# e88894d3 16-Aug-2002 Alfred Perlstein <alfred@FreeBSD.org>

make the strings for tcptimers, tanames and prurequests const to silence
warnings.


# c58eb46e 23-Mar-2002 Bruce Evans <bde@FreeBSD.org>

Fixed some style bugs in the removal of __P(()). The main ones were
not removing tabs before "__P((", and not outdenting continuation lines
to preserve non-KNF lining up of code with parentheses. Switch to KNF
formatting and/or rewrap the whole prototype in some cases.


# 789f12fe 19-Mar-2002 Alfred Perlstein <alfred@FreeBSD.org>

Remove __P


# b40ce416 12-Sep-2001 Julian Elischer <julian@FreeBSD.org>

KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha


# 969d6001 02-Sep-2001 Julian Elischer <julian@FreeBSD.org>

add another prototype and a couple of stopgaps for the in_protosw variant.


# 2b6a0c4f 10-Aug-2001 Julian Elischer <julian@FreeBSD.org>

Make the protoswitch definitiosn checkable in the same way that
cdevsw entries have been for a long time.
Discover that we now have two version sof the same structure.
I will shoot one of them shortly when I figure out why someone thinks
they need it. (And I can prove they don't)
(netinet/ipprotosw.h should GO AWAY)


# 33841545 10-Jun-2001 Hajimu UMEMOTO <ume@FreeBSD.org>

Sync with recent KAME.
This work was based on kame-20010528-freebsd43-snap.tgz and some
critical problem after the snap was out were fixed.
There are many many changes since last KAME merge.

TODO:
- The definitions of SADB_* in sys/net/pfkeyv2.h are still different
from RFC2407/IANA assignment because of binary compatibility
issue. It should be fixed under 5-CURRENT.
- ip6po_m member of struct ip6_pktopts is no longer used. But, it
is still there because of binary compatibility issue. It should
be removed under 5-CURRENT.

Reviewed by: itojun
Obtained from: KAME
MFC after: 3 weeks


# 90fcbbd6 18-Feb-2001 Poul-Henning Kamp <phk@FreeBSD.org>

Remove unneeded loop increment in src/sys/netinet/in_pcb.c:in_pcbnotify

Add new PRC_UNREACH_ADMIN_PROHIB in sys/sys/protosw.h

Remove condition on TCP in src/sys/netinet/ip_icmp.c:icmp_input

In src/sys/netinet/ip_icmp.c:icmp_input set code = PRC_UNREACH_ADMIN_PROHIB
or PRC_UNREACH_HOST for all unreachables except ICMP_UNREACH_NEEDFRAG

Rename sysctl icmp_admin_prohib_like_rst to icmp_unreach_like_rst
to reflect the fact that we also react on ICMP unreachables that
are not administrative prohibited. Also update the comments to
reflect this.

In sys/netinet/tcp_subr.c:tcp_ctlinput add code to treat
PRC_UNREACH_ADMIN_PROHIB and PRC_UNREACH_HOST different.

PR: 23986
Submitted by: Jesper Skriver <jesper@skriver.dk>


# 6c292daf 16-Aug-2000 Darren Reed <darrenr@FreeBSD.org>

backout previous change for now


# 48f0e051 16-Aug-2000 Darren Reed <darrenr@FreeBSD.org>

add extern for inetsw


# c4ac87ea 31-Jul-2000 Darren Reed <darrenr@FreeBSD.org>

activate pfil_hooks and covert ipfilter to use it


# 664a31e4 28-Dec-1999 Peter Wemm <peter@FreeBSD.org>

Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL"
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot). This is consistant with the other
BSD's who made this change quite some time ago. More commits to come.


# c3aac50f 27-Aug-1999 Peter Wemm <peter@FreeBSD.org>

$Id$ -> $FreeBSD$


# b0acefa8 20-Jan-1999 Bill Fenner <fenner@FreeBSD.org>

Add a flag, passed to pru_send routines, PRUS_MORETOCOME. This
flag means that there is more data to be put into the socket buffer.
Use it in TCP to reduce the interaction between mbuf sizes and the
Nagle algorithm.

Based on: "Justin C. Walker" <justin@apple.com>'s description of Apple's
fix for this problem.


# cfe8b629 22-Aug-1998 Garrett Wollman <wollman@FreeBSD.org>

Yow! Completely change the way socket options are handled, eliminating
another specialized mbuf type in the process. Also clean up some
of the cruft surrounding IPFW, multicast routing, RSVP, and other
ill-explored corners.


# ecbb00a2 07-Jun-1998 Doug Rabson <dfr@FreeBSD.org>

This commit fixes various 64bit portability problems required for
FreeBSD/alpha. The most significant item is to change the command
argument to ioctl functions from int to u_long. This change brings us
inline with various other BSD versions. Driver writers may like to
use (__FreeBSD_version == 300003) to detect this change.

The prototype FreeBSD/alpha machdep will follow in a couple of days
time.


# 8bcc577e 01-Feb-1998 Bruce Evans <bde@FreeBSD.org>

Forward declare more structs that are used in prototypes here - don't
depend on <sys/types.h> forward declaring common ones.


# cb3453e8 21-Dec-1997 Bruce Evans <bde@FreeBSD.org>

Moved some declarations from <sys/socket.h> to the correct places, and
fixed everything that depended on them being misplaced.


# 3a74593f 13-Sep-1997 Peter Wemm <peter@FreeBSD.org>

Update interfaces for poll()


# 57bf258e 16-Aug-1997 Garrett Wollman <wollman@FreeBSD.org>

Fix all areas of the system (or at least all those in LINT) to avoid storing
socket addresses in mbufs. (Socket buffers are the one exception.) A number
of kernel APIs needed to get fixed in order to make this happen. Also,
fix three protocol families which kept PCBs in mbufs to not malloc them
instead. Delete some old compatibility cruft while we're at it, and add
some new routines in the in_cksum family.


# 33f8d1fe 27-May-1997 Philippe Charnier <charnier@FreeBSD.org>

prevent `struct proc' from being declared inside parameter list.
PR: kern/3548


# 9f907986 24-May-1997 Peter Wemm <peter@FreeBSD.org>

Attempt to convert the ip_divert code to use the new-style protocol request
switch. I needed 'LINT' to compile for other reasons so I kinda got the
blood on my hands. Note: I don't know how to test this, I don't know if
it works correctly.


# a29f300e 27-Apr-1997 Garrett Wollman <wollman@FreeBSD.org>

The long-awaited mega-massive-network-code- cleanup. Part I.

This commit includes the following changes:
1) Old-style (pr_usrreq()) protocols are no longer supported, the compatibility
glue for them is deleted, and the kernel will panic on boot if any are compiled
in.

2) Certain protocol entry points are modified to take a process structure,
so they they can easily tell whether or not it is possible to sleep, and
also to access credentials.

3) SS_PRIV is no more, and with it goes the SO_PRIVSTATE setsockopt()
call. Protocols should use the process pointer they are now passed.

4) The PF_LOCAL and PF_ROUTE families have been updated to use the new
style, as has the `raw' skeleton family.

5) PF_LOCAL sockets now obey the process's umask when creating a socket
in the filesystem.

As a result, LINT is now broken. I'm hoping that some enterprising hacker
with a bit more time will either make the broken bits work (should be
easy for netipx) or dike them out.


# 7b187005 14-Mar-1997 Garrett Wollman <wollman@FreeBSD.org>

Add protoswitch entries for shortcut send/receive. Correct
a few misleading comments, and move allthe struct tag
forward declarations to be in one place.


# 6875d254 22-Feb-1997 Peter Wemm <peter@FreeBSD.org>

Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.


# c20ac029 18-Feb-1997 Garrett Wollman <wollman@FreeBSD.org>

Declare the new generic EOPNOTSUPP routines.


# 176395b2 12-Feb-1997 Garrett Wollman <wollman@FreeBSD.org>

Implement PRC_IFUP a la PRC_IFDOWN so that protocols know when an interface
has come bacl up (and can referse actions taken as a result of downing).


# 1130b656 14-Jan-1997 Jordan K. Hubbard <jkh@FreeBSD.org>

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


# 2c37256e 11-Jul-1996 Garrett Wollman <wollman@FreeBSD.org>

Modify the kernel to use the new pr_usrreqs interface rather than the old
pr_usrreq mechanism which was poorly designed and error-prone. This
commit renames pr_usrreq to pr_ousrreq so that old code which depended on it
would break in an obvious manner. This commit also implements the new
interface for TCP, although the old function is left as an example
(#ifdef'ed out). This commit ALSO fixes a longstanding bug in the
TCP timer processing (introduced by davidg on 1995/04/12) which caused
timer processing on a TCB to always stop after a single timer had
expired (because it misinterpreted the return value from tcp_usrreq()
to indicate that the TCB had been deleted). Finally, some code
related to polling has been deleted from if.c because it is not
relevant t -current and doesn't look at all like my current code.


# 1e4ad9ce 09-Jul-1996 Garrett Wollman <wollman@FreeBSD.org>

This is a proposal-in-code for a substantial modification of the way
the high kernel calls into a protocol stack to perform requests on the
user's behalf. We replace the pr_usrreq() entry in struct protosw with a
pointer to a structure containing pointers to functions which implement
the various reuqests; each function is declared with the correct type and
number of arguments. (This is unlike the current scheme in which a quarter
of the requests take arguments of type other than (struct mbuf *) and the
difference is papered over with casts.) There are a few benefits to this
new scheme:

1) Arguments are passed with their correct types, and null-pointer dummies
are no longer necessary.

2) There should be slightly better caching effects from eliminating
the prximity to extraneous code and th switch in pr_usrreq().

3) It becomes much easier to change the types of the arguments to something
other than `struct mbuf *' (e.g.,pushing the work of sosend() into
the protocol as advocated by Van Jacobson).

There is one principal drawback: existing protocol stacks need to
be modified. This is alleviated by compatibility code in
uipc_socket2.c and uipc_domain.c which emulates the new interface
in terms of the old and vice versa.

This idea is not original to me. I read about what Jacobson did
in one of his papers and have tried to implement the first steps
towards something like that here. Much work remains to be done.


# b62d102c 15-Dec-1995 Bruce Evans <bde@FreeBSD.org>

Uniformized pr_ctlinput protosw functions. The third arg is now `void
*' instead of caddr_t and it isn't optional (it never was). Most of the
netipx (and netns) pr_ctlinput functions abuse the second arg instead of
using the third arg but fixing this is beyond the scope of this round
of changes.


# 512fef80 20-Nov-1995 Bruce Evans <bde@FreeBSD.org>

Completed function declarations and/or added prototypes.


# 9989d2c4 19-Nov-1995 Poul-Henning Kamp <phk@FreeBSD.org>

Add a couple of the easy prototypes.


# 24e16f2e 06-Feb-1995 Garrett Wollman <wollman@FreeBSD.org>

Merge in the socket-level support for Transaction TCP from the OLAH_TTCP
branch.

Submitted by: Andras Olah <olah@cs.utwente.nl>


# b4a8d575 08-Oct-1994 Poul-Henning Kamp <phk@FreeBSD.org>

Added prototypes here and there. Moved pfctlinput into socket.h.


# 44df8ef6 07-Oct-1994 Poul-Henning Kamp <phk@FreeBSD.org>

Prototypes of today. Brought to you by a 28 minute transit time on BART :-)

(For the SF-unaware: I ride the BART (The Bay-area subway) for half an hour
each way to work. I use the time to shut up gcc -Wall on my handbook).


# af9da405 20-Aug-1994 Paul Richards <paul@FreeBSD.org>

Made them all idempotent.
Reviewed by:
Submitted by:


# 3c4dd356 02-Aug-1994 David Greenman <dg@FreeBSD.org>

Added $Id$


# df8bae1d 24-May-1994 Rodney W. Grimes <rgrimes@FreeBSD.org>

BSD 4.4 Lite Kernel Sources