History log of /freebsd-current/sys/netinet/sctp_pcb.c
Revision Date Author Comments
# 42aeb8d4 10-May-2024 Michael Tuexen <tuexen@FreeBSD.org>

sctp: store vtag expire time as time_t
Reported by: Coverity Scan
CID: 1492525
CID: 1493239
MFC after: 3 days


# 9d8a3718 10-May-2024 Michael Tuexen <tuexen@FreeBSD.org>

sctp: store cookie secret change time as time_t
Reported by: Coverity Scan
CID: 1492349
CID: 1493281
MFC after: 3 days


# f79a8585 30-Jan-2024 Gleb Smirnoff <glebius@FreeBSD.org>

sockets: garbage collect SS_ISCONFIRMING

Fixes: 8df32b19dee92b5eaa4b488ae78dca6accfcb38e


# dac91eb7 08-Oct-2023 Zhenlei Huang <zlei@FreeBSD.org>

sctp: Various fixes for loader tunables

The following sysctl variables are actually loader tunables. Add sysctl
flag CTLFLAG_TUN to them so that `sysctl -T` will report them correctly.

1. net.inet.sctp.tcbhashsize
2. net.inet.sctp.pcbhashsize
3. net.inet.sctp.chunkscale

The loader tunable 'net.inet.sctp.tcbhashsize' and 'net.inet.sctp.chunkscale'
are only used during vnet initializing, thus it make no senses to make them
writable tunable.

Validate the values of loader tunables on vnet initialize, reset them to
theirs defaults if invalid to prevent potential kernel panics.

Reviewed by: tuexen, #transport, #network
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42007


# 4f14d4b6 18-Aug-2023 Michael Tuexen <tuexen@FreeBSD.org>

sctp: cleanup handling of graceful shutdown of the peer

Don't handle a graceful shutdown of the peer as an implicit signal
that all partial messages are complete. First, this is not implemented
correctly and second this should not be done by the peer. It is more
appropriate to handle this as a protocol violation.
Remove the incorrect code and leave detecting the protocol violation
and its handling in a followup commit.

MFC after: 1 week


# c3179e66 18-Aug-2023 Michael Tuexen <tuexen@FreeBSD.org>

sctp: cleanup cdefs.h include


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 749a7fb5 13-Aug-2023 Michael Tuexen <tuexen@FreeBSD.org>

sctp: cleanup

Do not put a variable in the stcb for passing it to a function.
Just use a parameter of the function. No functional change intended.

MFC after: 1 week


# 6cb8b3b5 13-Aug-2023 Michael Tuexen <tuexen@FreeBSD.org>

sctp: use consistent names for locking macros

While there, add also a macro for an assert. Will be used shortly.
No functional change intended.

MFC after: 1 week


# c6207881 28-Jul-2023 Michael Tuexen <tuexen@FreeBSD.org>

sctp: keep sb_acc and sb_ccc in sync

PR: 260116
MFC after: 1 week


# 52640d61 22-Jul-2023 Michael Tuexen <tuexen@FreeBSD.org>

sctp: update zero checksum support

Implement support for the error detection method identifier.
MFC after: 2 weeks


# 04ede367 03-May-2023 Michael Tuexen <tuexen@FreeBSD.org>

sctp: only start shutdown guard timer when sending SHUTDOWN chunk

The intention is to protect a malicious peer not following the
shutdown procedures.

MFC after: 1 week


# 4a2b92d9 09-Mar-2023 Michael Tuexen <tuexen@FreeBSD.org>

sctp: initial implementation of draft-tuexen-tsvwg-sctp-zero-checksum


# f83db644 06-Nov-2022 Michael Tuexen <tuexen@FreeBSD.org>

sctp: minor changes due to upstreaming of Glebs recent changes


# 81a34d37 17-Aug-2022 Gleb Smirnoff <glebius@FreeBSD.org>

protosw: retire pr_drain and use EVENTHANDLER(9) directly

The method was called for two different conditions: 1) the VM layer is
low on pages or 2) one of UMA zones of mbuf allocator exhausted.
This change 2) into a new event handler, but all affected network
subsystems modified to subscribe to both, so this change shall not
bring functional changes under different low memory situations.

There were three subsystems still using pr_drain: TCP, SCTP and frag6.
The latter had its protosw entry for the only reason to register its
pr_drain method.

Reviewed by: tuexen, melifaro
Differential revision: https://reviews.freebsd.org/D36164


# 24e13a49 26-Jul-2022 Dimitry Andric <dim@FreeBSD.org>

Adjust sctp_drain() definition to avoid clang 15 warning

With clang 15, the following -Werror warning is produced:

sys/netinet/sctp_pcb.c:6946:11: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
sctp_drain()
^
void

This is because sctp_drain() is declared with a (void) argument list,
but defined with an empty argument list. Make the definition match the
declaration.

MFC after: 3 days


# a5c2009d 03-Jun-2022 Michael Tuexen <tuexen@FreeBSD.org>

sctp: improve handling of sctp inpcb flags

Use an atomic operation when the inp is not write locked.

Reported by: syzbot+bf27083e9a3f8fde8b4d@syzkaller.appspotmail.com
MFC after: 3 days


# edc5b6ea 13-May-2022 Michael Tuexen <tuexen@FreeBSD.org>

sctp: use sb_avail() when accessing sb_acc for reading

This is a cleanup to simplify a patch for PR 260116.

PR: 260116
MFC after: 3 days


# 9b2a35b3 13-May-2022 Michael Tuexen <tuexen@FreeBSD.org>

sctp: improve consistency

No functional change intended.

MFC after: 3 days


# eeba2221 15-Apr-2022 Michael Tuexen <tuexen@FreeBSD.org>

sctp: don't keep a pointer to a freed stcb around

Reported by: syzbot+b9ef06efdae7cb9ee414@syzkaller.appspotmail.com
Reported by: syzbot+b1e4793e0e6b25b0d510@syzkaller.appspotmail.com
MFC after: 3 days


# 3c3d77bd 07-Apr-2022 Michael Tuexen <tuexen@FreeBSD.org>

sctp: use variable names in a consistent way

No functional change intended.

MFC after: 3 days


# 5ac91821 28-Mar-2022 Michael Tuexen <tuexen@FreeBSD.org>

sctp: get rid of stcb send lock

Just use the stcb lock instead to simplify locking.

Reported by: syzbot+d00b202063150f85b110@syzkaller.appspotmail.com
Reported by: syzbot+87f268a0a6d2d6383306@syzkaller.appspotmail.com
MFC after: 3 days


# 274a0e4a 18-Feb-2022 Michael Tuexen <tuexen@FreeBSD.org>

sctp: cleanup, no functional change intended.

MFC after: 3 days


# 3ca204c9 17-Feb-2022 Michael Tuexen <tuexen@FreeBSD.org>

sctp: remove unused parameter

MFC after: 3 days


# afad340a 03-Jan-2022 Gleb Smirnoff <glebius@FreeBSD.org>

inpcb: garbage collect INP_LOCK_INIT(), used only once in sctp

Reviewed by: tuexen
Differential revision: https://reviews.freebsd.org/D33543


# 2de2ae33 30-Dec-2021 Michael Tuexen <tuexen@FreeBSD.org>

sctp: improve sctp_pathmtu_adjustment()

Allow the resending of DATA chunks to be controlled by the caller,
which allows retiring sctp_mtu_size_reset() in a separate commit.
Also improve the computaion of the overhead and use 32-bit integers
consistently.
Thanks to Timo Voelker for pointing me to the code.

MFC after: 3 days


# 989453da 27-Dec-2021 Michael Tuexen <tuexen@FreeBSD.org>

sctp: cleanup the SCTP_MAXSEG socket option.

This patch makes the handling of the SCTP_MAXSEG socket option
compliant with RFC 6458 (SCTP socket API) and fixes an issue
found by syzkaller.

Reported by: syzbot+a2791b89ab99121e3333@syzkaller.appspotmail.com
MFC after: 3 days


# 54912d47 04-Dec-2021 Michael Tuexen <tuexen@FreeBSD.org>

sctp: unbreak NOINET6 builds.

PR: 260119
Reported by: kostikbel
MFC after: 1 week


# d79676fb 03-Dec-2021 Michael Tuexen <tuexen@FreeBSD.org>

sctp: inherit IP level socket options from listening socket

Ensure that TTL and TOS values set on a listener get inheritet
to the accepted sockets.

PR: 260119
MFC after: 1 week


# 3c1ba6f3 25-Nov-2021 Michael Tuexen <tuexen@FreeBSD.org>

sctp: improve consistency, no functional change intended


# 762ae0ec 21-Sep-2021 Michael Tuexen <tuexen@FreeBSD.org>

sctp: Simplify stream scheduler usage

Callers are getting the stcb send lock, so just KASSERT that.
No need to signal this when calling stream scheduler functions.
No functional change intended.

MFC after: 1 week


# 4181fa2a 12-Sep-2021 Michael Tuexen <tuexen@FreeBSD.org>

sctp: minor cleanup, no functional change

MFC after: 1 week


# 2d5c48ec 11-Sep-2021 Mark Johnston <markj@FreeBSD.org>

sctp: Tighten up locking around sctp_aloc_assoc()

All callers of sctp_aloc_assoc() mark the PCB as connected after a
successful call (for one-to-one-style sockets). In all cases this is
done without the PCB lock, so the PCB's flags can be corrupted. We also
do not atomically check whether a one-to-one-style socket is a listening
socket, which violates various assumptions in solisten_proto().

We need to hold the PCB lock across all of sctp_aloc_assoc() to fix
this. In order to do that without introducing lock order reversals, we
have to hold the global info lock as well.

So:
- Convert sctp_aloc_assoc() so that the inp and info locks are
consistently held. It returns with the association lock held, as
before.
- Fix an apparent bug where we failed to remove an association from a
global hash if sctp_add_remote_addr() fails.
- sctp_select_a_tag() is called when initializing an association, and it
acquires the global info lock. To avoid lock recursion, push locking
into its callers.
- Introduce sctp_aloc_assoc_connected(), which atomically checks for a
listening socket and sets SCTP_PCB_FLAGS_CONNECTED.

There is still one edge case in sctp_process_cookie_new() where we do
not update PCB/socket state correctly.

Reviewed by: tuexen
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31908


# 3ea2cdd4 09-Sep-2021 Michael Tuexen <tuexen@FreeBSD.org>

sctp: add explicit cast, no functional change intended

MFC after: 3 days


# 4250aa11 09-Sep-2021 Mark Johnston <markj@FreeBSD.org>

sctp: Clear assoc socket references when freeing a PCB

This restores behaviour present in the first import of SCTP. Commit
ceaad40ae729dea2c5d8ffcfdd45bb96fb8969d2 commented this out and commit
62fb761ff28bb184a2543e539dd689fefd5d3246 removed it. However, once
sctp_inpcb_free() returns, the socket reference is gone no matter what,
so we need to clear it.

Reported by: syzbot+30dd69297fcbc5f0e10a@syzkaller.appspotmail.com
Reported by: syzbot+7b2f9d4bcac1c9569291@syzkaller.appspotmail.com
Reported by: syzbot+ed3e651f7d040af480a6@syzkaller.appspotmail.com
Reviewed by: tuexen
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31886


# 58a7bf12 08-Sep-2021 Michael Tuexen <tuexen@FreeBSD.org>

sctp: cleanup timewait handling for vtags

MFC after: 1 week


# ee473117 07-Sep-2021 Mark Johnston <markj@FreeBSD.org>

sctp: Fix a lock order reversal in sctp_swap_inpcb_for_listen()

When port reuse is enabled in a one-to-one-style socket, sctp_listen()
may call sctp_swap_inpcb_for_listen() to move the PCB out of the "TCP
pool". In so doing it will drop the PCB lock, yielding an LOR since we
now hold several socket locks. Reorder sctp_listen() so that it
performs this operation before beginning the conversion to a listening
socket. Also modify sctp_swap_inpcb_for_listen() to return with PCB
write-locked, since that's what sctp_listen() expects now.

Reviewed by: tuexen
Fixes: bd4a39cc93d9 ("socket: Properly interlock when transitioning to a listening socket")
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31879


# 6e3af632 07-Sep-2021 Mark Johnston <markj@FreeBSD.org>

sctp: Fix lock recursion in sctp_swap_inpcb_for_listen()

After commit bd4a39cc93d9 we now hold the global inp info lock across
the call to sctp_swap_inpcb_for_listen(), which attempts to acquire it
again. Since sctp_swap_inpcb_for_listen()'s sole caller is
sctp_listen(), we can simply change it to not try to acquire the lock.

Reported by: syzbot+a76b19ea2f8e1190c451@syzkaller.appspotmail.com
Reported by: syzbot+a1b6cef257ad145b7187@syzkaller.appspotmail.com
Reviewed by: tuexen
Fixes: bd4a39cc93d9 ("socket: Properly interlock when transitioning to a listening socket")
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31878


# aab1d593 08-Sep-2021 Michael Tuexen <tuexen@FreeBSD.org>

sctp: minor cleanups, no functional change intended


# c17b531b 07-Sep-2021 Mark Johnston <markj@FreeBSD.org>

sctp: Fix races around sctp_inpcb_free()

sctp_close() and sctp_abort() disassociate the PCB from its socket.
As a part of this, they attempt to free the PCB, which may end up
lingering. Fix some bugs in this area:

- For some reason, sctp_close() and sctp_abort() set
SCTP_PCB_FLAGS_SOCKET_GONE using an atomic compare-and-set without the
PCB lock held. This is racy since sctp_flags is normally updated
without atomics, using the PCB lock to synchronize. So, the update
can be lost, which can cause all sort of races with other SCTP
components which look for the _GONE flag. Fix the problem simply by
acquiring the PCB lock in order to set the flag. Note that we have to
drop and re-acquire the lock again in sctp_inpcb_free(), but I don't
see a good way around that for now. If it's a real problem, the _GONE
flag could be split out of sctp_flags and into a dedicated sctp_inpcb
field.
- In sctp_inpcb_free(), load sctp_socket after acquiring the PCB lock,
to avoid possible races with parallel sctp_inpcb_free() calls.
- Add an assertion sctp_inpcb_free() to verify that _ALLGONE is not set.

Reviewed by: tuexen
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31811


# d35be50f 01-Sep-2021 Mark Johnston <markj@FreeBSD.org>

sctp: Hold association locks across socket wakeups when freeing

At this point we do not hold the inpcb lock, so the only thing holding
the socket reference live is the TCB lock, which needs to be acquired by
sctp_inpcb_free() in order to destroy associations. Defer the unlock to
until after we dereference the socket reference.

Reported by: syzbot+1d0f2c4675de76a4cf1e@syzkaller.appspotmail.com
Reported by: syzbot+fabee77954fe69d3a5ad@syzkaller.appspotmail.com
Reviewed by: tuexen
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31754


# 65f30a39 01-Sep-2021 Mark Johnston <markj@FreeBSD.org>

sctp: Release the socket reference when detaching an association

Later in sctp_free_assoc(), when we clean up chunk lists,
sctp_free_spbufspace() is used to reset the byte count in the socket
send buffer. However, if the PCB is going away, the socket may already
have been detached from the PCB, in which case this becomes a use-after
free. Clear the socket reference from the association before detaching
it from the PCB, if the PCB has already lost its socket reference.

Reviewed by: tuexen
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31753


# 457abbb8 01-Sep-2021 Mark Johnston <markj@FreeBSD.org>

sctp: Implement sctp_inpcb_bind_locked()

This will be used by sctp_listen() to avoid dropping locks when
performing an implicit bind. No functional change intended.

Reviewed by: tuexen
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31757


# 4a36122b 31-Aug-2021 Mark Johnston <markj@FreeBSD.org>

sctp: Fix racy UNBOUND flag check in sctp_inpcb_bind()

SCTP needs to avoid binding a given socket twice. The check used to
avoid this is racy since neither the inpcb lock nor the global info lock
is held. Fix it by synchronizing using the global info lock. In
particular, sctp_inpcb_bind() may drop the inpcb lock in some cases, but
the info lock is sufficient to prevent double insertion into PCB hash
tables.

Reported by: syzbot+548a8560d959669d0e12@syzkaller.appspotmail.com
Reviewed by: tuexen
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31734


# 2496d812 31-Aug-2021 Mark Johnston <markj@FreeBSD.org>

sctp: Simplify the free port search in sctp_inpcb_bind()

Eliminate a flag variable and reduce indentation. No functional change
intended.

Reviewed by: tuexen
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31733


# 93908fce 31-Aug-2021 Mark Johnston <markj@FreeBSD.org>

sctp: Avoid unnecessary refcount bumps in sctp_inpcb_bind()

We only drop the inp lock when binding to a specific port. So, only
acquire an extra reference when required. This simplifies error
handling a bit.

Reviewed by: tuexen
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31732


# 0d29e4bc 31-Aug-2021 Mark Johnston <markj@FreeBSD.org>

sctp: Remove always-false checks in sctp_inpcb_bind()

No functional change intended.

Reviewed by: tuexen
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31731


# 105b68b4 09-Jul-2021 Michael Tuexen <tuexen@FreeBSD.org>

sctp: Fix errno in case of association setup failures

Do not report always ETIMEDOUT, but only when appropriate. In
other cases report ECONNABORTED.

MFC after: 3 days


# c7f048ab 27-Jun-2021 Michael Tuexen <tuexen@FreeBSD.org>

sctp: initialize sequence numbers for ECN correctly

MFC after: 3 days
Reported by: Junseok Yang (for the userland stack)


# 8b3d0f64 02-May-2021 Michael Tuexen <tuexen@FreeBSD.org>

sctp: improve address list scanning

If the alternate address has to be removed, force the stack to
find a new one, if it is still needed.

MFC after: 3 days


# 5a50eb65 12-Mar-2021 John Baldwin <jhb@FreeBSD.org>

Don't pass RFPROC to kproc_create(), it is redundant.

Reviewed by: tuexen, kib
MFC after: 1 week
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D29206


# 5ac83902 21-Feb-2021 Michael Tuexen <tuexen@FreeBSD.org>

sctp: clear a pointer to a net which will be removed

MFC after: 3 days


# b954d816 06-Oct-2020 Michael Tuexen <tuexen@FreeBSD.org>

Ensure variables are initialized before used.

MFC after: 3 days


# 6176f9d6 06-Oct-2020 Michael Tuexen <tuexen@FreeBSD.org>

Remove dead stores reported by clang static code analysis

MFC after: 3 days


# b15f5411 29-Sep-2020 Michael Tuexen <tuexen@FreeBSD.org>

Improve the input validation and processing of cookies.
This avoids setting the association in an inconsistent
state, which could result in a use-after-free situation.
This can be triggered by a malicious peer, if the peer
can modify the cookie without the local endpoint recognizing
it.
Thanks to Ned Williamson for reporting the issue.

MFC after: 3 days


# fbc6840b 28-Sep-2020 Michael Tuexen <tuexen@FreeBSD.org>

Minor cleanup.

MFC after: 3 days


# b6db274d 23-Sep-2020 Michael Tuexen <tuexen@FreeBSD.org>

Whitespace changes.

MFC after: 3 days


# 662c1305 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

net: clean up empty lines in .c and .h files


# f5d30f7f 16-Aug-2020 Michael Tuexen <tuexen@FreeBSD.org>

Improve the handling of concurrent send() calls for SCTP sockets,
especially when having the explicit EOR mode enabled.

Reported by: Megan2013678@protonmail.com
Reported by: syzbot+bc02585076c3cc977f9b@syzkaller.appspotmail.com
MFC after: 3 days


# 205f3e15 23-Jul-2020 Michael Tuexen <tuexen@FreeBSD.org>

Clear the pointer to the socket when closing it also in case of
an ungraceful operation.
This fixes a use-after-free bug found and reported by Taylor
Brandstetter of Google by testing the userland stack.

MFC after: 1 week


# 8745f898 18-Jul-2020 Michael Tuexen <tuexen@FreeBSD.org>

Add reference counts for inp/stcb/net when timers are running.
This avoids a use-after-free reported for the userland stack.
Thanks to Taylor Brandstetter for suggesting a patch for
the userland stack.

MFC after: 1 week


# 05bceec6 18-Jul-2020 Michael Tuexen <tuexen@FreeBSD.org>

Remove code which is not needed.

MFC after: 1 week


# 7f0ad227 17-Jul-2020 Michael Tuexen <tuexen@FreeBSD.org>

Improve the locking of address lists by adding some asserts and
rearranging the addition of address such that the lock is not
given up during checking and adding.

MFC after: 1 week


# e6db509d 22-Jun-2020 Mark Johnston <markj@FreeBSD.org>

Move the definition of SCTP's system_base_info into sctp_crc32.c.

This file is the only SCTP source file compiled into the kernel when
SCTP_SUPPORT is configured. sctp_delayed_checksum() references a couple
of counters defined in system_base_info, so the change allows these
counters to be referenced in a kernel compiled without "options SCTP".

Submitted by: tuexen
MFC with: r362338


# d60bdf85 13-Jun-2020 Michael Tuexen <tuexen@FreeBSD.org>

Remove usage of empty macro.

MFC after: 1 week


# 2f9e6db0 12-Jun-2020 Michael Tuexen <tuexen@FreeBSD.org>

More cleanups due to ifdef cleanup done upstream

MFC after: 1 week


# 28397ac1 11-Jun-2020 Michael Tuexen <tuexen@FreeBSD.org>

Non-functional changes due to upstream cleanup.

MFC after: 1 week


# 5fb132ab 08-Jun-2020 Michael Tuexen <tuexen@FreeBSD.org>

Whitespace cleanups and removal of a stale comment.

MFC after: 1 week


# 3f53d622 06-Jun-2020 Michael Tuexen <tuexen@FreeBSD.org>

Fix typo in comment.

Submitted by Orgad Shaneh for the userland stack.
MFC after: 1 week


# 2cf33471 06-Jun-2020 Michael Tuexen <tuexen@FreeBSD.org>

Non-functional changes due to cleanup (upstream removing of Panda support)
of the code

MFC after: 1 week


# 999f86d6 19-May-2020 Michael Tuexen <tuexen@FreeBSD.org>

Replace snprintf() by SCTP_SNPRINTF() and let SCTP_SNPRINTF() map
to snprintf() on FreeBSD. This allows to check for failures of snprintf()
on platforms other than FreeBSD kernel.


# 821bae7c 19-May-2020 Michael Tuexen <tuexen@FreeBSD.org>

Revert r361209:

cem noted that on FreeBSD snprintf() can not fail and code should not
check for that.

A followup commit will replace the usage of snprintf() in the SCTP
sources with a variadic macro SCTP_SNPRINTF, which will simply map to
snprintf() on FreeBSD and do a checking similar to r361209 on
other platforms.


# bca18028 18-May-2020 Michael Tuexen <tuexen@FreeBSD.org>

Cleanup, no functional change intended.

MFC after: 3 days


# e708e2a4 18-May-2020 Michael Tuexen <tuexen@FreeBSD.org>

Handle failures of snprintf().

MFC after: 3 days


# da8c34c3 17-May-2020 Michael Tuexen <tuexen@FreeBSD.org>

Non-functional changes, cleanups.

MFC after: 3 days


# 983066f0 25-Apr-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Convert route caching to nexthop caching.

This change is build on top of nexthop objects introduced in r359823.

Nexthops are separate datastructures, containing all necessary information
to perform packet forwarding such as gateway interface and mtu. Nexthops
are shared among the routes, providing more pre-computed cache-efficient
data while requiring less memory. Splitting the LPM code and the attached
data solves multiple long-standing problems in the routing layer,
drastically reduces the coupling with outher parts of the stack and allows
to transparently introduce faster lookup algorithms.

Route caching was (re)introduced to minimise (slow) routing lookups, allowing
for notably better performance for large TCP senders. Caching works by
acquiring rtentry reference, which is protected by per-rtentry mutex.
If the routing table is changed (checked by comparing the rtable generation id)
or link goes down, cache record gets withdrawn.

Nexthops have the same reference counting interface, backed by refcount(9).
This change merely replaces rtentry with the actual forwarding nextop as a
cached object, which is mostly mechanical. Other moving parts like cache
cleanup on rtable change remains the same.

Differential Revision: https://reviews.freebsd.org/D24340


# 25ec3553 28-Mar-2020 Michael Tuexen <tuexen@FreeBSD.org>

Handle integer overflows correctly when converting msecs and secs to
ticks and vice versa.
These issues were caught by recently added panic() calls on INVARIANTS
systems.

Reported by: syzbot+b44787b4be7096cd1590@syzkaller.appspotmail.com
Reported by: syzbot+35f82d22805c1e899685@syzkaller.appspotmail.com
MFC after: 1 week


# 24187cfe 25-Mar-2020 Michael Tuexen <tuexen@FreeBSD.org>

Revert https://svnweb.freebsd.org/changeset/base/357829

This introduces a regression reported by koobs@ when running a pyhton
test suite on a loaded system.

This patch resulted in a failing accept() call, when the association
was setup and gracefully shutdown by the peer before accept was called.
So the following packetdrill script would fail:

+0.0 socket(..., SOCK_STREAM, IPPROTO_SCTP) = 3
+0.0 bind(3, ..., ...) = 0
+0.0 listen(3, 1) = 0
+0.0 < sctp: INIT[flgs=0, tag=1, a_rwnd=15000, os=1, is=1, tsn=1]
+0.0 > sctp: INIT_ACK[flgs=0, tag=2, a_rwnd=..., os=..., is=..., tsn=1, ...]
+0.1 < sctp: COOKIE_ECHO[flgs=0, len=..., val=...]
+0.0 > sctp: COOKIE_ACK[flgs=0]
+0.0 < sctp: DATA[flgs=BE, len=116, tsn=1, sid=0, ssn=0, ppid=0]
+0.0 > sctp: SACK[flgs=0, cum_tsn=1, a_rwnd=..., gaps=[], dups=[]]
+0.0 < sctp: SHUTDOWN[flgs=0, cum_tsn=0]
+0.0 > sctp: SHUTDOWN_ACK[flgs=0]
+0.0 < sctp: SHUTDOWN_COMPLETE[flgs=0]
+0.0 accept(3, ..., ...) = 4
+0.0 close(3) = 0
+0.0 recv(4, ..., 4096, 0) = 100
+0.0 recv(4, ..., 4096, 0) = 0
+0.0 close(4) = 0

Reported by: koops@


# 6fb7b4fb 19-Mar-2020 Michael Tuexen <tuexen@FreeBSD.org>

Consistently provide arguments for timer start and stop routines.
This is another step in cleaning up timer handling.
MFC after: 1 week


# 56ccb48f 12-Feb-2020 Michael Tuexen <tuexen@FreeBSD.org>

Don't panic under INVARIANTS when we can't allocate memory for storing
a vtag in time wait.
This issue was found by running syzkaller.

MFC after: 1 week


# ca3de626 12-Feb-2020 Michael Tuexen <tuexen@FreeBSD.org>

Mark the socket as disconnected when freeing the association the first
time.
This issue was found by running syzkaller.

MFC after: 1 week


# 8803350d 11-Feb-2020 Michael Tuexen <tuexen@FreeBSD.org>

Revert https://svnweb.freebsd.org/changeset/base/357761

This was suggested by cem@


# 9803f01c 11-Feb-2020 Michael Tuexen <tuexen@FreeBSD.org>

Don't start an SCTP timer using a net, which has been removed.

Submitted by: Taylor Brandstetter
MFC after: 1 week


# 95d27478 11-Feb-2020 Michael Tuexen <tuexen@FreeBSD.org>

Use an int instead of a bool variable, since bool is not supported
on all platforms the stack is running on in userland.


# 6a34ec63 09-Feb-2020 Michael Tuexen <tuexen@FreeBSD.org>

Stop the PMTU and HB timer when removing a net, not when freeing it.

Submitted by: Taylor Brandstetter
MFC after: 1 week


# 5555400a 09-Feb-2020 Michael Tuexen <tuexen@FreeBSD.org>

Cleanup timer handling.

Submitted by: Taylor Brandstetter
MFC after: 1 week


# f799ff82 04-Feb-2020 Michael Tuexen <tuexen@FreeBSD.org>

Remove unused timer.

Submitted by: Taylor Brandstetter


# 4b66d476 05-Jan-2020 Michael Tuexen <tuexen@FreeBSD.org>

Return -1 consistently if an error occurs.

MFC after: 1 week


# 6088175a 20-Dec-2019 Michael Tuexen <tuexen@FreeBSD.org>

Improve input validation for some parameters having a too small
reported length.

Thanks to Natalie Silvanovich from Google for finding one of these
issues in the SCTP userland stack and reporting it.

MFC after: 1 week


# 671d68fa 13-Oct-2019 Mark Johnston <markj@FreeBSD.org>

Move SCTP DTrace probe definitions into a .c file.

Previously they were defined in a header which was included exactly
once. Change this to follow the usual practice of putting definitions
in C files. No functional change intended.

Discussed with: tuexen
MFC after: 1 week
Sponsored by: The FreeBSD Foundation


# e30a1788 31-Aug-2019 Michael Tuexen <tuexen@FreeBSD.org>

Improve function definition.

MFC after: 3 days


# 94962f6b 05-Aug-2019 Michael Tuexen <tuexen@FreeBSD.org>

Improve consistency. No functional change.

MFC after: 3 days


# 0ecd976e 02-Aug-2019 Bjoern A. Zeeb <bz@FreeBSD.org>

IPv6 cleanup: kernel

Finish what was started a few years ago and harmonize IPv6 and IPv4
kernel names. We are down to very few places now that it is feasible
to do the change for everything remaining with causing too much disturbance.

Remove "aliases" for IPv6 names which confusingly could indicate
that we are talking about a different data structure or field or
have two fields, one for each address family.
Try to follow common conventions used in FreeBSD.

* Rename sin6p to sin6 as that is how it is spelt in most places.
* Remove "aliases" (#defines) for:
- in6pcb which really is an inpcb and nothing separate
- sotoin6pcb which is sotoinpcb (as per above)
- in6p_sp which is inp_sp
- in6p_flowinfo which is inp_flow
* Try to use ia6 for in6_addr rather than in6p.
* With all these gone also rename the in6p variables to inp as
that is what we call it in most of the network stack including
parts of netinet6.

The reasons behind this cleanup are that we try to further
unify netinet and netinet6 code where possible and that people
will less ignore one or the other protocol family when doing
code changes as they may not have spotted places due to different
names for the same thing.

No functional changes.

Discussed with: tuexen (SCTP changes)
MFC after: 3 months
Sponsored by: Netflix


# 25fa310a 15-Jul-2019 Michael Tuexen <tuexen@FreeBSD.org>

Fix socket state handling when freeing an SCTP endpoint.

This issue was found by runing syzkaller.

MFC after: 1 week


# 8a956abe 13-Jul-2019 Michael Tuexen <tuexen@FreeBSD.org>

When calling sctp_initialize_auth_params(), the inp must have at
least a read lock. To avoid more complex locking dances, just
call it in sctp_aloc_assoc() when the write lock is still held.

Reported by: syzbot+08a486f7e6966f1c3cfb@syzkaller.appspotmail.com
MFC after: 1 week


# 689ed089 25-Mar-2019 Michael Tuexen <tuexen@FreeBSD.org>

Improve locking when tearing down an SCTP association.
This is joint work with rrs@ and the issue was found by
syzkaller.

MFC after: 1 week


# be62c88b 03-Mar-2019 Michael Tuexen <tuexen@FreeBSD.org>

Allocate an assocition id and register the stcb with holding the lock.
This avoids a race where stcbs can be found, which are not completely
initialized.

This was found by running syzkaller.

MFC after: 3 days


# 1a0b0216 21-Aug-2018 Michael Tuexen <tuexen@FreeBSD.org>

Refactor the SHUTDOWN_PENDING state handling.

This is not a functional change but a preperation for the upcoming
DTrace support. It is necessary to change the state in one
logical operation, even if it involves clearing the sub state
SHUTDOWN_PENDING.

MFC after: 1 month


# 839d21d6 13-Aug-2018 Michael Tuexen <tuexen@FreeBSD.org>

Use the stacb instead of the asoc in state macros.

This is not a functional change. Just a preparation for upcoming
dtrace state change provider support.


# 61a21880 13-Aug-2018 Michael Tuexen <tuexen@FreeBSD.org>

Use consistently the macors to modify the assoc state.

No functional change.


# 0053ed28 19-Jul-2018 Michael Tuexen <tuexen@FreeBSD.org>

Whitespace changes due to changes in ident.


# b0471b4b 19-Jul-2018 Michael Tuexen <tuexen@FreeBSD.org>

Revert https://svnweb.freebsd.org/changeset/base/336503
since I also ran the export script with different parameters.


# 7679e49d 19-Jul-2018 Michael Tuexen <tuexen@FreeBSD.org>

Whitespace changes due to change if ident.


# 13500cbb 02-Jun-2018 Michael Tuexen <tuexen@FreeBSD.org>

Don't overflow a buffer if we receive an INIT or INIT-ACK chunk
without a RANDOM parameter but with a CHUNKS or HMAC-ALGO parameter.
Please note that sending this combination violates the specification.

Thnanks to Ronald E. Crane for reporting the issue for the userland
stack.

MFC after: 3 days


# 51369649 20-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys: further adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.


# 28a6adde 03-Nov-2017 Michael Tuexen <tuexen@FreeBSD.org>

Allow the setting of the MTU for future paths using an SCTP socket option.
This functionality was missing.

MFC after: 1 week


# afb908da 22-Sep-2017 Michael Tuexen <tuexen@FreeBSD.org>

Add missing socket lock.

MFC after: 1 week


# 2c62ba73 20-Sep-2017 Michael Tuexen <tuexen@FreeBSD.org>

Protect the address workqueue timer by a mutex.

MFC after: 1 week


# 0c4622da 09-Sep-2017 Michael Tuexen <tuexen@FreeBSD.org>

Silence a Coverity warning from scanning the usrsctp library.

MFC after: 3 days


# 5ba7f91f 19-Jul-2017 Michael Tuexen <tuexen@FreeBSD.org>

Use memset/memcpy instead of bzero/bcopy.

Just use one variant instead of both. Use the memset/memcpy
ones since they cause less problems in crossplatform deployment.

MFC after: 1 week


# 28cd0699 18-Jul-2017 Michael Tuexen <tuexen@FreeBSD.org>

Fix the accounting and add code to detect errors in accounting.
Joint work with rrs@
MFC after: 1 week


# f4358911 23-Jun-2017 Michael Tuexen <tuexen@FreeBSD.org>

Handle sctp_get_next_param() in a consistent way.

This addresses an issue found by Felix Weinrank using libfuzz.
While there, use also consistent nameing.

MFC after: 3 days


# 12d8a8e7 08-Jun-2017 Gleb Smirnoff <glebius@FreeBSD.org>

The desired lock here is socket buffer, not socket.

Right now they match, but won't in future.


# 5d08768a 26-May-2017 Michael Tuexen <tuexen@FreeBSD.org>

Use the SCTP_PCB_FLAGS_ACCEPTING flags to check for listeners.

While there, use a macro for checking the listen state to allow for
easier changes if required.

This done to help glebius@ with his listen changes.


# 10e0318a 29-Apr-2017 Michael Tuexen <tuexen@FreeBSD.org>

Allow SCTP to use the hostcache.

This patch allows the MTU stored in the hostcache to be used as an
initial value for SCTP paths. When an ICMP PTB message is received,
store the MTU in the hostcache.

MFC after: 1 week


# 627c036f 13-Feb-2017 Andrey V. Elsukov <ae@FreeBSD.org>

Remove IPsec related PCB code from SCTP.

The inpcb structure has inp_sp pointer that is initialized by
ipsec_init_pcbpolicy() function. This pointer keeps strorage for IPsec
security policies associated with a specific socket.
An application can use IP_IPSEC_POLICY and IPV6_IPSEC_POLICY socket
options to configure these security policies. Then ip[6]_output()
uses inpcb pointer to specify that an outgoing packet is associated
with some socket. And IPSEC_OUTPUT() method can use a security policy
stored in the inp_sp. For inbound packet the protocol-specific input
routine uses IPSEC_CHECK_POLICY() method to check that a packet conforms
to inbound security policy configured in the inpcb.

SCTP protocol doesn't specify inpcb for ip[6]_output() when it sends
packets. Thus IPSEC_OUTPUT() method does not consider such packets as
associated with some socket and can not apply security policies
from inpcb, even if they are configured. Since IPSEC_CHECK_POLICY()
method is called from protocol-specific input routine, it can specify
inpcb pointer and associated with socket inbound policy will be
checked. But there are two problems:
1. Such check is asymmetric, becasue we can not apply security policy
from inpcb for outgoing packet.
2. IPSEC_CHECK_POLICY() expects that caller holds INPCB lock and
access to inp_sp is protected. But for SCTP this is not correct,
becasue SCTP uses own locks to protect inpcb.

To fix these problems remove IPsec related PCB code from SCTP.
This imply that IP_IPSEC_POLICY and IPV6_IPSEC_POLICY socket options
will be not applicable to SCTP sockets. To be able correctly check
inbound security policies for SCTP, mark its protocol header with
the PR_LASTHDR flag.

Reported by: tuexen
Reviewed by: tuexen
Differential Revision: https://reviews.freebsd.org/D9538


# fcf59617 06-Feb-2017 Andrey V. Elsukov <ae@FreeBSD.org>

Merge projects/ipsec into head/.

Small summary
-------------

o Almost all IPsec releated code was moved into sys/netipsec.
o New kernel modules added: ipsec.ko and tcpmd5.ko. New kernel
option IPSEC_SUPPORT added. It enables support for loading
and unloading of ipsec.ko and tcpmd5.ko kernel modules.
o IPSEC_NAT_T option was removed. Now NAT-T support is enabled by
default. The UDP_ENCAP_ESPINUDP_NON_IKE encapsulation type
support was removed. Added TCP/UDP checksum handling for
inbound packets that were decapsulated by transport mode SAs.
setkey(8) modified to show run-time NAT-T configuration of SA.
o New network pseudo interface if_ipsec(4) added. For now it is
build as part of ipsec.ko module (or with IPSEC kernel).
It implements IPsec virtual tunnels to create route-based VPNs.
o The network stack now invokes IPsec functions using special
methods. The only one header file <netipsec/ipsec_support.h>
should be included to declare all the needed things to work
with IPsec.
o All IPsec protocols handlers (ESP/AH/IPCOMP protosw) were removed.
Now these protocols are handled directly via IPsec methods.
o TCP_SIGNATURE support was reworked to be more close to RFC.
o PF_KEY SADB was reworked:
- now all security associations stored in the single SPI namespace,
and all SAs MUST have unique SPI.
- several hash tables added to speed up lookups in SADB.
- SADB now uses rmlock to protect access, and concurrent threads
can do SA lookups in the same time.
- many PF_KEY message handlers were reworked to reflect changes
in SADB.
- SADB_UPDATE message was extended to support new PF_KEY headers:
SADB_X_EXT_NEW_ADDRESS_SRC and SADB_X_EXT_NEW_ADDRESS_DST. They
can be used by IKE daemon to change SA addresses.
o ipsecrequest and secpolicy structures were cardinally changed to
avoid locking protection for ipsecrequest. Now we support
only limited number (4) of bundled SAs, but they are supported
for both INET and INET6.
o INPCB security policy cache was introduced. Each PCB now caches
used security policies to avoid SP lookup for each packet.
o For inbound security policies added the mode, when the kernel does
check for full history of applied IPsec transforms.
o References counting rules for security policies and security
associations were changed. The proper SA locking added into xform
code.
o xform code was also changed. Now it is possible to unregister xforms.
tdb_xxx structures were changed and renamed to reflect changes in
SADB/SPDB, and changed rules for locking and refcounting.

Reviewed by: gnn, wblock
Obtained from: Yandex LLC
Relnotes: yes
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D9352


# b7b84c0e 26-Dec-2016 Michael Tuexen <tuexen@FreeBSD.org>

Whitespace changes.

The toolchain for processing the sources has been updated. No functional
change.

MFC after: 3 days


# 49656eef 07-Dec-2016 Michael Tuexen <tuexen@FreeBSD.org>

Cleanup the names of SSN, SID, TSN, FSN, PPID and MID.

This made a couple of bugs visible in handling SSN wrap-arounds
when using DATA chunks. Now bulk transfer seems to work fine...
This fixes the issue reported in
https://github.com/sctplab/usrsctp/issues/111

MFC after: 1 week


# 859422cc 13-Oct-2016 Michael Tuexen <tuexen@FreeBSD.org>

Mark the socket as un-writable when it is 1-to-1 and the SCTP association
is freed.

MFC after: 1 month


# 4c7fb0cf 13-Oct-2016 Michael Tuexen <tuexen@FreeBSD.org>

Whitespace changes.

MFC after: 1 month


# 4d58b0c3 06-Aug-2016 Michael Tuexen <tuexen@FreeBSD.org>

Remove stream queue entry consistently from wheel.
While there, improve the handling of drain.

MFC after: 3 days


# d1ea5fa9 05-Aug-2016 Michael Tuexen <tuexen@FreeBSD.org>

Fix various bugs in relation to the I-DATA chunk support

This is joint work with rrs.

MFC after: 3 days


# 36ad8372 06-Jun-2016 Sepherosa Ziehau <sephe@FreeBSD.org>

net: Use M_HASHTYPE_OPAQUE_HASH if the mbuf flowid has hash properties

Reviewed by: hps, erj, tuexen
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D6688


# cd0a4ff6 02-May-2016 Pedro F. Giffuni <pfg@FreeBSD.org>

netinet/sctp*: minor spelling fixes in comments.

No functional change.

Reviewed by: tuexen


# ec70917f 01-May-2016 Michael Tuexen <tuexen@FreeBSD.org>

When a client uses UDP encapsulation and lists IP addresses in the INIT
chunk, enable UDP encapsulation for all those addresses.
This helps clients using a userland stack to support multihoming if
they are not behind a NAT.

MFC after: 1 week


# 7154bf4a 30-Apr-2016 Michael Tuexen <tuexen@FreeBSD.org>

Add the UDP encaps port as a parameter to sctp_add_remote_addr().

This is currently only a code change without any functional
change. But this allows to set the remote encapsulation port
in a more detailed way, which will be provided in a follow-up
commit.

MFC after: 1 week


# 44249214 07-Apr-2016 Randall Stewart <rrs@FreeBSD.org>

This is work done by Michael Tuexen and myself at the IETF. This
adds the new I-Data (Interleaved Data) message. This allows a user
to be able to have complete freedom from Head Of Line blocking that
was previously there due to the in-ability to send multiple large
messages without the TSN's being in sequence. The code as been
tested with Michaels various packet drill scripts as well as
inter-networking between the IETF's location in Argentina and Germany.


# 76f8482a 28-Mar-2016 Michael Tuexen <tuexen@FreeBSD.org>

Restrict local addresses until they are acked by the peer.

MFC after: 1 week


# 64a3a630 19-Feb-2016 Michael Tuexen <tuexen@FreeBSD.org>

Use the SCTP level pointer, not the interface level.

MFC after: 3 days


# 467f0d55 16-Feb-2016 Michael Tuexen <tuexen@FreeBSD.org>

Whitespace changes.


# 2b1c7de4 16-Feb-2016 Michael Tuexen <tuexen@FreeBSD.org>

Improve the teardown of the SCTP stack.

Obtained from: bz@
MFC after: 1 week


# 79b67faa 28-Jan-2016 Michael Tuexen <tuexen@FreeBSD.org>

Always look in the TCP pool.
This fixes issues with a restarting peer when the listening
1-to-1 style socket is closed.

MFC after: 3 days


# d30c4f99 22-Jan-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Noisy comments (not sure if the static would be valid for all SCTP
implementations).

Reorder some cleanup just to match the general order we normally use.

Sponsored by: The FreeBSD Foundation


# 27a01c6c 23-Jan-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Try to catch a couple of SCTP teardown race conditions.
Saw all the printfs already.

Note: not sure the atomics are needed but without them, the condition
would never trigger, and we'd still see panics (which could have been
due to the insert race). Will work my way backwards in case this stays
stable.

Sponsored by: The FreeBSD Foundation


# 1f12da0e 22-Jan-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Just checkpoint the WIP in order to be able to make the tree update
easier. Note: this is currently not in a usable state as certain
teardown parts are not called and the DOMAIN rework is missing.
More to come soon and find its way to head.

Obtained from: P4 //depot/user/bz/vimage/...
Sponsored by: The FreeBSD Foundation


# c7e732ae 14-Jan-2016 Michael Tuexen <tuexen@FreeBSD.org>

Fix a bug in INIT handling on accepted 1-to-1 style sockets when the
listener is closed.
This fix allows the following packetdrill test to pass:
// Setup a connected, blocking 1-to-1 style socket
+0.0 socket(..., SOCK_STREAM, IPPROTO_SCTP) = 3
// Check the handshake with en empty(!) cookie
+0.0 bind(3, ..., ...) = 0
+0.0 listen(3, 1) = 0
+0.0 < sctp: INIT[flgs=0, tag=1, a_rwnd=1500, os=1, is=1, tsn=1]
+0.0 > sctp: INIT_ACK[flgs=0, tag=2, a_rwnd=..., os=..., is=..., tsn=1, ...]
+0.0 < sctp: COOKIE_ECHO[flgs=0, len=..., val=...]
+0.0 > sctp: COOKIE_ACK[flgs=0]
+0.0 accept(3, ..., ...) = 4
+0.0 close(3) = 0
// Inject an INIT chunk and expect an INIT-ACK
+0.0 < sctp: INIT[flgs=0, tag=3, a_rwnd=1500, os=1, is=1, tsn=1]
+0.0 > sctp: INIT_ACK[flgs=0, tag=..., a_rwnd=..., os=..., is=..., tsn=..., ...]

MFC after: 3 days


# c979034b 06-Dec-2015 Michael Tuexen <tuexen@FreeBSD.org>

Fix the allocation of outgoing streams:
* When processing a cookie, use the number of
streams announced in the INIT-ACK.
* When sending an INIT-ACK for an existing
association, use the value from the association,
not from the end-point.

MFC after: 1 week


# 3bf2363d 21-Nov-2015 Michael Tuexen <tuexen@FreeBSD.org>

Fix the handling of IPSec policies in the SCTP stack. At least
make sure they are not leaked...

MFC after: 1 week


# e5d23883 21-Nov-2015 Michael Tuexen <tuexen@FreeBSD.org>

Revert part of r291137 which seems correct, bit does not fix the
resource problem I'm currently hunting down.

MFC after: 1 week
X-MFC with: 291137


# 8ca16419 21-Nov-2015 Michael Tuexen <tuexen@FreeBSD.org>

Don't send SHUTDOWN chunk when the association is in a front state
and the applications calls shutdown(..., SHUT_WR) or
shutdown(..., SHUT_RDWR).

MFC after: 1 week.


# 6e9c45e0 19-Oct-2015 Michael Tuexen <tuexen@FreeBSD.org>

Use __func__ instead of __FUNCTION__.

This allows to compile the userland stack without errors using gcc5.
Thanks to saghul for makeing me aware and providing the patch.

MFC after: 1 week


# 267dbe63 27-Jul-2015 Michael Tuexen <tuexen@FreeBSD.org>

Provide consistent error causes whenever an ABORT chunk is sent.

MFC after: 1 week


# d089f9b9 17-Jun-2015 Michael Tuexen <tuexen@FreeBSD.org>

Add FIB support for SCTP.
This fixes https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=200379

MFC after: 3 days


# d60568d7 28-May-2015 Michael Tuexen <tuexen@FreeBSD.org>

Retire SCTP_DONT_DO_PRIVADDR_SCOPE which was never defined.

MFC after: 3 days


# b7d130be 28-May-2015 Michael Tuexen <tuexen@FreeBSD.org>

Fix and cleanup the debug information. This has no user-visible changes.
Thanks to Irene Ruengeler for proving a patch.

MFC after: 3 days


# 0426123f 24-Mar-2015 Michael Tuexen <tuexen@FreeBSD.org>

Fix two bugs which resulted in a screwed up end point list:
* Use a save way to walk throught a list while manipulting it.
* Have to appropiate locks in place.
Joint work with rrs@

MFC after: 3 days


# 59b6d5be 10-Mar-2015 Michael Tuexen <tuexen@FreeBSD.org>

Add a SCTP socket option to limit the cwnd for each path.

MFC after: 1 month


# e88f89a3 11-Jan-2015 Michael Tuexen <tuexen@FreeBSD.org>

Remove dead code.

Reported by: Coverity
CID: 748665
MFC after: 1 week


# d3cfd430 11-Jan-2015 Michael Tuexen <tuexen@FreeBSD.org>

Remove dead code.

Reported by: Coverity
CID: 748666
MFC after: 1 week


# 142a4d9e 17-Dec-2014 Michael Tuexen <tuexen@FreeBSD.org>

Add a missing break.

Reported by: Coverity
CID: 1232014
MFC after: 3 days


# 457b4b88 04-Dec-2014 Michael Tuexen <tuexen@FreeBSD.org>

This is the SCTP specific companion of
https://svnweb.freebsd.org/changeset/base/275358
which was provided by Hans Petter Selasky.


# 4e88d37a 02-Dec-2014 Michael Tuexen <tuexen@FreeBSD.org>

Do the renaming of sb_cc to sb_ccc in a way with less code changes by
using a macro.
This is an alternate approach to
https://svnweb.freebsd.org/changeset/base/275326
which is easier to handle upstream.

Discussed with: rrs, glebius


# 0f9d0a73 29-Nov-2014 Gleb Smirnoff <glebius@FreeBSD.org>

Merge from projects/sendfile:

o Introduce a notion of "not ready" mbufs in socket buffers. These
mbufs are now being populated by some I/O in background and are
referenced outside. This forces following implications:
- An mbuf which is "not ready" can't be taken out of the buffer.
- An mbuf that is behind a "not ready" in the queue neither.
- If sockbet buffer is flushed, then "not ready" mbufs shouln't be
freed.

o In struct sockbuf the sb_cc field is split into sb_ccc and sb_acc.
The sb_ccc stands for ""claimed character count", or "committed
character count". And the sb_acc is "available character count".
Consumers of socket buffer API shouldn't already access them directly,
but use sbused() and sbavail() respectively.
o Not ready mbufs are marked with M_NOTREADY, and ready but blocked ones
with M_BLOCKED.
o New field sb_fnrdy points to the first not ready mbuf, to avoid linear
search.
o New function sbready() is provided to activate certain amount of mbufs
in a socket buffer.

A special note on SCTP:
SCTP has its own sockbufs. Unfortunately, FreeBSD stack doesn't yet
allow protocol specific sockbufs. Thus, SCTP does some hacks to make
itself compatible with FreeBSD: it manages sockbufs on its own, but keeps
sb_cc updated to inform the stack of amount of data in them. The new
notion of "not ready" data isn't supported by SCTP. Instead, only a
mechanical substitute is done: s/sb_cc/sb_ccc/.
A proper solution would be to take away struct sockbuf from struct
socket and allow protocols to implement their own socket buffers, like
SCTP already does. This was discussed with rrs@.

Sponsored by: Netflix
Sponsored by: Nginx, Inc.


# 47b80412 16-Sep-2014 Michael Tuexen <tuexen@FreeBSD.org>

Use a consistent type for the number of HMAC algorithms.
This fixes a bug which resulted in a warning on the userland
stack, when compiled on Windows.
Thanks to Peter Kasting from Google for reporting the issue and
provinding a potential fix.

MFC after: 3 days


# ad234e3c 07-Sep-2014 Michael Tuexen <tuexen@FreeBSD.org>

Address warnings generated by the clang analyzer.

MFC after: 1 week


# 24aaac8d 07-Sep-2014 Michael Tuexen <tuexen@FreeBSD.org>

Use union sctp_sockstore instead of struct sockaddr_storage. This
eliminiates some warnings when building in userland.
Thanks to Patrick Laimbock for reporting this issue.
Remove also some unnecessary casts.
There should be no functional change.

MFC after: 1 week


# 24110da0 06-Sep-2014 Michael Tuexen <tuexen@FreeBSD.org>

Fix a leak of an address, if the address is scheduled for removal
and the stack is torn down.
Thanks to Peter Bostroem and Jiayang Liu from Google for reporting the
issue.

MFC after: 1 week


# 97a0ca5b 12-Aug-2014 Michael Tuexen <tuexen@FreeBSD.org>

Change SCTP sysctl from auth_disable to auth_enable. This is
consistent with other similar sysctl variable used in SCTP.


# c79bec9c 12-Aug-2014 Michael Tuexen <tuexen@FreeBSD.org>

Add support for the SCTP_AUTH_SUPPORTED and SCTP_ASCONF_SUPPORTED
socket options. Add also a sysctl to control the support of ASCONF.

MFC after: 1 week


# 317e00ef 04-Aug-2014 Michael Tuexen <tuexen@FreeBSD.org>

Add support for the SCTP_RECONFIG_SUPPORTED and the corresponding
sysctl controlling the negotiation of the RE-CONFIG extension.

MFC after: 3 days


# cb9b8e6f 03-Aug-2014 Michael Tuexen <tuexen@FreeBSD.org>

Add support for the SCTP_PKTDROP_SUPPORTED socket option and
the corresponding sysctl variable.
The default is off, since the specification is not an RFC yet.

MFC after: 1 week


# caea9879 03-Aug-2014 Michael Tuexen <tuexen@FreeBSD.org>

Add SCTP socket option SCTP_NRSACK_SUPPORTED to control the
NRSACK extension. The default will still be off, since it
it not an RFC (yet).
Changing the sysctl name will be in a separate commit.

MFC after: 1 week


# dd973b0e 02-Aug-2014 Michael Tuexen <tuexen@FreeBSD.org>

Add support for the SCTP_PR_SUPPORTED socket option as specified in
http://tools.ietf.org/html/draft-ietf-tsvwg-sctp-prpolicies
Add also a sysctl controlling the default of the end-points.

MFC after: 1 week


# f342355a 02-Aug-2014 Michael Tuexen <tuexen@FreeBSD.org>

Cleanup the ECN configuration handling and provide an SCTP socket
option for controlling ECN on future associations and get the
status on current associations.
A simialar pattern will be used for controlling SCTP extensions in
upcoming commits.


# 47aac6fa 01-Aug-2014 Michael Tuexen <tuexen@FreeBSD.org>

Remove the asconf_auth_nochk sysctl. This was off by default and only
existed to be able to test with non-compliant peers a long time ago.


# f64a0b06 11-Jul-2014 Michael Tuexen <tuexen@FreeBSD.org>

Bugfix: When a remote address was added to an endpoint,
a source address was selected and cached, but it was not
stored that is was cached. This resulted in selecting
different source addresses for the INIT-ACK and COOKIE-ACK
when possible.
Thanks to Niu Zhixiong for reporting the issue.

MFC after: 1 week


# 4474d71a 11-Jul-2014 Michael Tuexen <tuexen@FreeBSD.org>

Integrate upstream changes.

MFC after: 1 week


# 6ba22f19 20-Jun-2014 Michael Tuexen <tuexen@FreeBSD.org>

Honor jails for unbound SCTP sockets when selecting source addresses,
reporting IP-addresses to the peer during the handshake, adding
addresses to the host, reporting the addresses via the sysctl
interface (used by netstat, for example) and reporting the
addresses to the application via socket options.
This issue was reported by Bernd Walter.

MFC after: 3 days


# 4aa74d8b 06-May-2014 Michael Tuexen <tuexen@FreeBSD.org>

Remove unused code. This is triggered by the bugreport of Sylvestre Ledru
which deal with useless code in the user land stack:
https://bugzilla.mozilla.org/show_bug.cgi?id=1003929

MFC after: 3 days


# 9ba5b6b7 29-Mar-2014 Michael Tuexen <tuexen@FreeBSD.org>

Handle an edge case of address management similar to TCP.
This needs to be reconsidered when the address handling
will be reimplemented.
The patch is from rrs@.

MFC after: 3 days


# ff1ffd74 15-Mar-2014 Michael Tuexen <tuexen@FreeBSD.org>

* Provide information in error causes in ASCII instead of
proprietary binary format.
* Add support for a diagnostic information error cause.
The code is sysctlable and the default is 0, which
means it is not sent.

This is joint work with rrs@.

MFC after: 1 week


# c302aeb1 29-Nov-2013 Michael Tuexen <tuexen@FreeBSD.org>

In
http://svnweb.freebsd.org/changeset/base/258221
I introduced a bug which initialized global locks
whenever the SCTP stack initialized. This was fixed in
http://svnweb.freebsd.org/changeset/base/258574
by rodrigc@. He just initialized the locks for
the default vnet. This fix reverts to the old
behaviour before r258221, which explicitly makes
sure it is only called once, because this works also on
other platforms.
MFC after: 3 days
X-MFC with: r258574.


# c0c61281 25-Nov-2013 Craig Rodrigues <rodrigc@FreeBSD.org>

Only initialize some mutexes for the default VNET.

In r208160, sctp_it_ctl was made a global variable, across all VNETs.
However, sctp_init() is called for every VNET that is created. This results
in the same global mutexes which are part of sctp_it_ctl being initialized. This can result
in crashes if many jails are created.

To reproduce the problem:
(1) Take a GENERIC kernel config, and add options for: VIMAGE, WITNESS,
INVARIANTS.
(2) Run this command in a loop:
jail -l -u root -c path=/ name=foo persist vnet && jexec foo ifconfig lo0 127.0.0.1/8 && jail -r foo

(see http://lists.freebsd.org/pipermail/freebsd-current/2010-November/021280.html )

Witness will warn about the same mutex being initialized.

Fix the problem by only initializing these mutexes in the default VNET.


# dcb3fc4c 16-Nov-2013 Michael Tuexen <tuexen@FreeBSD.org>

When determining if an address belongs to an stcb, take the address family
into account for wildcard bound endpoints.

MFC after: 3 days


# f4f34bde 16-Nov-2013 Michael Tuexen <tuexen@FreeBSD.org>

Cleanups which result in fixes which have been made upstream
and where partially suggested by Andrew Galante.
There is no functional change in FreeBSD.

MFC after: 3 days


# 3b3d05d7 03-Nov-2013 Michael Tuexen <tuexen@FreeBSD.org>

Unlock the lock before destroying it.
This issue was reported by Andrew Galante.

MFC after: 3 days


# b54ddf22 02-Nov-2013 Michael Tuexen <tuexen@FreeBSD.org>

Changes from upstream to improve compilation when INET or INET6
or none of them is defined.

MFC after: 3 days


# daac3e7d 28-Oct-2013 Michael Tuexen <tuexen@FreeBSD.org>

Fix compilation if SCTP_DONT_DO_PRIVADDR_SCOPE is defined.
The issue was reported by Andrew Galante.

MFC after: 3 days


# ee1ccd92 05-Jul-2013 Michael Tuexen <tuexen@FreeBSD.org>

Fix a bug were only 2048 streams where usable even though more than
2048 streams were negotiated on the wire. While there, remove the
hard coded limit of 2048 streams.

MFC after: 3 days


# 56f778aa 03-Jul-2013 Michael Tuexen <tuexen@FreeBSD.org>

Code cleanups.

MFC after: 3 days


# 3457ccda 10-May-2013 Michael Tuexen <tuexen@FreeBSD.org>

Honor the net.inet6.ip6.v6only sysctl variable and the IPV6_V6ONLY
socket option for SCTP sockets in the same way as for UDP or TCP
sockets.

MFC after: 2 weeks


# 2416af26 11-Feb-2013 Michael Tuexen <tuexen@FreeBSD.org>

Send the adaptation layer indication only if set by the user.

MFC after: 3 days
Discussed with: rrs


# c53f854a 11-Feb-2013 Michael Tuexen <tuexen@FreeBSD.org>

Don't send kernel provided information in the User Initiated
ABORT cause, since the user can also provide this kind of
information. So the receiver doesn't know who provided the
information.
While there: Fix a bug where the stack would send a malformed
ABORT chunk when using a send() call with SCTP_ABORT|SCT_SENDALL
flags.

MFC after: 3 days


# f0d44a49 10-Feb-2013 Michael Tuexen <tuexen@FreeBSD.org>

Make sure that received packets for removed addresses are handled
consistently. While there, make variable names consistent.

MFC after: 3 days


# a1cb341b 09-Feb-2013 Michael Tuexen <tuexen@FreeBSD.org>

Cleanup the handling of address scopes. Announce in the INIT/INIT-ACK
only the supported address types. While there, do some whitespace
cleanups.

MFC after: 1 week


# c39cfa1f 09-Feb-2013 Michael Tuexen <tuexen@FreeBSD.org>

Fix a bug where HEARTBEATs were still sent in SHUTDOWN_SENT or
SHUTDOWN_ACK_SENT state. While there, make the corresponding
code consistent.

MFC after: 1 week


# 72c123a8 27-Dec-2012 Michael Tuexen <tuexen@FreeBSD.org>

Minor cleanups of debug messages.

MFC after: 3 days


# eb1b1807 05-Dec-2012 Gleb Smirnoff <glebius@FreeBSD.org>

Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags within sys.

Exceptions:

- sys/contrib not touched
- sys/mbuf.h edited manually


# e3976bb8 26-Nov-2012 Michael Tuexen <tuexen@FreeBSD.org>

Find the endpoint for an incoming packet also if the endpoint
comes from sctp_peeloff().

MFC after: 3 days


# 325c8c46 16-Nov-2012 Michael Tuexen <tuexen@FreeBSD.org>

Get the accounting working. We now have counters how many
chunks for each SCTP outgoing stream are in the send and
sent queue.
While there, improve the naming of NR-SACK related constants
recently introduced.

MFC after: 1 week


# a7ad6026 07-Nov-2012 Michael Tuexen <tuexen@FreeBSD.org>

Add per outgoing stream accounting for chunks in the send
and sent queue. This provides no functional change, but is
a preparation for an upcoming stream reset improvement.
Done with rrs@.

MFC after: 1 week


# e06f3469 23-Sep-2012 Michael Tuexen <tuexen@FreeBSD.org>

Whitespace change.

MFC after: 3 days


# 8225a9bc 09-Sep-2012 Michael Tuexen <tuexen@FreeBSD.org>

Whitespace changes.

MFC after: 10 days


# dd294dce 05-Sep-2012 Michael Tuexen <tuexen@FreeBSD.org>

Using %p in a format string requires a void *.

MFC after: 10 days


# 63c6726e 05-Aug-2012 Michael Tuexen <tuexen@FreeBSD.org>

Fix a refcount issue. The called only decrements is stcb is NULL.

MFC after: 3 days
Discussed with: rrs


# 83220851 04-Aug-2012 Michael Tuexen <tuexen@FreeBSD.org>

Fix a bug reported by Simon L. B. Nielsen:
If an SCTP endpoint receives an ASCONF with a wildcard
lookup address and incorrect verification tag, the system
crashes.

MFC after: 3 days.


# d07b2ac6 17-Jul-2012 Michael Tuexen <tuexen@FreeBSD.org>

Fix a refcount bug when freeing an association.
While there: Change code to be consistent.
Discussed with rrs@.
MFC after: 3 days


# e0e00a4d 15-Jul-2012 Michael Tuexen <tuexen@FreeBSD.org>

#ifdef INET and INET6 consistently. This also fixes a bug, where
it was done wrong.

MFC after: 3 days


# b5e0cd79 14-Jul-2012 Michael Tuexen <tuexen@FreeBSD.org>

Use case for selecting the address family (as in other places).

MFC after: 3 days


# b1754ad1 28-Jun-2012 Michael Tuexen <tuexen@FreeBSD.org>

Pass the src and dst address of a received packet explicitly around.

MFC after: 3 days


# 2faa5be5 03-Jun-2012 Michael Tuexen <tuexen@FreeBSD.org>

Remove code which is not needed.

MFC after: 3 days


# 807aad63 23-May-2012 Michael Tuexen <tuexen@FreeBSD.org>

Use consistent text at the begining of the files.

MFC after: 3 days


# 1edc9dba 13-May-2012 Michael Tuexen <tuexen@FreeBSD.org>

Provide in the SCTP_SEND_FAILED and SCTP_SEND_FAILED_EVENT notifications
the correct ssf_error or ssfe_error as required by RFC 6458.

MFC after: 3 days


# a2b42326 12-May-2012 Michael Tuexen <tuexen@FreeBSD.org>

Provide in the association change notification the received ABORT chunk
if case of SCTP_COMM_LOST or SCTP_CANT_STR_ASSOC as required by RFC 6458.

MFC after: 3 days


# 3f826ed2 06-May-2012 Michael Tuexen <tuexen@FreeBSD.org>

Remove debug code.

MFC after: 3 days


# cd3fd531 04-May-2012 Michael Tuexen <tuexen@FreeBSD.org>

Use SCTP_PRINTF() instead of printf() in all SCTP sources.

MFC after: 3 days


# 60990c0c 27-Dec-2011 Michael Tuexen <tuexen@FreeBSD.org>

Address issues found by clang. While there, fix also some style
issues.

MFC after: 3 months.


# 7215cc1b 17-Dec-2011 Michael Tuexen <tuexen@FreeBSD.org>

Fix unused parameter warnings.
While there, fix some whitespace issues.

MFC after: 3 months.


# a56569ba 28-Nov-2011 Michael Tuexen <tuexen@FreeBSD.org>

Remove debug code.

MFC after: 1 month.


# c9c58059 20-Nov-2011 Michael Tuexen <tuexen@FreeBSD.org>

Add support for the SCTP_REMOTE_UDP_ENCAPS_PORT socket option.
Retire the the now unused sctp_udp_tunneling_for_client_enable
sysctl variable.

MFC after: 3 months.


# 36311411 18-Nov-2011 Michael Tuexen <tuexen@FreeBSD.org>

Cleanup comparison of interface names.

MFC after: 1 month.


# a62e467a 15-Nov-2011 Michael Tuexen <tuexen@FreeBSD.org>

Set the MTU of an path to an approriate value if the interface MTU
can't be determined.

MFC after: 3 days.


# eb20220e 13-Nov-2011 Michael Tuexen <tuexen@FreeBSD.org>

Don't copy uninitialized memory. Also simplify the comparison
of interface names.

MFC after: 3 days.


# dc81ec89 07-Nov-2011 Michael Tuexen <tuexen@FreeBSD.org>

When loading addresses from INITs, always use the correct
local address.

MFC after: 3 days.


# 7ffa2290 27-Oct-2011 Michael Tuexen <tuexen@FreeBSD.org>

When add a new remote address using sctp_add_remote_addr(),
return the correct net if requested.

MFC after: 3 days.


# 3d2443cc 09-Oct-2011 Michael Tuexen <tuexen@FreeBSD.org>

When moving an stcb to a new inp and we copy over the list of
bound addresses, update the last used address pointer.
If not, it might result in a crash if the old inp goes away.

MFC after: 3 days.


# 629749b6 09-Oct-2011 Michael Tuexen <tuexen@FreeBSD.org>

Update the inp stored in a HB-timer when moving an stcb to a new inp.
Use only this stored inp when processing a HB timeout.
This fixes a bug which results in a crash.

MFC after: 3 days.


# 80c79bbe 17-Sep-2011 Michael Tuexen <tuexen@FreeBSD.org>

Fix the enabling/disabling of Heartbeats and path MTU
discovery when using the SCTP_PEER_ADDR_PARAMS socket option.
Approved by: re
MFC after: 1 month.


# 92776dfd 15-Sep-2011 Michael Tuexen <tuexen@FreeBSD.org>

Make sure that SCTP rejects broadcast, multicast and wildcard addresses
as remote addresses.

Approved by: re
MFC after: 1 month.


# c55b70ce 14-Sep-2011 Michael Tuexen <tuexen@FreeBSD.org>

Ensure that 1-to-1 style SCTP sockets can only be connected once.
Allow implicit setup also for 1-to-1 style sockets as described
in the latest version of the socket API ID.

Approved by: re
MFC after: 1 month


# 58bdb691 14-Sep-2011 Michael Tuexen <tuexen@FreeBSD.org>

Fix the handling of the flowlabel and DSCP value in the SCTP_PEER_ADDR_PARAMS
socket option.
Honor the net.inet6.ip6.auto_flowlabel sysctl setting.

Approved by: re (bz)
MFC after: 1 month.


# b10f2dc8 14-Aug-2011 Michael Tuexen <tuexen@FreeBSD.org>

Add support for the spp_dscp field in the SCTP_PEER_ADDR_PARAMS
socket option. Backwards compatibility is provided by still
supporting the spp_ipv4_tos field.

Approved by: re@
MFC after: 2 months.


# ca85e948 03-Aug-2011 Michael Tuexen <tuexen@FreeBSD.org>

The result of a joint work between rrs@ and myself at the IETF:
* Decouple the path supervision using a separate HB timer per path.
* Add support for potentially failed state.
* Bring back RTO.min to 1 second.
* Accept packets on IP-addresses already announced via an ASCONF
* While there: do some cleanups.

Approved by: re@
MFC after: 2 months.


# 1a3b5ce2 12-Jul-2011 Michael Tuexen <tuexen@FreeBSD.org>

Don't check for SOCK_DGRAM anymore. Also remove multicast
related code which is not necessary anymore.


# e2e7c62e 15-Jun-2011 Michael Tuexen <tuexen@FreeBSD.org>

Add support for the newly added SCTP API.
In particular add support for:
* SCTP_SNDINFO, SCTP_PRINFO, SCTP_AUTHINFO, SCTP_DSTADDRV4, and
SCTP_DSTADDRV6 cmsgs.
* SCTP_NXTINFO and SCTP_RCVINFO cmgs.
* SCTP_EVENT, SCTP_RECVRCVINFO, SCTP_RECVNXTINFO and SCTP_DEFAULT_SNDINFO
socket option.
* Special association ids (SCTP_FUTURE_ASSOC, ...)
* sctp_recvv() and sctp_sendv() functions.

MFC after: 1 month.


# 689e6a5f 08-May-2011 Michael Tuexen <tuexen@FreeBSD.org>

Fix a locking issue showing up on Mac OS X when subscribing to
authentication events. DTLS/SCTP renegotiations trigger the bug.

MFC after: 2 weeks.


# e6194c2e 30-Apr-2011 Michael Tuexen <tuexen@FreeBSD.org>

Improve compilation of SCTP code without INET support.
Some bugs where fixed while doing this:
* ASCONF-ACK messages might use wrong port number when using
IPv6.
* Checking for additional addresses takes the correct address
into account and also does not do more comparisons than
necessary.

This patch is based on one received from bz@ who was
sponsored by The FreeBSD Foundation and iXsystems.

MFC after: 1 week


# f79aab18 08-Mar-2011 Randall Stewart <rrs@FreeBSD.org>

Tunes and fixes the new DC-CC to seem to hit the
right mix. Still may need some tweaks but it
appears to almost not give away too much to an
RFC2581 flow, but can really minimize the amount of
buffers used in the net.

MFC after: 3 months


# 299108c5 26-Feb-2011 Randall Stewart <rrs@FreeBSD.org>

Improvements to CC modules:
1) Add four new points that allow you to get more information
to cc algo's
2) Fix the case where user changes module on a existing TCB, in
such a case, the initialization module needs to be called on all nets.
3) Move htcp_cc structure to a union that other modules can use.
4) Add 5th point for get/set socket options for cc_module specific options

MFC after: 2 months


# 4c97400f 07-Feb-2011 Michael Tuexen <tuexen@FreeBSD.org>

Fix bugs related to M_FLOWID:
* Store the flowid when receiving an SCTP/IPv6 packet.
* Store the flowid when receiving an SCTP packet with wrong CRC.
* Initilize flowid correctly.
* Put test code under INVARIANTS.
MFC after: 3 months.


# 73403d414 07-Feb-2011 Randall Stewart <rrs@FreeBSD.org>

1) Track when flowid does get set.
MFC after: 3 months


# a4ae38f1 05-Feb-2011 Michael Tuexen <tuexen@FreeBSD.org>

Add support for M_FLOWID.


# 0071ee5e 04-Feb-2011 Randall Stewart <rrs@FreeBSD.org>

1) Fix cpu mapping per JB's suggestions
2) Fix it so INIT's don't always end up on CPU0

MFC after: 3 months


# c446091b 03-Feb-2011 Michael Tuexen <tuexen@FreeBSD.org>

Make sure that changing the ECN sysctl does not affect
exisiting associations and endpoints.

MFC after: 3 months.


# dec0177d 03-Feb-2011 Randall Stewart <rrs@FreeBSD.org>

1) Move per John Baldwin to mp_maxid
2) Some signed/unsigned errors found by Mac OS compiler (from Michael)
3) a couple of copyright updates on the effected files.

MFC after: 3 months


# ae26e0a4 03-Feb-2011 Randall Stewart <rrs@FreeBSD.org>

Fix the per CPU stats so that:
1) They don't use the giant "MAX_CPU" define and instead
are allocated dynamically based on mp_ncpus
2) Will zero with the netstat -z -s -p sctp
3) Will be properly handled by both the sctp_init and finish
(the multi-net stuff was incorrectly bzero'ing in sctp_init
the wrong size.. the bzero is now moved to the right places).
And of course the free is put in at the very end.

MFC after: 3 Months


# bfc46083 03-Feb-2011 Randall Stewart <rrs@FreeBSD.org>

Adds an experimental option to create a pool of
threads. These serve as input threads and are queued
packets based on the V-tag number. This is similar to
what a modern card can do with queue's for TCP... but
alas modern cards know nothing about SCTP.

MFC after: 3 months (maybe)


# 899288ae 02-Feb-2011 Randall Stewart <rrs@FreeBSD.org>

1) Allow a chunk to track the cwnd it was at when sent.
2) Add separate max-bursts for retransmit and hb. These
are set to sysctlable values but not settable via the
socket api. This makes sure we don't blast out HB's or
fast-retransmits.
3) Determine on the first data transmission on a net if
its local-lan (by being under or over a RTT). This
can later be used to think about different algorithms
based on locallan vs big-i (experimental)
4) The cwnd should NOT be allowed to grow when an ECNEcho
is seen (TCP has this same bug). We fix this in SCTP
so an ECNe being seen prevents an advance of cwnd.
5) CWR's should not be sent multiple times to the
same network, instead just updating the TSN being
transmitted if needed.

MFC after: 1 Month


# 493d8e5a 31-Jan-2011 Randall Stewart <rrs@FreeBSD.org>

More ECN fixes:
1) We now remove ECN-Nonce since it will no longer continue as a I-D
2) Eliminate last_tsn_echo, this tied us to an assoc not the net
and thus we were not doing m-homing on the ECN-Echo senders side right.
3) Increment the count going out even if the TSN in lower in the pending
ECN-Echo, this way the receiver knows exactly how many packets were
marked even with network re-ordering
4) Fix so we DO NOT stop doing delayed sack if a ECN Echo is in queue
MFC after: 1 month


# a21779f0 29-Jan-2011 Randall Stewart <rrs@FreeBSD.org>

Fixes to ECN in SCTP.
1) ECN was on an association basis, this is incorrect and
will not work with CMT or for that matter if the user
is sending to multiple addresses. This commit makes
ECN on a per path basis.
2) Adopt the new format for the ECN internet draft. This also
maintains compatability with old format chunks as well.
3) Keep track of the real time of a RTT down to micro seconds.
For some future conditional features (for like a data center
this is good information to have).
MFC after: 1 month


# f7a77f6f 23-Jan-2011 Michael Tuexen <tuexen@FreeBSD.org>

Add stream scheduling support.
This work is based on a patch received from Robin Seggelmann.

MFC after: 3 months.


# 0e9a9c10 19-Jan-2011 Michael Tuexen <tuexen@FreeBSD.org>

Cleanup the management of CC functions.

MFC after: 3 months.


# 20b07a4d 30-Dec-2010 Michael Tuexen <tuexen@FreeBSD.org>

Define and use SCTP_SSN_GE, SCTP_SSN_GT, SCTP_TSN_GE, SCTP_TSN_GT macros
and use them instead of the generic compare_with_wrap.
Retire compare_with_wrap.

MFC after: 3 months.


# 4a9ef3f8 30-Dec-2010 Michael Tuexen <tuexen@FreeBSD.org>

Code cleanup: Use LIST_FOREACH, LIST_FOREACH_SAFE, TAILQ_FOREACH,
TAILQ_FOREACH_SAFE where appropriate.
No functional change.

MFC after: 3 months.


# 6324ca61 25-Nov-2010 Randall Stewart <rrs@FreeBSD.org>

Adds new dtrace for cwnd functions and lay's
groundwork for future dtrace points (rwnd flightsize etc).

MFC after: 2 months


# 27387dac 12-Nov-2010 Michael Tuexen <tuexen@FreeBSD.org>

Fix a locking issue reported by brucec@ affecting
1-to-1 style sockets which have not yet been
accepted.

MFC after: 3 days.


# 034b88b0 09-Nov-2010 Michael Tuexen <tuexen@FreeBSD.org>

Improve the scalability by using the local and remote port when
putting inps in the tcpephash.

MFC after: 3 days.


# b1ce21c6 09-Nov-2010 Rebecca Cran <brucec@FreeBSD.org>

Fix typos.

PR: bin/148894
Submitted by: olgeni


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# f8faf20c 19-Sep-2010 Michael Tuexen <tuexen@FreeBSD.org>

Fix a locking issue which shows up when the code is used
on Mac OS X.

MFC after: 2 weeks.


# b3f7949d 15-Sep-2010 Michael Tuexen <tuexen@FreeBSD.org>

Remove old debug code.

MFC after: 2 weeks.


# 9eea4a2d 15-Sep-2010 Michael Tuexen <tuexen@FreeBSD.org>

Delay the assignment of a path for DATA chunk until they hit
the sent_queue. Honor a given path when the SCTP_ADDR_OVER
flag is set.

MFC after: 2 weeks.


# 52129fcd 05-Sep-2010 Randall Stewart <rrs@FreeBSD.org>

Fix some CLANG warnings. One clang warning is left
due to the fact that its bogus.. nam->sa_family will
not change from AF_INET6 to AF_INET (but clang
thinks it does ;-D)


# fc048708 01-Sep-2010 Michael Tuexen <tuexen@FreeBSD.org>

Fix a bug which results in peer IPv4 addresses a.b.c.d with 224<=d<=239
incorrectly being detected as multicast addresses on little endian systems.

MFC after: 2 weeks


# 20083c2e 28-Aug-2010 Michael Tuexen <tuexen@FreeBSD.org>

Fix the switching on/off of CMT using sysctl and socket option.
Fix the switching on/off of PF and NR-SACKs using sysctl.
Add minor improvement in handling malloc failures.
Improve the address checks when sending.

MFC after: 4 weeks


# 8db924de 26-Jul-2010 Randall Stewart <rrs@FreeBSD.org>

Make sure that we report chunks if a socket
still exists that were not sent. In either
case carefully remove the data if it does not
get taken by the reporting routines.

MFC after: 2 weeks


# 606c58db 02-Jul-2010 Randall Stewart <rrs@FreeBSD.org>

Fix a bug that WILL cause a panic. Basically
a read-lock is being called to check the vtag-timewait cache.
Then in two cases (where a vtag is bad i.e. in the time-wait
state) the write-unlock is called NOT the read-unlock. Under
conditions where lots of associations are coming and going
this will cause the system to panic at some point.

MFC after: 3 days


# 370d524f 24-Jun-2010 Michael Tuexen <tuexen@FreeBSD.org>

Fix a bug I introduced in r209470.

MFC after: 3 days


# 749c49ac 23-Jun-2010 Michael Tuexen <tuexen@FreeBSD.org>

* Implement sctp_does_stcb_own_this_addr() correclty. It was taking the
wrong side into account.
* sctp_findassociation_ep_addr() must check the local address if available.
This fixes a bug where ABORT chunks were accepted even in the case where
the local was not owned by the endpoint.
Thanks to brucec for pointing out a bug in my first version of the fix.
MFC after: 3 days


# 5483bc18 22-Jun-2010 Michael Tuexen <tuexen@FreeBSD.org>

MFC 209264
* Fix a bug where the length of the ASCONF-ACK was calculated wrong due
to using an uninitialized variable.
* Fix a bug where a NULL pointer was dereferenced when interfaces
come and go at a high rate.
* Fix a bug where inps where not deregistered from iterators.
* Fix a race condition in freeing an association.
* Fix a refcount problem related to the iterator.
Each of the above bug results in a panic. It shows up when
interfaces come and go at a high rate.

Approved by: re


# fc066a61 14-Jun-2010 Michael Tuexen <tuexen@FreeBSD.org>

* Fix a bug where the length of the ASCONF-ACK was calculated wrong due
to using an uninitialized variable.
* Fix a bug where a NULL pointer was dereferenced when interfaces
come and go at a high rate.
* Fix a bug where inps where not deregistered from iterators.
* Fix a race condition in freeing an association.
* Fix a refcount problem related to the iterator.
Each of the above bug results in a panic. It shows up when
interfaces come and go at a high rate.

Obtained from: rrs (partly)
MFC after: 3 days


# cd89751f 11-Jun-2010 Michael Tuexen <tuexen@FreeBSD.org>

MFC 209029

3 Fixes -
a) There was a case where a ICMP message could cause
us to return leaving a stuck lock on an stcb.
b) The iterator needed some tweaks to fix its lock
ordering.
c) The ITERATOR_LOCK is no longer needed in the freeing
of a stcb. Now that the timer based one is gone we don't
have a multiple resume situation. Add to that that there
was somewhere a path out of the freeing of an assoc that
did NOT release the iterator_lock.. it was time to clean
this old code up and in the process fix the lock bug.

Approved by: re (bz)


# ec4c19fc 10-Jun-2010 Randall Stewart <rrs@FreeBSD.org>

3 Fixes -
a) There was a case where a ICMP message could cause
us to return leaving a stuck lock on an stcb.
b) The iterator needed some tweaks to fix its lock
ordering.
c) The ITERATOR_LOCK is no longer needed in the freeing
of a stcb. Now that the timer based one is gone we don't
have a multiple resume situation. Add to that that there
was somewhere a path out of the freeing of an assoc that
did NOT release the iterator_lock.. it was time to clean
this old code up and in the process fix the lock bug.

MFC after: 1 week


# 2a0266f7 10-Jun-2010 Randall Stewart <rrs@FreeBSD.org>

MFC:
Fix a number of bugs and race conditions.
r208160: Bring back of the iterator thread. It now properly handles VNETS
having only one thread. The old timer based code was full of
LOR's and other issues.

r208852: Cleanup bug. Basically when an un-accepted socket was hanging on a
closed listener, we would leak the inp never cleaning it up

r208853: Enhance the use under invarients of the audit for locks function
and fix a bug where a close collision with a cookie being processed
would cause a crash.

r208854: Use the proper increment macros when working with the
sent_queue_retran_cnt

r208855: Align comments properly, Fix a bug where we were NOT looking at the
resend markings for control chunks and also not decrementing the
retran count which caused extra calls to retransmission. Alos add
a valid no locks call to the output routine.

r208856: Spacing issues in auth/bsd addr.

r208857: Get rid of a windows ifdef that somehow leaked in

r208863: Missing error leg returns in some failure cases

r208864: LOR fix between the iterator and sctp_inpcb_close

r208874: Don't call the sctp_inpcb_free from abort an association since you
don't know what locks you hold and a timer will take care of the
situation when the gone flag is set

r208875: sctp_inpcb_free bug - a socket under the right situation could get
stuck (from the accept queue) and never start the proper cleanup
timer)

r208876: Further enhance invariant lock validation, Fix a bug where a closed
socket and a INIT-ACK could collide and cause a crash

r208878: Clear up another bug in sctp_inpcb_free where we would end up due
to a race in freeing hit a destroy of a contended lock.

r208879: Optimize the cleanup and make some additional fixes in the sysctl
code so that it won't reference a GONE INP and crash us

r208883 & r208891: Fix so we don't open a hole between a sock lock and a call
to socantrcvmore.. we could before hit a race that would kill the
socket underneath us leading to a crash

r208897: CUM-ACK calculation was messed up. So basically large message got
broken from the original NR_sack integration.

r208902: Make sure that we don't move a bit to the NR array that is behind
the cum-ack

r208952: Use both bit maps to calculte the cum-ack.

r208953: Fix bug having to do with freeing an sctp_inpcb_free().
1) make sure not to remove the flag until you get the lock again.
2) make sure all log_closing calls hold the lock.
3) Release all the locks when everthing is done and call callout_drain
not callout_stop..

r208970: Fix some places on user allocation of a new sctp_inpcb where we run
out of resource that we make sure to NULL the so_pcb pointer.
Approved by: re - (bz@freebsd.org)


# 41291ef0 09-Jun-2010 Randall Stewart <rrs@FreeBSD.org>

Found by Michael. In cases where we run
out of memory (no more inp space) we don't
propely NULL the INP on return.

Obtained from: tuexen
MFC after: 3 Days


# b3a44e46 09-Jun-2010 Randall Stewart <rrs@FreeBSD.org>

Fix serveral bugs all having to do with freeing an
sctp_inpcb:
1) Make sure not to remove the flag on the PCB until
after the close() caller is back in control with the
lock. Otherwise a quickly freeing assoc could kill the
inpcb and cause a panic.

2) Make sure all calls to log_closing have not released
the locks before calling the log function, we don't
want the logging function to crash us due to a freed
inpcb.

3) Make sure that when we get to the end, we release all
locks (after removing them from view) and as long as
we are NOT the inp-kill timer removing the inp, call
the callout_drain() function so a racing timer won't
later call in and cause a racing crash.
MFC after: 1 week


# b9771f04 07-Jun-2010 Randall Stewart <rrs@FreeBSD.org>

Opps... my bad.. we don't need a SOCK_UNLOCK() after
calling socantrcvmore_locked() since it will unlock
the lock for you.

MFC after: 1 week


# 9ed1e280 06-Jun-2010 Randall Stewart <rrs@FreeBSD.org>

Fix so we call socantrcvmore_locked so we
don't see a race where we unlock to call
the non-locked version and have the socket
go away.

MFC after: 1 week


# 8ce4a9a2 06-Jun-2010 Randall Stewart <rrs@FreeBSD.org>

1) Optimize the cleanup and don't always depend on
the timer. This is done by considering the locks
we will destroy and if they are contended we consider
it the same as a reference count being up. Fixing this
appears to cleanup another crash that was appearing with
all the timers where the socket buf lock got corrupted.

2) Fix the sysctl code to take a lot more care when looking
at INP's that are in the GONE or ALLGONE state.

MFC after: 1 week


# 0c7dc840 06-Jun-2010 Randall Stewart <rrs@FreeBSD.org>

Ok, yet another bug in killing off all the hundreds
of apitesters.. Basically we end up with attempting
to destroy a lock thats contended on. A cookie echo
arrives at the same time that the close is happening.
The close gets the lock but the cookie echo has already
passed the check for the gone flag and is then locked
waiting on the create lock.. when we go to destroy it
bam. For now we do the timer destroy for all calls
to close.. We can probably optimize this later so that
we check whats being contended on and if there is contention
then do the timer thing. but this is probably safest since
the inp has been removed from all lists and references and
only the timer can find it.. once the locks are released all
other places will instantly see the GONE flag and bail (thats
what the change in sctp_input is one place that was lacking
the bail code).

MFC after: 1 week


# 7c82e9fa 06-Jun-2010 Randall Stewart <rrs@FreeBSD.org>

Fix a bug in the sctp_inpcb_free. Basically if the socket
was setup to do an abortive close an association that was
in the accept_queue could get stuck and never freed. Now
we properly start the kill timer on the socket and turn
off the flag (same thing we do for the graceful close method).
MFC after: 1 week


# 2c6b25b4 05-Jun-2010 Randall Stewart <rrs@FreeBSD.org>

Hopefully this fixes a LOR by making
so we only hold the iterator lock during
updates to the iterators work.

MFC after: 1 week


# 62fb761f 05-Jun-2010 Randall Stewart <rrs@FreeBSD.org>

This fixes a bug in the close up of a socket that
had un-accepted assoc's. Basically the assoc (and inp)
would get stuck and never get cleaned up.

MFC after: 1 week


# f7517433 16-May-2010 Randall Stewart <rrs@FreeBSD.org>

This adds back the Iterator to the sctp
code base. We now properly have ONE thread
that services all VNET's. Also we purge out
the old timer based iterator code which had
multiple LOR's and other issues.

MFC after: 3 days


# 93c3efa7 16-May-2010 Randall Stewart <rrs@FreeBSD.org>

MFC of 207924:

This fixes a bug with the one-2-one model socket when a
user sets up a socket to a server sends data and closes
the socket before the server has called accept(). It used
to NOT work at all. Now we add a flag to the assoc and
defer assoc cleanup so that the accept will succeed


# 88a7eb29 11-May-2010 Randall Stewart <rrs@FreeBSD.org>

This fixes a bug with the one-2-one model socket when a
user sets up a socket to a server sends data and closes
the socket before the server has called accept(). It used
to NOT work at all. Now we add a flag to the assoc and
defer assoc cleanup so that the accept will suceed.


# 17f2eabb 16-Apr-2010 Randall Stewart <rrs@FreeBSD.org>

MFC of 206137

This is Part III of the great IETF hack-a-thon to fix
the NR-Sack code. (the last one on the cpu options
was a lull.. i.e MFC 205629).. still 2 more to go.


# 07072810 16-Apr-2010 Randall Stewart <rrs@FreeBSD.org>

MFC of 205629

Adds the option of seperating out the sctp stats per
processor. This will be refined further and is definetly
exploratory (which is why its an option) i.e. making it
allocate the actual number of processors is coming ;-D.


# b5c16493 03-Apr-2010 Michael Tuexen <tuexen@FreeBSD.org>

* Fix some race condition in SACK/NR-SACK processing.
* Fix handling of mapping arrays when draining mbufs or processing
FORWARD-TSN chunks.
* Cleanup code (no duplicate code anymore for SACKs and NR-SACKs).
Part of this code was developed together with rrs.
MFC after: 2 weeks.


# ff014514 24-Mar-2010 Randall Stewart <rrs@FreeBSD.org>

Adds the option of keeping per-cpu statistics in SCTP. This
may be useful since it gets rid of atomics but I want it to
remain an option until I can do further testing on if it really
speeds things up.


# b93b253d 24-Jan-2010 Michael Tuexen <tuexen@FreeBSD.org>

MFC 202449:

Get rid of support of an old version of the SCTP-AUTH draft.
Get rid of unused MD5 code.


# 06ee5047 17-Jan-2010 Michael Tuexen <tuexen@FreeBSD.org>

MFC 201523

Correct usage of parenthesis.


# 33dabcc0 17-Jan-2010 Michael Tuexen <tuexen@FreeBSD.org>

MFC 199437

Use always LIST_EMPTY instead of sometime SCTP_LIST_EMPTY,
which is defined as LIST_EMPTY.


# 5661a9ed 16-Jan-2010 Michael Tuexen <tuexen@FreeBSD.org>

Get rid of support of an old version of the SCTP-AUTH draft.
Get rid of unused MD5 code.

MFC after: 1 week


# f5366806 04-Jan-2010 Michael Tuexen <tuexen@FreeBSD.org>

Correct usage of parenthesis.

PR: kern/142066
Approved by: rrs (mentor)
Obtained from: Henning Petersen, Bruce Cran.
MFC after: 2 weeks


# cf19fced 07-Dec-2009 Michael Tuexen <tuexen@FreeBSD.org>

MFC 197288,197326,197327,197328,197342,197914,197929,
197955,199365,199370,199371,199373,199866
This MFCs all SCTP/VNET relevant fixes from head.

Approved by: rrs (mentor)


# 83fc1165 17-Nov-2009 Michael Tuexen <tuexen@FreeBSD.org>

Use always LIST_EMPTY instead of sometime SCTP_LIST_EMPTY,
which is defined as LIST_EMPTY.

Approved by: rrs (mentor)
MFC after: 1 month


# b6c57802 17-Nov-2009 Michael Tuexen <tuexen@FreeBSD.org>

Fix a memory leak when destroying an SCTP stack.
Clean up sctp_pcb_finish().
Approved by: rrs (mentor)
MFC after: 1 month


# f71e78a1 10-Oct-2009 Michael Tuexen <tuexen@FreeBSD.org>

Fix a race condition where a mutex was destroyed while sleeping on it.
Found while analyzing a report from julian. It might fix his bug.
Approved by: rrs (mentor)
MFC after: 3 days


# 4b6492f5 20-Sep-2009 Michael Tuexen <tuexen@FreeBSD.org>

Fix handling of sctp_drain().

Approved by: rrs (mentor)
MFC after: 2 month


# 30c3a843 19-Sep-2009 Michael Tuexen <tuexen@FreeBSD.org>

Fix the disabling of sctp_drain().

Approved by: rrs (mentor)
MFC after: 1 month.


# 8518270e 19-Sep-2009 Michael Tuexen <tuexen@FreeBSD.org>

Get SCTP working in combination with VIMAGE.
Contains code from bz.
Approved by: rrs (mentor)
MFC after: 1 month.


# 482444b4 17-Sep-2009 Randall Stewart <rrs@FreeBSD.org>

Support for VNET in SCTP (hopefully)


# 04a34c6c 16-Sep-2009 Michael Tuexen <tuexen@FreeBSD.org>

Fixes two bugs:
1) A lock issue, if we ever had to try again
we would double lock the INP lock.
2) We were allowing (at wrap) associd 0... which really
we cannot allow since 0 normally means in most socket
API calls that we are wishing to effect something on
the INP not TCB.

Approved by: re, rrs (mentor)


# f3d06a3c 13-Sep-2009 Randall Stewart <rrs@FreeBSD.org>

Fixes two bugs:
1) A lock issue, if we ever had to try again
we would double lock the INP lock.
2) We were allowing (at wrap) associd 0... which really
we cannot allow since 0 normally means in most socket
API calls that we are wishing to effect something on
the INP not TCB.

MFC after: 1 week


# ca007251 15-Aug-2009 Michael Tuexen <tuexen@FreeBSD.org>

MFC r196260.
* Fix a bug where PR-SCTP settings are ignore when using implicit
association setup.
* Fix a bug where message with illegal stream ids are not deleted.
* Fix a crash when reporting back unsent messages from the send_queue.
* Fix a bug related to INIT retransmission when the socket is already
closed.
* Fix a bug where associations were stalled when partial delivery API
was enabled.
* Fix a bug where the receive buffer size was smaller than the
partial_delivery_point.

Approved by: re, rrs (mentor)


# 810ec536 15-Aug-2009 Michael Tuexen <tuexen@FreeBSD.org>

* Fix a bug where PR-SCTP settings are ignore when using implicit
association setup.
* Fix a bug where message with illegal stream ids are not deleted.
* Fix a crash when reporting back unsent messages from the send_queue.
* Fix a bug related to INIT retransmission when the socket is already
closed.
* Fix a bug where associations were stalled when partial delivery API
was enabled.
* Fix a bug where the receive buffer size was smaller than the
partial_delivery_point.

Approved by: re, rrs (mentor)
MFC after: One day.


# a16ccdce 30-May-2009 Randall Stewart <rrs@FreeBSD.org>

Adds missing sysctl to manage the vtag_time_wait time. This will
even allow disabling time-wait all together if you set the value
to 0 (not advisable actually). The default remains the same
i.e. 60 seconds.


# bf1be571 30-May-2009 Randall Stewart <rrs@FreeBSD.org>

Fix a small memory leak from the nr-sack code - the mapping array
was not being freed at term of association. Also get rid of
the MICHAELS_EXP code.


# 8933fa13 04-Apr-2009 Randall Stewart <rrs@FreeBSD.org>

Many bug fixes (from the IETF hack-fest):
- PR-SCTP had major issues when skipping through a multi-part message.
o Did not look at socket buffer.
o Did not properly handle the reassmebly queue.
o The MARKED segments could interfere and un-skip a chunk causing
a problem with the proper FWD-TSN.
o No FR of FWD-TSN's was being done.
- NR-Sack code was basically disabled. It needed fixes that
never got into the real code.
- CMT code had issues when the two paths were NOT the same b/w. We
found a few small bugs, but also the critcal one here was not
dividing the rwnd amongst the paths.

Obtained from: Michael Tuexen and myself at the IETF hack-fest ;-)


# ea44232b 20-Feb-2009 Randall Stewart <rrs@FreeBSD.org>

Add the add-stream capability. Still needs more
testing..

MFC after: 1 month


# c3b8c73c 13-Feb-2009 Randall Stewart <rrs@FreeBSD.org>

Have the jail code use the error returned to pass not constant
errors.
Obtained from: jamie@freebsd.org


# a99b6783 03-Feb-2009 Randall Stewart <rrs@FreeBSD.org>

- Cleanup checksum code.
- Prepare for CRC offloading, add MIB counters (RS/MT).
- Bugfix: Disable CRC computation for IPv6 addresses with local scope (MT).
- Bugfix: Handle close() with SO_LINGER correctly when notifications
are generated during the close() call(MT).
- Bugfix: Generate DRY event when sender is dry during subscription.
Only for 1-to-1 style sockets (RS/MT)
- Bugfix: Put vtags for the correct amount of time into time-wait (MT).
- Bugfix: Clear vtag entries correctly on expiration (MT).
- Bugfix: shutdown() indicates ENOTCONN when called for unconnected
1-to-1 style sockets (MT).
- Bugfix: In sctp Auth code (PL).
- Add support for devices that support SCTP csum offload (igb).
- Add missing sctp_associd to mib sysctl xsctp_tcb structure (RS)
Obtained from: With help from Peter Lei and Michael Tuexen


# 385195c0 10-Dec-2008 Marko Zec <zec@FreeBSD.org>

Conditionally compile out V_ globals while instantiating the appropriate
container structures, depending on VIMAGE_GLOBALS compile time option.

Make VIMAGE_GLOBALS a new compile-time option, which by default will not
be defined, resulting in instatiations of global variables selected for
V_irtualization (enclosed in #ifdef VIMAGE_GLOBALS blocks) to be
effectively compiled out. Instantiate new global container structures
to hold V_irtualized variables: vnet_net_0, vnet_inet_0, vnet_inet6_0,
vnet_ipsec_0, vnet_netgraph_0, and vnet_gif_0.

Update the VSYM() macro so that depending on VIMAGE_GLOBALS the V_
macros resolve either to the original globals, or to fields inside
container structures, i.e. effectively

#ifdef VIMAGE_GLOBALS
#define V_rt_tables rt_tables
#else
#define V_rt_tables vnet_net_0._rt_tables
#endif

Update SYSCTL_V_*() macros to operate either on globals or on fields
inside container structs.

Extend the internal kldsym() lookups with the ability to resolve
selected fields inside the virtualization container structs. This
applies only to the fields which are explicitly registered for kldsym()
visibility via VNET_MOD_DECLARE() and vnet_mod_register(), currently
this is done only in sys/net/if.c.

Fix a few broken instances of MODULE_GLOBAL() macro use in SCTP code,
and modify the MODULE_GLOBAL() macro to resolve to V_ macros, which in
turn result in proper code being generated depending on VIMAGE_GLOBALS.

De-virtualize local static variables in sys/contrib/pf/net/pf_subr.c
which were prematurely V_irtualized by automated V_ prepending scripts
during earlier merging steps. PF virtualization will be done
separately, most probably after next PF import.

Convert a few variable initializations at instantiation to
initialization in init functions, most notably in ipfw. Also convert
TUNABLE_INT() initializers for V_ variables to TUNABLE_FETCH_INT() in
initializer functions.

Discussed at: devsummit Strassburg
Reviewed by: bz, julian
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation


# 830d754d 06-Dec-2008 Randall Stewart <rrs@FreeBSD.org>

Code from the hack-session known as the IETF (and a
bit of debugging afterwards):
- Fix protection code for notification generation.
- Decouple associd from vtag
- Allow vtags to have less strigent requirements in non-uniqueness.
o don't pre-hash them when you issue one in a cookie.
o Allow duplicates and use addresses and ports to
discriminate amongst the duplicates during lookup.
- Add support for the NAT draft draft-ietf-behave-sctpnat-00, this
is still experimental and needs more extensive testing with the
Jason Butt ipfw changes.
- Support for the SENDER_DRY event to get DTLS in OpenSSL working
with a set of patches from Michael Tuexen (hopefully heading to OpenSSL soon).
- Update the support of SCTP-AUTH by Peter Lei.
- Use macros for refcounting.
- Fix MTU for UDP encapsulation.
- Fix reporting back of unsent data.
- Update assoc send counter handling to be consistent with endpoint sent counter.
- Fix a bug in PR-SCTP.
- Fix so we only send another FWD-TSN when a SACK arrives IF and only
if the adv-peer-ack point progressed. However we still make sure
a timer is running if we do have an adv_peer_ack point.
- Fix PR-SCTP bug where chunks were retransmitted if they are sent
unreliable but not abandoned yet.

With the help of: Michael Teuxen and Peter Lei :-)
MFC after: 4 weeks


# 413628a7 29-Nov-2008 Bjoern A. Zeeb <bz@FreeBSD.org>

MFp4:
Bring in updated jail support from bz_jail branch.

This enhances the current jail implementation to permit multiple
addresses per jail. In addtion to IPv4, IPv6 is supported as well.
Due to updated checks it is even possible to have jails without
an IP address at all, which basically gives one a chroot with
restricted process view, no networking,..

SCTP support was updated and supports IPv6 in jails as well.

Cpuset support permits jails to be bound to specific processor
sets after creation.

Jails can have an unrestricted (no duplicate protection, etc.) name
in addition to the hostname. The jail name cannot be changed from
within a jail and is considered to be used for management purposes
or as audit-token in the future.

DDB 'show jails' command was added to aid debugging.

Proper compat support permits 32bit jail binaries to be used on 64bit
systems to manage jails. Also backward compatibility was preserved where
possible: for jail v1 syscalls, as well as with user space management
utilities.

Both jail as well as prison version were updated for the new features.
A gap was intentionally left as the intermediate versions had been
used by various patches floating around the last years.

Bump __FreeBSD_version for the afore mentioned and in kernel changes.

Special thanks to:
- Pawel Jakub Dawidek (pjd) for his multi-IPv4 patches
and Olivier Houchard (cognet) for initial single-IPv6 patches.
- Jeff Roberson (jeff) and Randall Stewart (rrs) for their
help, ideas and review on cpuset and SCTP support.
- Robert Watson (rwatson) for lots and lots of help, discussions,
suggestions and review of most of the patch at various stages.
- John Baldwin (jhb) for his help.
- Simon L. Nielsen (simon) as early adopter testing changes
on cluster machines as well as all the testers and people
who provided feedback the last months on freebsd-jail and
other channels.
- My employer, CK Software GmbH, for the support so I could work on this.

Reviewed by: (see above)
MFC after: 3 months (this is just so that I get the mail)
X-MFC Before: 7.2-RELEASE if possible


# 6974bd9e 27-Nov-2008 Bjoern A. Zeeb <bz@FreeBSD.org>

Unify ipsec[46]_delete_pcbpolicy in ipsec_delete_pcbpolicy.
Ignoring different names because of macros (in6pcb, in6p_sp) and
inp vs. in6p variable name both functions were entirely identical.

Reviewed by: rwatson (as part of a larger changeset)
MFC after: 6 weeks (*)
(*) possibly need to leave a stub wrappers in 7 to keep the symbols.


# a1e13272 12-Nov-2008 Randall Stewart <rrs@FreeBSD.org>

-Improvement: Add '\n' on debug output in sctp_lower_sosend().
-Improvement: panic() on INVARIANTS kernels if memory allocation
fails for a tagblock in sctp_add_vtag_to_timewait().
-Bugfix: Protect code in sctp_is_in_timewait() by
SCTP_INP_INFO_WLOCK/SCTP_INP_INFO_WUNLOCK.
-Cleanup: Get rid of unused variable now in sctp_init_asoc().
-Bugfix: Reuse the correct vtag in sctp_add_vtag_to_timewait().
-Cleanup: Get rid of unused constant SCTP_TIME_WAIT_SHORT
in sctp_constants.h.
-Improvement: Use all hash buckets of the vtag hash table.
-Cleanup: Get rid of then unused constant SCTP_STACK_VTAG_HASH_SIZE_A.
-Bugfix: Handle SHUTDOWN;SACK packet correctly.
-Bugfix: Last TSN in a gap ack block was not being "ack'd"
in the internal scoreboard.
Obtained from: (with help from Michael Tuexen)


# d7f03759 19-Oct-2008 Ulf Lilleengen <lulf@FreeBSD.org>

- Import the HEAD csup code which is the basis for the cvsmode work.


# 4a16c2c8 27-Aug-2008 Randall Stewart <rrs@FreeBSD.org>

- When we close a socket with pending assoc's that are still
shutting down, NULL out the socket pointer so we won't
ever refer to a dead socket.

Obtained from: Neil Wilson


# 603724d3 17-Aug-2008 Bjoern A. Zeeb <bz@FreeBSD.org>

Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).

This is the first in a series of commits over the course
of the next few weeks.

Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.

We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.

Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch


# 6d9e8f2b 31-Jul-2008 Randall Stewart <rrs@FreeBSD.org>

Adds support for the SCTP_PORT_REUSE option
Fixes a refcount bug found in the process

Obtained from: With the help of Michael Tuexen


# fc14de76 09-Jul-2008 Randall Stewart <rrs@FreeBSD.org>

1) Adds the rest of the VIMAGE change macros
2) Adds some __UserSpace__ on some of the common defines that
the user space code needs
3) Fixes a bug when we send up data to a user that failed. We
need to a) trim off the data chunk headers, if present, and
b) make sure the frag bit is communicated properly for the
msgs coming off the stream queues... i.e. we see if some
of the msg has been taken.

Obtained from: jeli contributed the VIMAGE changes on this pass Thanks Julain!


# 97a7b90f 14-Jun-2008 Randall Stewart <rrs@FreeBSD.org>

More prep for Vimage:
- only one functino to destroy an SCTP stack sctp_finish()
- Make it so this function also arranges for any threads
created by the image to do a kthread_exit()


# b3f1ea41 14-Jun-2008 Randall Stewart <rrs@FreeBSD.org>

- Macro-izes the packed declaration in all headers.
- Vimage prep - these are major restructures to move
all global variables to be accessed via a macro or two.
The variables all go into a single structure.
- Asconf address addition tweaks (add_or_del Interfaces)
- Fix rwnd calcualtion to be more conservative.
- Support SACK_IMMEDIATE flag to skip delayed sack
by demand of peer.
- Comment updates in the sack mapping calculations
- Invarients panic added.
- Pre-support for UDP tunneling (we can do this on
MAC but will need added support from UDP to
get a "pipe" of UDP packets in.
- clear trace buffer sysctl added when local tracing on.

Note the majority of this huge patch is all the vimage prep stuff :-)


# c54a18d2 20-May-2008 Randall Stewart <rrs@FreeBSD.org>

- Adds support for the multi-asconf (From Kozuka-san)
- Adds some prepwork (Not all yet) for vimage in particular
support the delete the sctppcbinfo.xx structs. There is
still a leak in here if it were to be called plus we stil
need the regrouping (From Me and Michael Tuexen)
- Adds support for UDP tunneling. For BSD there is no
socket yet setup so its disabled, but major argument
changes are in here to emcompass the passing of the port
number (zero when you don't have a udp tunnel, the default
for BSD). Will add some hooks in UDP here shortly (discussed
with Robert) that will allow easy tunneling. (Mainly from
Peter Lei and Michael Tuexen with some BSD work from me :-D)
- Some ease for windows, evidently leave is reserved by their
compile move label leave: -> out:

MFC after: 1 week


# 5e2c2d87 16-Apr-2008 Randall Stewart <rrs@FreeBSD.org>

Allow SCTP to compile without INET6.
PR: 116816
Obtained from tuexen@fh-muenster.de:
MFC after: 2 weeks


# 7a846e9a 22-Feb-2008 Randall Stewart <rrs@FreeBSD.org>

Fixes a memory leak when VRF's are in play.

Submitted by: Prasad Narasimha (snprasad@cisco.com)
Reviewed by: rrs


# 3ca1bcee 28-Jan-2008 Randall Stewart <rrs@FreeBSD.org>

- Fix a comment about prison.
- Fix it so the VRF is captured while locks are held.
MFC after: 1 week


# fb8fb8f8 30-Oct-2007 Randall Stewart <rrs@FreeBSD.org>

- Change the Time Wait of vtags value to match the cookie-life
- Select a tag gains ability to optionally save new tags
off in the timewait system.
- When looking up associations do not give back a stcb that
is in the about-to-be-freed state, and instead continue
looking for other candiates.
- New function to query to see if value is in time-wait.
- Timewait had a time comparison error that caused very
few vtags to actually stay in time-wait.
- When setting tags in time-wait, we now use the time
requested NOT a fixed constant value.
- sstat now gets the proper associd when we do the query.
- When we process an association, we expect the tag chosen
(if we have one from a cookie) to be in time-wait. Before
we would NOT allow the assoc up by checking if its good.
In theory this should have caused almost all assoc not
to come up except for the time-comparison bug above (this
bug was hidden by the time comparison bug :-D).
- Don't save tags for nonce values in the time-wait cache
since these are used only during cookie collisions and do
not matter if they are unique or not.
MFC after: 1 week


# b201f536 16-Oct-2007 Randall Stewart <rrs@FreeBSD.org>

- fix sctp_ifn initial refcount issue (prevents deletion)
- fix a bug during cookie collision that prevented an
association from coming up in a specific restart case.
- Fix it so the shutdown-pending flag gets removed (this is
more for correctness then needed) when we enter shutdown-sent
or shutdown-ack-sent states.
- Fix a bug that caused the receiver to sometimes NOT send
a SACK when a duplicate TSN arrived. Without this fix
it was possible for the association to fall down if the
- Deleted primary destination is also stored when SCTP_MOBILITY_BASE.
(Previously, it is stored when only SCTP_MOBILITY_FASTHANDOFF)
- Fix a locking issue where we might call send_initiate_ack() and
incorrectly state the lock held/not held. Also fix it so that
when we release the lock the inp cannot be deleted on us.
- Add the debug option that can cause the stack to panic instead
of aborting an assoc. This does not and should never show up
in options but is useful for debugging unexpected aborts.
- Add cumack_log sent to track sending cumack information for
the debug case where we are running a special log per assoc.
- Added extra () aroudn sctp_sbspace macro to avoid compile warnings.
MFC after: 1 week


# 8d3b5e7a 06-Oct-2007 Randall Stewart <rrs@FreeBSD.org>

- Fix the one-2-one model to properly do a socantrecv()
Approved by: re@freeBSD.org (Ken Smith)


# d55b0b1b 30-Sep-2007 Randall Stewart <rrs@FreeBSD.org>

- Bug fix managing congestion parameter on immediate
retransmittion by handover event (fast mobility code)
- Fixed problem of mobility code which is caused by remaining
parameters in the deleted primary destination.
- Add a missing lock. When a peer sends an INIT, and while we
are processing it to send an INIT-ACK the socket is closed,
we did not hold a lock to keep the socket from going away.
Add protection for this case.
- Fix so that arwnd is alway uses the minimal rwnd if the user
has set the socket buffer smaller. Found this when the test
org decided to see what happens when you set in a rwnd of 10
bytes (which is not allowed per RFC .. 4k is minimum).
- Fixes so a cookie-echo ootb will NOT cause an abort to
be sent. This was happening in a MPI collision case.
- Examined all panics and unless there was no recovery, moved
any that were not already to INVARANTS.

Approved by: re@freebsd.org (gnn)


# baf3da66 20-Sep-2007 Randall Stewart <rrs@FreeBSD.org>

- fix (global) address handling in the presence of duplicates, the
last interface should own the address, but the current code
fumbles the handoff. This fixes that.
- move address related debugs to PCB4 and add additional ones to
help in debugging address problems.

Approved by: re@freebsd.org (K Smith)


# c99efcf6 18-Sep-2007 Randall Stewart <rrs@FreeBSD.org>

- The address lock is changed to a rwlock. This
also involves macro changes to have a RLOCK and a WLOCK
and placing the correct version within the code.
- The INP-INFO lock is changed to a rwlock.
- When sctp_shutdown() is called on Mac OS X, the socket lock is held.
So call sctp_chunk_output with SCTP_SO_LOCKED and
not SCTP_SO_NOT_LOCKED.
- Add SCTP_IPI_ADDR_[RW]LOCK and SCTP_IPI_ADDR_[RW]UNLOCK for Mac OS X.
- u_int64_t -> uint64_t
- add missing addr unlock for error return path
Approved by: re@freebsd.org (K Smith)


# b27a6b7d 13-Sep-2007 Randall Stewart <rrs@FreeBSD.org>

- DF bit was on for COOKIE-ECHO chunks. This is
incorrect and should be OFF letting IP fragment
large cookie-echos.
- Rename sysctl variable logging to log_level.
- Fix description of sysctl variable stats.
- Add sysctl variable log to make sctp_log readable via sysctl
mechanism (this is by compile switch and targets non KTR platforms or
when someone wants to do performance wise tracing).
- Removed debug code

Approved by: re@freebsd.org (B Mah)


# 04ee05e8 13-Sep-2007 Randall Stewart <rrs@FreeBSD.org>

- Incorrect error EAGAIN returned for invalid send on a locked
stream (using EEOR mode). Changed to EINVAL (in sctp_output.c)
- Static analysis comments added
- fix in mobility code to return a value (static analysis found).
- sctp6_notify function made visible instead of
static (this is needed for Panda).

Approved by: re@freebsd.org (B Mah)


# 851b7298 08-Sep-2007 Randall Stewart <rrs@FreeBSD.org>

- send call has a reference to uio->uio_resid in
the recent send code, but uio may be NULL on sendfile
calls. Change to use sndlen variable.
- EMSGSIZE is not being returned in non-blocking mode
and needs a small tweak to look if the msg would
ever fit when returning EWOULDBLOCK.
- FWD-TSN has a bug in stream processing which could
cause a panic. This is a follow on to the codenomicon
fix.
- PDAPI level 1 and 2 do not work unless the reader
gets his returned buffer full. Fix so we can break
out when at level 1 or 2.
- Fix fast-handoff features to copy across properly on
accepted sockets
- Fix sctp_peeloff() system call when no true system call
exists to screen arguments for errors. In cases where a
real system call exists the system call itself does this.
- Fix raddr leak in recent add-ip code change for bundled
asconfs (even when non-bundled asconfs are received)
- Make sure ipi_addr lock is held when walking global addr
list. Need to change this lock type to a rwlock().
- Add don't wake flag on both input and output when the
socket is closing.
- When deleting an address verify the interface is correct
before allowing the delete to process. This protects panda
and unnumbered.
- Clean up old sysctl stuff and get rid of the old Open/Net
BSD structures.
- Add a function to watch the ranges in the sysctl sets.
- When appending in the reassembly queue, validate that
the assoc has not gone to about to be freed. If so
(in the middle) abort out. Note this especially effects
MAC I think due to the lock/unlock they do (or with
LOCK testing in place).
- Netstat patch to get rid of warnings.
- Make sure that no data gets queued to inactive/unconfirmed
destinations. This especially effect CMT but also makes a
impact on regular SCTP as well.
- During init collision when we detect seq number out
of sync we need to treat it like Case C and discard
the cookie (no invarient needed here).
- Atomic access to the random store.
- When we declare a vtag good, we need to shove it
into the time wait hash to prevent further use. When
the tag is put into the assoc hash, we need to remove it
from the twait hash (where it will surely be). This prevents
duplicate tag assignments.
- Move decr-ref count to better protect sysctl out of
data.
- ltrace error corrections in sctp6_usrreq.c
- Add hook for interface up/down to be sent to us.
- Make sysctl() exported structures independent of processor
architecture.
- Fix route and src addr cache clearing for delete address case.
- Make sure address marked SCTP_DEL_IP_ADDRESS is never selected
as src addr.
- in icmp handling fixed so we actually look at the icmp codes
to figure out what to do.
- Modified mobility code.
Reception of DELETE IP ADDRESS for a primary destination and
SET PRIMARY for a new primary destination is used for
retransmission trigger to the new primary destination.
Also, in this case, destination of chunks in send_queue are
changed to the new primary destination.
- Fix so that we disallow sending by mbuf to ever have EEOR
mode set upon it.

Approved by: re@freebsd.org (B Mah)


# ceaad40a 08-Sep-2007 Randall Stewart <rrs@FreeBSD.org>

- Locking compatiability changes. This involves adding
additional flags to many function calls. The flags only
get used in BSD when we compile with lock testing. These
flags allow apple to escape the "giant" lock it holds on
the socket and have more fine-grained locking in the NKE.
It also allows us to test (with witness) the locking used
by apple via a compile switch (manually applied).

Approved by: re@freebsd.org(B Mah)


# 2afb3e84 26-Aug-2007 Randall Stewart <rrs@FreeBSD.org>

- During shutdown pending, when the last sack came in and
the last message on the send stream was "null" but still
there, a state we allow, we could get hung and not clean
it up and wait for the shutdown guard timer to clear the
association without a graceful close. Fix this so that
that we properly clean up.
- Added support for Multiple ASCONF per new RFC. We only
(so far) accept input of these and cannot yet generate
a multi-asconf.
- Sysctl'd support for experimental Fast Handover feature. Always
disabled unless sysctl or socket option changes to enable.
- Error case in add-ip where the peer supports AUTH and ADD-IP
but does NOT require AUTH of ASCONF/ASCONF-ACK. We need to
ABORT in this case.
- According to the Kyoto summit of socket api developers
(Solaris, Linux, BSD). We need to have:
o non-eeor mode messages be atomic - Fixed
o Allow implicit setup of an assoc in 1-2-1 model if
using the sctp_**() send calls - Fixed
o Get rid of HAVE_XXX declarations - Done
o add a sctp_pr_policy in hole in sndrcvinfo structure - Done
o add a PR_SCTP_POLICY_VALID type flag - yet to-do in a future patch!
- Optimize sctp6 calls to reuse code in sctp_usrreq. Also optimize
when we close sending out the data and disabling Nagle.
- Change key concatenation order to match the auth RFC
- When sending OOTB shutdown_complete always do csum.
- Don't send PKT-DROP to a PKT-DROP
- For abort chunks just always checksums same for
shutdown-complete.
- inpcb_free front state had a bug where in queue
data could wedge an assoc. We need to just abandon
ones in front states (free_assoc).
- If a peer sends us a 64k abort, we would try to
assemble a response packet which may be larger than
64k. This then would be dropped by IP. Instead make
a "minimum" size for us 64k-2k (we want at least
2k for our initack). If we receive such an init
discard it early without all the processing.
- When we peel off we must increment the tcb ref count
to keep it from being freed from underneath us.
- handling fwd-tsn had bugs that caused memory overwrites
when given faulty data, fixed so can't happen and we
also stop at the first bad stream no.
- Fixed so comm-up generates the adaption indication.
- peeloff did not get the hmac params copied.
- fix it so we lock the addr list when doing src-addr selection
(in future we need to use a multi-reader/one writer lock here)
- During lowlevel output, we could end up with a _l_addr set
to null if the iterator is calling the output routine. This
means we would possibly crash when we gather the MTU info.
Fix so we only do the gather where we have a src address
cached.
- we need to be sure to set abort flag on conn state when
we receive an abort.
- peeloff could leak a socket. Moved code so the close will
find the socket if the peeloff fails (uipc_syscalls.c)

Approved by: re@freebsd.org(Ken Smith)


# c4739e2f 23-Aug-2007 Randall Stewart <rrs@FreeBSD.org>

- Fix address add handling to clear cached routes and source addresses
when peer acks the add in case the routing table changes.
- Fix sctp_lower_sosend to send shutdown chunk for mbuf send
case when sndlen = 0 and sinfoflag = SCTP_EOF
- Fix sctp_lower_sosend for SCTP_ABORT mbuf send case with null data,
So that it does not send the "null" data mbuf out and cause
it to get freed twice.
- Fix so auto-asconf sysctl actually effect the socket's asconf state.
- Do not allow SCTP_AUTO_ASCONF option to be used on subset bound sockets.
- Memset bug in sctp_output.c (arguments were reversed) submitted
found and reported by Dave Jones (davej@codemonkey.org.uk).
- PD-API point needs to be invoked >= not just > to conform to socket api
draft this fixes sctp_indata.c in the two places need to be >=.
- move M_NOTIFICATION to use M_PROTO5.
- PEER_ADDR_PARAMS did not fail properly if you specify an address
that is not in the association with a valid assoc_id. This meant
you got or set the stcb level values instead of the destination
you thought you were going to get/set. Now validate if the
stcb is non-null and the net is NULL that the sa_family is
set and the address is unspecified otherwise return an error.
- The thread based iterator could crash if associations were freed
at the exact time it was running. rework the worker thread to
use the increment/decrement to prevent this and no longer use
the markers that the timer based iterator uses.
- Fix the memleak in sctp_add_addr_to_vrf() for the case when it is
detected that ifa is already pointing to a ifn.
- Fix it so that if someone is so insane that they drop the
send window below the minimal add mark, they still can send.
- Changed all state for associations to use mask safe macro.
- During front states in association freeing in sctp_inpcbfree, we
had a locking problem where locks were not in place where they
should have been.
- Free association calls were not testing the return value in
sctp_inpcb_free() properly... others should be cast void returns
where we don't care about the return value.
- If a reference count is held on an assoc, even from the "force free"
we should not do the actual free.. but instead let the timer
free it.
- When we enter sctp_input(), if the SCTP_ASOC_ABOUT_TO_BE_FREED
flag is set, we must NOT process the packet but handle it like
ootb. This is because while freeing an assoc we release the
locks to get all the higher order locks so we can purge all
the hash tables. This leaves a hole if a packet comes in
just at that point. Now sctp_common_input_processing() will
call the ootb code in such a case.
- Change MBUF M_NOTIFICATION to use M_PROTO5 (per Sam L). This makes
it so we don't have a conflict (I think this is a covertity change).
We made this change AFTER some conversation and looking to make sure
that M_PROTO5 does not have a problem between SCTP and the 802.11
stuff (which is the only other place its used).
- Fixed lock order reversal and missing atomic protection around
locked_tcb during association lookup and the 1-2-1 model.
- Added debug to source address selection.
- V6 output must always do checksum even for loopback.
- Remove more locks around inp that are not needed for an atomically
added/subtracted ref count.
- slight optimization in the way we zero the array in sctp_sack_check()
- It was possible to respond to a ABORT() with bad checksum with
a PKT-DROP. This lead to a PKT-DROP/ABORT war. Add code to NOT
send a PKT-DROP to any ABORT().
- Add an option for local logging (useful for macintosh or when
you need better performing during debugging). Note no commands
are here to get the log info, you must just use kgdb.
- The timer code needs to be aware of if it needs to call
sctp_sack_check() to slide the maps and adjust the cum-ack.
This is because it may be out of sync cum-ack wise.
- Added threshold managment logging.
- If the user picked just the right size, that just filled the send
window minus one mtu, we would enter a forever loop not copying and
at the same time not blocking. Change from < to <= solves this.
- Sysctl added to control the fragment interleave level which defaults
to 1.
- My rwnd control was not being used to control the rwnd properly (we
did not add and subtract to it :-() this is now fixed so we handle
small messages (1 byte etc) better to bring our rwnd down more
slowly.

Approved by: re@freebsd.org (Bruce Mah)


# 2dad8a55 15-Aug-2007 Randall Stewart <rrs@FreeBSD.org>

- Remove extra comment for 7.0 (no GIANT here).
- Remove unneeded WLOCK/UNLOCK of inp for getting TCB lock.
- Fix panic that may occur when freeing an assoc that has partial
delivery in progress (may dereference null socket pointer when
queuing partial delivery aborted notification)
- Some spacing and comment fixes.
- Fix address add handling to clear cached routes and source addresses
when peer acks the add in case the routing table changes.
Approved by: re@freebsd.org (Bruce Mah)


# 1b649582 24-Jul-2007 Randall Stewart <rrs@FreeBSD.org>

- take out a needless panic under invariants for sctp_output.c
- Fix addrs's error checking of sctp_sendx(3) when addrcnt is less than
SCTP_SMALL_IOVEC_SIZE
- re-add back inpcb_bind local address check bypass capability
- Fix it so sctp_opt_info is independant of assoc_id postion.
- Fix cookie life set to use MSEC_TO_TICKS() macro.
- asconf changes
o More comment changes/clarifications related to the old local address
"not" list which is now an explicit restricted list.

o Rename some functions for clarity:
- sctp_add/del_local_addr_assoc to xxx_local_addr_restricted()
- asconf related iterator functions to sctp_asconf_iterator_xxx()

o Fix bug when the same address is deleted and added (and removed from
the asconf queue) where the ifa is "freed" twice refcount wise,
possibly freeing it completely.

o Fix bug in output where the first ASCONF would not go out after the
last address is changed (e.g. only goes out when retransmitted).

o Fix bug where multiple ASCONFs can be bundled in the same packet with
the and with the same serial numbers.

o Fix asconf stcb iterator to not send ASCONF until after all work
queue entries have been processed.

o Change behavior so that when the last address is deleted (auto asconf
on a bound all endpoint) no action is taken until an address is
added; at that time, an ASCONF add+delete is sent (if the assoc
is still up).

o Fix local address counting so that address scoping is taken into
account.

o #ifdef SCTP_TIMER_BASED_ASCONF the old timer triggered sending
of ASCONF (after an RTO). The default now is to send
ASCONF immediately (except for the case of changing/deleting the
last usable address).
Approved by: re(ken smith)@freebsd.org


# 52be287e 21-Jul-2007 Randall Stewart <rrs@FreeBSD.org>

- remove duplicate code from sctp_asconf.c
- remove duplicate #include <sys/priv.h> that is not under
#ifdef FreeBSD version to allow compile on 6.1
- static analysis changes per the cisco SA tool including:
o some SA_IGNORE comments
o some checks for NULL before unlock.
o type corrections int -> size_t
- Fix it so sctp_alloc_asoc takes a thread/proc argument. Without this
we pass a NULL in to bind on implicit assoc setup and crash :-(
Approved by: re@freebsd.org(Ken Smith)


# 18e198d3 17-Jul-2007 Randall Stewart <rrs@FreeBSD.org>

- added pre-checks to the bindx call.
- use proper tick gathering macro instead of ticks directly.
- Placed reasonable boundaries on sets that a user can do
that are converted to ticks from ms.
- Fix CMT_PF to always check to be sure CMT is on.
- Fix ticks use of CMT_PF.
- put back code to allow asconfs to be queued while INITs are in flight
and before the assoc is established.
- During window probes, an ack'd packet might be left with the window
probe mark on it causing it to be retransmitted. Change so that
the flight decrease macro clears the window_probe mark.
- Additional logging flight size/reading and ASOC LOG. This
is only enabled if you manually insert things into opt_sctp.h
since its a set of debug code only.
- Found an interesting SMP race in the way data was appended which
could cause a reader to lose a part of a message, had to
reorder when we marked the message was complete to after
the data was appended.
- bug in ADD-IP for the subset bound socket case when the peer has only
one address
- fix ASCONF implicit success/error handling case
- proper support of jails in Freebsd 6>
- copy out the timeval for the 64 bit sparc world on cookie-echo
alignment error crashes without this).
Approved by: re(Ken Smith)


# b54d3a6c 14-Jul-2007 Randall Stewart <rrs@FreeBSD.org>

- Modular congestion control, with RFC2581 being the default.
- CMT_PF states added (w/sysctl to turn the PF version on)
- sctp_input.c had a missing incr of cookie case when the
auth was bad. This meant a free was called without an
increment to refcnt, added increment like rest of code.
- There was a case, unlikely, when the scope of the destination
changed (this is a TSNH case). In that case, it would not free
the alloc'ed asoc (in sctp_input.c).
- When listed addresses found a colliding cookie/Init, then
the collided upon tcb was not unlocked in sctp_pcb.c
- Add error checking on arguments of sctp_sendx(3) to prevent it from
referencing a NULL pointer.
- Fix an error return of sctp_sendx(3), it was returing
ENOMEM not -1.
- Get assoc id was changed to use the sanctified socket api
method for getting a assoc id (PEER_ADDR_INFO instead of
PEER_ADDR_PARAMS).
- Fix it so a peeled off socket will get a proper error return
if it trys to send to a different address then it is connected to.
- Fix so that select_a_stream can avoid an endless loop that
could hang a caller.
- time_entered (state set time) was not being set in all cases
to the time we went established.
Approved by: re(ken smith)


# b2630c29 02-Jul-2007 George V. Neville-Neil <gnn@FreeBSD.org>

Commit the change from FAST_IPSEC to IPSEC. The FAST_IPSEC
option is now deprecated, as well as the KAME IPsec code.
What was FAST_IPSEC is now IPSEC.

Approved by: re
Sponsored by: Secure Computing


# 5bead436 02-Jul-2007 Randall Stewart <rrs@FreeBSD.org>

- Consolidate the code that free's chunks to actually also
call the sctp_free_remote_address() function.
- Assure that when we allocate a chunk the whoTo is NULL,
also when we free it and place it into the cache we NULL
it (that way the consolidation code will always work).
- Fix a small race, when a empty data holder is left on the stream
out queue, and both sides do a shutdown, the empty data holder
would prevent us from sending a SHUTDOWN-ACK and at the same time we
never would cleanup the empty holder (since nothing was ever in queue).
We now add a utility function that a) cleans up empty holders and
b) properly determines if there are still pending data chunks on
the stream out wheel.
Approved by: re@freebsd.org (Ken Smith)


# 2cb64cb2 01-Jul-2007 George V. Neville-Neil <gnn@FreeBSD.org>

Commit IPv6 support for FAST_IPSEC to the tree.
This commit includes only the kernel files, the rest of the files
will follow in a second commit.

Reviewed by: bz
Approved by: re
Supported by: Secure Computing


# eacc51c5 18-Jun-2007 Randall Stewart <rrs@FreeBSD.org>

- Fixes cstatic issues found by cisco sa tool (missing frees and such
on error legs)
- align sctp_sockstore to 64 bit boundary ..


# 75298de2 17-Jun-2007 Randall Stewart <rrs@FreeBSD.org>

Back out last change to inpcb_free. Turns out we need
to hold off freeing if there is data pending ... someone
might do send/close. Which means we want the data to
go and then close it after startup. Added comments to
the code as well to note that this is done for a reason.


# e42a0f5e 16-Jun-2007 Randall Stewart <rrs@FreeBSD.org>

- For sctp_input/sctp6_input add announcment when a packet arrives (debug)
- re-factor the packet drop in sctp_output a bit more, we don't need the
trim after all, but the size calc is now corrected.
- When a assoc is in the COOKIE-ECHO/COOKIE-WAIT state and the user
closes, it should not matter if data is queued, the assoc should be
purged.
- In error leg a missing free_chunk when iph comes in NULL (should not
happen but just in case).


# e1461651 15-Jun-2007 Randall Stewart <rrs@FreeBSD.org>

- Update the comment lines in sctp_input.c
- We need to init the INP_LOCK since otherwise for
non-SMP kernels you crash when you set the TOS.


# 22a67197 14-Jun-2007 Randall Stewart <rrs@FreeBSD.org>

- Add VRF id to sctp_ifa structure, needed mainly in panda but useful
during deletes of ifa's in diff VRF's when applicable.


# 80fefe0a 14-Jun-2007 Randall Stewart <rrs@FreeBSD.org>

- Fix so ifn's are properly deleted when the ref count goes to 0.
- Fix so VRF's will clean themselves up when no references are around.
- Allow sctp_ifa to be passed into inpcb_bind, addr_mgmt_ep_sa to bypass
normal validation checks.
- turn auto-asconf off for subset bound sockets
- Moves all logging to use KTR. This gets rid of most
of the logging #ifdef's with a few exceptions reducing
the number of config options for SCTP.


# 9a972525 12-Jun-2007 Randall Stewart <rrs@FreeBSD.org>

- Fixed cookie handling to calc an RTO when
its an INIT collision case.
- Fixed RTO calc to maintain a seperate variable to track
if a RTO calc as been done, this allows the RTO var to be
doubled during initial timeouts.
- Reduces the amount of stack used by process control.
- Use a constant for the peer chunk overhead.
- Name change to spell candidate correctly.


# 71498f30 12-Jun-2007 Bruce M Simpson <bms@FreeBSD.org>

Import rewrite of IPv4 socket multicast layer to support source-specific
and protocol-independent host mode multicast. The code is written to
accomodate IPv6, IGMPv3 and MLDv2 with only a little additional work.

This change only pertains to FreeBSD's use as a multicast end-station and
does not concern multicast routing; for an IGMPv3/MLDv2 router
implementation, consider the XORP project.

The work is based on Wilbert de Graaf's IGMPv3 code drop for FreeBSD 4.6,
which is available at: http://www.kloosterhof.com/wilbert/igmpv3.html

Summary
* IPv4 multicast socket processing is now moved out of ip_output.c
into a new module, in_mcast.c.
* The in_mcast.c module implements the IPv4 legacy any-source API in
terms of the protocol-independent source-specific API.
* Source filters are lazy allocated as the common case does not use them.
They are part of per inpcb state and are covered by the inpcb lock.
* struct ip_mreqn is now supported to allow applications to specify
multicast joins by interface index in the legacy IPv4 any-source API.
* In UDP, an incoming multicast datagram only requires that the source
port matches the 4-tuple if the socket was already bound by source port.
An unbound socket SHOULD be able to receive multicasts sent from an
ephemeral source port.
* The UDP socket multicast filter mode defaults to exclusive, that is,
sources present in the per-socket list will be blocked from delivery.
* The RFC 3678 userland functions have been added to libc: setsourcefilter,
getsourcefilter, setipv4sourcefilter, getipv4sourcefilter.
* Definitions for IGMPv3 are merged but not yet used.
* struct sockaddr_storage is now referenced from <netinet/in.h>. It
is therefore defined there if not already declared in the same way
as for the C99 types.
* The RFC 1724 hack (specify 0.0.0.0/8 addresses to IP_MULTICAST_IF
which are then interpreted as interface indexes) is now deprecated.
* A patch for the Rhyolite.com routed in the FreeBSD base system
is available in the -net archives. This only affects individuals
running RIPv1 or RIPv2 via point-to-point and/or unnumbered interfaces.
* Make IPv6 detach path similar to IPv4's in code flow; functionally same.
* Bump __FreeBSD_version to 700048; see UPDATING.

This work was financially supported by another FreeBSD committer.

Obtained from: p4://bms_netdev
Submitted by: Wilbert de Graaf (original work)
Reviewed by: rwatson (locking), silence from fenner,
net@ (but with encouragement)


# 32f9753c 11-Jun-2007 Robert Watson <rwatson@FreeBSD.org>

Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in
some cases, move to priv_check() if it was an operation on a thread and
no other flags were present.

Eliminate caller-side jail exception checking (also now-unused); jail
privilege exception code now goes solely in kern_jail.c.

We can't yet eliminate suser() due to some cases in the KAME code where
a privilege check is performed and then used in many different deferred
paths. Do, however, move those prototypes to priv.h.

Reviewed by: csjp
Obtained from: TrustedBSD Project


# 108df27c 08-Jun-2007 Randall Stewart <rrs@FreeBSD.org>

- RTO was not being initialized to 0, thus the rtt calculation
algoritm would not go through the proper initialization.
- The initialization was incorrect as well, causing problems in
sat networks with > 1sec RTT
- Get rid of magic numbers in RTT calculations.


# f4c93d24 02-Jun-2007 Randall Stewart <rrs@FreeBSD.org>

- fix initial pcb vrf setting when the initial vrf is not the
default_vrf_id
- Missing lock/unlock of inp added as well in the v6 side.
- IFN hash table moves to sctppcbinfo since indexes are
unique across systems (including different VRFs) this makes it easier
to do ifn lookups.


# ad21a364 01-Jun-2007 Randall Stewart <rrs@FreeBSD.org>

- Take out the broken table-id concept. Panda Routers have a M-VRF
concept that is NOT well thought out for a multi-homed transport
protocol. So the useless table-id entries passed around need to
be removed.
- Add a event timer for the zero copy api.
- Fix a bug in sctp_timer.c when searching for an alternate
with the largest ssthresh (the compare was wrong).


# 0696e120 30-May-2007 Randall Stewart <rrs@FreeBSD.org>

- Fix a memory overwrite when the mapping array
is expanded, size of expansion was not taken int consideration.
- Fix so vtag hash is 1 bigger so that it modulo's out
correctly, avoids a panic when restart with right modulo happens.
- do not dereference stcb when control->do_not_ref_stcb is set
- Fix up packet logging to not often use a lock and also to
add to options.
- Fix some logging option duplication in the sctputil.h


# 207304d4 29-May-2007 Randall Stewart <rrs@FreeBSD.org>

- Fixes so we won't try to start a timer when we
hold a wq lock for the iterator. Panda uses a
silly recursive lock they hold through the timer.
- Add poor mans wireshark compile option..
- Allocate and start using SCTP_M_XXX for all SCTP_MALLOC() calls.
- sysctl now will get back the refcnt for viewing by onlookers.

Reviewed by: gnn


# d61a0ae0 28-May-2007 Randall Stewart <rrs@FreeBSD.org>

- fixed autclose to not allow setting on 1-2-1 model.
- bounded cookie-life to 1 second minimum in socket option set.
- Delayed_ack_time becomes delayed_ack per new socket api document.
- Improve port number selection, we now use low/high bounds and
no chance of a endless loop. Only one call to random per bind
as well.
- fixes so set_peer_primary pre-screens addresses to be
valid to this host.
- maxseg did not allow setting on an assoc basis. We needed
to thus track and use an association value instead of a inp value.
- Fixed ep get of HB status to report back properly.
- use settings flag to tell if assoc level hb is on off not
the timer.. since the timer may still run if unconf address
are present.
- check for crazy ENABLE/DISABLE conditions.
- set and get of pmtud (fixed path mtu) not always taking into account ovh.
- Getting PMTU info on stcb only needs to return PMTUD_ENABLED if
any net is doing PMTU discovery.
- Panic or warning fixed to not do so when a valid ip frag is
taking place.
- sndrcvinfo appearing in both inp and stcb was full size, instead
of the non-pad version. This saves about 92 bytes from each struct
by carefully converting to use the smaller version.
- one-2-one model get(maxseg) would always get ep value, never the
tcb's value.
- The delayed ack time could be under a tick, this fixes so
it bounds it to at least 1 tick for platforms whos tick
is more than a ms.
- Fragment interleave level set to wrong default value.
- Fragment interleave could not set level 0.
- Defered stream reset was broken due to a guard check and ntohl issue.
- Found two lock order reversals and fixed.
- Tighten up address checking, if the user gives an address the sa_len
had better be set properly.
- Get asoc by assoc-id would return a locked tcb when it was asked
not to if the tcb was in the restart hash.
- sysctl to dig down and get more association details

Reviewed by: gnn


# 3c503c28 16-May-2007 Randall Stewart <rrs@FreeBSD.org>

- Fixed 1-2-1 model to not worry about associd in sockopts
- Fixed RTOinfo for bounding.
- Fixed connect() to return ECONNREFUSED when an ABORT is received.
- Added comments to direct Static Analysis not to look at some things
it does not understand (comments are /* sa_ignore XXXXX */)
- Bind when colliding was broken, missing not_found = 1 before
checking to see if the port was in use caused endless bind loop.
- Cookie life needs to be in milliseconds to conform to socket api.
- Cookie life is not supposed to change if its 0, On the assoc
level set we changed it to 0 opps.
- Two more static analysis issues identified by the cisco
tool. Null checks needed.
- An issue for sendfile(). Need to validate the correct
input argument.
- When sending failed due to a no route to host, we leaked
the mbuf chain failing to call m_freem().
- Fix #ifdef issue for getting hash block len when HAVE_SHA2 is NOT defined
Reviewed by: gnn


# ad81507e 09-May-2007 Randall Stewart <rrs@FreeBSD.org>

Two major items here:
- All printf that was surrounded by #ifdef SCTP_DEBUG moves to
a macro that does all of this. This removes all printfs from
the code and makes the code more portable and easier to
read.
- Static Analysis (cisco) - found a few bugs, but mostly we
add checks for NULL pointers and such to make the tool
happy. We now pass the Cisco SA tools checks except for
where it does not understand tailq/lists. We still need
to look at the coverity tools output too (this is like
the cisco SA tool) and see if it wants us to fix any other
items. Hopefully this will be the last major churn in the
code other than bug fixes.


# b1006367 08-May-2007 Randall Stewart <rrs@FreeBSD.org>

- Copyright change, cisco's silly tool wants it to say:
"Copyright (c) 2001-2007, by Cisco Systems,"
instead of
*Copyright (c) 2001-2007, Cisco Systems,"

- Also fix a few straglers that were still in 2006.


# b0552ae2 08-May-2007 Randall Stewart <rrs@FreeBSD.org>

- Get rid of the sctp_inpcb_free() "magic numbers", now they
are sensible defines that tell what you are directing
the function to do.


# 6e55db54 08-May-2007 Randall Stewart <rrs@FreeBSD.org>

- Static analyisis fixes for cisco's commit (this is equivilant
to the coverity tool.. may even be the same one.. not sure).
- A bug in the way sctp_abort() and friends were
setting the IP_CLOSE flag.. and NOT passing the
last argument as a (,1)... so that things would
get freed..


# 17205ecc 07-May-2007 Randall Stewart <rrs@FreeBSD.org>

- More macros for OS compatabilty
- PR-SCTP would ignore FWD-TSN's above a rwnd's worth
of TSN's (1 byte msgs).. this left the peer hopelessly
out of sync.. or an attacker. So now we abort the assoc.
- New IFN hash, also rename hashes to match addr/ifn now
that the vrf has multiple.
- Do not enable SCTP_PCB_FLAGS_RECVDATAIOEVNT per default
as defined in the Socket API ID.
- Export MTU information via sysctl.
- Vrf's need table id's. This is default for
BSD, but may be other things later when BSD
fully supports VRFs.
- Additional stream reset bug (caught by cisco dev-test).
- Additional validations for the address in sending a message (socket api).
-------- and -----
- Fix association notifications not to give the active open
side false notifications.
- Fix so sendfile and SENDALL will work properly (missing
flag to say socket sender is done).
- Fix Bug that prevented COOKIES from being retransmitted.
- Break out connectx into helper sub-models so that iox routines can
reuse the helpers.
- When an address is added during system init (non-dynamic mode) make
sure that the "defer use" flag is not set.
** its compiling on XR now :-D **

Reviewed by: gnn


# 1bb552e8 04-May-2007 Randall Stewart <rrs@FreeBSD.org>

Fixes a missing unlock in the one-2-one hash table, if
it was full and a collision occured, then we would leave
a inp locked. Also fixes a missing inp unlock if IPSEC was
on and it failed during the attach. Bug found by Weongyo Jeong.


# d06c82f1 01-May-2007 Randall Stewart <rrs@FreeBSD.org>

- Somehow the disable fragment option got lost. We could
set/clear it but would not do it. Now we will.
- Moved to latest socket api for extended sndrcv info struct.
- Moved to support all new levels of fragment interleave (0-2).
- Codenomicon security test updates - length checks and such.
- Bug in stream reset (2 actually).
- setpeerprimary could unlock a null pointer, fixed.
- Added a flag in the pcb so netstat can see if we are listening easier.

Obtained from: (some of the Listen changes from Weongyo Jeong)


# 9a6142d8 22-Apr-2007 Randall Stewart <rrs@FreeBSD.org>

- Somehow the disable fragment option got lost. We could
set/clear it but would not do it. Now we will.
- Moved to latest socket api for extended sndrcv info struct.
- Moved to support all new levels of fragment interleave.


# f1f73e57 19-Apr-2007 Randall Stewart <rrs@FreeBSD.org>

- More work on making send lock contention.
- Removed free-oqueue cache.
- Fix counter for sq entries
- Increased the amount of information retained
on ASOC_TSN logging on the association.
- Made it so with the ASOC_TSN logging on
sending or recieving an abort we dump the log.
- Went through and added invariant's around some
panic's that needed them.
- decrements went to atomic_subtact_int instead of add -1
- Removed residual count increment that threw off a
strm oq count.
- Tracks and complaints if we don't have a LAST fragment and
clean up the sp structure.
- Track a new stat that counts number of abandoned msgs that
happen if you close without reading.
- Fix lookup of frag point to be aware of a 0 assoc-id.
Reviewed by: gnn


# c105859e 14-Apr-2007 Randall Stewart <rrs@FreeBSD.org>

- fix source address selection when picking an acceptable address
- name change of prefered -> preferred
- CMT fast recover code added.
- Comment fixes in CMT.
- We were not giving a reason of cant_start_asoc per socket api
if we failed to get init/or/cookie to bring up an assoc. Change
so we don't just give a generic "comm lost" but look at actual
states of dying assoc.
- change "crc32" arguments to "crc32c" to silence strict/noisy
compiler warnings when crc32() is also declared
- A few minor tweaks to get the portable stuff truely portable
for sctp6_usrreq.c :-D
- one-2-one style vrf match problem.
- window recovery would leave chks marked for retran
during window probes on the sent queue. This would then
cause an out-of-order problem and assure that the flight
size "problem" would occur.
- Solves a flight size logging issue that caused rwnd
overruns, flight size off as well as false retransmissions.g
- Macroize the up and down of flight size.
- Fix a ECNE bug in its counting.
- The strict_sacks options was causing aborts when window probing
was active, fix to make strict sacks a bit smarter about what
the next unsent TSN is.
- Fixes a one-2-one wakeup bug found by Martin Kulas.
- If-defed out form, Andre's copy routines pending his
commit of at least m_last().. need to adjust for 6.2 as
well.. since m_last won't exist.
Reviewed by: gnn


# bff64a4d 03-Apr-2007 Randall Stewart <rrs@FreeBSD.org>

- fixed several places where we did not release INP locks.
- fixed a refcount bug in the new ifa structures.
- use vrf's from default stcb or inp whenever possible.
- Address limits raised to account for a full IP fragmented
packet (1000 addresses).
- flight size correcting updated to include one message only
and to handle case where the peer does not cumack the
next segment aka lists 1/1 in sack blocks..
- Various bad init/init-ack handling could cause a panic
since we tried to unlock the destroyed mutex. Fixes
so we properly exit when we need to destroy an assoc.
(Found by Cisco DevTest team :D)
- name rename in src-addr-selection from pass to sifa.
- route structure typedef'd to allow different platforms
and updated into sctp_os_bsd file.
- Max retransmissions a chunk can be made added.
Reviewed by: gnn


# 5e54f665 31-Mar-2007 Randall Stewart <rrs@FreeBSD.org>

- Found bug in min split point bundling which caused
incorrect, non-bundlable fragmentation.
- Added min residual to better control split points for
both how big a msg must be as well as how much needs
to be left over.
- With our new algo in place, we need to implicitly
set "end of msg" on the sp-> structure otherwise we
end up with "hung" associations.
- Room reserved up front in IP header by pushing IP
header to back of mbuf.
- Fix so FR's peg count of retransmissions needed.
- Fix so an unlucky chunk that never gets across
will kill the assoc via the kill timer and send an
abort too.
- Fix bug in sctp_input which can result in a crash.
- Do not strip off IP options anymore.
- Clean up sctp_calculate_rto().
- Get rid of unused sysctl.
- Fixed so we discard all M-Cast
- Fixed so port check done AFTER checksum
- Fixed bug in fragmentation code that prevented
us from fragmenting a small complete message when
we needed to.
- Window probes were not marked back to unsent and
flight adjusted when a sack came in with no
window change or accepting of the probe data.
We now fix this with having a mark on the net and
the chunk so we can clear it out when the sack arrives
forcing it to retran just like it was "new" this
improves the handling of window probes, which were
dropped by the receiver.
- Tighten AUTH protocol error checks during INIT/INIT-ACK exchange


# 62c1ff9c 20-Mar-2007 Randall Stewart <rrs@FreeBSD.org>

- window update sacks sent incorrectly after
shutdown which caused extra abort from peer.
- RTT time calculation was not being done in
express sack handling since it refered to an unused
variable (rto_pending). Removed variable.
- socket buffer high water access macro-ized.


# 6a27c376 19-Mar-2007 Randall Stewart <rrs@FreeBSD.org>

Adds a hash table to speed local address lookup
on a per VRF basis (BSD has only one VRF currently).
Hash table is sized to 16 but may need to be adjusted
for machines with large numbers of addresses.
Reviewed by: gnn


# 132dea7d 19-Mar-2007 Randall Stewart <rrs@FreeBSD.org>

- errno -> becomes error in sctp_output.c and sctputil.c
- SB_CLEAR macro defined and used for sb clearing.
- Fix for CMT express_sack_handling did not do proper
pseudo-cumack updates.
- Get rid of extraneous function that was never used ip_2_ip6_hdr()
- Fixed source address selection bug (initialization problem).
- Source address selection debug added.


# 42551e99 15-Mar-2007 Randall Stewart <rrs@FreeBSD.org>

- Sysctl's move to seperate file
- moved away from ifn/ifa access to sctp_ifa/sctp_ifn
built and managed by the add-ip code.
- cleaned up add-ip code to use the iterator
- made iterator be a thread, which enables auto-asconf now.
- rewrote and cleaned up source address selection (also
made it use new structures).
- Fixed a couple of memory leaks.
- DACK now settable as to how many packets to delay as
well as time.
- connectx() to latest socket API, new associd arg.
- Fixed issue with revoking and loosing potential to
send when we inflate the flight size. We now inflate
the cwnd too and deflate it later when the revoked
chunk is sent or acked.
- Got rid of some temp debug code
- src addr selection moved to a common file (sctp_output.c)
- Support for simple VRF's (we have support for multi-vfr
via compile switch that is scrubbed from BSD but we won't
need multi-vrf until we first get VRF :-D)
- Rest of mib work for address information now done
- Limit number of addresses in INIT/INIT-ACK to
a #def (30).

Reviewed by: gnn


# f42a358a 12-Feb-2007 Randall Stewart <rrs@FreeBSD.org>

- Copyright updates (aka 2007)
- ZONE get now also take a type cast so it does the
cast like mtod does.
- New macro SCTP_LIST_EMPTY, which in bsd is just
LIST_EMPTY
- Removal of const in some of the static hmac functions
(not needed)
- Store length changes to allow for new fields in auth
- Auth code updated to current draft (this should be the
RFC version we think).
- use uint8_t instead of u_char in LOOPBACK address comparison
- Some u_int32_t converted to uint32_t (in crc code)
- A bug was found in the mib counts for ordered/unordered
count, this was fixed (was referencing a freed mbuf).
- SCTP_ASOCLOG_OF_TSNS added (code will probably disappear
after my testing completes. It allows us to keep a
small log on each assoc of the last 40 TSN's in/out and
stream assignment. It is NOT in options and so is only
good for private builds.
- Some CMT changes in prep for Jana fixing his problem
with reneging when CMT is enabled (Concurrent Multipath
Transfer = CMT).
- Some missing mib stats added.
- Correction to number of open assoc's count in mib
- Correction to os_bsd.h to get right sha2 macros
- Add of special AUTH_04 flags so you can compile the code
with the old format (in case the peer does not yet support
the latest auth code).
- Nonce sum was incorrectly being set in when ecn_nonce was
NOT on.
- LOR in listen with implicit bind found and fixed.
- Moved away from using mbuf's for socket options to using
just data pointers. The mbufs were used to harmonize
NetBSD code since both Net and Open used this method. We
have decided to move away from that and more conform to
FreeBSD style (which makes more sense).
- Very very nasty bug found in some of my "debug" code. The
cookie_how collision case tracking had an endless loop in
it if you got a second retransmission of a cookie collision
case. This would lock up a CPU .. ugly..
- auth function goes to using size_t instead of int which
conforms to socketapi better
- Found the nasty bug that happens after 9 days of testing.. you
get the data chunk, deliver it and due to the reference to a ch->
that every now and then has been deleted (depending on the postion
in the mbuf) you have an invalid ch->ch.flags.. and thus you don't
advance the stream sequence number.. so you block the stream
permanently. The fix is to make local variables of these guys
and set them up before you have any chance of trimming the
mbuf.
- style fix in sctp_util.h, not sure how this got bad maybe in
the last patch? (aka it may not be in the real source).
- Found interesting bug when using the extended snd/rcv info where
we would get an error on receiving with this. Thats because
it was NOT padded to the same size as the snd_rcv info. We
increase (add the pad) so the two structs are the same size
in sctp_uio.h
- In sctp_usrreq.c one of the most common things we did for
socket options was to cast the pointer and validate the size.
This as been macro-ized to help make the code more readable.
- in sctputil.c two things, the socketapi class found a missing
flag type (the next msg is a notification) and a missing
scope recovery was also fixed.

Reviewed by: gnn


# 93164cf9 18-Jan-2007 Randall Stewart <rrs@FreeBSD.org>

- most all includes (#include <>) migrate to the sctp_os_bsd.h file
- Finally all splxx() are removed
- Count error fixed in mapping array which might
cause a wrong cumack generation.
- Invariants around panic for case D + printf when no invariants.
- one-to-one model race condition fixed by using
a pre-formed connection and then completing the
work so accept won't happen on a non-formed
association.
- Some additional paranoia checks in sctp_output.
- Locks that were missing in the accept code.

Approved by: gnn


# 44b7479b 15-Jan-2007 Randall Stewart <rrs@FreeBSD.org>

- Macroizes the V6ONLY flag check.
- Added a short time wait (not used yet) constant
- Corrected the type of the crc32c table (it was
unsigned long and really is a uint32_t
- Got rid of the user of MHeaders until they
are truely needed by lower layers.
- Fixed an initialization problem in the readq structure
(ordering was off).
- Found yet another collision bug when the random number
generator returns two numbers on one side (during a collision)
that are the same. Also added some tracking of cookies
that will go away when we know that we have the last collision
bug gone.
- Fixed an init bug for book_size_scale, that was causing
Early FR code to run when it should not.
- Fixed a flight size tracking bug that was associated with
Early FR but due to above bug also effected all FR's
- Fixed it so Max Burst also will apply to Fast Retransmit.
- Fixed a bug in the temporary logging code that allowed a
static log array overflow
- hashinit_flags is now used.
- Two last mcopym's were converted to the macro sctp_m_copym that
has always been used by all other places
- macro sctp_m_copym was converted to upper case.
- We now validate sinfo_flags on input (we did not before).
- Fixed a bug that prevented a user from sending data and immediately
shuting down with one send operation.
- Moved to use hashdestroy instead of free() in our macros.
- Fixed an init problem in our timed_wait vtag where we
did not fully initialize our time-wait blocks.
- Timer stops were re-positioned.
- A pcb cleanup method was added, however this probably will
not be used in BSD.. unless we make module loadable protocols
- I think this fixes the mysterious timer bug.. it was a
ordering of locks problem in the way we did timers. It
now conforms to the timeout(9) manual (except for the
_drain part, we had to do this a different way due
to locks).
- Fixed error return code so we get either CONNREUSED or CONNRESET
depending on where one is in progression
- Purged an unused clone macro.
- Fixed a read erro code issue where we were NOT getting the proper
error when the connection was reset.
- Purged an unused clone macro.
- Fixed a read erro code issue where we were NOT getting the proper
error when the connection was reset.
Approved by: gnn


# 139bc87f 29-Dec-2006 Randall Stewart <rrs@FreeBSD.org>

a) macro-ization of all mbuf and random number
access plus timers. This makes the code
more portable and able to change out the
mbuf or timer system used more easily ;-)
b) removal of all use of pkt-hdr's until only
the places we need them (before ip_output routines).
c) remove a bunch of code not needed due to <b> aka
worrying about pkthdr's :-)
d) There was one last reorder problem it looks where
if a restart occur's and we release and relock (at
the point where we setup our alias vtag) we would
end up possibly getting the wrong TSN in place. The
code that fixed the TSN's just needed to be shifted
around BEFORE the release of the lock.. also code that
set the state (since this also could contribute).
Approved by: gnn


# a5d547ad 14-Dec-2006 Randall Stewart <rrs@FreeBSD.org>

1) Fixes on a number of different collision case LOR's.
2) Fix all "magic numbers" to be constants.
3) A collision case that would generate two associations to
the same peer due to a missing lock is fixed.
4) Added tracking of where timers are stopped.
Approved by: gnn


# 03b0b021 07-Nov-2006 Randall Stewart <rrs@FreeBSD.org>

-Fixes first of all the getcred on IPv6 and V4. The
copy's were incorrect and so was the locking.
-A bug was also found that would create a race and
panic when an abort arrived on a socket being read
from.
-Also fix the reader to get MSG_TRUNC when a partial
delivery is aborted.
-Also addresses a couple of coverity caught error path
memory leaks and a couple of other valid complaints
Approved by: gnn


# b96fbb37 06-Nov-2006 Robert Watson <rwatson@FreeBSD.org>

Convert three new suser(9) calls introduced between when the priv(9)
patch was prepared and committed to priv(9) calls. Add XXX comments
as, in each case, the semantics appear to differ from the TCP/UDP
versions of the calls with respect to jail, and because cr_canseecred()
is not used to validate the query.

Obtained from: TrustedBSD Project


# f4ad963c 06-Nov-2006 Randall Stewart <rrs@FreeBSD.org>

This changes tracks down the EEOR->NonEEOR mode failure
to wakeup on close of the sender. It basically moves
the return (when the asoc has a reader/writer) further
down and gets the wakeup and assoc appending (of the
PD-API event) moved up before the return. It also
moves the flag set right before the return so we can
assure only once adding the PD-API events.

Approved by: gnn


# 50cec919 05-Nov-2006 Randall Stewart <rrs@FreeBSD.org>

Tons of fixes to get all the 64bit issues removed.
This also moves two 16 bit int's to become 32 bit
values so we do not have to use atomic_add_16.
Most of the changes are %p, casts and other various
nasty's that were in the orignal code base. With this
commit my machine will now do a build universe.. however
I as yet have not tested on a 64bit machine .. it may not work :-(


# 50514179 03-Nov-2006 John Birrell <jb@FreeBSD.org>

Remove a bogus cast in an attempt to fix the tinderbox builds on
lots of arches.


# 562a89b5 03-Nov-2006 Randall Stewart <rrs@FreeBSD.org>

More 64 bit pointer fun.
%p changed in multiple prints
the mtod() was also fixed.


# f8829a4a 03-Nov-2006 Randall Stewart <rrs@FreeBSD.org>

Ok, here it is, we finally add SCTP to current. Note that this
work is not just mine, but it is also the works of Peter Lei
and Michael Tuexen. They both are my two key other developers
working on the project.. and they need ata-boy's too:
****
peterlei@cisco.com
tuexen@fh-muenster.de
****
I did do a make sysent which updated the
syscall's and sysproto.. I hope that is correct... without
it you don't build since we have new syscalls for SCTP :-0

So go out and look at the NOTES, add
option SCTP (make sure inet and inet6 are present too)
and play with SCTP.

I will see about comitting some test tools I have after I
figure out where I should place them. I also have a
lib (libsctp.a) that adds some of the missing socketapi
functions that I need to put into lib's.. I will talk
to George about this :-)

There may still be some 64 bit issues in here, none of
us have a 64 bit processor to test with yet.. Michael
may have a MAC but thats another beast too..

If you have a mac and want to use SCTP contact Michael
he maintains a web site with a loadable module with
this code :-)

Reviewed by: gnn
Approved by: gnn