#
1.178 |
|
13-May-2024 |
jsg |
remove prototypes with no matching function ok mpi@
|
#
1.177 |
|
12-Apr-2024 |
bluhm |
Split single TCP inpcb table into IPv4 and IPv6 parts.
With two separate TCP hash tables, each one becomes smaller. When we remove the exclusive net lock from TCP, contention on internet PCB table mutex will be reduced. UDP has been split earlier into IPv4 and IPv6. Replace branch conditions based on INP_IPV6 with assertions.
OK mvs@
|
Revision tags: OPENBSD_7_5_BASE
|
#
1.176 |
|
13-Feb-2024 |
bluhm |
Merge struct route and struct route_in6.
Use a common struct route for both inet and inet6. Unfortunately struct sockaddr is shorter than sockaddr_in6, so netinet/in.h has to be exposed from net/route.h. Struct route has to be bsd visible for userland as netstat kvm code inspects inp_route. Internet PCB and TCP SYN cache can use a plain struct route now. All specific sockaddr types for inet and inet6 are embeded there.
OK claudio@
|
#
1.175 |
|
27-Jan-2024 |
bluhm |
Declare address parameter in TCP SYN cache const.
tcp6_ctlinput() casted a constant sockaddr_sin6 to non-const sockaddr. sa6_src may be &sa6_any which lives in read-only data section. Better pass down the const addresses to syn_cache_lookup(). They are needed for hash lookup and are not modified.
OK mvs@
|
#
1.174 |
|
11-Jan-2024 |
bluhm |
Fix white spaces in TCP.
|
#
1.173 |
|
29-Nov-2023 |
bluhm |
Run TCP syn cache timer without kernel lock.
As syn_cache_timer() uses syn cache mutex and exclusive net lock, it does not need kernel lock.
OK mvs@
|
#
1.172 |
|
16-Nov-2023 |
bluhm |
Run TCP SYN cache timer logik without net lock.
Introduce global TCP SYN cache mutex. Devide timer function in parts protected by mutex and sending with netlock. Split the flags field in dynamic flags protected by mutex and fixed flags set during initialization. Document whether fields of struct syn_cache are protected by net lock or mutex.
input and OK sashan@
|
Revision tags: OPENBSD_7_4_BASE
|
#
1.171 |
|
04-Sep-2023 |
bluhm |
Fix netstat output of uses of current SYN cache left.
TCP syn cache variable scs_use is basically counting packet insertions into syn cache. Prefer type long to exclude overflow on fast machines. Due to counting downwards from a limit, it can become negative. Copy it out as tcps_sc_uses_left via sysctl, and print it as signed long long integer.
OK mvs@
|
#
1.170 |
|
28-Aug-2023 |
bluhm |
Introduce reference counting for TCP syn cache entries.
The syn_cache_reaper() is a hack to serialize timeouts. Unfortunately it has a race and panics sometimes with pool_do_get: syncache free list modified. Add a reference counter for timeout and list of syn cache entries. Currently list refcout is not strictly necessary due to exclusive netlock, but will be needed when we continue unlocking.
Checking timeout_initialized() is not MP friendly, better do proper initialization during object allocation. Refcount in btrace helps to find leaks.
bug reported and fix tested by Peter J. Philipp OK claudio@
|
#
1.169 |
|
06-Jul-2023 |
bluhm |
Convert tcp_now() time counter to 64 bit.
After changing tcp now tick to milliseconds, 32 bits will wrap around after 49 days of uptime. That may be a problem in some places of our stack. Better use a 64 bit counter.
As timestamp option is 32 bit in TCP protocol, use the lower 32 bit there. There are casts to 32 bits that should behave correctly.
Start with random 63 bit offset to avoid uptime leakage. 2^63 milliseconds result in 2.9*10^8 years of possible uptime.
OK yasuoka@
|
#
1.168 |
|
02-Jul-2023 |
bluhm |
Use TSO and LRO on the loopback interface to transfer TCP faster.
If tcplro is activated on lo(4), ignore the MTU with TCP packets. They are passed along with the information that they have to be chopped in case they are forwarded later. New netstat(1) counter shows that software LRO is in effect. The feature is currently turned off by default.
tested by jan@; OK claudio@ jan@
|
#
1.167 |
|
23-May-2023 |
jan |
New counters for LRO packets from hardware TCP offloading.
With tweaks from patrick@ and bluhm@.
OK bluhm@
|
#
1.166 |
|
18-May-2023 |
jan |
Use TSO offloading in ix(4).
With a lot of tweaks, improvements and testing from bluhm.
Thanks to Hrvoje Popovski from the University of Zagreb for his great testing effort to make this happen.
ok bluhm
|
#
1.165 |
|
15-May-2023 |
bluhm |
Implement the TCP/IP layer for hardware TCP segmentation offload. If the driver of a network interface claims to support TSO, do not chop the packet in software, but pass it down to the interface layer. Precalculate parts of the pseudo header checksum, but without the packet length. The length of all generated smaller packets is not known yet. Driver and hardware will use the mbuf packet header field ph_mss to calculate it and update checksum. Introduce separate flags IFCAP_TSOv4 and IFCAP_TSOv6 as hardware might support ony one protocol family. The old flag IFXF_TSO is only relevant for large receive offload. It is missnamed, but keep that for now. Note that drivers do not set TSO capabilites yet. Also the ifconfig flags and pseudo interfaces capabilities will be done separately. So this commit should not change behavior. heavily based on the work from jan@; OK sashan@
|
#
1.164 |
|
10-May-2023 |
bluhm |
Implement TCP send offloading, for now in software only. This is meant as a fallback if network hardware does not support TSO. Driver support is still work in progress. TCP output generates large packets. In IP output the packet is chopped to TCP maximum segment size. This reduces the CPU cycles used by pf. The regular output could be assisted by hardware later, but pf route-to and IPsec needs the software fallback in general. For performance comparison or to workaround possible bugs, sysctl net.inet.tcp.tso=0 disables the feature. netstat -s -p tcp shows TSO counter with chopped and generated packets. based on work from jan@ tested by jmc@ jan@ Hrvoje Popovski OK jan@ claudio@
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.163 |
|
14-Mar-2023 |
yasuoka |
To avoid misunderstanding, keep variables for tcp keepalive in milliseconds, which is the same unit of tcp_now(). However, keep the unit of sysctl variables in seconds and convert their unit in tcp_sysctl(). Additionally revert TCPTV_SRTTDFLT back to 3 seconds, which was mistakenly changed to 1.5 seconds by tcp_timer.h 1.19.
ok claudio
|
#
1.162 |
|
13-Dec-2022 |
claudio |
In tcp_now() switch from getnsecuptime() to getnsecruntime()
The tcp timer is not supposed to run during suspend but getnsecuptime() does and because of this sessions with TCP_KEEPALIVE on reset after a few hours of sleep.
Problem noticed by mlarkin@, investigation by yasuoka@ additional testing jca@ OK yasuoka@ jca@ cheloha@
|
#
1.161 |
|
07-Nov-2022 |
yasuoka |
Modify TCP receive buffer size auto scaling to use the smoothed RTT (SRTT) instead of the timestamp option. Since the timestamp option is disabled on some OSs (eg. Windows) or dropped by some firewalls/routers, in such a case the window size had been fixed at 16KB, this limits throughput at very low on high latency networks. Also replace "tcp_now" from 2HZ tick counter to binuptime in milliseconds to calculate the SRTT better.
tested by krw matthieu jmatthew dlg djm stu stsp ok claudio
|
#
1.160 |
|
17-Oct-2022 |
mvs |
Change pru_abort() return type to the type of void and make pru_abort() optional.
We have no interest on pru_abort() return value. We call it only from soabort() which is dummy pru_abort() wrapper and has no return value.
Only the connection oriented sockets need to implement (*pru_abort)() handler. Such sockets are tcp(4) and unix(4) sockets, so remove existing code for all others, it doesn't called.
ok guenther@
|
#
1.159 |
|
03-Oct-2022 |
bluhm |
System calls should not fail due to temporary memory shortage in malloc(9) or pool_get(9). Pass down a wait flag to pru_attach(). During syscall socket(2) it is ok to wait, this logic was missing for internet pcb. Pfkey and route sockets were already waiting. sonewconn() must not wait when called during TCP 3-way handshake. This logic has been preserved. Unix domain stream socket connect(2) can wait until the other side has created the socket to accept. OK mvs@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.158 |
|
13-Sep-2022 |
mvs |
Change pru_rcvd() return type to the type of void. We have no interest on pru_rcvd() return value.
Drop "pru_rcvd != NULL" check within pru_rcvd() wrapper. We only call it if the socket's protocol have PR_WANTRCVD flag set. Such sockets are route domain, tcp(4) and unix(4) sockets.
ok guenther@ bluhm@
|
#
1.157 |
|
03-Sep-2022 |
mvs |
Move PRU_PEERADDR request to (*pru_peeraddr)().
Introduce in{,6}_peeraddr() and use them for inet and inet6 sockets, except tcp(4) case.
Also remove *_usrreq() handlers.
ok bluhm@
|
#
1.156 |
|
03-Sep-2022 |
bluhm |
Use a mutex to update tcp_maxidle, tcp_iss, and tcp_now. This removes pressure from the exclusive netlock in tcp_slowtimo(). Reading is done atomically. Ensure that the tcp_now value is read only once per function to provide consistent time. OK yasuoka@
|
#
1.155 |
|
03-Sep-2022 |
mvs |
Move PRU_SOCKADDR request to (*pru_sockaddr)()
Introduce in{,6}_sockaddr() functions, and use them for all except tcp(4) inet sockets. For tcp(4) sockets use tcp_sockaddr() to keep debug ability.
The key management and route domain sockets returns EINVAL error for PRU_SOCKADDR request, so keep this behaviour for a while instead of make pru_sockaddr handler optional and return EOPNOTSUPP.
ok bluhm@
|
#
1.154 |
|
02-Sep-2022 |
mvs |
Move PRU_CONTROL request to (*pru_control)().
The 'proc *' arg is not used for PRU_CONTROL request, so remove it from pru_control() wrapper.
Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for inet6 case.
ok guenther@ bluhm@
|
#
1.153 |
|
31-Aug-2022 |
mvs |
Move PRU_SENDOOB request to (*pru_sendoob)().
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. To avoid dummy m_freem(9) handlers for all protocols release passed mbufs in the pru_sendoob() EOPNOTSUPP error path.
Also fix `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path.
ok bluhm@
|
#
1.152 |
|
29-Aug-2022 |
mvs |
Move PRU_RCVOOB request to (*pru_rcvoob)().
ok bluhm@
|
#
1.151 |
|
28-Aug-2022 |
mvs |
Move PRU_SENSE request to (*pru_sense)().
ok bluhm@
|
#
1.150 |
|
28-Aug-2022 |
mvs |
Move PRU_ABORT request to (*pru_abort)().
We abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it required only for tcp(4) and unix(4) sockets, so i should be optional. However, they will be removed with separate diff, and this time PRU_ABORT requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and key management sockets leave it alive. This was also converted as is, because this wrong code never called.
ok bluhm@
|
#
1.149 |
|
27-Aug-2022 |
mvs |
Move PRU_SEND request to (*pru_send)().
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.177 |
|
12-Apr-2024 |
bluhm |
Split single TCP inpcb table into IPv4 and IPv6 parts.
With two separate TCP hash tables, each one becomes smaller. When we remove the exclusive net lock from TCP, contention on internet PCB table mutex will be reduced. UDP has been split earlier into IPv4 and IPv6. Replace branch conditions based on INP_IPV6 with assertions.
OK mvs@
|
Revision tags: OPENBSD_7_5_BASE
|
#
1.176 |
|
13-Feb-2024 |
bluhm |
Merge struct route and struct route_in6.
Use a common struct route for both inet and inet6. Unfortunately struct sockaddr is shorter than sockaddr_in6, so netinet/in.h has to be exposed from net/route.h. Struct route has to be bsd visible for userland as netstat kvm code inspects inp_route. Internet PCB and TCP SYN cache can use a plain struct route now. All specific sockaddr types for inet and inet6 are embeded there.
OK claudio@
|
#
1.175 |
|
27-Jan-2024 |
bluhm |
Declare address parameter in TCP SYN cache const.
tcp6_ctlinput() casted a constant sockaddr_sin6 to non-const sockaddr. sa6_src may be &sa6_any which lives in read-only data section. Better pass down the const addresses to syn_cache_lookup(). They are needed for hash lookup and are not modified.
OK mvs@
|
#
1.174 |
|
11-Jan-2024 |
bluhm |
Fix white spaces in TCP.
|
#
1.173 |
|
29-Nov-2023 |
bluhm |
Run TCP syn cache timer without kernel lock.
As syn_cache_timer() uses syn cache mutex and exclusive net lock, it does not need kernel lock.
OK mvs@
|
#
1.172 |
|
16-Nov-2023 |
bluhm |
Run TCP SYN cache timer logik without net lock.
Introduce global TCP SYN cache mutex. Devide timer function in parts protected by mutex and sending with netlock. Split the flags field in dynamic flags protected by mutex and fixed flags set during initialization. Document whether fields of struct syn_cache are protected by net lock or mutex.
input and OK sashan@
|
Revision tags: OPENBSD_7_4_BASE
|
#
1.171 |
|
04-Sep-2023 |
bluhm |
Fix netstat output of uses of current SYN cache left.
TCP syn cache variable scs_use is basically counting packet insertions into syn cache. Prefer type long to exclude overflow on fast machines. Due to counting downwards from a limit, it can become negative. Copy it out as tcps_sc_uses_left via sysctl, and print it as signed long long integer.
OK mvs@
|
#
1.170 |
|
28-Aug-2023 |
bluhm |
Introduce reference counting for TCP syn cache entries.
The syn_cache_reaper() is a hack to serialize timeouts. Unfortunately it has a race and panics sometimes with pool_do_get: syncache free list modified. Add a reference counter for timeout and list of syn cache entries. Currently list refcout is not strictly necessary due to exclusive netlock, but will be needed when we continue unlocking.
Checking timeout_initialized() is not MP friendly, better do proper initialization during object allocation. Refcount in btrace helps to find leaks.
bug reported and fix tested by Peter J. Philipp OK claudio@
|
#
1.169 |
|
06-Jul-2023 |
bluhm |
Convert tcp_now() time counter to 64 bit.
After changing tcp now tick to milliseconds, 32 bits will wrap around after 49 days of uptime. That may be a problem in some places of our stack. Better use a 64 bit counter.
As timestamp option is 32 bit in TCP protocol, use the lower 32 bit there. There are casts to 32 bits that should behave correctly.
Start with random 63 bit offset to avoid uptime leakage. 2^63 milliseconds result in 2.9*10^8 years of possible uptime.
OK yasuoka@
|
#
1.168 |
|
02-Jul-2023 |
bluhm |
Use TSO and LRO on the loopback interface to transfer TCP faster.
If tcplro is activated on lo(4), ignore the MTU with TCP packets. They are passed along with the information that they have to be chopped in case they are forwarded later. New netstat(1) counter shows that software LRO is in effect. The feature is currently turned off by default.
tested by jan@; OK claudio@ jan@
|
#
1.167 |
|
23-May-2023 |
jan |
New counters for LRO packets from hardware TCP offloading.
With tweaks from patrick@ and bluhm@.
OK bluhm@
|
#
1.166 |
|
18-May-2023 |
jan |
Use TSO offloading in ix(4).
With a lot of tweaks, improvements and testing from bluhm.
Thanks to Hrvoje Popovski from the University of Zagreb for his great testing effort to make this happen.
ok bluhm
|
#
1.165 |
|
15-May-2023 |
bluhm |
Implement the TCP/IP layer for hardware TCP segmentation offload. If the driver of a network interface claims to support TSO, do not chop the packet in software, but pass it down to the interface layer. Precalculate parts of the pseudo header checksum, but without the packet length. The length of all generated smaller packets is not known yet. Driver and hardware will use the mbuf packet header field ph_mss to calculate it and update checksum. Introduce separate flags IFCAP_TSOv4 and IFCAP_TSOv6 as hardware might support ony one protocol family. The old flag IFXF_TSO is only relevant for large receive offload. It is missnamed, but keep that for now. Note that drivers do not set TSO capabilites yet. Also the ifconfig flags and pseudo interfaces capabilities will be done separately. So this commit should not change behavior. heavily based on the work from jan@; OK sashan@
|
#
1.164 |
|
10-May-2023 |
bluhm |
Implement TCP send offloading, for now in software only. This is meant as a fallback if network hardware does not support TSO. Driver support is still work in progress. TCP output generates large packets. In IP output the packet is chopped to TCP maximum segment size. This reduces the CPU cycles used by pf. The regular output could be assisted by hardware later, but pf route-to and IPsec needs the software fallback in general. For performance comparison or to workaround possible bugs, sysctl net.inet.tcp.tso=0 disables the feature. netstat -s -p tcp shows TSO counter with chopped and generated packets. based on work from jan@ tested by jmc@ jan@ Hrvoje Popovski OK jan@ claudio@
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.163 |
|
14-Mar-2023 |
yasuoka |
To avoid misunderstanding, keep variables for tcp keepalive in milliseconds, which is the same unit of tcp_now(). However, keep the unit of sysctl variables in seconds and convert their unit in tcp_sysctl(). Additionally revert TCPTV_SRTTDFLT back to 3 seconds, which was mistakenly changed to 1.5 seconds by tcp_timer.h 1.19.
ok claudio
|
#
1.162 |
|
13-Dec-2022 |
claudio |
In tcp_now() switch from getnsecuptime() to getnsecruntime()
The tcp timer is not supposed to run during suspend but getnsecuptime() does and because of this sessions with TCP_KEEPALIVE on reset after a few hours of sleep.
Problem noticed by mlarkin@, investigation by yasuoka@ additional testing jca@ OK yasuoka@ jca@ cheloha@
|
#
1.161 |
|
07-Nov-2022 |
yasuoka |
Modify TCP receive buffer size auto scaling to use the smoothed RTT (SRTT) instead of the timestamp option. Since the timestamp option is disabled on some OSs (eg. Windows) or dropped by some firewalls/routers, in such a case the window size had been fixed at 16KB, this limits throughput at very low on high latency networks. Also replace "tcp_now" from 2HZ tick counter to binuptime in milliseconds to calculate the SRTT better.
tested by krw matthieu jmatthew dlg djm stu stsp ok claudio
|
#
1.160 |
|
17-Oct-2022 |
mvs |
Change pru_abort() return type to the type of void and make pru_abort() optional.
We have no interest on pru_abort() return value. We call it only from soabort() which is dummy pru_abort() wrapper and has no return value.
Only the connection oriented sockets need to implement (*pru_abort)() handler. Such sockets are tcp(4) and unix(4) sockets, so remove existing code for all others, it doesn't called.
ok guenther@
|
#
1.159 |
|
03-Oct-2022 |
bluhm |
System calls should not fail due to temporary memory shortage in malloc(9) or pool_get(9). Pass down a wait flag to pru_attach(). During syscall socket(2) it is ok to wait, this logic was missing for internet pcb. Pfkey and route sockets were already waiting. sonewconn() must not wait when called during TCP 3-way handshake. This logic has been preserved. Unix domain stream socket connect(2) can wait until the other side has created the socket to accept. OK mvs@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.158 |
|
13-Sep-2022 |
mvs |
Change pru_rcvd() return type to the type of void. We have no interest on pru_rcvd() return value.
Drop "pru_rcvd != NULL" check within pru_rcvd() wrapper. We only call it if the socket's protocol have PR_WANTRCVD flag set. Such sockets are route domain, tcp(4) and unix(4) sockets.
ok guenther@ bluhm@
|
#
1.157 |
|
03-Sep-2022 |
mvs |
Move PRU_PEERADDR request to (*pru_peeraddr)().
Introduce in{,6}_peeraddr() and use them for inet and inet6 sockets, except tcp(4) case.
Also remove *_usrreq() handlers.
ok bluhm@
|
#
1.156 |
|
03-Sep-2022 |
bluhm |
Use a mutex to update tcp_maxidle, tcp_iss, and tcp_now. This removes pressure from the exclusive netlock in tcp_slowtimo(). Reading is done atomically. Ensure that the tcp_now value is read only once per function to provide consistent time. OK yasuoka@
|
#
1.155 |
|
03-Sep-2022 |
mvs |
Move PRU_SOCKADDR request to (*pru_sockaddr)()
Introduce in{,6}_sockaddr() functions, and use them for all except tcp(4) inet sockets. For tcp(4) sockets use tcp_sockaddr() to keep debug ability.
The key management and route domain sockets returns EINVAL error for PRU_SOCKADDR request, so keep this behaviour for a while instead of make pru_sockaddr handler optional and return EOPNOTSUPP.
ok bluhm@
|
#
1.154 |
|
02-Sep-2022 |
mvs |
Move PRU_CONTROL request to (*pru_control)().
The 'proc *' arg is not used for PRU_CONTROL request, so remove it from pru_control() wrapper.
Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for inet6 case.
ok guenther@ bluhm@
|
#
1.153 |
|
31-Aug-2022 |
mvs |
Move PRU_SENDOOB request to (*pru_sendoob)().
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. To avoid dummy m_freem(9) handlers for all protocols release passed mbufs in the pru_sendoob() EOPNOTSUPP error path.
Also fix `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path.
ok bluhm@
|
#
1.152 |
|
29-Aug-2022 |
mvs |
Move PRU_RCVOOB request to (*pru_rcvoob)().
ok bluhm@
|
#
1.151 |
|
28-Aug-2022 |
mvs |
Move PRU_SENSE request to (*pru_sense)().
ok bluhm@
|
#
1.150 |
|
28-Aug-2022 |
mvs |
Move PRU_ABORT request to (*pru_abort)().
We abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it required only for tcp(4) and unix(4) sockets, so i should be optional. However, they will be removed with separate diff, and this time PRU_ABORT requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and key management sockets leave it alive. This was also converted as is, because this wrong code never called.
ok bluhm@
|
#
1.149 |
|
27-Aug-2022 |
mvs |
Move PRU_SEND request to (*pru_send)().
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.176 |
|
13-Feb-2024 |
bluhm |
Merge struct route and struct route_in6.
Use a common struct route for both inet and inet6. Unfortunately struct sockaddr is shorter than sockaddr_in6, so netinet/in.h has to be exposed from net/route.h. Struct route has to be bsd visible for userland as netstat kvm code inspects inp_route. Internet PCB and TCP SYN cache can use a plain struct route now. All specific sockaddr types for inet and inet6 are embeded there.
OK claudio@
|
#
1.175 |
|
27-Jan-2024 |
bluhm |
Declare address parameter in TCP SYN cache const.
tcp6_ctlinput() casted a constant sockaddr_sin6 to non-const sockaddr. sa6_src may be &sa6_any which lives in read-only data section. Better pass down the const addresses to syn_cache_lookup(). They are needed for hash lookup and are not modified.
OK mvs@
|
#
1.174 |
|
11-Jan-2024 |
bluhm |
Fix white spaces in TCP.
|
#
1.173 |
|
29-Nov-2023 |
bluhm |
Run TCP syn cache timer without kernel lock.
As syn_cache_timer() uses syn cache mutex and exclusive net lock, it does not need kernel lock.
OK mvs@
|
#
1.172 |
|
16-Nov-2023 |
bluhm |
Run TCP SYN cache timer logik without net lock.
Introduce global TCP SYN cache mutex. Devide timer function in parts protected by mutex and sending with netlock. Split the flags field in dynamic flags protected by mutex and fixed flags set during initialization. Document whether fields of struct syn_cache are protected by net lock or mutex.
input and OK sashan@
|
Revision tags: OPENBSD_7_4_BASE
|
#
1.171 |
|
04-Sep-2023 |
bluhm |
Fix netstat output of uses of current SYN cache left.
TCP syn cache variable scs_use is basically counting packet insertions into syn cache. Prefer type long to exclude overflow on fast machines. Due to counting downwards from a limit, it can become negative. Copy it out as tcps_sc_uses_left via sysctl, and print it as signed long long integer.
OK mvs@
|
#
1.170 |
|
28-Aug-2023 |
bluhm |
Introduce reference counting for TCP syn cache entries.
The syn_cache_reaper() is a hack to serialize timeouts. Unfortunately it has a race and panics sometimes with pool_do_get: syncache free list modified. Add a reference counter for timeout and list of syn cache entries. Currently list refcout is not strictly necessary due to exclusive netlock, but will be needed when we continue unlocking.
Checking timeout_initialized() is not MP friendly, better do proper initialization during object allocation. Refcount in btrace helps to find leaks.
bug reported and fix tested by Peter J. Philipp OK claudio@
|
#
1.169 |
|
06-Jul-2023 |
bluhm |
Convert tcp_now() time counter to 64 bit.
After changing tcp now tick to milliseconds, 32 bits will wrap around after 49 days of uptime. That may be a problem in some places of our stack. Better use a 64 bit counter.
As timestamp option is 32 bit in TCP protocol, use the lower 32 bit there. There are casts to 32 bits that should behave correctly.
Start with random 63 bit offset to avoid uptime leakage. 2^63 milliseconds result in 2.9*10^8 years of possible uptime.
OK yasuoka@
|
#
1.168 |
|
02-Jul-2023 |
bluhm |
Use TSO and LRO on the loopback interface to transfer TCP faster.
If tcplro is activated on lo(4), ignore the MTU with TCP packets. They are passed along with the information that they have to be chopped in case they are forwarded later. New netstat(1) counter shows that software LRO is in effect. The feature is currently turned off by default.
tested by jan@; OK claudio@ jan@
|
#
1.167 |
|
23-May-2023 |
jan |
New counters for LRO packets from hardware TCP offloading.
With tweaks from patrick@ and bluhm@.
OK bluhm@
|
#
1.166 |
|
18-May-2023 |
jan |
Use TSO offloading in ix(4).
With a lot of tweaks, improvements and testing from bluhm.
Thanks to Hrvoje Popovski from the University of Zagreb for his great testing effort to make this happen.
ok bluhm
|
#
1.165 |
|
15-May-2023 |
bluhm |
Implement the TCP/IP layer for hardware TCP segmentation offload. If the driver of a network interface claims to support TSO, do not chop the packet in software, but pass it down to the interface layer. Precalculate parts of the pseudo header checksum, but without the packet length. The length of all generated smaller packets is not known yet. Driver and hardware will use the mbuf packet header field ph_mss to calculate it and update checksum. Introduce separate flags IFCAP_TSOv4 and IFCAP_TSOv6 as hardware might support ony one protocol family. The old flag IFXF_TSO is only relevant for large receive offload. It is missnamed, but keep that for now. Note that drivers do not set TSO capabilites yet. Also the ifconfig flags and pseudo interfaces capabilities will be done separately. So this commit should not change behavior. heavily based on the work from jan@; OK sashan@
|
#
1.164 |
|
10-May-2023 |
bluhm |
Implement TCP send offloading, for now in software only. This is meant as a fallback if network hardware does not support TSO. Driver support is still work in progress. TCP output generates large packets. In IP output the packet is chopped to TCP maximum segment size. This reduces the CPU cycles used by pf. The regular output could be assisted by hardware later, but pf route-to and IPsec needs the software fallback in general. For performance comparison or to workaround possible bugs, sysctl net.inet.tcp.tso=0 disables the feature. netstat -s -p tcp shows TSO counter with chopped and generated packets. based on work from jan@ tested by jmc@ jan@ Hrvoje Popovski OK jan@ claudio@
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.163 |
|
14-Mar-2023 |
yasuoka |
To avoid misunderstanding, keep variables for tcp keepalive in milliseconds, which is the same unit of tcp_now(). However, keep the unit of sysctl variables in seconds and convert their unit in tcp_sysctl(). Additionally revert TCPTV_SRTTDFLT back to 3 seconds, which was mistakenly changed to 1.5 seconds by tcp_timer.h 1.19.
ok claudio
|
#
1.162 |
|
13-Dec-2022 |
claudio |
In tcp_now() switch from getnsecuptime() to getnsecruntime()
The tcp timer is not supposed to run during suspend but getnsecuptime() does and because of this sessions with TCP_KEEPALIVE on reset after a few hours of sleep.
Problem noticed by mlarkin@, investigation by yasuoka@ additional testing jca@ OK yasuoka@ jca@ cheloha@
|
#
1.161 |
|
07-Nov-2022 |
yasuoka |
Modify TCP receive buffer size auto scaling to use the smoothed RTT (SRTT) instead of the timestamp option. Since the timestamp option is disabled on some OSs (eg. Windows) or dropped by some firewalls/routers, in such a case the window size had been fixed at 16KB, this limits throughput at very low on high latency networks. Also replace "tcp_now" from 2HZ tick counter to binuptime in milliseconds to calculate the SRTT better.
tested by krw matthieu jmatthew dlg djm stu stsp ok claudio
|
#
1.160 |
|
17-Oct-2022 |
mvs |
Change pru_abort() return type to the type of void and make pru_abort() optional.
We have no interest on pru_abort() return value. We call it only from soabort() which is dummy pru_abort() wrapper and has no return value.
Only the connection oriented sockets need to implement (*pru_abort)() handler. Such sockets are tcp(4) and unix(4) sockets, so remove existing code for all others, it doesn't called.
ok guenther@
|
#
1.159 |
|
03-Oct-2022 |
bluhm |
System calls should not fail due to temporary memory shortage in malloc(9) or pool_get(9). Pass down a wait flag to pru_attach(). During syscall socket(2) it is ok to wait, this logic was missing for internet pcb. Pfkey and route sockets were already waiting. sonewconn() must not wait when called during TCP 3-way handshake. This logic has been preserved. Unix domain stream socket connect(2) can wait until the other side has created the socket to accept. OK mvs@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.158 |
|
13-Sep-2022 |
mvs |
Change pru_rcvd() return type to the type of void. We have no interest on pru_rcvd() return value.
Drop "pru_rcvd != NULL" check within pru_rcvd() wrapper. We only call it if the socket's protocol have PR_WANTRCVD flag set. Such sockets are route domain, tcp(4) and unix(4) sockets.
ok guenther@ bluhm@
|
#
1.157 |
|
03-Sep-2022 |
mvs |
Move PRU_PEERADDR request to (*pru_peeraddr)().
Introduce in{,6}_peeraddr() and use them for inet and inet6 sockets, except tcp(4) case.
Also remove *_usrreq() handlers.
ok bluhm@
|
#
1.156 |
|
03-Sep-2022 |
bluhm |
Use a mutex to update tcp_maxidle, tcp_iss, and tcp_now. This removes pressure from the exclusive netlock in tcp_slowtimo(). Reading is done atomically. Ensure that the tcp_now value is read only once per function to provide consistent time. OK yasuoka@
|
#
1.155 |
|
03-Sep-2022 |
mvs |
Move PRU_SOCKADDR request to (*pru_sockaddr)()
Introduce in{,6}_sockaddr() functions, and use them for all except tcp(4) inet sockets. For tcp(4) sockets use tcp_sockaddr() to keep debug ability.
The key management and route domain sockets returns EINVAL error for PRU_SOCKADDR request, so keep this behaviour for a while instead of make pru_sockaddr handler optional and return EOPNOTSUPP.
ok bluhm@
|
#
1.154 |
|
02-Sep-2022 |
mvs |
Move PRU_CONTROL request to (*pru_control)().
The 'proc *' arg is not used for PRU_CONTROL request, so remove it from pru_control() wrapper.
Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for inet6 case.
ok guenther@ bluhm@
|
#
1.153 |
|
31-Aug-2022 |
mvs |
Move PRU_SENDOOB request to (*pru_sendoob)().
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. To avoid dummy m_freem(9) handlers for all protocols release passed mbufs in the pru_sendoob() EOPNOTSUPP error path.
Also fix `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path.
ok bluhm@
|
#
1.152 |
|
29-Aug-2022 |
mvs |
Move PRU_RCVOOB request to (*pru_rcvoob)().
ok bluhm@
|
#
1.151 |
|
28-Aug-2022 |
mvs |
Move PRU_SENSE request to (*pru_sense)().
ok bluhm@
|
#
1.150 |
|
28-Aug-2022 |
mvs |
Move PRU_ABORT request to (*pru_abort)().
We abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it required only for tcp(4) and unix(4) sockets, so i should be optional. However, they will be removed with separate diff, and this time PRU_ABORT requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and key management sockets leave it alive. This was also converted as is, because this wrong code never called.
ok bluhm@
|
#
1.149 |
|
27-Aug-2022 |
mvs |
Move PRU_SEND request to (*pru_send)().
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.175 |
|
27-Jan-2024 |
bluhm |
Declare address parameter in TCP SYN cache const.
tcp6_ctlinput() casted a constant sockaddr_sin6 to non-const sockaddr. sa6_src may be &sa6_any which lives in read-only data section. Better pass down the const addresses to syn_cache_lookup(). They are needed for hash lookup and are not modified.
OK mvs@
|
#
1.174 |
|
11-Jan-2024 |
bluhm |
Fix white spaces in TCP.
|
#
1.173 |
|
29-Nov-2023 |
bluhm |
Run TCP syn cache timer without kernel lock.
As syn_cache_timer() uses syn cache mutex and exclusive net lock, it does not need kernel lock.
OK mvs@
|
#
1.172 |
|
16-Nov-2023 |
bluhm |
Run TCP SYN cache timer logik without net lock.
Introduce global TCP SYN cache mutex. Devide timer function in parts protected by mutex and sending with netlock. Split the flags field in dynamic flags protected by mutex and fixed flags set during initialization. Document whether fields of struct syn_cache are protected by net lock or mutex.
input and OK sashan@
|
Revision tags: OPENBSD_7_4_BASE
|
#
1.171 |
|
04-Sep-2023 |
bluhm |
Fix netstat output of uses of current SYN cache left.
TCP syn cache variable scs_use is basically counting packet insertions into syn cache. Prefer type long to exclude overflow on fast machines. Due to counting downwards from a limit, it can become negative. Copy it out as tcps_sc_uses_left via sysctl, and print it as signed long long integer.
OK mvs@
|
#
1.170 |
|
28-Aug-2023 |
bluhm |
Introduce reference counting for TCP syn cache entries.
The syn_cache_reaper() is a hack to serialize timeouts. Unfortunately it has a race and panics sometimes with pool_do_get: syncache free list modified. Add a reference counter for timeout and list of syn cache entries. Currently list refcout is not strictly necessary due to exclusive netlock, but will be needed when we continue unlocking.
Checking timeout_initialized() is not MP friendly, better do proper initialization during object allocation. Refcount in btrace helps to find leaks.
bug reported and fix tested by Peter J. Philipp OK claudio@
|
#
1.169 |
|
06-Jul-2023 |
bluhm |
Convert tcp_now() time counter to 64 bit.
After changing tcp now tick to milliseconds, 32 bits will wrap around after 49 days of uptime. That may be a problem in some places of our stack. Better use a 64 bit counter.
As timestamp option is 32 bit in TCP protocol, use the lower 32 bit there. There are casts to 32 bits that should behave correctly.
Start with random 63 bit offset to avoid uptime leakage. 2^63 milliseconds result in 2.9*10^8 years of possible uptime.
OK yasuoka@
|
#
1.168 |
|
02-Jul-2023 |
bluhm |
Use TSO and LRO on the loopback interface to transfer TCP faster.
If tcplro is activated on lo(4), ignore the MTU with TCP packets. They are passed along with the information that they have to be chopped in case they are forwarded later. New netstat(1) counter shows that software LRO is in effect. The feature is currently turned off by default.
tested by jan@; OK claudio@ jan@
|
#
1.167 |
|
23-May-2023 |
jan |
New counters for LRO packets from hardware TCP offloading.
With tweaks from patrick@ and bluhm@.
OK bluhm@
|
#
1.166 |
|
18-May-2023 |
jan |
Use TSO offloading in ix(4).
With a lot of tweaks, improvements and testing from bluhm.
Thanks to Hrvoje Popovski from the University of Zagreb for his great testing effort to make this happen.
ok bluhm
|
#
1.165 |
|
15-May-2023 |
bluhm |
Implement the TCP/IP layer for hardware TCP segmentation offload. If the driver of a network interface claims to support TSO, do not chop the packet in software, but pass it down to the interface layer. Precalculate parts of the pseudo header checksum, but without the packet length. The length of all generated smaller packets is not known yet. Driver and hardware will use the mbuf packet header field ph_mss to calculate it and update checksum. Introduce separate flags IFCAP_TSOv4 and IFCAP_TSOv6 as hardware might support ony one protocol family. The old flag IFXF_TSO is only relevant for large receive offload. It is missnamed, but keep that for now. Note that drivers do not set TSO capabilites yet. Also the ifconfig flags and pseudo interfaces capabilities will be done separately. So this commit should not change behavior. heavily based on the work from jan@; OK sashan@
|
#
1.164 |
|
10-May-2023 |
bluhm |
Implement TCP send offloading, for now in software only. This is meant as a fallback if network hardware does not support TSO. Driver support is still work in progress. TCP output generates large packets. In IP output the packet is chopped to TCP maximum segment size. This reduces the CPU cycles used by pf. The regular output could be assisted by hardware later, but pf route-to and IPsec needs the software fallback in general. For performance comparison or to workaround possible bugs, sysctl net.inet.tcp.tso=0 disables the feature. netstat -s -p tcp shows TSO counter with chopped and generated packets. based on work from jan@ tested by jmc@ jan@ Hrvoje Popovski OK jan@ claudio@
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.163 |
|
14-Mar-2023 |
yasuoka |
To avoid misunderstanding, keep variables for tcp keepalive in milliseconds, which is the same unit of tcp_now(). However, keep the unit of sysctl variables in seconds and convert their unit in tcp_sysctl(). Additionally revert TCPTV_SRTTDFLT back to 3 seconds, which was mistakenly changed to 1.5 seconds by tcp_timer.h 1.19.
ok claudio
|
#
1.162 |
|
13-Dec-2022 |
claudio |
In tcp_now() switch from getnsecuptime() to getnsecruntime()
The tcp timer is not supposed to run during suspend but getnsecuptime() does and because of this sessions with TCP_KEEPALIVE on reset after a few hours of sleep.
Problem noticed by mlarkin@, investigation by yasuoka@ additional testing jca@ OK yasuoka@ jca@ cheloha@
|
#
1.161 |
|
07-Nov-2022 |
yasuoka |
Modify TCP receive buffer size auto scaling to use the smoothed RTT (SRTT) instead of the timestamp option. Since the timestamp option is disabled on some OSs (eg. Windows) or dropped by some firewalls/routers, in such a case the window size had been fixed at 16KB, this limits throughput at very low on high latency networks. Also replace "tcp_now" from 2HZ tick counter to binuptime in milliseconds to calculate the SRTT better.
tested by krw matthieu jmatthew dlg djm stu stsp ok claudio
|
#
1.160 |
|
17-Oct-2022 |
mvs |
Change pru_abort() return type to the type of void and make pru_abort() optional.
We have no interest on pru_abort() return value. We call it only from soabort() which is dummy pru_abort() wrapper and has no return value.
Only the connection oriented sockets need to implement (*pru_abort)() handler. Such sockets are tcp(4) and unix(4) sockets, so remove existing code for all others, it doesn't called.
ok guenther@
|
#
1.159 |
|
03-Oct-2022 |
bluhm |
System calls should not fail due to temporary memory shortage in malloc(9) or pool_get(9). Pass down a wait flag to pru_attach(). During syscall socket(2) it is ok to wait, this logic was missing for internet pcb. Pfkey and route sockets were already waiting. sonewconn() must not wait when called during TCP 3-way handshake. This logic has been preserved. Unix domain stream socket connect(2) can wait until the other side has created the socket to accept. OK mvs@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.158 |
|
13-Sep-2022 |
mvs |
Change pru_rcvd() return type to the type of void. We have no interest on pru_rcvd() return value.
Drop "pru_rcvd != NULL" check within pru_rcvd() wrapper. We only call it if the socket's protocol have PR_WANTRCVD flag set. Such sockets are route domain, tcp(4) and unix(4) sockets.
ok guenther@ bluhm@
|
#
1.157 |
|
03-Sep-2022 |
mvs |
Move PRU_PEERADDR request to (*pru_peeraddr)().
Introduce in{,6}_peeraddr() and use them for inet and inet6 sockets, except tcp(4) case.
Also remove *_usrreq() handlers.
ok bluhm@
|
#
1.156 |
|
03-Sep-2022 |
bluhm |
Use a mutex to update tcp_maxidle, tcp_iss, and tcp_now. This removes pressure from the exclusive netlock in tcp_slowtimo(). Reading is done atomically. Ensure that the tcp_now value is read only once per function to provide consistent time. OK yasuoka@
|
#
1.155 |
|
03-Sep-2022 |
mvs |
Move PRU_SOCKADDR request to (*pru_sockaddr)()
Introduce in{,6}_sockaddr() functions, and use them for all except tcp(4) inet sockets. For tcp(4) sockets use tcp_sockaddr() to keep debug ability.
The key management and route domain sockets returns EINVAL error for PRU_SOCKADDR request, so keep this behaviour for a while instead of make pru_sockaddr handler optional and return EOPNOTSUPP.
ok bluhm@
|
#
1.154 |
|
02-Sep-2022 |
mvs |
Move PRU_CONTROL request to (*pru_control)().
The 'proc *' arg is not used for PRU_CONTROL request, so remove it from pru_control() wrapper.
Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for inet6 case.
ok guenther@ bluhm@
|
#
1.153 |
|
31-Aug-2022 |
mvs |
Move PRU_SENDOOB request to (*pru_sendoob)().
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. To avoid dummy m_freem(9) handlers for all protocols release passed mbufs in the pru_sendoob() EOPNOTSUPP error path.
Also fix `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path.
ok bluhm@
|
#
1.152 |
|
29-Aug-2022 |
mvs |
Move PRU_RCVOOB request to (*pru_rcvoob)().
ok bluhm@
|
#
1.151 |
|
28-Aug-2022 |
mvs |
Move PRU_SENSE request to (*pru_sense)().
ok bluhm@
|
#
1.150 |
|
28-Aug-2022 |
mvs |
Move PRU_ABORT request to (*pru_abort)().
We abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it required only for tcp(4) and unix(4) sockets, so i should be optional. However, they will be removed with separate diff, and this time PRU_ABORT requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and key management sockets leave it alive. This was also converted as is, because this wrong code never called.
ok bluhm@
|
#
1.149 |
|
27-Aug-2022 |
mvs |
Move PRU_SEND request to (*pru_send)().
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.174 |
|
11-Jan-2024 |
bluhm |
Fix white spaces in TCP.
|
#
1.173 |
|
29-Nov-2023 |
bluhm |
Run TCP syn cache timer without kernel lock.
As syn_cache_timer() uses syn cache mutex and exclusive net lock, it does not need kernel lock.
OK mvs@
|
#
1.172 |
|
16-Nov-2023 |
bluhm |
Run TCP SYN cache timer logik without net lock.
Introduce global TCP SYN cache mutex. Devide timer function in parts protected by mutex and sending with netlock. Split the flags field in dynamic flags protected by mutex and fixed flags set during initialization. Document whether fields of struct syn_cache are protected by net lock or mutex.
input and OK sashan@
|
Revision tags: OPENBSD_7_4_BASE
|
#
1.171 |
|
04-Sep-2023 |
bluhm |
Fix netstat output of uses of current SYN cache left.
TCP syn cache variable scs_use is basically counting packet insertions into syn cache. Prefer type long to exclude overflow on fast machines. Due to counting downwards from a limit, it can become negative. Copy it out as tcps_sc_uses_left via sysctl, and print it as signed long long integer.
OK mvs@
|
#
1.170 |
|
28-Aug-2023 |
bluhm |
Introduce reference counting for TCP syn cache entries.
The syn_cache_reaper() is a hack to serialize timeouts. Unfortunately it has a race and panics sometimes with pool_do_get: syncache free list modified. Add a reference counter for timeout and list of syn cache entries. Currently list refcout is not strictly necessary due to exclusive netlock, but will be needed when we continue unlocking.
Checking timeout_initialized() is not MP friendly, better do proper initialization during object allocation. Refcount in btrace helps to find leaks.
bug reported and fix tested by Peter J. Philipp OK claudio@
|
#
1.169 |
|
06-Jul-2023 |
bluhm |
Convert tcp_now() time counter to 64 bit.
After changing tcp now tick to milliseconds, 32 bits will wrap around after 49 days of uptime. That may be a problem in some places of our stack. Better use a 64 bit counter.
As timestamp option is 32 bit in TCP protocol, use the lower 32 bit there. There are casts to 32 bits that should behave correctly.
Start with random 63 bit offset to avoid uptime leakage. 2^63 milliseconds result in 2.9*10^8 years of possible uptime.
OK yasuoka@
|
#
1.168 |
|
02-Jul-2023 |
bluhm |
Use TSO and LRO on the loopback interface to transfer TCP faster.
If tcplro is activated on lo(4), ignore the MTU with TCP packets. They are passed along with the information that they have to be chopped in case they are forwarded later. New netstat(1) counter shows that software LRO is in effect. The feature is currently turned off by default.
tested by jan@; OK claudio@ jan@
|
#
1.167 |
|
23-May-2023 |
jan |
New counters for LRO packets from hardware TCP offloading.
With tweaks from patrick@ and bluhm@.
OK bluhm@
|
#
1.166 |
|
18-May-2023 |
jan |
Use TSO offloading in ix(4).
With a lot of tweaks, improvements and testing from bluhm.
Thanks to Hrvoje Popovski from the University of Zagreb for his great testing effort to make this happen.
ok bluhm
|
#
1.165 |
|
15-May-2023 |
bluhm |
Implement the TCP/IP layer for hardware TCP segmentation offload. If the driver of a network interface claims to support TSO, do not chop the packet in software, but pass it down to the interface layer. Precalculate parts of the pseudo header checksum, but without the packet length. The length of all generated smaller packets is not known yet. Driver and hardware will use the mbuf packet header field ph_mss to calculate it and update checksum. Introduce separate flags IFCAP_TSOv4 and IFCAP_TSOv6 as hardware might support ony one protocol family. The old flag IFXF_TSO is only relevant for large receive offload. It is missnamed, but keep that for now. Note that drivers do not set TSO capabilites yet. Also the ifconfig flags and pseudo interfaces capabilities will be done separately. So this commit should not change behavior. heavily based on the work from jan@; OK sashan@
|
#
1.164 |
|
10-May-2023 |
bluhm |
Implement TCP send offloading, for now in software only. This is meant as a fallback if network hardware does not support TSO. Driver support is still work in progress. TCP output generates large packets. In IP output the packet is chopped to TCP maximum segment size. This reduces the CPU cycles used by pf. The regular output could be assisted by hardware later, but pf route-to and IPsec needs the software fallback in general. For performance comparison or to workaround possible bugs, sysctl net.inet.tcp.tso=0 disables the feature. netstat -s -p tcp shows TSO counter with chopped and generated packets. based on work from jan@ tested by jmc@ jan@ Hrvoje Popovski OK jan@ claudio@
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.163 |
|
14-Mar-2023 |
yasuoka |
To avoid misunderstanding, keep variables for tcp keepalive in milliseconds, which is the same unit of tcp_now(). However, keep the unit of sysctl variables in seconds and convert their unit in tcp_sysctl(). Additionally revert TCPTV_SRTTDFLT back to 3 seconds, which was mistakenly changed to 1.5 seconds by tcp_timer.h 1.19.
ok claudio
|
#
1.162 |
|
13-Dec-2022 |
claudio |
In tcp_now() switch from getnsecuptime() to getnsecruntime()
The tcp timer is not supposed to run during suspend but getnsecuptime() does and because of this sessions with TCP_KEEPALIVE on reset after a few hours of sleep.
Problem noticed by mlarkin@, investigation by yasuoka@ additional testing jca@ OK yasuoka@ jca@ cheloha@
|
#
1.161 |
|
07-Nov-2022 |
yasuoka |
Modify TCP receive buffer size auto scaling to use the smoothed RTT (SRTT) instead of the timestamp option. Since the timestamp option is disabled on some OSs (eg. Windows) or dropped by some firewalls/routers, in such a case the window size had been fixed at 16KB, this limits throughput at very low on high latency networks. Also replace "tcp_now" from 2HZ tick counter to binuptime in milliseconds to calculate the SRTT better.
tested by krw matthieu jmatthew dlg djm stu stsp ok claudio
|
#
1.160 |
|
17-Oct-2022 |
mvs |
Change pru_abort() return type to the type of void and make pru_abort() optional.
We have no interest on pru_abort() return value. We call it only from soabort() which is dummy pru_abort() wrapper and has no return value.
Only the connection oriented sockets need to implement (*pru_abort)() handler. Such sockets are tcp(4) and unix(4) sockets, so remove existing code for all others, it doesn't called.
ok guenther@
|
#
1.159 |
|
03-Oct-2022 |
bluhm |
System calls should not fail due to temporary memory shortage in malloc(9) or pool_get(9). Pass down a wait flag to pru_attach(). During syscall socket(2) it is ok to wait, this logic was missing for internet pcb. Pfkey and route sockets were already waiting. sonewconn() must not wait when called during TCP 3-way handshake. This logic has been preserved. Unix domain stream socket connect(2) can wait until the other side has created the socket to accept. OK mvs@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.158 |
|
13-Sep-2022 |
mvs |
Change pru_rcvd() return type to the type of void. We have no interest on pru_rcvd() return value.
Drop "pru_rcvd != NULL" check within pru_rcvd() wrapper. We only call it if the socket's protocol have PR_WANTRCVD flag set. Such sockets are route domain, tcp(4) and unix(4) sockets.
ok guenther@ bluhm@
|
#
1.157 |
|
03-Sep-2022 |
mvs |
Move PRU_PEERADDR request to (*pru_peeraddr)().
Introduce in{,6}_peeraddr() and use them for inet and inet6 sockets, except tcp(4) case.
Also remove *_usrreq() handlers.
ok bluhm@
|
#
1.156 |
|
03-Sep-2022 |
bluhm |
Use a mutex to update tcp_maxidle, tcp_iss, and tcp_now. This removes pressure from the exclusive netlock in tcp_slowtimo(). Reading is done atomically. Ensure that the tcp_now value is read only once per function to provide consistent time. OK yasuoka@
|
#
1.155 |
|
03-Sep-2022 |
mvs |
Move PRU_SOCKADDR request to (*pru_sockaddr)()
Introduce in{,6}_sockaddr() functions, and use them for all except tcp(4) inet sockets. For tcp(4) sockets use tcp_sockaddr() to keep debug ability.
The key management and route domain sockets returns EINVAL error for PRU_SOCKADDR request, so keep this behaviour for a while instead of make pru_sockaddr handler optional and return EOPNOTSUPP.
ok bluhm@
|
#
1.154 |
|
02-Sep-2022 |
mvs |
Move PRU_CONTROL request to (*pru_control)().
The 'proc *' arg is not used for PRU_CONTROL request, so remove it from pru_control() wrapper.
Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for inet6 case.
ok guenther@ bluhm@
|
#
1.153 |
|
31-Aug-2022 |
mvs |
Move PRU_SENDOOB request to (*pru_sendoob)().
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. To avoid dummy m_freem(9) handlers for all protocols release passed mbufs in the pru_sendoob() EOPNOTSUPP error path.
Also fix `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path.
ok bluhm@
|
#
1.152 |
|
29-Aug-2022 |
mvs |
Move PRU_RCVOOB request to (*pru_rcvoob)().
ok bluhm@
|
#
1.151 |
|
28-Aug-2022 |
mvs |
Move PRU_SENSE request to (*pru_sense)().
ok bluhm@
|
#
1.150 |
|
28-Aug-2022 |
mvs |
Move PRU_ABORT request to (*pru_abort)().
We abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it required only for tcp(4) and unix(4) sockets, so i should be optional. However, they will be removed with separate diff, and this time PRU_ABORT requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and key management sockets leave it alive. This was also converted as is, because this wrong code never called.
ok bluhm@
|
#
1.149 |
|
27-Aug-2022 |
mvs |
Move PRU_SEND request to (*pru_send)().
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.173 |
|
29-Nov-2023 |
bluhm |
Run TCP syn cache timer without kernel lock.
As syn_cache_timer() uses syn cache mutex and exclusive net lock, it does not need kernel lock.
OK mvs@
|
#
1.172 |
|
16-Nov-2023 |
bluhm |
Run TCP SYN cache timer logik without net lock.
Introduce global TCP SYN cache mutex. Devide timer function in parts protected by mutex and sending with netlock. Split the flags field in dynamic flags protected by mutex and fixed flags set during initialization. Document whether fields of struct syn_cache are protected by net lock or mutex.
input and OK sashan@
|
Revision tags: OPENBSD_7_4_BASE
|
#
1.171 |
|
04-Sep-2023 |
bluhm |
Fix netstat output of uses of current SYN cache left.
TCP syn cache variable scs_use is basically counting packet insertions into syn cache. Prefer type long to exclude overflow on fast machines. Due to counting downwards from a limit, it can become negative. Copy it out as tcps_sc_uses_left via sysctl, and print it as signed long long integer.
OK mvs@
|
#
1.170 |
|
28-Aug-2023 |
bluhm |
Introduce reference counting for TCP syn cache entries.
The syn_cache_reaper() is a hack to serialize timeouts. Unfortunately it has a race and panics sometimes with pool_do_get: syncache free list modified. Add a reference counter for timeout and list of syn cache entries. Currently list refcout is not strictly necessary due to exclusive netlock, but will be needed when we continue unlocking.
Checking timeout_initialized() is not MP friendly, better do proper initialization during object allocation. Refcount in btrace helps to find leaks.
bug reported and fix tested by Peter J. Philipp OK claudio@
|
#
1.169 |
|
06-Jul-2023 |
bluhm |
Convert tcp_now() time counter to 64 bit.
After changing tcp now tick to milliseconds, 32 bits will wrap around after 49 days of uptime. That may be a problem in some places of our stack. Better use a 64 bit counter.
As timestamp option is 32 bit in TCP protocol, use the lower 32 bit there. There are casts to 32 bits that should behave correctly.
Start with random 63 bit offset to avoid uptime leakage. 2^63 milliseconds result in 2.9*10^8 years of possible uptime.
OK yasuoka@
|
#
1.168 |
|
02-Jul-2023 |
bluhm |
Use TSO and LRO on the loopback interface to transfer TCP faster.
If tcplro is activated on lo(4), ignore the MTU with TCP packets. They are passed along with the information that they have to be chopped in case they are forwarded later. New netstat(1) counter shows that software LRO is in effect. The feature is currently turned off by default.
tested by jan@; OK claudio@ jan@
|
#
1.167 |
|
23-May-2023 |
jan |
New counters for LRO packets from hardware TCP offloading.
With tweaks from patrick@ and bluhm@.
OK bluhm@
|
#
1.166 |
|
18-May-2023 |
jan |
Use TSO offloading in ix(4).
With a lot of tweaks, improvements and testing from bluhm.
Thanks to Hrvoje Popovski from the University of Zagreb for his great testing effort to make this happen.
ok bluhm
|
#
1.165 |
|
15-May-2023 |
bluhm |
Implement the TCP/IP layer for hardware TCP segmentation offload. If the driver of a network interface claims to support TSO, do not chop the packet in software, but pass it down to the interface layer. Precalculate parts of the pseudo header checksum, but without the packet length. The length of all generated smaller packets is not known yet. Driver and hardware will use the mbuf packet header field ph_mss to calculate it and update checksum. Introduce separate flags IFCAP_TSOv4 and IFCAP_TSOv6 as hardware might support ony one protocol family. The old flag IFXF_TSO is only relevant for large receive offload. It is missnamed, but keep that for now. Note that drivers do not set TSO capabilites yet. Also the ifconfig flags and pseudo interfaces capabilities will be done separately. So this commit should not change behavior. heavily based on the work from jan@; OK sashan@
|
#
1.164 |
|
10-May-2023 |
bluhm |
Implement TCP send offloading, for now in software only. This is meant as a fallback if network hardware does not support TSO. Driver support is still work in progress. TCP output generates large packets. In IP output the packet is chopped to TCP maximum segment size. This reduces the CPU cycles used by pf. The regular output could be assisted by hardware later, but pf route-to and IPsec needs the software fallback in general. For performance comparison or to workaround possible bugs, sysctl net.inet.tcp.tso=0 disables the feature. netstat -s -p tcp shows TSO counter with chopped and generated packets. based on work from jan@ tested by jmc@ jan@ Hrvoje Popovski OK jan@ claudio@
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.163 |
|
14-Mar-2023 |
yasuoka |
To avoid misunderstanding, keep variables for tcp keepalive in milliseconds, which is the same unit of tcp_now(). However, keep the unit of sysctl variables in seconds and convert their unit in tcp_sysctl(). Additionally revert TCPTV_SRTTDFLT back to 3 seconds, which was mistakenly changed to 1.5 seconds by tcp_timer.h 1.19.
ok claudio
|
#
1.162 |
|
13-Dec-2022 |
claudio |
In tcp_now() switch from getnsecuptime() to getnsecruntime()
The tcp timer is not supposed to run during suspend but getnsecuptime() does and because of this sessions with TCP_KEEPALIVE on reset after a few hours of sleep.
Problem noticed by mlarkin@, investigation by yasuoka@ additional testing jca@ OK yasuoka@ jca@ cheloha@
|
#
1.161 |
|
07-Nov-2022 |
yasuoka |
Modify TCP receive buffer size auto scaling to use the smoothed RTT (SRTT) instead of the timestamp option. Since the timestamp option is disabled on some OSs (eg. Windows) or dropped by some firewalls/routers, in such a case the window size had been fixed at 16KB, this limits throughput at very low on high latency networks. Also replace "tcp_now" from 2HZ tick counter to binuptime in milliseconds to calculate the SRTT better.
tested by krw matthieu jmatthew dlg djm stu stsp ok claudio
|
#
1.160 |
|
17-Oct-2022 |
mvs |
Change pru_abort() return type to the type of void and make pru_abort() optional.
We have no interest on pru_abort() return value. We call it only from soabort() which is dummy pru_abort() wrapper and has no return value.
Only the connection oriented sockets need to implement (*pru_abort)() handler. Such sockets are tcp(4) and unix(4) sockets, so remove existing code for all others, it doesn't called.
ok guenther@
|
#
1.159 |
|
03-Oct-2022 |
bluhm |
System calls should not fail due to temporary memory shortage in malloc(9) or pool_get(9). Pass down a wait flag to pru_attach(). During syscall socket(2) it is ok to wait, this logic was missing for internet pcb. Pfkey and route sockets were already waiting. sonewconn() must not wait when called during TCP 3-way handshake. This logic has been preserved. Unix domain stream socket connect(2) can wait until the other side has created the socket to accept. OK mvs@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.158 |
|
13-Sep-2022 |
mvs |
Change pru_rcvd() return type to the type of void. We have no interest on pru_rcvd() return value.
Drop "pru_rcvd != NULL" check within pru_rcvd() wrapper. We only call it if the socket's protocol have PR_WANTRCVD flag set. Such sockets are route domain, tcp(4) and unix(4) sockets.
ok guenther@ bluhm@
|
#
1.157 |
|
03-Sep-2022 |
mvs |
Move PRU_PEERADDR request to (*pru_peeraddr)().
Introduce in{,6}_peeraddr() and use them for inet and inet6 sockets, except tcp(4) case.
Also remove *_usrreq() handlers.
ok bluhm@
|
#
1.156 |
|
03-Sep-2022 |
bluhm |
Use a mutex to update tcp_maxidle, tcp_iss, and tcp_now. This removes pressure from the exclusive netlock in tcp_slowtimo(). Reading is done atomically. Ensure that the tcp_now value is read only once per function to provide consistent time. OK yasuoka@
|
#
1.155 |
|
03-Sep-2022 |
mvs |
Move PRU_SOCKADDR request to (*pru_sockaddr)()
Introduce in{,6}_sockaddr() functions, and use them for all except tcp(4) inet sockets. For tcp(4) sockets use tcp_sockaddr() to keep debug ability.
The key management and route domain sockets returns EINVAL error for PRU_SOCKADDR request, so keep this behaviour for a while instead of make pru_sockaddr handler optional and return EOPNOTSUPP.
ok bluhm@
|
#
1.154 |
|
02-Sep-2022 |
mvs |
Move PRU_CONTROL request to (*pru_control)().
The 'proc *' arg is not used for PRU_CONTROL request, so remove it from pru_control() wrapper.
Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for inet6 case.
ok guenther@ bluhm@
|
#
1.153 |
|
31-Aug-2022 |
mvs |
Move PRU_SENDOOB request to (*pru_sendoob)().
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. To avoid dummy m_freem(9) handlers for all protocols release passed mbufs in the pru_sendoob() EOPNOTSUPP error path.
Also fix `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path.
ok bluhm@
|
#
1.152 |
|
29-Aug-2022 |
mvs |
Move PRU_RCVOOB request to (*pru_rcvoob)().
ok bluhm@
|
#
1.151 |
|
28-Aug-2022 |
mvs |
Move PRU_SENSE request to (*pru_sense)().
ok bluhm@
|
#
1.150 |
|
28-Aug-2022 |
mvs |
Move PRU_ABORT request to (*pru_abort)().
We abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it required only for tcp(4) and unix(4) sockets, so i should be optional. However, they will be removed with separate diff, and this time PRU_ABORT requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and key management sockets leave it alive. This was also converted as is, because this wrong code never called.
ok bluhm@
|
#
1.149 |
|
27-Aug-2022 |
mvs |
Move PRU_SEND request to (*pru_send)().
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.172 |
|
16-Nov-2023 |
bluhm |
Run TCP SYN cache timer logik without net lock.
Introduce global TCP SYN cache mutex. Devide timer function in parts protected by mutex and sending with netlock. Split the flags field in dynamic flags protected by mutex and fixed flags set during initialization. Document whether fields of struct syn_cache are protected by net lock or mutex.
input and OK sashan@
|
Revision tags: OPENBSD_7_4_BASE
|
#
1.171 |
|
04-Sep-2023 |
bluhm |
Fix netstat output of uses of current SYN cache left.
TCP syn cache variable scs_use is basically counting packet insertions into syn cache. Prefer type long to exclude overflow on fast machines. Due to counting downwards from a limit, it can become negative. Copy it out as tcps_sc_uses_left via sysctl, and print it as signed long long integer.
OK mvs@
|
#
1.170 |
|
28-Aug-2023 |
bluhm |
Introduce reference counting for TCP syn cache entries.
The syn_cache_reaper() is a hack to serialize timeouts. Unfortunately it has a race and panics sometimes with pool_do_get: syncache free list modified. Add a reference counter for timeout and list of syn cache entries. Currently list refcout is not strictly necessary due to exclusive netlock, but will be needed when we continue unlocking.
Checking timeout_initialized() is not MP friendly, better do proper initialization during object allocation. Refcount in btrace helps to find leaks.
bug reported and fix tested by Peter J. Philipp OK claudio@
|
#
1.169 |
|
06-Jul-2023 |
bluhm |
Convert tcp_now() time counter to 64 bit.
After changing tcp now tick to milliseconds, 32 bits will wrap around after 49 days of uptime. That may be a problem in some places of our stack. Better use a 64 bit counter.
As timestamp option is 32 bit in TCP protocol, use the lower 32 bit there. There are casts to 32 bits that should behave correctly.
Start with random 63 bit offset to avoid uptime leakage. 2^63 milliseconds result in 2.9*10^8 years of possible uptime.
OK yasuoka@
|
#
1.168 |
|
02-Jul-2023 |
bluhm |
Use TSO and LRO on the loopback interface to transfer TCP faster.
If tcplro is activated on lo(4), ignore the MTU with TCP packets. They are passed along with the information that they have to be chopped in case they are forwarded later. New netstat(1) counter shows that software LRO is in effect. The feature is currently turned off by default.
tested by jan@; OK claudio@ jan@
|
#
1.167 |
|
23-May-2023 |
jan |
New counters for LRO packets from hardware TCP offloading.
With tweaks from patrick@ and bluhm@.
OK bluhm@
|
#
1.166 |
|
18-May-2023 |
jan |
Use TSO offloading in ix(4).
With a lot of tweaks, improvements and testing from bluhm.
Thanks to Hrvoje Popovski from the University of Zagreb for his great testing effort to make this happen.
ok bluhm
|
#
1.165 |
|
15-May-2023 |
bluhm |
Implement the TCP/IP layer for hardware TCP segmentation offload. If the driver of a network interface claims to support TSO, do not chop the packet in software, but pass it down to the interface layer. Precalculate parts of the pseudo header checksum, but without the packet length. The length of all generated smaller packets is not known yet. Driver and hardware will use the mbuf packet header field ph_mss to calculate it and update checksum. Introduce separate flags IFCAP_TSOv4 and IFCAP_TSOv6 as hardware might support ony one protocol family. The old flag IFXF_TSO is only relevant for large receive offload. It is missnamed, but keep that for now. Note that drivers do not set TSO capabilites yet. Also the ifconfig flags and pseudo interfaces capabilities will be done separately. So this commit should not change behavior. heavily based on the work from jan@; OK sashan@
|
#
1.164 |
|
10-May-2023 |
bluhm |
Implement TCP send offloading, for now in software only. This is meant as a fallback if network hardware does not support TSO. Driver support is still work in progress. TCP output generates large packets. In IP output the packet is chopped to TCP maximum segment size. This reduces the CPU cycles used by pf. The regular output could be assisted by hardware later, but pf route-to and IPsec needs the software fallback in general. For performance comparison or to workaround possible bugs, sysctl net.inet.tcp.tso=0 disables the feature. netstat -s -p tcp shows TSO counter with chopped and generated packets. based on work from jan@ tested by jmc@ jan@ Hrvoje Popovski OK jan@ claudio@
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.163 |
|
14-Mar-2023 |
yasuoka |
To avoid misunderstanding, keep variables for tcp keepalive in milliseconds, which is the same unit of tcp_now(). However, keep the unit of sysctl variables in seconds and convert their unit in tcp_sysctl(). Additionally revert TCPTV_SRTTDFLT back to 3 seconds, which was mistakenly changed to 1.5 seconds by tcp_timer.h 1.19.
ok claudio
|
#
1.162 |
|
13-Dec-2022 |
claudio |
In tcp_now() switch from getnsecuptime() to getnsecruntime()
The tcp timer is not supposed to run during suspend but getnsecuptime() does and because of this sessions with TCP_KEEPALIVE on reset after a few hours of sleep.
Problem noticed by mlarkin@, investigation by yasuoka@ additional testing jca@ OK yasuoka@ jca@ cheloha@
|
#
1.161 |
|
07-Nov-2022 |
yasuoka |
Modify TCP receive buffer size auto scaling to use the smoothed RTT (SRTT) instead of the timestamp option. Since the timestamp option is disabled on some OSs (eg. Windows) or dropped by some firewalls/routers, in such a case the window size had been fixed at 16KB, this limits throughput at very low on high latency networks. Also replace "tcp_now" from 2HZ tick counter to binuptime in milliseconds to calculate the SRTT better.
tested by krw matthieu jmatthew dlg djm stu stsp ok claudio
|
#
1.160 |
|
17-Oct-2022 |
mvs |
Change pru_abort() return type to the type of void and make pru_abort() optional.
We have no interest on pru_abort() return value. We call it only from soabort() which is dummy pru_abort() wrapper and has no return value.
Only the connection oriented sockets need to implement (*pru_abort)() handler. Such sockets are tcp(4) and unix(4) sockets, so remove existing code for all others, it doesn't called.
ok guenther@
|
#
1.159 |
|
03-Oct-2022 |
bluhm |
System calls should not fail due to temporary memory shortage in malloc(9) or pool_get(9). Pass down a wait flag to pru_attach(). During syscall socket(2) it is ok to wait, this logic was missing for internet pcb. Pfkey and route sockets were already waiting. sonewconn() must not wait when called during TCP 3-way handshake. This logic has been preserved. Unix domain stream socket connect(2) can wait until the other side has created the socket to accept. OK mvs@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.158 |
|
13-Sep-2022 |
mvs |
Change pru_rcvd() return type to the type of void. We have no interest on pru_rcvd() return value.
Drop "pru_rcvd != NULL" check within pru_rcvd() wrapper. We only call it if the socket's protocol have PR_WANTRCVD flag set. Such sockets are route domain, tcp(4) and unix(4) sockets.
ok guenther@ bluhm@
|
#
1.157 |
|
03-Sep-2022 |
mvs |
Move PRU_PEERADDR request to (*pru_peeraddr)().
Introduce in{,6}_peeraddr() and use them for inet and inet6 sockets, except tcp(4) case.
Also remove *_usrreq() handlers.
ok bluhm@
|
#
1.156 |
|
03-Sep-2022 |
bluhm |
Use a mutex to update tcp_maxidle, tcp_iss, and tcp_now. This removes pressure from the exclusive netlock in tcp_slowtimo(). Reading is done atomically. Ensure that the tcp_now value is read only once per function to provide consistent time. OK yasuoka@
|
#
1.155 |
|
03-Sep-2022 |
mvs |
Move PRU_SOCKADDR request to (*pru_sockaddr)()
Introduce in{,6}_sockaddr() functions, and use them for all except tcp(4) inet sockets. For tcp(4) sockets use tcp_sockaddr() to keep debug ability.
The key management and route domain sockets returns EINVAL error for PRU_SOCKADDR request, so keep this behaviour for a while instead of make pru_sockaddr handler optional and return EOPNOTSUPP.
ok bluhm@
|
#
1.154 |
|
02-Sep-2022 |
mvs |
Move PRU_CONTROL request to (*pru_control)().
The 'proc *' arg is not used for PRU_CONTROL request, so remove it from pru_control() wrapper.
Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for inet6 case.
ok guenther@ bluhm@
|
#
1.153 |
|
31-Aug-2022 |
mvs |
Move PRU_SENDOOB request to (*pru_sendoob)().
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. To avoid dummy m_freem(9) handlers for all protocols release passed mbufs in the pru_sendoob() EOPNOTSUPP error path.
Also fix `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path.
ok bluhm@
|
#
1.152 |
|
29-Aug-2022 |
mvs |
Move PRU_RCVOOB request to (*pru_rcvoob)().
ok bluhm@
|
#
1.151 |
|
28-Aug-2022 |
mvs |
Move PRU_SENSE request to (*pru_sense)().
ok bluhm@
|
#
1.150 |
|
28-Aug-2022 |
mvs |
Move PRU_ABORT request to (*pru_abort)().
We abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it required only for tcp(4) and unix(4) sockets, so i should be optional. However, they will be removed with separate diff, and this time PRU_ABORT requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and key management sockets leave it alive. This was also converted as is, because this wrong code never called.
ok bluhm@
|
#
1.149 |
|
27-Aug-2022 |
mvs |
Move PRU_SEND request to (*pru_send)().
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.171 |
|
04-Sep-2023 |
bluhm |
Fix netstat output of uses of current SYN cache left.
TCP syn cache variable scs_use is basically counting packet insertions into syn cache. Prefer type long to exclude overflow on fast machines. Due to counting downwards from a limit, it can become negative. Copy it out as tcps_sc_uses_left via sysctl, and print it as signed long long integer.
OK mvs@
|
#
1.170 |
|
28-Aug-2023 |
bluhm |
Introduce reference counting for TCP syn cache entries.
The syn_cache_reaper() is a hack to serialize timeouts. Unfortunately it has a race and panics sometimes with pool_do_get: syncache free list modified. Add a reference counter for timeout and list of syn cache entries. Currently list refcout is not strictly necessary due to exclusive netlock, but will be needed when we continue unlocking.
Checking timeout_initialized() is not MP friendly, better do proper initialization during object allocation. Refcount in btrace helps to find leaks.
bug reported and fix tested by Peter J. Philipp OK claudio@
|
#
1.169 |
|
06-Jul-2023 |
bluhm |
Convert tcp_now() time counter to 64 bit.
After changing tcp now tick to milliseconds, 32 bits will wrap around after 49 days of uptime. That may be a problem in some places of our stack. Better use a 64 bit counter.
As timestamp option is 32 bit in TCP protocol, use the lower 32 bit there. There are casts to 32 bits that should behave correctly.
Start with random 63 bit offset to avoid uptime leakage. 2^63 milliseconds result in 2.9*10^8 years of possible uptime.
OK yasuoka@
|
#
1.168 |
|
02-Jul-2023 |
bluhm |
Use TSO and LRO on the loopback interface to transfer TCP faster.
If tcplro is activated on lo(4), ignore the MTU with TCP packets. They are passed along with the information that they have to be chopped in case they are forwarded later. New netstat(1) counter shows that software LRO is in effect. The feature is currently turned off by default.
tested by jan@; OK claudio@ jan@
|
#
1.167 |
|
23-May-2023 |
jan |
New counters for LRO packets from hardware TCP offloading.
With tweaks from patrick@ and bluhm@.
OK bluhm@
|
#
1.166 |
|
18-May-2023 |
jan |
Use TSO offloading in ix(4).
With a lot of tweaks, improvements and testing from bluhm.
Thanks to Hrvoje Popovski from the University of Zagreb for his great testing effort to make this happen.
ok bluhm
|
#
1.165 |
|
15-May-2023 |
bluhm |
Implement the TCP/IP layer for hardware TCP segmentation offload. If the driver of a network interface claims to support TSO, do not chop the packet in software, but pass it down to the interface layer. Precalculate parts of the pseudo header checksum, but without the packet length. The length of all generated smaller packets is not known yet. Driver and hardware will use the mbuf packet header field ph_mss to calculate it and update checksum. Introduce separate flags IFCAP_TSOv4 and IFCAP_TSOv6 as hardware might support ony one protocol family. The old flag IFXF_TSO is only relevant for large receive offload. It is missnamed, but keep that for now. Note that drivers do not set TSO capabilites yet. Also the ifconfig flags and pseudo interfaces capabilities will be done separately. So this commit should not change behavior. heavily based on the work from jan@; OK sashan@
|
#
1.164 |
|
10-May-2023 |
bluhm |
Implement TCP send offloading, for now in software only. This is meant as a fallback if network hardware does not support TSO. Driver support is still work in progress. TCP output generates large packets. In IP output the packet is chopped to TCP maximum segment size. This reduces the CPU cycles used by pf. The regular output could be assisted by hardware later, but pf route-to and IPsec needs the software fallback in general. For performance comparison or to workaround possible bugs, sysctl net.inet.tcp.tso=0 disables the feature. netstat -s -p tcp shows TSO counter with chopped and generated packets. based on work from jan@ tested by jmc@ jan@ Hrvoje Popovski OK jan@ claudio@
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.163 |
|
14-Mar-2023 |
yasuoka |
To avoid misunderstanding, keep variables for tcp keepalive in milliseconds, which is the same unit of tcp_now(). However, keep the unit of sysctl variables in seconds and convert their unit in tcp_sysctl(). Additionally revert TCPTV_SRTTDFLT back to 3 seconds, which was mistakenly changed to 1.5 seconds by tcp_timer.h 1.19.
ok claudio
|
#
1.162 |
|
13-Dec-2022 |
claudio |
In tcp_now() switch from getnsecuptime() to getnsecruntime()
The tcp timer is not supposed to run during suspend but getnsecuptime() does and because of this sessions with TCP_KEEPALIVE on reset after a few hours of sleep.
Problem noticed by mlarkin@, investigation by yasuoka@ additional testing jca@ OK yasuoka@ jca@ cheloha@
|
#
1.161 |
|
07-Nov-2022 |
yasuoka |
Modify TCP receive buffer size auto scaling to use the smoothed RTT (SRTT) instead of the timestamp option. Since the timestamp option is disabled on some OSs (eg. Windows) or dropped by some firewalls/routers, in such a case the window size had been fixed at 16KB, this limits throughput at very low on high latency networks. Also replace "tcp_now" from 2HZ tick counter to binuptime in milliseconds to calculate the SRTT better.
tested by krw matthieu jmatthew dlg djm stu stsp ok claudio
|
#
1.160 |
|
17-Oct-2022 |
mvs |
Change pru_abort() return type to the type of void and make pru_abort() optional.
We have no interest on pru_abort() return value. We call it only from soabort() which is dummy pru_abort() wrapper and has no return value.
Only the connection oriented sockets need to implement (*pru_abort)() handler. Such sockets are tcp(4) and unix(4) sockets, so remove existing code for all others, it doesn't called.
ok guenther@
|
#
1.159 |
|
03-Oct-2022 |
bluhm |
System calls should not fail due to temporary memory shortage in malloc(9) or pool_get(9). Pass down a wait flag to pru_attach(). During syscall socket(2) it is ok to wait, this logic was missing for internet pcb. Pfkey and route sockets were already waiting. sonewconn() must not wait when called during TCP 3-way handshake. This logic has been preserved. Unix domain stream socket connect(2) can wait until the other side has created the socket to accept. OK mvs@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.158 |
|
13-Sep-2022 |
mvs |
Change pru_rcvd() return type to the type of void. We have no interest on pru_rcvd() return value.
Drop "pru_rcvd != NULL" check within pru_rcvd() wrapper. We only call it if the socket's protocol have PR_WANTRCVD flag set. Such sockets are route domain, tcp(4) and unix(4) sockets.
ok guenther@ bluhm@
|
#
1.157 |
|
03-Sep-2022 |
mvs |
Move PRU_PEERADDR request to (*pru_peeraddr)().
Introduce in{,6}_peeraddr() and use them for inet and inet6 sockets, except tcp(4) case.
Also remove *_usrreq() handlers.
ok bluhm@
|
#
1.156 |
|
03-Sep-2022 |
bluhm |
Use a mutex to update tcp_maxidle, tcp_iss, and tcp_now. This removes pressure from the exclusive netlock in tcp_slowtimo(). Reading is done atomically. Ensure that the tcp_now value is read only once per function to provide consistent time. OK yasuoka@
|
#
1.155 |
|
03-Sep-2022 |
mvs |
Move PRU_SOCKADDR request to (*pru_sockaddr)()
Introduce in{,6}_sockaddr() functions, and use them for all except tcp(4) inet sockets. For tcp(4) sockets use tcp_sockaddr() to keep debug ability.
The key management and route domain sockets returns EINVAL error for PRU_SOCKADDR request, so keep this behaviour for a while instead of make pru_sockaddr handler optional and return EOPNOTSUPP.
ok bluhm@
|
#
1.154 |
|
02-Sep-2022 |
mvs |
Move PRU_CONTROL request to (*pru_control)().
The 'proc *' arg is not used for PRU_CONTROL request, so remove it from pru_control() wrapper.
Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for inet6 case.
ok guenther@ bluhm@
|
#
1.153 |
|
31-Aug-2022 |
mvs |
Move PRU_SENDOOB request to (*pru_sendoob)().
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. To avoid dummy m_freem(9) handlers for all protocols release passed mbufs in the pru_sendoob() EOPNOTSUPP error path.
Also fix `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path.
ok bluhm@
|
#
1.152 |
|
29-Aug-2022 |
mvs |
Move PRU_RCVOOB request to (*pru_rcvoob)().
ok bluhm@
|
#
1.151 |
|
28-Aug-2022 |
mvs |
Move PRU_SENSE request to (*pru_sense)().
ok bluhm@
|
#
1.150 |
|
28-Aug-2022 |
mvs |
Move PRU_ABORT request to (*pru_abort)().
We abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it required only for tcp(4) and unix(4) sockets, so i should be optional. However, they will be removed with separate diff, and this time PRU_ABORT requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and key management sockets leave it alive. This was also converted as is, because this wrong code never called.
ok bluhm@
|
#
1.149 |
|
27-Aug-2022 |
mvs |
Move PRU_SEND request to (*pru_send)().
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.170 |
|
28-Aug-2023 |
bluhm |
Introduce reference counting for TCP syn cache entries.
The syn_cache_reaper() is a hack to serialize timeouts. Unfortunately it has a race and panics sometimes with pool_do_get: syncache free list modified. Add a reference counter for timeout and list of syn cache entries. Currently list refcout is not strictly necessary due to exclusive netlock, but will be needed when we continue unlocking.
Checking timeout_initialized() is not MP friendly, better do proper initialization during object allocation. Refcount in btrace helps to find leaks.
bug reported and fix tested by Peter J. Philipp OK claudio@
|
#
1.169 |
|
06-Jul-2023 |
bluhm |
Convert tcp_now() time counter to 64 bit.
After changing tcp now tick to milliseconds, 32 bits will wrap around after 49 days of uptime. That may be a problem in some places of our stack. Better use a 64 bit counter.
As timestamp option is 32 bit in TCP protocol, use the lower 32 bit there. There are casts to 32 bits that should behave correctly.
Start with random 63 bit offset to avoid uptime leakage. 2^63 milliseconds result in 2.9*10^8 years of possible uptime.
OK yasuoka@
|
#
1.168 |
|
02-Jul-2023 |
bluhm |
Use TSO and LRO on the loopback interface to transfer TCP faster.
If tcplro is activated on lo(4), ignore the MTU with TCP packets. They are passed along with the information that they have to be chopped in case they are forwarded later. New netstat(1) counter shows that software LRO is in effect. The feature is currently turned off by default.
tested by jan@; OK claudio@ jan@
|
#
1.167 |
|
23-May-2023 |
jan |
New counters for LRO packets from hardware TCP offloading.
With tweaks from patrick@ and bluhm@.
OK bluhm@
|
#
1.166 |
|
18-May-2023 |
jan |
Use TSO offloading in ix(4).
With a lot of tweaks, improvements and testing from bluhm.
Thanks to Hrvoje Popovski from the University of Zagreb for his great testing effort to make this happen.
ok bluhm
|
#
1.165 |
|
15-May-2023 |
bluhm |
Implement the TCP/IP layer for hardware TCP segmentation offload. If the driver of a network interface claims to support TSO, do not chop the packet in software, but pass it down to the interface layer. Precalculate parts of the pseudo header checksum, but without the packet length. The length of all generated smaller packets is not known yet. Driver and hardware will use the mbuf packet header field ph_mss to calculate it and update checksum. Introduce separate flags IFCAP_TSOv4 and IFCAP_TSOv6 as hardware might support ony one protocol family. The old flag IFXF_TSO is only relevant for large receive offload. It is missnamed, but keep that for now. Note that drivers do not set TSO capabilites yet. Also the ifconfig flags and pseudo interfaces capabilities will be done separately. So this commit should not change behavior. heavily based on the work from jan@; OK sashan@
|
#
1.164 |
|
10-May-2023 |
bluhm |
Implement TCP send offloading, for now in software only. This is meant as a fallback if network hardware does not support TSO. Driver support is still work in progress. TCP output generates large packets. In IP output the packet is chopped to TCP maximum segment size. This reduces the CPU cycles used by pf. The regular output could be assisted by hardware later, but pf route-to and IPsec needs the software fallback in general. For performance comparison or to workaround possible bugs, sysctl net.inet.tcp.tso=0 disables the feature. netstat -s -p tcp shows TSO counter with chopped and generated packets. based on work from jan@ tested by jmc@ jan@ Hrvoje Popovski OK jan@ claudio@
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.163 |
|
14-Mar-2023 |
yasuoka |
To avoid misunderstanding, keep variables for tcp keepalive in milliseconds, which is the same unit of tcp_now(). However, keep the unit of sysctl variables in seconds and convert their unit in tcp_sysctl(). Additionally revert TCPTV_SRTTDFLT back to 3 seconds, which was mistakenly changed to 1.5 seconds by tcp_timer.h 1.19.
ok claudio
|
#
1.162 |
|
13-Dec-2022 |
claudio |
In tcp_now() switch from getnsecuptime() to getnsecruntime()
The tcp timer is not supposed to run during suspend but getnsecuptime() does and because of this sessions with TCP_KEEPALIVE on reset after a few hours of sleep.
Problem noticed by mlarkin@, investigation by yasuoka@ additional testing jca@ OK yasuoka@ jca@ cheloha@
|
#
1.161 |
|
07-Nov-2022 |
yasuoka |
Modify TCP receive buffer size auto scaling to use the smoothed RTT (SRTT) instead of the timestamp option. Since the timestamp option is disabled on some OSs (eg. Windows) or dropped by some firewalls/routers, in such a case the window size had been fixed at 16KB, this limits throughput at very low on high latency networks. Also replace "tcp_now" from 2HZ tick counter to binuptime in milliseconds to calculate the SRTT better.
tested by krw matthieu jmatthew dlg djm stu stsp ok claudio
|
#
1.160 |
|
17-Oct-2022 |
mvs |
Change pru_abort() return type to the type of void and make pru_abort() optional.
We have no interest on pru_abort() return value. We call it only from soabort() which is dummy pru_abort() wrapper and has no return value.
Only the connection oriented sockets need to implement (*pru_abort)() handler. Such sockets are tcp(4) and unix(4) sockets, so remove existing code for all others, it doesn't called.
ok guenther@
|
#
1.159 |
|
03-Oct-2022 |
bluhm |
System calls should not fail due to temporary memory shortage in malloc(9) or pool_get(9). Pass down a wait flag to pru_attach(). During syscall socket(2) it is ok to wait, this logic was missing for internet pcb. Pfkey and route sockets were already waiting. sonewconn() must not wait when called during TCP 3-way handshake. This logic has been preserved. Unix domain stream socket connect(2) can wait until the other side has created the socket to accept. OK mvs@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.158 |
|
13-Sep-2022 |
mvs |
Change pru_rcvd() return type to the type of void. We have no interest on pru_rcvd() return value.
Drop "pru_rcvd != NULL" check within pru_rcvd() wrapper. We only call it if the socket's protocol have PR_WANTRCVD flag set. Such sockets are route domain, tcp(4) and unix(4) sockets.
ok guenther@ bluhm@
|
#
1.157 |
|
03-Sep-2022 |
mvs |
Move PRU_PEERADDR request to (*pru_peeraddr)().
Introduce in{,6}_peeraddr() and use them for inet and inet6 sockets, except tcp(4) case.
Also remove *_usrreq() handlers.
ok bluhm@
|
#
1.156 |
|
03-Sep-2022 |
bluhm |
Use a mutex to update tcp_maxidle, tcp_iss, and tcp_now. This removes pressure from the exclusive netlock in tcp_slowtimo(). Reading is done atomically. Ensure that the tcp_now value is read only once per function to provide consistent time. OK yasuoka@
|
#
1.155 |
|
03-Sep-2022 |
mvs |
Move PRU_SOCKADDR request to (*pru_sockaddr)()
Introduce in{,6}_sockaddr() functions, and use them for all except tcp(4) inet sockets. For tcp(4) sockets use tcp_sockaddr() to keep debug ability.
The key management and route domain sockets returns EINVAL error for PRU_SOCKADDR request, so keep this behaviour for a while instead of make pru_sockaddr handler optional and return EOPNOTSUPP.
ok bluhm@
|
#
1.154 |
|
02-Sep-2022 |
mvs |
Move PRU_CONTROL request to (*pru_control)().
The 'proc *' arg is not used for PRU_CONTROL request, so remove it from pru_control() wrapper.
Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for inet6 case.
ok guenther@ bluhm@
|
#
1.153 |
|
31-Aug-2022 |
mvs |
Move PRU_SENDOOB request to (*pru_sendoob)().
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. To avoid dummy m_freem(9) handlers for all protocols release passed mbufs in the pru_sendoob() EOPNOTSUPP error path.
Also fix `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path.
ok bluhm@
|
#
1.152 |
|
29-Aug-2022 |
mvs |
Move PRU_RCVOOB request to (*pru_rcvoob)().
ok bluhm@
|
#
1.151 |
|
28-Aug-2022 |
mvs |
Move PRU_SENSE request to (*pru_sense)().
ok bluhm@
|
#
1.150 |
|
28-Aug-2022 |
mvs |
Move PRU_ABORT request to (*pru_abort)().
We abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it required only for tcp(4) and unix(4) sockets, so i should be optional. However, they will be removed with separate diff, and this time PRU_ABORT requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and key management sockets leave it alive. This was also converted as is, because this wrong code never called.
ok bluhm@
|
#
1.149 |
|
27-Aug-2022 |
mvs |
Move PRU_SEND request to (*pru_send)().
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.169 |
|
06-Jul-2023 |
bluhm |
Convert tcp_now() time counter to 64 bit.
After changing tcp now tick to milliseconds, 32 bits will wrap around after 49 days of uptime. That may be a problem in some places of our stack. Better use a 64 bit counter.
As timestamp option is 32 bit in TCP protocol, use the lower 32 bit there. There are casts to 32 bits that should behave correctly.
Start with random 63 bit offset to avoid uptime leakage. 2^63 milliseconds result in 2.9*10^8 years of possible uptime.
OK yasuoka@
|
#
1.168 |
|
02-Jul-2023 |
bluhm |
Use TSO and LRO on the loopback interface to transfer TCP faster.
If tcplro is activated on lo(4), ignore the MTU with TCP packets. They are passed along with the information that they have to be chopped in case they are forwarded later. New netstat(1) counter shows that software LRO is in effect. The feature is currently turned off by default.
tested by jan@; OK claudio@ jan@
|
#
1.167 |
|
23-May-2023 |
jan |
New counters for LRO packets from hardware TCP offloading.
With tweaks from patrick@ and bluhm@.
OK bluhm@
|
#
1.166 |
|
18-May-2023 |
jan |
Use TSO offloading in ix(4).
With a lot of tweaks, improvements and testing from bluhm.
Thanks to Hrvoje Popovski from the University of Zagreb for his great testing effort to make this happen.
ok bluhm
|
#
1.165 |
|
15-May-2023 |
bluhm |
Implement the TCP/IP layer for hardware TCP segmentation offload. If the driver of a network interface claims to support TSO, do not chop the packet in software, but pass it down to the interface layer. Precalculate parts of the pseudo header checksum, but without the packet length. The length of all generated smaller packets is not known yet. Driver and hardware will use the mbuf packet header field ph_mss to calculate it and update checksum. Introduce separate flags IFCAP_TSOv4 and IFCAP_TSOv6 as hardware might support ony one protocol family. The old flag IFXF_TSO is only relevant for large receive offload. It is missnamed, but keep that for now. Note that drivers do not set TSO capabilites yet. Also the ifconfig flags and pseudo interfaces capabilities will be done separately. So this commit should not change behavior. heavily based on the work from jan@; OK sashan@
|
#
1.164 |
|
10-May-2023 |
bluhm |
Implement TCP send offloading, for now in software only. This is meant as a fallback if network hardware does not support TSO. Driver support is still work in progress. TCP output generates large packets. In IP output the packet is chopped to TCP maximum segment size. This reduces the CPU cycles used by pf. The regular output could be assisted by hardware later, but pf route-to and IPsec needs the software fallback in general. For performance comparison or to workaround possible bugs, sysctl net.inet.tcp.tso=0 disables the feature. netstat -s -p tcp shows TSO counter with chopped and generated packets. based on work from jan@ tested by jmc@ jan@ Hrvoje Popovski OK jan@ claudio@
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.163 |
|
14-Mar-2023 |
yasuoka |
To avoid misunderstanding, keep variables for tcp keepalive in milliseconds, which is the same unit of tcp_now(). However, keep the unit of sysctl variables in seconds and convert their unit in tcp_sysctl(). Additionally revert TCPTV_SRTTDFLT back to 3 seconds, which was mistakenly changed to 1.5 seconds by tcp_timer.h 1.19.
ok claudio
|
#
1.162 |
|
13-Dec-2022 |
claudio |
In tcp_now() switch from getnsecuptime() to getnsecruntime()
The tcp timer is not supposed to run during suspend but getnsecuptime() does and because of this sessions with TCP_KEEPALIVE on reset after a few hours of sleep.
Problem noticed by mlarkin@, investigation by yasuoka@ additional testing jca@ OK yasuoka@ jca@ cheloha@
|
#
1.161 |
|
07-Nov-2022 |
yasuoka |
Modify TCP receive buffer size auto scaling to use the smoothed RTT (SRTT) instead of the timestamp option. Since the timestamp option is disabled on some OSs (eg. Windows) or dropped by some firewalls/routers, in such a case the window size had been fixed at 16KB, this limits throughput at very low on high latency networks. Also replace "tcp_now" from 2HZ tick counter to binuptime in milliseconds to calculate the SRTT better.
tested by krw matthieu jmatthew dlg djm stu stsp ok claudio
|
#
1.160 |
|
17-Oct-2022 |
mvs |
Change pru_abort() return type to the type of void and make pru_abort() optional.
We have no interest on pru_abort() return value. We call it only from soabort() which is dummy pru_abort() wrapper and has no return value.
Only the connection oriented sockets need to implement (*pru_abort)() handler. Such sockets are tcp(4) and unix(4) sockets, so remove existing code for all others, it doesn't called.
ok guenther@
|
#
1.159 |
|
03-Oct-2022 |
bluhm |
System calls should not fail due to temporary memory shortage in malloc(9) or pool_get(9). Pass down a wait flag to pru_attach(). During syscall socket(2) it is ok to wait, this logic was missing for internet pcb. Pfkey and route sockets were already waiting. sonewconn() must not wait when called during TCP 3-way handshake. This logic has been preserved. Unix domain stream socket connect(2) can wait until the other side has created the socket to accept. OK mvs@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.158 |
|
13-Sep-2022 |
mvs |
Change pru_rcvd() return type to the type of void. We have no interest on pru_rcvd() return value.
Drop "pru_rcvd != NULL" check within pru_rcvd() wrapper. We only call it if the socket's protocol have PR_WANTRCVD flag set. Such sockets are route domain, tcp(4) and unix(4) sockets.
ok guenther@ bluhm@
|
#
1.157 |
|
03-Sep-2022 |
mvs |
Move PRU_PEERADDR request to (*pru_peeraddr)().
Introduce in{,6}_peeraddr() and use them for inet and inet6 sockets, except tcp(4) case.
Also remove *_usrreq() handlers.
ok bluhm@
|
#
1.156 |
|
03-Sep-2022 |
bluhm |
Use a mutex to update tcp_maxidle, tcp_iss, and tcp_now. This removes pressure from the exclusive netlock in tcp_slowtimo(). Reading is done atomically. Ensure that the tcp_now value is read only once per function to provide consistent time. OK yasuoka@
|
#
1.155 |
|
03-Sep-2022 |
mvs |
Move PRU_SOCKADDR request to (*pru_sockaddr)()
Introduce in{,6}_sockaddr() functions, and use them for all except tcp(4) inet sockets. For tcp(4) sockets use tcp_sockaddr() to keep debug ability.
The key management and route domain sockets returns EINVAL error for PRU_SOCKADDR request, so keep this behaviour for a while instead of make pru_sockaddr handler optional and return EOPNOTSUPP.
ok bluhm@
|
#
1.154 |
|
02-Sep-2022 |
mvs |
Move PRU_CONTROL request to (*pru_control)().
The 'proc *' arg is not used for PRU_CONTROL request, so remove it from pru_control() wrapper.
Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for inet6 case.
ok guenther@ bluhm@
|
#
1.153 |
|
31-Aug-2022 |
mvs |
Move PRU_SENDOOB request to (*pru_sendoob)().
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. To avoid dummy m_freem(9) handlers for all protocols release passed mbufs in the pru_sendoob() EOPNOTSUPP error path.
Also fix `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path.
ok bluhm@
|
#
1.152 |
|
29-Aug-2022 |
mvs |
Move PRU_RCVOOB request to (*pru_rcvoob)().
ok bluhm@
|
#
1.151 |
|
28-Aug-2022 |
mvs |
Move PRU_SENSE request to (*pru_sense)().
ok bluhm@
|
#
1.150 |
|
28-Aug-2022 |
mvs |
Move PRU_ABORT request to (*pru_abort)().
We abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it required only for tcp(4) and unix(4) sockets, so i should be optional. However, they will be removed with separate diff, and this time PRU_ABORT requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and key management sockets leave it alive. This was also converted as is, because this wrong code never called.
ok bluhm@
|
#
1.149 |
|
27-Aug-2022 |
mvs |
Move PRU_SEND request to (*pru_send)().
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.168 |
|
02-Jul-2023 |
bluhm |
Use TSO and LRO on the loopback interface to transfer TCP faster.
If tcplro is activated on lo(4), ignore the MTU with TCP packets. They are passed along with the information that they have to be chopped in case they are forwarded later. New netstat(1) counter shows that software LRO is in effect. The feature is currently turned off by default.
tested by jan@; OK claudio@ jan@
|
#
1.167 |
|
23-May-2023 |
jan |
New counters for LRO packets from hardware TCP offloading.
With tweaks from patrick@ and bluhm@.
OK bluhm@
|
#
1.166 |
|
18-May-2023 |
jan |
Use TSO offloading in ix(4).
With a lot of tweaks, improvements and testing from bluhm.
Thanks to Hrvoje Popovski from the University of Zagreb for his great testing effort to make this happen.
ok bluhm
|
#
1.165 |
|
15-May-2023 |
bluhm |
Implement the TCP/IP layer for hardware TCP segmentation offload. If the driver of a network interface claims to support TSO, do not chop the packet in software, but pass it down to the interface layer. Precalculate parts of the pseudo header checksum, but without the packet length. The length of all generated smaller packets is not known yet. Driver and hardware will use the mbuf packet header field ph_mss to calculate it and update checksum. Introduce separate flags IFCAP_TSOv4 and IFCAP_TSOv6 as hardware might support ony one protocol family. The old flag IFXF_TSO is only relevant for large receive offload. It is missnamed, but keep that for now. Note that drivers do not set TSO capabilites yet. Also the ifconfig flags and pseudo interfaces capabilities will be done separately. So this commit should not change behavior. heavily based on the work from jan@; OK sashan@
|
#
1.164 |
|
10-May-2023 |
bluhm |
Implement TCP send offloading, for now in software only. This is meant as a fallback if network hardware does not support TSO. Driver support is still work in progress. TCP output generates large packets. In IP output the packet is chopped to TCP maximum segment size. This reduces the CPU cycles used by pf. The regular output could be assisted by hardware later, but pf route-to and IPsec needs the software fallback in general. For performance comparison or to workaround possible bugs, sysctl net.inet.tcp.tso=0 disables the feature. netstat -s -p tcp shows TSO counter with chopped and generated packets. based on work from jan@ tested by jmc@ jan@ Hrvoje Popovski OK jan@ claudio@
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.163 |
|
14-Mar-2023 |
yasuoka |
To avoid misunderstanding, keep variables for tcp keepalive in milliseconds, which is the same unit of tcp_now(). However, keep the unit of sysctl variables in seconds and convert their unit in tcp_sysctl(). Additionally revert TCPTV_SRTTDFLT back to 3 seconds, which was mistakenly changed to 1.5 seconds by tcp_timer.h 1.19.
ok claudio
|
#
1.162 |
|
13-Dec-2022 |
claudio |
In tcp_now() switch from getnsecuptime() to getnsecruntime()
The tcp timer is not supposed to run during suspend but getnsecuptime() does and because of this sessions with TCP_KEEPALIVE on reset after a few hours of sleep.
Problem noticed by mlarkin@, investigation by yasuoka@ additional testing jca@ OK yasuoka@ jca@ cheloha@
|
#
1.161 |
|
07-Nov-2022 |
yasuoka |
Modify TCP receive buffer size auto scaling to use the smoothed RTT (SRTT) instead of the timestamp option. Since the timestamp option is disabled on some OSs (eg. Windows) or dropped by some firewalls/routers, in such a case the window size had been fixed at 16KB, this limits throughput at very low on high latency networks. Also replace "tcp_now" from 2HZ tick counter to binuptime in milliseconds to calculate the SRTT better.
tested by krw matthieu jmatthew dlg djm stu stsp ok claudio
|
#
1.160 |
|
17-Oct-2022 |
mvs |
Change pru_abort() return type to the type of void and make pru_abort() optional.
We have no interest on pru_abort() return value. We call it only from soabort() which is dummy pru_abort() wrapper and has no return value.
Only the connection oriented sockets need to implement (*pru_abort)() handler. Such sockets are tcp(4) and unix(4) sockets, so remove existing code for all others, it doesn't called.
ok guenther@
|
#
1.159 |
|
03-Oct-2022 |
bluhm |
System calls should not fail due to temporary memory shortage in malloc(9) or pool_get(9). Pass down a wait flag to pru_attach(). During syscall socket(2) it is ok to wait, this logic was missing for internet pcb. Pfkey and route sockets were already waiting. sonewconn() must not wait when called during TCP 3-way handshake. This logic has been preserved. Unix domain stream socket connect(2) can wait until the other side has created the socket to accept. OK mvs@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.158 |
|
13-Sep-2022 |
mvs |
Change pru_rcvd() return type to the type of void. We have no interest on pru_rcvd() return value.
Drop "pru_rcvd != NULL" check within pru_rcvd() wrapper. We only call it if the socket's protocol have PR_WANTRCVD flag set. Such sockets are route domain, tcp(4) and unix(4) sockets.
ok guenther@ bluhm@
|
#
1.157 |
|
03-Sep-2022 |
mvs |
Move PRU_PEERADDR request to (*pru_peeraddr)().
Introduce in{,6}_peeraddr() and use them for inet and inet6 sockets, except tcp(4) case.
Also remove *_usrreq() handlers.
ok bluhm@
|
#
1.156 |
|
03-Sep-2022 |
bluhm |
Use a mutex to update tcp_maxidle, tcp_iss, and tcp_now. This removes pressure from the exclusive netlock in tcp_slowtimo(). Reading is done atomically. Ensure that the tcp_now value is read only once per function to provide consistent time. OK yasuoka@
|
#
1.155 |
|
03-Sep-2022 |
mvs |
Move PRU_SOCKADDR request to (*pru_sockaddr)()
Introduce in{,6}_sockaddr() functions, and use them for all except tcp(4) inet sockets. For tcp(4) sockets use tcp_sockaddr() to keep debug ability.
The key management and route domain sockets returns EINVAL error for PRU_SOCKADDR request, so keep this behaviour for a while instead of make pru_sockaddr handler optional and return EOPNOTSUPP.
ok bluhm@
|
#
1.154 |
|
02-Sep-2022 |
mvs |
Move PRU_CONTROL request to (*pru_control)().
The 'proc *' arg is not used for PRU_CONTROL request, so remove it from pru_control() wrapper.
Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for inet6 case.
ok guenther@ bluhm@
|
#
1.153 |
|
31-Aug-2022 |
mvs |
Move PRU_SENDOOB request to (*pru_sendoob)().
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. To avoid dummy m_freem(9) handlers for all protocols release passed mbufs in the pru_sendoob() EOPNOTSUPP error path.
Also fix `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path.
ok bluhm@
|
#
1.152 |
|
29-Aug-2022 |
mvs |
Move PRU_RCVOOB request to (*pru_rcvoob)().
ok bluhm@
|
#
1.151 |
|
28-Aug-2022 |
mvs |
Move PRU_SENSE request to (*pru_sense)().
ok bluhm@
|
#
1.150 |
|
28-Aug-2022 |
mvs |
Move PRU_ABORT request to (*pru_abort)().
We abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it required only for tcp(4) and unix(4) sockets, so i should be optional. However, they will be removed with separate diff, and this time PRU_ABORT requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and key management sockets leave it alive. This was also converted as is, because this wrong code never called.
ok bluhm@
|
#
1.149 |
|
27-Aug-2022 |
mvs |
Move PRU_SEND request to (*pru_send)().
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.167 |
|
23-May-2023 |
jan |
New counters for LRO packets from hardware TCP offloading.
With tweaks from patrick@ and bluhm@.
OK bluhm@
|
#
1.166 |
|
18-May-2023 |
jan |
Use TSO offloading in ix(4).
With a lot of tweaks, improvements and testing from bluhm.
Thanks to Hrvoje Popovski from the University of Zagreb for his great testing effort to make this happen.
ok bluhm
|
#
1.165 |
|
15-May-2023 |
bluhm |
Implement the TCP/IP layer for hardware TCP segmentation offload. If the driver of a network interface claims to support TSO, do not chop the packet in software, but pass it down to the interface layer. Precalculate parts of the pseudo header checksum, but without the packet length. The length of all generated smaller packets is not known yet. Driver and hardware will use the mbuf packet header field ph_mss to calculate it and update checksum. Introduce separate flags IFCAP_TSOv4 and IFCAP_TSOv6 as hardware might support ony one protocol family. The old flag IFXF_TSO is only relevant for large receive offload. It is missnamed, but keep that for now. Note that drivers do not set TSO capabilites yet. Also the ifconfig flags and pseudo interfaces capabilities will be done separately. So this commit should not change behavior. heavily based on the work from jan@; OK sashan@
|
#
1.164 |
|
10-May-2023 |
bluhm |
Implement TCP send offloading, for now in software only. This is meant as a fallback if network hardware does not support TSO. Driver support is still work in progress. TCP output generates large packets. In IP output the packet is chopped to TCP maximum segment size. This reduces the CPU cycles used by pf. The regular output could be assisted by hardware later, but pf route-to and IPsec needs the software fallback in general. For performance comparison or to workaround possible bugs, sysctl net.inet.tcp.tso=0 disables the feature. netstat -s -p tcp shows TSO counter with chopped and generated packets. based on work from jan@ tested by jmc@ jan@ Hrvoje Popovski OK jan@ claudio@
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.163 |
|
14-Mar-2023 |
yasuoka |
To avoid misunderstanding, keep variables for tcp keepalive in milliseconds, which is the same unit of tcp_now(). However, keep the unit of sysctl variables in seconds and convert their unit in tcp_sysctl(). Additionally revert TCPTV_SRTTDFLT back to 3 seconds, which was mistakenly changed to 1.5 seconds by tcp_timer.h 1.19.
ok claudio
|
#
1.162 |
|
13-Dec-2022 |
claudio |
In tcp_now() switch from getnsecuptime() to getnsecruntime()
The tcp timer is not supposed to run during suspend but getnsecuptime() does and because of this sessions with TCP_KEEPALIVE on reset after a few hours of sleep.
Problem noticed by mlarkin@, investigation by yasuoka@ additional testing jca@ OK yasuoka@ jca@ cheloha@
|
#
1.161 |
|
07-Nov-2022 |
yasuoka |
Modify TCP receive buffer size auto scaling to use the smoothed RTT (SRTT) instead of the timestamp option. Since the timestamp option is disabled on some OSs (eg. Windows) or dropped by some firewalls/routers, in such a case the window size had been fixed at 16KB, this limits throughput at very low on high latency networks. Also replace "tcp_now" from 2HZ tick counter to binuptime in milliseconds to calculate the SRTT better.
tested by krw matthieu jmatthew dlg djm stu stsp ok claudio
|
#
1.160 |
|
17-Oct-2022 |
mvs |
Change pru_abort() return type to the type of void and make pru_abort() optional.
We have no interest on pru_abort() return value. We call it only from soabort() which is dummy pru_abort() wrapper and has no return value.
Only the connection oriented sockets need to implement (*pru_abort)() handler. Such sockets are tcp(4) and unix(4) sockets, so remove existing code for all others, it doesn't called.
ok guenther@
|
#
1.159 |
|
03-Oct-2022 |
bluhm |
System calls should not fail due to temporary memory shortage in malloc(9) or pool_get(9). Pass down a wait flag to pru_attach(). During syscall socket(2) it is ok to wait, this logic was missing for internet pcb. Pfkey and route sockets were already waiting. sonewconn() must not wait when called during TCP 3-way handshake. This logic has been preserved. Unix domain stream socket connect(2) can wait until the other side has created the socket to accept. OK mvs@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.158 |
|
13-Sep-2022 |
mvs |
Change pru_rcvd() return type to the type of void. We have no interest on pru_rcvd() return value.
Drop "pru_rcvd != NULL" check within pru_rcvd() wrapper. We only call it if the socket's protocol have PR_WANTRCVD flag set. Such sockets are route domain, tcp(4) and unix(4) sockets.
ok guenther@ bluhm@
|
#
1.157 |
|
03-Sep-2022 |
mvs |
Move PRU_PEERADDR request to (*pru_peeraddr)().
Introduce in{,6}_peeraddr() and use them for inet and inet6 sockets, except tcp(4) case.
Also remove *_usrreq() handlers.
ok bluhm@
|
#
1.156 |
|
03-Sep-2022 |
bluhm |
Use a mutex to update tcp_maxidle, tcp_iss, and tcp_now. This removes pressure from the exclusive netlock in tcp_slowtimo(). Reading is done atomically. Ensure that the tcp_now value is read only once per function to provide consistent time. OK yasuoka@
|
#
1.155 |
|
03-Sep-2022 |
mvs |
Move PRU_SOCKADDR request to (*pru_sockaddr)()
Introduce in{,6}_sockaddr() functions, and use them for all except tcp(4) inet sockets. For tcp(4) sockets use tcp_sockaddr() to keep debug ability.
The key management and route domain sockets returns EINVAL error for PRU_SOCKADDR request, so keep this behaviour for a while instead of make pru_sockaddr handler optional and return EOPNOTSUPP.
ok bluhm@
|
#
1.154 |
|
02-Sep-2022 |
mvs |
Move PRU_CONTROL request to (*pru_control)().
The 'proc *' arg is not used for PRU_CONTROL request, so remove it from pru_control() wrapper.
Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for inet6 case.
ok guenther@ bluhm@
|
#
1.153 |
|
31-Aug-2022 |
mvs |
Move PRU_SENDOOB request to (*pru_sendoob)().
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. To avoid dummy m_freem(9) handlers for all protocols release passed mbufs in the pru_sendoob() EOPNOTSUPP error path.
Also fix `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path.
ok bluhm@
|
#
1.152 |
|
29-Aug-2022 |
mvs |
Move PRU_RCVOOB request to (*pru_rcvoob)().
ok bluhm@
|
#
1.151 |
|
28-Aug-2022 |
mvs |
Move PRU_SENSE request to (*pru_sense)().
ok bluhm@
|
#
1.150 |
|
28-Aug-2022 |
mvs |
Move PRU_ABORT request to (*pru_abort)().
We abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it required only for tcp(4) and unix(4) sockets, so i should be optional. However, they will be removed with separate diff, and this time PRU_ABORT requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and key management sockets leave it alive. This was also converted as is, because this wrong code never called.
ok bluhm@
|
#
1.149 |
|
27-Aug-2022 |
mvs |
Move PRU_SEND request to (*pru_send)().
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.163 |
|
14-Mar-2023 |
yasuoka |
To avoid misunderstanding, keep variables for tcp keepalive in milliseconds, which is the same unit of tcp_now(). However, keep the unit of sysctl variables in seconds and convert their unit in tcp_sysctl(). Additionally revert TCPTV_SRTTDFLT back to 3 seconds, which was mistakenly changed to 1.5 seconds by tcp_timer.h 1.19.
ok claudio
|
#
1.162 |
|
13-Dec-2022 |
claudio |
In tcp_now() switch from getnsecuptime() to getnsecruntime()
The tcp timer is not supposed to run during suspend but getnsecuptime() does and because of this sessions with TCP_KEEPALIVE on reset after a few hours of sleep.
Problem noticed by mlarkin@, investigation by yasuoka@ additional testing jca@ OK yasuoka@ jca@ cheloha@
|
#
1.161 |
|
07-Nov-2022 |
yasuoka |
Modify TCP receive buffer size auto scaling to use the smoothed RTT (SRTT) instead of the timestamp option. Since the timestamp option is disabled on some OSs (eg. Windows) or dropped by some firewalls/routers, in such a case the window size had been fixed at 16KB, this limits throughput at very low on high latency networks. Also replace "tcp_now" from 2HZ tick counter to binuptime in milliseconds to calculate the SRTT better.
tested by krw matthieu jmatthew dlg djm stu stsp ok claudio
|
#
1.160 |
|
17-Oct-2022 |
mvs |
Change pru_abort() return type to the type of void and make pru_abort() optional.
We have no interest on pru_abort() return value. We call it only from soabort() which is dummy pru_abort() wrapper and has no return value.
Only the connection oriented sockets need to implement (*pru_abort)() handler. Such sockets are tcp(4) and unix(4) sockets, so remove existing code for all others, it doesn't called.
ok guenther@
|
#
1.159 |
|
03-Oct-2022 |
bluhm |
System calls should not fail due to temporary memory shortage in malloc(9) or pool_get(9). Pass down a wait flag to pru_attach(). During syscall socket(2) it is ok to wait, this logic was missing for internet pcb. Pfkey and route sockets were already waiting. sonewconn() must not wait when called during TCP 3-way handshake. This logic has been preserved. Unix domain stream socket connect(2) can wait until the other side has created the socket to accept. OK mvs@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.158 |
|
13-Sep-2022 |
mvs |
Change pru_rcvd() return type to the type of void. We have no interest on pru_rcvd() return value.
Drop "pru_rcvd != NULL" check within pru_rcvd() wrapper. We only call it if the socket's protocol have PR_WANTRCVD flag set. Such sockets are route domain, tcp(4) and unix(4) sockets.
ok guenther@ bluhm@
|
#
1.157 |
|
03-Sep-2022 |
mvs |
Move PRU_PEERADDR request to (*pru_peeraddr)().
Introduce in{,6}_peeraddr() and use them for inet and inet6 sockets, except tcp(4) case.
Also remove *_usrreq() handlers.
ok bluhm@
|
#
1.156 |
|
03-Sep-2022 |
bluhm |
Use a mutex to update tcp_maxidle, tcp_iss, and tcp_now. This removes pressure from the exclusive netlock in tcp_slowtimo(). Reading is done atomically. Ensure that the tcp_now value is read only once per function to provide consistent time. OK yasuoka@
|
#
1.155 |
|
03-Sep-2022 |
mvs |
Move PRU_SOCKADDR request to (*pru_sockaddr)()
Introduce in{,6}_sockaddr() functions, and use them for all except tcp(4) inet sockets. For tcp(4) sockets use tcp_sockaddr() to keep debug ability.
The key management and route domain sockets returns EINVAL error for PRU_SOCKADDR request, so keep this behaviour for a while instead of make pru_sockaddr handler optional and return EOPNOTSUPP.
ok bluhm@
|
#
1.154 |
|
02-Sep-2022 |
mvs |
Move PRU_CONTROL request to (*pru_control)().
The 'proc *' arg is not used for PRU_CONTROL request, so remove it from pru_control() wrapper.
Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for inet6 case.
ok guenther@ bluhm@
|
#
1.153 |
|
31-Aug-2022 |
mvs |
Move PRU_SENDOOB request to (*pru_sendoob)().
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. To avoid dummy m_freem(9) handlers for all protocols release passed mbufs in the pru_sendoob() EOPNOTSUPP error path.
Also fix `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path.
ok bluhm@
|
#
1.152 |
|
29-Aug-2022 |
mvs |
Move PRU_RCVOOB request to (*pru_rcvoob)().
ok bluhm@
|
#
1.151 |
|
28-Aug-2022 |
mvs |
Move PRU_SENSE request to (*pru_sense)().
ok bluhm@
|
#
1.150 |
|
28-Aug-2022 |
mvs |
Move PRU_ABORT request to (*pru_abort)().
We abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it required only for tcp(4) and unix(4) sockets, so i should be optional. However, they will be removed with separate diff, and this time PRU_ABORT requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and key management sockets leave it alive. This was also converted as is, because this wrong code never called.
ok bluhm@
|
#
1.149 |
|
27-Aug-2022 |
mvs |
Move PRU_SEND request to (*pru_send)().
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.162 |
|
13-Dec-2022 |
claudio |
In tcp_now() switch from getnsecuptime() to getnsecruntime()
The tcp timer is not supposed to run during suspend but getnsecuptime() does and because of this sessions with TCP_KEEPALIVE on reset after a few hours of sleep.
Problem noticed by mlarkin@, investigation by yasuoka@ additional testing jca@ OK yasuoka@ jca@ cheloha@
|
#
1.161 |
|
07-Nov-2022 |
yasuoka |
Modify TCP receive buffer size auto scaling to use the smoothed RTT (SRTT) instead of the timestamp option. Since the timestamp option is disabled on some OSs (eg. Windows) or dropped by some firewalls/routers, in such a case the window size had been fixed at 16KB, this limits throughput at very low on high latency networks. Also replace "tcp_now" from 2HZ tick counter to binuptime in milliseconds to calculate the SRTT better.
tested by krw matthieu jmatthew dlg djm stu stsp ok claudio
|
#
1.160 |
|
17-Oct-2022 |
mvs |
Change pru_abort() return type to the type of void and make pru_abort() optional.
We have no interest on pru_abort() return value. We call it only from soabort() which is dummy pru_abort() wrapper and has no return value.
Only the connection oriented sockets need to implement (*pru_abort)() handler. Such sockets are tcp(4) and unix(4) sockets, so remove existing code for all others, it doesn't called.
ok guenther@
|
#
1.159 |
|
03-Oct-2022 |
bluhm |
System calls should not fail due to temporary memory shortage in malloc(9) or pool_get(9). Pass down a wait flag to pru_attach(). During syscall socket(2) it is ok to wait, this logic was missing for internet pcb. Pfkey and route sockets were already waiting. sonewconn() must not wait when called during TCP 3-way handshake. This logic has been preserved. Unix domain stream socket connect(2) can wait until the other side has created the socket to accept. OK mvs@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.158 |
|
13-Sep-2022 |
mvs |
Change pru_rcvd() return type to the type of void. We have no interest on pru_rcvd() return value.
Drop "pru_rcvd != NULL" check within pru_rcvd() wrapper. We only call it if the socket's protocol have PR_WANTRCVD flag set. Such sockets are route domain, tcp(4) and unix(4) sockets.
ok guenther@ bluhm@
|
#
1.157 |
|
03-Sep-2022 |
mvs |
Move PRU_PEERADDR request to (*pru_peeraddr)().
Introduce in{,6}_peeraddr() and use them for inet and inet6 sockets, except tcp(4) case.
Also remove *_usrreq() handlers.
ok bluhm@
|
#
1.156 |
|
03-Sep-2022 |
bluhm |
Use a mutex to update tcp_maxidle, tcp_iss, and tcp_now. This removes pressure from the exclusive netlock in tcp_slowtimo(). Reading is done atomically. Ensure that the tcp_now value is read only once per function to provide consistent time. OK yasuoka@
|
#
1.155 |
|
03-Sep-2022 |
mvs |
Move PRU_SOCKADDR request to (*pru_sockaddr)()
Introduce in{,6}_sockaddr() functions, and use them for all except tcp(4) inet sockets. For tcp(4) sockets use tcp_sockaddr() to keep debug ability.
The key management and route domain sockets returns EINVAL error for PRU_SOCKADDR request, so keep this behaviour for a while instead of make pru_sockaddr handler optional and return EOPNOTSUPP.
ok bluhm@
|
#
1.154 |
|
02-Sep-2022 |
mvs |
Move PRU_CONTROL request to (*pru_control)().
The 'proc *' arg is not used for PRU_CONTROL request, so remove it from pru_control() wrapper.
Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for inet6 case.
ok guenther@ bluhm@
|
#
1.153 |
|
31-Aug-2022 |
mvs |
Move PRU_SENDOOB request to (*pru_sendoob)().
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. To avoid dummy m_freem(9) handlers for all protocols release passed mbufs in the pru_sendoob() EOPNOTSUPP error path.
Also fix `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path.
ok bluhm@
|
#
1.152 |
|
29-Aug-2022 |
mvs |
Move PRU_RCVOOB request to (*pru_rcvoob)().
ok bluhm@
|
#
1.151 |
|
28-Aug-2022 |
mvs |
Move PRU_SENSE request to (*pru_sense)().
ok bluhm@
|
#
1.150 |
|
28-Aug-2022 |
mvs |
Move PRU_ABORT request to (*pru_abort)().
We abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it required only for tcp(4) and unix(4) sockets, so i should be optional. However, they will be removed with separate diff, and this time PRU_ABORT requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and key management sockets leave it alive. This was also converted as is, because this wrong code never called.
ok bluhm@
|
#
1.149 |
|
27-Aug-2022 |
mvs |
Move PRU_SEND request to (*pru_send)().
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.161 |
|
07-Nov-2022 |
yasuoka |
Modify TCP receive buffer size auto scaling to use the smoothed RTT (SRTT) instead of the timestamp option. Since the timestamp option is disabled on some OSs (eg. Windows) or dropped by some firewalls/routers, in such a case the window size had been fixed at 16KB, this limits throughput at very low on high latency networks. Also replace "tcp_now" from 2HZ tick counter to binuptime in milliseconds to calculate the SRTT better.
tested by krw matthieu jmatthew dlg djm stu stsp ok claudio
|
#
1.160 |
|
17-Oct-2022 |
mvs |
Change pru_abort() return type to the type of void and make pru_abort() optional.
We have no interest on pru_abort() return value. We call it only from soabort() which is dummy pru_abort() wrapper and has no return value.
Only the connection oriented sockets need to implement (*pru_abort)() handler. Such sockets are tcp(4) and unix(4) sockets, so remove existing code for all others, it doesn't called.
ok guenther@
|
#
1.159 |
|
03-Oct-2022 |
bluhm |
System calls should not fail due to temporary memory shortage in malloc(9) or pool_get(9). Pass down a wait flag to pru_attach(). During syscall socket(2) it is ok to wait, this logic was missing for internet pcb. Pfkey and route sockets were already waiting. sonewconn() must not wait when called during TCP 3-way handshake. This logic has been preserved. Unix domain stream socket connect(2) can wait until the other side has created the socket to accept. OK mvs@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.158 |
|
13-Sep-2022 |
mvs |
Change pru_rcvd() return type to the type of void. We have no interest on pru_rcvd() return value.
Drop "pru_rcvd != NULL" check within pru_rcvd() wrapper. We only call it if the socket's protocol have PR_WANTRCVD flag set. Such sockets are route domain, tcp(4) and unix(4) sockets.
ok guenther@ bluhm@
|
#
1.157 |
|
03-Sep-2022 |
mvs |
Move PRU_PEERADDR request to (*pru_peeraddr)().
Introduce in{,6}_peeraddr() and use them for inet and inet6 sockets, except tcp(4) case.
Also remove *_usrreq() handlers.
ok bluhm@
|
#
1.156 |
|
03-Sep-2022 |
bluhm |
Use a mutex to update tcp_maxidle, tcp_iss, and tcp_now. This removes pressure from the exclusive netlock in tcp_slowtimo(). Reading is done atomically. Ensure that the tcp_now value is read only once per function to provide consistent time. OK yasuoka@
|
#
1.155 |
|
03-Sep-2022 |
mvs |
Move PRU_SOCKADDR request to (*pru_sockaddr)()
Introduce in{,6}_sockaddr() functions, and use them for all except tcp(4) inet sockets. For tcp(4) sockets use tcp_sockaddr() to keep debug ability.
The key management and route domain sockets returns EINVAL error for PRU_SOCKADDR request, so keep this behaviour for a while instead of make pru_sockaddr handler optional and return EOPNOTSUPP.
ok bluhm@
|
#
1.154 |
|
02-Sep-2022 |
mvs |
Move PRU_CONTROL request to (*pru_control)().
The 'proc *' arg is not used for PRU_CONTROL request, so remove it from pru_control() wrapper.
Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for inet6 case.
ok guenther@ bluhm@
|
#
1.153 |
|
31-Aug-2022 |
mvs |
Move PRU_SENDOOB request to (*pru_sendoob)().
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. To avoid dummy m_freem(9) handlers for all protocols release passed mbufs in the pru_sendoob() EOPNOTSUPP error path.
Also fix `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path.
ok bluhm@
|
#
1.152 |
|
29-Aug-2022 |
mvs |
Move PRU_RCVOOB request to (*pru_rcvoob)().
ok bluhm@
|
#
1.151 |
|
28-Aug-2022 |
mvs |
Move PRU_SENSE request to (*pru_sense)().
ok bluhm@
|
#
1.150 |
|
28-Aug-2022 |
mvs |
Move PRU_ABORT request to (*pru_abort)().
We abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it required only for tcp(4) and unix(4) sockets, so i should be optional. However, they will be removed with separate diff, and this time PRU_ABORT requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and key management sockets leave it alive. This was also converted as is, because this wrong code never called.
ok bluhm@
|
#
1.149 |
|
27-Aug-2022 |
mvs |
Move PRU_SEND request to (*pru_send)().
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.160 |
|
17-Oct-2022 |
mvs |
Change pru_abort() return type to the type of void and make pru_abort() optional.
We have no interest on pru_abort() return value. We call it only from soabort() which is dummy pru_abort() wrapper and has no return value.
Only the connection oriented sockets need to implement (*pru_abort)() handler. Such sockets are tcp(4) and unix(4) sockets, so remove existing code for all others, it doesn't called.
ok guenther@
|
#
1.159 |
|
03-Oct-2022 |
bluhm |
System calls should not fail due to temporary memory shortage in malloc(9) or pool_get(9). Pass down a wait flag to pru_attach(). During syscall socket(2) it is ok to wait, this logic was missing for internet pcb. Pfkey and route sockets were already waiting. sonewconn() must not wait when called during TCP 3-way handshake. This logic has been preserved. Unix domain stream socket connect(2) can wait until the other side has created the socket to accept. OK mvs@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.158 |
|
13-Sep-2022 |
mvs |
Change pru_rcvd() return type to the type of void. We have no interest on pru_rcvd() return value.
Drop "pru_rcvd != NULL" check within pru_rcvd() wrapper. We only call it if the socket's protocol have PR_WANTRCVD flag set. Such sockets are route domain, tcp(4) and unix(4) sockets.
ok guenther@ bluhm@
|
#
1.157 |
|
03-Sep-2022 |
mvs |
Move PRU_PEERADDR request to (*pru_peeraddr)().
Introduce in{,6}_peeraddr() and use them for inet and inet6 sockets, except tcp(4) case.
Also remove *_usrreq() handlers.
ok bluhm@
|
#
1.156 |
|
03-Sep-2022 |
bluhm |
Use a mutex to update tcp_maxidle, tcp_iss, and tcp_now. This removes pressure from the exclusive netlock in tcp_slowtimo(). Reading is done atomically. Ensure that the tcp_now value is read only once per function to provide consistent time. OK yasuoka@
|
#
1.155 |
|
03-Sep-2022 |
mvs |
Move PRU_SOCKADDR request to (*pru_sockaddr)()
Introduce in{,6}_sockaddr() functions, and use them for all except tcp(4) inet sockets. For tcp(4) sockets use tcp_sockaddr() to keep debug ability.
The key management and route domain sockets returns EINVAL error for PRU_SOCKADDR request, so keep this behaviour for a while instead of make pru_sockaddr handler optional and return EOPNOTSUPP.
ok bluhm@
|
#
1.154 |
|
02-Sep-2022 |
mvs |
Move PRU_CONTROL request to (*pru_control)().
The 'proc *' arg is not used for PRU_CONTROL request, so remove it from pru_control() wrapper.
Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for inet6 case.
ok guenther@ bluhm@
|
#
1.153 |
|
31-Aug-2022 |
mvs |
Move PRU_SENDOOB request to (*pru_sendoob)().
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. To avoid dummy m_freem(9) handlers for all protocols release passed mbufs in the pru_sendoob() EOPNOTSUPP error path.
Also fix `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path.
ok bluhm@
|
#
1.152 |
|
29-Aug-2022 |
mvs |
Move PRU_RCVOOB request to (*pru_rcvoob)().
ok bluhm@
|
#
1.151 |
|
28-Aug-2022 |
mvs |
Move PRU_SENSE request to (*pru_sense)().
ok bluhm@
|
#
1.150 |
|
28-Aug-2022 |
mvs |
Move PRU_ABORT request to (*pru_abort)().
We abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it required only for tcp(4) and unix(4) sockets, so i should be optional. However, they will be removed with separate diff, and this time PRU_ABORT requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and key management sockets leave it alive. This was also converted as is, because this wrong code never called.
ok bluhm@
|
#
1.149 |
|
27-Aug-2022 |
mvs |
Move PRU_SEND request to (*pru_send)().
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.159 |
|
03-Oct-2022 |
bluhm |
System calls should not fail due to temporary memory shortage in malloc(9) or pool_get(9). Pass down a wait flag to pru_attach(). During syscall socket(2) it is ok to wait, this logic was missing for internet pcb. Pfkey and route sockets were already waiting. sonewconn() must not wait when called during TCP 3-way handshake. This logic has been preserved. Unix domain stream socket connect(2) can wait until the other side has created the socket to accept. OK mvs@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.158 |
|
13-Sep-2022 |
mvs |
Change pru_rcvd() return type to the type of void. We have no interest on pru_rcvd() return value.
Drop "pru_rcvd != NULL" check within pru_rcvd() wrapper. We only call it if the socket's protocol have PR_WANTRCVD flag set. Such sockets are route domain, tcp(4) and unix(4) sockets.
ok guenther@ bluhm@
|
#
1.157 |
|
03-Sep-2022 |
mvs |
Move PRU_PEERADDR request to (*pru_peeraddr)().
Introduce in{,6}_peeraddr() and use them for inet and inet6 sockets, except tcp(4) case.
Also remove *_usrreq() handlers.
ok bluhm@
|
#
1.156 |
|
03-Sep-2022 |
bluhm |
Use a mutex to update tcp_maxidle, tcp_iss, and tcp_now. This removes pressure from the exclusive netlock in tcp_slowtimo(). Reading is done atomically. Ensure that the tcp_now value is read only once per function to provide consistent time. OK yasuoka@
|
#
1.155 |
|
03-Sep-2022 |
mvs |
Move PRU_SOCKADDR request to (*pru_sockaddr)()
Introduce in{,6}_sockaddr() functions, and use them for all except tcp(4) inet sockets. For tcp(4) sockets use tcp_sockaddr() to keep debug ability.
The key management and route domain sockets returns EINVAL error for PRU_SOCKADDR request, so keep this behaviour for a while instead of make pru_sockaddr handler optional and return EOPNOTSUPP.
ok bluhm@
|
#
1.154 |
|
02-Sep-2022 |
mvs |
Move PRU_CONTROL request to (*pru_control)().
The 'proc *' arg is not used for PRU_CONTROL request, so remove it from pru_control() wrapper.
Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for inet6 case.
ok guenther@ bluhm@
|
#
1.153 |
|
31-Aug-2022 |
mvs |
Move PRU_SENDOOB request to (*pru_sendoob)().
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. To avoid dummy m_freem(9) handlers for all protocols release passed mbufs in the pru_sendoob() EOPNOTSUPP error path.
Also fix `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path.
ok bluhm@
|
#
1.152 |
|
29-Aug-2022 |
mvs |
Move PRU_RCVOOB request to (*pru_rcvoob)().
ok bluhm@
|
#
1.151 |
|
28-Aug-2022 |
mvs |
Move PRU_SENSE request to (*pru_sense)().
ok bluhm@
|
#
1.150 |
|
28-Aug-2022 |
mvs |
Move PRU_ABORT request to (*pru_abort)().
We abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it required only for tcp(4) and unix(4) sockets, so i should be optional. However, they will be removed with separate diff, and this time PRU_ABORT requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and key management sockets leave it alive. This was also converted as is, because this wrong code never called.
ok bluhm@
|
#
1.149 |
|
27-Aug-2022 |
mvs |
Move PRU_SEND request to (*pru_send)().
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.158 |
|
13-Sep-2022 |
mvs |
Change pru_rcvd() return type to the type of void. We have no interest on pru_rcvd() return value.
Drop "pru_rcvd != NULL" check within pru_rcvd() wrapper. We only call it if the socket's protocol have PR_WANTRCVD flag set. Such sockets are route domain, tcp(4) and unix(4) sockets.
ok guenther@ bluhm@
|
#
1.157 |
|
03-Sep-2022 |
mvs |
Move PRU_PEERADDR request to (*pru_peeraddr)().
Introduce in{,6}_peeraddr() and use them for inet and inet6 sockets, except tcp(4) case.
Also remove *_usrreq() handlers.
ok bluhm@
|
#
1.156 |
|
03-Sep-2022 |
bluhm |
Use a mutex to update tcp_maxidle, tcp_iss, and tcp_now. This removes pressure from the exclusive netlock in tcp_slowtimo(). Reading is done atomically. Ensure that the tcp_now value is read only once per function to provide consistent time. OK yasuoka@
|
#
1.155 |
|
03-Sep-2022 |
mvs |
Move PRU_SOCKADDR request to (*pru_sockaddr)()
Introduce in{,6}_sockaddr() functions, and use them for all except tcp(4) inet sockets. For tcp(4) sockets use tcp_sockaddr() to keep debug ability.
The key management and route domain sockets returns EINVAL error for PRU_SOCKADDR request, so keep this behaviour for a while instead of make pru_sockaddr handler optional and return EOPNOTSUPP.
ok bluhm@
|
#
1.154 |
|
02-Sep-2022 |
mvs |
Move PRU_CONTROL request to (*pru_control)().
The 'proc *' arg is not used for PRU_CONTROL request, so remove it from pru_control() wrapper.
Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for inet6 case.
ok guenther@ bluhm@
|
#
1.153 |
|
31-Aug-2022 |
mvs |
Move PRU_SENDOOB request to (*pru_sendoob)().
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. To avoid dummy m_freem(9) handlers for all protocols release passed mbufs in the pru_sendoob() EOPNOTSUPP error path.
Also fix `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path.
ok bluhm@
|
#
1.152 |
|
29-Aug-2022 |
mvs |
Move PRU_RCVOOB request to (*pru_rcvoob)().
ok bluhm@
|
#
1.151 |
|
28-Aug-2022 |
mvs |
Move PRU_SENSE request to (*pru_sense)().
ok bluhm@
|
#
1.150 |
|
28-Aug-2022 |
mvs |
Move PRU_ABORT request to (*pru_abort)().
We abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it required only for tcp(4) and unix(4) sockets, so i should be optional. However, they will be removed with separate diff, and this time PRU_ABORT requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and key management sockets leave it alive. This was also converted as is, because this wrong code never called.
ok bluhm@
|
#
1.149 |
|
27-Aug-2022 |
mvs |
Move PRU_SEND request to (*pru_send)().
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.157 |
|
03-Sep-2022 |
mvs |
Move PRU_PEERADDR request to (*pru_peeraddr)().
Introduce in{,6}_peeraddr() and use them for inet and inet6 sockets, except tcp(4) case.
Also remove *_usrreq() handlers.
ok bluhm@
|
#
1.156 |
|
03-Sep-2022 |
bluhm |
Use a mutex to update tcp_maxidle, tcp_iss, and tcp_now. This removes pressure from the exclusive netlock in tcp_slowtimo(). Reading is done atomically. Ensure that the tcp_now value is read only once per function to provide consistent time. OK yasuoka@
|
#
1.155 |
|
03-Sep-2022 |
mvs |
Move PRU_SOCKADDR request to (*pru_sockaddr)()
Introduce in{,6}_sockaddr() functions, and use them for all except tcp(4) inet sockets. For tcp(4) sockets use tcp_sockaddr() to keep debug ability.
The key management and route domain sockets returns EINVAL error for PRU_SOCKADDR request, so keep this behaviour for a while instead of make pru_sockaddr handler optional and return EOPNOTSUPP.
ok bluhm@
|
#
1.154 |
|
02-Sep-2022 |
mvs |
Move PRU_CONTROL request to (*pru_control)().
The 'proc *' arg is not used for PRU_CONTROL request, so remove it from pru_control() wrapper.
Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for inet6 case.
ok guenther@ bluhm@
|
#
1.153 |
|
31-Aug-2022 |
mvs |
Move PRU_SENDOOB request to (*pru_sendoob)().
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. To avoid dummy m_freem(9) handlers for all protocols release passed mbufs in the pru_sendoob() EOPNOTSUPP error path.
Also fix `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path.
ok bluhm@
|
#
1.152 |
|
29-Aug-2022 |
mvs |
Move PRU_RCVOOB request to (*pru_rcvoob)().
ok bluhm@
|
#
1.151 |
|
28-Aug-2022 |
mvs |
Move PRU_SENSE request to (*pru_sense)().
ok bluhm@
|
#
1.150 |
|
28-Aug-2022 |
mvs |
Move PRU_ABORT request to (*pru_abort)().
We abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it required only for tcp(4) and unix(4) sockets, so i should be optional. However, they will be removed with separate diff, and this time PRU_ABORT requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and key management sockets leave it alive. This was also converted as is, because this wrong code never called.
ok bluhm@
|
#
1.149 |
|
27-Aug-2022 |
mvs |
Move PRU_SEND request to (*pru_send)().
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.154 |
|
02-Sep-2022 |
mvs |
Move PRU_CONTROL request to (*pru_control)().
The 'proc *' arg is not used for PRU_CONTROL request, so remove it from pru_control() wrapper.
Split out {tcp,udp}6_usrreqs from {tcp,udp}_usrreqs and use them for inet6 case.
ok guenther@ bluhm@
|
#
1.153 |
|
31-Aug-2022 |
mvs |
Move PRU_SENDOOB request to (*pru_sendoob)().
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. To avoid dummy m_freem(9) handlers for all protocols release passed mbufs in the pru_sendoob() EOPNOTSUPP error path.
Also fix `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path.
ok bluhm@
|
#
1.152 |
|
29-Aug-2022 |
mvs |
Move PRU_RCVOOB request to (*pru_rcvoob)().
ok bluhm@
|
#
1.151 |
|
28-Aug-2022 |
mvs |
Move PRU_SENSE request to (*pru_sense)().
ok bluhm@
|
#
1.150 |
|
28-Aug-2022 |
mvs |
Move PRU_ABORT request to (*pru_abort)().
We abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it required only for tcp(4) and unix(4) sockets, so i should be optional. However, they will be removed with separate diff, and this time PRU_ABORT requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and key management sockets leave it alive. This was also converted as is, because this wrong code never called.
ok bluhm@
|
#
1.149 |
|
27-Aug-2022 |
mvs |
Move PRU_SEND request to (*pru_send)().
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.153 |
|
31-Aug-2022 |
mvs |
Move PRU_SENDOOB request to (*pru_sendoob)().
PRU_SENDOOB request always consumes passed `top' and `control' mbufs. To avoid dummy m_freem(9) handlers for all protocols release passed mbufs in the pru_sendoob() EOPNOTSUPP error path.
Also fix `control' mbuf(9) leak in the tcp(4) PRU_SENDOOB error path.
ok bluhm@
|
#
1.152 |
|
29-Aug-2022 |
mvs |
Move PRU_RCVOOB request to (*pru_rcvoob)().
ok bluhm@
|
#
1.151 |
|
28-Aug-2022 |
mvs |
Move PRU_SENSE request to (*pru_sense)().
ok bluhm@
|
#
1.150 |
|
28-Aug-2022 |
mvs |
Move PRU_ABORT request to (*pru_abort)().
We abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it required only for tcp(4) and unix(4) sockets, so i should be optional. However, they will be removed with separate diff, and this time PRU_ABORT requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and key management sockets leave it alive. This was also converted as is, because this wrong code never called.
ok bluhm@
|
#
1.149 |
|
27-Aug-2022 |
mvs |
Move PRU_SEND request to (*pru_send)().
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.152 |
|
29-Aug-2022 |
mvs |
Move PRU_RCVOOB request to (*pru_rcvoob)().
ok bluhm@
|
#
1.151 |
|
28-Aug-2022 |
mvs |
Move PRU_SENSE request to (*pru_sense)().
ok bluhm@
|
#
1.150 |
|
28-Aug-2022 |
mvs |
Move PRU_ABORT request to (*pru_abort)().
We abort only the sockets which are linked to `so_q' or `so_q0' queues of listening socket. Such sockets have no corresponding file descriptor and are not accessed from userland, so PRU_ABORT used to destroy them on listening socket destruction.
Currently all our sockets support PRU_ABORT request, but actually it required only for tcp(4) and unix(4) sockets, so i should be optional. However, they will be removed with separate diff, and this time PRU_ABORT requests were converted as is.
Also, the socket should be destroyed on PRU_ABORT request, but route and key management sockets leave it alive. This was also converted as is, because this wrong code never called.
ok bluhm@
|
#
1.149 |
|
27-Aug-2022 |
mvs |
Move PRU_SEND request to (*pru_send)().
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.149 |
|
27-Aug-2022 |
mvs |
Move PRU_SEND request to (*pru_send)().
The former PRU_SEND error path of gre_usrreq() had `control' mbuf(9) leak. It was fixed in new gre_send().
The former pfkeyv2_send() was renamed to pfkeyv2_dosend().
ok bluhm@
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.148 |
|
26-Aug-2022 |
mvs |
Move PRU_RCVD request to (*pru_rcvd)().
ok bluhm@
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.147 |
|
22-Aug-2022 |
mvs |
Move PRU_SHUTDOWN request to (*pru_shutdown)().
ok bluhm@
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.146 |
|
22-Aug-2022 |
mvs |
Move PRU_DISCONNECT request to (*pru_disconnect).
ok bluhm@
|
#
1.145 |
|
22-Aug-2022 |
mvs |
Move PRU_ACCEPT request to (*pru_accept)().
ok bluhm@
|
#
1.144 |
|
21-Aug-2022 |
mvs |
Move PRU_CONNECT request to (*pru_connect)() handler.
ok bluhm@
|
#
1.143 |
|
21-Aug-2022 |
mvs |
Move PRU_LISTEN request to (*pru_listen)() handler.
ok bluhm@
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.142 |
|
20-Aug-2022 |
mvs |
Move PRU_BIND request to (*pru_bind)() handler.
For the protocols which don't support request, leave handler NULL. Do the NULL check within corresponding pru_() wrapper and return EOPNOTSUPP in such case. This will be done for all upcoming user request handlers.
ok bluhm@ guenther@
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.141 |
|
15-Aug-2022 |
mvs |
Introduce 'pr_usrreqs' structure and move existing user-protocol handlers into it. We want to split existing (*pr_usrreq)() to multiple short handlers for each PRU_ request as it was already done for PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)() split will be done with the following diffs.
Based on reverted diff from guenther@.
ok bluhm@
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.140 |
|
11-Aug-2022 |
claudio |
Add TCP_INFO support to getsockopt for tcp sessions.
TCP_INFO provides a lot of information about the TCP session of this socket. Many processes like to peek at the rtt of a connection but this also provides a lot of more special info for use by e.g. tcpbench(1). While the basic minimal info is available all the time the more specific data is only populated for privileged processes. This is done to not share data back to userland that may allow to attack a session. TCP_INFO is available to pledge "inet" since pledged processes like chrome tend to use TCP_INFO when available. OK bluhm@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.139 |
|
25-Feb-2022 |
guenther |
Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com Revert the pr_usrreqs move: syzkaller found a NULL pointer deref and I won't be available to monitor for followup issues for a bit
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.138 |
|
25-Feb-2022 |
guenther |
Move pr_attach and pr_detach to a new structure pr_usrreqs that can then be shared among protosw structures, following the same basic direction as NetBSD and FreeBSD for this.
Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the proper prototype to eliminate the previously necessary casts.
ok mvs@ bluhm@
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.137 |
|
23-Jan-2022 |
bluhm |
Define all TCP TF_ flags as unsigned numbers. They are stored in u_int t_flags. Shifting TF_TIMER with TCPT_DELACK can touch the sign bit. found by kubsan; suggested by deraadt@; OK miod@
|
Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.136 |
|
28-Jan-2021 |
visa |
Drop tcp_trace() from SMALL_KERNEL builds to make room on amd64 floppy
OK deraadt@
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.135 |
|
18-Aug-2020 |
gnezdo |
Convert tcp_sysctl to sysctl_bounded_args
This introduces bounds checks for many net.inet.tcp sysctl variables. Folded some fitting cases into the framework: tcp_do_sack, tcp_do_ecn.
ok derradt@
|
Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.134 |
|
12-Jul-2019 |
bluhm |
Count the number of TCP SACK options that were dropped due to the sack hole list length or pool limit. OK claudio@
|
Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.133 |
|
11-Jun-2018 |
bluhm |
The output from tcp debug sockets was incomplete. After detach tp was NULL and nothing was traced. So save the old tcpcb and use that to retrieve some information. Note that otb may be freed and must not be dereferenced. Use a heuristic for cases where the address family is in the IP header but not provided in the PCB. OK visa@
|
#
1.132 |
|
08-May-2018 |
bluhm |
Historically there were slow and fast tcp timeouts. That is why the delack timer had a different implementation. Use the same mechanism for all TCP timer. OK mpi@ visa@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.131 |
|
07-Feb-2018 |
bluhm |
Historically TCP timeouts were implemented with pr_slowtimo and pr_fasttimo. That is the reason why we have two timeout mechanisms with complicated ticks calculation. Move the delay ACK timeout to milliseconds and remove some ticks and hz mess from the others. This makes it easier to see the actual values. OK florian@ dhill@ dlg@
|
#
1.130 |
|
06-Feb-2018 |
bluhm |
There was a race in the TCP timers. As they may sleep to grab the netlock, timers may still run after they have been disarmed. Deleting the timeout is not sufficient to cancel them, but the code from 4.4 BSD is assuming this. The solution is to add a flag for every timer to see whether it has been armed or canceled. Remove the TF_DEAD check as tcp_canceltimers() is called before the reaper timer is fired. Cancelation works reliably now. OK mpi@
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.129 |
|
23-Jan-2018 |
bluhm |
The TCP reaper timeout was still imlemented as soft timeout. So it could run immediately and was not synchronized with the TCP timeouts, although that was the intension when it was introduced in revision 1.85. Convert the reaper to an ordinary TCP timeout so it is scheduled on the same timeout thread after all timeouts have finished. A net lock is not necessary as the process calling tcp_close() will not access the tcpcb after arming the reaper timeout. OK mikeb@
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|
#
1.128 |
|
02-Nov-2017 |
florian |
Move PRU_DETACH out of pr_usrreq into per proto pr_detach functions to pave way for more fine grained locking.
Suggested by, comments & OK mpi
|
#
1.127 |
|
25-Oct-2017 |
job |
Remove the TCP_FACK option and associated #if{,n}def code.
TCP_FACK was disabled by provos@ in June 1999. TCP_FACK is an algorithm that decides that when something is lost, all not SACKed packets until the most forward SACK are lost. It may be a correct estimate, if network does not reorder packets.
OK visa@ mpi@ mikeb@
|
#
1.126 |
|
24-Oct-2017 |
mikeb |
Refactor handling of partial TCP acknowledgements
With input from Klemens Nanni, OK visa, mpi, bluhm
|
#
1.125 |
|
22-Oct-2017 |
mikeb |
Unconditionally enable TCP selective acknowledgements (SACK)
OK deraadt, mpi, visa, job
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.124 |
|
14-Apr-2017 |
bluhm |
Pass down the address family through the pr_input calls. This allows to simplify code used for both IPv4 and IPv6. OK mikeb@ deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.123 |
|
13-Mar-2017 |
claudio |
Move PRU_ATTACH out of the pr_usrreq functions into pr_attach. Attach is quite a different thing to the other PRU functions and this should make locking a bit simpler. This also removes the ugly hack on how proto was passed to the attach function. OK bluhm@ and mpi@ on a previous version
|
#
1.122 |
|
09-Feb-2017 |
jca |
percpu counters for TCP stats
ok mpi@ bluhm@
|
#
1.121 |
|
01-Feb-2017 |
dhill |
In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with the netlock held. This also changes the prototypes of the *ctloutput functions to take an mbuf instead of an mbuf pointer.
help, guidance from bluhm@ and mpi@ ok bluhm@
|
#
1.120 |
|
29-Jan-2017 |
bluhm |
Change the IPv4 pr_input function to the way IPv6 is implemented, to get rid of struct ip6protosw and some wrapper functions. It is more consistent to have less different structures. The divert_input functions cannot be called anyway, so remove them. OK visa@ mpi@
|
#
1.119 |
|
26-Jan-2017 |
bluhm |
Reduce the difference between struct protosw and ip6protosw. The IPv4 pr_ctlinput functions did return a void pointer that was always NULL and never used. Make all functions void like in the IPv6 case. OK mpi@
|
#
1.118 |
|
25-Jan-2017 |
bluhm |
Since raw_input() and route_input() are gone from pr_input, we can make the variable parameters of the protocol input functions fixed. Also add the proto to make it similar to IPv6. OK mpi@ guenther@ millert@
|
#
1.117 |
|
16-Nov-2016 |
mpi |
Kill recursive splsoftnet()s.
While here keep local definitions local.
ok bluhm@
|
#
1.116 |
|
04-Oct-2016 |
mpi |
Convert timeouts that need a process context to timeout_set_proc(9).
The current reason is that rtalloc_mpath(9) inside ip_output() might end up inserting a RTF_CLONED route and that require a write lock.
ok kettenis@, bluhm@
|
Revision tags: OPENBSD_6_0_BASE
|
#
1.115 |
|
20-Jul-2016 |
bluhm |
To tune the TCP SYN cache we need more information. Print the relevant counters with netstat -s -p tcp. OK henning@
|
#
1.114 |
|
20-Jul-2016 |
bluhm |
Make the size for the syn cache hash array tunable. As we are swapping between two syn caches for random reseeding anyway, this feature can be added easily. When the cache is empty, there is an opportunity to change the hash size. This allows an admin under SYN flood attack to defend his machine. Suggested by claudio@; OK jung@ claudio@ jmc@
|
#
1.113 |
|
18-Jun-2016 |
vgross |
Add net.inet.{tcp,udp}.rootonly sysctl, to mark which ports cannot be bound to by non-root users.
Ok millert@ bluhm@
|
#
1.112 |
|
29-Mar-2016 |
bluhm |
Allow to adjust tcp_syn_use_limit with sysctl net.inet.tcp.synuselimit. This is convenient to test the feature and may be useful to defend against syn flooding in a denial of service condition. It is consistent to the existing syn cache sysctls. Move some declarations to tcp_var.h to access the syn cache sets from tcp_sysctl(). OK mpi@
|
#
1.111 |
|
27-Mar-2016 |
bluhm |
To prevent attacks on the hash buckets of the syn cache, our TCP stack reseeds the hash function every time the cache is empty. Unfortunatly the attacker can prevent the reseeding by sending unanswered SYN packes periodically. Fix this by having an active syn cache that gets new entries and a passive one that is idling out. When the passive one is empty and the active one has been used 100000 times, they switch roles and the hash function is reseeded with new random. tedu@ agrees; OK mpi@
|
#
1.110 |
|
21-Mar-2016 |
bluhm |
Add a tcps_sc_seedrandom counter in TCP SYN cache and netstat -s. This shows how often the hash function is reseeded and the random bucket distribution changes. OK mpi@ claudio@
|
Revision tags: OPENBSD_5_9_BASE
|
#
1.109 |
|
27-Aug-2015 |
bluhm |
The syn cache is completely implemented in tcp_input.c. So all its global variables should also live there. OK markus@
|
#
1.108 |
|
24-Aug-2015 |
bluhm |
Rename the syn cache counter into tcp_syn_cache_count to have the same prefix for all variables. Convert the counter type to int, the limit is also int. Before searching the cache, check that it is not empty. Do not access the counter outside of the syn cache from tcp_ctlinput(), let the syn_cache_lookup() function handle it. OK dlg@
|
Revision tags: OPENBSD_5_7_BASE OPENBSD_5_8_BASE
|
#
1.107 |
|
08-Feb-2015 |
yasuoka |
Count dropped SYN packets on the tcpstat. They are dropped due to the listen queue (backlog) limit or the memory shortage in syn-cache.
ok henning reyk claudio
|
#
1.106 |
|
21-Jan-2015 |
deraadt |
To satisfy kernel grovellers and bad (but document) sysctl practice, be pragmatic and #include <sys/timeout.h> for struct tcpb (glorious namespace violation) ok kettenis millert sthen
|
Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
|
#
1.105 |
|
23-Jan-2014 |
henning |
since the cksum rewrite the counters for hardware checksummed packets are are lie, since the software engine emulates hardware offloading and that is later indistinguishable. so kill the hw cksummed counters. introduce software checksummed packet counters instead. tcp/udp handles ip & ipvshit, ip cksum covered, 6 has no ip layer cksum. as before we still have a miscounting bug for inbound with pf on, to be fixed in the next step. found by, prodding & ok naddy
|
#
1.104 |
|
23-Oct-2013 |
deraadt |
remove historical #if 1
|
#
1.103 |
|
21-Oct-2013 |
phessler |
Sprinkle a lot more IPv6 routing domains support in the kernel.
Mostly mechanical, setting and passing the rdomain and rtable correctly. Not yet enabled.
Lots of help and hints from claudio and bluhm
OK claudio@, bluhm@
|
#
1.102 |
|
12-Aug-2013 |
bluhm |
Add the TCP socket option TCP_NOPUSH to delay sending the stream. This is useful to aggregate data in the kernel from multiple sources like writes and socket splicing. It avoids sending small packets. From FreeBSD via David Hill; OK mikeb@ henning@
|
Revision tags: OPENBSD_5_4_BASE
|
#
1.101 |
|
01-Jun-2013 |
bluhm |
Pass the routing domain to IPv6 pr_ctlinput() like in IPv4. OK claudio@
|
#
1.100 |
|
10-Apr-2013 |
mpi |
Remove various external variable declaration from sources files and move them to the corresponding header with an appropriate comment if necessary.
ok guenther@
|
Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE OPENBSD_5_3_BASE
|
#
1.99 |
|
06-Jul-2011 |
sthen |
Add sysctl net.inet.tcp.always_keepalive, when this is set the system behaves as if SO_KEEPALIVE was set on all TCP sockets, forcing keepalives to be sent every net.inet.tcp.keepidle half-seconds.
In conjunction with a keepidle value greatly reduced from the default, this can be useful for keeping sessions open if you are stuck on a network with short NAT or firewall timeouts.
Feedback from various people, ok henning@ claudio@
|
Revision tags: OPENBSD_4_9_BASE
|
#
1.98 |
|
07-Jan-2011 |
bluhm |
Add socket option SO_SPLICE to splice together two TCP sockets. The data received on the source socket will automatically be sent on the drain socket. This allows to write relay daemons with zero data copy. ok markus@
|
#
1.97 |
|
21-Oct-2010 |
bluhm |
There is no TCP6 in our kernel, so remove the #ifndef TCP6. No binary change. ok claudio@ henning@
|
#
1.96 |
|
24-Sep-2010 |
claudio |
TCP send and recv buffer scaling. Send buffer is scaled by not accounting unacknowledged on the wire data against the buffer limit. Receive buffer scaling is done similar to FreeBSD -- measure the delay * bandwith product and base the buffer on that. The problem is that our RTT measurment is coarse so it overshoots on low delay links. This does not matter that much since the recvbuffer is almost always empty. Add a back pressure mechanism to control the amount of memory assigned to socketbuffers that kicks in when 80% of the cluster pool is used. Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.
Based on work by markus@ and djm@.
OK dlg@, henning@, put it in deraadt@
|
Revision tags: OPENBSD_4_8_BASE
|
#
1.95 |
|
09-Jul-2010 |
reyk |
Add support for using IPsec in multiple rdomains.
This allows to run isakmpd/iked/ipsecctl in multiple rdomains independently (with "route exec"); the kernel will pickup the rdomain from the process context of the pfkey socket and load the flows and SAs into the matching rdomain encap routing table. The network stack also needs to pass the rdomain to the ipsec stack to lookup the correct rdomain that belongs to an interface/mbuf/... You can now run individual IPsec configs per rdomain or create IPsec VPNs between multiple rdomains on the same machine ;). Note that a primary enc(4) in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.
Test by some people, mostly on existing "rdomain 0" setups. Was in snaps for some days and people didn't complain.
ok claudio@ naddy@
|
#
1.94 |
|
03-Jul-2010 |
guenther |
Fix the naming of interfaces and variables for rdomains and rtables and make it possible to bind sockets (including listening sockets!) to rtables and not just rdomains. This changes the name of the system calls, socket option, and ioctl. After building with this you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.
Since this removes the existing [gs]etrdomain() system calls, the libc major is bumped.
Written by claudio@, criticized^Wcritiqued by me
|
Revision tags: OPENBSD_4_7_BASE
|
#
1.93 |
|
13-Nov-2009 |
claudio |
Extend the protosw pr_ctlinput function to include the rdomain. This is needed so that the route and inp lookups done in TCP and UDP know where to look. Additionally in_pcbnotifyall() and tcp_respond() got a rdomain argument as well for similar reasons. With this tcp seems to be now fully rdomain save and no longer leaks single packets into the main domain. Looks good markus@, henning@
|
#
1.92 |
|
10-Aug-2009 |
claudio |
sockets created via a listening socket lose the rdomain and fail to work therefore. Inherit the rdomain through the syncache. There are some interactions that need some more work (ctlinput) so this can be improved but is good enough for now. OK markus@
|
Revision tags: OPENBSD_4_6_BASE
|
#
1.91 |
|
05-Jun-2009 |
claudio |
Initial support for routing domains. This allows to bind interfaces to alternate routing table and separate them from other interfaces in distinct routing tables. The same network can now be used in any doamin at the same time without causing conflicts. This diff is mostly mechanical and adds the necessary rdomain checks accross net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6. input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@
|
Revision tags: OPENBSD_4_5_BASE
|
#
1.90 |
|
08-Nov-2008 |
dlg |
fix macros up so they use the do { } while (/* CONSTCOND */ 0) idiom
ok deraadt@ otto@
|
Revision tags: OPENBSD_4_4_BASE
|
#
1.89 |
|
24-May-2008 |
thib |
Remove {tcp/udp}6_usrreq(); Since the normal ones now take a proc argument, theres no need for these, since they are just wrappers.
OK claudio@
|
#
1.88 |
|
23-May-2008 |
thib |
Deal with the situation when TCP nfs mounts timeout and processes get hung in nfs_reconnect() because they do not have the proper privilages to bind to a socket, by adding a struct proc * argument to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind) and do the sobind() with proc0 in nfs_connect.
OK markus@, blambert@. "go ahead" deraadt@.
Fixes an issue reported by bernd@ (Tested by bernd@). Fixes PR5135 too.
|
#
1.87 |
|
06-May-2008 |
markus |
remove tcp_drain code since it's not longer used; ok henning, feedback thib
|
Revision tags: OPENBSD_4_3_BASE
|
#
1.86 |
|
20-Feb-2008 |
markus |
remove old unused TCP isn code; ok henning, dhartmei, mcbride
|
#
1.85 |
|
20-Feb-2008 |
markus |
when creating a response, use the correct TCP header instead of relying on the mbuf chain layout; with claudio@ and krw@; ok henning@
|
#
1.84 |
|
13-Dec-2007 |
reyk |
implement sysctls to report IP, TCP, UDP, and ICMP statistics and change netstat to use them instead of accessing kvm for it. more protocols will be added later.
discussed with deraadt@ claudio@ gilles@ ok deraadt@
|
Revision tags: OPENBSD_4_2_BASE
|
#
1.83 |
|
25-Jun-2007 |
markus |
branches: 1.83.2; merge tcp_set_iss() and tcp_set_tsm(); ok mcbride, djm (on earlier version)
|
#
1.82 |
|
15-Jun-2007 |
markus |
Drop the current random timestamps and the current ISN generation code and replace both with a RFC1948 based method, so TCP clients now have monotonic ISN/timestamps. The server side uses completely random ISN/timestamps and does time-wait recycling (on port reuse). ok djm@, mcbride@; thanks to lots of testers
|
Revision tags: OPENBSD_4_1_BASE
|
#
1.81 |
|
01-Feb-2007 |
jmc |
branches: 1.81.2; correct rfc; from Kris Katterjohn
|
Revision tags: OPENBSD_3_9_BASE OPENBSD_4_0_BASE
|
#
1.80 |
|
11-Dec-2005 |
deraadt |
bitfields must be off an int or such type
|
#
1.79 |
|
20-Nov-2005 |
brad |
splimp -> splvm. mbuf allocation here.
ok henning@
|
#
1.78 |
|
15-Nov-2005 |
miod |
Only two `h' in threshold.
|
Revision tags: OPENBSD_3_8_BASE
|
#
1.77 |
|
02-Aug-2005 |
markus |
change the TCP reass queue from LIST to TAILQ; ok henning claudio fgsch krw
|
#
1.76 |
|
04-Jul-2005 |
markus |
remove TUBA, ok many
|
#
1.75 |
|
30-Jun-2005 |
markus |
implement PMTU checks from http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html i.e. don't act on ICMP-need-frag immediately if adhoc checks on the advertised mtu fail. the mtu update is delayed until a tcp retransmit happens. initial patch by Fernando Gont, tested by many.
|
#
1.74 |
|
24-May-2005 |
fgont |
Ignore ICMP Source Quench messages meant for TCP connections. (Details in http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html) ok markus frantzen
|
#
1.73 |
|
05-Apr-2005 |
markus |
add tcp sack stats, similar to freebsd; ok deraadt
|
Revision tags: OPENBSD_3_7_BASE
|
#
1.72 |
|
09-Mar-2005 |
markus |
from freebsd: 1. set rcv_laststart/rcv_lastend after checking the tcp window 2. pass rcv_laststart and rcv_lastend on the stack (shrink tcp state) ok henning, djm
|
#
1.71 |
|
04-Mar-2005 |
markus |
- check th_ack against snd_una/max; from Raja Mukerji via hugh@ - limit pool to tcp_sackhole_limit entries (sysctl-able) - stop sack option processing on pool_get errors - use SEQ_MIN/SEQ_MAX ok henning, hshoexer, deraadt
|
#
1.70 |
|
27-Feb-2005 |
markus |
1. tcp_xmit_timer(): remove extra rtt decrement (t_rtttime is 0-based while t_rtt was 1-based), update callers 2. define and use TCP_RTT_BASE_SHIFT instead of the hardcoded 2. 3. add missing shifts when t_srtt/t_rttvar are used. 4. update the comments: t_srtt uses 5 bits of fraction (not 3) and t_rttvar uses 4 bits 5. remove obsolete/unused macros TCP_RTT_SCALE and TCP_RTTVAR_SCALE 6. make sure rttmin is not > TCPTV_REXMTMAX parts from netbsd, ok mcbride, henning
|
#
1.69 |
|
10-Jan-2005 |
mcbride |
Make sure bogus values don't make their way into tcp_xmit_timer() calculations. - Ignore ts_ecr if it is 0, or the resulting rtt is out of range. (use tp->t_rtttime instead) - Initialise tcp_now to 1, to avoid the 500ms window where a valid ts_ecr of 0 could be ignored. - Convert out-of-range rtt values to valid ones in tcp_xmit_timer().
ok frantzen@ markus@
|
#
1.68 |
|
25-Nov-2004 |
markus |
fix for race between invocation for timer and network input 1) add a reaper for TCP and SYN cache states (cf. netbsd pr 20390) 2) additional check for TCP_TIMER_ISARMED(TCPT_REXMT) in tcp_timer_persist() with mickey@; ok deraadt@
|
#
1.67 |
|
28-Oct-2004 |
mcbride |
Modulate tcp_now by a random amount on a per-connection basis.
ok markus@ frantzen@
|
#
1.66 |
|
16-Sep-2004 |
markus |
don't send partial segments if SS_ISSENDING is set, remember TF_LASTIDLE across invocations of tcp_output (from freebsd); ok mcbride
|
Revision tags: OPENBSD_3_6_BASE
|
#
1.65 |
|
15-Jul-2004 |
markus |
branches: 1.65.2; tcp_trace() expects short, not int; ok deraadt
|
Revision tags: SMP_SYNC_A SMP_SYNC_B
|
#
1.64 |
|
08-Jun-2004 |
markus |
factor out md5 code; ok+tests henning@, djm@, hshoexer@
|
#
1.63 |
|
25-Apr-2004 |
markus |
add TCPCTL_DROP; ok deraadt, cedric, grange, ...
|
#
1.62 |
|
20-Apr-2004 |
markus |
add tcps_rcvacktooold; ok deraadt
|
Revision tags: OPENBSD_3_5_BASE
|
#
1.61 |
|
02-Mar-2004 |
markus |
branches: 1.61.2; limit total number of queued out-of-order packets to NMBCLUSTERS/2; ok mcbride
|
#
1.60 |
|
27-Feb-2004 |
markus |
implement tcp_drain() similar to ip_drain(); ok mcbride@
|
#
1.59 |
|
27-Feb-2004 |
markus |
API change; counter for upcoming tcp_drain(); ok deraadt
|
#
1.58 |
|
15-Feb-2004 |
markus |
switch to sysctl_int_arr(); ok itojun, henning, miod, deraadt
|
#
1.57 |
|
31-Jan-2004 |
markus |
!sack_disable -> sack_enable; ok deraadt@
|
#
1.56 |
|
29-Jan-2004 |
markus |
support for RFC3390 (Increasing TCP's Initial Window); ok deraadt, itojun
|
#
1.55 |
|
14-Jan-2004 |
markus |
syncache+ipv6 support for TCP_SIGNATURE; with itojun; ok deraadt
|
#
1.54 |
|
13-Jan-2004 |
markus |
bring back the old TCP_SIGNATURE code from tcp_input.c rev 1.45 and make it compile (does not work yet); ok deraadt@
|
#
1.53 |
|
07-Jan-2004 |
markus |
syn_XXX_limit -> synXXXlimit for consistency; ok deraadt
|
#
1.52 |
|
06-Jan-2004 |
markus |
import netbsd's version of David Borman's syncache code http://www.kohala.com/start/borman.97jun06.txt; ok deraadt@, henning@
|
Revision tags: OPENBSD_3_4_BASE
|
#
1.51 |
|
09-Jun-2003 |
itojun |
branches: 1.51.2; backout following: >use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
PR 3283 fixed (confirmed)
|
#
1.50 |
|
02-Jun-2003 |
millert |
Remove the advertising clause in the UCB license which Berkeley rescinded 22 July 1999. Proofed by myself and Theo.
|
#
1.49 |
|
29-May-2003 |
itojun |
use m_pulldown not m_pullup2. fix some bugs in IPv6 tcp_trace().
|
#
1.48 |
|
26-May-2003 |
itojun |
fix tcpcb size to make trpt happy
|
#
1.47 |
|
23-May-2003 |
itojun |
don't #ifdef within struct tcpcb definition, as it is used in userland too. dhartmei ok
|
Revision tags: UBC_SYNC_A
|
#
1.46 |
|
12-May-2003 |
jason |
Nuke a whole bunch of commons; ok tedu (still more to come *sigh*)
|
Revision tags: OPENBSD_3_3_BASE
|
#
1.45 |
|
12-Feb-2003 |
jason |
branches: 1.45.2; Remove commons; inspired by netbsd.
|
Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
|
#
1.44 |
|
09-Jun-2002 |
itojun |
whitespace
|
#
1.43 |
|
16-May-2002 |
kjc |
bring in ECN support from KAME. it consists of - ECN support in TCP - tunnel-egress and fragment reassembly rules in layer-3 not to lose congestion info at tunnel-egress and fragment reassembly
to enable ECN in TCP, build a kernel with TCP_ECN, and then, turn it on by "sysctl -w net.inet.tcp.ecn=1".
ok deraadt@
|
Revision tags: OPENBSD_3_1_BASE
|
#
1.42 |
|
14-Mar-2002 |
millert |
First round of __P removal in sys
|
#
1.41 |
|
08-Mar-2002 |
provos |
use timeout(9) to schedule TCP timers. this avoid traversing all tcp connections during tcp_slowtimo. apdapted from thorpej@netbsd.org
|
#
1.40 |
|
02-Mar-2002 |
provos |
disable immediate ack on TH_PUSH. make behaviour sysctl tuneable. from netbsd; also fix a bug where setting TF_ACKNOW didn't actually result in an ack.
|
#
1.39 |
|
01-Mar-2002 |
provos |
remove tcp_fasttimo and convert delayed acks to the timeout(9) API instead. adapated from netbsd. okay angelos@
|
#
1.38 |
|
15-Jan-2002 |
provos |
allocate sackholes with pool
|
Revision tags: OPENBSD_3_0_BASE UBC_BASE
|
#
1.37 |
|
23-Jun-2001 |
angelos |
branches: 1.37.4; Keep stats on TCP/UDP hardware checksumming.
|
#
1.36 |
|
09-Jun-2001 |
angelos |
Inclusion protection.
|
Revision tags: OPENBSD_2_9_BASE
|
#
1.35 |
|
13-Dec-2000 |
provos |
more random tcp sequence numbers. okay deraadt@, angelos@
|
#
1.34 |
|
11-Dec-2000 |
itojun |
nuke #ifdef TCP6 (no longer supported). validate ICMPv6 too big messages (pmtud) based on pcb. we accept certain amount of non-validated ones, as IPv6 mandates ICMPv6 (so even for traffic from unconnected pcb, we need pmtud). sync with kame
|
Revision tags: OPENBSD_2_8_BASE
|
#
1.33 |
|
14-Oct-2000 |
itojun |
implement net.inet.tcp.rstppslimit. rate-limits outbound TCP RST traffic to less than N per 1 second.
|
#
1.32 |
|
25-Sep-2000 |
provos |
on expiry of pmtu route, retry higher mtu. okay angelos@
|
#
1.31 |
|
20-Sep-2000 |
provos |
correctly calculate mss
|
#
1.30 |
|
18-Sep-2000 |
provos |
Path MTU discovery based on NetBSD but with the decision to use the DF flag delayed to ip_output(). That halves the code and reduces most of the route lookups. okay deraadt@
|
#
1.29 |
|
11-Jul-2000 |
provos |
compute correct window scale when recvpipe option is set in route; based on diff from "Pete Kazmier" <pete@kazmier.com>
|
#
1.28 |
|
26-Jun-2000 |
art |
Make the definition of tcpstat in tcp_var.h extern.
|
#
1.27 |
|
18-Jun-2000 |
beck |
support ipv6 for tcp_ident
|
Revision tags: OPENBSD_2_7_BASE SMP_BASE
|
#
1.26 |
|
21-Dec-1999 |
provos |
branches: 1.26.2; option TCP_NEWRENO goes away, its the default case for TCP_SACK if SACK is disabled for the connection or via sysctl
|
Revision tags: kame_19991208
|
#
1.25 |
|
08-Dec-1999 |
itojun |
bring in KAME IPv6 code, dated 19991208. replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support. see sys/netinet6/{TODO,IMPLEMENTATION} for more details.
GENERIC configuration should work fine as before. GENERIC.v6 works fine as well, but you'll need KAME userland tools to play with IPv6 (will be bringed into soon).
|
Revision tags: OPENBSD_2_6_BASE
|
#
1.24 |
|
06-Aug-1999 |
deraadt |
back out all recent changes, which continue to be a source for nasty bugs
|
#
1.23 |
|
22-Jul-1999 |
niklas |
Revert to 1.21
|
#
1.22 |
|
17-Jul-1999 |
provos |
revert tcp_input.c to before 07/01/1999 - this seems to solve the mysterious data corruptions and panics that people have experienced. by reverting we loose tcp signatures and ipv6 cleanups, the code looked correct to me.
|
#
1.21 |
|
06-Jul-1999 |
cmetz |
Added support for TCP MD5 option (RFC 2385).
|
#
1.20 |
|
02-Jul-1999 |
cmetz |
Fixed a #ifdef defined()... typo that turned into a compilation failure.
|
Revision tags: OPENBSD_2_5_BASE
|
#
1.19 |
|
27-Mar-1999 |
provos |
add SADB_X_BINDSA to pfkey allowing incoming SAs to refer to an outgoing SA to be used, use this SA in ip_output if available. allow mobile road warriors for bind SAs with wildcard dst and src addresses. check IPSEC AUTH and ESP level when receiving packets, drop them if protection is insufficient. add stats to show dropped packets because of insufficient IPSEC protection. -- phew. this was all done in canada. dugsong and linh provided the ride and company.
|
#
1.18 |
|
04-Feb-1999 |
deraadt |
indent
|
#
1.17 |
|
04-Feb-1999 |
deraadt |
use u_int32_t and u_int64_t for stats variables, instead of quad/long
|
#
1.16 |
|
11-Jan-1999 |
niklas |
Make TCP_SACK compile with new netinet
|
#
1.15 |
|
11-Jan-1999 |
deraadt |
netinet merge of NRL stuff. some indent and shrinkage needed; NRL/cmetz
|
#
1.14 |
|
18-Nov-1998 |
deraadt |
indent right
|
#
1.13 |
|
17-Nov-1998 |
provos |
NewReno, SACK and FACK support for TCP, adapted from code for BSDI by Hari Balakrishnan (hari@lcs.mit.edu), Tom Henderson (tomh@cs.berkeley.edu) and Venkat Padmanabhan (padmanab@cs.berkeley.edu) as part of the Daedalus research group at the University of California, (http://daedalus.cs.berkeley.edu). [I was able to do this on time spent at the Center for Information Technology Integration (citi.umich.edu)]
|
#
1.12 |
|
28-Oct-1998 |
provos |
- fix three bugs pointed out in Stevens, i.a. updating timestamps correctly - fix a 4.4bsd-lite2 bug, when tcp options are present the maximum segment size is not updated correctly, so that fast recovery forces out a segment which is split in two segments by tcp_output(), the fix is adpated from FreeBSD, the effective mss is recorded after option negotiation in 3way handshake. [I was able to fix this on time spent at Center for Information Technology Integration (citi.umich.edu)]
|
Revision tags: OPENBSD_2_4_BASE
|
#
1.11 |
|
10-Jun-1998 |
beck |
New TCPCTL_IDENT sysctl for identd without kmem insanity.
|
Revision tags: OPENBSD_2_3_BASE
|
#
1.10 |
|
18-Mar-1998 |
angelos |
Add FreeBSD patch (check for SYN packets arriving at a socket in LISTEN state with source address/port == destination address/port).
|
#
1.9 |
|
24-Jan-1998 |
mickey |
sysctl for def sizes for tcp/udp send/recv queues
|
Revision tags: OPENBSD_2_2_BASE
|
#
1.8 |
|
09-Aug-1997 |
millert |
The list of tcp/udp ports not to allocate dynamically is now a bitmask configurable via sysctl([38]). The default values have not changed. If one wants to change the list it should be done early on in /etc/rc.
|
#
1.7 |
|
15-Jun-1997 |
deraadt |
change byte counters to u_quad_t
|
#
1.6 |
|
06-Jun-1997 |
deraadt |
add net.inet.tcp.{keepidle,keepintvl,slowhz}; mouse@Rodents.Montreal.QC.CA
|
Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE
|
#
1.5 |
|
20-Sep-1996 |
deraadt |
`solve' the syn bomb problem as well as currently known; add sysctl's for SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT (net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start doing tail drop, but slightly prefer the same port.
|
#
1.4 |
|
12-Sep-1996 |
tholo |
TCP Persist handling; from 4.4BSD Lite2 (via NetBSD PR 2335)
|
#
1.3 |
|
03-Mar-1996 |
niklas |
From NetBSD: 960217 merge
|
#
1.2 |
|
14-Dec-1995 |
deraadt |
from netbsd: make netinet work on systems where pointers and longs are 64 bits (like the alpha). Biggest problem: IP headers were overlayed with structure which included pointers, and which therefore didn't overlay properly on 64-bit machines. Solution: instead of threading pointers through IP header overlays, add a "queue element" structure to do the threading, and point it at the ip headers.
|
#
1.1 |
|
18-Oct-1995 |
deraadt |
branches: 1.1.1; Initial revision
|