#
359652 |
|
06-Apr-2020 |
hselasky |
MFC r333806: Use NULL for SYSINIT's last arg, which is a pointer type
Sponsored by: The FreeBSD Foundation
|
#
350865 |
|
11-Aug-2019 |
gnn |
MFC: 350557
Properly validate arguments for route deletion
Reported by: Liang Zhuo brightiup.zhuo@gmail.com
|
#
331722 |
|
29-Mar-2018 |
eadler |
Revert r330897:
This was intended to be a non-functional change. It wasn't. The commit message was thus wrong. In addition it broke arm, and merged crypto related code.
Revert with prejudice.
This revert skips files touched in r316370 since that commit was since MFCed. This revert also skips files that require $FreeBSD$ property changes.
Thank you to those who helped me get out of this mess including but not limited to gonzo, kevans, rgrimes.
Requested by: gjb (re)
|
#
330897 |
|
14-Mar-2018 |
eadler |
Partial merge of the SPDX changes
These changes are incomplete but are making it difficult to determine what other changes can/should be merged.
No objections from: pfg
|
#
320134 |
|
20-Jun-2017 |
ae |
MFC r319895: Resurrect RTF_RNH_LOCKED flag and restore ability to call rtalloc1_fib() with acquired RIB lock.
This fixes a possible panic due to trying to acquire RIB rlock when it is already exclusive locked.
PR: 215963, 215122 Sponsored by: Yandex LLC Approved by: re (delphij)
|
#
310884 |
|
31-Dec-2016 |
loos |
MFC r309717:
Fix the typos and style(9) in comment.
|
#
302408 |
|
07-Jul-2016 |
gjb |
Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle. Prune svn:mergeinfo from the new branch, as nothing has been merged here.
Additional commits post-branch will follow.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
302054 |
|
21-Jun-2016 |
bz |
Get closer to a VIMAGE network stack teardown from top to bottom rather than removing the network interfaces first. This change is rather larger and convoluted as the ordering requirements cannot be separated.
Move the pfil(9) framework to SI_SUB_PROTO_PFIL, move Firewalls and related modules to their own SI_SUB_PROTO_FIREWALL. Move initialization of "physical" interfaces to SI_SUB_DRIVERS, move virtual (cloned) interfaces to SI_SUB_PSEUDO. Move Multicast to SI_SUB_PROTO_MC.
Re-work parts of multicast initialisation and teardown, not taking the huge amount of memory into account if used as a module yet.
For interface teardown we try to do as many of them as we can on SI_SUB_INIT_IF, but for some this makes no sense, e.g., when tunnelling over a higher layer protocol such as IP. In that case the interface has to go along (or before) the higher layer protocol is shutdown.
Kernel hhooks need to go last on teardown as they may be used at various higher layers and we cannot remove them before we cleaned up the higher layers.
For interface teardown there are multiple paths: (a) a cloned interface is destroyed (inside a VIMAGE or in the base system), (b) any interface is moved from a virtual network stack to a different network stack ("vmove"), or (c) a virtual network stack is being shut down. All code paths go through if_detach_internal() where we, depending on the vmove flag or the vnet state, make a decision on how much to shut down; in case we are destroying a VNET the individual protocol layers will cleanup their own parts thus we cannot do so again for each interface as we end up with, e.g., double-frees, destroying locks twice or acquiring already destroyed locks. When calling into protocol cleanups we equally have to tell them whether they need to detach upper layer protocols ("ulp") or not (e.g., in6_ifdetach()).
Provide or enahnce helper functions to do proper cleanup at a protocol rather than at an interface level.
Approved by: re (hrs) Obtained from: projects/vnet Reviewed by: gnn, jhb Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D6747
|
#
301502 |
|
06-Jun-2016 |
bz |
Provide a public interface to rt_flushifroutes which takes the address family as an argument as well. This will be used to cleanup individual protocols during VNET teardown.
Obtained from: projects/vnet Sponsored by: The FreeBSD Foundation
|
#
301217 |
|
02-Jun-2016 |
gnn |
This change re-adds L2 caching for TCP and UDP, as originally added in D4306 but removed due to other changes in the system. Restore the llentry pointer to the "struct route", and use it to cache the L2 lookup (ARP or ND6) as appropriate.
Submitted by: Mike Karels Differential Revision: https://reviews.freebsd.org/D6262
|
#
297234 |
|
24-Mar-2016 |
bz |
Fix compile errors after r297225:
- properly V_irtualise variable access unbreaking VIMAGE kernels. - remove the volatile from the function return type to make architecture using gcc happy [-Wreturn-type] "type qualifiers ignored on function return type" I am not entirely happy with this solution putting the u_int there but it will do for now.
|
#
297225 |
|
24-Mar-2016 |
gnn |
FreeBSD previously provided route caching for TCP (and UDP). Re-add route caching for TCP, with some improvements. In particular, invalidate the route cache if a new route is added, which might be a better match. The cache is automatically invalidated if the old route is deleted.
Submitted by: Mike Karels Reviewed by: gnn Differential Revision: https://reviews.freebsd.org/D4306
|
#
295529 |
|
11-Feb-2016 |
dteske |
Merge SVN r295220 (bz) from projects/vnet/
Fix a panic that occurs when a vnet interface is unavailable at the time the vnet jail referencing said interface is stopped.
Sponsored by: FIS Global, Inc.
|
#
294710 |
|
25-Jan-2016 |
melifaro |
Fix flowtable part missed in r294706.
|
#
294706 |
|
25-Jan-2016 |
melifaro |
MFP r287070,r287073: split radix implementation and route table structure.
There are number of radix consumers in kernel land (pf,ipfw,nfs,route) with different requirements. In fact, first 3 don't have _any_ requirements and first 2 does not use radix locking. On the other hand, routing structure do have these requirements (rnh_gen, multipath, custom to-be-added control plane functions, different locking). Additionally, radix should not known anything about its consumers internals.
So, radix code now uses tiny 'struct radix_head' structure along with internal 'struct radix_mask_head' instead of 'struct radix_node_head'. Existing consumers still uses the same 'struct radix_node_head' with slight modifications: they need to pass pointer to (embedded) 'struct radix_head' to all radix callbacks.
Routing code now uses new 'struct rib_head' with different locking macro: RADIX_NODE_HEAD prefix was renamed to RIB_ (which stands for routing information base).
New net/route_var.h header was added to hold routing subsystem internal data. 'struct rib_head' was placed there. 'struct rtentry' will also be moved there soon.
|
#
294020 |
|
14-Jan-2016 |
melifaro |
Fix panic in IP redirect. Panic was introduced in r293466.
Found by: Yamagi Burmeister <lists at yamagi.org>>
|
#
293886 |
|
14-Jan-2016 |
melifaro |
Remove now-unused wrappers for various routing functions.
|
#
293829 |
|
13-Jan-2016 |
melifaro |
Remove RTF_RNH_LOCKED support from rtalloc1_fib().
Last caller using it was eliminated in r293471.
Sponsored by: Yandex LLC
|
#
293466 |
|
09-Jan-2016 |
melifaro |
(Temporarily) remove route_redirect_event eventhandler.
Such handler should pass different set of variables, instead of directly providing 2 locked route entries. Given that it hasn't been really used since at least 2012, remove current code. Will re-add it after finishing most major routing-related changes.
Discussed with: np
|
#
293465 |
|
09-Jan-2016 |
melifaro |
Please Coverity by removing unneccessary check (rt_key() is always set).
Coverity CID: 1347797
|
#
293424 |
|
08-Jan-2016 |
melifaro |
Do more fine-grained locking in rtrequest1_fib().
Last consumer using RTF_RNH_LOCKED flag was eliminated in r291643. Restrict passing RTF_RNH_LOCKED to rtrequest1_fib() and do better locking for RTM_ADD / RTM_DELETE cases.
|
#
293159 |
|
04-Jan-2016 |
melifaro |
Add rib_lookup_info() to provide API for retrieving individual route entries data in unified format.
There are control plane functions that require information other than just next-hop data (e.g. individual rtentry fields like flags or prefix/mask). Given that the goal is to avoid rte reference/refcounting, re-use rt_addrinfo structure to store most rte fields. If caller wants to retrieve key/mask or gateway (which are sockaddrs and are allocated separately), it needs to provide sufficient-sized sockaddrs structures w/ ther pointers saved in passed rt_addrinfo.
Convert: * lltable new records checks (in_lltable_rtcheck(), nd6_is_new_addr_neighbor(). * rtsock pre-add/change route check. * IPv6 NS ND-proxy check (RADIX_MPATH code was eliminated because 1) we don't support RTF_ANNOUNCE ND-proxy for networks and there should not be multiple host routes for such hosts 2) if we have multiple routes we should inspect them (which is not done). 3) the entire idea of abusing KRT as storage for ND proxy seems odd. Userland programs should be used for that purpose).
|
#
292163 |
|
13-Dec-2015 |
melifaro |
Fix PINNED routes handling. Before r291643, adding new interface prefix had the following logic: try_add: EEXIST && (PINNED) { try_del(w/o PINNED flag) if (OK) try_add(PINNED) }
In r291643, deletion was performed w/ PINNED flag held which leaded to new interface prefixes (like ::1) overriding older ones. Fix this by requesting deletion w/o RTF_PINNED.
PR: kern/205285 Submitted by: Fabian Keil <fk at fabiankeil.de>
|
#
291643 |
|
02-Dec-2015 |
melifaro |
Move RTF_PINNED handling to generic route code. This eliminates last RTF_RNH_LOCKED rtrequest1_fib() user.
|
#
291565 |
|
01-Dec-2015 |
ngie |
Fix LINT-NOIP kernels after r291467
rn is only used if INET or INET6 are defined
Sponsored by: EMC / Isilon Storage Division
|
#
291467 |
|
30-Nov-2015 |
melifaro |
Move flowtable rte checks to separate function.
|
#
291466 |
|
30-Nov-2015 |
melifaro |
Add new rt_foreach_fib_walk_del() function for deleting route entries by filter function instead of picking into routing table details in each consumer. Remove now-unused rt_expunge() (eliminating last external RTF_RNH_LOCKED user). This simplifies future nexthops/mulitipath changes and rtrequest1_fib() locking refactoring.
Actual changes: Add "rt_chain" field to permit rte grouping while doing batched delete from routing table (thus growing rte 200->208 on amd64). Add "rti_filter" / "rti_filterdata" / "rti_spare" fields to rt_addrinfo to pass filter function to various routing subsystems in standard way. Convert all rt_expunge() customers to new rt_addinfo-based api and eliminate rt_expunge().
|
#
290828 |
|
14-Nov-2015 |
melifaro |
Pass provided af instead of AF_UNSPEC to setwa_f callback.
|
#
290154 |
|
29-Oct-2015 |
bdrewery |
Avoid passing an uninitialized 'i'. Currently nothing was depending on it anyhow.
Coverity CID: 1331562
|
#
289461 |
|
17-Oct-2015 |
melifaro |
Remove several compat functions from pre-fib era.
|
#
287798 |
|
14-Sep-2015 |
vangyzen |
Fix the handling of IPv6 On-Link Redirects.
On receipt of a redirect message, install an interface route for the redirected destination. On removal of the corresponding Neighbor Cache entry, remove the interface route.
This requires changes in rtredirect_fib() to cope with an AF_LINK address for the gateway and with the absence of RTF_GATEWAY.
This fixes the "Redirected On-Link" test cases in the Tahi IPv6 Ready Logo Phase 2 test suite.
Unrelated to the above, fix a recursion on the radix node head lock triggered by the Tahi Redirected to Alternate Router test cases.
When I first wrote this patch in October 2012, all Section 2 (Neighbor Discovery) test cases passed on 10-CURRENT, 9-STABLE, and 8-STABLE. cem@ recently rebased the 10.x patch onto head and reported that it passes Tahi. (Thanks!)
These other test cases also passed in 2012:
* the RTF_MODIFIED case, with IPv4 and IPv6 (using a RTF_HOST|RTF_GATEWAY route for the destination)
* the redirected-to-self case, with IPv4 and IPv6
* a valid IPv4 redirect
All testing in 2012 was done with WITNESS and INVARIANTS.
Tested by: EMC / Isilon Storage Division via Conrad Meyer (cem) in 2015, Mark Kelley <mark_kelley@dell.com> in 2012, TC Telkamp <terence_telkamp@dell.com> in 2012 PR: 152791 Reviewed by: melifaro (current rev), bz (earlier rev) Approved by: kib (mentor) MFC after: 1 month Relnotes: yes Sponsored by: Dell Inc. Differential Revision: https://reviews.freebsd.org/D3602
|
#
287476 |
|
05-Sep-2015 |
melifaro |
Constantify lookup key in ifa_ifwith* functions. Some places in our network stack already have const arguments (like if_output() routines and LLE functions).
Code using ifa_ifwith (and similar functins) along with LLE/_output functions is currently bound to use tricks like __DECONST(). Provide a cleaner way by making sockaddr lookup key really constant.
MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D3464
|
#
286594 |
|
10-Aug-2015 |
melifaro |
Rename rt_foreach_fib() to rt_foreach_fib_walk().
Suggested by: julian
|
#
286458 |
|
08-Aug-2015 |
melifaro |
MFP r274295:
* Move interface route cleanup to route.c:rt_flushifroutes() * Convert most of "for (fibnum = 0; fibnum < rt_numfibs; fibnum++)" users to use new rt_foreach_fib() instead of hand-rolling cycles.
|
#
286057 |
|
30-Jul-2015 |
loos |
Follow r256586 and rename the kernel version of the Free() macro to R_Free(). This matches the other macros and reduces the chances to clash with other headers.
This also fixes the build of radix.c outside of the kernel environment.
Reviewed by: glebius
|
#
281583 |
|
16-Apr-2015 |
araujo |
Remove duplicate header entry.
|
#
274611 |
|
16-Nov-2014 |
melifaro |
Finish r274175: do control plane MTU tracking.
Update route MTU in case of ifnet MTU change. Add new RTF_FIXEDMTU to track explicitly specified MTU.
Old behavior: ifconfig em0 mtu 1500->9000 -> all routes traversing em0 do not change MTU. User has to manually update all routes. ifconfig em0 mtu 9000->1500 -> all routes traversing em0 do not change MTU. However, if ip[6]_output finds route with rt_mtu > interface mtu, rt_mtu gets updated.
New behavior: ifconfig em0 mtu 1500->9000 -> all interface routes in all fibs gets updated with new MTU unless RTF_FIXEDMTU flag set on them. ifconfig em0 mtu 9000->1500 -> all routes in all fibs gets updated with new MTU unless RTF_FIXEDMTU flag set on them AND rt_mtu is less than ifp mtu.
route add ... -mtu XXX automatically sets RTF_FIXEDMTU flag. route change .. -mtu 0 automatically removes RTF_FIXEDMTU flag.
PR: 194238 MFC after: 1 month CR: D1125
|
#
274589 |
|
16-Nov-2014 |
melifaro |
Revert r274585: rte lock is properly destroyed in uma dtor callback.
Pointed by: glebius
|
#
274585 |
|
16-Nov-2014 |
melifaro |
Make witness happy: destroy rte lock before free.
MFC after: 2 weeks
|
#
274187 |
|
06-Nov-2014 |
melifaro |
Fix build.
Pointy hat to: melifaro
|
#
274177 |
|
06-Nov-2014 |
melifaro |
Finish r274118: remove useless fields from struct domain.
Sponsored by: Yandex LLC
|
#
274175 |
|
06-Nov-2014 |
melifaro |
Make checks for rt_mtu generic:
Some virtual if drivers has (ab)used ifa ifa_rtrequest hook to enforce route MTU to be not bigger that interface MTU. While ifa_rtrequest hooking might be an option in some situation, it is not feasible to do MTU checks there: generic (or per-domain) routing code is perfectly capable of doing this.
We currrently have 3 places where MTU is altered:
1) route addition. In this case domain overrides radix _addroute callback (in[6]_addroute) and all necessary checks/fixes are/can be done there.
2) route change (especially, GW change). In this case, there are no explicit per-domain calls, but one can override rte by setting ifa_rtrequest hook to domain handler (inet6 does this).
3) ifconfig ifaceX mtu YYYY In this case, we have no callbacks, but ip[6]_output performes runtime checks and decreases rt_mtu if necessary.
Generally, the goals are to be able to handle all MTU changes in control plane, not in runtime part, and properly deal with increased interface MTU.
This commit changes the following: * removes hooks setting MTU from drivers side * adds proper per-doman MTU checks for case 1) * adds generic MTU check for case 2)
* The latter is done by using new dom_ifmtu callback since if_mtu denotes L3 interface MTU, e.g. maximum trasmitted _packet_ size. However, IPv6 mtu might be different from if_mtu one (e.g. default 1280) for some cases, so we need an abstract way to know maximum MTU size for given interface and domain. * moves rt_setmetrics() before MTU/ifa_rtrequest hooks since it copies user-supplied data which must be checked. * removes RT_LOCK_ASSERT() from other ifa_rtrequest hooks to be able to use this functions on new non-inserted rte.
More changes will follow soon.
MFC after: 1 month Sponsored by: Yandex LLC
|
#
271916 |
|
21-Sep-2014 |
hrs |
Make net.add_addr_allfibs vnet-local.
|
#
271438 |
|
11-Sep-2014 |
asomers |
Revisions 264905 and 266860 added a "int fib" argument to ifa_ifwithnet and ifa_ifwithdstaddr. For the sake of backwards compatibility, the new arguments were added to new functions named ifa_ifwithnet_fib and ifa_ifwithdstaddr_fib, while the old functions became wrappers around the new ones that passed RT_ALL_FIBS for the fib argument. However, the backwards compatibility is not desired for FreeBSD 11, because there are numerous other incompatible changes to the ifnet(9) API. We therefore decided to remove it from head but leave it in place for stable/9 and stable/10. In addition, this commit adds the fib argument to ifa_ifwithbroadaddr for consistency's sake.
sys/sys/param.h Increment __FreeBSD_version
sys/net/if.c sys/net/if_var.h sys/net/route.c Add fibnum argument to ifa_ifwithbroadaddr, and remove the _fib versions of ifa_ifwithdstaddr, ifa_ifwithnet, and ifa_ifwithroute.
sys/net/route.c sys/net/rtsock.c sys/netinet/in_pcb.c sys/netinet/ip_options.c sys/netinet/ip_output.c sys/netinet6/nd6.c Fixup calls of modified functions.
share/man/man9/ifnet.9 Document changed API.
CR: https://reviews.freebsd.org/D458 MFC after: Never Sponsored by: Spectra Logic
|
#
267992 |
|
28-Jun-2014 |
hselasky |
Pull in r267961 and r267973 again. Fix for issues reported will follow.
|
#
267985 |
|
27-Jun-2014 |
gjb |
Revert r267961, r267973:
These changes prevent sysctl(8) from returning proper output, such as:
1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory
|
#
267961 |
|
27-Jun-2014 |
hselasky |
Extend the meaning of the CTLFLAG_TUN flag to automatically check if there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel.
Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change.
MFC after: 2 weeks Sponsored by: Mellanox Technologies
|
#
266860 |
|
29-May-2014 |
asomers |
Fix unintended KBI change from r264905. Add _fib versions of ifa_ifwithnet() and ifa_ifwithdstaddr() The legacy functions will call the _fib() versions with RT_ALL_FIBS, preserving legacy behavior.
sys/net/if_var.h sys/net/if.c Add legacy-compatible functions as described above. Ensure legacy behavior when RT_ALL_FIBS is passed as fibnum.
sys/netinet/in_pcb.c sys/netinet/ip_output.c sys/netinet/ip_options.c sys/net/route.c sys/net/rtsock.c sys/netinet6/nd6.c Call with _fib() functions if we must use a specific fib, or the legacy functions otherwise.
tests/sys/netinet/fibs_test.sh tests/sys/netinet/udp_dontroute.c Improve the udp_dontroute test. The bug that this test exercises is that ifa_ifwithnet() will return the wrong address, if multiple interfaces have addresses on the same subnet but with different fibs. The previous version of the test only considered one possible failure mode: that ifa_ifwithnet_fib() might fail to find any suitable address at all. The new version also checks whether ifa_ifwithnet_fib() finds the correct address by checking where the ARP request goes.
Reported by: bz, hrs Reviewed by: hrs MFC after: 1 week X-MFC-with: 264905 Sponsored by: Spectra Logic
|
#
265280 |
|
03-May-2014 |
melifaro |
Remove additional fib checks from rtalloc1_fib. It looks like current consumers are either unaware of MRT (and uses RT_DEFAULT_FIB implicitly) or know what thay are doing, In latter case they will be either hit by KASSERT or ESCRH will be returned due to NULL rnh.
|
#
265279 |
|
03-May-2014 |
melifaro |
Pass radix head ptr along with rte to rtexpunge(). Rename rtexpunge to rt_expunge().
|
#
265103 |
|
29-Apr-2014 |
melifaro |
Move rt_setmetrics() from rtsock.c to route.c. All rtsock-initiated rte creation/modification are now performed in route.c holding radix tree write lock. This reduces the need for per-rte mutex.
Sponsored by: Yandex LLC MFC after: 1 month
|
#
265091 |
|
29-Apr-2014 |
melifaro |
Do not use senderr() in rtrequest1_fib_change().
Suggested by: glebius MFC after: 4 weeks
|
#
264989 |
|
26-Apr-2014 |
melifaro |
Remove useless `register' declarations.
MFC after: 1 month
|
#
264986 |
|
26-Apr-2014 |
melifaro |
Decouple RTM_CHANGE from RTM_GET handling in rtsock.c:route_output(). RTM_CHANGE is now handled inside route.c:rtrequest1_fib() as it should be. Note change change handler is a separate function rtrequest1_fib_change().
MFC after: 1 month
|
#
264973 |
|
26-Apr-2014 |
melifaro |
Unify sa_equal() macro usage.
MFC after: 2 weeks
|
#
264905 |
|
24-Apr-2014 |
asomers |
Fix subnet and default routes on different FIBs on the same subnet.
These two bugs are closely related. The root cause is that ifa_ifwithnet does not consider FIBs when searching for an interface address.
sys/net/if_var.h sys/net/if.c Add a fib argument to ifa_ifwithnet and ifa_ifwithdstadddr. Those functions will only return an address whose interface fib equals the argument.
sys/net/route.c Update calls to ifa_ifwithnet and ifa_ifwithdstaddr with fib arguments.
sys/netinet/in.c Update in_addprefix to consider the interface fib when adding prefixes. This will prevent it from not adding a subnet route when one already exists on a different fib.
sys/net/rtsock.c sys/netinet/in_pcb.c sys/netinet/ip_output.c sys/netinet/ip_options.c sys/netinet6/nd6.c Add RT_DEFAULT_FIB arguments to ifa_ifwithdstaddr and ifa_ifwithnet. In some cases it there wasn't a clear specific fib number to use. In others, I was unable to test those functions so I chose RT_DEFAULT_FIB to minimize divergence from current behavior. I will fix some of the latter changes along with PR kern/187553.
tests/sys/netinet/fibs_test.sh tests/sys/netinet/udp_dontroute.c tests/sys/netinet/Makefile Revert r263738. The udp_dontroute test was right all along. However, bugs kern/187550 and kern/187553 cancelled each other out when it came to this test. Because of kern/187553, ifa_ifwithnet searched the default fib instead of the requested one, but because of kern/187550, there was an applicable subnet route on the default fib. The new test added in r263738 doesn't work right, however. I can verify with dtrace that ifa_ifwithnet returned the wrong address before I applied this commit, but route(8) miraculously found the correct interface to use anyway. I don't know how.
Clear expected failure messages for kern/187550 and kern/187552.
PR: kern/187550 PR: kern/187552 Reviewed by: melifaro MFC after: 3 weeks Sponsored by: Spectra Logic
|
#
264887 |
|
24-Apr-2014 |
asomers |
Fix host and network routes for new interfaces when net.add_addr_allfibs=0
sys/net/route.c In rtinit1, use the interface fib instead of the process fib. The latter wasn't very useful because ifconfig(8) is usually invoked with the default process fib. Changing ifconfig(8) to use setfib(2) would be redundant, because it already sets the interface fib.
tests/sys/netinet/fibs_test.sh Clear the expected ATF failure
sys/net/if.c Pass the interface fib in calls to rtrequest1_fib and rtalloc1_fib
sys/netinet/in.c sys/net/if_var.h Add a fibnum argument to ifa_switch_loopback_route, a subroutine of in_scrubprefix. Pass it the interface fib.
PR: kern/187549 Reviewed by: melifaro MFC after: 3 weeks Sponsored by: Spectra Logic Corporation
|
#
264241 |
|
07-Apr-2014 |
tuexen |
Call sctp_addr_change() from rt_addrmsg() instead of rt_newaddrmsg_fib(), since rt_addrmsg() gets also called from other functions.
MFC after: 3 days
|
#
263203 |
|
15-Mar-2014 |
glebius |
Garbage collect long time obsoleted (or never used) stuff from routing API.
|
#
262806 |
|
05-Mar-2014 |
glebius |
The route code used to mtx_destroy() a locked mutex before rtentry free. Now, after r262763 it started to return locked mutexes to UMA. To fix that, conditionally unlock the mutex in the destructor.
Tested by: "Sergey V. Dyatko" <sergey.dyatko@gmail.com>
|
#
262763 |
|
04-Mar-2014 |
glebius |
- Remove rt_metrics_lite and simply put its members into rtentry. - Use counter(9) for rt_pksent (former rt_rmx.rmx_pksent). This removes another cache trashing ++ from packet forwarding path. - Create zini/fini methods for the rtentry UMA zone. Via initialize mutex and counter in them. - Fix reporting of rmx_pksent to routing socket. - Fix netstat(1) to report "Use" both in kvm(3) and sysctl(3) mode.
The change is mostly targeted for stable/10 merge. For head, rt_pksent is expected to just disappear.
Discussed with: melifaro Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
#
262758 |
|
04-Mar-2014 |
gnn |
Revert previous commit (262727) and bounce patch back to the submitter.
Pointed out by: jhb
|
#
262727 |
|
04-Mar-2014 |
gnn |
Naming consistency fix. The routing code defines RADIX_NODE_HEAD_LOCK as grabbing the write lock, but RADIX_NODE_HEAD_LOCK_ASSERT as checking the read lock.
Submitted by: Vijay Singh <vijju.singh at gmail.com> MFC after: 1 month
|
#
261601 |
|
07-Feb-2014 |
glebius |
o Revamp API between flowtable and netinet, netinet6. - ip_output() and ip_output6() simply call flowtable_lookup(), passing mbuf and address family. That's the only code under #ifdef FLOWTABLE in the protocols code now. o Revamp statistics gathering and export. - Remove hand made pcpu stats, and utilize counter(9). - Snapshot of statistics is available via 'netstat -rs'. - All sysctls are moved into net.flowtable namespace, since spreading them over net.inet isn't correct. o Properly separate at compile time INET and INET6 parts. o General cleanup. - Remove chain of multiple flowtables. We simply have one for IPv4 and one for IPv6. - Flowtables are allocated in flowtable.c, symbols are static. - With proper argument to SYSINIT() we no longer need flowtable_ready. - Hash salt doesn't need to be per-VNET. - Removed rudimentary debugging, which use quite useless in dtrace era.
The runtime behavior of flowtable shouldn't be changed by this commit.
Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
#
260508 |
|
10-Jan-2014 |
melifaro |
Simplify inet alias handling code: if we're adding/removing alias which has the same prefix as some other alias on the same interface, use newly-added rt_addrmsg() instead of hand-rolled in_addralias_rtmsg().
This eliminates the following rtsock messages:
Pinned RTM_ADD for prefix (for alias addition). Pinned RTM_DELETE for prefix (for alias withdrawal).
Example (got 10.0.0.1/24 on vlan4, playing with 10.0.0.2/24):
before commit, addition:
got message of size 116 on Fri Jan 10 14:13:15 2014 RTM_NEWADDR: address being added to iface: len 116, metric 0, flags: sockaddrs: <NETMASK,IFP,IFA,BRD> 255.255.255.0 vlan4:8.0.27.c5.29.d4 10.0.0.2 10.0.0.255
got message of size 192 on Fri Jan 10 14:13:15 2014 RTM_ADD: Add Route: len 192, pid: 0, seq 0, errno 0, flags:<UP,PINNED> locks: inits: sockaddrs: <DST,GATEWAY,NETMASK> 10.0.0.0 10.0.0.2 (255) ffff ffff ff
after commit, addition:
got message of size 116 on Fri Jan 10 13:56:26 2014 RTM_NEWADDR: address being added to iface: len 116, metric 0, flags: sockaddrs: <NETMASK,IFP,IFA,BRD> 255.255.255.0 vlan4:8.0.27.c5.29.d4 14.0.0.2 14.0.0.255
before commit, wihdrawal:
got message of size 192 on Fri Jan 10 13:58:59 2014 RTM_DELETE: Delete Route: len 192, pid: 0, seq 0, errno 0, flags:<UP,PINNED> locks: inits: sockaddrs: <DST,GATEWAY,NETMASK> 10.0.0.0 10.0.0.2 (255) ffff ffff ff
got message of size 116 on Fri Jan 10 13:58:59 2014 RTM_DELADDR: address being removed from iface: len 116, metric 0, flags: sockaddrs: <NETMASK,IFP,IFA,BRD> 255.255.255.0 vlan4:8.0.27.c5.29.d4 10.0.0.2 10.0.0.255
adter commit, withdrawal:
got message of size 116 on Fri Jan 10 14:14:11 2014 RTM_DELADDR: address being removed from iface: len 116, metric 0, flags: sockaddrs: <NETMASK,IFP,IFA,BRD> 255.255.255.0 vlan4:8.0.27.c5.29.d4 10.0.0.2 10.0.0.255
Sending both RTM_ADD/RTM_DELETE messages to rtsock is completely wrong (and requires some hacks to keep prefix in route table on RTM_DELETE).
I've tested this change with quagga (no change) and bird (*).
bird alias handling is already broken in *BSD sysdep code, so nothing changes here, too.
I'm going to MFC this change if there will be no complains about behavior change.
While here, fix some style(9) bugs introduced by r260488 (pointed by glebius and bde).
Sponsored by: Yandex LLC MFC after: 4 weeks
|
#
260488 |
|
09-Jan-2014 |
melifaro |
Split rt_newaddrmsg_fib() into two different functions. Adding/deleting interface addresses involves access to 3 different subsystems, int different parts of code. Each call can fail, so reporting successful operation by rtsock in the middle of the process error-prone.
Further split routing notification API and actual rtsock calls via creating public-available rt_addrmsg() / rt_routemsg() functions with "private" rtsock_* backend.
MFC after: 2 weeks
|
#
260460 |
|
08-Jan-2014 |
melifaro |
Constanly use RT_ALL_FIBS everywhere instead of -1.
MFC after: 2 weeks
|
#
260379 |
|
06-Jan-2014 |
melifaro |
Partially fix IPv4 interface routes deletion in RADIX_MPATH.
Noticed by: Nikolay Denev <ndenev at gmail.com> MFC after: 1 month
|
#
260295 |
|
04-Jan-2014 |
melifaro |
Change semantics for rnh_lookup() function: now it performs exact match search, regardless of netmask existance. This simplifies most of rnh_lookup() consumers.
Fix panic triggered by deleting non-existent host route.
PR: kern/185092 Submitted by: Nikolay Denev <ndenev at gmail.com> MFC after: 1 month
|
#
258591 |
|
25-Nov-2013 |
rodrigc |
In vnet_route_uninit(), free some memory that is allocated in vnet_route_init().
To reproduce the problem: (1) Take a GENERIC kernel config, and add options for: VIMAGE, WITNESS, INVARIANTS. (2) Run this command in a loop: jail -l -u root -c path=/ name=foo persist vnet && jexec foo ifconfig lo0 127.0.0.1/8 && jail -r foo
see: http://lists.freebsd.org/pipermail/freebsd-current/2010-November/021280.html http://lists.freebsd.org/pipermail/freebsd-current/2010-November/021291.html
This doesn't eliminate all the "Freed UMA keg was not empty" warning messages on the console, but it helps.
|
#
257176 |
|
26-Oct-2013 |
glebius |
The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare to this event, adding if_var.h to files that do need it. Also, include all includes that now are included due to implicit pollution via if_var.h
Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
#
256624 |
|
16-Oct-2013 |
melifaro |
Fix long-standing issue with incorrect radix mask calculation.
Usual symptoms are messages like rn_delete: inconsistent annotation rn_addmask: mask impossibly already in tree or inability to flush/delete particular prefix in ipfw table.
Changes: * Assume 32 bytes as maximum radix key length * Remove rn_init() * Statically allocate rn_ones/rn_zeroes * Make separate mask tree for each "normal" tree instead of system global one * Remove "optimization" on masks reusage and key zeroying * Change rn_addmask() arguments to accept tree pointer (no users in base)
PR: kern/182851, kern/169206, kern/135476, kern/134531 Found by: Slawa Olhovchenkov <slw@zxy.spb.ru> MFC after: 2 weeks Reviewed by: glebius Sponsored by: Yandex LLC
|
#
250764 |
|
18-May-2013 |
melifaro |
Fix rte leak introduced in r248070.
MFC after: 2 weeks
|
#
250700 |
|
16-May-2013 |
julian |
Finally change the mbuf to have its own fib field instead of stealing 4 flag bits. This was supposed to happen in 8.0, and again in 2012..
MFC after: never
|
#
248070 |
|
08-Mar-2013 |
melifaro |
Fix long-standing issue with interface routes being unprotected: Use RTM_PINNED flag to mark route as immutable. Forbid deleting immutable routes without special rtrequest1_fib() flag. Adding interface address with prefix already in route table is handled by atomically deleting old prefix and adding interface one.
Discussed with: andre, eri MFC after: 3 weeks
|
#
247842 |
|
05-Mar-2013 |
melifaro |
Write lock is not required for find&compare operation.
MFC after: 2 weeks
|
#
233113 |
|
18-Mar-2012 |
bz |
Hide kernel option ROUTETABLES evaluations in the implementation rather than the header file. With this also move RT_MAXFIBS and RT_NUMFIBS into the implemantion to avoid further usage in other code. rt_numfibs is all that should be needed.
This allows users to change the number of FIBs from 1..RT_MAXFIBS(16) dynamically using the tunable without the need to change the kernel config for the maximum anymore. This means that thet multi-FIB feature is now fully available with GENERIC kernels. The kernel option ROUTETABLES can still be used to set the default numbers of FIBs in absence of the tunable.
Ok.ed by: julian, hrs, melifaro MFC after: 2 weeks
|
#
231852 |
|
17-Feb-2012 |
bz |
Merge multi-FIB IPv6 support from projects/multi-fibv6/head/:
Extend the so far IPv4-only support for multiple routing tables (FIBs) introduced in r178888 to IPv6 providing feature parity.
This includes an extended rtalloc(9) KPI for IPv6, the necessary adjustments to the network stack, and user land support as in netstat.
Sponsored by: Cisco Systems, Inc. Reviewed by: melifaro (basically) MFC after: 10 days
|
#
230510 |
|
24-Jan-2012 |
bz |
Replace random ARIN direct assignment legacy IPs with proper RFC 5735 TEST-NET1 block for use in documentation and example code addresses.
MFC after: 3 days
|
#
228532 |
|
15-Dec-2011 |
glebius |
Simplify rtrequest(RTM_ADD): ifa can't be NULL after rt_getifa_fib().
|
#
226710 |
|
24-Oct-2011 |
qingli |
The host-id/interface-id can have a specific value and is properly masked out when adding a prefix route through the "route" command. However, when deleting the route, simply changing the command keyword from "add" to "delete" does not work. The failoure is observed in both IPv4 and IPv6 route insertion. The patch makes the route command behavior consistent between the "add" and the "delete" operation.
MFC after: 1 week
|
#
225837 |
|
28-Sep-2011 |
bz |
Pass the fibnum where we need filtering of the message on the rtsock allowing routing daemons to filter routing updates on an rtsock per FIB.
Adjust raw_input() and split it into wrapper and a new function taking an optional callback argument even though we only have one consumer [1] to keep the hackish flags local to rtsock.c.
PR: kern/134931 Submitted by: multiple (see PR) Suggested by: rwatson [1] Reviewed by: rwatson MFC after: 3 days
|
#
225617 |
|
16-Sep-2011 |
kmacy |
In order to maximize the re-usability of kernel code in user space this patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls.
Reviewed by: rwatson Approved by: re (bz)
|
#
224703 |
|
08-Aug-2011 |
kevlo |
In rtinit1(), before rtrequest1_fib() is called, info.rti_flags is initialized by flags (function argument) or-ed with ifa->ifa_flags. If both NIC has a loopback route to itself, so IFA_RTSELF is set on ifa(s). As IFA_RTSELF is defined by RTF_HOST, rtrequest1_fib() is called with RTF_HOST flag even if netmask is not NULL. Consequently, netmask is set to zero in rtrequest1_fib(), and request to add network route is changed under hands to request to add host route.
Tested by: Andrew Boyer <aboyer at averesystems.com> Submitted by: Svatopluk Kraus <onwahe at gmail dot com> Approved by: re (hrs)
|
#
223359 |
|
21-Jun-2011 |
bz |
Garbage collect never used global, sysctl, externs.
MFC after: 1 week
|
#
223334 |
|
20-Jun-2011 |
bz |
Leave an extra comment about flowtable and IPv6 support rectifying a previous comment.
MFC after: 1 week
|
#
219786 |
|
19-Mar-2011 |
dchagin |
ouch, newrt is used on the return path, my fault. Partialy revert the previous change.
MFC after: 1 Week.
|
#
219783 |
|
19-Mar-2011 |
dchagin |
A bit rearranged rtalloc1_fib() code. Initialize a variable when it is really needed. To avoid code duplication move the miss label to line up and jump on it.
MFC after: 1 Week
|
#
219776 |
|
19-Mar-2011 |
dchagin |
Remove a now unused variable.
MFC after: 1 Week
|
#
218909 |
|
21-Feb-2011 |
brucec |
Fix typos - remove duplicate "the".
PR: bin/154928 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days
|
#
217322 |
|
12-Jan-2011 |
mdf |
sysctl(9) cleanup checkpoint: amd64 GENERIC builds cleanly.
Commit the net* piece.
|
#
215701 |
|
22-Nov-2010 |
dim |
After some off-list discussion, revert a number of changes to the DPCPU_DEFINE and VNET_DEFINE macros, as these cause problems for various people working on the affected files. A better long-term solution is still being considered. This reversal may give some modules empty set_pcpu or set_vnet sections, but these are harmless.
Changes reverted:
------------------------------------------------------------------------ r215318 | dim | 2010-11-14 21:40:55 +0100 (Sun, 14 Nov 2010) | 4 lines
Instead of unconditionally emitting .globl's for the __start_set_xxx and __stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu sections are actually defined.
------------------------------------------------------------------------ r215317 | dim | 2010-11-14 21:38:11 +0100 (Sun, 14 Nov 2010) | 3 lines
Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout the tree.
------------------------------------------------------------------------ r215316 | dim | 2010-11-14 21:23:02 +0100 (Sun, 14 Nov 2010) | 2 lines
Add macros to define static instances of VNET_DEFINE and DPCPU_DEFINE.
|
#
215317 |
|
14-Nov-2010 |
dim |
Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout the tree.
|
#
208553 |
|
25-May-2010 |
qingli |
This patch fixes the problem where proxy ARP entries cannot be added over the if_ng interface.
MFC after: 3 days
|
#
207369 |
|
29-Apr-2010 |
bz |
MFP4: @176978-176982, 176984, 176990-176994, 177441
"Whitspace" churn after the VIMAGE/VNET whirls.
Remove the need for some "init" functions within the network stack, like pim6_init(), icmp_init() or significantly shorten others like ip6_init() and nd6_init(), using static initialization again where possible and formerly missed.
Move (most) variables back to the place they used to be before the container structs and VIMAGE_GLOABLS (before r185088) and try to reduce the diff to stable/7 and earlier as good as possible, to help out-of-tree consumers to update from 6.x or 7.x to 8 or 9.
This also removes some header file pollution for putatively static global variables.
Revert VIMAGE specific changes in ipfilter::ip_auth.c, that are no longer needed.
Reviewed by: jhb Discussed with: rwatson Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH MFC after: 6 days
|
#
204902 |
|
08-Mar-2010 |
qingli |
One of the advantages of enabling ECMP (a.k.a RADIX_MPATH) is to allow for connection load balancing across interfaces. Currently the address alias handling method is colliding with the ECMP code. For example, when two interfaces are configured on the same prefix, only one prefix route is installed. So connection load balancing among the available interfaces is not possible.
The other advantage of ECMP is for failover. The issue with the current code, is that the interface link-state is not reflected in the route entry. For example, if there are two interfaces on the same prefix, the cable on one interface is unplugged, new and existing connections should switch over to the other interface. This is not done today and packets go into a black hole.
Also, there is a small bug in the kernel where deleting ECMP routes in the userland will always return an error even though the command is successfully executed.
MFC after: 5 days
|
#
201282 |
|
30-Dec-2009 |
qingli |
The proxy arp entries could not be added into the system over the IFF_POINTOPOINT link types. The reason was due to the routing entry returned from the kernel covering the remote end is of an interface type that does not support ARP. This patch fixes this problem by providing a hint to the kernel routing code, which indicates the prefix route instead of the PPP host route should be returned to the caller. Since a host route to the local end point is also added into the routing table, and there could be multiple such instantiations due to multiple PPP links can be created with the same local end IP address, this patch also fixes the loopback route installation failure problem observed prior to this patch. The reference count of loopback route to local end would be either incremented or decremented. The first instantiation would create the entry and the last removal would delete the route entry.
MFC after: 5 days
|
#
200537 |
|
14-Dec-2009 |
luigi |
Move the scan for max_keylen into route.c::route_init(), and make max_keylen an argument for rn_init(). This removes an unnecessary dependency on domain.h from radix.c
MFC after: 7 days
|
#
199365 |
|
17-Nov-2009 |
tuexen |
Fix a LOR showing up with sctp_bsd_addr(): Do not hold a rt lock when calling rt_newaddrmsg().
Reviewed by: qingli Approved by: rrs (mentor) MFC after: 1 month
|
#
197727 |
|
03-Oct-2009 |
bz |
Put #ifdef INET around parts of the FLOWTABLE code, to unbreak nooptions INET kernel builds.
MFC after: 3 days X-MFC: with r197687
|
#
197687 |
|
01-Oct-2009 |
qingli |
The flow-table associates TCP/UDP flows and IP destinations with specific routes. When the routing table changes, for example, when a new route with a more specific prefix is inserted into the routing table, the flow-table is not updated to reflect that change. As such existing connections cannot take advantage of the new path. In some cases the path is broken. This patch will update the affected flow-table entries when a more specific route is added. The route entry is properly marked when a route is deleted from the table. In this case, when the flow-table performs a search, the stale entry is updated automatically. Therefore this patch is not necessary for route deletion.
Submitted by: simon, phk Reviewed by: bz, kmacy MFC after: 3 days
|
#
196019 |
|
01-Aug-2009 |
rwatson |
Merge the remainder of kern_vimage.c and vimage.h into vnet.c and vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes.
Reviewed by: bz Approved by: re (vimage blanket)
|
#
195837 |
|
23-Jul-2009 |
rwatson |
Introduce and use a sysinit-based initialization scheme for virtual network stacks, VNET_SYSINIT:
- Add VNET_SYSINIT and VNET_SYSUNINIT macros to declare events that will occur each time a network stack is instantiated and destroyed. In the !VIMAGE case, these are simply mapped into regular SYSINIT/SYSUNINIT. For the VIMAGE case, we instead use SYSINIT's to track their order and properties on registration, using them for each vnet when created/ destroyed, or immediately on module load for already-started vnets. - Remove vnet_modinfo mechanism that existed to serve this purpose previously, as well as its dependency scheme: we now just use the SYSINIT ordering scheme. - Implement VNET_DOMAIN_SET() to allow protocol domains to declare that they want init functions to be called for each virtual network stack rather than just once at boot, compiling down to DOMAIN_SET() in the non-VIMAGE case. - Walk all virtualized kernel subsystems and make use of these instead of modinfo or DOMAIN_SET() for init/uninit events. In some cases, convert modular components from using modevent to using sysinit (where appropriate). In some cases, do minor rejuggling of SYSINIT ordering to make room for or better manage events.
Portions submitted by: jhb (VNET_SYSINIT), bz (cleanup) Discussed with: jhb, bz, julian, zec Reviewed by: bz Approved by: re (VIMAGE blanket)
|
#
195727 |
|
16-Jul-2009 |
rwatson |
Remove unused VNET_SET() and related macros; only VNET_GET() is ever actually used. Rename VNET_GET() to VNET() to shorten variable references.
Discussed with: bz, julian Reviewed by: bz Approved by: re (kensmith, kib)
|
#
195699 |
|
14-Jul-2009 |
rwatson |
Build on Jeff Roberson's linker-set based dynamic per-CPU allocator (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables.
Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker.
Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided.
This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS.
Bump __FreeBSD_version and update UPDATING.
Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)
|
#
195624 |
|
11-Jul-2009 |
kmacy |
Re-factoring for adding weighted routes introduced a fairly irritating bug where the system will panic when RADIX_MPATH is enabled. This change fixes this.
Approved by: re@
|
#
194760 |
|
23-Jun-2009 |
rwatson |
Modify most routines returning 'struct ifaddr *' to return references rather than pointers, requiring callers to properly dispose of those references. The following routines now return references:
ifaddr_byindex ifa_ifwithaddr ifa_ifwithbroadaddr ifa_ifwithdstaddr ifa_ifwithnet ifaof_ifpforaddr ifa_ifwithroute ifa_ifwithroute_fib rt_getifa rt_getifa_fib IFP_TO_IA ip_rtaddr in6_ifawithifp in6ifa_ifpforlinklocal in6ifa_ifpwithaddr in6_ifadd carp_iamatch6 ip6_getdstifaddr
Remove unused macro which didn't have required referencing:
IFP_TO_IA6
This closes many small races in which changes to interface or address lists while an ifaddr was in use could lead to use of freed memory (etc). In a few cases, add missing if_addr_list locking required to safely acquire references.
Because of a lack of deep copying support, we accept a race in which an in6_ifaddr pointed to by mbuf tags and extracted with ip6_getdstifaddr() doesn't hold a reference while in transmit. Once we have mbuf tag deep copy support, this can be fixed.
Reviewed by: bz Obtained from: Apple, Inc. (portions) MFC after: 6 weeks (portions)
|
#
194640 |
|
22-Jun-2009 |
bz |
Move virtualization of routing related variables into their own Vimage module, which had been there already but now is stateful.
All variables are now file local; so this further limits the global spreading of routing related things throughout the kernel.
Add a missing function local variable in case of MPATHing.
Reviewed by: zec
|
#
194629 |
|
22-Jun-2009 |
bz |
Collect all VIMAGE_GLOBALS variables in one place.
No longer export rt_tables as all lookups go through rt_tables_get_rnh().
We cannot make rt_tables (and rtstat, rttrash[1]) static as netstat -r (-rs[1]) would stop working on a stripped VIMAGE_GLOBALS kernel.
Reviewed by: zec Presumably broken by: phk 13.5y ago in r12820 [1]
|
#
194622 |
|
22-Jun-2009 |
rwatson |
Add a new function, ifa_ifwithaddr_check(), which rather than returning a pointer to an ifaddr matching the passed socket address, returns a boolean indicating whether one was present. In the (near) future, ifa_ifwithaddr() will return a referenced ifaddr rather than a raw ifaddr pointer, and the new wrapper will allow callers that care only about the boolean condition to avoid having to free that reference.
MFC after: 3 weeks
|
#
194602 |
|
21-Jun-2009 |
rwatson |
Clean up common ifaddr management:
- Unify reference count and lock initialization in a single function, ifa_init(). - Move tear-down from a macro (IFAFREE) to a function ifa_free(). - Move reference count bump from a macro (IFAREF) to a function ifa_ref(). - Instead of using a u_int protected by a mutex to refcount(9) for reference count management.
The ifa_mtx is now used for exactly one ioctl, and possibly should be removed.
MFC after: 3 weeks
|
#
193731 |
|
08-Jun-2009 |
zec |
Introduce an infrastructure for dismantling vnet instances.
Vnet modules and protocol domains may now register destructor functions to clean up and release per-module state. The destructor mechanisms can be triggered by invoking "vimage -d", or a future equivalent command which will be provided via the new jail framework.
While this patch introduces numerous placeholder destructor functions, many of those are currently incomplete, thus leaking memory or (even worse) failing to stop all running timers. Many of such issues are already known and will be incrementaly fixed over the next weeks in smaller incremental commits.
Apart from introducing new fields in structs ifnet, domain, protosw and vnet_net, which requires the kernel and modules to be rebuilt, this change should have no impact on nooptions VIMAGE builds, since vnet destructors can only be called in VIMAGE kernels. Moreover, destructor functions should be in general compiled in only in options VIMAGE builds, except for kernel modules which can be safely kldunloaded at run time.
Bump __FreeBSD_version to 800097. Reviewed by: bz, julian Approved by: rwatson, kib (re), julian (mentor)
|
#
193232 |
|
01-Jun-2009 |
bz |
Convert the two dimensional array to be malloced and introduce an accessor function to get the correct rnh pointer back.
Update netstat to get the correct pointer using kvm_read() as well.
This not only fixes the ABI problem depending on the kernel option but also permits the tunable to overwrite the kernel option at boot time up to MAXFIBS, enlarging the number of FIBs without having to recompile. So people could just use GENERIC now.
Reviewed by: julian, rwatson, zec X-MFC: not possible
|
#
191734 |
|
02-May-2009 |
zec |
Unbreak options VIMAGE + nooptions INVARIANTS kernel builds.
Submitted by: julian Approved by: julian (mentor)
|
#
191548 |
|
26-Apr-2009 |
zec |
In preparation for turning on options VIMAGE in next commits, rearrange / replace / adjust several INIT_VNET_* initializer macros, all of which currently resolve to whitespace.
Reviewed by: bz (an older version of the patch) Approved by: julian (mentor)
|
#
191080 |
|
14-Apr-2009 |
kmacy |
Extend route command: - add show as alias for get - add weights to allow mpath to do more than equal cost - add sticky / nostick to disable / re-enable per-connection load balancing
This adds a field to rt_metrics_lite so network bits of world will need to be re-built.
Reviewed by: jeli & qingli
|
#
190909 |
|
11-Apr-2009 |
zec |
Introduce vnet module registration / initialization framework with dependency tracking and ordering enforcement.
With this change, per-vnet initialization functions introduced with r190787 are no longer directly called from traditional initialization functions (which cc in most cases inlined to pre-r190787 code), but are instead registered via the vnet framework first, and are invoked only after all prerequisite modules have been initialized. In the long run, this framework should allow us to both initialize and dismantle multiple vnet instances in a correct order.
The problem this change aims to solve is how to replay the initialization sequence of various network stack components, which have been traditionally triggered via different mechanisms (SYSINIT, protosw). Note that this initialization sequence was and still can be subtly different depending on whether certain pieces of code have been statically compiled into the kernel, loaded as modules by boot loader, or kldloaded at run time.
The approach is simple - we record the initialization sequence established by the traditional mechanisms whenever vnet_mod_register() is called for a particular vnet module. The vnet_mod_register_multi() variant allows a single initializer function to be registered multiple times but with different arguments - currently this is only used in kern/uipc_domain.c by net_add_domain() with different struct domain * as arguments, which allows for protosw-registered initialization routines to be invoked in a correct order by the new vnet initialization framework.
For the purpose of identifying vnet modules, each vnet module has to have a unique ID, which is statically assigned in sys/vimage.h. Dynamic assignment of vnet module IDs is not supported yet.
A vnet module may specify a single prerequisite module at registration time by filling in the vmi_dependson field of its vnet_modinfo struct with the ID of the module it depends on. Unless specified otherwise, all vnet modules depend on VNET_MOD_NET (container for ifnet list head, rt_tables etc.), which thus has to and will always be initialized first. The framework will panic if it detects any unresolved dependencies before completing system initialization. Detection of unresolved dependencies for vnet modules registered after boot (kldloaded modules) is not provided.
Note that the fact that each module can specify only a single prerequisite may become problematic in the long run. In particular, INET6 depends on INET being already instantiated, due to TCP / UDP structures residing in INET container. IPSEC also depends on INET, which will in turn additionally complicate making INET6-only kernel configs a reality.
The entire registration framework can be compiled out by turning on the VIMAGE_GLOBALS kernel config option.
Reviewed by: bz Approved by: julian (mentor)
|
#
190787 |
|
06-Apr-2009 |
zec |
First pass at separating per-vnet initializer functions from existing functions for initializing global state.
At this stage, the new per-vnet initializer functions are directly called from the existing global initialization code, which should in most cases result in compiler inlining those new functions, hence yielding a near-zero functional change.
Modify the existing initializer functions which are invoked via protosw, like ip_init() et. al., to allow them to be invoked multiple times, i.e. per each vnet. Global state, if any, is initialized only if such functions are called within the context of vnet0, which will be determined via the IS_DEFAULT_VNET(curvnet) check (currently always true).
While here, V_irtualize a few remaining global UMA zones used by net/netinet/netipsec networking code. While it is not yet clear to me or anybody else whether this is the right thing to do, at this stage this makes the code more readable, and makes it easier to track uncollected UMA-zone-backed objects on vnet removal. In the long run, it's quite possible that some form of shared use of UMA zone pools among multiple vnets should be considered.
Bump __FreeBSD_version due to changes in layout of structs vnet_ipfw, vnet_inet and vnet_net.
Approved by: julian (mentor)
|
#
186705 |
|
02-Jan-2009 |
qingli |
The log message should terminate with a newline instead of a tab character.
|
#
186167 |
|
16-Dec-2008 |
kmacy |
style and spelling fix
|
#
186119 |
|
15-Dec-2008 |
qingli |
This main goals of this project are: 1. separating L2 tables (ARP, NDP) from the L3 routing tables 2. removing as much locking dependencies among these layers as possible to allow for some parallelism in the search operations 3. simplify the logic in the routing code,
The most notable end result is the obsolescent of the route cloning (RTF_CLONING) concept, which translated into code reduction in both IPv4 ARP and IPv6 NDP related modules, and size reduction in struct rtentry{}. The change in design obsoletes the semantics of RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland applications such as "arp" and "ndp" have been modified to reflect those changes. The output from "netstat -r" shows only the routing entries.
Quite a few developers have contributed to this project in the past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and Andre Oppermann. And most recently:
- Kip Macy revised the locking code completely, thus completing the last piece of the puzzle, Kip has also been conducting active functional testing - Sam Leffler has helped me improving/refactoring the code, and provided valuable reviews - Julian Elischer setup the perforce tree for me and has helped me maintaining that branch before the svn conversion
|
#
185849 |
|
10-Dec-2008 |
kmacy |
fix a reported panic when adding a route and one hit here when deleting a route
- pass RTF_RNH_LOCKED to rtalloc1_fib in 2 cases where the lock is held - make sure the rnh lock is held across rt_setgate and rt_getifa_fib
|
#
185807 |
|
09-Dec-2008 |
bz |
Fix a bug introduced in r185747: rather than dereferencing an uninitialized *rt to something undefined, use the fibnum that came in as function argument.
Found with: Coverity Prevent(tm) CID: 4168
|
#
185774 |
|
08-Dec-2008 |
kmacy |
- avoid recursively locking the radix node head lock - assert that it is held if RTF_RNH_LOCKED is not passed
|
#
185747 |
|
07-Dec-2008 |
kmacy |
- convert radix node head lock from mutex to rwlock - make radix node head lock not recursive - fix LOR in rtexpunge - fix LOR in rtredirect
Reviewed by: sam
|
#
185571 |
|
02-Dec-2008 |
bz |
Rather than using hidden includes (with cicular dependencies), directly include only the header files needed. This reduces the unneeded spamming of various headers into lots of files.
For now, this leaves us with very few modules including vnet.h and thus needing to depend on opt_route.h.
Reviewed by: brooks, gnn, des, zec, imp Sponsored by: The FreeBSD Foundation
|
#
185348 |
|
26-Nov-2008 |
zec |
Merge more of currently non-functional (i.e. resolving to whitespace) macros from p4/vimage branch.
Do a better job at enclosing all instantiations of globals scheduled for virtualization in #ifdef VIMAGE_GLOBALS blocks.
De-virtualize and mark as const saorder_state_alive and saorder_state_any arrays from ipsec code, given that they are never updated at runtime, so virtualizing them would be pointless.
Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
|
#
185088 |
|
19-Nov-2008 |
zec |
Change the initialization methodology for global variables scheduled for virtualization.
Instead of initializing the affected global variables at instatiation, assign initial values to them in initializer functions. As a rule, initialization at instatiation for such variables should never be introduced again from now on. Furthermore, enclose all instantiations of such global variables in #ifdef VIMAGE_GLOBALS blocks.
Essentialy, this change should have zero functional impact. In the next phase of merging network stack virtualization infrastructure from p4/vimage branch, the new initialization methology will allow us to switch between using global variables and their counterparts residing in virtualization containers with minimum code churn, and in the long run allow us to intialize multiple instances of such container structures.
Discussed at: devsummit Strassburg Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
|
#
183550 |
|
02-Oct-2008 |
zec |
Step 1.5 of importing the network stack virtualization infrastructure from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit
Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs.
Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_*() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT().
Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.).
All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(*).
(*) netipsec/keysock.c did not validate depending on compile time options.
Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
|
#
183200 |
|
20-Sep-2008 |
zec |
Move #defines for MRT-related constants from net/route.c to net/route.h, because the vnet code will need those constants as well.
Reviewed by: bz Approved by: julian (mentor) MFC after: never
|
#
183034 |
|
15-Sep-2008 |
julian |
Hey, committed the same typo twice! must be a record
|
#
183032 |
|
15-Sep-2008 |
julian |
rewrite rt_check. Ztake into account that whiel teh rtentry is unlocked, someone else might change it, so after we re-acquire the lock on it, we need to check it is still valid. People have been panicing in this function due to soem edge cases which I have hopefully removed.
Reviewed by: keramida @ Obtained from: 1 week
|
#
183017 |
|
14-Sep-2008 |
julian |
come on Julian, make up if you're committing one change or the other. fix braino
|
#
183013 |
|
14-Sep-2008 |
julian |
Revert a part of the MRT commit that proved un-needed. rt_check() in its original form proved to be sufficient and rt_check_fib() can go away (as can its evil twin in_rt_check()).
I believe this does NOT address the crashes people have been seeing in rt_check.
MFC after: 1 week
|
#
182615 |
|
01-Sep-2008 |
brooks |
Wrap a line that became too long with the addition of V_.
(This file contains many more unwrapped or badly wrapped lines.)
|
#
181803 |
|
17-Aug-2008 |
bz |
Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course of the next few weeks.
Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch
|
#
180840 |
|
26-Jul-2008 |
julian |
Add the ability to add new addresses for interfacesto just one FIB (Other more specific related options will follow) This allows one to set multiple p2p links to the same place and select which to use by having each in different FIBS.
|
#
178898 |
|
10-May-2008 |
julian |
move a #define from a place it shouldn't have been to a place it should have been. Basically my testign didn't ocver one case that this broke. thanks tinderbox!
|
#
178897 |
|
10-May-2008 |
julian |
undef MAXFIBS before redefining it
|
#
178888 |
|
09-May-2008 |
julian |
Add code to allow the system to handle multiple routing tables. This particular implementation is designed to be fully backwards compatible and to be MFC-able to 7.x (and 6.x)
Currently the only protocol that can make use of the multiple tables is IPv4 Similar functionality exists in OpenBSD and Linux.
From my notes:
-----
One thing where FreeBSD has been falling behind, and which by chance I have some time to work on is "policy based routing", which allows different packet streams to be routed by more than just the destination address.
Constraints: ------------
I want to make some form of this available in the 6.x tree (and by extension 7.x) , but FreeBSD in general needs it so I might as well do it in -current and back port the portions I need.
One of the ways that this can be done is to have the ability to instantiate multiple kernel routing tables (which I will now refer to as "Forwarding Information Bases" or "FIBs" for political correctness reasons). Which FIB a particular packet uses to make the next hop decision can be decided by a number of mechanisms. The policies these mechanisms implement are the "Policies" referred to in "Policy based routing".
One of the constraints I have if I try to back port this work to 6.x is that it must be implemented as a EXTENSION to the existing ABIs in 6.x so that third party applications do not need to be recompiled in timespan of the branch.
This first version will not have some of the bells and whistles that will come with later versions. It will, for example, be limited to 16 tables in the first commit. Implementation method, Compatible version. (part 1) ------------------------------- For this reason I have implemented a "sufficient subset" of a multiple routing table solution in Perforce, and back-ported it to 6.x. (also in Perforce though not always caught up with what I have done in -current/P4). The subset allows a number of FIBs to be defined at compile time (8 is sufficient for my purposes in 6.x) and implements the changes needed to allow IPV4 to use them. I have not done the changes for ipv6 simply because I do not need it, and I do not have enough knowledge of ipv6 (e.g. neighbor discovery) needed to do it.
Other protocol families are left untouched and should there be users with proprietary protocol families, they should continue to work and be oblivious to the existence of the extra FIBs.
To understand how this is done, one must know that the current FIB code starts everything off with a single dimensional array of pointers to FIB head structures (One per protocol family), each of which in turn points to the trie of routes available to that family.
The basic change in the ABI compatible version of the change is to extent that array to be a 2 dimensional array, so that instead of protocol family X looking at rt_tables[X] for the table it needs, it looks at rt_tables[Y][X] when for all protocol families except ipv4 Y is always 0. Code that is unaware of the change always just sees the first row of the table, which of course looks just like the one dimensional array that existed before.
The entry points rtrequest(), rtalloc(), rtalloc1(), rtalloc_ign() are all maintained, but refer only to the first row of the array, so that existing callers in proprietary protocols can continue to do the "right thing". Some new entry points are added, for the exclusive use of ipv4 code called in_rtrequest(), in_rtalloc(), in_rtalloc1() and in_rtalloc_ign(), which have an extra argument which refers the code to the correct row.
In addition, there are some new entry points (currently called rtalloc_fib() and friends) that check the Address family being looked up and call either rtalloc() (and friends) if the protocol is not IPv4 forcing the action to row 0 or to the appropriate row if it IS IPv4 (and that info is available). These are for calling from code that is not specific to any particular protocol. The way these are implemented would change in the non ABI preserving code to be added later.
One feature of the first version of the code is that for ipv4, the interface routes show up automatically on all the FIBs, so that no matter what FIB you select you always have the basic direct attached hosts available to you. (rtinit() does this automatically).
You CAN delete an interface route from one FIB should you want to but by default it's there. ARP information is also available in each FIB. It's assumed that the same machine would have the same MAC address, regardless of which FIB you are using to get to it.
This brings us as to how the correct FIB is selected for an outgoing IPV4 packet.
Firstly, all packets have a FIB associated with them. if nothing has been done to change it, it will be FIB 0. The FIB is changed in the following ways.
Packets fall into one of a number of classes.
1/ locally generated packets, coming from a socket/PCB. Such packets select a FIB from a number associated with the socket/PCB. This in turn is inherited from the process, but can be changed by a socket option. The process in turn inherits it on fork. I have written a utility call setfib that acts a bit like nice..
setfib -3 ping target.example.com # will use fib 3 for ping.
It is an obvious extension to make it a property of a jail but I have not done so. It can be achieved by combining the setfib and jail commands.
2/ packets received on an interface for forwarding. By default these packets would use table 0, (or possibly a number settable in a sysctl(not yet)). but prior to routing the firewall can inspect them (see below). (possibly in the future you may be able to associate a FIB with packets received on an interface.. An ifconfig arg, but not yet.)
3/ packets inspected by a packet classifier, which can arbitrarily associate a fib with it on a packet by packet basis. A fib assigned to a packet by a packet classifier (such as ipfw) would over-ride a fib associated by a more default source. (such as cases 1 or 2).
4/ a tcp listen socket associated with a fib will generate accept sockets that are associated with that same fib.
5/ Packets generated in response to some other packet (e.g. reset or icmp packets). These should use the FIB associated with the packet being reponded to.
6/ Packets generated during encapsulation. gif, tun and other tunnel interfaces will encapsulate using the FIB that was in effect withthe proces that set up the tunnel. thus setfib 1 ifconfig gif0 [tunnel instructions] will set the fib for the tunnel to use to be fib 1.
Routing messages would be associated with their process, and thus select one FIB or another. messages from the kernel would be associated with the fib they refer to and would only be received by a routing socket associated with that fib. (not yet implemented)
In addition Netstat has been edited to be able to cope with the fact that the array is now 2 dimensional. (It looks in system memory using libkvm (!)). Old versions of netstat see only the first FIB.
In addition two sysctls are added to give: a) the number of FIBs compiled in (active) b) the default FIB of the calling process.
Early testing experience: -------------------------
Basically our (IronPort's) appliance does this functionality already using ipfw fwd but that method has some drawbacks.
For example, It can't fully simulate a routing table because it can't influence the socket's choice of local address when a connect() is done.
Testing during the generating of these changes has been remarkably smooth so far. Multiple tables have co-existed with no notable side effects, and packets have been routes accordingly.
ipfw has grown 2 new keywords:
setfib N ip from anay to any count ip from any to any fib N
In pf there seems to be a requirement to be able to give symbolic names to the fibs but I do not have that capacity. I am not sure if it is required.
SCTP has interestingly enough built in support for this, called VRFs in Cisco parlance. it will be interesting to see how that handles it when it suddenly actually does something.
Where to next: --------------------
After committing the ABI compatible version and MFCing it, I'd like to proceed in a forward direction in -current. this will result in some roto-tilling in the routing code.
Firstly: the current code's idea of having a separate tree per protocol family, all of the same format, and pointed to by the 1 dimensional array is a bit silly. Especially when one considers that there is code that makes assumptions about every protocol having the same internal structures there. Some protocols don't WANT that sort of structure. (for example the whole idea of a netmask is foreign to appletalk). This needs to be made opaque to the external code.
My suggested first change is to add routing method pointers to the 'domain' structure, along with information pointing the data. instead of having an array of pointers to uniform structures, there would be an array pointing to the 'domain' structures for each protocol address domain (protocol family), and the methods this reached would be called. The methods would have an argument that gives FIB number, but the protocol would be free to ignore it.
When the ABI can be changed it raises the possibilty of the addition of a fib entry into the "struct route". Currently, the structure contains the sockaddr of the desination, and the resulting fib entry. To make this work fully, one could add a fib number so that given an address and a fib, one can find the third element, the fib entry.
Interaction with the ARP layer/ LL layer would need to be revisited as well. Qing Li has been working on this already.
This work was sponsored by Ironport Systems/Cisco
Reviewed by: several including rwatson, bz and mlair (parts each) Obtained from: Ironport systems/Cisco
|
#
178176 |
|
13-Apr-2008 |
bz |
Fix the build in case RADIX_MPATH is not defined.
|
#
178167 |
|
13-Apr-2008 |
qingli |
This patch provides the back end support for equal-cost multi-path (ECMP) for both IPv4 and IPv6. Previously, multipath route insertion is disallowed. For example,
route add -net 192.103.54.0/24 10.9.44.1 route add -net 192.103.54.0/24 10.9.44.2
The second route insertion will trigger an error message of "add net 192.103.54.0/24: gateway 10.2.5.2: route already in table"
Multiple default routes can also be inserted. Here is the netstat output:
default 10.2.5.1 UGS 0 3074 bge0 => default 10.2.5.2 UGS 0 0 bge0
When multipath routes exist, the "route delete" command requires a specific gateway to be specified or else an error message would be displayed. For example,
route delete default
would fail and trigger the following error message:
"route: writing to routing socket: No such process" "delete net default: not in table"
On the other hand,
route delete default 10.2.5.2
would be successful: "delete net default: gateway 10.2.5.2"
One does not have to specify a gateway if there is only a single route for a particular destination.
I need to perform more testings on address aliases and multiple interfaces that have the same IP prefixes. This patch as it stands today is not yet ready for prime time. Therefore, the ECMP code fragments are fully guarded by the RADIX_MPATH macro. Include the "options RADIX_MPATH" in the kernel configuration to enable this feature.
Reviewed by: robert, sam, gnn, julian, kmacy
|
#
176244 |
|
13-Feb-2008 |
jhb |
Use RTFREE_LOCKED() instead of rtfree() when releasing a reference on the 'rt' route in rtredirect() as 'rt' is always locked.
MFC after: 1 week PR: kern/117913 Submitted by: Stefan Lambrev stefan.lambrev of moneybookers.com
|
#
174934 |
|
27-Dec-2007 |
mux |
Add a workaround for a deadlock between the rt_setgate() and rt_check() functions. It is easily triggered by running routed, and, I expect, by running any other daemon that uses routing sockets.
Reviewed by: net@ MFC after: 1 week
|
#
174703 |
|
17-Dec-2007 |
kmacy |
widen the routing event interface (arp update, redirect, and eventually pmtu change) into separate functions
revert previous commit's changes to arpresolve and add a new interface arpresolve2 which does arp resolution without an mbuf
|
#
174559 |
|
12-Dec-2007 |
kmacy |
add interface for allowing consumers to register for ARP updates, redirects, and path MTU changes
Reviewed by: silby
|
#
174374 |
|
06-Dec-2007 |
julian |
No need to assert that a == b when we just set a = b.
|
#
172885 |
|
22-Oct-2007 |
jhb |
Close a race when trying to lookup a gateway route in rt_check(). Specifically, if two threads were doing concurrent lookups and the existing gateway was marked down, the the first thread would drop a reference on the gateway route and then unlock the "root" route while it tried to allocate a new route. The second thread could then also drop a reference on the same gateway route resulting in a reference underflow. Fix this by clearing the gateway route pointer after dropping the reference count but before dropping the lock. Secondly, in this same case, the second thread would overwrite the gateway route pointer w/o free'ing a reference to the route installed by the first thread. In practice this would probably just fix a lost reference that would result in a route never being freed.
This fixes panics observed in rt_check() and rtexpunge().
MFC after: 1 week PR: kern/112490 Insight from: mehuljv at yahoo.com Reviewed by: ru (found the "not-setting it to NULL" part) Tested by: several
|
#
170557 |
|
11-Jun-2007 |
phk |
Add missing \n to printf
|
#
169872 |
|
22-May-2007 |
glebius |
Some minor cleanups: - In rt_check() remove the senderr() macro and the "bad" label. They used to simplify code, but now aren't. - Remove extra RT_LOCK_ASSERT() in rt_setgate(). The RT_REMREF macro does this. - In rtfree() convert panics to KASSERTs. - Strict the routing API: rtfree() should be called only in a case when we are completely sure we've got the last reference on the rtentry. In all other cases RTFREE_LOCKED() macro should be used. If the reference isn't the last one spit out a warning printf. Correct the only(?) case for this in rt_check(). - Fix typos in comments.
|
#
164549 |
|
23-Nov-2006 |
bde |
Initialize a local variable in 2 places just before it is used, not always at the start of rtalloc1(). This backs out part of revs 1.83 and 1.85.
Profiling on an i386 showed that that for sending tiny packets using bge, -current takes 7 bzero()s where RELENG_4 takes only 1, and that bzero()ing is now the dominant overhead (10-12%, up from 1%, but profiling overestimated this a bit). This commit backs out 2 of the 6 extra bzero()s (1 in each of 2 calls per packet to rtalloc1()). They were the largest ones by byte count (48 bytes each) but perhaps not by time (small misaligned ones might take longer).
|
#
159305 |
|
05-Jun-2006 |
qingli |
Assuming the interface has an address of x.x.x.195, a mask of 255.255.255.0, and a default route with gateway x.x.x.1. Now if the address mask is changed to something more specific, e.g., 255.255.255.128, then after the mask change the default gateway is no longer reachable.
Since the default route is still present in the routing table, when the output code tries to resolve the address of the default gateway in function rt_check(), again, the default route will be returned by rtalloc1(). Because the lock is currently held on the rtentry structure, one more attempt to hold the lock will trigger a crash due to "lock recursed on non-recursive mutex ..."
This is a general problem. The fix checks for the above condition so that an existing route entry is not mistaken for a new cloned route. Approriately, an ENETUNREACH error is returned back to the caller
Approved by: andre
|
#
158661 |
|
16-May-2006 |
qingli |
The current routing code allows insertion of indirect routes that have gateways which are unreachable except through the default router. For example, assuming there is a default route configured, and inserting a route
"route add 64.102.54.0/24 60.80.1.1"
is currently allowed even when 60.80.1.1 is only reachable through the default route. However, an error is thrown when this route is utilized, say,
"ping 64.102.54.1" will return an error
This type of route insertion should be disallowed becasue:
1) Let's say that somehow our code allowed this packet to flow to the default router, and the default router knows the next hop is 60.80.1.1, then the question is why bother inserting this route in the 1st place, just simply use the default route.
2) Since we're not talking about source routing here, the default router could very well choose a different path than using 60.80.1.1 for the next hop, again it defeats the purpose of adding this route.
Reviewed by: ru, gnn, bz Approved by: andre
|
#
158294 |
|
04-May-2006 |
bz |
In rtrequest and rtinit check for sa_len != 0 for the given destination. These checks are needed so we do not install a route looking like this: (0) 192.0.2.200 UH tun0 =>
When removing this route the kernel will start to walk the address space which looks like a hang on 64bit platforms because it'll take ages while on 32bit you should see a panic when kernel debugging options are turned on.
The problem is in rtrequest1: if (netmask) { rt_maskedcopy(dst, ndst, netmask); } else bcopy(dst, ndst, dst->sa_len);
In both cases the len might be 0 if the application forgot to set it. If so ndst will be all-zero leading to above mentioned strange routes.
This is an application error but we must not fail/hang/panic because of this.
Looks ok: gnn No objections: net@ (silence) MFC after: 8 weeks
|
#
152315 |
|
11-Nov-2005 |
ru |
- Store pointer to the link-level address right in "struct ifnet" rather than in ifindex_table[]; all (except one) accesses are through ifp anyway. IF_LLADDR() works faster, and all (except one) ifaddr_byindex() users were converted to use ifp->if_addr.
- Stop storing a (pointer to) Ethernet address in "struct arpcom", and drop the IFP2ENADDR() macro; all users have been converted to use IF_LLADDR() instead.
|
#
150414 |
|
21-Sep-2005 |
glebius |
Several fixes to rt_setgate(), that fix problems with route changing:
- Rearrange code so that in a case of failure the affected route is not changed. Otherwise, a bogus rtentry will be left and later rt_check() can recurse on its lock. [1] - Remove comment about protocol cloning. - Fix two places where rtentry mutex was recursed on, because accessed via two different pointers, that were actually pointing to the same rtentry in some cases. [1] - Return EADDRINUSE instead of bogus EDQUOT, in case when gateway uses the same route. [2]
Reported & tested by: ps, Andrej Zverev <az inec.ru> [1] PR: kern/64090 [2]
|
#
150351 |
|
19-Sep-2005 |
andre |
Use monotonic 'time_uptime' instead of 'time_second' as timebase for rt->rt_rmx.rmx_expire.
|
#
148954 |
|
11-Aug-2005 |
glebius |
o Make rt_check() function more strict: - rt0 passed to rt_check() must not be NULL, assert this. - rt returned by rt_check() must be valid locked rtentry, if no error occured. o Modify callers, so that they never pass NULL rt0 to rt_check().
Reviewed by: sam, ume (nd6.c)
|
#
148883 |
|
09-Aug-2005 |
glebius |
In preparation for fixing races in ARP (and probably in other L2/L3 mappings) make rt_check() return a locked rtentry.
|
#
147650 |
|
28-Jun-2005 |
qingli |
Require gateways for routes to be of the same address family as the route itself.
It fixes a bug where an IPv4 route for example has an IPv6 gateway specified:
route add 10.1.1.1 -inet6 fe80::1%fxp0
Destination Gateway Flags Refs Use Netif Expire 10.1.1.1 fe80::1%fxp0 UGHS 0 0 fxp0
The fix rejects these illegal combinations:
route: writing to routing socket: Invalid argument add host 10.1.1.1: gateway fe80::1%fxp0: Invalid argument
Reviewed by: KAME jinmei@isl.rdc.toshiba.co.jp Reviewed by: andre (mentor) Approved by: re MFC after: 5
|
#
139823 |
|
06-Jan-2005 |
imp |
/* -> /*- for license, minor formatting changes
|
#
134122 |
|
21-Aug-2004 |
csjp |
When a prison is given the ability to create raw sockets (when the security.jail.allow_raw_sockets sysctl MIB is set to 1) where privileged access to jails is given out, it is possible for prison root to manipulate various network parameters which effect the host environment. This commit plugs a number of security holes associated with the use of raw sockets and prisons.
This commit makes the following changes:
- Add a comment to rtioctl warning developers that if they add any ioctl commands, they should use super-user checks where necessary, as it is possible for PRISON root to make it this far in execution. - Add super-user checks for the execution of the SIOCGETVIFCNT and SIOCGETSGCNT IP multicast ioctl commands. - Add a super-user check to rip_ctloutput(). If the calling cred is PRISON root, make sure the socket option name is IP_HDRINCL, otherwise deny the request.
Although this patch corrects a number of security problems associated with raw sockets and prisons, the warning in jail(8) should still apply, and by default we should keep the default value of security.jail.allow_raw_sockets MIB to 0 (or disabled) until we are certain that we have tracked down all the problems.
Looking forward, we will probably want to eliminate the references to curthread.
This may be a MFC candidate for RELENG_5.
Reviewed by: rwatson Approved by: bmilekic (mentor)
|
#
133513 |
|
11-Aug-2004 |
andre |
Convert the routing table to use an UMA zone for rtentries. The zone is called "rtentry".
This saves a considerable amount of kernel memory. R_Zmalloc previously used 256 byte blocks (plus kmalloc overhead) whereas UMA only needs 132 bytes.
Idea from: OpenBSD
|
#
132780 |
|
28-Jul-2004 |
kan |
Avoid casts as lvalues.
|
#
128626 |
|
24-Apr-2004 |
luigi |
fix one typo and remove one wrong line
|
#
128622 |
|
24-Apr-2004 |
luigi |
Correct and extend the description of the behaviour of rt_check().
|
#
128524 |
|
21-Apr-2004 |
luigi |
Clearly comment the assumptions that allow us to cast a 'struct radix_node *' to a 'struct rtentry *' in this code, and introduce a macro, RNTORT(), to do this type conversion.
|
#
128455 |
|
20-Apr-2004 |
luigi |
Fix the initial check for NULL arguments in rtfree (previously it checked for rt == NULL after dereferencing the pointer). We never check for those events elsewhere, so probably these checks might go away here as well.
Slightly simplify (and document) the logic for memory allocation in rt_setgate().
The rest is mostly style changes -- replace 0 with NULL where appropriate, remove the macro SA() that was only used once, remove some useless debugging code in rt_fixchange, explain some odd-looking casts.
|
#
128399 |
|
18-Apr-2004 |
luigi |
replace Bcopy with bcopy as in the rest of the file.
|
#
128357 |
|
17-Apr-2004 |
luigi |
make route_init() static
|
#
128311 |
|
16-Apr-2004 |
luigi |
Consistently use ifaddr_byindex() to access the link-level address of an interface. No functional change.
On passing, comment a likely bug in net/rtsock.c:sysctl_ifmalist() which, if confirmed, would deserve to be fixed and MFC'ed
|
#
128185 |
|
13-Apr-2004 |
luigi |
route.h: introduce a macro, SA_SIZE(struct sockaddr *) which returns the space occupied by a struct sockaddr when passed through a routing socket. Use it to replace the macro ROUNDUP(int), that does the same but is redefined by every file which uses it, courtesy of the School of Cut'n'Paste Programming(TM).
(partial) userland changes to follow.
|
#
128167 |
|
12-Apr-2004 |
luigi |
in rtinit(), remove one useless variable, and move a few others within the block where they are used.
|
#
128019 |
|
07-Apr-2004 |
imp |
Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson.
Approved by: core, peter, alc, rwatson
|
#
124237 |
|
07-Jan-2004 |
sam |
Remove extraneous unlock. This fixes a panic seen when manipulating static entries in the ARP table.
|
#
123262 |
|
07-Dec-2003 |
sam |
bandaid LOR in rt_setgate; a proper fix requires code refactoring
|
#
122986 |
|
25-Nov-2003 |
sam |
workaround LOR in rt_setgate
Reviewed by: andre Approved by: re (rwatson)
|
#
122921 |
|
20-Nov-2003 |
andre |
Remove RTF_PRCLONING from routing table and adjust users of it accordingly. The define is left intact for ABI compatibility with userland.
This is a pre-step for the introduction of tcp_hostcache. The network stack remains fully useable with this change.
Reviewed by: sam (mentor), bms Reviewed by: -net, -current, core@kame.net (IPv6 parts) Approved by: re (scottl)
|
#
122334 |
|
08-Nov-2003 |
sam |
replace explicit changes to rt_refcnt by RT_ADDREF and RT_REMREF macros that expand to include assertions when the system is built with INVARIANTS
Supported by: FreeBSD Foundation
|
#
121770 |
|
30-Oct-2003 |
sam |
Overhaul routing table entry cleanup by introducing a new rtexpunge routine that takes a locked routing table reference and removes all references to the entry in the various data structures. This eliminates instances of recursive locking and also closes races where the lock on the entry had to be dropped prior to calling rtrequest(RTM_DELETE). This also cleans up confusion where the caller held a reference to an entry that might have been reclaimed (and in some cases used that reference).
Supported by: FreeBSD Foundation
|
#
121717 |
|
29-Oct-2003 |
sam |
avoid recursive lock panic by unlocking before calling rtrequest; this is consistent with other places but will be replaced shortly by a "proper fix"
Supported by: FreeBSD Foundation Pain felt by: Jiri Mikulas
|
#
121139 |
|
16-Oct-2003 |
sam |
Correct handling of cloning loop avoidance: rtalloc1 may return a null pointer in which case we should not do the unlock.
Supported by: FreeBSD Foundatin
|
#
120993 |
|
11-Oct-2003 |
sam |
fix braino: null the pointer who's memory we just free'd, not some other pointers that are (potentially) used later
|
#
120888 |
|
07-Oct-2003 |
sam |
insure local variable is initialized prior to use
|
#
120820 |
|
05-Oct-2003 |
sam |
fix typo that caused a panic when processing an ICMP redirect
Sponsored by: FreeBSD Foundation
|
#
120727 |
|
04-Oct-2003 |
sam |
Locking for updates to routing table entries. Each rtentry gets a mutex that covers updates to the contents. Note this is separate from holding a reference and/or locking the routing table itself.
Other/related changes:
o rtredirect loses the final parameter by which an rtentry reference may be returned; this was never used and added unwarranted complexity for locking. o minor style cleanups to routing code (e.g. ansi-fy function decls) o remove the logic to bump the refcnt on the parent of cloned routes, we assume the parent will remain as long as the clone; doing this avoids a circularity in locking during delete o convert some timeouts to MPSAFE callouts
Notes:
1. rt_mtx in struct rtentry is guarded by #ifdef _KERNEL as user-level applications cannot/do-no know about mutex's. Doing this requires that the mutex be the last element in the structure. A better solution is to introduce an externalized version of struct rtentry but this is a major task because of the intertwining of rtentry and other data structures that are visible to user applications. 2. There are known LOR's that are expected to go away with forthcoming work to eliminate many held references. If not these will be resolved prior to release. 3. ATM changes are untested.
Sponsored by: FreeBSD Foundation Obtained from: BSD/OS (partly)
|
#
120701 |
|
03-Oct-2003 |
sam |
cleanups prior to adding locking (and in some cases to eliminate locking):
o move route_cb to be private to rtsock.c o replace global static route_proto by locals o eliminate global #define shorthands for info references o remove some register decls o ansi-fy function decls o move items to be close in scope to their usage o add rt_dispatch function for dispatching the actual message o cleanup tangled logic for doing all-but-me msg send
Support by: FreeBSD Foundation
|
#
113428 |
|
13-Apr-2003 |
hsu |
No need to unlock if error detected before locking.
Submitted by: harti
|
#
111767 |
|
02-Mar-2003 |
mdodd |
Reduce code duplication. This adds the function rt_check() to route.c.
Approved by: sam (in principle)
|
#
111119 |
|
19-Feb-2003 |
imp |
Back out M_* changes, per decision of the TRB.
Approved by: trb
|
#
109623 |
|
21-Jan-2003 |
alfred |
Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
|
#
108272 |
|
25-Dec-2002 |
ru |
I'm not sure what was the problem at the time of revision 1.37 when julian@ added it, but the commented out code had at least one bug -- not freeing the allocated mbuf.
Anyway, this comment no longer applies as of revision 1.67, so remove it.
|
#
108270 |
|
25-Dec-2002 |
ru |
Revision 1.67 changes correspond to CSRG revision 8.3.1.1 changes.
|
#
108269 |
|
25-Dec-2002 |
ru |
If the caller of rtrequest*(RTM_DELETE, ...) asked for a copy of the entry being removed (ret_nrt != NULL), increment the entry's rt_refcnt like we do it for RTM_ADD and RTM_RESOLVE, rather than messing around with 1->0 transitions for rtfree() all over.
|
#
108250 |
|
24-Dec-2002 |
hsu |
SMP locking for radix nodes.
|
#
108206 |
|
23-Dec-2002 |
ru |
rn_walktree*() compute the next leaf before applying a function to current leaves because function may vanish the current node.
If parent RTA_GENMASK route has a clone (a "cloning clone"), an rn_walktree_from() starting from parent will cause another walk starting from clone. If a function is either rt_fixdelete() or rt_fixchange(), this recursive walk may vanish the leaf that is remembered by an outer walk (the "next leaf" above), panicing a system when it resumes with an outer walk.
The following script paniced my single-user mode booted system:
: sysctl net.inet.ip.forwarding=1 : ipfw add 1 allow ip from any to any : ifconfig lo0 127.1 : route add -net 10 -genmask 255.255.255.0 127.1 : telnet 10.1 # rt_fixchange() panic : telnet 10.2 : telnet 10.1 : route delete -net 10 # rt_fixdelete() panic
For the time being, avoid these races by disallowing recursive walks in rt_fixchange() and rt_fixdelete().
Also, make a slight optimization in the rtrequest(RTM_RESOLVE) case: there is no reason to call rt_fixchange() in this case.
PR: kern/37606 MFC after: 5 days
|
#
108033 |
|
18-Dec-2002 |
hsu |
Lock up ifaddr reference counts.
|
#
106968 |
|
15-Nov-2002 |
luigi |
Massive cleanup of the ip_mroute code.
No functional changes, but:
+ the mrouting module now should behave the same as the compiled-in version (it did not before, some of the rsvp code was not loaded properly); + netinet/ip_mroute.c is now truly optional; + removed some redundant/unused code; + changed many instances of '0' to NULL and INADDR_ANY as appropriate; + removed several static variables to make the code more SMP-friendly; + fixed some minor bugs in the mrouting code (mostly, incorrect return values from functions).
This commit is also a prerequisite to the addition of support for PIM, which i would like to put in before DP2 (it does not change any of the existing APIs, anyways).
Note, in the process we found out that some device drivers fail to properly handle changes in IFF_ALLMULTI, leading to interesting behaviour when a multicast router is started. This bug is not corrected by this commit, and will be fixed with a separate commit.
Detailed changes: -------------------- netinet/ip_mroute.c all the above. conf/files make ip_mroute.c optional net/route.c fix mrt_ioctl hook netinet/ip_input.c fix ip_mforward hook, move rsvp_input() here together with other rsvp code, and a couple of indentation fixes. netinet/ip_output.c fix ip_mforward and ip_mcast_src hooks netinet/ip_var.h rsvp function hooks netinet/raw_ip.c hooks for mrouting and rsvp functions, plus interface cleanup. netinet/ip_mroute.h remove an unused and optional field from a struct
Most of the code is from Pavlin Radoslavov and the XORP project
Reviewed by: sam MFC after: 1 week
|
#
97649 |
|
31-May-2002 |
silby |
Ensure that packet counts are always reset to 0 when a route is cloned. Previously, they took on the count of their parent route (which was sometimes nonzero.)
Submitted by: Andre Oppermann <oppermann@pipeline.ch> MFC after: 5 days
|
#
92725 |
|
19-Mar-2002 |
alfred |
Remove __P.
|
#
87060 |
|
28-Nov-2001 |
brian |
Fix a typo in a comment
|
#
85074 |
|
17-Oct-2001 |
ru |
Pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2.
Have sys/net/route.c:rtrequest1(), which takes ``rt_addrinfo *'' as the argument. Pass rt_addrinfo all the way down to rtrequest1 and ifa->ifa_rtrequest. 3rd argument of ifa->ifa_rtrequest is now ``rt_addrinfo *'' instead of ``sockaddr *'' (almost noone is using it anyways).
Benefit: the following command now works. Previously we needed two route(8) invocations, "add" then "change". # route add -inet6 default ::1 -ifp gif0
Remove unsafe typecast in rtrequest(), from ``rtentry *'' to ``sockaddr *''. It was introduced by 4.3BSD-Reno and never corrected.
Obtained from: BSD/OS, NetBSD MFC after: 1 month PR: kern/28360
|
#
85052 |
|
17-Oct-2001 |
ru |
64-bit fixes from CSRG.
|
#
84971 |
|
15-Oct-2001 |
ru |
Don't even attempt to clone host routes.
MFC after: 1 week
|
#
80353 |
|
25-Jul-2001 |
fenner |
Don't bother passing p to rtioctl just so it can fail to pass it to mrt_ioctl
|
#
80350 |
|
25-Jul-2001 |
ume |
As commented in defined in sys/net/route.c, rt_fixchange() has a bad effect, which would cause unnecessary route deletion:
* Unfortunately, this has the obnoxious * property of also triggering for insertion /above/ a pre-existing network * route and clones. Sigh. This may be fixed some day.
The effect has been even worse, because recent versions of route.c set the parent rtentry for cloned routes from an interface-direct route. For example, suppose that we have an interface "ne0" that has an IPv4 subnet "10.0.0.0/24". Then we may have a cloned route like 10.0.0.1 on the interface, whose parent route is 10.0.0.0/24 (to the interface ne0). Now, when we add the default route (i.e. 0.0.0.0/0), rt_fixchange() will remove the cloned route 10.0.0.1. The (bad) effect also prevents rt_setgate from configuring rt_gwroute, which would not be an intended behavior.
As suggested in the comments to rt_fixchange(), we need stricter check in the function, to prevent unintentional route deletion.
This fix also solve the "IPV6 panic?" problem in nd6_timer().
Submitted by: JINMEI Tatuya <jinmei@isl.rdc.toshiba.co.jp> MFC after: 4 days
|
#
77689 |
|
04-Jun-2001 |
ru |
When looking for an interface appropriate for the (new or changing) route in ifa_ifwithroute(), as the last resort, look up the route to the gateway, not destination (to derive the interface from).
PR: kern/27852 Submitted by: Iasen Kostoff <tbyte@tbyte.org> MFC after: 2 weeks
|
#
74299 |
|
15-Mar-2001 |
ru |
net/route.c:
A route generated from an RTF_CLONING route had the RTF_WASCLONED flag set but did not have a reference to the parent route, as documented in the rtentry(9) manpage. This prevented such routes from being deleted when their parent route is deleted.
Now, for example, if you delete an IP address from a network interface, all ARP entries that were cloned from this interface route are flushed.
This also has an impact on netstat(1) output. Previously, dynamically created ARP cache entries (RTF_STATIC flag is unset) were displayed as part of the routing table display (-r). Now, they are only printed if the -a option is given.
netinet/in.c, netinet/in_rmx.c:
When address is removed from an interface, also delete all routes that point to this interface and address. Previously, for example, if you changed the address on an interface, outgoing IP datagrams might still use the old address. The only solution was to delete and re-add some routes. (The problem is easily observed with the route(8) command.)
Note, that if the socket was already bound to the local address before this address is removed, new datagrams generated from this socket will still be sent from the old address.
PR: kern/20785, kern/21914 Reviewed by: wollman (the idea)
|
#
59529 |
|
23-Apr-2000 |
wollman |
A couple months ago, Kirk and I were doing a walkthrough of the radix-tree search routine, and scratching our heads over why it was so obfuscated. This delta fixes a number of confusing style bugs and renames several structure members to have more meaningful names. There remain a number of odd control-flow structures. These changes do not affect the generated code.
|
#
56030 |
|
15-Jan-2000 |
shin |
Clear ro->ro_rt just after RTFREE(). Pleases let me make sure that no one touch the invalid ro_rt pointer, after splx(s) and before next ro_rt initialization. Though usually this seems to be already called at splnet, I still sometime experience kernel crash at rtfree() in my INET6 enabled environment where IPv6 connection is frequently used. (Off-course, it might be just due to another bug.)
|
#
55009 |
|
22-Dec-1999 |
shin |
IPSEC support in the kernel. pr_input() routines prototype is also changed to support IPSEC and IPV6 chained protocol headers.
Reviewed by: freebsd-arch, cvs-committers Obtained from: KAME project
|
#
54369 |
|
09-Dec-1999 |
jdp |
Fix a route table leak in rtalloc() and rtalloc_ign(). It is possible for ro->ro_rt to be non-NULL even though the RTF_UP flag is cleared. (Example: a routing daemon or the "route" command deletes a cloned route in active use by a TCP connection.) In that case, the code was clobbering a reference to the routing table entry without decrementing the entry's reference count.
The splnet() call probably isn't needed, but I haven't been able to prove that yet. It isn't significant from a performance standpoint since it is executed very rarely.
Reviewed by: wollman and others in the freebsd-current mailing list
|
#
54350 |
|
09-Dec-1999 |
shin |
rtcalloc() is removed because it turned out not to be necessary for FreeBSD. (It was added as a part of KAME patch)
Specified by: jdp@polstra.com
|
#
53647 |
|
23-Nov-1999 |
brian |
Only emit the ``wrong ifa'' message if the matching interface is neither IFF_LOOPBACK or IFF_POINTOPOINT. It's quite common (and probably more correct) to route local IP numbers via lo0 and it makes configuration easier to assign the hostname address to local POINTOPOINT links too.
This message usually remains hidden because the loopback interface gets the highest interface number at boot time, but when the ethernet interface is added later, the message can get pretty annoying.
Also, fix a typo.
Not objected to by: freebsd-net
|
#
53541 |
|
22-Nov-1999 |
shin |
KAME netinet6 basic part(no IPsec,no V6 Multicast Forwarding, no UDP/TCP for IPv6 yet)
With this patch, you can assigne IPv6 addr automatically, and can reply to IPv6 ping.
Reviewed by: freebsd-arch, cvs-committers Obtained from: KAME project
|
#
50477 |
|
27-Aug-1999 |
peter |
$Id$ -> $FreeBSD$
|
#
46161 |
|
29-Apr-1999 |
luoqi |
Postpone route_init() until all domains are attached.
|
#
43305 |
|
27-Jan-1999 |
dillon |
Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile
|
#
35256 |
|
17-Apr-1998 |
des |
Seventy-odd "its" / "it's" typos in comments fixed as per kern/6108.
|
#
33181 |
|
09-Feb-1998 |
eivind |
Staticize.
|
#
33134 |
|
06-Feb-1998 |
eivind |
Back out DIAGNOSTIC changes.
|
#
33108 |
|
04-Feb-1998 |
eivind |
Turn DIAGNOSTIC into a new-style option.
|
#
32350 |
|
08-Jan-1998 |
eivind |
Make INET a proper option.
This will not make any of object files that LINT create change; there might be differences with INET disabled, but hardly anything compiled before without INET anyway. Now the 'obvious' things will give a proper error if compiled without inet - ipx_ip, ipfw, tcp_debug. The only thing that _should_ work (but can't be made to compile reasonably easily) is sppp :-(
This commit move struct arpcom from <netinet/if_ether.h> to <net/if_arp.h>.
|
#
30813 |
|
28-Oct-1997 |
bde |
Removed unused #includes.
|
#
29506 |
|
16-Sep-1997 |
bde |
Fixed gratuitous ANSIisms.
|
#
29024 |
|
01-Sep-1997 |
bde |
Added used #include - don't depend on <sys/mbuf.h> including <sys/malloc.h> (unless we only use the bogusly shared M*WAIT flags).
|
#
24203 |
|
24-Mar-1997 |
bde |
Don't include <sys/ioctl.h> in the kernel. Stage 1: don't include it when it is not used. In most cases, the reasons for including it went away when the special ioctl headers became self-sufficient.
|
#
23392 |
|
05-Mar-1997 |
julian |
add a bunch of comments to describe what's going on. This is some of the worst code I've had to wade through in ages and I don't want to have to start from scratch again next time.
(I have a 2.2 version of these comments, can I commit them?)
|
#
22975 |
|
22-Feb-1997 |
peter |
Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
|
#
22010 |
|
25-Jan-1997 |
julian |
fix mixleading comment (my error.. I wrote the comment)
|
#
21673 |
|
14-Jan-1997 |
jkh |
Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
#
18206 |
|
10-Sep-1996 |
julian |
No code changes what so ever, but added about 150 lines of comments Sorry if this makes it harder to merge in lite2 stuff but hey.. At least I can figure out what is going on whenever I end up going through those files again..
do we have a policy regarding commenting existing code?
|
#
17997 |
|
02-Sep-1996 |
fenner |
Bugfix and simplification for rev 1.34: make sure that the route is non-null before trying to delete it in rt_setgate(), which then allows removal of the special-case code from the RTM_ADD case. This should fix the panics that joerg and Phil Karn have been seeing.
|
#
17802 |
|
24-Aug-1996 |
peter |
route.c:RTM_ADD does not check for a netmask before doing a tree walk like it does elsewhere. This is probably only happens when incorrect args are given to route(8), or when running with non-IPv4 stacks but incorrect args to the route command is no excuse for panicing!
Submitted by: Michael Clay <mclay@weareb.org>, PR#1532
|
#
17052 |
|
09-Jul-1996 |
fenner |
Disallow host routes that point to themselves. These routes serve no purpose, other than to get in the way of the ARP table and cause "can't allocate llinfo" errors.
This change may cause gated or routed to start complaining when adding such routes. If so, these programs will need to be fixed to not try to add these routes.
Reviewed by: wollman
|
#
14904 |
|
29-Mar-1996 |
fenner |
Eliminate panic("rtfree") caused by double-freeing the route when rt == rt->rt_gwroute . rt == rt->gwroute shouldn't happen in the first place, but that's another problem.
(try "route add -host <hostonmynet> <hostonmynet>; ping <hostonmynet>; route delete <hostonmynet>")
|
#
14546 |
|
11-Mar-1996 |
dg |
Move or add #include <queue.h> in preparation for upcoming struct socket changes.
|
#
14328 |
|
02-Mar-1996 |
peter |
Add more options into the conf/options and i386/conf/options.i386 files and the #include hooks so that 'make depend' is more useful. This covers most of the options I regularly use (but not all) and some other easy ones.
|
#
13616 |
|
24-Jan-1996 |
wollman |
Fix memory leak in case of adding a host route on top of another one.
Pointed-out-by: Bill Fenner <fenner@parc.xerox.com>
|
#
12820 |
|
14-Dec-1995 |
phk |
Another mega commit to staticize things.
|
#
12578 |
|
02-Dec-1995 |
bde |
Fixed call to mrt_ioctl(). mrt_ioctl() for some reason has different number of args when MROUTING is defined.
|
#
11921 |
|
29-Oct-1995 |
phk |
Second batch of cleanup changes. This time mostly making a lot of things static and some unused variables here and there.
|
#
11539 |
|
16-Oct-1995 |
wollman |
When adding a route fails because there is already a route with the same (mask,value) in the tree, don't immediately return EEXIST. Instead, check to see if the pre-existing route was generated by protcol-cloning. If so, then it is OK to simply blow away the old route and re-attempt the insertion. If not, then fall back to the same error code as before.
|
#
9759 |
|
29-Jul-1995 |
bde |
Eliminate sloppy common-style declarations. There should be none left for the LINT configuation.
|
#
9469 |
|
10-Jul-1995 |
wollman |
When adding a route, set rt_ifa and rt_ifp a little earlier so that the protocol-specific add routine can examine it if desired.
|
#
8876 |
|
30-May-1995 |
rgrimes |
Remove trailing whitespace.
|
#
8070 |
|
25-Apr-1995 |
wollman |
Finally finish the cloning cleanup work by making sure that clones go away whenever a clone's parent is changed, or a route is added in a certain set of circumstances.
This also includes code to forbid setting a route's gateway to an address which can only be reached through that route, thus (hopefully) eliminating one class of cloning bottomless-recursion bugs.
|
#
7335 |
|
24-Mar-1995 |
wollman |
Don't delete clones if they are PINNED.
|
#
7279 |
|
23-Mar-1995 |
wollman |
radix.c: correct exit condition in rn_walktree_from() route.c: be a little more careful when running deleting children of dying . routes
|
#
7224 |
|
21-Mar-1995 |
wollman |
Protocol-cloned routes should gain a reference to their parents to make sure that rt->rt_parent values can never be re-used harmfully.
|
#
7199 |
|
20-Mar-1995 |
dg |
Made minor readability tweak.
|
#
7197 |
|
20-Mar-1995 |
wollman |
Better fix for the deletion of parents of cloned routes problem, superseding the `nextchild' hack. This also provides a way forward to fix RTM_CHANGE and RTM_ADD as well.
|
#
7090 |
|
16-Mar-1995 |
bde |
Add and move declarations to fix all of the warnings from `gcc -Wimplicit' (except in netccitt, netiso and netns) and most of the warnings from `gcc -Wnested-externs'. Fix all the bugs found. There were no serious ones.
|
#
5801 |
|
23-Jan-1995 |
dg |
Added back the missing last few bytes of the file.
|
#
5791 |
|
23-Jan-1995 |
wollman |
route.c: keep track of where cloned routes come from, and make sure to delete them when the ``parent'' goes away
route.h: add glue to track this to rtentry structure. WARNING WILL ROBINSON! This will be yet another incompatible change in your route-using binaries. I apologize, but this was the only way to do it. I took this opportunity to increase the size of the metrics to what I believe will be the final length for 2.1, so that when the T/TCP stuff is done, this won't happen again.
|
#
5104 |
|
13-Dec-1994 |
wollman |
Implemented rtalloc_ign().
|
#
5099 |
|
13-Dec-1994 |
wollman |
Add support for two separate cloning flags, one set by the lower layers, and one set by the protocol family. Also add another parameter to rtalloc1() to allow for any interface flags to be ignored; currently this is only useful for RTF_PRCLONING. Get rid of rt_prflags and re-unite with rt_flags. Add T/TCP ``route metrics''.
NB: YOU MUST RECOMPILE `route' AND OTHER RELATED PROGRAMS AS A RESULT OF THIS CHANGE.
This also adds a new interface parameter, `ifi_physical', which will eventually replace IFF_ALTPHYS as the mechanism for specifying the particular physical connection desired on a multiple-connection card.
NB: YOU MUST RECOMPILE `ifconfig' AND OTHER RELATED PROGRAMS AS A RESULT OF THIS CHANGE.
|
#
4104 |
|
02-Nov-1994 |
wollman |
Collapse two fields so that we have space for another 32 flags. NB: You will have to recompile programs which use the `rt_use' member in order to get the correct values. This should not cause incorrect operation, but the statistics may look a little confusing.
|
#
4073 |
|
02-Nov-1994 |
wollman |
Add code to be a bit smarter about IP routes, conditioned on the option IN_RMX. (Eventually this will be standard, but I just wrote the code today and don't want to break anyone.)
|
#
3514 |
|
11-Oct-1994 |
wollman |
Fix a bug which caused panics when attempting to change just the flags of a route. (This still doesn't work, but it doesn't panic now.) It looks like there may be a number of incipient bugs in this code.
Also, get ready for the time when all IP gateway routes are cloning, which is necessary to keep proper TCP statistics.
|
#
3311 |
|
02-Oct-1994 |
phk |
GCC cleanup. Reviewed by: Submitted by: Obtained from:
|
#
2754 |
|
14-Sep-1994 |
wollman |
Shuffle some functions and variables around to make it possible for multicast routing to be implemented as an LKM. (There's still a bit of work to do in this area.)
|
#
2554 |
|
07-Sep-1994 |
wollman |
The mrt_ioctl goop properly depends on MROUTING, not MULTICAST. (Oof!)
|
#
2544 |
|
07-Sep-1994 |
se |
Reviewed by: Stefan Esser Submitted by: rtioctl(): changed parameter to mrt_ioctl from "cmd" to "req" to make it compile with MULTICAST defined.
|
#
2531 |
|
06-Sep-1994 |
wollman |
Initial get-the-easy-case-working upgrade of the multicast code to something more recent than the ancient 1.2 release contained in 4.4. This code has the following advantages as compared to previous versions (culled from the README file for the SunOS release):
- True multicast delivery - Configurable rate-limiting of forwarded multicast traffic on each physical interface or tunnel, using a token-bucket limiter. - Simplistic classification of packets for prioritized dropping. - Administrative scoping of multicast address ranges. - Faster detection of hosts leaving groups. - Support for multicast traceroute (code not yet available). - Support for RSVP, the Resource Reservation Protocol.
What still needs to be done:
- The multicast forwarder needs testing. - The multicast routing daemon needs to be ported. - Network interface drivers need to have the `#ifdef MULTICAST' goop ripped out of them. - The IGMP code should probably be bogon-tested.
Some notes about the porting process:
In some cases, the Berkeley people decided to incorporate functionality from later releases of the multicast code, but then had to do things differently. As a result, if you look at Deering's patches, and then look at our code, it is not always obvious whether the patch even applies. Let the reader beware.
I ran ip_mroute.c through several passes of `unifdef' to get rid of useless grot, and to permanently enable the RSVP support, which we will include as standard.
Ported by: Garrett Wollman Submitted by: Steve Deering and Ajit Thyagarajan (among others)
|
#
1817 |
|
02-Aug-1994 |
dg |
Added $Id$
|
#
1549 |
|
25-May-1994 |
rgrimes |
The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.
Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman
|
#
1542 |
|
24-May-1994 |
rgrimes |
This commit was generated by cvs2svn to compensate for changes in r1541, which included commits to RCS files with non-trunk default branches.
|
#
1541 |
|
24-May-1994 |
rgrimes |
BSD 4.4 Lite Kernel Sources
|