#
ed34a6b6 |
|
17-Jan-2023 |
Eric Joyner <erj@FreeBSD.org> |
iflib: Add subinterface interrupt allocation function The ice(4) driver will add the ability to create extra interfaces that hang off of the base interface; to do that the driver requires a method for the subinterface to request hardware interrupt resources from the base interface. Signed-off-by: Eric Joyner <erj@FreeBSD.org> MFC after: 3 days Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D39930
|
#
3c7da27a |
|
22-Mar-2023 |
Eric Joyner <erj@FreeBSD.org> |
iflib: Add sysctl to request extra MSIX vectors on driver load Intended to be used with upcoming feature to add sub-interfaces, since those new interfaces will be dynamically created and will need to have spare MSI-X interrupts already allocated for them on driver load. This sysctl is marked as a tunable since it will need to be set before the driver is loaded since MSI-X interrupt allocation and setup is done during the attach process. Signed-off-by: Eric Joyner <erj@FreeBSD.org> MFC after: 3 days Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D41326
|
#
e4a0c92e |
|
16-Apr-2024 |
Stephen J. Kiernan <stevek@FreeBSD.org> |
iflib: Correct indentation according to style(9) The indentation style for the SYSCTL_* macros used was not matching KNF. Reported by: jhb Differential Revision: https://reviews.freebsd.org/D44811
|
#
303dea74 |
|
03-Apr-2024 |
Stephen J. Kiernan <stevek@FreeBSD.org> |
iflib: Fix compiler warnings Some of the QUAD sysctls are actually for unsigned quad values. Switch to using UQUAD instead, as that is meant for unsigned. Reviewed by: erj, jhb Obtained from: Juniper Networks, Inc. Differential Revision: https://reviews.freebsd.org/D44620
|
#
685dc743 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
d2dd3d5a |
|
04-Aug-2023 |
Eric Joyner <erj@FreeBSD.org> |
iflib: Remove redundant variable In iflib_init_locked(), sctx and scctx both point to the same value, which is the ifc_softc_ctx field in the iflib softc. Remove the declaration and assignment to sctx since scctx can be used instead, and the name of scctx follows the naming convention used for local variables that point to ifc_softc_ctx. In theory there should be no functional impact with this change. Signed-off-by: Eric Joyner <erj@FreeBSD.org> Reviewed by: kbowling@ MFC after: 3 days Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D41325
|
#
7f527d48 |
|
04-Aug-2023 |
Eric Joyner <erj@FreeBSD.org> |
iflib: Fix white space and reduce some line lengths This helps align some of the code with the rest of the style used in iflib, but as marius@ points out, this is not style(9). Signed-off-by: Eric Joyner <erj@FreeBSD.org> Reviewed by: kbowling@ MFC after: 3 days Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D41324
|
#
7ff9ae90 |
|
03-Aug-2023 |
Marius Strobl <marius@FreeBSD.org> |
iflib(9): Remove support for cloning pseudo interfaces This code was used by the first incarnation of wg(4) and is dead ever since f187d6dfbf633665ba6740fe22742aec60ce02a2 has removed the latter again. Moreover, this code matched iflib(4) like a square peg fits in a round hole, was incomplete and despite some hacks still tailored to VPC and wg(4) but not generic. In effect, this reverts the following: 09f6ff4f1a47c3009dc16fdc609a44f2341bc7ac (w/ its "ancillary changes") 9aeca21324f481f57f2ecb7009f461f4f51b62b3 1f93e931d9f0c688f43f98ef777e04636a325526 0f9544d03e89d180f94a7a84b110ec7d2b6c625a 0dd691b41276ce13d25ffb1443af27f85038aa3f Reviewed by: erj, kbowling Differential Revision: <https://reviews.freebsd.org/D41196>
|
#
04d4e345 |
|
27-Jul-2023 |
Przemyslaw Lewandowski <przemyslawx.lewandowski@intel.com> |
iflib: Fix panic during driver reload stress test During a driver reload stress test, after 50-300 reloads a panic occurs. After adding sleeps in between loading and unloading the driver, the issue does not occur. It's possible that loading/unloading too fast may cause the gt_taskqueue pointer to be freed earlier than expected; checking for a null pointer first fixes it. Signed-off-by: Eric Joyner <erj@FreeBSD.org> Reviewed by: erj@ Tested by: jeffrey.e.pieper@intel.com MFC after: 3 days Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D39457
|
#
a52f23f4 |
|
19-Jul-2023 |
Eric Joyner <erj@FreeBSD.org> |
iflib: Unlock ctx lock around call to ether_ifattach() Panic occurs during loading driver using kldload. It exists since netlink is enabled. There is problem with double locking ctx. This fix allows to call ether_ifattach() without locked ctx. Signed-off-by: Eric Joyner <erj@FreeBSD.org> PR: 271768 Reviewed by: erj@, jhb@ MFC after: 1 day Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D40557
|
#
a6b55ee6 |
|
17-Apr-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
net: replace IFF_KNOWSEPOCH with IFF_NEEDSEPOCH Expect that drivers call into the network stack with the net epoch entered. This has already been the fact since early 2020. The net interrupts, that are marked with INTR_TYPE_NET, were entering epoch since 511d1afb6bf. For the taskqueues there is NET_TASK_INIT() and all drivers that were known back in 2020 we marked with it in 6c3e93cb5a4. However in e87c4940156 we took conservative approach and preferred to opt-in rather than opt-out for the epoch. This change not only reverts e87c4940156 but adds a safety belt to avoid panicing with INVARIANTS if there is a missed driver. With INVARIANTS we will run in_epoch() check, print a warning and enter the net epoch. A driver that prints can be quickly fixed with the IFF_NEEDSEPOCH flag, but better be augmented to properly enter the epoch itself. Note on TCP LRO: it is a backdoor to enter the TCP stack bypassing some layers of net stack, ignoring either old IFF_KNOWSEPOCH or the new IFF_NEEDSEPOCH. But the tcp_lro_flush_all() asserts the presence of network epoch. Indeed, all NIC drivers that support LRO already provide the epoch, either with help of INTR_TYPE_NET or just running NET_EPOCH_ENTER() in their code. Reviewed by: zlei, gallatin, erj Differential Revision: https://reviews.freebsd.org/D39510
|
#
25c92cd2 |
|
06-Mar-2023 |
Justin Hibbits <jhibbits@FreeBSD.org> |
iflib: Further convert to use IfAPI accessors Summary: When iflib was first converted some IfAPI APIs were not yet present, so were tagged with "XXX" comments. Finish the conversion by using these new APIs. Reviewed by: gallatin, erj Sponsored by: Juniper Networks, Inc Differential Revision: https://reviews.freebsd.org/D38928
|
#
5f7bea29 |
|
28-Feb-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
iflib: fix regression with new pfil(9) KPI Do not pass the pointer to our valid mbuf to pfil(9). Pass an uninitialized one only. This was unsafe with the old KPI, too, but for some reason didn't fail. Fixes: caf32b260ad46b17a4c1a8ce6383e37ac489f023
|
#
caf32b26 |
|
14-Feb-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
pfil: add pfil_mem_{in,out}() and retire pfil_run_hooks() The 0b70e3e78b0 changed the original design of a single entry point into pfil(9) chains providing separate functions for the filtering points that always provide mbufs and know the direction of a flow. The motivation was to reduce branching. The logical continuation would be to do the same for the filtering points that always provide a memory pointer and retire the single entry point. o Hooks now provide two functions: one for mbufs and optional for memory pointers. o pfil_hook_args() has a new member and pfil_add_hook() has a requirement to zero out uninitialized data. Bump PFIL_VERSION. o As it was before, a hook function for a memory pointer may realloc into an mbuf. Such mbuf would be returned via a pointer that must be provided in argument. o The only hook that supports memory pointers is ipfw:default-link. It is rewritten to provide two functions. o All remaining uses of pfil_run_hooks() are converted to pfil_mem_in(). o Transparent union of pfil_packet_t and tricks to fix pointer alignment are retired. Internal pfil_realloc() reduces down to m_devget() and thus is retired, too. Reviewed by: mjg, ocochard Differential revision: https://reviews.freebsd.org/D37977
|
#
9147969b |
|
24-Jan-2023 |
Przemyslaw Lewandowski <przemyslawx.lewandowski@intel.com> |
iflib: Add null check to iflib_stop() Ever since gtaskqueue_drain() was added to iflib_stop(), a kernel panic occurs when the ice(4) driver is in recovery mode. Queues are not initialized in this mode, so gt_taskqueue is not initialized, and gtaskqueue_drain() will panic. Fix this by only doing a drain if an RX queue's gt_taskqueue is initialized. Signed-off-by: Eric Joyner <erj@FreeBSD.org> Reviewed by: erj@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D37892
|
#
2c2b37ad |
|
13-Jan-2023 |
Justin Hibbits <jhibbits@FreeBSD.org> |
ifnet/API: Move struct ifnet definition to a <net/if_private.h> Hide the ifnet structure definition, no user serviceable parts inside, it's a netstack implementation detail. Include it temporarily in <net/if_var.h> until all drivers are updated to use the accessors exclusively. Reviewed by: glebius Sponsored by: Juniper Networks, Inc. Differential Revision: https://reviews.freebsd.org/D38046
|
#
402810d3 |
|
20-Oct-2021 |
Justin Hibbits <jhibbits@FreeBSD.org> |
Convert iflib(4) and iflib-based drivers to the DrvAPI Summary: Convert iflib(4) and the following drivers: * axgbe * em * ice * ixl * vmxnet Sponsored by: Juniper Networks, Inc. Reviewed by: kbowling, #iflib Differential Revision: https://reviews.freebsd.org/D37768
|
#
9c950139 |
|
17-Oct-2022 |
Eric Joyner <erj@FreeBSD.org> |
iflib: Introduce v2 of TX Queue Select Functionality For v2, iflib will parse packet headers before queueing a packet. This commit also adds a new field in the structure that holds parsed header information from packets; it stores the IP ToS/traffic class field found in the IPv4/IPv6 header. To help, it will only partially parse header packets before queueing them by using a new header parsing function that does less than the current parsing header function; for our purposes we only need up to the minimal IP header in order to get the IP ToS infromation and don't need to pull up more data. For now, v1 and v2 co-exist in this patch; v1 still offers a less-invasive method where none of the packet is parsed in iflib before queueing. This also bumps the sys/param.h version. Signed-off-by: Eric Joyner <erj@FreeBSD.org> Tested by: IntelNetworking MFC after: 3 days Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D34742
|
#
0294e95d |
|
21-Jul-2022 |
Dimitry Andric <dim@FreeBSD.org> |
Fix unused variable warning in iflib.c With clang 15, the following -Werror warning is produced: sys/net/iflib.c:993:8: error: variable 'n' set but not used [-Werror,-Wunused-but-set-variable] u_int n; ^ The 'n' variable appears to have been a debugging aid that has never been used for anything, so remove it. MFC after: 3 days
|
#
d08cb453 |
|
08-Apr-2022 |
John Baldwin <jhb@FreeBSD.org> |
iflib: Use empty inline functions for prefetch*() on non-x86. This avoids warnings about unused variables in expressions passed to prefetch*().
|
#
213e9139 |
|
29-Jul-2021 |
Eric Joyner <erj@FreeBSD.org> |
iflib: Allow drivers to determine which queue to TX on Adds a new function pointer to struct if_txrx in order to allow drivers to set their own function that will determine which queue a packet should be sent on. Since this includes a kernel ABI change, bump the __FreeBSD_version as well. (This motivation behind this is to allow the driver to examine the UP in the VLAN tag and determine which queue to TX on based on that, in support of HW TX traffic shaping.) Signed-off-by: Eric Joyner <erj@FreeBSD.org> Reviewed by: kbowling@, stallamr@netapp.com Tested by: jeffrey.e.pieper@intel.com Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D31485
|
#
e0e12405 |
|
14-Jan-2022 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: fix LOR in iflib_netmap_register In iflib_device_register(), the CTX_LOCK is acquired first and then IFNET_WLOCK is acquired by ether_ifattach(). However, in netmap_hw_reg() we do the opposite: IFNET_RLOCK is acquired first, and then CTX_LOCK is acquired by iflib_netmap_register(). Fix this LOR issue by wrapping the CTX_LOCK/UNLOCK calls in iflib_device_register with an additional IFNET_WLOCK. This is safe since the IFNET_WLOCK is recursive. MFC after: 1 month
|
#
618d49f5 |
|
10-Jan-2022 |
Alexander Motin <mav@FreeBSD.org> |
Revert "iflib: Relax timer period from 0.5 to 0.5-0.75s." I've noticed relations between iflib_timer() vs ixl_admin_timer(). Both scheduled at the same 2Hz rate, but the second is rescheduling the first each time, so if the first get any slower, it won't be executed at all. Revert this until deeper investigation. This reverts commit 90bc1cf65778aafb1f226c8fe08218cfed5e40b2.
|
#
90bc1cf6 |
|
09-Jan-2022 |
Alexander Motin <mav@FreeBSD.org> |
iflib: Relax timer period from 0.5 to 0.5-0.75s. While there switch it from hardclock ticks to milliseconds. MFC after: 2 weeks
|
#
e2650af1 |
|
29-Dec-2021 |
Stefan Eßer <se@FreeBSD.org> |
Make CPU_SET macros compliant with other implementations The introduction of <sched.h> improved compatibility with some 3rd party software, but caused the configure scripts of some ports to assume that they were run in a GLIBC compatible environment. Parts of sched.h were made conditional on -D_WITH_CPU_SET_T being added to ports, but there still were compatibility issues due to invalid assumptions made in autoconfigure scripts. The differences between the FreeBSD version of macros like CPU_AND, CPU_OR, etc. and the GLIBC versions was in the number of arguments: FreeBSD used a 2-address scheme (one source argument is also used as the destination of the operation), while GLIBC uses a 3-adderess scheme (2 source operands and a separately passed destination). The GLIBC scheme provides a super-set of the functionality of the FreeBSD macros, since it does not prevent passing the same variable as source and destination arguments. In code that wanted to preserve both source arguments, the FreeBSD macros required a temporary copy of one of the source arguments. This patch set allows to unconditionally provide functions and macros expected by 3rd party software written for GLIBC based systems, but breaks builds of externally maintained sources that use any of the following macros: CPU_AND, CPU_ANDNOT, CPU_OR, CPU_XOR. One contributed driver (contrib/ofed/libmlx5) has been patched to support both the old and the new CPU_OR signatures. If this commit is merged to -STABLE, the version test will have to be extended to cover more ranges. Ports that have added -D_WITH_CPU_SET_T to build on -CURRENT do no longer require that option. The FreeBSD version has been bumped to 1400046 to reflect this incompatible change. Reviewed by: kib MFC after: 2 weeks Relnotes: yes Differential Revision: https://reviews.freebsd.org/D33451
|
#
4561c4f0 |
|
28-Dec-2021 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
net: iflib: sync isc_capenable to if_capenable On SIOCSIFCAP, some bits in ifp->if_capenable may be toggled. When this happens, apply the same change to isc_capenable, which is the iflib private copy of if_capenable (for a subset of the IFCAP_* bits). In this way the iflib drivers can check the bits using isc_capenable rather than if_capenable. This is convenient because the latter access requires an additional indirection through the ifp, and it is also less likely to be in cache. PR: 260068 Reviewed by: kbowling, gallatin MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D33156
|
#
1bfdb812 |
|
19-Nov-2021 |
Andriy Gapon <avg@FreeBSD.org> |
iflib_stop: drain rx tasks to prevent any data races iflib_stop modifies iflib data structures that are used by _task_fn_rx, most prominently the free lists. So, iflib_stop has to ensure that the rx task threads are not active. This should help to fix a crash seen when iflib_if_ioctl (e.g., SIOCSIFCAP) is called while there is already traffic flowing. The crash has been seen on VMWare guests with vmxnet3 driver. My guess is that on physical hardware the couple of 1ms delays that iflib_stop has after disabling interrupts are enough for the queued work to be completed before any iflib state is touched. But on busy hypervisors the guests might not get enough CPU time to complete the work, thus there can be a race between the taskqueue threads and the work done to handle an ioctl, specifically in iflib_stop and iflib_init_locked. PR: 259458 Reviewed by: markj MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D32926
|
#
66fa12d8 |
|
18-Aug-2021 |
Stephan de Wit <stephan.dewt@yahoo.co.uk> |
iflib: emulate counters in netmap mode When iflib devices are in netmap mode the driver counters are no longer updated making it look from userspace tools that traffic has stopped. Reported by: Franco Fichtner <franco@opnsense.org> Reviewed by: vmaffione, iflib (erj, gallatin) Obtained from: OPNsense MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D31550
|
#
a5688853 |
|
06-Jul-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
iflib: use m_gethdr_raw Reviewed by: gallatin Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D31081
|
#
bad5f0b6 |
|
30-Jun-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
iflib: switch bare zone_mbuf use to m_free_raw Reviewed by: kbowling Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D30961
|
#
58632fa7 |
|
19-May-2021 |
Marcin Wojtas <mw@FreeBSD.org> |
iflib: Add a new quirk ENETC NIC found in LS1028A has a bug where clearing TX pidx/cidx causes the ring to hang after being re-enabled. Add a new flag, if set iflib will preserve the indices during restart. Submitted by: Kornel Duleba <mindal@semihalf.com> Reviewed by: gallatin, erj Obtained from: Semihalf Sponsored by: Alstom Group Differential Revision: https://reviews.freebsd.org/D30728
|
#
cd945dc0 |
|
27-Apr-2021 |
Marcin Wojtas <mw@FreeBSD.org> |
iflib: Take iri_pad into account when processing small frames Drivers can specify padding of received frames with iri_pad field. This can be used to enforce ip alignment by hardware. Iflib ignored that padding when processing small frames, which rendered this feature inoperable. I found it while writing a driver for a NIC that can ip align received packets. Note that this doesn't change behavior of existing drivers as they all set iri_pad to 0. Submitted by: Kornel Duleba <mindal@semihalf.com> Reviewed by: gallatin Obtained from: Semihalf Sponsored by: Alstom Group Differential Revision: https://reviews.freebsd.org/D30009
|
#
ca7005f1 |
|
25-Apr-2021 |
Patrick Kelsey <pkelsey@FreeBSD.org> |
iflib: Improve mapping of TX/RX queues to CPUs iflib now supports mapping each (TX,RX) queue pair to the same CPU (default), to separate CPUs, or to a pair of physical and logical CPUs that share the same L2 cache. The mapping mechanism supports unequal numbers of TX and RX queues, with the excess queues always being mapped to consecutive physical CPUs. When the platform cannot distinguish between physical and logical CPUs, all are treated as physical CPUs. See the comment on get_cpuid_for_queue() for the entire matrix. The following device-specific tunables influence the mapping process: dev.<device>.<unit>.iflib.core_offset (existing) dev.<device>.<unit>.iflib.separate_txrx (existing) dev.<device>.<unit>.iflib.use_logical_cores (new) The following new, read-only sysctls provide visibility of the mapping results: dev.<device>.<unit>.iflib.{t,r}xq<n>.cpu When an iflib driver allocates TX softirqs without providing reference RX IRQs, iflib now binds those TX softirqs to CPUs using the above mapping mechanism (that is, treats them as if they were TX IRQs). Previously, such bindings were left up to the grouptaskqueue code and thus fell outside of the iflib CPU mapping strategy. Reviewed by: kbowling Tested by: olivier, pkelsey MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D24094
|
#
3183d0b6 |
|
23-Apr-2021 |
Andrew Gallatin <gallatin@FreeBSD.org> |
iflib: initialize LRO unconditionally Changes to the LRO code have exposed a bug in iflib where devices which are not capable of doing LRO are still calling tcp_lro_flush_all(), even when they have not initialized the LRO context. This used to be mostly harmless, but the LRO code now sets the VNET based on the ifp in the lro context and will try to access it through a NULL ifp resulting in a panic at boot. To fix this, we unconditionally initializes LRO so that we have a valid LRO context when calling tcp_lro_flush_all(). One alternative is to check the device capabilities before calling tcp_lro_flush_all() or adding a new state flag in the ctx. However, it seems unwise to add an extra, mostly useless test for higher performance devices when we can just initialize LRO for all devices. Reviewed by: erj, hselasky, markj, olivier Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D29928
|
#
361e9501 |
|
05-Apr-2021 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
iflib: add support for netmap offsets Follow-up change to a6d768d845c173823785c71bb18b40074e7a8998. This change adds iflib support for netmap offsets, enabling applications to use offsets on any driver backed by iflib.
|
#
21d0c012 |
|
29-Mar-2021 |
you@x <you@x> |
netmap: iflib: add nm_config callback This per-driver callback is invoked by netmap when it wants to align the number of TX/RX netmap rings and/or the number of TX/RX netmap slots to the actual state configured in the hardware. The alignment happens when netmap mode is switched on (with no active netmap file descriptors for that netmap port), or when collecting netmap port information. MFC after: 1 week
|
#
09c3f04f |
|
02-Mar-2021 |
Marcin Wojtas <mw@FreeBSD.org> |
iflib: add support for admin completion queues For interfaces with admin completion queues, introduce a new devmethod IFDI_ADMIN_COMPLETION_HANDLE and a corresponding flag IFLIB_HAS_ADMINCQ. This provides an option for handling any admin cq logic, which cannot be run from an interrupt context. Said method is called from within iflib's admin task, making it safe to sleep. Reviewed by: mmacy Submitted by: Artur Rojek <ar@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc. Differential Revision: https://reviews.freebsd.org/D28708
|
#
ef567155 |
|
24-Feb-2021 |
Marcin Wojtas <mw@FreeBSD.org> |
Fix powerpc build after 6dd69f0064f1 Commit 6dd69f0064f1 ("iflib: introduce isc_dma_width") failed to build on powerpc due to implicit type conversion error. Fix that. Submitted by: Artur Rojek <ar@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc.
|
#
6dd69f00 |
|
24-Feb-2021 |
Marcin Wojtas <mw@FreeBSD.org> |
iflib: introduce isc_dma_width Some DMA controllers are unable to address the full host memory space and are instead limited to a subset of address range (e.g. 48-bit). Allow the driver to specify the maximum allowed DMA addressing width (in bits) for the NIC hardware, by introducing a new field in if_softc_ctx. If said field is omitted (set to 0), the lowaddr of DMA window bounds defaults to BUS_SPACE_MAXADDR. Submitted by: Artur Rojek <ar@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc. Differential Revision: https://reviews.freebsd.org/D28706
|
#
b6999635 |
|
24-Feb-2021 |
Mark Johnston <markj@FreeBSD.org> |
iflib: Avoid double counting in rxeof iflib_rxeof() was counting everything twice. This was introduced when pfil hooks were added to the iflib receive path. We want to count rx packets/bytes before the pfil hooks are executed, so remove the counter adjustments that are executed after. PR: 253583 Reviewed by: gallatin, erj MFC after: 3 days Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D28900
|
#
2ccf971a |
|
19-Feb-2021 |
John Baldwin <jhb@FreeBSD.org> |
iflib: Cast the result of iflib_netmap_txq_init() to void. This fixes a warning from GCC for kernels without netmap since the return value is never used. Reviewed by: vmaffione, erj Differential Revision: https://reviews.freebsd.org/D28598
|
#
922cf8ac |
|
14-Feb-2021 |
Allan Jude <allanjude@FreeBSD.org> |
Use iflib_if_init_locked() during media change instead of iflib_init_locked(). iflib_init_locked() assumes that iflib_stop() has been called, however, it is not called for media changes. iflib_if_init_locked() calls stop then init, so fixes the problem. PR: 253473 MFC after: 3 days Reviewed by: markj Sponsored by: Juniper Networks, Inc., Klara, Inc. Differential Revision: https://reviews.freebsd.org/D28667
|
#
38bfc6de |
|
01-Feb-2021 |
Sai Rajesh Tallamraju <stallamr@netapp.com> |
iflib: Free resources in a consistent order during detach Memory and PCI resources are freed with no particular order. This could cause use-after-frees when detaching following a failed attach. For instance, iflib_tx_structures_free() frees ctx->ifc_txqs[] but iflib_tqg_detach() attempts to access this array. Similarly, adapter queues gets freed by IFDI_QUEUES_FREE() but IFDI_DETACH() attempts to access adapter queues to free PCI resources. MFC after: 2 weeks Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D27634
|
#
3f43ada9 |
|
28-Jan-2021 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Catch up with 6edfd179c86: mechanically rename IFCAP_NOMAP to IFCAP_MEXTPG. Originally IFCAP_NOMAP meant that the mbuf has external storage pointer that points to unmapped address. Then, this was extended to array of such pointers. Then, such mbufs were augmented with header/trailer. Basically, extended mbufs are extended, and set of features is subject to change. The new name should be generic enough to avoid further renaming.
|
#
f80efe50 |
|
24-Jan-2021 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
iflib: netmap: move per-packet operation out of fragments loop MFC after: 1 week
|
#
aceaccab |
|
24-Jan-2021 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
iflib: netmap: add support for NS_MOREFRAG The NS_MOREFRAG flag can be set in a netmap slot to represent a multi-fragment packet. Only the last fragment of a packet does not have the flag set. On TX rings, the flag may be set by the userspace application. The kernel will look at the flag and use it to properly set up the NIC TX descriptors. On RX rings, the kernel may set the flag if the packet received was split across multiple netmap buffers. The userspace application should look at the flag to know when the packet is complete. Submitted by: rajesh1.kumar_amd.com Reviewed by: vmaffione MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D27799
|
#
0c864213 |
|
21-Jan-2021 |
Andrew Gallatin <gallatin@FreeBSD.org> |
iflib: Fix a NULL pointer deref rxd_frag_to_sd() have pf_rv parameter as NULL with the current code. This patch fixes the NULL pointer dereference in that case thus avoiding a possible panic. Submitted by: rajesh1.kumar at amd.com Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D28115
|
#
55f0ad5f |
|
10-Jan-2021 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: restore hwofs and support it in iflib Restore the hwofs functionality temporarily disabled by 7ba6ecf216fb15e8b147db2 to prevent issues with iflib. This patch brings the necessary changes to iflib to enable howfs to allow interface restarts without disrupting netmap applications actively using its rings. After this change, it becomes possible for multiple non-cooperating netmap applications to use non-overlapping subsets of the available netmap rings without clashing with each other. PR: 252453 MFC after: 1 week
|
#
8aa8484c |
|
10-Jan-2021 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
iflib: fix build failure in case DEV_NETMAP is not defined This addresses the build failure introduced by 3d65fd97e85ab807f3baa62. MFC with: 3d65fd97e85ab807f3baa62
|
#
4ba9ad0d |
|
10-Jan-2021 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
iflib: add assert to prevent out-of-bounds array access The iflib_queues_alloc() allocates isc_nrxqs iflib_dma_info structs for each rxqset, and links each struct to a different free list. As a result, it must be isc_nrxqs >= isc_nfl (plus the completion queue, if present). Add an assertion to make this constraint explicit. MFC after: 2 weeks
|
#
3d65fd97 |
|
09-Jan-2021 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: iflib: enable/disable krings on any interface reinit Since 1d238b07d5d4d9660ae0e0, krings are disabled before a reinit cycle triggered by iflib_netmap_register. However, this operation is actually necessary also for any interface reinit triggered by other causes (i.e., ifconfig commands). We achieve this goal by moving the krings enable/disable operation inside iflib_stop() and iflib_init_locked(). Once here, this change also removes some redundant operations from iflib_netmap_register(), that are already performed by iflib_stop(). PR: 252453 MFC after: 1 week
|
#
3189ba61 |
|
09-Jan-2021 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: iflib: fix asserts in netmap_fl_refill() When netmap_fl_refill() is called at initialization time (e.g., during netmap_iflib_register()), nic_i must be 0, since the free list is reinitialized. At the end of the refill cycle, nic_i must still be zero, because exactly N descriptors (N is the ring size) are refilled. This patch therefore fixes the assertions to check on nic_i rather than on nm_i. The current netmap_reset() may in fact cause nm_i to be != 0 while the device is resetting: this may happen when multiple non-cooperating processes open different subsets of the available netmap rings. PR: 252518 MFC after: 1 week
|
#
1d238b07 |
|
09-Jan-2021 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: iflib: stop krings during interface reset When different processes open separate subsets of the available rings of a same netmap interface, a device reset may be performed while one of the processes is actively using some rings (e.g., caused by another process executing a nmport_open()). With this patch, such situation will cause the active process to get a POLLERR, so that it can have a chance to detect the situation. We also guarantee that no process is running a txsync or rxsync (ioctl or poll) while an iflib device reset is in progress. PR: 252453 MFC after: 1 week
|
#
81be6552 |
|
18-Dec-2020 |
Matt Macy <mmacy@FreeBSD.org> |
iflib: ensure that tx interrupts enabled and cleanups Doing a 'dd' over iscsi will reliably cause stalls. Tx cleaning _should_ reliably happen as data is sent. However, currently if the transmit queue fills it will wait until the iflib timer (hz/2) runs. This change causes the the tx taskq thread to be run if there are completed descriptors. While here: - make timer interrupt delay a sysctl - simplify txd_db_check handling - comment on INTR types Background on the change: Initially doorbell updates were minimized by only writing to the register on every fourth packet. If txq_drain would return without writing to the doorbell it scheduled a callout on the next tick to do the doorbell write to ensure that the write otherwise happened "soon". At that time a sysctl was added for users to avoid the potential added latency by simply writing to the doorbell register on every packet. This worked perfectly well for e1000 and ixgbe ... and appeared to work well on ixl. However, as it turned out there was a race to this approach that would lockup the ixl MAC. It was possible for a lower producer index to be written after a higher one. On e1000 and ixgbe this was harmless - on ixl it was fatal. My initial response was to add a lock around doorbell writes - fixing the problem but adding an unacceptable amount of lock contention. The next iteration was to use transmit interrupts to drive delayed doorbell writes. If there were no packets in the queue all doorbell writes would be immediate as the queue started to fill up we could delay doorbell writes further and further. At the start of drain if we've cleaned any packets we know we've moved the state machine along and we write the doorbell (an obvious missing optimization was to skip that doorbell write if db_pending is zero). This change required that tx interrupts be scheduled periodically as opposed to just when the hardware txq was full. However, that just leads to our next problem. Initially dedicated msix vectors were used for both tx and rx. However, it was often possible to use up all available vectors before we set up all the queues we wanted. By having rx and tx share a vector for a given queue we could halve the number of vectors used by a given configuration. The problem here is that with this change only e1000 passed the necessary value to have the fast interrupt drive tx when appropriate. Reported by: mav@ Tested by: mav@ Reviewed by: gallatin@ MFC after: 1 month Sponsored by: iXsystems Differential Revision: https://reviews.freebsd.org/D27683
|
#
c065d4e5 |
|
07-Dec-2020 |
Mark Johnston <markj@FreeBSD.org> |
iflib: Avoid leaking the freelist bitmaps upon driver detach Submitted by: Sai Rajesh Tallamraju <stallamr@netapp.com> MFC after: 2 weeks Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D27342
|
#
10254019 |
|
07-Dec-2020 |
Mark Johnston <markj@FreeBSD.org> |
iflib: Detach tasks upon device registration failure In some error paths we would fail to detach from the iflib taskqueue groups. Also move the detach code into its own subroutine instead of duplicating it. Submitted by: Sai Rajesh Tallamraju <stallamr@netapp.com> MFC after: 2 weeks Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D27342
|
#
54bf96fb |
|
11-Nov-2020 |
Mark Johnston <markj@FreeBSD.org> |
iflib: Free full mbuf chains when draining transmit queues Submitted by: Sai Rajesh Tallamraju <stallamr@netapp.com> Reviewed by: gallatin, hselasky MFC after: 1 week Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D27179
|
#
be7a6b3d |
|
28-Oct-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
iflib: fix typo bug introduced by r367093 Code was supposed to call callout_reset_sbt_on() rather than callout_reset_sbt(). This resulted into passing a "cpu" value to a "flag" argument. A recipe for subtle errors. PR: 248652 Reported by: sg@efficientip.com MFC with: r367093
|
#
17cec474 |
|
27-Oct-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
iflib: add per-tx-queue netmap timer The way netmap TX is handled in iflib when TX interrupts are not used (IFC_NETMAP_TX_IRQ not set) has some issues: - The netmap_tx_irq() function gets called by iflib_timer(), which gets scheduled with tick granularity (hz). This is not frequent enough for 10Gbps NICs and beyond (e.g., ixgbe or ixl). The end result is that the transmitting netmap application is not woken up fast enough to saturate the link with small packets. - The iflib_timer() functions also calls isc_txd_credits_update() to ask for more TX completion updates. However, this violates the netmap requirement that only txsync can access the TX queue for datapath operations. Only netmap_tx_irq() may be called out of the txsync context. This change introduces per-tx-queue netmap timers, using microsecond granularity to ensure that netmap_tx_irq() can be called often enough to allow for maximum packet rate. The timer routine simply calls netmap_tx_irq() to wake up the netmap application. The latter will wake up and call txsync to collect TX completion updates. This change brings back line rate speed with small packets for ixgbe. For the time being, timer expiration is hardcoded to 90 microseconds, in order to avoid introducing a new sysctl. We may eventually implement an adaptive expiration period or use another deferred work mechanism in place of timers. Also, fix the timers usage to make sure that each queue is serviced by a different CPU. PR: 248652 Reported by: sg@efficientip.com MFC after: 2 weeks
|
#
662c1305 |
|
01-Sep-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
net: clean up empty lines in .c and .h files
|
#
35d8a463 |
|
01-Sep-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
iflib: leave only 1 receive descriptor unused The pidx argument of isc_rxd_flush() indicates which is the last valid receive descriptor to be used by the NIC. However, current code has multiple issues: - Intel drivers write pidx to their RDT register, which means that NICs will only use the descriptors up to pidx-1 (modulo ring size N), and won't actually use the one pointed by pidx. This does not break reception, but it is anyway confusing and suboptimal (the NIC will actually see only N-2 descriptors as available, rather than N-1). Other drivers (if_vmx, if_bnxt, if_mgb) adhere to this semantic). - The semantic used by Intel (RDT is one descriptor past the last valid one) is used by most (if not all) NICs, and it is also used on the TX side (also in iflib). Since iflib is not currently using this semantic for RX, it must decrement fl->ifl_pidx (modulo N) before calling isc_rxd_flush(), and then the per-driver callback implementation must increment the index again (to match the real semantic). This is confusing and suboptimal. - The iflib refill function is also called at initialization. However, in case the ring size is smaller than 128 (e.g. if_mgb), the refill function will actually prepare all the receive descriptors (N), without leaving one unused, as most of NICs assume (e.g. to avoid RDT to overrun RDH). I can speculate that the code looks like this right now because this issue showed up during testing (e.g. with if_mgb), and it was easy to workaround by decrementing pidx before isc_rxd_flush(). The goal of this change is to simplify the code (removing a bunch of instructions from the RX fast path), and to make the semantic of isc_rxd_flush() consistent across drivers. To achieve this, we: - change the semantics of the pidx argument to the usual one (that is the index one past the last valid one), so that both iflib and drivers avoid the decrement/increment dance. - fix the initialization code to prepare at most N-1 descriptors. Reviewed by: markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D26191
|
#
ae750d5c |
|
25-Aug-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
iflib: netmap: publish all the receive buffer At initialization time, the netmap RX refill function used to prepare the NIC RX ring with N-1 buffers rather than N (with N equal to the number of descriptors in the NIC RX ring). This is not how netmap is supposed to work, as it would keep kring->nr_hwcur not in sync with the NIC "next index to refill" (i.e., fl->ifl_pidx). Instead we prepare N buffers, although we still publish (with isc_rxd_flush()) only the first N-1 buffers, to avoid the NIC producer pointer to overrun the NIC consumer pointer (for NICs where this is a real issue, e.g. Intel ones). MFC after: 2 weeks
|
#
de5b4610 |
|
24-Aug-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
iflib: fix isc_rxd_flush call in netmap_fl_refill() The semantic of the pidx argument of isc_rxd_flush() is the last valid index of in the free list, rather than the next index to be published. However, netmap was still using the old convention. While there, also refactor the netmap_fl_refill() to simplify a little bit and add an assertion. MFC after: 2 weeks
|
#
6d84e76a |
|
12-Aug-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
iflib: netmap: improve rxsync to support IFLIB_HAS_RXCQ For drivers with IFLIB_HAS_RXCQ set, there is a separate completion queue. In this case, the netmap rxsync routine needs to update rxq->ifr_cq_cidx in the same way it is updated by iflib_rxeof(). This improves the situation for vmx(4) and bnxt(4) drivers, which use iflib and have the IFLIB_HAS_RXCQ bit set. PR: 248494 MFC after: 3 weeks
|
#
530960be |
|
12-Aug-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
iflib: refactor netmap_fl_refill and fix off-by-one issue First, fix the initialization of the fl->ifl_rxd_idxs array, which was affected by an off-by-one bug. Once there, refactor the function to use better names for local variables, optimize the variable assignments, and merge the bus_dmamap_sync() inner loop with the outer one. PR: 248494 MFC after: 3 weeks
|
#
c9d886cd |
|
06-Aug-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
iflib: netmap: drop redundant check The validity of head is already checked by nm_rxsync_prologue(). MFC after: 2 weeks
|
#
ee07345d |
|
06-Aug-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
iflib: netmap: don't increment ifl_cidx on the wrong free list Netmap only uses free list 0 to keep it consistent with its one-to-one mapping between each netmap ring and a device RX (or TX) queue. However, the current iflib_netmap_rxsync() routine was mistakenly updating the ifl_cidx field of both free lists. PR: 248494 MFC after: 2 weeks
|
#
0ae0e8d2 |
|
26-Jul-2020 |
Matt Macy <mmacy@FreeBSD.org> |
iflib: fix LOR with bpf detach Reported by: grehan@ Approved by: grehan@ MFC after: 1 week Sponsored by: Netgate Differential Revision: https://reviews.freebsd.org/D25530
|
#
ac11d857 |
|
20-Jul-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
iflib: initialize netmap with the correct number of descriptors In case the network device has a RX or TX control queue, the correct number of TX/RX descriptors is contained in the second entry of the isc_ntxd (or isc_nrxd) array, rather than in the first entry. This case is correctly handled by iflib_device_register() and iflib_pseudo_register(), but not by iflib_netmap_attach(). If the first entry is larger than the second, this can result in a panic. This change fixes the bug by introducing two helper functions that also lead to some code simplification. PR: 247647 MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D25541
|
#
b256d25c |
|
06-Jul-2020 |
Mark Johnston <markj@FreeBSD.org> |
iflib: Fix some nits in the rx refill code. - Get rid of the ifl_vm_addrs array. It is not used by any existing consumer, so we are just dirtying a couple of cache lines for no reason. - Use uma_zalloc(fl->ifl_zone) instead of m_cljget(). Otherwise m_cljget() is doing unnecessary work to look up the correct zone, when iflib already knows what that zone is. - ifl_gen is only used when INVARIANTS is on, so make that more clear. - Fix some style nits and inconsistencies. Reviewed by: gallatin Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D25490
|
#
a363e1d4 |
|
06-Jul-2020 |
Mark Johnston <markj@FreeBSD.org> |
iflib: Fix handling of mbuf cluster allocation failures. When refilling an rx freelist, make sure we only update the hardware producer index if at least one cluster was allocated. Otherwise the NIC is programmed to write a previously used cluster, typically resulting in a use-after-free when packet data is written by the hardware. Also make sure that we don't update the fragment index cursor if the last allocation attempt didn't succeed. For at least Intel drivers, iflib assumes that the consumer index and fragment index cursor stay in lockstep, but this assumption was violated in the face of cluster allocation failures. Reported and tested by: pho Reviewed by: gallatin, hselasky MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D25489
|
#
9503233f |
|
25-Jun-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
iflib: fix compilation issue introduced in r362621 The ifp local variable is useful even without netmap and altq, as it is used to check for IFF_DRV_RUNNING. MFC after: 2 weeks
|
#
d8b2d26b |
|
25-Jun-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
iflib: netmap: add support for partial ring openings Reviewed by: gallatin MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D25254
|
#
88a68866 |
|
25-Jun-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
iflib: netmap: add per-tx-queue netmap support Reviewed by: gallatin MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D25253
|
#
0ff21267 |
|
23-Jun-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
iflib: netmap: fix rsync index overrun In the current iflib_netmap_rxsync, there is nothing that prevents kring->nr_hwtail to overrun kring->nr_hwcur during the descriptor import phase. This may cause errors in netmap applications, such as: em1 RX0: fail 'head < kring->nr_hwcur || head > kring->nr_hwtail' h 795 c 795 t 282 rh 795 rc 795 rt 282 hc 282 ht 282 Reviewed by: gallatin MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D25252
|
#
9aeca213 |
|
21-Jun-2020 |
Matt Macy <mmacy@FreeBSD.org> |
iflib: fix cloneattach fail and generalize pseudo device handling - a cloneattach failure will not currently be handled correctly, jump to the right target - pseudo devices are all treat as if they're ethernet devices - this often doesn't make sense MFC after: 1 week Sponsored by: Netgate, Inc. Differential Revision: https://reviews.freebsd.org/D25083
|
#
0a182b4c |
|
14-Jun-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
iflib: netmap: enter/exit netmap mode after device stops Avoid possible race conditions by calling nm_set_native_flags() and nm_clear_native_flags() only after the device has been stopped. MFC after: 1 week
|
#
e136e9c8 |
|
09-Jun-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
iflib: netmap: honor netmap_irx_irq return values In the receive interrupt routine, always call netmap_rx_irq(). The latter function will return != NM_IRQ_PASS if netmap is not active on that specific receive queue, so that the driver can go on with iflib_rxeof(). Note that netmap supports partial opening, where only a subset of the RX or TX rings can be open in netmap mode. Checking the IFCAP_NETMAP flag is not enough to make sure that the queue is indeed in netmap mode. Moreover, in case netmap_rx_irq() returns NM_IRQ_RESCHED, it means that netmap expects the driver to call netmap_rx_irq() again as soon as possible. Currently, this may happen when the device is attached to a VALE switch. Reviewed by: gallatin MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D25167
|
#
1f93e931 |
|
31-May-2020 |
Matt Macy <mmacy@FreeBSD.org> |
Fix panics when using iflib pseudo device support Reviewed by: gallatin@, hselasky@ MFC after: 1 week Sponsored by: Netgate, Inc. Differential Revision: https://reviews.freebsd.org/D23710
|
#
814fa34d |
|
30-Apr-2020 |
Mark Johnston <markj@FreeBSD.org> |
Increase the iflib txq callout mutex name length to 32 bytes. With a length of 16, the name ("<if name>:TX(<qid>):callout") typically gets truncated. PR: 245712 Reported by: ghuckriede@blackberry.com MFC after: 1 week
|
#
45818bf1 |
|
27-Apr-2020 |
Eric Joyner <erj@FreeBSD.org> |
iflib: Stop interface before (un)registering VLAN This patch is intended to solve a specific problem that iavf(4) encounters, but what it does can be extended to solve other issues. To summarize the iavf(4) issue, if the PF driver configures VLAN anti-spoof, then the VF driver needs to make sure no untagged traffic is sent if a VLAN is configured, and vice-versa. This can be an issue when a VLAN is being registered or unregistered, e.g. when a packet may be on the ring with a VLAN in it, but the VLANs are being unregistered. This can cause that tagged packet to go out and cause an MDD event. To fix this, include a new interface-dependent function that drivers can implement named IFDI_NEEDS_RESTART(). Right now, this function is called in iflib_vlan_unregister/register() to determine whether the interface needs to be stopped and started when a VLAN is registered or unregistered. The default return value of IFDI_NEEDS_RESTART() is true, so this fixes the MDD problem that iavf(4) encounters, since the interface rings are flushed during a stop/init. A future change to iavf(4) will implement that function just in case the default value changes, and to make it explicit that this interface reset is required when a VLAN is added or removed. Reviewed by: gallatin@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D22086
|
#
59d50fe5 |
|
30-Mar-2020 |
Mark Johnston <markj@FreeBSD.org> |
Simplify taskqgroup inititialization. taskqgroup initialization was broken into two steps: 1. allocate the taskqgroup structure, at SI_SUB_TASKQ; 2. initialize taskqueues, start taskqueue threads, enqueue "binder" tasks to bind threads to specific CPUs, at SI_SUB_SMP. Step 2 tries to handle the case where tasks have already been attached to a queue, by migrating them to their intended queue. In particular, tasks can't be enqueued before step 2 has completed. This breaks NFS mountroot on systems using an iflib-based driver when EARLY_AP_STARTUP is not defined, since mountroot happens before SI_SUB_SMP in this case. Simplify initialization: do all initialization except for CPU binding at SI_SUB_TASKQ. This means that until CPU binding is completed, group tasks may be executed on a CPU other than that to which they were bound, but this should not be a problem for existing users of the taskqgroup KPIs. Reported by: sbruno Tested by: bdragon, sbruno MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D24188
|
#
ed6611cc |
|
24-Mar-2020 |
Ed Maste <emaste@FreeBSD.org> |
iflib: simplify MPASS assertion Submitted by: andrew
|
#
68af0153 |
|
24-Mar-2020 |
Ed Maste <emaste@FreeBSD.org> |
iflib: split compound assertion ThunderX cluster systems are panicking on boot with a failed assertion MPASS(gtask != NULL && gtask->gt_taskqueue != NULL). Split the assertion so that it's clear which part is failing.
|
#
87699691 |
|
14-Mar-2020 |
Patrick Kelsey <pkelsey@FreeBSD.org> |
Remove extraneous code from iflib ifsd_cidx is never used, and the line removed from rxd_frag_to_sd() is just dead code. Reviewed by: erj, gallatin MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D23951
|
#
3caff188 |
|
14-Mar-2020 |
Patrick Kelsey <pkelsey@FreeBSD.org> |
Remove refill budget from iflib Reviewed by: gallatin MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D23948
|
#
b3813609 |
|
14-Mar-2020 |
Patrick Kelsey <pkelsey@FreeBSD.org> |
Allow iflib drivers to specify the buffer size used for each receive queue Reviewed by: erj, gallatin MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D23947
|
#
e5030490 |
|
14-Mar-2020 |
Patrick Kelsey <pkelsey@FreeBSD.org> |
Remove freelist contiguous-indexes assertion from rxd_frag_to_sd() The vmx driver is an example of an iflib driver that might report packets using non-contiguous descriptors (with unused descriptors either between received packets or between the fragments of a received packet), so this assertion needs to be removed. For such drivers, the freelist producer and consumer indexes don't relate directly to driver ring slots (the driver deals directly with freelist buffer indexes supplied by iflib during refill, and reports them with each fragment during packet reception), but do continue to be used by iflib for accounting, such as determining the number of ring slots that are refillable. PR: 243126, 243392, 240628 Reported by: avg, alexandr.oleynikov@gmail.com, Harald Schmalzbauer Reviewed by: gallatin MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D23946
|
#
4f2beb72 |
|
14-Mar-2020 |
Patrick Kelsey <pkelsey@FreeBSD.org> |
Fix iflib zero-length fragment handling The dmamap for zero-length fragments should not be unloaded, as doing so breaks the the cluster-reuse logic in _iflib_fl_refill(). All zero-length fragments are now handled by the assemble_segments() path so that the cluster-reuse logic there does not have to be replicated in the small-single-fragment-packet path of iflib_rxd_pkt_get(). Packets consisting entirely of zero-length fragments (which result in a NULL mbuf pointer) are now properly tolerated. This allows drivers (such as the vmx driver) to pass such packets to iflib when a descriptor error occurs during packet reception, the advantage being that the refill of descriptors associated with the error packet are handled via the existing iflib machinery without having to duplicate parts of that machinery in the driver to handle that error case. Reviewed by: avg, erj, gallatin MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D23945
|
#
9e9b738a |
|
14-Mar-2020 |
Patrick Kelsey <pkelsey@FreeBSD.org> |
Fix iflib freelist state corruption This fixes a bug in iflib freelist management that breaks the required correspondence between freelist indexes and driver ring slots. PR: 243126, 243392, 240628 Reported by: avg, alexandr.oleynikov@gmail.com, Harald Schmalzbauer Reviewed by: avg, gallatin MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D23943
|
#
7029da5c |
|
26-Feb-2020 |
Pawel Biernacki <kaktus@FreeBSD.org> |
Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718
|
#
e87c4940 |
|
24-Feb-2020 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Although most of the NIC drivers are epoch ready, due to peer pressure switch over to opt-in instead of opt-out for epoch. Instead of IFF_NEEDSEPOCH, provide IFF_KNOWSEPOCH. If driver marks itself with IFF_KNOWSEPOCH, then ether_input() would not enter epoch when processing its packets. Now this will create recursive entrance in epoch in >90% network drivers, but will guarantee safeness of the transition. Mark several tested drivers as IFF_KNOWSEPOCH. Reviewed by: hselasky, jeff, bz, gallatin Differential Revision: https://reviews.freebsd.org/D23674
|
#
f98977b5 |
|
12-Feb-2020 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
Use NET_TASK_INIT() and NET_GROUPTASK_INIT() for drivers that process incoming packets in taskqueue context. This patch extends r357772. Tested by: yp@mm.st Sponsored by: Mellanox Technologies
|
#
fb1a29b4 |
|
12-Feb-2020 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
Make sure the so-called end of receive interrupts don't starve in iflib. When the receive ring cannot be filled with mbufs, due to lack of memory, no more interrupts may be generated to fill the receive ring later on. Make sure to have a watchdog, to try refilling the receive ring from time to time, hopefully when more mbufs are available. Differential Revision: https://reviews.freebsd.org/D23315 MFC after: 1 week Reviewed by: gallatin@ Sponsored by: Mellanox Technologies
|
#
6c3e93cb |
|
11-Feb-2020 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Use NET_TASK_INIT() and NET_GROUPTASK_INIT() for drivers that process incoming packets in taskqueue context. Reviewed by: hselasky Differential Revision: https://reviews.freebsd.org/D23518
|
#
0b8df657 |
|
22-Jan-2020 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Enter network epoch in iflib rxeof task. In upcoming changes ether_input() is going to be changed not to enter the network epoch. It is going to be responsibility of network interrupt. In case of iflib - its taskqueue.
|
#
f6afed72 |
|
02-Jan-2020 |
Eric Joyner <erj@FreeBSD.org> |
iflib: Prevent watchdog from resetting idle queues While changing link state in iflib_link_state_change(), queues are marked as IFLIB_QUEUE_IDLE to disable watchdog. Currently, iflib_timer() watchdog does not check for previous queue status before marking it as IFLIB_QUEUE_HUNG. This patch adds check of queue status before marking it as hung. Signed-off-by: Piotr Pietruszewski <piotr.pietruszewski@intel.com> PR: 239240 Submitted by: Piotr Pietruszewski <piotr.pietruszewski@intel.com> Reported by: ultima@ Reviewed by: gallatin@, erj@ MFC after: 3 days Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D21712
|
#
db8e8f1e |
|
04-Nov-2019 |
Eric Joyner <erj@FreeBSD.org> |
iflib: properly release memory allocated for DMA DMA memory allocations using the bus_dma.h interface are not properly released in all cases for both Tx and Rx. This causes ~448 bytes of M_DEVBUF allocations to be leaked. First, the DMA maps for Rx are not properly destroyed. A slight attempt is made in iflib_fl_bufs_free to destroy the maps if we're detaching. However, this function may not be reliably called during detach. Indeed, there is a comment "asking" if this should be moved out. Fix this by moving the bus_dmamap_destroy call into iflib_rx_sds_free, where we already sync and unload the DMA. Second, the DMA tag associated with the ifr_ifdi descriptor DMA is not released properly anywhere. Add a call to iflib_dma_free in iflib_rx_structures_free. Third, use of NULL as a canary value on the map pointer returned by bus_dmamap_create is not valid. On some platforms, notably x86, this value may be NULL. In this case, we fail to properly release the related resources. Remove the NULL checks on map values in both iflib_fl_bufs_free and iflib_txsd_destroy. With all of these fixes applied, the leaks to M_DEVBUF are squelched, and iflib drivers now seem to properly cleanup when detaching. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: erj@, gallatin@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D22203
|
#
244e7cff |
|
30-Oct-2019 |
Eric Joyner <erj@FreeBSD.org> |
iflib: cleanup memory leaks on driver detach From Jake: The iflib stack failed to release all of the memory allocated under M_IFLIB during device detach. Specifically, the ifmp_ring, the ift_ifdi Tx DMA info, and the ifr_ifdi Rx DMA info were not being released. Release this memory so that iflib won't leak memory when a device detaches. Since we're freeing the ift_ifdi pointer during iflib_txq_destroy we need to call this only after iflib_dma_free in iflib_tx_structures_free. Additionally, also ensure that we destroy the callout mutex associated with each Tx queue when we free it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: erj@, gallatin@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D22157
|
#
1558015e |
|
23-Oct-2019 |
Eric Joyner <erj@FreeBSD.org> |
iflib: call ether_ifdetach and netmap_detach before stop From Jake: Calling ether_ifdetach after iflib_stop leads to a potential race where a stale ifp pointer can remain in the route entry list for IPv6 traffic. This will potentially cause a page fault or other system instability if the ifp pointer is accessed. Move both iflib_netmap_detach and ether_ifdetach to be called prior to iflib_stop. This avoids the race above, and helps ensure that other ifp references are removed before stopping the interface. Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: erj@, gallatin@, jhb@ MFC after: 3 days Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D22071
|
#
7790c8c1 |
|
17-Oct-2019 |
Conrad Meyer <cem@FreeBSD.org> |
Split out a more generic debugnet(4) from netdump(4) Debugnet is a simplistic and specialized panic- or debug-time reliable datagram transport. It can drive a single connection at a time and is currently unidirectional (debug/panic machine transmit to remote server only). It is mostly a verbatim code lift from netdump(4). Netdump(4) remains the only consumer (until the rest of this patch series lands). The INET-specific logic has been extracted somewhat more thoroughly than previously in netdump(4), into debugnet_inet.c. UDP-layer logic and up, as much as possible as is protocol-independent, remains in debugnet.c. The separation is not perfect and future improvement is welcome. Supporting INET6 is a long-term goal. Much of the diff is "gratuitous" renaming from 'netdump_' or 'nd_' to 'debugnet_' or 'dn_' -- sorry. I thought keeping the netdump name on the generic module would be more confusing than the refactoring. The only functional change here is the mbuf allocation / tracking. Instead of initiating solely on netdump-configured interface(s) at dumpon(8) configuration time, we watch for any debugnet-enabled NIC for link activation and query it for mbuf parameters at that time. If they exceed the existing high-water mark allocation, we re-allocate and track the new high-water mark. Otherwise, we leave the pre-panic mbuf allocation alone. In a future patch in this series, this will allow initiating netdump from panic ddb(4) without pre-panic configuration. No other functional change intended. Reviewed by: markj (earlier version) Some discussion with: emaste, jhb Objection from: marius Differential Revision: https://reviews.freebsd.org/D21421
|
#
41669133 |
|
30-Sep-2019 |
Mark Johnston <markj@FreeBSD.org> |
Add IFLIB_SINGLE_IRQ_RX_ONLY. As of r347221 the iflib legacy interrupt mode setup assumes that drivers perform both receive and transmit processing from the interrupt handler. This assumption is invalid in the vmxnet3 driver, so introduce the IFLIB_SINGLE_IRQ_RX_ONLY flag to make iflib avoid tx processing in the interrupt handler. PR: 239118 Reported and tested by: Juraj Lutter <otis@sk.freebsd.org> Obtained from: marius Reviewed by: gallatin MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D21831
|
#
6554362c |
|
27-Sep-2019 |
Andrew Gallatin <gallatin@FreeBSD.org> |
kTLS support for TLS 1.3 TLS 1.3 requires a few changes because 1.3 pretends to be 1.2 with a record type of application data. The "real" record type is then included at the end of the user-supplied plaintext data. This required adding a field to the mbuf_ext_pgs struct to save the record type, and passing the real record type to the sw_encrypt() ktls backend functions. Reviewed by: jhb, hselasky Sponsored by: Netflix Differential Revision: D21801
|
#
53b5b9b0 |
|
24-Sep-2019 |
Eric Joyner <erj@FreeBSD.org> |
iflib: Remove redundant VLAN events deregistration From Piotr: r351152 introduced iflib_deregister() function calling EVENTHANDLER_DEREGISTER() to unregister VLAN events. This patch removes duplicate of EVENTHANDLER_DEREGISTER() calls placed in iflib_device_deregister() as this function is now calling iflib_deregister(). This is to avoid deregistering same event twice. This patch also adds check in iflib_vlan_register() to prevent registering VLAN while being in detach. Patch co-authored by Krzysztof Galazka <krzysztof.galazka@intel.com>, erj <erj@FreeBSD.org> and Jacob Keller <jacob.e.keller@intel.com>. Signed-off-by: Piotr Pietruszewski <piotr.pietruszewski@intel.com> Submitted by: Piotr Pietruszewski <piotr.pietruszewski@intel.com> Reviewed by: gallatin@, erj@ MFC after: 3 days Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D21711
|
#
56614414 |
|
16-Aug-2019 |
Eric Joyner <erj@FreeBSD.org> |
iflib: add iflib_deregister to help cleanup on exit Commit message by Jake: The iflib_register function exists to allocate and setup some common structures used by both iflib_device_register and iflib_pseudo_register. There is no associated cleanup function used to undo the steps taken in this function. Both iflib_device_deregister and iflib_pseudo_deregister have some of the necessary steps scattered in their flow. However, most of the necessary cleanup is not done during the error path of iflib_device_register and iflib_pseudo_register. Some examples of missed cleanup include: the ifp pointer is not free'd during error cleanup the STATE and CTX locks are not destroyed during error cleanup the vlan event handlers are not removed during error cleanup media added to the ifmedia structure is not removed the kobject reference is never deleted Additionally, when initializing the kobject class reference counter is increased even though kobj_init already increases it. This results in the class never being free'd again because the reference count would never hit zero even after all driver instances are unloaded. To aid in proper cleanup, implement an iflib_deregister function that goes through the reverse steps taken by iflib_register. Call this function during the error cleanup for iflib_device_register and iflib_pseudo_register. Additionally call the function in the iflib_device_deregister and iflib_pseudo_deregister functions near the end of their flow. This helps reduce code duplication and ensures that proper steps are taken to cleanup allocations and references in both the regular and error cleanup flows. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: shurd@, erj@ MFC after: 3 days Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D21005
|
#
197c6798 |
|
01-Aug-2019 |
Eric Joyner <erj@FreeBSD.org> |
iflib: Prevent kernel panic caused by loading driver with a specific interrupt configuration If a device has only 1 MSI-X interrupt available and does not support either MSI or legacy interrupts, iflib_device_register() will fail, leak memory and MSI resources, and the driver will not load. Worse, if another iflib-using driver tries to unload afterwards, a kernel panic will occur because the previous failed iflib driver loead did not properly call "taskqgroup_detach()" during it's cleanup. This patch is band-aid for this situation -- don't try allocating MSI or legacy interrupts if a single MSI-X interrupt was allocated, but fail to load instead. As well, during the cleanup, properly call taskqgroup_detach() on the admin task to prevent panics when other iflib drivers unload. This whole interrupt allocation process actually needs re-doing to properly support devices with only a single MSI-X interrupt, devices that only support MSI-X, non-PCI devices, and multiple non-MSIX interrupts, as well. Signed-off-by: Eric Joyner <erj@freebsd.org> Reviewed by: marius@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D20747
|
#
6a3f243b |
|
01-Aug-2019 |
Eric Joyner <erj@FreeBSD.org> |
iflib: remove kobject class reference increment Commit message from Jake: In iflib_register, the context is initialized as a kobject using the device driver's "driver" kobject class. As part of this, the function mistakenly increments the ref counter. The ref counter is incremented twice, once in the code directly, and once again by kobj_class_compile. However, there is no associated decrement in the detach path. Because of this, the ref counter will never go back down to zero, and thus the kobject method table will never be released. Remove this unnecessary reference count increment. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: jhb@, erj@ MFC after: 3 days Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D21125
|
#
7f3f6aad |
|
24-Jul-2019 |
Eric Joyner <erj@FreeBSD.org> |
iflib: fix dangling device softc pointer Commit text by Jake: If a driver's IFDI_ATTACH_PRE function fails, the iflib_device_register function will free the ctx pointer. However, it does not reset the device softc pointer to NULL. This will result in memory corruption as a future access to the now invalid pointer will corrupt memory that is later allocated on top of the same memory location. The iflib_device_deregister function correctly resets the softc pointer by using device_set_softc(). This clears up the invalid dangling pointer and prevents memory corruption that could lead to a panic or undefined behavior if the device's driver failed to attach. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: erj@, gallatin@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D21003
|
#
c2c5d1e7 |
|
26-Jun-2019 |
Marius Strobl <marius@FreeBSD.org> |
o In iflib_txq_drain(): - Remove desc_used, which is only ever written to. - Remove a dead store to reclaimed. - Don't recycle avail. - Sort variables according to style(9). These changes will make a subsequent commit easier to read. o In iflib_tx_credits_update(), don't bother checking whether the ift_txd_credits_update method pointer is NULL; _iflib_pre_assert() asserts upfront that this method has been assigned and functions like iflib_{fast_intr_rxtx,netmap_timer_adjust,txq_can_drain}() and _task_fn_tx() were already unconditionally relying on the method being callable.
|
#
188adcb7 |
|
19-Jun-2019 |
Marko Zec <zec@FreeBSD.org> |
V_ip6_forwarding and V_ipforwarding have been defined in ip6_var.h / ip_var.h since at least 2008, so make use of those definitions here. MFC after: 3 days
|
#
6aee0bfa |
|
19-Jun-2019 |
Marko Zec <zec@FreeBSD.org> |
Evaluating htons() at compile time is more efficient than doing ntohs() at runtime. This change removes a dependency on a barrel shifter pass before branch resolution, while reducing the instruction stream size by 9 bytes on amd64. MFC after: 3 days
|
#
d49e83ea |
|
15-Jun-2019 |
Marius Strobl <marius@FreeBSD.org> |
- Replace unused and only ever written to members of public iflib(9) structs with placeholders (in the latter case, IFLIB_MAX_TX_BYTES etc. are also only ever used for these write-only members if at all, so both these macros and members can just go). Using these spares may render it possible to merge certain iflib(9) fixes to stable/12. Otherwise, changes extending struct if_irq or struct if_shared_ctx in any way would break KBI as instances of these are allocated by the driver front-ends (by contrast, struct if_pkt_info as well as struct if_softc_ctx instances are provided by iflib(9) and, thus, may grow at least at the end without breaking KBI). - Make the pvi_name in struct pci_vendor_info const char * as device identifiers in hardware lookup tables aren't to be expected to ever change at runtime. - Similarly, make the pci_vendor_info_t of struct if_shared_ctx which is used to point to the struct pci_vendor_info arrays provided by the driver front-ends const. - Remove the ETH_ADDR_LEN macro from iflib.h; this was duplicating ETHER_ADDR_LEN of <net/ethernet.h> with iflib(9) actually only consuming the latter macro. - Make the name argument of iflib_io_tqg_attach(9) const, matching the taskqgroup_attach_cpu(9) this function wraps as well as e. g. iflib_config_gtask_init(9). - Remove the orphaned iflib_qset_lock_get() prototype. - Remove some extraneous empty lines.
|
#
668d6dbb |
|
29-May-2019 |
Eric Joyner <erj@FreeBSD.org> |
iflib: provide probe wrapper for vendor drivers From Jake: Vendor drivers that exist out-of-tree generally should return BUS_PROBE_VENDOR from their device probe functions. This helps ensure that a vendor replacement driver will supersede the in-kernel driver for a given device. Currently, if a vendor wants to implement a driver based on iflib, it will always report BUS_PROBE_DEFAULT. Add a wrapper function, iflib_device_probe_vendor() which can be used in place of iflib_device_probe(). This function will just return BUS_PROBE_VENDOR whenever iflib_device_probe() would return BUS_PROBE_DEFAULT. While vendor drivers can already implement such a wrapper themselves, providing it in the iflib.h header makes it easier for the vendor driver to do the right thing. Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: erj@, gallatin@, marius@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D20221
|
#
afb77372 |
|
09-May-2019 |
Eric Joyner <erj@FreeBSD.org> |
iflib: use default ntxd and nrxd when user value is not power of 2 From Jake: A user may set a sysctl to override the default number of Tx or Rx descriptors. However, certain calculations in the iflib core expect the number of descriptors to be a power of 2. Update _iflib_assert to verify that all of the shared context parameters for the number of descriptors are powers of 2. Modify iflib_reset_qvalues to check that the provided isc_nrxd value is a power of 2. If it's not, print a warning message and then use the default value. An alternative might be to try rounding the number down instead. However, this creates problems in case the rounded down value is below the minimum value that the driver would support. Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: marius@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D19880
|
#
007b804f |
|
08-May-2019 |
Marius Strobl <marius@FreeBSD.org> |
Allow to build without INET and INET6 again after r347221. Submitted by: cam
|
#
3d10e9ed |
|
07-May-2019 |
Marius Strobl <marius@FreeBSD.org> |
o Use iflib_fast_intr_rxtx() also for "legacy" interrupts, i. e. INTx and MSI. Unlike as with iflib_fast_intr_ctx(), the former will also enqueue _task_fn_tx() in addition to _task_fn_rx() if appropriate, bringing TCP TX throughput of EM-class devices on par with the MSI-X case and, thus, close to wirespeed/pre-iflib(4) times again. [1] Note that independently of the interrupt type, the UDP performance with these MACs still is abysmal and nowhere near to where it was before the conversion of em(4) to iflib(4). o In iflib_init_locked(), announce which free list failed to set up. o In _task_fn_tx() when running netmap(4), issue ifdi_intr_enable instead of the ifdi_tx_queue_intr_enable method in case of a "legacy" interrupt as the latter is valid with MSI-X only. o Instead of adding the missing - and apparently convoluted enough that a DBG_COUNTER_INC was put into a wrong spot in _task_fn_rx() - checks for ifdi_{r,t}x_queue_intr_enable being available in the MSI-X case also to iflib_fast_intr_rxtx(), factor these out to iflib_device_register() and make the checks fail gracefully rather than panic. This avoids invoking the checks at runtime over and over again in iflib_fast_intr_rxtx() and _task_fn_{r,t}x() - even if it's just in case of INVARIANTS - and makes these functions more readable. o In iflib_rx_structures_setup(), only initialize LRO resources if device and driver have LRO capability in order to not waste memory. Also, free the LRO resources again if setting them up fails for one of the queues. However, don't bother invoking iflib_rx_sds_free() in that case because iflib_rx_structures_setup() doesn't call iflib_rxsd_alloc() either (and iflib_{device,pseudo}_register() will issue iflib_rx_sds_free() in case of failure via iflib_rx_structures_free(), but there definitely is some asymmetry left to be fixed, though). o Similarly, free LRO resources again in iflib_rx_structures_free(). o In iflib_irq_set_affinity(), handle get_core_offset() errors gracefully instead of panicing (but only in case of INVARIANTS). This is a follow- up to r344132, as such driver bugs shouldn't be fatal. o Likewise, handle unknown iflib_intr_type_t in iflib_irq_alloc_generic() gracefully, too. o Bring yet more sanity to iflib_msix_init(): - If the device doesn't provide enough MSI-X vectors or not all vectors can be allocate so the expected number of queues in addition to admin interrupts can't be supported, try MSI next (and then INTx) as proper MSI-X vector distribution can't be assured in such cases. In essence, this change brings r254008 forward to iflib(4). Also, this is the fix alluded to in the commit message of r343934. - If the MSI-X allocation has failed, don't prematurely announce MSI is going to be used as the latter in fact may not be available either. - When falling back to MSI, only release the MSI-X table resource again if it was allocated in iflib_msix_init(), i. e. isn't supplied by the driver, in the first place. o In mp_ndesc_handler(), handle unknown type arguments gracefully, too. PR: 235031 (likely) [1] Reviewed by: shurd Differential Revision: https://reviews.freebsd.org/D20175
|
#
1722eeac |
|
06-May-2019 |
Marius Strobl <marius@FreeBSD.org> |
- Remove the unused ifc_link_irq and ifc_mtx_name members of struct iflib_ctx. - Remove the only ever written to ift_db_mtx_name member of struct iflib_txq. - Remove the unused or only ever written to ifr_size, ifr_cq_pidx, ifr_cq_gen and ifr_lro_enabled members of struct iflib_rxq. - Consistently spell DMA, RX and TX uppercase in comments, messages etc. instead of mixing with some lowercase variants. - Consistently use if_t instead of a mix of if_t and struct ifnet pointers. - Bring the function comments of _iflib_fl_refill(), iflib_rx_sds_free() and iflib_fl_setup() in line with reality. - Judging problem reports, people are wondering what on earth messages like: "TX(0) desc avail = 1024, pidx = 0" are trying to indicate. Thus, extend this string to be more like that of non-iflib(4) Ethernet MAC drivers, notifying about a watchdog timeout due to which the interface will be reset. - Take advantage of the M_HAS_VLANTAG macro. - Use false/true rather than FALSE/TRUE for variables of type bool. - Use FALLTHROUGH as advocated by style(9).
|
#
e2621d96 |
|
03-May-2019 |
Matt Macy <mmacy@FreeBSD.org> |
Allow iflib drivers to pass a pointer to their own ifmedia structure. Tested by: emaste@ Differential Revision: https://reviews.freebsd.org/D19946
|
#
ce3da455 |
|
02-May-2019 |
Ed Maste <emaste@FreeBSD.org> |
iflib: remove assertion that isc_capabilities is nonzero It's atypical, but not invalid, for a driver to pass no capabilities. Submitted by: Gerald Aryeetey <aryeeteygerald_rogers.com> Reviewed by: shurd MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D20142
|
#
f154ece0 |
|
25-Apr-2019 |
Stephen Hurd <shurd@FreeBSD.org> |
iflib: Better control over queue core assignment By default, cores are now assigned to queues in a sequential manner rather than all NICs starting at the first core. On a four-core system with two NICs each using two queue pairs, the nic:queue -> core mapping has changed from this: 0:0 -> 0, 0:1 -> 1 1:0 -> 0, 1:1 -> 1 To this: 0:0 -> 0, 0:1 -> 1 1:0 -> 2, 1:1 -> 3 Additionally, a device can now be configured to use separate cores for TX and RX queues. Two new tunables have been added, dev.X.Y.iflib.separate_txrx and dev.X.Y.iflib.core_offset. If core_offset is set, the NIC is not part of the auto-assigned sequence. Reviewed by: marius MFC after: 2 weeks Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D20029
|
#
6d49b41e |
|
24-Apr-2019 |
Andrew Gallatin <gallatin@FreeBSD.org> |
iflib: Add pfil hooks As with mlx5en, the idea is to drop unwanted traffic as early in receive as possible, before mbufs are allocated and anything is passed up the stack. This can save considerable CPU time when a machine is under a flooding style DOS attack. The major change here is to remove the unneeded abstraction where callers of rxd_frag_to_sd() get back a pointer to the mbuf ring, and are responsible for NULL'ing that mbuf themselves. Now this happens directly in rxd_frag_to_sd(), and it returns an mbuf. This allows us to use the decision (and potentially mbuf) returned by the pfil hooks. The driver can now recycle mbufs to avoid re-allocation when packets are dropped. Reviewed by: marius (shurd and erj also provided feedback) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D19645
|
#
1fd8c72c |
|
17-Apr-2019 |
Kyle Evans <kevans@FreeBSD.org> |
iflib: Use new ether_gen_addr, restricting addresses to that subset Differential Revision: https://reviews.freebsd.org/D19587
|
#
225eae1b |
|
28-Mar-2019 |
Eric Joyner <erj@FreeBSD.org> |
iflib: return ENETDOWN when the network device is down From Jake: iflib_if_transmit returns ENOBUFS when the device is down, or when the link isn't active. This was changed in r308792 from return (0), so that the function correctly reports an error that it was unable to transmit. However, using ENOBUFS can cause some network applications to produce the following or similar errors: "ping: sendto: No buffer space available" This is a bit confusing as the real cause of the issue is that the network device is down. Replace the ENOBUFS return with ENETDOWN to indicate more clearly that the reason for the failure to send is due to the network device is offline. This will cause the error message to be reported as "ping: sendto: Network is down" Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: shurd@, sbruno@, bz@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D19652
|
#
aac9c817 |
|
28-Mar-2019 |
Eric Joyner <erj@FreeBSD.org> |
iflib: hold the CTX lock in iflib_pseudo_register From Jake: The iflib_device_register function takes the CTX lock before calling IFDI_ATTACH_PRE, and releases it upon finishing the registration. Mirror this process in iflib_pseudo_register, so that we always hold the CTX lock during the attach process when registering a pseudo interface or a regular interface. This was caught by code inspection while attempting to analyze where the CTX lock was held. Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: shurd@, erj@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D19604
|
#
10a1e981 |
|
19-Mar-2019 |
Eric Joyner <erj@FreeBSD.org> |
iflib: mark isc_driver_version as constant From Jake: The iflib core never modifies the isc_driver_version string. Allow drivers to safely assign pointers to constant buffers by marking this parameter const. Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: erj@, gallatin@, jhb@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D19577
|
#
1b9d9394 |
|
19-Mar-2019 |
Eric Joyner <erj@FreeBSD.org> |
iflib: expose the Rx mbuf buffer size to drivers From Jake: iflib_fl_setup calculates a suitable buffer size for the Rx mbufs based on the isc_max_frame_size value that drivers setup. This calculation is repeated by drivers when programming their hardware with the size of each Rx buffer. This can lead to a mismatch where the iflib mbuf size is different from the expected size of the buffer as programmed by the hardware. This can lead to unexpected results. If iflib ever wants to support mbuf sizes larger than one page, every driver must be updated to account for the new possible buffer sizes. Fix this by calculating the mbuf size prior to calling IFDI_INIT, and adding the iflib_get_rx_mbuf_sz function which will expose this value to drivers, so that they do not repeat the same calculation. Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: shurd@, erj@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D19489
|
#
3e8d1bae |
|
19-Mar-2019 |
Eric Joyner <erj@FreeBSD.org> |
iflib: prevent possible infinite loop in iflib_encap From Jake: iflib_encap calls bus_dmamap_load_mbuf_sg. Upon it returning EFBIG, an m_collapse and an m_defrag are attempted to shrink the mbuf cluster to fit within the DMA segment limitations. However, if we call m_defrag, and then bus_dmamap_load_mbuf_sg returns EFBIG on the now defragmented mbuf, we will continuously re-call bus_dmamap_load_mbuf_sg over and over. This happens because m_head isn't NULL, and remap is >1, so we don't try to m_collapse or m_defrag again. The only way we exit the loop is if m_head is NULL. However, m_head can't be modified by the call to bus_dmamap_load_mbuf_sg, because we don't pass it as a double pointer. I believe this will be an incredibly rare occurrence, because it is unlikely that bus_dmamap_load_mbuf_sg will actually fail on the second defragment with an EFBIG error. However, it still seems like a possibility that we should account for. Fix the exit check to ensure that if remap is >1, we will also exit, even if m_head is not NULL. Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: shurd@, gallatin@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D19468
|
#
bc408c7d |
|
05-Mar-2019 |
Eric Joyner <erj@FreeBSD.org> |
Remove references to CONTIGMALLOC_WORKS in iflib and em From Jake: "The iflib_fl_setup() function tries to pick various buffer sizes based on the max_frame_size value defined by the parent driver. However, this code was wrapped under CONTIGMALLOC_WORKS, which was never actually defined anywhere. This same code pattern was used in if_em.c, likely trying to match what iflib uses. Since CONTIGMALLOC_WORKS is not defined, remove this dead code from iflib_fl_setup and if_em.c Given that various iflib drivers appear to be using a similar calculation, it might be worth making this buffer size a value that the driver can peek at in the future." Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: shurd@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D19199
|
#
ca62461b |
|
15-Feb-2019 |
Stephen Hurd <shurd@FreeBSD.org> |
iflib: Improve return values of interrupt handlers. iflib was returning FILTER_HANDLED, in cases where FILTER_STRAY was more correct. This potentially caused issues with shared legacy interrupts. Driver filters returning FILTER_STRAY are now properly handled. Submitted by: Augustin Cavalier <waddlesplash@gmail.com> Reviewed by: marius, gallatin Obtained from: Haiku (a84bb9, 4947d1) MFC after: 1 week Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D19201
|
#
a6611c93 |
|
12-Feb-2019 |
Marius Strobl <marius@FreeBSD.org> |
Fix the build with ALTQ after r344060.
|
#
f855ec81 |
|
12-Feb-2019 |
Marius Strobl <marius@FreeBSD.org> |
Make taskqgroup_attach{,_cpu}(9) work across architectures So far, intr_{g,s}etaffinity(9) take a single int for identifying a device interrupt. This approach doesn't work on all architectures supported, as a single int isn't sufficient to globally specify a device interrupt. In particular, with multiple interrupt controllers in one system as found on e. g. arm and arm64 machines, an interrupt number as returned by rman_get_start(9) may be only unique relative to the bus and, thus, interrupt controller, a certain device hangs off from. In turn, this makes taskqgroup_attach{,_cpu}(9) and - internal to the gtaskqueue implementation - taskqgroup_attach_deferred{,_cpu}() not work across architectures. Yet in turn, iflib(4) as gtaskqueue consumer so far doesn't fit architectures where interrupt numbers aren't globally unique. However, at least for intr_setaffinity(..., CPU_WHICH_IRQ, ...) as employed by the gtaskqueue implementation to bind an interrupt to a particular CPU, using bus_bind_intr(9) instead is equivalent from a functional point of view, with bus_bind_intr(9) taking the device and interrupt resource arguments required for uniquely specifying a device interrupt. Thus, change the gtaskqueue implementation to employ bus_bind_intr(9) instead and intr_{g,s}etaffinity(9) to take the device and interrupt resource arguments required respectively. This change also moves struct grouptask from <sys/_task.h> to <sys/gtaskqueue.h> and wraps struct gtask along with the gtask_fn_t typedef into #ifdef _KERNEL as userland likes to include <sys/_task.h> or indirectly drags it in - for better or worse also with _KERNEL defined -, which with device_t and struct resource dependencies otherwise is no longer as easily possible now. The userland inclusion problem probably can be improved a bit by introducing a _WANT_TASK (as well as a _WANT_MOUNT) akin to the existing _WANT_PRISON etc., which is orthogonal to this change, though, and likely needs an exp-run. While at it: - Change the gt_cpu member in the grouptask structure to be of type int as used elswhere for specifying CPUs (an int16_t may be too narrow sooner or later), - move the gtaskqueue_enqueue_fn typedef from <sys/gtaskqueue.h> to the gtaskqueue implementation as it's only used and needed there, - change the GTASK_INIT macro to use "gtask" rather than "task" as argument given that it actually operates on a struct gtask rather than a struct task, and - let subr_gtaskqueue.c consistently use __func__ to print functions names. Reported by: mmel Reviewed by: mmel Differential Revision: https://reviews.freebsd.org/D19139
|
#
95dcf343 |
|
12-Feb-2019 |
Marius Strobl <marius@FreeBSD.org> |
Further correct and optimize the bus_dma(9) usage of iflib(4): o Correct the obvious bugs in the netmap(4) parts: - No longer check for the existence of DMA maps as bus_dma(9) is used unconditionally in iflib(4) since r341095. - Supply the correct DMA tag and map pairs to bus_dma(9) functions (see also the commit message of r343753). - In iflib_netmap_timer_adjust(), add synchronization of the TX descriptors before calling the ift_txd_credits_update method as the latter evaluates the TX descriptors possibly updated by the MAC. - In _task_fn_tx(), wrap the netmap(4)-specific bits in #ifdef DEV_NETMAP just as done in _task_fn_admin() and _task_fn_rx() respectively. o In iflib_fast_intr_rxtx(), synchronize the TX rather than the RX descriptors before calling the ift_txd_credits_update method (see also above). o There's no need to synchronize an RX buffer that is going to be recycled in iflib_rxd_pkt_get(), yet; it's sufficient to do that as late as passing RX buffers to the MAC via the ift_rxd_refill method. Hence, combine that synchronization with the synchronization of new buffers into a common spot in _iflib_fl_refill(). o There's no need to synchronize the RX descriptors of a free list in preparation of the MAC updating their statuses with every invocation of rxd_frag_to_sd(); it's enough to do this once before handing control over to the MAC, i. e. before calling ift_rxd_flush method in _iflib_fl_refill(), which already performs the necessary synchronization. o Given that the ift_rxd_available method evaluates the RX descriptors which possibly have been altered by the MAC, synchronize as appropriate beforehand. Most notably this is now done in iflib_rxd_avail(), which in turn means that we don't need to issue the same synchronization yet again before calling the ift_rxd_pkt_get method in iflib_rxeof(). o In iflib_txd_db_check(), synchronize the TX descriptors before handing them over to the MAC for transmission via the ift_txd_flush method. o In iflib_encap(), move the TX buffer synchronization after the invocation of the ift_txd_encap() method. If the MAC driver fails to encapsulate the packet and we retry with a defragmented mbuf chain or finally fail, the cycles for TX buffer synchronization have been wasted. Synchronizing afterwards matches what non-iflib(4) drivers typically do and is sufficient as the MAC will not actually start with the transmission before - in this case - the ift_txd_flush method is called. Moreover, for the latter reason the synchronization of the TX descriptors in iflib_encap() can go as it's enough to synchronize them before passing control over to the MAC by issuing the ift_txd_flush() method (see above). o In iflib_txq_can_drain(), only synchronize TX descriptors if the ift_txd_credits_update method accessing these is actually called. Differential Revision: https://reviews.freebsd.org/D19081
|
#
bfce461e |
|
04-Feb-2019 |
Marius Strobl <marius@FreeBSD.org> |
o As illustrated by e. g. figure 7-14 of the Intel 82599 10 GbE controller datasheet revision 3.3, in the context of Ethernet MACs the control data describing the packet buffers typically are named "descriptors". Each of these descriptors references one buffer, multiple of which a packet can be composed of. By contrast, in comments, messages and the names of structure members, iflib(4) refers to DMA resources employed for RX and TX buffers (rather than control data) as "desc(riptors)". This odd naming convention of iflib(4) made reviewing r343085 and identifying wrong and missing bus_dmamap_sync(9) calls in particular way harder than it already is. This convention may also explain why the netmap(4) part of iflib(4) pairs the DMA tags for control data with DMA maps of buffers and vice versa in calls to bus_dma(9) functions. Therefore, change iflib(4) to refer to buf(fers) when buffers and not the usual understanding of descriptors is meant. This change does not include corrections to the DMA resources used in the netmap(4) parts. However, it revises error messages to state which kind of allocation/creation failed. Specifically, the "Unable to allocate tx_buffer (map) memory" copy & pasted inappropriately on several occasions was replaced with proper messages. o Enhance some other error messages to indicate which half - RX or TX - they apply to instead of using identical text in both cases and generally canonicalize them. o Correct the descriptions of iflib_{r,t}xsd_alloc() to reflect reality; current code doesn't use {r,t}x_buffer structures. o In iflib_queues_alloc(): - Remove redundant BUS_DMA_NOWAIT of iflib_dma_alloc() calls, - change the M_WAITOK from malloc(9) calls into M_NOWAIT. The return values are already checked, deferred DMA allocations not being an option at this point, BUS_DMA_NOWAIT has to be used anyway and prior malloc(9) calls in this function also specify M_NOWAIT. Reviewed by: shurd Differential Revision: https://reviews.freebsd.org/D19067
|
#
b97de13a |
|
30-Jan-2019 |
Marius Strobl <marius@FreeBSD.org> |
- Stop iflib(4) from leaking MSI messages on detachment by calling bus_teardown_intr(9) before pci_release_msi(9). - Ensure that iflib(4) and associated drivers pass correct RIDs to bus_release_resource(9) by obtaining the RIDs via rman_get_rid(9) on the corresponding resources instead of using the RIDs initially passed to bus_alloc_resource_any(9) as the latter function may change those RIDs. Solely em(4) for the ioport resource (but not others) and bnxt(4) were using the correct RIDs by caching the ones returned by bus_alloc_resource_any(9). - Change the logic of iflib_msix_init() around to only map the MSI-X BAR if MSI-X is actually supported, i. e. pci_msix_count(9) returns > 0. Otherwise the "Unable to map MSIX table " message triggers for devices that simply don't support MSI-X and the user may think that something is wrong while in fact everything works as expected. - Put some (mostly redundant) debug messages emitted by iflib(4) and em(4) during attachment under bootverbose. The non-verbose output of em(4) seen during attachment now is close to the one prior to the conversion to iflib(4). - Replace various variants of spelling "MSI-X" (several in messages) with "MSI-X" as used in the PCI specifications. - Remove some trailing whitespace from messages emitted by iflib(4) and change them to consistently start with uppercase. - Remove some obsolete comments about releasing interrupts from drivers and correct a few others. Reviewed by: erj, Jacob Keller, shurd Differential Revision: https://reviews.freebsd.org/D18980
|
#
3db348b5 |
|
26-Jan-2019 |
Marius Strobl <marius@FreeBSD.org> |
- In _iflib_fl_refill(), don't mark an RX buffer as available in the corresponding bitmap before adding an mbuf has actually succeeded. Previously, m_gethdr(M_NOWAIT, ...) failing caused a "hole" in the RX ring but not in its bitmap. One implication of such a hole was that in a subsequent call to _iflib_fl_refill() with the RX buffer accounting still indicating another reclaimable buffer, bit_ffc(3) nevertheless returned -1 in frag_idx which in turn caused havoc when used as an index. Thus, additionally assert that frag_idx is 0 or greater. Another possible consequence of a hole in the RX ring was a NULL- dereference when trying to use the unallocated mbuf, for example in iflib_rxd_pkt_get(). While at it, make the variable declarations in _iflib_fl_refill() conform to style(9) and remove redundant checks already performed by bit_ffc{,_at}(3). - In iflib_queues_alloc(), don't pass redundant M_ZERO to bit_alloc(3). Reported and tested by: pho
|
#
77102fd6 |
|
25-Jan-2019 |
Andrew Gallatin <gallatin@FreeBSD.org> |
Fix an iflib driver unload panic introduced in r343085 The new loop to sync and unload descriptors was indexed by "i", rather than "j". The panic was caused by "i" being advanced rather than "j", and eventually becoming out of bounds. Reviewed by: kib MFC after: 3 days Sponsored by: Netflix
|
#
8f82136a |
|
21-Jan-2019 |
Patrick Kelsey <pkelsey@FreeBSD.org> |
onvert vmx(4) to being an iflib driver. Also, expose IFLIB_MAX_RX_SEGS to iflib drivers and add iflib_dma_alloc_align() to the iflib API. Performance is generally better with the tunable/sysctl dev.vmx.<index>.iflib.tx_abdicate=1. Reviewed by: shurd MFC after: 1 week Relnotes: yes Sponsored by: RG Nets Differential Revision: https://reviews.freebsd.org/D18761
|
#
7f3eb9da |
|
21-Jan-2019 |
Patrick Kelsey <pkelsey@FreeBSD.org> |
Fix various resource leaks that can occur in the error paths of iflib_device_register() and iflib_pseudo_register(). Reviewed by: shurd MFC after: 1 week Sponsored by: RG Nets Differential Revision: https://reviews.freebsd.org/D18760
|
#
8a04b53d |
|
15-Jan-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Improve iflib busdma(9) KPI use. - Specify BUS_DMA_NOWAIT for bus_dmamap_load() on rx refill, since callbacks are not supposed to be used. - Match tso/non-tso tags to corresponding tx map operations. Create separate tso maps for tx descriptors. In particular, do not use non-tso tag to load, unload, or destroy a map created with tso tag. - Add missed bus_dmamap_sync() calls. Submitted by: marius. Reported and tested by: pho Reviewed by: marius Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
cd28ea92 |
|
07-Jan-2019 |
Stephen Hurd <shurd@FreeBSD.org> |
Use iflib_if_init_locked() during resume instead of iflib_init_locked(). iflib_init_locked() assumes that iflib_stop() has been called, however, it is not called for suspend. iflib_if_init_locked() calls stop then init, so fixes the problem. This was causing errors after a resume from suspend. PR: 224059 Reported by: zeising MFC after: 1 week Sponsored by: Limelight Networks
|
#
85f3b801 |
|
02-Jan-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix typo, use boolean operator instead of bit-wise. Reviewed by: marius, shurd MFC after: 3 days Sponsored by: The FreeBSD Foundation
|
#
7124b5ba |
|
11-Dec-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
Fix !tx_abdicate error from r336560 r336560 was supposed to restore pre-r323954 behaviour when tx_abdicate is not set (the default case). However, it appears that rather than the drainage check being made conditional on tx_abdicate being set, it was duplicated so it occured twice if tx_abdicate was set and once if it was not. Now when !tx_abdicate, drainage is only checked if the doorbell isn't pending. Reported by: lev MFC after: 1 week Sponsored by: Limelight Networks
|
#
fbec776d |
|
27-Nov-2018 |
Andrew Gallatin <gallatin@FreeBSD.org> |
Use busdma unconditionally in iflib - Remove the complex mechanism to choose between using busdma and raw pmap_kextract at runtime. The reduced complexity makes the code easier to read and maintain. - Fix a bug in the small packet receive path where clusters were repeatedly mapped but never unmapped. We now store the cluster's bus address and avoid re-mapping the cluster each time a small packet is received. This patch fixes bugs I've seen where ixl(4) will not even respond to ping without seeing DMAR faults. I see a small improvement (14%) on packet forwarding tests using a Haswell based Xeon E5-2697 v3. Olivier sees a small regression (-3% to -6%) with lower end hardware. Reviewed by: mmacy Not objected to by: sbruno MFC after: 8 weeks Sponsored by: Netflix, Inc Differential Revision: https://reviews.freebsd.org/D17901
|
#
0efb1a46 |
|
14-Nov-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
Clear RX completion queue state veriables in iflib_stop() iflib_stop() was not resetting the rxq completion queue state variables. This meant that for any driver that has receive completion queues, after a reinit, iflib would start asking what's available on the rx side starting at whatever the completion queue index was prior to the stop, instead of at 0. Submitted by: pkelsey Reported by: pkelsey MFC after: 3 days Sponsored by: Limelight Networks
|
#
8d4ceb9c |
|
14-Nov-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
Prevent POLA violation with TSO/CSUM offload Ensure that any time CSUM_IP_TSO or CSUM_IP6_TSO is set that the corresponding CSUM_IP6?_TCP / CSUM_IP flags are also set. Rather than requireing drivers to bake-in an understanding that TSO implies checksum offloads, make it explicit. This change requires us to move the IFLIB_NEED_ZERO_CSUM implementation to ensure it's zeroed for TSO. Reported by: Jacob Keller <jacob.e.keller@intel.com> MFC after: 1 week Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D17801
|
#
4d261ce2 |
|
14-Nov-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
Fix leaks caused by ifc_nhwtxqs never being initialized r333502 removed initialization of ifc_nhwtxqs, and it's not clear there's a need to copy it into the struct iflib_ctx at all. Use ctx->ifc_sctx->isc_ntxqs instead. Further, iflib_stop() did not clear the last ring in the case where isc_nfl != isc_nrxqs (such as when IFLIB_HAS_RXCQ is set). Use ctx->ifc_sctx->isc_nrxqs here instead of isc_nfl. Reported by: pkelsey Reviewed by: pkelsey MFC after: 3 days Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D17979
|
#
a42546df |
|
07-Nov-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
Fix rxcsum issue introduced in r338838 r338838 attempted to fix issues with rxcsum and rxcsum6. However, the rxcsum bits were set as though if_setcapenablebit() was being called, not if_togglecapenable() which is in use. As a result, it was not possible to disable rxcsum when rxcsum6 was supported. PR: 233004 Reported by: lev Reviewed by: lev MFC after: 3 days Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D17881
|
#
46fa0c25 |
|
23-Oct-2018 |
Eric Joyner <erj@FreeBSD.org> |
Revert r339634. That commit is causing kernel panics in em(4), so this will be reverted until those are fixed. Reported by: ae@, pho@, et al Sponsored by: Intel Corporation
|
#
940f62d6 |
|
22-Oct-2018 |
Eric Joyner <erj@FreeBSD.org> |
iflib: drain enqueued tasks before detaching from taskqgroup The taskqgroup_detach function does not check if task is already enqueued when detaching it. This may lead to kernel panic if enqueued task starts after context state lock is destroyed. Ensure that the already enqueued admin tasks are executed before detaching them. The issue was discovered during validation of D16429. Unloading of if_ixlv followed by immediate removal of VFs with iovctl -D may lead to panic on NODEBUG kernel. As well, check if iflib is in detach before enqueueing new admin or iov tasks, to prevent new tasks from executing while the taskqgroup tasks are being drained. Submitted by: Krzysztof Galazka <krzysztof.galazka@intel.com> Reviewed by: shurd@, erj@ Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D17404
|
#
77c1fcec |
|
12-Oct-2018 |
Eric Joyner <erj@FreeBSD.org> |
ixl/iavf(4): Change ixlv to iavf and update it to use iflib(9) Finishes the conversion of the 40Gb Intel Ethernet drivers to iflib(9) for FreeBSD 12.0, and fixes numerous bugs in both ixl(4) and iavf(4). This commit also re-adds the VF driver to GENERIC since it now compiles and functions. The VF driver name was changed from ixlv(4) to iavf(4) because the VF driver is now intended to be used with future products, not just with Fortville/Fort Park VFs. A man page update that documents these drivers is forthcoming in a separate commit. Reviewed by: sbruno@, kbowling@ Tested by: jeffrey.e.pieper@intel.com Approved by: re (gjb@) Relnotes: yes Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D16429
|
#
0c919c23 |
|
20-Sep-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
Fix capabilities handling for iflib drivers Various capabilities were not being handled correctly in the SIOCSIFCAP handler. Specifically: IFCAP_RXCSUM and IFCAP_RXCSUM_IPV6 could be set even if not supported It was impossible to disable IFCAP_RXCSUM and/or IFCAP_RXCSUM_IPV6 via ifconfig since it does ioctl() per command-line flag rather than combine them into a single call. IFCAP_VLAN_HWCSUM could not be modified via the ioctl() Setting any combination of the three IFCAP_WOL flags would set only IFCAP_WOL_MCAST | IFCAP_WOL_MAGIC. For example, setting only IFCAP_WOL_UCAST would result in both IFCAP_WOL_MCAST and IFCAP_WOL_MAGIC being enabled, but IFCAP_WOL_UCAST would not be enabled. Because if_vlancap() was called before if_togglecapenable(), vlan flags were sometimes not applied correctly. Interfaces were being unnecessarily stopped and restarted for WoL PR: 231151 Submitted by: Kaho Toshikazu <kaho@elam.kais.kyoto-u.ac.jp> Reported by: Shirkdog <mshirk@daemon-security.com> Reviewed by: galladin Approved by: re (gjb) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D17158
|
#
64e6fc13 |
|
06-Sep-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
Clean up iflib sysctls Remove sysctls: txq_drain_encapfail - now a duplicate of encap_txd_encap_fail intr_link - was never incremented intr_msix - was never incremented rx_zero_len - was never incremented The following were not incremented in all code-paths that apply: m_pullups, mbuf_defrag, rxd_flush, tx_encap, rx_intr_enables, tx_frees, encap_txd_encap_fail. Fixes: Replace the broken collapse_pkthdr() implementation with an MPASS(). fl_refills and fl_refills_large were not incremented when using netmap. Reviewed by: gallatin Approved by: re (marius) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D16733
|
#
bc0e855b |
|
29-Aug-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
Fix compile error due to missing parenthesis in r338372 Approved by: re (gjb)
|
#
a520f8b6 |
|
29-Aug-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
Fix potential data corruption in iflib The MP ring may have txq pointers enqueued. Previously, these were passed to m_free() when IFC_QFLUSH was set. This patch checks for the value and doesn't call m_free(). Reviewed by: gallatin Approved by: re (gjb) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D16882
|
#
8f410865 |
|
03-Aug-2018 |
Patrick Kelsey <pkelsey@FreeBSD.org> |
Mark the send queue ready so ALTQ is available.
|
#
b8ca4756 |
|
25-Jul-2018 |
Patrick Kelsey <pkelsey@FreeBSD.org> |
ALTQ support for iflib. Reviewed by: jmallett, mmacy Differential Revision: https://reviews.freebsd.org/D16433
|
#
c9a49a4f |
|
24-Jul-2018 |
Marius Strobl <marius@FreeBSD.org> |
Since r336611, n is only used for INET in iflib_parse_header(). Reported by: rpokala
|
#
7474544b |
|
22-Jul-2018 |
Marius Strobl <marius@FreeBSD.org> |
Use the maximum of isc_tx_{nsegments,tso_segments_max} for MAX_TX_DESC. Since r336313, TSO support for LEM-class devices is removed again as it was before the conversion of {l,}em(4) to iflib(4) in r311849 and as a result, isc_tx_tso_segments_max is 0 for LEM-class devices now. Thus, inappropriate watermarks were used for this class. This is really only a band-aid, though, because so far iflib(9) doesn't fully take into account that DMA engines can support different maxima of segments for transfers of TSO and non-TSO packets. For example, the DESC_RECLAIMABLE macro is based on isc_tx_nsegments while MAX_TX_DESC used isc_tx_tso_segments_max only. For most in-tree consumers that doesn't make a difference as the maxima are the same for both kinds of transfers (that is, apart from the fact that TSO may require up to 2 sentinel descriptors but also not with every MAC supported). However, isc_tx_nsegments is 8 but isc_tx_tso_segments_max is 85 by default with ixl(4).
|
#
8b8d9093 |
|
22-Jul-2018 |
Marius Strobl <marius@FreeBSD.org> |
- Given that the controlling expression of the receive loop in iflib_rxeof() tests for avail > 0, avail can never be 0 within that loop. Thus, move decrementing avail and budget_left into the loop and before the code which checks for additional descriptors having become available in case all the previous ones have been processed but there still is budget left so the latter code works as expected. [1] - In iflib_{busdma_load_mbuf_sg,parse_header}(), remove dead stores to m and n respectively. [2, 3] - In collapse_pkthdr(), ensure that m_next isn't NULL before dereferencing it. [4] - Remove a duplicate assignment of segs in iflib_encap(). Reported by: Coverity CID: 1356027 [1], 1356047 [2], 1368205 [3], 1356028 [4]
|
#
fe51d4cd |
|
20-Jul-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
Add knob to control tx ring abdication. r323954 changed the mp ring behaviour when 64-bit atomics were available to abdicate the TX ring rather than having one become a consumer thereby running to completion on TX. The consumer of the mp ring was then triggered in the tx task rather than blocking the TX call. While this significantly lowered the number of RX drops in small-packet forwarding, it also negatively impacts TX performance. With this change, the default behaviour is reverted, causing one TX ring to become a consumer during the enqueue call. A new sysctl, dev.X.Y.iflib.tx_abdicate is added to control this behaviour. Reviewed by: gallatin Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D16302
|
#
dd7fbcf1 |
|
20-Jul-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
Improve netmap TX handling when TX IRQs are not used/supported Use the timer to poll for TX completions when there are outstanding TX slots. Track when the last driver timer was called to prevent overcalling it. Also clean up some kring vs NIC ring usage. Reviewed by: marius, Johannes Lundberg <johalun0@gmail.com> Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D16300
|
#
7f87c040 |
|
15-Jul-2018 |
Marius Strobl <marius@FreeBSD.org> |
Assorted TSO fixes for em(4)/iflib(9) and dead code removal: - Ever since the workaround for the silicon bug of TSO4 causing MAC hangs was committed in r295133, CSUM_TSO always got disabled unconditionally by em(4) on the first invocation of em_init_locked(). However, even with that problem fixed, it turned out that for at least e. g. 82579 not all necessary TSO workarounds are in place, still causing MAC hangs even at Gigabit speed. Thus, for stable/11, TSO usage was deliberately disabled in r323292 (r323293 for stable/10) for the EM-class by default, allowing users to turn it on if it happens to work with their particular EM MAC in a Gigabit-only environment. In head, the TSO workaround for speeds other than Gigabit was lost with the conversion to iflib(9) in r311849 (possibly along with another one or two TSO workarounds). Yet at the same time, for EM-class MACs TSO4 got enabled by default again, causing device hangs. Therefore, change the default for this hardware class back to have TSO4 off, allowing users to turn it on manually if it happens to work in their environment as we do in stable/{10,11}. An alternative would be to add a whitelist of EM-class devices where TSO4 actually is reliable with the workarounds in place, but given that the advantage of TSO at Gigabit speed is rather limited - especially with the overhead of these workarounds -, that's really not worth it. [1] This change includes the addition of an isc_capabilities to struct if_softc_ctx so iflib(9) can also handle interface capabilities that shouldn't be enabled by default which is used to handle the default-off capabilities of e1000 as suggested by shurd@ and moving their handling from em_setup_interface() to em_if_attach_pre() accordingly. - Although 82543 support TSO4 in theory, the former lem(4) didn't have support for TSO4, presumably because TSO4 is even more broken in the LEM-class of MACs than the later EM ones. Still, TSO4 for LEM-class devices was enabled as part of the conversion to iflib(9) in r311849, causing device hangs. So revert back to the pre-r311849 behavior of not supporting TSO4 for LEM-class at all, which includes not creating a TSO DMA tag in iflib(9) for devices not having IFCAP_TSO4 set. [2] - In fact, the FreeBSD TCP stack can handle a TSO size of IP_MAXPACKET (65535) rather than FREEBSD_TSO_SIZE_MAX (65518). However, the TSO DMA must have a maxsize of the maximum TSO size plus the size of a VLAN header for software VLAN tagging. The iflib(9) converted em(4), thus, first correctly sets scctx->isc_tx_tso_size_max to EM_TSO_SIZE in em_if_attach_pre(), but later on overrides it with IP_MAXPACKET in em_setup_interface() (apparently, left-over from pre-iflib(9) times). So remove the later and correct iflib(9) to correctly cap the maximum TSO size reported to the stack at IP_MAXPACKET. While at it, let iflib(9) use if_sethwtsomax*(). This change includes the addition of isc_tso_max{seg,}size DMA engine constraints for the TSO DMA tag to struct if_shared_ctx and letting iflib_txsd_alloc() automatically adjust the maxsize of that tag in case IFCAP_VLAN_MTU is supported as requested by shurd@. - Move the if_setifheaderlen(9) call for adjusting the maximum Ethernet header length from {ixgbe,ixl,ixlv,ixv,em}_setup_interface() to iflib(9) so adjustment is automatically done in case IFCAP_VLAN_MTU is supported. As a consequence, this adjustment now is also done in case of bnxt(4) which missed it previously. - Move the reduction of the maximum TSO segment count reported to the stack by the number of m_pullup(9) calls (which in the worst case, can add another mbuf and, thus, the requirement for another DMA segment each) in the transmit path for performance reasons from em_setup_interface() to iflib_txsd_alloc() as these pull-ups are now done in iflib_parse_header() rather than in the no longer existing em_xmit(). Moreover, this optimization applies to all drivers using iflib(9) and not just em(4); all in-tree iflib(9) consumers still have enough room to handle full size TSO packets. Also, reduce the adjustment to the maximum number of m_pullup(9)'s now performed in iflib_parse_header(). - Prior to the conversion of em(4)/igb(4)/lem(4) and ixl(4) to iflib(9) in r311849 and r335338 respectively, these drivers didn't enable IFCAP_VLAN_HWFILTER by default due to VLAN events not being passed through by lagg(4). With iflib(9), IFCAP_VLAN_HWFILTER was turned on by default but also lagg(4) was fixed in that regard in r203548. So just remove the now redundant and defunct IFCAP_VLAN_HWFILTER handling in {em,ixl,ixlv}_setup_interface(). - Nuke other redundant IFCAP_* setting in {em,ixl,ixlv}_setup_interface() which is (more completely) already done in {em,ixl,ixlv}_if_attach_pre() now. - Remove some redundant/dead setting of scctx->isc_tx_csum_flags in em_if_attach_pre(). - Remove some IFCAP_* duplicated either directly or indirectly (e. g. via IFCAP_HWCSUM) in {EM,IGB,IXL}_CAPS. - Don't bother to fiddle with IFCAP_HWSTATS in ixgbe(4)/ixgbev(4) as iflib(9) adds that capability unconditionally. - Remove some unused macros from em(4). - Bump __FreeBSD_version as some of the above changes require the modules of drivers using iflib(9) to be recompiled. Okayed by: sbruno@ at 201806 DevSummit Transport Working Group [1] Reviewed by: sbruno (earlier version), erj PR: 219428 (part of; comment #10) [1], 220997 (part of; comment #3) [2] Differential Revision: https://reviews.freebsd.org/D15720
|
#
dfae03b5 |
|
18-Jun-2018 |
Eric Joyner <erj@FreeBSD.org> |
iflib: Style fixes MFC after: 1 week
|
#
e4defe55 |
|
17-Jun-2018 |
Marius Strobl <marius@FreeBSD.org> |
Assorted fixes to MSI-X/MSI/INTx setup in iflib(9): - In iflib_msix_init(), VMMs with broken MSI-X activation are trying to be worked around by manually enabling PCIM_MSIXCTRL_MSIX_ENABLE before calling pci_alloc_msix(9). Apart from constituting a layering violation, this has the problem of leaving PCIM_MSIXCTRL_MSIX_ENABLE enabled when falling back to MSI or INTx when e. g. MSI-X is black- listed and initially also when disabled via hw.pci.enable_msix. The later in turn was incorrectly worked around in r325166. Since r310806, pci(4) itself has code to deal with broken MSI-X handling of VMMs, so all of these workarounds in iflib(9) can go, fixing non-working interrupts when falling back to MSI/INTx. In any case, possibly further adjustments to broken MSI-X activation of VMMs like enabling r310806 by default in VM environments need to be placed into pci(4), not iflib(9). [1] - Also remove the pci_enable_busmaster(9) call from iflib_msix_init(), which is already more properly invoked from iflib_device_attach(). - When falling back to MSI/INTx, release the MSI-X BAR resource again. - When falling back to INTx, ensure scctx->isc_vectors is set to 1 and not to something higher from a device with more than one MSI message supported. - Make the nearby ring_state(s) stuff (static) const. Discussed with: jhb at BSDCan 2018 [1] Reviewed by: imp, jhb Differential Revision: https://reviews.freebsd.org/D15729
|
#
3ab4a960 |
|
08-Jun-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
Remove tx task spinning added in r333686 This caused issues with PASTE. Just remove the reschedule since the DELAY() should be enough for use cases such as pkt-gen which were failing before the change. Reported by: Michio Honda Sponsored by: Limelight Networks
|
#
a06424dd |
|
07-Jun-2018 |
Eric Joyner <erj@FreeBSD.org> |
iflib: Record TCP checksum info in iflib when TCP checksum is requested ixl(4) (when it switches over to using iflib) devices need the TCP header length in order to do TCP checksum offload. Reviewed by: gallatin@, shurd@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D15558
|
#
3e0e6330 |
|
29-May-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
iflib: mark irq allocation name parameter as constant The *name parameter passed to iflib_irq_alloc_generic and iflib_softirq_alloc_generic is never modified. Many places in code pass string literals and thus should not be modified. Mark the *name parameter as a const char * instead, so that we enforce that the name is not modified before passing to bus_describe_intr() Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: kmacy Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D15343
|
#
6c3c3194 |
|
29-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
iflib: hold context lock across detach for drivers that need it
|
#
1d7ef186 |
|
25-May-2018 |
Eric Joyner <erj@FreeBSD.org> |
iflib: Add new shared flag: IFLIB_ADMIN_ALWAYS_RUN ixl(4)'s nvmupdate utility expects the nvmupdate process to run while the interface is down; these nvm update commands use the admin queue, so the admin queue needs to be able to generate interrupts and be processed while the interface is down. So add a flag that ixl(4) sets that lets the entire admin task run even when the interface is marked down/IFF_DRV_RUNNING isn't set. With this change, nvmupdate should function like it did pre-iflib. Reviewed by: gallatin@, sbruno@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D15575
|
#
f6cb0dea |
|
19-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
net: fix uninitialized variable warning
|
#
46d0f824 |
|
18-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
net: fix set but not used
|
#
2aa6f526 |
|
16-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
Fix !netmap build post r333686 Approved by: sbruno
|
#
5ee36c68 |
|
16-May-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
Work around lack of TX IRQs in iflib for netmap When poll() is called via netmap, txsync is initially called, and if there are no available buffers to reclaim, it waits for the driver to notify of new buffers. Since the TX IRQ is generally not used in iflib drivers, this ends up causing a timeout. Work around this by having the reclaim DELAY(1) if it's initially unable to reclaim anything, then schedule the tx task, which will spin by continuously rescheduling itself until some buffers are reclaimed. In general, the delay is enough to allow some buffers to be reclaimed, so spinning is minimized. Reported by: Johannes Lundberg <johalun0@gmail.com> Reviewed by: sbruno Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15455
|
#
09f6ff4f |
|
11-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
iflib(9): Add support for cloning pseudo interfaces Part 3 of many ... The VPC framework relies heavily on cloning pseudo interfaces (vmnics, vpc switch, vcpswitch port, hostif, vxlan if, etc). This pulls in that piece. Some ancillary changes get pulled in as a side effect. Reviewed by: shurd@ Approved by: sbruno@ Sponsored by: Joyent, Inc. Differential Revision: https://reviews.freebsd.org/D15347
|
#
ac88e6da |
|
08-May-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
iflib: print message when iflib_tx_structures_setup fails Print a message when iflib_tx_structures_setup fails, like we do for iflib_rx_structures_setup. Now that we always print a message from within iflib_qset_structures_setup when it fails, stop printing one in iflib_device_register() at the call site. Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: gallatin MFC after: 3 days Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D15300
|
#
6108c013 |
|
08-May-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
iflib: cleanup queues when iflib_device_register fail Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: gallatin MFC after: 3 days Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D15299
|
#
1f7ce05d |
|
07-May-2018 |
Andrew Gallatin <gallatin@FreeBSD.org> |
Fix an off-by-one error when deciding to request a tx interrupt The canonical check for whether or not a ring is drainable is TXQ_AVAIL() > MAX_TX_DESC() + 2. Use this same construct here, in order to avoid a potential off-by-one error where we might otherwise fail to request an interrupt. Reviewed by: mmacy Sponsored by: Netflix
|
#
94618825 |
|
05-May-2018 |
Mark Johnston <markj@FreeBSD.org> |
Add netdump support to iflib. em(4) and igb(4) were tested by me, and ixgbe(4) and bnxt(4) were tested by sbruno. Reviewed by: mmacy, shurd MFC after: 1 month Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D15262
|
#
1ae4848c |
|
04-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
fix gcc8 warnings Approved by: sbruno
|
#
b89827a0 |
|
04-May-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
iflib: fix invalid free during queue allocation failure In r301567, code was added to cleanup to prevent memory leaks for the Tx and Rx ring structs. This code carefully tracked txq and rxq, and made sure to free them properly during cleanup. Because we assigned the txq and rxq pointers into the ctx->ifc_txqs and ctx->ifc_rxqs, we carefully reset these pointers to NULL, so that cleanup code would not accidentally free the memory twice. This was changed by r304021 ("Update iflib to support more NIC designs"), which removed this resetting of the pointers to NULL, because it re-used the txq and rxq pointers as an index into the queue set array. Unfortunately, the cleanup code was left alone. Thus, if we fail to allocate DMA or fail to configure the queues using the drivers ifdi methods, we will attempt to free txq and rxq. These variables would now incorrectly point to the wrong location, resulting in a page fault. There are a number of methods to correct this, but ultimately the root cause was that we reuse the txq and rxq pointers for two different purposes. Instead, when allocating, store the returned pointer directly into ctx->ifc_txqs and ctx->ifc_rxqs. Then, assign this to txq and rxq as index pointers before starting the loop to allocate each queue. Drop the cleanup code for txq and rxq, and only use ctx->ifc_txqs and ctx->ifc_rxqs. Thus, we no longer need to free txq or rxq under any error flow, and intsead rely solely on the pointers stored in ctx->ifc_txqs and ctx->ifc_rxqs. This prevents the invalid free(), and ensures that we still properly cleanup after ourselves as before when failing to allocate. Submitted by: Jacob Keller Reviewed by: gallatin, sbruno Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D15285
|
#
4d613f5d |
|
04-May-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
iflib: remove unused brscp pointer from iflib_queues_alloc This pointer was no longer written to as of r315217. Since nothing writes to the variable, remove it. Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: gallatin, kmacy, sbruno Differential Revision: https://reviews.freebsd.org/D15284
|
#
aa8a24d3 |
|
03-May-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
Allow iflib NIC drivers to sleep rather than busy wait Since the move to SMP NIC driver locking has had to go through serious contortions using mtx around long running hardware operations. This moves iflib past that. Individual drivers may now sleep when appropriate. Submitted by: Matthew Macy <mmacy@mattmacy.io> Reviewed by: shurd Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D14983
|
#
f7594707 |
|
30-Apr-2018 |
Andrew Gallatin <gallatin@FreeBSD.org> |
Fix iflib_encap() EFBIG handling bugs 1) Don't give up if m_collapse() fails. Rather than giving up, try m_defrag() immediately. 2) Fix a leak where, if the NIC driver rejected the defrag'ed chain as having too many segments, we would fail to free the chain. Reviewed by: Matthew Macy <mmacy@mattmacy.io> (this version of patch) Submitted by: Matthew Macy <mmacy@mattmacy.io> (early version of leak fix)
|
#
0b75ac77 |
|
18-Apr-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
iflib: Fix queue distribution when there are no threads Previously, if there are no threads, all queues which targeted cores that share an L2 cache were bound to a single core. The intent is to distribute them across these cores. Reported by: olivier Reviewed by: sbruno Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15120
|
#
7b610b60 |
|
12-Apr-2018 |
Sean Bruno <sbruno@FreeBSD.org> |
Restore r332389 after resolution of locking fixes. Add one extra lock initialization to iflib_register() that was missed in the git<->phab conversion. Split out flag manipulation from general context manipulation in iflib To avoid blocking on the context lock in the swi thread and risk potential deadlocks, this change protects lighter weight updates that only need to be consistent with each other with their own lock. Submitted by: Matthew Macy <mmacy@mattmacy.io> Reviewed by: shurd Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D14967
|
#
2ff91c17 |
|
12-Apr-2018 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: align codebase to the current upstream (commit id 3fb001303718146) Changelist: - Turn tx_rings and rx_rings arrays into arrays of pointers to kring structs. This patch includes fixes for ixv, ixl, ix, re, cxgbe, iflib, vtnet and ptnet drivers to cope with the change. - Generalize the nm_config() callback to accept a struct containing many parameters. - Introduce NKR_FAKERING to support buffers sharing (used for netmap pipes) - Improved API for external VALE modules. - Various bug fixes and improvements to the netmap memory allocator, including support for externally (userspace) allocated memory. - Refactoring of netmap pipes: now linked rings share the same netmap buffers, with a separate set of kring pointers (rhead, rcur, rtail). Buffer swapping does not need to happen anymore. - Large refactoring of the control API towards an extensible solution; the goal is to allow the addition of more commands and extension of existing ones (with new options) without the need of hacks or the risk of running out of configuration space. A new NIOCCTRL ioctl has been added to handle all the requests of the new control API, which cover all the functionalities so far supported. The netmap API bumps from 11 to 12 with this patch. Full backward compatibility is provided for the old control command (NIOCREGIF), by means of a new netmap_legacy module. Many parts of the old netmap.h header has now been moved to netmap_legacy.h (included by netmap.h). Approved by: hrs (mentor)
|
#
66def526 |
|
11-Apr-2018 |
Mateusz Guzik <mjg@FreeBSD.org> |
iflib: fix up a mismerge in r332419 Lead to crashes on boot while in ifconfig. Submitted by: Matthew Macy <mmacy@mattmacy.io>
|
#
90d72813 |
|
11-Apr-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
Properly initialize ifc_nhwtxqs. Also, since ifc_nhwrxqs is only used in one place, remove it from the struct. This was preventing iflib_dma_free() from being called via iflib_device_detach(). Submitted by: Matthew Macy <mmacy@mattmacy.io> Reviewed by: shurd Sponsored by: Limelight Networks
|
#
7feb8819 |
|
11-Apr-2018 |
Sean Bruno <sbruno@FreeBSD.org> |
Revert r332389 as it is causing panics for various users and we need to add some more test cases.
|
#
5c1d8c4b |
|
10-Apr-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
Split out flag manipulation from general context manipulation in iflib To avoid blocking on the context lock in the swi thread and risk potential deadlocks, this change protects lighter weight updates that only need to be consistent with each other with their own lock. Submitted by: Matthew Macy <mmacy@mattmacy.io> Reviewed by: shurd Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D14967
|
#
541d96aa |
|
30-Mar-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Use an accessor function to access ifr_data. This fixes 32-bit compat (no ioctl command defintions are required as struct ifreq is the same size). This is believed to be sufficent to fully support ifconfig on 32-bit systems. Reviewed by: kib Obtained from: CheriBSD MFC after: 1 week Relnotes: yes Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14900
|
#
18628b74 |
|
25-Mar-2018 |
Mark Johnston <markj@FreeBSD.org> |
Clamp IFLIB_RX_COPY_THRESH to MHLEN in iflib_rxd_pkt_get(). If one has added fields to struct mbuf such that MHLEN is smaller than this threshold (128), iflib_rxd_pkt_get() may otherwise overrun the internal mbuf buffer while copying. Reviewed by: mmacy MFC after: 3 days Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D14843
|
#
226fb85d |
|
02-Mar-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
iflib: stop timer callout when stopping iflib_timer has been seen running after the interface had been removed. This change prevents that. Submitted by: matt.macy@joyent.com
|
#
7cb7c6e3 |
|
20-Feb-2018 |
Navdeep Parhar <np@FreeBSD.org> |
Catch up with the removal of nktr_slot_flags from upstream netmap. No functional impact intended. Submitted by: Vincenzo Maffione <v.maffione@gmail.com>
|
#
a4e59607 |
|
20-Feb-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
IFLIB: do not remove dmamap on buffer unload Dmamap is created only on IFC attach. If we remove it on buffer release, we won't be able to do ifconfig down&up. Only destroy when in detach. Reported by: wma Reviewed by: wma Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D14060
|
#
ac2fffa4 |
|
21-Jan-2018 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
Revert r327828, r327949, r327953, r328016-r328026, r328041: Uses of mallocarray(9). The use of mallocarray(9) has rocketed the required swap to build FreeBSD. This is likely caused by the allocation size attributes which put extra pressure on the compiler. Given that most of these checks are superfluous we have to choose better where to use mallocarray(9). We still have more uses of mallocarray(9) but hopefully this is enough to bring swap usage to a reasonable level. Reported by: wosch PR: 225197
|
#
44313341 |
|
15-Jan-2018 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
net*: make some use of mallocarray(9). Focus on code where we are doing multiplications within malloc(9). None of these ire likely to overflow, however the change is still useful as some static checkers can benefit from the allocation attributes we use for mallocarray. This initial sweep only covers malloc(9) calls with M_NOWAIT. No good reason but I started doing the changes before r327796 and at that time it was convenient to make sure the sorrounding code could handle NULL values. X-Differential revision: https://reviews.freebsd.org/D13837
|
#
9c58cafa |
|
27-Dec-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Don't pass rids to taskqgroup_attach() As everywhere else, we want to pass rman_get_start(irq->ii_res). This caused set affinity errors when not using MSI-X vectors (legacy and MSI interrupts). Reported by: sbruno Sponsored by: Limelight Networks
|
#
ca03863c |
|
27-Dec-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Remove assertion that's not true for !EARLY_AP_STARTUP gtask->gt_taskqueue is NULL when EARLY_AP_STARTUP is not enabled. Remove assertion to allow this config to work. Reported by: oleg Sponsored by: Limelight Networks
|
#
de130954 |
|
27-Dec-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Fix indentation. Sponsored by: Limelight Networks
|
#
97755e83 |
|
21-Dec-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix build for kernels with SCHED_4BSD. Sponsored by: The FreeBSD Foundation
|
#
25ac1dd5 |
|
20-Dec-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Don't call tcp_lro_rx() unless hardware verified TCP/UDP csum It seems that tcp_lro_rx() doesn't verify TCP checksums, so if there are bad checksums in the packets caused by invalid data, the invalid data will pass through without errors. This was noticed with the igb driver and a specific internet host: fetch http://www.mpfr.org/mpfr-current/mpfr-3.1.6.tar.xz -o test.bin && sha256 test.bin Would result in a different value sometimes. This ends up making LRO require RXCSUM to be enabled, and RXCSUM to support TCP and UDP checksums. PR: 224346 Reported by: gjb Reviewed by: sbruno Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D13561
|
#
40cf51c4 |
|
19-Dec-2017 |
Li-Wen Hsu <lwhsu@FreeBSD.org> |
Add missing `;` Approved by: kevlo
|
#
b103855e |
|
19-Dec-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Support attaching tx queues to cpus This will attempt to use a different thread/core on the same L2 cache when possible, or use the same cpu as the rx thread when not. If SMP isn't enabled, don't go looking for cores to use. This is mostly useful when using shared TX/RX queues. Reviewed by: sbruno Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12446
|
#
96fc97c8 |
|
19-Dec-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Update Matthew Macy contact info Email address has changed, uses consistent name (Matthew, not Matt) Reported by: Matthew Macy <mmacy@mattmacy.io> Differential Revision: https://reviews.freebsd.org/D13537
|
#
06c47d48 |
|
11-Dec-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Increment encap_pad_mbuf_fail when m_dup() fails in padding Previously, the counter was only incremented when m_append() failed. Since the function can also fail on m_dup() now, increment the counter there as well. Sponsored by: Limelight Networks
|
#
04993890 |
|
08-Dec-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Free mbuf chain when m_dup fails Fix memory leak where mbuf chain wasn't free()d if iflib_ether_pad() has a failure in m_dup(). Reported by: "Ryan Stone" <rysto32@gmail.com> Sponsored by: Limelight Networks
|
#
a15fbbb8 |
|
08-Dec-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Handle read-only mbufs in iflib ether pad function If ethernet padding is enabled, and a read-only mbuf is passed, it would modify the mbuf using m_append(). Instead, call m_dup() and append to the new packet. Reported by: Pyun YongHyeon Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D13414
|
#
d14c853b |
|
05-Dec-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
iflib: Support to padding Ethernet frames to a min size Some bnxt devices do not correctly send frames smaller than 52 bytes (without CRC), so add a quirk that will pad frames to an arbitrary size before passing off to the encap routine. Reported by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com> Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D13269
|
#
fe1bcada |
|
05-Dec-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Avoid calling CURVNET_[SET|RESTORE] for each packet The LRO possible test was calling CURVNET_SET once for IPv4 or IPv6 for each packet in a chain. Only call it once per chain instead. Submitted by: Matthew Macy <mmacy@mattmacy.io> Reviewed by: cem, ae Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D13368
|
#
a027c8e9 |
|
01-Dec-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Add support for SIOCGIFXMEDIA to iflib SIOCGIFXMEDIA is required for extended ethernet media types, but iflib did not support it. Reported by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com> Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D13312
|
#
772593db |
|
29-Nov-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Fix comment introduced in r326369 The code uses the set of all CPUs, it doesn't zero out the set. Sponsored by: Limelight Networks
|
#
e516b535 |
|
29-Nov-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Ensure that ctx->ifc_cpus is always initialized If a device didn't support MSI-X, ctx->ifc_cpus would not be initialized, but the IRQ allocation routines still uses the value. Move the initialization to common code. Sponsored by: Limelight Networks
|
#
7274b2f6 |
|
20-Nov-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Fix off-by-one error in bit_nclear() usage bit_nclear() takes the bit numbers for the start and end bits, not the start and a count. This was resulting in memory corruption past the end of the bitstr_t. Sponsored by: Limelight Networks
|
#
d2735264 |
|
16-Nov-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Fix default numbers of iflib queue sets The intent appears to be having one RX/TX queue set per core, but since scctx->isc_n[tr]xqsets is set to max before calling iflib_msix_init(), both end up being set to total number of cores. Use ctx->ifc_sysctl_n[rt]xqs as the selected value and scctx->isc_n[rt]xqsets as the max. This should result in what appears to be the intended behaviour Reviewed by: sbruno Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D13096
|
#
abec4724 |
|
06-Nov-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Fix NOINET/NOINET6 build during compilation of iflib. Reported by: kib
|
#
35e4e998 |
|
06-Nov-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Only chain non-LRO mbufs when LRO is not possible Preserve packet order between tcp_lro_rx() and if_input() to avoid creating extra corner cases. If no packets can be LROed, combine them into one chain for submission via if_input(). If any packet can potentially be LROed however, retain old behaviour and call if_input() for each packet. This should keep the 12% improvement for small packet forwarding intact, but mostly avoids impacting the LRO case. Reviewed by: cem, sbruno Approved by: sbruno (mentor) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12876
|
#
0b6c52b6 |
|
31-Oct-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Preserve TSO checksum flags r323941 incorrectly disabled TSO flags based on MTU. Reported by: Yuri Pankov <yuripv@gmx.com> Reviewed by: sbruno Approved by: sbruno (mentor) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12880
|
#
a1b799ca |
|
31-Oct-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Fix PR221990 - Assertion at iflib.c:1947 ifl_pidx and ifl_credits are going out of sync in _iflib_fl_refill() as they use different update log. Use the same update logic for both, and add a final call to isc_rxd_refill() to handle early exits from the loop. PR: 221990 Reported by: pho Reviewed by: sbruno Approved by: sbruno (mentor) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12798
|
#
10e0d938 |
|
30-Oct-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Fix build with nodevice netmap iru_init() was declared and used outside the DEV_NETMAP conditional blocks, but was implemented inside one. Move the implementation out of the DEV_NETMAP block to allow building with netmap disabled. Reported by: Andrew Turner <andrew@fubar.geek.nz> Reviewed by: sbruno Approved by: sbruno (mentor) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12842
|
#
09b57b7f |
|
30-Oct-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
bnxt: HW_LRO Rx Pkt with > 32 fragments caused Crash (iflib) Broadcom NIC with HW_LRO setting max_agg_segs >= 6 can generate Rx pkt with 64 (2^6) fragments, modify IFLIB_MAX_RX_SEGS to 64 to avoid memory corruption / Crash. Submitted by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com> Reviewed by: shurd, sbruno Approved by: sbruno (mentor) Sponsored by: Broadcom Limited Differential Revision: https://reviews.freebsd.org/D12774
|
#
2d873474 |
|
30-Oct-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Fix PR222744 - netmap errors with iflib em driver Fix error when refilling netmap buffers that resulted in the first buffer of the successive passes through ifl_bus_addrs[] leaving the first value unset (tmp_pidx started at 1, not zero after the first time through the loop). Leave the one unused buffer required by some NICs visible in the netmap ring rather than hidden. There will always be a buffer in use by the kernel now when an iflib driver is used via netmap. Always get the netmap slot index via netmap_idx_n2k() to account for nkr_hwofs in a consistent way. Split shared functionality into new functions. iru_init(): shared by _iflib_fl_refill() and netmap_fl_refill() netmap_fl_refill(): shared by iflib_netmap_rxsync() and iflib_netmap_rxq_init() PR: 222744 Reported by: Shirkdog <mshirk@daemon-security.com> Reviewed by: sbruno Approved by: sbruno (mentor) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12769
|
#
0fdea539 |
|
30-Oct-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Avoid enabling MSI-X if MSI-X is disabled globally It was reported on the community call that with hw.pci.enable_msix=0, iflib would enable MSI-X on the device and attempt to use it, which caused issues. Test the sysctl explicitly and do not enable MSI-X if it's disabled globally. Reviewed by: sbruno Approved by: sbruno (mentor) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12805
|
#
3429c02f |
|
23-Oct-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Some cache related optimizations 1. prefetch 128 bytes of mbufs. 2. Re-order filling the pkt_info so cache stalls happen at the end 3. Define empty prefetch2cachelines() macro when the function isn't present. Provides small performance improvments on some hardware Reviewed by: sbruno Approved by: sbruno (mentor) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12447
|
#
1c0054d2 |
|
05-Oct-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Fix "taskqgroup_attach: setaffinity failed: 3" with iflib drivers Improved logging added in r323879 exposed an error during attach. We need the irq, not the rid to work correctly. em uses shared irqs, so it will use the same irq for TX as RX. bnxt does not use shared irqs, or TX irqs at all, so there's no need to set the TX irq affinity. Reviewed by: sbruno Approved by: sbruno (mentor) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12496
|
#
1225d9da |
|
23-Sep-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Have ifmp_ring_enqueue() abdicate instead of switch to a consumer Move TX out of the enqueue() path. As a result, we need to have ifmp_ring_check_drainage() pick up from the abdicate state. We also need to either enqueue the TX task, or check drainage after calling ifmp_ring_enqueue() to ensure it's sent. This change results in a 30% small packet forwarding improvement. Reviewed by: olivier, sbruno Approved by: sbruno (mentor) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12439
|
#
f4d2154e |
|
22-Sep-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Make the rx budget a tunable This allows tuning the rx budget for special load profiles as well as more easily testing to determine sane defaults. Reviewed by: sbruno Approved by: sbruno (mentor) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12445
|
#
20f63282 |
|
22-Sep-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Chain mbufs before passing to if_input() Build a list of mbufs to pass to if_input() after LRO. Results in 12% small packet forwarding rate improvement. Reviewed by: sbruno Approved by: sbruno (mentor) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12444
|
#
c5cf2172 |
|
22-Sep-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Some small packet performance improvements If the packet is smaller than MTU, disable the TSO flags. Move TCP header parsing inside the IS_TSO?() test. Add a new IFLIB_NEED_ZERO_CSUM flag to indicate the checksums need to be zeroed before TX. Reviewed by: sbruno Approved by: sbruno (mentor) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12442
|
#
d0d0ad0a |
|
20-Sep-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Fix iflib netmap RX RXQ setup for netmap was broken because netmap_rxq_init was getting called before IFDI_INIT - thus we ended up with ring tail pointer being reset to zero. Reviewed by: sbruno Approved by: sbruno (mentor) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12140
|
#
ab2e3f79 |
|
15-Sep-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Revert r323516 (iflib rollup) This was really too big of a commit even if everything worked, but there are multiple new issues introduced in the one huge commit, so it's not worth keeping this until it's fixed. I'll work on splitting this up into logical chunks and introduce them one at a time over the next week or two. Approved by: sbruno (mentor) Sponsored by: Limelight Networks
|
#
d300df01 |
|
12-Sep-2017 |
Stephen Hurd <shurd@FreeBSD.org> |
Roll up iflib commits from github. This pulls in most of the work done by Matt Macy as well as other changes which he has accepted via pull request to his github repo at https://github.com/mattmacy/networking/ This should bring -CURRENT and the github repo into close enough sync to allow small feature branches rather than a large chain of interdependant patches being developed out of tree. The reset of the synchronization should be able to be completed on github by splitting the remaining changes that are not yet ready into short feature branches for later review as smaller commits. Here is a summary of changes included in this patch: 1) More checks when INVARIANTS are enabled for eariler problem detection 2) Group Task Queue cleanups - Fix use of duplicate shortdesc for gtaskqueue malloc type. Some interfaces such as memguard(9) use the short description to identify malloc types, so duplicates should be avoided. 3) Allow gtaskqueues to use ithreads in addition to taskqueues - In some cases, this can improve performance 4) Better logging when taskqgroup_attach*() fails to set interrupt affinity. 5) Do not start gtaskqueues until they're needed 6) Have mp_ring enqueue function enter the ABDICATED rather than BUSY state. This moves the TX to the gtaskq and allows processing to continue faster as well as make TX batching more likely. 7) Add an ift_txd_errata function to struct if_txrx. This allows drivers to inspect/modify mbufs before transmission. 8) Add a new IFLIB_NEED_ZERO_CSUM for drivers to indicate they need checksums zeroed for checksum offload to work. This avoids modifying packet data in the TX path when possible. 9) Use ithreads for iflib I/O instead of taskqueues 10) Clean up ioctl and support async ioctl functions 11) Prefetch two cachlines from each mbuf instead of one up to 128B. We often need to parse packet header info beyond 64B. 12) Fix potential memory corruption due to fence post error in bit_nclear() usage. 13) Improved hang detection and handling 14) If the packet is smaller than MTU, disable the TSO flags. This avoids extra packet parsing when not needed. 15) Move TCP header parsing inside the IS_TSO?() test. This avoids extra packet parsing when not needed. 16) Pass chains of mbufs that are not consumed by lro to if_input() rather call if_input() for each mbuf. 17) Re-arrange packet header loads to get as much work as possible done before a cache stall. 18) Lock the context when calling IFDI_ATTACH_PRE()/IFDI_ATTACH_POST()/ IFDI_DETACH(); 19) Attempt to distribute RX/TX tasks across cores more sensibly, especially when RX and TX share an interrupt. RX will attempt to take the first threads on a core, and TX will attempt to take successive threads. 20) Allow iflib_softirq_alloc_generic() to request affinity to the same cpus an interrupt has affinity with. This allows TX queues to ensure they are serviced by the socket the device is on. 21) Add new iflib sysctls to net.iflib: - timer_int - interval at which to run per-queue timers in ticks - force_busdma 22) Add new per-device iflib sysctls to dev.X.Y.iflib - rx_budget allows tuning the batch size on the RX path - watchdog_events Count of watchdog events seen since load 23) Fix error where netmap_rxq_init() could get called before IFDI_INIT() 24) e1000: Fixed version of r323008: post-cold sleep instead of DELAY when waiting for firmware - After interrupts are enabled, convert all waits to sleeps - Eliminates e1000 software/firmware synchronization busy waits after startup 25) e1000: Remove special case for budget=1 in em_txrx.c - Premature optimization which may actually be incorrect with multi-segment packets 26) e1000: Split out TX interrupt rather than share an interrupt for RX and TX. - Allows better performance by keeping RX and TX paths separate 27) e1000: Separate igb from em code where suitable Much easier to understand separate functions and "if (is_igb)" than previous tests like "if (reg_icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC))" #blamebruno Reviewed by: sbruno Approved by: sbruno (mentor) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12235
|
#
2cc3b2ee |
|
31-Aug-2017 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Do not abuse flag that is clearly marked as unused. This creates conflicts with FreeBSD variations that may use it. The usage of the flag M_TOOBIG is limited to iflib queue, thus using one of M_PROTO flags is fine. There is no need to grab global flag. Silence from: kmacy, sbruno (2 weeks)
|
#
a9693502 |
|
30-Aug-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Revert r323008 and its conversion of e1000/iflib to using SX locks. This seems to be missing something on the 82574L causing NFS root mounts to hang. Reported by: kib
|
#
e17e5b41 |
|
29-Aug-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Continuation of lock cleanup in e1000. Post-cold sleep instead of DELAY when waiting for firmware. Convert softc mutex to an SX lock. Change all waits to sleeps once interrupts are enabled (and it is safe to sleep). Submitted by: Matt Macy <matt@mattmacy.io> Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12101
|
#
21e10b16 |
|
23-Aug-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
iflib: call device's if_init function during vlan initialization. Submitted by: bhargava.marreddy@broadcom.com Reviewed by: shurd Sponsored by: Broadcom Differential Revision: https://reviews.freebsd.org/D12098
|
#
5c5ca36c |
|
09-Aug-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Don't leak mbufs if clusers exceeds the number of segments. This would leak mbufs over time causing crashes. PR: 221202 Submitted by: Matt Macy <matt@mattmacy.io> Reported by: gergely.czuczy@harmless.hu Sponsored by: Limelight Networks
|
#
18a660b3 |
|
09-Aug-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Export IFCAP_HWSTATS so that we don't experience double stats counting on iflib enabled devices. PR: 220198 Submitted by: Matt Macy <matt@mattmacy.io> Reported by: Ben Woods <woodsb02@freebsd.org> Sponsored by: Limelight Networks
|
#
9d35858f |
|
27-Jul-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Slight restructure of iflib_busdma_load_mbuf_sg() to fix accounting when m_collapse() fails. Submitted by: krzystof.galazka@intel.com Reviewed by: Jeb Cramer <cramerj@intel.com> Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D11476
|
#
9d0a88de |
|
20-Jul-2017 |
Dimitry Andric <dim@FreeBSD.org> |
Fix printf format warning in iflib.c Clang 5.0.0 got better warnings about printf format strings using %zd, and this leads to the following -Werror warning on e.g. arm: sys/net/iflib.c:1517:8: error: format specifies type 'ssize_t' (aka 'int') but the argument has type 'bus_size_t' (aka 'unsigned long') [-Werror,-Wformat] sctx->isc_tx_maxsize, nsegments, sctx->isc_tx_maxsegsize); ^~~~~~~~~~~~~~~~~~~~ sys/net/iflib.c:1517:41: error: format specifies type 'ssize_t' (aka 'int') but the argument has type 'bus_size_t' (aka 'unsigned long') [-Werror,-Wformat] sctx->isc_tx_maxsize, nsegments, sctx->isc_tx_maxsegsize); ^~~~~~~~~~~~~~~~~~~~~~~ Fix this by casting bus_size_t arguments to uintmax_t, and using %ju instead. Reviewed by: emaste MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D11679
|
#
dcbc025f |
|
19-Jul-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Don't cache mbuf pointers if the number of descriptors is greater than the number of buffers. Submitted by: Matt Macy <mmacy@mattmacy.io> Sponsored by: Limelight Networks
|
#
25d52811 |
|
03-Jul-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
iflib - flib_busdma_load_mbuf_sg used isc_tx_maxsize as max semgent size. Submitted by: krzysztof.galazka@intel.com Differential Revision: https://reviews.freebsd.org/D11403
|
#
87890dba |
|
03-Jul-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
bnxt(4) Enable LRO support, redux iflib - reset fl-ifl_fragidx to 0 on iflib_fl_bufs_free(). This caused the panic in em/igb when adding it to a bridge device. iflib - Handle out of order packet delivery from hardware in support of LRO Out of order updates to rxd's is fixed in r315217. However, it is not completely fixed. While refilling the buffers, iflib is not considering the out of order descriptors. Hence, it is refilling sequentially. "idx" variable in _iflib_fl_refill routine is incremented sequentially. By doing refilling sequentially, it will override the SGEs that are *IN USE* by other connections. Fix is to maintain a bitmap of rx descriptors and differentiate the used one with unused one and refill only at the unused indices. This patch also fixes a few bugs in bnxt, related to the same feature. Submitted by: bhargava.marreddy@broadcom.com Reviewed by: venkatkumar.duvvuru@broadcom.com shurd Differential Revision: https://reviews.freebsd.org/D10681
|
#
fa5416a8 |
|
17-Jun-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Revert r319989 "bnxt(4) Enable LRO support" This generates startup LORs and panics when adding elements to bridge devices. I will document further in https://reviews.freebsd.org/D10681 PR: 220073 Submitted by: dchagin Reported by: db
|
#
51a621f7 |
|
15-Jun-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
bnxt(4) Enable LRO support iflib - Handle out of order packet delivery from hardware in support of LRO Out of order updates to rxd's is fixed in r315217. However, it is not completely fixed. While refilling the buffers, iflib is not considering the out of order descriptors. Hence, it is refilling sequentially. "idx" variable in _iflib_fl_refill routine is incremented sequentially. By doing refilling sequentially, it will override the SGEs that are *IN USE* by other connections. Fix is to maintain a bitmap of rx descriptors and differentiate the used one with unused one and refill only at the unused indices. This patch also fixes a few bugs in bnxt, related to the same feature. Submitted by: bhargava.marreddy@broadcom.com Reviewed by: shurd@ Differential Revision: https://reviews.freebsd.org/D10681
|
#
3600bd1e |
|
15-Jun-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Revert r319921 which seems to cause NFS booting assertion panics in various configurations. Reported by: pho@
|
#
f7587db0 |
|
13-Jun-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Add new sysctl to allow changing of timing of the txq timers. Add new sysctl to override use of busdma in the driver. Submitted by: Drew Gallitin <gallatin@netflix.com>
|
#
aa8fa07c |
|
13-Jun-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Plug mbuf leak in the busdma path of iflib. Submitted by: Michael Tuexen <tuexen@freebsd.org> Reported by: Drew Gallitin <gallatin@netflix.com>
|
#
1d898b91 |
|
09-May-2017 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Adjust a comment. MFC after: 3 days
|
#
60596476 |
|
06-Apr-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Move pause frame counter out of struct if_ctx and into struct if_softc_ctx_t so that we can use it in iflib to detect pause frames. The igb(4) driver definitely used to use this in its old timer function and I see no reason to restrict it to that driver only. Sponsored by: Limelight Networks
|
#
ea351d3f |
|
04-Apr-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Allow MSIX to be turned off by tuneable per interface, per driver. Sponsored by: Limelight Networks
|
#
2b2fc973 |
|
30-Mar-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Don't call init functions directly from the timer/watchdog function. Enqueue this in the admin task now that it can process it. Submitted by: Matt Macy <mmacy@nextbsd.org> Sponsored by: Limelight Networks
|
#
5c1ff255 |
|
30-Mar-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Assert IFF_DRV_OACTIVE in iflib_timer() when the "hung" case is detected so that iflib's admin task can still process the reset directive and restore functionality. Sponsored by: Limelight Networks
|
#
5e888388 |
|
14-Mar-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Change casting to a uintptr_t to be compatible with non-x86 architectures. Submitted by: Matt Macy <mmacy@nextbsd.org> Reported by: rpokala Sponsored by: Limelight Networks
|
#
0a1b74a3 |
|
14-Mar-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Fixup LINT by using uint64_t type as we do on all other calls to PNMB() Found with Jenkins. Reported by: lwshu Sponsored by: Limelight Networks
|
#
95246abb |
|
13-Mar-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
IFLIB updates - unconditionally enable BUS_DMA on non-x86 architectures - speed up rxd zeroing via customized function - support out of order updates to rxd's - add prefetching to hardware descriptor rings - only prefetch on 10G or faster hardware - add seperate tx queue intr function - preliminary rework of NETMAP interfaces, WIP Submitted by: Matt Macy <mmacy@nextbsd.org> Sponsored by: Limelight Networks
|
#
d945ed64 |
|
01-Mar-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Make gtaskqueue compatible with drm-next such that they can be used with the linuxkpi tasklets. Submitted by: mmacy@nextbsd.org Reported by: hps
|
#
e099b90b |
|
21-Feb-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys: Replace zero with NULL for pointers. Found with: devel/coccinelle MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D9694
|
#
67af525c |
|
04-Feb-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Delete duplicate break.
|
#
835809f9 |
|
28-Jan-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Fix i386 compile failure by moving needed closing parenthesis out of conditional block. Submitted by: hiren Reported by: cy
|
#
e035717e |
|
27-Jan-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
IFLIB updates: We found routing performance dropped significantly when configuring FreeBSD as a router, we are applying the following changes in order to resolve those issues and hopefully perform better. - don't prefetch the flags array, we usually don't need it - prefetch the next cache line of each of the software descriptor arrays as well as the first cache line of each of the next four packets' mbufs and clusters - reduce max copy size to 63 bytes - convert rx soft descriptors from array of structures to a structure of arrays - update copyrights Submitted by: Matt Macy <mmacy@nextbsd.org>
|
#
96eeabef |
|
27-Jan-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Replace customized busmaster code with standardized setup call. Reported by: jhb
|
#
69b7fc3e |
|
26-Jan-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Minor style annoyance. Submitted by: bde
|
#
f7ae9a84 |
|
25-Jan-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Add error checking to the pci_find_cap(, PCIY_MSIX,) call that is returns success and a good value. Only then try to use it and set the MSIX_ENABLE bit. With the current em(4) driver we have observed failures in this case in a specific environment when pci_find_cap() would not return the assumed value, which meant we ended up writing to PCI register 2 (PCI_DEVICE_ID) which is read-only. PR: 216456 Submitted by: bz
|
#
bd84f700 |
|
24-Jan-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
iflib: Add internal tracking of smp startup status to reliably figure out what methods are to be used to get gtaskqueue up and running. e1000: Calculating this pointer gives undefined behaviour when (last == -1) (it is before the buffer). The pointer is always followed. Panics occurred when it points to an unmapped page. Otherwise, the pointed-to garbage tends to not have the E1000_TXD_STAT_DD bit set in it, so in the broken case the loop was usually null and the function just returned, and this was acidentally correct. Submitted by: bde Reported by: Matt Macy <mmacy@nextbsd.org>
|
#
36fa5d5b |
|
24-Jan-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Revert 312696 due to build tests.
|
#
562a3182 |
|
24-Jan-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
iflib: Add internal tracking of smp startup status to reliably figure out what methods are to be used to get gtaskqueue up and running. e1000: Calculating this pointer gives undefined behaviour when (last == -1) (it is before the buffer). The pointer is always followed. Panics occurred when it points to an unmapped page. Otherwise, the pointed-to garbage tends to not have the E1000_TXD_STAT_DD bit set in it, so in the broken case the loop was usually null and the function just returned, and this was acidentally correct. Submitted by: bde Reviewed by: Matt Macy <mmacy@nextbsd.org>
|
#
4ecb427a |
|
14-Jan-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Fix hangs in a uniprocessor configuration (qemu, virtualbox, real hw). sys/net/iflib.c: Add ctx to filter_info and don't skpi interrupt early on unless we're on an SMP system sys/kern/subr_gtaskqueue.c: Skip smp check if we're running UP Submitted by: Matt Macy <mmacy@nextbsd.org> Reported by: emaste bde
|
#
5b51fcfc |
|
09-Jan-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Remove unused mtx_held() macro.
|
#
1248952a |
|
01-Jan-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
2017 IFLIB updates in preparation for commits to e1000 and ixgbe. - iflib - add checksum in place support (mmacy) - iflib - initialize IP for TSO (going to be needed for e1000) (mmacy) - iflib - move isc_txrx from shared context to softc context (mmacy) - iflib - Normalize checks in TXQ drainage. (shurd) - iflib - Fix queue capping checks (mmacy) - iflib - Fix invalid assert, em can need 2 sentinels (mmacy) - iflib - let the driver determine what capabilities are set and what tx csum flags are used (mmacy) - add INVARIANTS debugging hooks to gtaskqueue enqueue (mmacy) - update bnxt(4) to support the changes to iflib (shurd) Some other various, sundry updates. Slightly more verbose changelog: Submitted by: mmacy@nextbsd.org Reviewed by: shurd mFC after: Sponsored by: LimeLight Networks and Dell EMC Isilon
|
#
da69b8f9 |
|
17-Nov-2016 |
Sean Bruno <sbruno@FreeBSD.org> |
iflib updates and fixes: - reset gen on down - initialize admin task statically - drain mp_ring on down - don't drop context lock on stop - reset error stats on down - fix typo in min_latency sysctl - return ENOBUFS from if_transmit if the driver isn't running or the link is down Submitted by: mmacy@nextbsd.org Reviewed by: shurd MFC after: 2 days Sponsored by: Isilon and Limelight Networks Differential Revision: https://reviews.freebsd.org/D8558
|
#
2fe66646 |
|
18-Oct-2016 |
Sean Bruno <sbruno@FreeBSD.org> |
Set default capabilities at attach. ref: https://github.com/NextBSD/NextBSD/commit/6425f45e5fc89f64925995bbcfc09c7558d896ea Submitted by: mmacy@nextbsd.org
|
#
add6f7d0 |
|
18-Oct-2016 |
Sean Bruno <sbruno@FreeBSD.org> |
When deciding whether or not to call tqg_attach_cpu(), reference rid directly. ref: https://github.com/NextBSD/NextBSD/commit/c9b47b468b8a3350811acfd9e167a8b91dc8f0c6 Submitted by: mmacy@nextbsd.org
|
#
8b2a1db9 |
|
18-Oct-2016 |
Sean Bruno <sbruno@FreeBSD.org> |
Toggle v4/v6 rxcsum together Only re-init if driver is running ref: https://github.com/NextBSD/NextBSD/commit/106518e874ec9a61daf4c09894170d24e2f4d60d Submitted by: mmacy@nextbsd.org
|
#
aa3c5dd8 |
|
18-Oct-2016 |
Sean Bruno <sbruno@FreeBSD.org> |
Fix misusage of CPU_FFS when binding queues to cpus ref: https://github.com/NextBSD/NextBSD/commit/922d0bdf2277f30954f143107d2a3eddb02abd2d Submitted by: mmacy@nextbsd.org
|
#
23ac9029 |
|
12-Aug-2016 |
Stephen Hurd <shurd@FreeBSD.org> |
Update iflib to support more NIC designs - Move group task queue into kern/subr_gtaskqueue.c - Change intr_enable to return an int so it can be detected if it's not implemented - Allow different TX/RX queues per set to be different sizes - Don't split up TX mbufs before transmit - Allow a completion queue for TX as well as RX - Pass the RX budget to isc_rxd_available() to allow an earlier return and avoid multiple calls Submitted by: shurd Reviewed by: gallatin Approved by: scottl Differential Revision: https://reviews.freebsd.org/D7393
|
#
f454e7eb |
|
04-Aug-2016 |
John Baldwin <jhb@FreeBSD.org> |
Add __printflike() to bus_describe_intr() to enable -Wformat checks. Fix a few places that were passing a raw string as the format to use a "%s" format string instead. MFC after: 2 months
|
#
91d546a0 |
|
08-Jul-2016 |
Conrad Meyer <cem@FreeBSD.org> |
iflib: Fix typo in 'iflib_rx_miss_bufs' sysctl name It looks like these sysctls were copy-pasted from netmap. Most were changed from 'ixl_' prefix to 'iflib_', but this one was missed. Fix the "can't re-use a leaf (ixl_rx_miss_bufs)!" warning. Reported by: dim@ and others Sponsored by: EMC / Isilon Storage Division
|
#
96c85efb |
|
06-Jul-2016 |
Nathan Whitehorn <nwhitehorn@FreeBSD.org> |
Replace a number of conflations of mp_ncpus and mp_maxid with either mp_maxid or CPU_FOREACH() as appropriate. This fixes a number of places in the kernel that assumed CPU IDs are dense in [0, mp_ncpus) and would try, for example, to run tasks on CPUs that did not exist or to allocate too few buffers on systems with sparse CPU IDs in which there are holes in the range and mp_maxid > mp_ncpus. Such circumstances generally occur on systems with SMT, but on which SMT is disabled. This patch restores system operation at least on POWER8 systems configured in this way. There are a number of other places in the kernel with potential problems in these situations, but where sparse CPU IDs are not currently known to occur, mostly in the ARM machine-dependent code. These will be fixed in a follow-up commit after the stable/11 branch. PR: kern/210106 Reviewed by: jhb Approved by: re (glebius)
|
#
0d0338af |
|
07-Jun-2016 |
Conrad Meyer <cem@FreeBSD.org> |
iflib: Improve cleanup on iflib_queues_alloc error path Fix some memory leaks. Some may remain. Reported by: Coverity Discussed with: mmacy CIDs: 1356036, 1356037, 1356038 Sponsored by: EMC / Isilon Storage Division
|
#
16fb86ab |
|
07-Jun-2016 |
Conrad Meyer <cem@FreeBSD.org> |
iflib: Fix potential leak in iflib_if_transmit Due to an accidental mismatch between allocation and release in the slow path of iflib_if_transmit, if a caller passed 9-16 mbufs to the routine, the mbuf array would be leaked. Fix the mismatch by removing the magic numbers in favor of nitems() on the stack array. According to mmacy, this leak is unlikely. Reported by: Coverity Discussed with: mmacy CID: 1356040 Sponsored by: EMC / Isilon Storage Division
|
#
c7762913 |
|
18-May-2016 |
Scott Long <scottl@FreeBSD.org> |
Remove assertions that don't make sense for the data type.
|
#
aaeb188a |
|
18-May-2016 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Make compile without INET or without IP support in the kernel by hiding variables and lro function calls behind approriate #ifdefs. Also move the #includes for "opt_*" to the place where they should be.
|
#
4c7070db |
|
17-May-2016 |
Scott Long <scottl@FreeBSD.org> |
Import the 'iflib' API library for network drivers. From the author: "iflib is a library to eliminate the need for frequently duplicated device independent logic propagated (poorly) across many network drivers." Participation is purely optional. The IFLIB kernel config option is provided for drivers that want to transition between legacy and iflib modes of operation. ixl and ixgbe driver conversions will be committed shortly. We hope to see participation from the Broadcom and maybe Chelsio drivers in the near future. Submitted by: mmacy@nextbsd.org Reviewed by: gallatin Differential Revision: D5211
|