#
a01c7081 |
|
18-Apr-2024 |
Gleb Smirnoff <glebius@FreeBSD.org> |
vtnet: use CURVNET_SET() instead of CURVNET_SET_QUIET() We don't expect the VNET context to be set for virtqueue neither for taskqueue handlers. Suggested by: zec Fixes: 3f2b9607756d0f92ca29c844db0718b313a06634
|
#
3f2b9607 |
|
28-Mar-2024 |
Gleb Smirnoff <glebius@FreeBSD.org> |
vtnet: set VNET context in RX handler The context is required for NIC-level pfil(9) filtering.
|
#
0ea4b408 |
|
04-Feb-2024 |
Warner Losh <imp@FreeBSD.org> |
vtnet: Avoid ifdefs based on __NO_STRICT_ALIGNMENT Some platforms require an adjustment of the ethernet hearders. Rather than make this be on __NO_STRICT_ALIGNMENT being defined, define VTNET_ETHER_ALIGN to be either 0 or ETHER_ALIGN (aka 2). Add a test to the if statements to only do them when != 0. This eliminates the #ifdef sprinkled in the code, still communicates the intent and gives the same compiled results. Sponsored by: Netflix Reviewed by: bz, bryanv Differential Revision: https://reviews.freebsd.org/D43654
|
#
d9e0e426 |
|
04-Feb-2024 |
Warner Losh <imp@FreeBSD.org> |
vtnet: Account for the padding when selecting allocation size While we account for the padding in the length of the mbuf we use, we do not account for it when we 'guess' the size of the mbuf to allocate based in the MTU of the device. This leads to a situation where we might fail if the mtu is close to a bucket size (say 2018) such that the added padding would push us over the edge for a full-sized packet. mtu of 2018 is super rare (2016 and 2020 would both work), but fix it none-the-less. It's a shame we can't just set VTNET_RX_HEADER_PAD to 2 in this case. The 4 seems hard-coded somewhere I've not found documented (I think it's in the protocol given the comments about VIRTIO_F_ANY_LAYOUT). Sponsored by: Netflix Reviewed by: bz Differential Revision: https://reviews.freebsd.org/D43656
|
#
3be59adb |
|
28-Jan-2024 |
Warner Losh <imp@FreeBSD.org> |
vtnet: Adjust for ethernet alignment. If the header that we add to the packet's size is 0 % 4 and we're strictly aligning, then we need to adjust where we store the header so the packet that follows will have it's struct ip header properly aligned. We do this on allocation (and when we check the length of the mbufs in the lro_nomrg case). We can't just adjust the clustersz in the softc, because it's also used to allocate the mbufs and it needs to be the proper size for that. Since we otherwise use the size of the mbuf (or sometimes the smaller size of the received packet) to compute how much we can buffer, this ensures no overflows. The 2 byte adjustment also does not affect how many packets we can receive in the lro_nomrg case. PR: 271288 Sponsored by: Netflix Reviewed by: bryanv Differential Revision: https://reviews.freebsd.org/D43224
|
#
23699ff2 |
|
28-Dec-2023 |
Warner Losh <imp@FreeBSD.org> |
Revert "vtnet: Adjust rx buffer so IP header 32-bit aligned" This reverts commit 9e6d11ce9a51d75ed6a94e180f2fb4e9188a2ba4. This wasn't right to start with... Requested by: markj
|
#
8ee1cc4a |
|
28-Dec-2023 |
Warner Losh <imp@FreeBSD.org> |
Revert "vtnet: Better adjust for ethernet alignment." This reverts commit e9da71cd35d46ca13da4396d99e0af1703290e68. This was inadvertantly pushed and turns out ot be not quite right. Requested by: markj
|
#
e9da71cd |
|
21-Dec-2023 |
Warner Losh <imp@FreeBSD.org> |
vtnet: Better adjust for ethernet alignment. Move adjustment of the mbuf from where we allocate it to where we are about to queue it to the device. Do this only on those platforms that require it. This allows us to receive an entire jumbo frame on other platforms. It also doesn't make the adjustment on subsequent frames when we queue mulitple mbufs for LRO operations. For the normal use case on armv7, there's no difference because we only ever allocate one mbuf. However, for the LRO cases it increases what's available in LRO. It also ensure that we get enough mbufs in those cases as well (though I have no ability to test this on a LRO scenario with armv7). This has the side effect of reverting 527b62e37e68. Fixes: 527b62e37e68 Sponsored by: Netflix
|
#
9e6d11ce |
|
20-Dec-2023 |
Warner Losh <imp@FreeBSD.org> |
vtnet: Adjust rx buffer so IP header 32-bit aligned Call madj(m, ETHER_ALIGN) to offset rx buffers when allocating them. This improves performance everywhere, and allows armv7 to work at all. PR: 271288 (PR had a different fix than I wound up with) MFC After: 3 days Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D43136
|
#
fdafd315 |
|
24-Nov-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Automated cleanup of cdefs and other formatting Apply the following automated changes to try to eliminate no-longer-needed sys/cdefs.h includes as well as now-empty blank lines in a row. Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/ Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/ Remove /\n+#if.*\n#endif.*\n+/ Remove /^#if.*\n#endif.*\n/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/ Sponsored by: Netflix
|
#
180c0240 |
|
18-Sep-2023 |
Mina Galić <freebsd@igalic.co> |
virtio: remove virtio_alloc_virtqueues' flags arg Summary: the flags argument is unused. Its initial design idea has been superceded by the addition of virtio_setup_intr and related APIs. Sponsored by: The FreeBSD Foundation Reviewers: bryanv Reviewed By: bryanv Subscribers: cognet, imp Differential Revision: https://reviews.freebsd.org/D41850
|
#
685dc743 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
580cadd6 |
|
08-Aug-2023 |
Kristof Provost <kp@FreeBSD.org> |
vtnet: allow IFF_ALLMULTI to be set without VIRTIO_NET_F_CTRL_RX If the host doesn't announce VIRTIO_NET_F_CTRL_RX we cannot disable all multicast traffic. Previously we'd refuse to set the IFF_ALLMULTI flag, which is the exact opposite of what is actually happening. This broke things such as igmpproxy. See also: https://redmine.pfsense.org/issues/14301 Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D41356
|
#
4d846d26 |
|
10-May-2023 |
Warner Losh <imp@FreeBSD.org> |
spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of BSD-2-Clause. Discussed with: pfg MFC After: 3 days Sponsored by: Netflix
|
#
a6b55ee6 |
|
17-Apr-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
net: replace IFF_KNOWSEPOCH with IFF_NEEDSEPOCH Expect that drivers call into the network stack with the net epoch entered. This has already been the fact since early 2020. The net interrupts, that are marked with INTR_TYPE_NET, were entering epoch since 511d1afb6bf. For the taskqueues there is NET_TASK_INIT() and all drivers that were known back in 2020 we marked with it in 6c3e93cb5a4. However in e87c4940156 we took conservative approach and preferred to opt-in rather than opt-out for the epoch. This change not only reverts e87c4940156 but adds a safety belt to avoid panicing with INVARIANTS if there is a missed driver. With INVARIANTS we will run in_epoch() check, print a warning and enter the net epoch. A driver that prints can be quickly fixed with the IFF_NEEDSEPOCH flag, but better be augmented to properly enter the epoch itself. Note on TCP LRO: it is a backdoor to enter the TCP stack bypassing some layers of net stack, ignoring either old IFF_KNOWSEPOCH or the new IFF_NEEDSEPOCH. But the tcp_lro_flush_all() asserts the presence of network epoch. Indeed, all NIC drivers that support LRO already provide the epoch, either with help of INTR_TYPE_NET or just running NET_EPOCH_ENTER() in their code. Reviewed by: zlei, gallatin, erj Differential Revision: https://reviews.freebsd.org/D39510
|
#
a2256150 |
|
14-Feb-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
net: use pfil_mbuf_{in,out} where we always have an mbuf This finalizes what has been started in 0b70e3e78b0. Reviewed by: kp, mjg Differential revision: https://reviews.freebsd.org/D37976
|
#
4ee96792 |
|
01-Mar-2022 |
Justin Hibbits <jhibbits@FreeBSD.org> |
Mechanically convert if_vtnet(4) to IfAPI Reviewed By: bryanv Sponsored by: Juniper Networks, Inc. Differential Revision: https://reviews.freebsd.org/D37799
|
#
5c4c96d3 |
|
06-May-2022 |
John Baldwin <jhb@FreeBSD.org> |
virtio: Remove unused devclass arguments to DRIVER_MODULE.
|
#
53236f90 |
|
18-Apr-2022 |
Michael Tuexen <tuexen@FreeBSD.org> |
if_vtnet: improve dumping a kernel Disable software LRO during kernel dumping, because having it enabled requires to be in a network epoch, which might or might not be the case depending on the code path resulting in the panic. Reviewed by: markj MFC after: 3 days Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D34787
|
#
127b40e7 |
|
13-Apr-2022 |
John Baldwin <jhb@FreeBSD.org> |
vtnet: offset is only used for INET or INET6.
|
#
fc035df8 |
|
05-Feb-2022 |
Aleksandr Fedorov <afedorov@FreeBSD.org> |
if_vtnet(4): Restore the ability to set promisc mode. PR: 254343, 255054 Reviewed by: vmaffione (mentor), donner Approved by: vmaffione (mentor), donner MFC after: 2 weeks Sponsored by: vstack.com Differential Revision: https://reviews.freebsd.org/D30639
|
#
526ddf17 |
|
20-Jan-2022 |
Mark Johnston <markj@FreeBSD.org> |
vtnet: Mark MRG_RXBUF headers as initialized before loading fields MFC after: 1 week Sponsored by: The FreeBSD Foundation
|
#
3f6ab549 |
|
04-Jan-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
vtnet: don't leak pfil(9) data on detach PR: 260667 Submitted by: <ghuckriede blackberry.com>
|
#
0d3b2bd7 |
|
14-Dec-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vtnet: plug set-but-not-used vars Sponsored by: Rubicon Communications, LLC ("Netgate")
|
#
710c0556 |
|
11-Aug-2021 |
Mark Johnston <markj@FreeBSD.org> |
virtio: Add KMSAN hooks for network and block devices This ensures that host-written data is marked as initialized. Sponsored by: The FreeBSD Foundation
|
#
4044af03 |
|
19-Apr-2021 |
Alexander V. Chernikov <melifaro@FreeBSD.org> |
Fix vtnet TCP lro panic Differential Revision: https://reviews.freebsd.org/D29900 Reviewed by: hps, kp
|
#
d4697a6b |
|
18-Mar-2021 |
Michael Tuexen <tuexen@FreeBSD.org> |
vtnet: fix TSO for TCP/IPv6 The decision whether a TCP packet is sent over IPv4 or IPv6 was based on ethertype, which works correctly. In D27926 the criteria was changed to checking if the CSUM_IP_TSO flag is set in the csum-flags and then considering it to be TCP/IPv4. However, the TCP stack sets the flag to CSUM_TSO for IPv4 and IPv6, where CSUM_TSO is defined as CSUM_IP_TSO|CSUM_IP6_TSO. Therefore TCP/IPv6 packets gets mis-classified as TCP/IPv4, which breaks TSO for TCP/IPv6. This patch bases the check again on the ethertype. This fix will be MFC instantly as discussed with re(gjb). MFC after: instantly PR: 254366 Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D29331
|
#
c1b554c8 |
|
22-Feb-2021 |
Alex Richardson <arichardson@FreeBSD.org> |
if_vtnet: Fix pointer-sign and used parameter warnings Reviewed By: grehan Differential Revision: https://reviews.freebsd.org/D28726
|
#
633218ee |
|
20-Jan-2021 |
Jessica Clarke <jrtc27@FreeBSD.org> |
virtio: Reduce boilerplate for device driver module definitions Rather than have every device register itself for both virtio_pci and virtio_mmio, provide a VIRTIO_DRIVER_MODULE wrapper to declare both, merge VIRTIO_SIMPLE_PNPTABLE with VIRTIO_SIMPLE_PNPINFO and make the latter register for both buses. This also has the benefit of abstracting away the available transports and their names. Reviewed by: bryanv Differential Revision: https://reviews.freebsd.org/D28073
|
#
e6cc42f1 |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
virtio: Handle possible failure of virtio_finalize_features() Try to standardize how drivers negotiate feature and the function names Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27930
|
#
2bfab357 |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
if_vtnet: Add counter for received host LRO Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27928
|
#
475a60ae |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
if_vtnet: Misc Tx path cleanup - Add and fix a few error path counters - Improve sysctl descriptions - Use flags consistently to determine IPv4 vs IPv6 Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27926
|
#
6b53aeed |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
if_vtnet: Set lro_nsegs for host LRO packets Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27933
|
#
33b5433f |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
if_vtnet: Remove unnecessary TUNABLE_INTs because of CTLFLAG_RDTUN Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27923
|
#
4f18e23f |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
if_vtnet: Schedule Rx task if pending items when enabling interrupt Prior to V1, the driver would enable interrupts and then notify the host that DRIVER_OK. Since for V1, DRIVER_OK needs to be set before notifying the virtqueues, there may be items in the queues waiting to be processed by the time interrupts are enabled. This fixes a bug where the Rx queue would appear stuck, only being usable after an interface down/up cycle. Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27922
|
#
c3187190 |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
if_vtnet: Disable F_MTU feature if MTU is invalid Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27931
|
#
bd8809df |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
if_vtnet: Limit allocations of unused virtqueues For multiqueue, we may use fewer than the provided maximum number of queues. Try to limit allocations of the unused queues: no interrupts, no indirect descriptors, and no taskqueues. Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27921
|
#
b470419e |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
if_vtnet: Rework 4be723f63 max multiqueue pairs check Verify the max_virtqueue_pairs is within the range allowed by the spec. Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27920
|
#
42343a63 |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
if_vtnet: Add support for software LRO This useful when running on hosts that support checksum offloading but not the GUEST_TSO (LRO) feature. Or potentially, some GRO-like support when doing forwarding. Only enable SW LRO when the host LRO is not available since both tends to be harmful, and difficult to enable/disable selectively with only a single IFCAP_LRO flag. Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27919
|
#
177761e4 |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
if_vtnet: Set the interface max TSO values Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27917
|
#
e36a6b1b |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
if_vtnet: Add support for CTRL_GUEST_OFFLOADS feature This allows the Rx checksum and LRO to be modified without a full reinit of the device. Remove IFCAP_RXCSUM_IPV6 from the interface capabilities since in VirtIO Rx checksums are just enabled or disabled for all protocols. Properly update IFCAP_LRO if LRO is becomes disabled when Rx checksums are disabled. Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27916
|
#
dc9029d8 |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
if_vtnet: Move ioctl handlers into separate functions Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27914 https://reviews.freebsd.org/D27915
|
#
44559b26 |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
if_vtnet: Cleanup the reinit process In modern VirtIO, the virtqueues cannot be notified before setting DRIVER_OK status. Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27932
|
#
32e0493c |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
if_vtnet: Cleanup the interface setup methods Defer the ether_ifattach until the interface capabilities are configured Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27913
|
#
2520cd38 |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
if_vtnet: Only set IFCAP_JUMBO_MTU when jumbo MTU is supported Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27912
|
#
baa5234f |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
if_vtnet: Move the Tx interrupt threshold into the Txq structure Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27911
|
#
05041794 |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
if_vtnet: Defer updating generated MAC address until attached This improves spec compliance because the driver is not suppose to notify the device prior to setting the DRIVER_OK status, which could happen with the VIRTIO_NET_F_CTRL_MAC_ADDR. The VIRTIO_NET_F_MAC feature should always be negotiated so would be a rare situation. Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27910
|
#
25dbc30e |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
if_vtnet: Remove at attach PROMISC handling This may have been required in an early, early, early version of the specification but I cannot find any reference to it, and a promiscuous default seems very odd so remove this code. Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27909
|
#
6a733393 |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
if_vtnet: Support VIRTIO_NET_F_SPEED_DUPLEX This features lets the guest driver know the speed and duplex of the "link". Instead of trying to support many media types based on the possible/likely speeds/duplexes, only use the speed to set the interface baudrate. Cleanup ifmedia code to match other drivers. Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27908
|
#
aabdf5b6 |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
if_vtnet: Support VIRTIO_NET_F_MTU This feature lets the guest driver know the maximum MTU size supported by the host device. If set, use this to limit the acceptable MTUs, and improve how the receive mbuf cluster size then is selected. Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27907
|
#
fa7ca1e3 |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
if_vtnet: Rx path cleanup - Fix the NEEDS_CSUM and DATA_VALID checksum flags. The NEEDS_CSUM checksum is incomplete (partial) so offer a fallback for the driver to calculate the checksum. Simplify DATA_VALID because we know the host has validated the checksum. - Default 4K mbuf clusters for mergeable buffers. May need to scale this down to 2K clusters in certain configurations such many queue pairs, big queues (like 4096 in GCP), and low memory. - Use the MTU when calculated the receive mbuf cluster size when not doing TSO/LRO. This will need more adjustment once the MTU feature is supported. Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27906
|
#
5e220811 |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
if_vtnet: Add initial modern (V1) support Very basic support to get packets flowing on modern QEMU but still several conformance issues remain that will be addressed in later commits. First of many passes at cleaning up various accumulated cruft Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27904
|
#
1cd1ed3f |
|
18-Jan-2021 |
Bryan Venteicher <bryanv@FreeBSD.org> |
Revert: virtio: Support non-legacy network device and queue And subsequent fix 576b099a. By adding the mergable header to the vtnet_rx_header structure, the size was increased by 2 bytes, breaking the alignment of this structure as described the in preceding comments. Furthermore, the mergable header does not belong the structure. With the mergable feature, the header is placed in line with the data, so there is no need for a separate segment, and misleading to follow the mergable header with any padding. The V1 header is effectively identical to mergable header, and the driver has long supported the mergable feature. Revert this so the later changes that add V1 support can show how V1 is derived from the existing mergable buffers support, and to facilitate a later MFC. Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27855
|
#
bb714db6 |
|
10-Jan-2021 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: vtnet: enable/disable krings on any interface reinit See 3d65fd97e85ab807f3b for a detailed explanation. PR: 252453 MFC after: 1 week
|
#
068dbf36 |
|
01-Sep-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
virtio: clean up empty lines in .c and .h files
|
#
ef6fdb33 |
|
15-Jun-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
if_vtnet: let vtnet_rx_vq_intr() and vtnet_rxq_tq_intr() share code Since the two functions are similar, introduce a common function (vtnet_rx_vq_process()) to share common code. This also improves locking, by ensuring vrxs_rescheduled is accessed under the RXQ lock, and taskqueue_enqueue() is not called under the lock (therefore avoiding a spurious duplicate lock warning). Reported by: jrtc27 MFC after: 2 weeks
|
#
576b099a |
|
14-Jun-2020 |
Jessica Clarke <jrtc27@FreeBSD.org> |
vtnet: Fix regression introduced in r361944 For legacy devices that don't support MrgRxBuf (such as bhyve pre-r358180), r361944 failed to update the receive handler to account for the additional padding introduced by the unused num_buffers field that is now always present in struct vtnet_rx_header. Thus, calculate the padding dynamically based on vtnet_hdr_size. PR: 247242 Reported by: thj Tested by: thj
|
#
16f224b5 |
|
14-Jun-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: vtnet: fix races in vtnet_netmap_reg() The nm_register callback needs to call nm_set_native_flags() or nm_clear_native_flags() once the device has been stopped. However, in the current implementation this is not true, as the device is stopped by vtnet_init_locked(). This causes race conditions where the driver crashes as soon as it dequeues netmap buffers assuming they are mbufs (or the other way around). To fix the issue, we extend vtnet_init_locked() with a second argument that, if not zero, will set/clear the netmap flags. This results in a huge simplification of the nm_register callback itself. Also, use netmap_reset() to check if a ring is going to be re-initialized in netmap mode. MFC after: 1 week
|
#
66823237 |
|
11-Jun-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: introduce netmap_kring_on() This function returns NULL if the ring identified by queue id and direction is in netmap mode. Otherwise return the corresponding kring. Use this function to replace vtnet_netmap_queue_on(). MFC after: 1 week
|
#
8c3988df |
|
08-Jun-2020 |
Jessica Clarke <jrtc27@FreeBSD.org> |
virtio: Support non-legacy network device and queue The non-legacy interface always defines num_buffers in the header, regardless of whether VIRTIO_NET_F_MRG_RXBUF, just leaving it unused. We also need to ensure our virtqueue doesn't filter out VIRTIO_F_VERSION_1 during negotiation, as it supports non-legacy transports just fine. This fixes network packet transmission on TinyEMU. Reviewed by: br, brooks (mentor), jhb (mentor) Approved by: br, brooks (mentor), jhb (mentor) Differential Revision: https://reviews.freebsd.org/D25132
|
#
f0d8d352 |
|
02-Jun-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: vtnet: call netmap_rx_irq() under VQ lock The netmap_rx_irq() function normally wakes up user-space threads waiting for more packets. In this case, it is not necessary to call it under the driver queue lock. However, if the interface is attached to a VALE switch, netmap_rx_irq() ends up calling rxsync on the interface (see netmap_bwrap_intr_notify()). Although concurrent rxsyncs are serialized through the kring lock (see nm_kr_tryget()), the lock acquire operation is not blocking. As a result, it may happen that netmap_rx_irq() is called on an RX ring while another instance is running, causing the second call to fail, and received packets stall in the receive VQ. We fix this issue by calling netmap_irx_irq() under the VQ lock. MFC after: 1 week
|
#
1b89d00b |
|
02-Jun-2020 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
netmap: vtnet: honor NM_IRQ_RESCHED The netmap_rx_irq() function may return NM_IRQ_RESCHED to inform the driver that more work is pending, and that netmap expects netmap_rx_irq() to be called again as soon as possible. This change implements this behaviour in the vtnet driver. MFC after: 1 week
|
#
7029da5c |
|
26-Feb-2020 |
Pawel Biernacki <kaktus@FreeBSD.org> |
Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718
|
#
e87c4940 |
|
24-Feb-2020 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Although most of the NIC drivers are epoch ready, due to peer pressure switch over to opt-in instead of opt-out for epoch. Instead of IFF_NEEDSEPOCH, provide IFF_KNOWSEPOCH. If driver marks itself with IFF_KNOWSEPOCH, then ether_input() would not enter epoch when processing its packets. Now this will create recursive entrance in epoch in >90% network drivers, but will guarantee safeness of the transition. Mark several tested drivers as IFF_KNOWSEPOCH. Reviewed by: hselasky, jeff, bz, gallatin Differential Revision: https://reviews.freebsd.org/D23674
|
#
6c3e93cb |
|
11-Feb-2020 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Use NET_TASK_INIT() and NET_GROUPTASK_INIT() for drivers that process incoming packets in taskqueue context. Reviewed by: hselasky Differential Revision: https://reviews.freebsd.org/D23518
|
#
629667a1 |
|
11-Jan-2020 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Pacify gcc. Reported by: rlibby
|
#
ed6cbf48 |
|
10-Jan-2020 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Add pfil(9) hook to vtnet(4). The patch could be simplier, using only the second chunk to vtnet_rxq_eof(), that passes full mbufs to pfil(9). Packet filter would m_free() them in case of returning PFIL_DROPPED. However, we pretend to be a hardware driver, so we first try to pass a memory buffer via PFIL_MEMPTR feature. This is mostly done for debugging purposes, so that one can experiment in bhyve with packet filters utilizing same features as a true driver.
|
#
29bfe210 |
|
08-Jan-2020 |
Kristof Provost <kp@FreeBSD.org> |
vtnet: Pre-allocate debugnet data immediately Don't wait until the vtnet_debugnet_init() call happens, because at that point we might already have allocated something from vtnet_tx_header_zone. Some systems showed this panic: vtnet0: link state changed to UP panic: keg vtnet_tx_hdr initialization after use. cpuid = 5 time = 1578427700 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe004db427f0 vpanic() at vpanic+0x17e/frame 0xfffffe004db42850 panic() at panic+0x43/frame 0xfffffe004db428b0 uma_zone_reserve() at uma_zone_reserve+0xf6/frame 0xfffffe004db428f0 vtnet_debugnet_init() at vtnet_debugnet_init+0x77/frame 0xfffffe004db42930 debugnet_any_ifnet_update() at debugnet_any_ifnet_update+0x42/frame 0xfffffe004db42980 do_link_state_change() at do_link_state_change+0x1b3/frame 0xfffffe004db429d0 taskqueue_run_locked() at taskqueue_run_locked+0x178/frame 0xfffffe004db42a30 taskqueue_run() at taskqueue_run+0x4d/frame 0xfffffe004db42a50 ithread_loop() at ithread_loop+0x1d6/frame 0xfffffe004db42ab0 fork_exit() at fork_exit+0x80/frame 0xfffffe004db42af0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe004db42af0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- KDB: enter: panic [ thread pid 12 tid 100011 ] Stopped at kdb_enter+0x37: movq $0,0x1084eb6(%rip) db> Reviewed by: cem, markj Differential Revision: https://reviews.freebsd.org/D23073
|
#
7dce5659 |
|
21-Oct-2019 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Convert to if_foreach_llmaddr() KPI.
|
#
7790c8c1 |
|
17-Oct-2019 |
Conrad Meyer <cem@FreeBSD.org> |
Split out a more generic debugnet(4) from netdump(4) Debugnet is a simplistic and specialized panic- or debug-time reliable datagram transport. It can drive a single connection at a time and is currently unidirectional (debug/panic machine transmit to remote server only). It is mostly a verbatim code lift from netdump(4). Netdump(4) remains the only consumer (until the rest of this patch series lands). The INET-specific logic has been extracted somewhat more thoroughly than previously in netdump(4), into debugnet_inet.c. UDP-layer logic and up, as much as possible as is protocol-independent, remains in debugnet.c. The separation is not perfect and future improvement is welcome. Supporting INET6 is a long-term goal. Much of the diff is "gratuitous" renaming from 'netdump_' or 'nd_' to 'debugnet_' or 'dn_' -- sorry. I thought keeping the netdump name on the generic module would be more confusing than the refactoring. The only functional change here is the mbuf allocation / tracking. Instead of initiating solely on netdump-configured interface(s) at dumpon(8) configuration time, we watch for any debugnet-enabled NIC for link activation and query it for mbuf parameters at that time. If they exceed the existing high-water mark allocation, we re-allocate and track the new high-water mark. Otherwise, we leave the pre-panic mbuf allocation alone. In a future patch in this series, this will allow initiating netdump from panic ddb(4) without pre-panic configuration. No other functional change intended. Reviewed by: markj (earlier version) Some discussion with: emaste, jhb Objection from: marius Differential Revision: https://reviews.freebsd.org/D21421
|
#
0f6040f0 |
|
03-Jun-2019 |
Conrad Meyer <cem@FreeBSD.org> |
virtio(4): Add PNP match metadata for virtio devices Register MODULE_PNP_INFO for virtio devices using the newbus PNP information provided by the previous commit. Matching can be quite simple; existing probe routines only matched on bus (implicit) and device_type. The same matching criteria are retained exactly, but is now also available to devmatch(8). Reviewed by: bryanv, markj; imp (earlier version) Differential Revision: https://reviews.freebsd.org/D20407
|
#
132ea9f2 |
|
07-May-2019 |
Michael Tuexen <tuexen@FreeBSD.org> |
Remove non-functional SCTP checksum offload support for virtio. Checksum offloading for SCTP is not currently specified for virtio. If the hypervisor announces checksum offloading support, it means TCP and UDP checksum offload. If an SCTP packet is sent and the host announced checksum offload support, the hypervisor inserts the IP checksum (16-bit) at the correct offset, but this is not the right checksum, which is a CRC32c. This results in all outgoing packets having the wrong checksum and therefore breaking SCTP based communications. This patch removes SCTP checksum offloading support from the virtio network interface. Thanks to Felix Weinrank for making me aware of the issue. Reviewed by: bryanv@ MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D20147
|
#
93ef2969 |
|
29-Jan-2019 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
vtnet: fix typo in vtnet_free_taskqueues Because of a typo, the code was mistakenly resetting the vtnrx_vq pointer rather than vtntx_tq. Reviewed by: bryanv MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D19015
|
#
2e42b74a |
|
14-Nov-2018 |
Vincenzo Maffione <vmaffione@FreeBSD.org> |
vtnet: fix netmap support netmap(4) support for vtnet(4) was incomplete and had multiple bugs. This commit fixes those bugs to bring netmap on vtnet in a functional state. Changelist: - handle errors returned by virtqueue_enqueue() properly (they were previously ignored) - make sure netmap XOR rest of the kernel access each virtqueue. - compute the number of netmap slots for TX and RX separately, according to whether indirect descriptors are used or not for a given virtqueue. - make sure sglist are freed according to their type (mbufs or netmap buffers) - add support for mulitiqueue and netmap host (aka sw) rings. - intercept VQ interrupts directly instead of intercepting them in txq_eof and rxq_eof. This simplifies the code and makes it easier to make sure taskqueues are not running for a VQ while it is in netmap mode. - implement vntet_netmap_config() to cope with changes in the number of queues. Reviewed by: bryanv Approved by: gnn (mentor) MFC after: 3 days Sponsored by: Sunny Valley Networks Differential Revision: https://reviews.freebsd.org/D17916
|
#
d7c5a620 |
|
18-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
ifnet: Replace if_addr_lock rwlock with epoch + mutex Run on LLNW canaries and tested by pho@ gallatin: Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5 based ConnectX 4-LX NIC, I see an almost 12% improvement in received packet rate, and a larger improvement in bytes delivered all the way to userspace. When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1, I see, using nstat -I mce0 1 before the patch: InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 4.98 0.00 4.42 0.00 4235592 33 83.80 4720653 2149771 1235 247.32 4.73 0.00 4.20 0.00 4025260 33 82.99 4724900 2139833 1204 247.32 4.72 0.00 4.20 0.00 4035252 33 82.14 4719162 2132023 1264 247.32 4.71 0.00 4.21 0.00 4073206 33 83.68 4744973 2123317 1347 247.32 4.72 0.00 4.21 0.00 4061118 33 80.82 4713615 2188091 1490 247.32 4.72 0.00 4.21 0.00 4051675 33 85.29 4727399 2109011 1205 247.32 4.73 0.00 4.21 0.00 4039056 33 84.65 4724735 2102603 1053 247.32 After the patch InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 5.43 0.00 4.20 0.00 3313143 33 84.96 5434214 1900162 2656 245.51 5.43 0.00 4.20 0.00 3308527 33 85.24 5439695 1809382 2521 245.51 5.42 0.00 4.19 0.00 3316778 33 87.54 5416028 1805835 2256 245.51 5.42 0.00 4.19 0.00 3317673 33 90.44 5426044 1763056 2332 245.51 5.42 0.00 4.19 0.00 3314839 33 88.11 5435732 1792218 2499 245.52 5.44 0.00 4.19 0.00 3293228 33 91.84 5426301 1668597 2121 245.52 Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch Reviewed by: gallatin Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15366
|
#
c857c7d5 |
|
05-May-2018 |
Mark Johnston <markj@FreeBSD.org> |
Add netdump support to vtnet(4). Tested with bhyve. Reviewed by: bryanv, julian MFC after: 1 month Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D15261
|
#
ac2fffa4 |
|
21-Jan-2018 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
Revert r327828, r327949, r327953, r328016-r328026, r328041: Uses of mallocarray(9). The use of mallocarray(9) has rocketed the required swap to build FreeBSD. This is likely caused by the allocation size attributes which put extra pressure on the compiler. Given that most of these checks are superfluous we have to choose better where to use mallocarray(9). We still have more uses of mallocarray(9) but hopefully this is enough to bring swap usage to a reasonable level. Reported by: wosch PR: 225197
|
#
26c1d774 |
|
13-Jan-2018 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
dev: make some use of mallocarray(9). Focus on code where we are doing multiplications within malloc(9). None of these is likely to overflow, however the change is still useful as some static checkers can benefit from the allocation attributes we use for mallocarray. This initial sweep only covers malloc(9) calls with M_NOWAIT. No good reason but I started doing the changes before r327796 and at that time it was convenient to make sure the sorrounding code could handle NULL values.
|
#
718cf2cc |
|
27-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys/dev: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.
|
#
e414c660 |
|
13-Feb-2017 |
Philip Paeps <philip@FreeBSD.org> |
vtnet: don't update VLAN filter when parent is not running Submitted by: Gerrie Roos <groos -at- xiplink -dot- com> Reviewed by: gnn Sponsored by: XipLink, Inc. Differential Revision: https://reviews.freebsd.org/D9573
|
#
4be723f6 |
|
11-Aug-2016 |
Steven Hartland <smh@FreeBSD.org> |
Fix vtnet hang with max_virtqueue_pairs > VTNET_MAX_QUEUE_PAIRS Correctly limit npairs passed to vtnet_ctrl_mq_cmd. This ensures that VQ_ALLOC_INFO_INIT is called with the correct value, preventing the system from hanging when max_virtqueue_pairs > VTNET_MAX_QUEUE_PAIRS. Add new sysctl requested_vq_pairs which allow the user to configure the requested number of virtqueue pairs. The actual value will still take into account the system limits. Also missing sysctls for the current tunables so their values can be seen. PR: 207446 Reported by: Andy Carrel MFC after: 3 days Relnotes: Yes Sponsored by: Multiplay
|
#
3fcb1aae |
|
14-May-2016 |
Kristof Provost <kp@FreeBSD.org> |
vtnet: fix panic on unload Since r276367 added the virtio_mmio support vtnet_modevent() gets called twice. This resulted in a memory leak during load and a panic on unload. Count the loads so we only initialise once (just like cxgbe(4)), and only clean up in the final unload. PR: 209428 Submitted by: novel@FreeBSD.org MFC after: 1 week
|
#
804fc8c8 |
|
03-Sep-2015 |
Marcelo Araujo <araujo@FreeBSD.org> |
Lower the compiler warning: unused-but-set-variable. Approved by: bapt (mentor) Differential Revision: D3556
|
#
0fdeab7b |
|
10-Jul-2015 |
Luigi Rizzo <luigi@FreeBSD.org> |
add netmap dependency when compiled as a module
|
#
581e6970 |
|
13-Jun-2015 |
Kristof Provost <kp@FreeBSD.org> |
Fix panic when adding vtnet interfaces to a bridge vtnet interfaces are always in promiscuous mode (at least if the VIRTIO_NET_F_CTRL_RX feature is not negotiated with the host). if_promisc() on a vtnet interface returned ENOTSUP although it has IFF_PROMISC set. This confused the bridge code. Instead we now accept all enable/disable promiscuous commands (and always keep IFF_PROMISC set). There are also two issues with the if_bridge error handling. If if_promisc() fails it uses bridge_delete_member() to clean up. This tries to disable promiscuous mode on the interface. That runs into an assert, because promiscuous mode was never set in the first place. (That's the panic reported in PR 200210.) We can only unset promiscuous mode if the interface actually is promiscuous. This goes against the reference counting done by if_promisc(), but only the first/last if_promic() calls can actually fail, so this is safe. A second issue is a double free of bif. It's already freed by bridge_delete_member(). PR: 200210 Differential Revision: https://reviews.freebsd.org/D2804 Reviewed by: philip (mentor)
|
#
cab10cc1 |
|
13-Jun-2015 |
Bryan Venteicher <bryanv@FreeBSD.org> |
Fix typo when deregistering the VLAN unconfig event handler Submitted by: Masao Uebayashi <uebayasi@tombiinc.com> MFC after: 3 days
|
#
4dc78216 |
|
29-Apr-2015 |
John Baldwin <jhb@FreeBSD.org> |
Don't free mbufs when stopping an interface in netmap mode. Currently if you ifconfig down a vtnet interface while it is being used via netmap, the kernel panics due to trying to treat the cookie values in the virtio rings as mbufs to be freed. When netmap is enabled, these cookie values are pointers to something else. Note that other netmap-aware drivers don't seem to need this as they store the mbuf pointers in the software rings that mirror the hardware descriptor rings, and since netmap doesn't touch those, the software state always has NULL mbuf pointers causing the loops to free mbufs to not do anything. However, vtnet reuses the same state area for both netmap and non-netmap mode, so it needs to explicitly avoid looking at the rings and treating the cookie values as mbufs if netmap is enabled. Differential Revision: https://reviews.freebsd.org/D2348 Reviewed by: adrian, bryanv, luigi MFC after: 1 week Sponsored by: Norse Corp, Inc.
|
#
ab4c2818 |
|
31-Dec-2014 |
Bryan Venteicher <bryanv@FreeBSD.org> |
Add softc flag for when the indirect descriptor feature was negotiated MFC after: 2 weeks
|
#
5b32b2fa |
|
31-Dec-2014 |
Bryan Venteicher <bryanv@FreeBSD.org> |
Use the appropriate IPv4 or IPv6 TSO HW assist flag MFC after: 2 weeks
|
#
e51f2e72 |
|
29-Dec-2014 |
Andrew Turner <andrew@FreeBSD.org> |
Attach vtnet to virtio_mmio. Qemu provides this as an option with AArch64. Sponsored by: The FreeBSD Foundation
|
#
c2529042 |
|
01-Dec-2014 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
Start process of removing the use of the deprecated "M_FLOWID" flag from the FreeBSD network code. The flag is still kept around in the "sys/mbuf.h" header file, but does no longer have any users. Instead the "m_pkthdr.rsstype" field in the mbuf structure is now used to decide the meaning of the "m_pkthdr.flowid" field. To modify the "m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX" macros as defined in the "sys/mbuf.h" header file. This patch introduces new behaviour in the transmit direction. Previously network drivers checked if "M_FLOWID" was set in "m_flags" before using the "m_pkthdr.flowid" field. This check has now now been replaced by checking if "M_HASHTYPE_GET(m)" is different from "M_HASHTYPE_NONE". In the future more hashtypes will be added, for example hashtypes for hardware dedicated flows. "M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is valid and has no particular type. This change removes the need for an "if" statement in TCP transmit code checking for the presence of a valid flowid value. The "if" statement mentioned above is now a direct variable assignment which is then later checked by the respective network drivers like before. Additional notes: - The SCTP code changes will be committed as a separate patch. - Removal of the "M_FLOWID" flag will also be done separately. - The FreeBSD version has been bumped. MFC after: 1 month Sponsored by: Mellanox Technologies
|
#
9a4dabdc |
|
09-Nov-2014 |
Bryan Venteicher <bryanv@FreeBSD.org> |
Enable LRO by default when available on vtnet interfaces The prior change to not enable LRO by default has confused several people. The configurations where LRO is problematic is not the typical use case for VirtIO, and due to other issues, this often requires checksum offloading to be disabled anyways. PR: 185864 MFC after: 2 weeks
|
#
84047b19 |
|
18-Sep-2014 |
Gleb Smirnoff <glebius@FreeBSD.org> |
- Provide if_get_counter() method for vtnet(4). - Do not accumulate statistics on every tick. - Accumulate statistics in vtnet_setup_stat_sysctl() and in vtnet_get_counter(). Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
#
1bffa951 |
|
30-Aug-2014 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Use define from if_var.h to access a field inside struct if_data, that resides in struct ifnet. Sponsored by: Nginx, Inc.
|
#
4bf50f18 |
|
16-Aug-2014 |
Luigi Rizzo <luigi@FreeBSD.org> |
Update to the current version of netmap. Mostly bugfixes or features developed in the past 6 months, so this is a 10.1 candidate. Basically no user API changes (some bugfixes in sys/net/netmap_user.h). In detail: 1. netmap support for virtio-net, including in netmap mode. Under bhyve and with a netmap backend [2] we reach over 1Mpps with standard APIs (e.g. libpcap), and 5-8 Mpps in netmap mode. 2. (kernel) add support for multiple memory allocators, so we can better partition physical and virtual interfaces giving access to separate users. The most visible effect is one additional argument to the various kernel functions to compute buffer addresses. All netmap-supported drivers are affected, but changes are mechanical and trivial 3. (kernel) simplify the prototype for *txsync() and *rxsync() driver methods. All netmap drivers affected, changes mostly mechanical. 4. add support for netmap-monitor ports. Think of it as a mirroring port on a physical switch: a netmap monitor port replicates traffic present on the main port. Restrictions apply. Drive carefully. 5. if_lem.c: support for various paravirtualization features, experimental and disabled by default. Most of these are described in our ANCS'13 paper [1]. Paravirtualized support in netmap mode is new, and beats the numbers in the paper by a large factor (under qemu-kvm, we measured gues-host throughput up to 10-12 Mpps). A lot of refactoring and additional documentation in the files in sys/dev/netmap, but apart from #2 and #3 above, almost nothing of this stuff is visible to other kernel parts. Example programs in tools/tools/netmap have been updated with bugfixes and to support more of the existing features. This is meant to go into 10.1 so we plan an MFC before the Aug.22 deadline. A lot of this code has been contributed by my colleagues at UNIPI, including Giuseppe Lettieri, Vincenzo Maffione, Stefano Garzarella. MFC after: 3 days.
|
#
32487a89 |
|
09-Jul-2014 |
Bryan Venteicher <bryanv@FreeBSD.org> |
Rework when the Tx queue completion interrupt is enabled The Tx interrupt is now kept disabled in the common case, only enabled when the number of free descriptors in the queue falls below a threshold. Transmitted frames are cleared from the VQ before subsequent transmit, or in the watchdog timer. This was a very big performance improvement for an experimental Netmap bhyve backend. MFC after: 1 month
|
#
bae486f5 |
|
15-Jun-2014 |
Bryan Venteicher <bryanv@FreeBSD.org> |
Force two byte alignment for all control message headers The header structure consists of two 1-byte elements, but it must always be describable by a single SG entry. Note for consistency, specify the alignment everywhere, even if the structure has the appropriate natural alignment since it contains a uint16_t. Obtained from: DragonFlyBSD MFC after: 1 week
|
#
fd5b3951 |
|
15-Jun-2014 |
Bryan Venteicher <bryanv@FreeBSD.org> |
Make the feature negotiation code easier to follow MFC after: 1 week
|
#
add526c6 |
|
15-Jun-2014 |
Bryan Venteicher <bryanv@FreeBSD.org> |
- Remove two write-only local variables - Remove unused element in the vtnet_rxq structure MFC after: 1 week
|
#
c26e5fc2 |
|
04-Jun-2014 |
Luigi Rizzo <luigi@FreeBSD.org> |
make sure ifp->if_transmit returns 0 if a buffer is enqueued. A similar fix should be applied to vmxnet, ixgbe, igb, i40e. (some of them previously reported by Michael Tuexen) Drivers using if_transmit are correct, and so are most of the other drivers that reassing if_transmit. Among other things, this bug causes panics when using netmap emulation on top of generic drivers. Approved by: bryanv MFC after: 3 days
|
#
b245f96c |
|
12-Mar-2014 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Since 32-bit if_baudrate isn't enough to describe a baud rate of a 10 Gbit interface, in the r241616 a crutch was provided. It didn't work well, and finally we decided that it is time to break ABI and simply make if_baudrate a 64-bit value. Meanwhile, the entire struct if_data was reviewed. o Remove the if_baudrate_pf crutch. o Make all fields of struct if_data fixed machine independent size. The notion of data (packet counters, etc) are by no means MD. And it is a bug that on amd64 we've got a 64-bit counters, while on i386 32-bit, which at modern speeds overflow within a second. This also removes quite a lot of COMPAT_FREEBSD32 code. o Give 16 bit for the ifi_datalen field. This field was provided to make future changes to if_data less ABI breaking. Unfortunately the 8 bit size of it had effectively limited sizeof if_data to 256 bytes. o Give 32 bits to ifi_mtu and ifi_metric. o Give 64 bits to the rest of fields, since they are counters. __FreeBSD_version bumped. Discussed with: emax Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
#
54fb8142 |
|
01-Feb-2014 |
Bryan Venteicher <bryanv@FreeBSD.org> |
Use m_defrag() instead of m_collapse() to compact a long mbuf chain This should be an infrequent occurrence, so remove the per-queue counters in favor of just global counters in the softc.
|
#
443c3d0b |
|
01-Feb-2014 |
Bryan Venteicher <bryanv@FreeBSD.org> |
Do not place the sglist used for Rx/Tx on the stack The sglist segment array has grown to a bit over 512 bytes (on 64-bit system) which is more than ideally should be put on the stack. Instead allocate an appropriately sized sglist and hang it off each Rx/Tx queue structure. Bump the maximum number of Tx segments to 64 to make it unlikely we'll have defragment an mbuf chain. Our previous count was rounded up to this value since it is the next power of two, so effective memory usage should not change. Also only allocate the maximum number of Tx segments if TSO was negotiated.
|
#
9ef6342f |
|
25-Jan-2014 |
Bryan Venteicher <bryanv@FreeBSD.org> |
Check for a full virtqueue in the multiqueue transmit path With most hosts, we'll negotiate indirect descriptors, so all we need is one available descriptor to transmit a frame.
|
#
dd6f83a0 |
|
25-Jan-2014 |
Bryan Venteicher <bryanv@FreeBSD.org> |
Avoid queue unlock followed by relock when the enable interrupt race is lost This already happens infrequently, and the hold time is still bounded since we defer to a taskqueue after a few tries.
|
#
bddddcd5 |
|
25-Jan-2014 |
Bryan Venteicher <bryanv@FreeBSD.org> |
Move duplicated transmit start code into a single function
|
#
5591e479 |
|
25-Jan-2014 |
Bryan Venteicher <bryanv@FreeBSD.org> |
Remove stray space
|
#
94716584 |
|
25-Jan-2014 |
Bryan Venteicher <bryanv@FreeBSD.org> |
Also include the mbuf's csum_flags in an assert message
|
#
1dbb21dc |
|
25-Jan-2014 |
Bryan Venteicher <bryanv@FreeBSD.org> |
Read and write the MAC address in the config space byte by byte
|
#
c3322cb9 |
|
28-Oct-2013 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Include necessary headers that now are available due to pollution via if_var.h. Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
#
76039bc8 |
|
26-Oct-2013 |
Gleb Smirnoff <glebius@FreeBSD.org> |
The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare to this event, adding if_var.h to files that do need it. Also, include all includes that now are included due to implicit pollution via if_var.h Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
#
d797300b |
|
05-Oct-2013 |
Bryan Venteicher <bryanv@FreeBSD.org> |
Do not hold the vtnet Rx queue lock when calling up into the stack This matches other similar drivers and avoids various LOR warnings. Approved by: re (marius)
|
#
6e03f319 |
|
02-Sep-2013 |
Bryan Venteicher <bryanv@FreeBSD.org> |
Complete any pending Tx frames before attempting the next transmit Also complete pending frames in the watchdog function when the EVENT_IDX feature was negotiated just in case the completion interrupt was postponed.
|
#
72d9611d |
|
01-Sep-2013 |
Eitan Adler <eadler@FreeBSD.org> |
Fix build with gcc Reported by: Michael Butler <imb@protected-networks.net> Reviewed by: jilles
|
#
8f3600b1 |
|
31-Aug-2013 |
Bryan Venteicher <bryanv@FreeBSD.org> |
Import multiqueue VirtIO net driver from my user/bryanv/vtnetmq branch This is a significant rewrite of much of the previous driver; lots of misc. cleanup was also performed, and support for a few other minor features was also added.
|
#
abd6790c |
|
04-Jul-2013 |
Bryan Venteicher <bryanv@FreeBSD.org> |
Merge virtio changes from projects/virtio Contains projects/virtio commits: r245738: virtio: Minor man page tweaks r246060: virtio: Cleanup feature description printing r246306: virtio: Remove old debugging flag r247238: virtio: Remove PRIx64 macros from format strings r247239: virtio: Constify some fields r247240: virtio: Minor code simplifications r249962: virtio: Update to my freebsd.org email address MFC after: 1 month
|
#
3dd8d840 |
|
04-Jul-2013 |
Bryan Venteicher <bryanv@FreeBSD.org> |
Merge vtnet changes from projects/virtio Minor changes to the network driver. A multiqueue driver that is a significant rewrite will be in merged shortly. Contains projects/virtio commits: r246058: vtnet: Move an mbuf ASSERT to the calling function r246059: vtnet: Tweak ASSERT message MFC after: 1 month
|
#
6632efe4 |
|
04-Jul-2013 |
Bryan Venteicher <bryanv@FreeBSD.org> |
Convert VirtIO to use ithreads instead of taskqueues Contains projects/virtio commits: r245709: Each VirtIO device was scheduling its own taskqueue(9) to do the off-level interrupt handling. ithreads(9) is the more nature way to do this. The primary motivation for this work to better support network multiqueue. r245710: virtio: Change virtqueue intr handlers to return void r245711: virtio_blk: Remove interrupt taskqueue r245721: vtnet: Remove interrupt taskqueue r245722: virtio_scsi: Remove interrupt taskqueue r245747: vtnet: Remove taskqueue fields missed in r245721 MFC after: 1 month
|
#
b059b01e |
|
14-Jun-2013 |
Bryan Venteicher <bryanv@FreeBSD.org> |
Merge r250802 from bryanv/vtnetmq - Fix setting of the Rx filters QEMU 1.4 made the descriptor requirement stricter - the size of buffer descriptor must exactly match the number of MAC addresses provided. PR: kern/178955 MFC after: 5 days
|
#
ac4b6bcd |
|
13-Dec-2012 |
Bryan Venteicher <bryanv@FreeBSD.org> |
virtio: Start taskqueues threads after attach cannot fail If virtio_setup_intr() failed during boot, we would hang in taskqueue_free() -> taskqueue_terminate() for all the taskq threads to terminate. This will never happen since the scheduler is not running by this point. Reported by: neel, grehan Approved by: grehan (mentor)
|
#
c6499ecc |
|
04-Dec-2012 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Mechanically substitute flags from historic mbuf allocator with malloc(9) flags in sys/dev.
|
#
310dacd0 |
|
10-Jul-2012 |
Peter Grehan <grehan@FreeBSD.org> |
Various VirtIO improvements PCI: - Properly handle interrupt fallback from MSIX to MSI to legacy. The host may not have sufficient resources to support MSIX, so we must be able to fallback to legacy interrupts. - Add interface to get the (sub) vendor and device IDs. - Rename flags to VTPCI_FLAG_* like other VirtIO drivers. Block: - No longer allocate vtblk_requests from separate UMA zone. malloc(9) from M_DEVBUF is sufficient. Assert segment counts at allocation. - More verbose error and debug messages. Network: - Remove stray write once variable. Virtqueue: - Shuffle code around in preparation of converting the mb()s to the appropriate atomic(9) operations. - Only walk the descriptor chain when freeing if INVARIANTS is defined since the result is only KASSERT()ed. Submitted by: Bryan Venteicher (bryanv@daemoninthecloset.org)
|
#
b8a58707 |
|
13-Apr-2012 |
Peter Grehan <grehan@FreeBSD.org> |
Catch up with Bryan Venteicher's virtio git repo: a8af6270bd96be6ccd86f70b60fa6512b710e4f0 virtio_blk: Include function name in panic string cbdb03a694b76c5253d7ae3a59b9995b9afbb67a virtio_balloon: Do the notify outside of the lock By the time we return from virtqueue_notify(), the descriptor will be in the used ring so we shouldn't have to sleep. 10ba392e60692529a5cbc1e9987e4064e0128447 virtio: Use DEVMETHOD_END 80cbcc4d6552cac758be67f0c99c36f23ce62110 virtqueue: Add support for VIRTIO_F_RING_EVENT_IDX This can be used to reduce the number of guest/host and host/guest interrupts by delaying the interrupt until a certain index value is reached. Actual use by the network driver will come along later. 8fc465969acc0c58477153e4c3530390db436c02 virtqueue: Simplify virtqueue_nused() Since the values just wrap naturally at UINT16_MAX, we can just subtract the two values directly, rather than doing 2's complement math. a8aa22f25959e2767d006cd621b69050e7ffb0ae virtio_blk: Remove debugging crud from 75dd732a There seems to be an issue with Qemu (or FreeBSD VirtIO) that sets the PCI register space for the device config to bogus values. This only seems to happen after unloading and reloading the module. d404800661cb2a9769c033f8a50b2133934501aa virtio_blk: Use better variable name 75dd732a97743d96e7c63f7ced3c2169696dadd3 virtio_blk: Partially revert 92ba40e65 Just use the virtqueue to determine if any requests are still inflight. 06661ed66b7a9efaea240f99f414c368f1bbcdc7 virtio_blk: error if allowed too few segments Should never happen unless the host provides use with a bogus seg_max value. 4b33e5085bc87a818433d7e664a0a2c8f56a1a89 virtio_blk: Sort function declarations 426b9f5cac892c9c64cc7631966461514f7e08c6 virtio_blk: Cleanup whitespace 617c23e12c61e3c2233d942db713c6b8ff0bd112 virtio_blk: Call disk_err() on error'd completed requests 081a5712d4b2e0abf273be4d26affcf3870263a9 virtio_blk: ASSERT the ready and inflight request queues are empty a9be2631a4f770a84145c18ee03a3f103bed4ca8 virtio_blk: Simplify check for too many segments At the cost of a small style violation. e00ec09da014f2e60cc75542d0ab78898672d521 virtio_blk: Add beginnings of suspend/resume Still not sure if we need to virtio_stop()/virtio_reinit() the device before/after a suspend. Don't start additional IO when marked as suspending. 47c71dc6ce8c238aa59ce8afd4bda5aa294bc884 virtio_blk: Panic when dealt an unhandled BIO cmd 1055544f90fb8c0cc6a2395f5b6104039606aafe virtio_blk: Add VQ enqueue/dequeue wrappers Wrapper functions managed the added/removing to the in-flight list of requests. Normally biodone() any completed IO when draining the virtqueue. 92ba40e65b3bb5e4acb9300ece711f1ea8f3f7f4 virtio_blk: Add in-flight list of requests 74f6d260e075443544522c0833dc2712dd93f49b virtio_blk: Rename VTBLK_FLAG_DETACHING to VTBLK_FLAG_DETACH 7aa549050f6fc6551c09c6362ed6b2a0728956ef virtio_blk: Finish all BIOs through vtblk_finish_bio() Also properly set bio_resid in the case of errors. Most geom_disk providers seem to do the same. 9eef6d0e6f7e5dd362f71ba097f2e2e4c3744882 Added function to translate VirtIO status to error code ef06adc337f31e1129d6d5f26de6d8d1be27bcd2 Reset dumping flag when given unexpected parameters 393b3e390c644193a2e392220dcc6a6c50b212d9 Added missing VTBLK_LOCK() in dump handler Obtained from: Bryan Venteicher bryanv at daemoninthecloset dot org
|
#
336f459c |
|
05-Dec-2011 |
Peter Grehan <grehan@FreeBSD.org> |
Catch up with Bryan Venteicher's virtio Hg repo: c162516 Remove vtblk_sector_size c162515 Wrap long license lines c162514 Remove vtblk_unit c162513 Wrap long lines in the license. c162512 Remove verbose messages when link goes up/down. A similar message is printed elsewhere as a result of if_link_state_change(). c162511 Explicity compare pointer to NULL c162510 Allocate the mac filter table at attach time. c162509 Add real BSD licenses to the header files copied from Linux. The chases upstream changes made in Linux awhile ago. c162508 Only notify if we actually dequeued something. c162507 Change a couple of if () { KASSERT(...) } to just KASSERTs. In non-debug kernels, the if() { } probably get optomized away, but I guess this is clearer. c162506 Remove VIRTIO_BLK_F_TOPOLOGY fields in the config. TOPOLOGY has since been removed from the spec, and the FreeBSD didn't really do anything with the fields anyways. c162505 Move vtblk_enqueue_request() outside the locks when getting the ident. c162504 Remove soon to be uneeded trylock during dump [1]. http://lists.freebsd.org/pipermail/freebsd-current/2011-November/029226.html c162503 Remove emtpy line c162502 Drop frame if cannot allocate a vtnet_tx_header. If we don't, we set OACTIVE, but if there are no other frames in flight, vtnet_txeof() will never be called to unset OACTIVE. The interface would have to be down/up'ed in order to become usable. We could be cuter here and only do this if the virtqueue is emtpy, but its probably not worth the complication. c162501 Start mbuf replacement loop at 1 for clarity Obtained from: Bryan Venteicher bryanv at daemoninthecloset dot org
|
#
10b59a9b |
|
17-Nov-2011 |
Peter Grehan <grehan@FreeBSD.org> |
Import virtio base, PCI front-end, and net/block/balloon drivers. Tested on Qemu/KVM, VirtualBox, and BHyVe. Currently built as modules-only on i386/amd64. Man pages not yet hooked up, pending review. Submitted by: Bryan Venteicher bryanv at daemoninthecloset dot org Reviewed by: bz MFC after: 4 weeks or so
|