History log of /freebsd-10-stable/sys/dev/netmap/
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
308136 31-Oct-2016 sbruno

MFC r308038:

The buffer address is always overwritten in the extended descriptor format,
we have to refresh it ... always. This fixes problems reported in NetMap
with em(4) devices after conversion to extended descriptor format in
svn r293331.

297478 01-Apr-2016 np

MFC r297298:
Plug leak in m_unshare.

m_unshare passes on the source mbuf's flags as-is to m_getcl and this
results in a leak if the flags include M_NOFREE. The fix is to clear
the bits not listed in M_COPYALL before calling m_getcl. M_RDONLY
should probably be filtered out too but that's outside the scope of this
fix.

Add assertions in the zone_mbuf and zone_pack ctors to catch similar
bugs.

Update netmap_get_mbuf to not pass M_NOFREE to m_getcl. It's not clear
what the original code was trying to do but it's likely incorrect.
Updated code is no different functionally but it avoids the newly added
assertions.

Sponsored by: Chelsio Communications

295008 28-Jan-2016 sbruno

Fixed up version of r294061 that was reverted due to breakage of features
(netmap) and architectures(i386). <I'll take the pointyhat on that one>

r283883
-- update to 3.1.0

r283893
-- update SRIOV API changes related to future possible MFC of SRIOV work

r285590
-- Fix ixgbe(4) SRIOV VF initialization bugs

r285591
-- Remove version check for FLOWID

r285592
-- Update netmap support for ixgbe SRIOV VFs.

r286238
-- Fixup MTU zeroing if INET/INET6 are undefined.

Submitted by: kevin bowling <kevin.bowling@kev009.com>
Differential Revision: https://reviews.freebsd.org/D4273

294958 27-Jan-2016 marius

Sync the e1000 drivers with what's in head as of r294327, modulo parts
that don't apply to stable/10 (driver API, if_inc_counter(), RSS changes
etc.) and modulo r287465 (which reportedly breaks igb(4)), i. e. assorted
fixes and improvements only:

o MFC r267385 (partial):
- Don't compare bus_dma map pointers for static DMA allocations against
NULL to determine if bus_dmamap_unload() or bus_dmamem_free() should be
called. Instead, check the associated bus and virtual addresses.
- Don't clear static DMA maps to NULL.
o MFC r284933:
Delete the refernce to VLAN handling being disabled by default. This is
no longer the case. [1]
o MFC r285639:
Add an adapter CORE lock in the DDB hook em_dump_queue to avoid WITNESS
panic in em_init_locked() while debugging.
o MFC r285879:
- Remove unused txd_saved.
- Intialize txd_upper, txd_lower and txd_used at declaration.
o MFC r286162:
Free mbufs when busdma loading fails.
o MFC r286829:
Add capability to disable CRC stripping as it breaks IPMI/BMC capabilities
on certain adatpers. [2]
o MFC r286831: [3]
- Increase EM_MAX_SCATTER to 64 such that the size of em_xmit()::
segs[EM_MAX_SCATTER] doesn't get overrun by things like NFS that can
and do shove more than 32 segs when being used with em(4) and TSO4.
- Update tso handling code in em_xmit() with update from jhb@
- Set if_hw_tsomax, if_hw_tsomaxsegcount and if_hw_tsomaxsegsize to
appropriate values.
- Define a TSO workaround "magic" number of 4 that is used to avoid an
alignment issue in hardware.
- Change a couple of integer values that were used as booleans to actual
bool types.
- Ensure that em_enable_intr() enables the appropriate mask of interrupts
and not just a hardcoded define of values.
o MFC r286832:
e1000/if_lem.c bump to 1.1.0
o MFC r286833:
Bump all copywrite dates to 2015.
o MFC r287112:
Style/whitespace cleanup in shared/common code.
o MFC r293331:
- Switch em(4) to the extended RX descriptor format.
- Split rxbuffer and txbuffer apart to support the new RX descriptor
format structures. Move rxbuffer manipulation to em_setup_rxdesc() to
unify the new behavior changes.
- Add a RSSKEYLEN macro for help in generating the RSSKEY data structures
in the card.
- Change em_receive_checksum() to process the new rxdescriptor format
status bit.
o MFC r293332:
Disable the reuse of checksum offload context descriptors in the case
of multiple queues in em(4). Document errata in the code.
o MFC r293854:
Given that em(4), lem(4) and igb(4) hardware doesn't require the
alignment guarantees provided by m_defrag(9), use m_collapse(9)
instead for performance reasons.
While at it, sanitize the statistics softc members, i. e. retire
unused ones and add SYSCTL nodes missing for actually used ones.

PR: 118693 [1], 161277 [2], 195078 [3], 199174 [3], 200221 [3]


/freebsd-10-stable/share/man/man4/em.4
/freebsd-10-stable/sys/dev/e1000/e1000_80003es2lan.c
/freebsd-10-stable/sys/dev/e1000/e1000_80003es2lan.h
/freebsd-10-stable/sys/dev/e1000/e1000_82540.c
/freebsd-10-stable/sys/dev/e1000/e1000_82541.c
/freebsd-10-stable/sys/dev/e1000/e1000_82541.h
/freebsd-10-stable/sys/dev/e1000/e1000_82542.c
/freebsd-10-stable/sys/dev/e1000/e1000_82543.c
/freebsd-10-stable/sys/dev/e1000/e1000_82543.h
/freebsd-10-stable/sys/dev/e1000/e1000_82571.c
/freebsd-10-stable/sys/dev/e1000/e1000_82571.h
/freebsd-10-stable/sys/dev/e1000/e1000_82575.c
/freebsd-10-stable/sys/dev/e1000/e1000_82575.h
/freebsd-10-stable/sys/dev/e1000/e1000_api.c
/freebsd-10-stable/sys/dev/e1000/e1000_api.h
/freebsd-10-stable/sys/dev/e1000/e1000_defines.h
/freebsd-10-stable/sys/dev/e1000/e1000_hw.h
/freebsd-10-stable/sys/dev/e1000/e1000_i210.c
/freebsd-10-stable/sys/dev/e1000/e1000_i210.h
/freebsd-10-stable/sys/dev/e1000/e1000_ich8lan.c
/freebsd-10-stable/sys/dev/e1000/e1000_ich8lan.h
/freebsd-10-stable/sys/dev/e1000/e1000_mac.c
/freebsd-10-stable/sys/dev/e1000/e1000_mac.h
/freebsd-10-stable/sys/dev/e1000/e1000_manage.c
/freebsd-10-stable/sys/dev/e1000/e1000_manage.h
/freebsd-10-stable/sys/dev/e1000/e1000_mbx.c
/freebsd-10-stable/sys/dev/e1000/e1000_mbx.h
/freebsd-10-stable/sys/dev/e1000/e1000_nvm.c
/freebsd-10-stable/sys/dev/e1000/e1000_nvm.h
/freebsd-10-stable/sys/dev/e1000/e1000_osdep.c
/freebsd-10-stable/sys/dev/e1000/e1000_osdep.h
/freebsd-10-stable/sys/dev/e1000/e1000_phy.c
/freebsd-10-stable/sys/dev/e1000/e1000_phy.h
/freebsd-10-stable/sys/dev/e1000/e1000_regs.h
/freebsd-10-stable/sys/dev/e1000/e1000_vf.c
/freebsd-10-stable/sys/dev/e1000/e1000_vf.h
/freebsd-10-stable/sys/dev/e1000/if_em.c
/freebsd-10-stable/sys/dev/e1000/if_em.h
/freebsd-10-stable/sys/dev/e1000/if_igb.c
/freebsd-10-stable/sys/dev/e1000/if_igb.h
/freebsd-10-stable/sys/dev/e1000/if_lem.c
/freebsd-10-stable/sys/dev/e1000/if_lem.h
/freebsd-10-stable/sys/dev/ixgb/if_ixgb.c
if_em_netmap.h
292211 14-Dec-2015 ngie

Unbreak the powerpc/powerpc64 tinderbox

PR: 198805
Submitted by: sbruno

MFC r280430:

r280430 (by bz):

Make ix_crcstrip a public symbol for the moment; it probably is not
the right solution but I will leave it to experts to untangle this
problem to properly stop the build failures.

At the moment only if_ix.c includes dev/netmap/ixgbe_netmap.h which is
good as ixgbe_netmap.h defines a couple of (file) static variables--thus
local to if_ix.c.
static int ix_crcstrip however now also got checked from ix_txrx.c
(as an extern) and should not be visible there. In fact we do see
powerpc and powerpc64 build failures because of this. It is unclear
to me why on other (clang built?) architectures this does not lead
to a reference of an undefined symbol and similar build breakage.

292096 11-Dec-2015 smh

MFC r279232: Add native netmap support to ixl

Sponsored by: Multiplay

284522 17-Jun-2015 sbruno

MFC r284179, r283959

Implement multiqueue (max 2 tx/rx queues) for the 82574L chipset.

Change default tuning parameters to handle this new configuration if
EM_MULTIQUEUE is set in the kernel configuration. Off by default.

See r283959 changelog for the scope of these changes.

Relnotes: Yes
Sponsored by: Limelight Networks

283343 24-May-2015 pkelsey

MFC r282978:

When a netmap process terminates without the full set of buffers it
was granted via rings and ni_bufs_list_head represented in those rings
and lists (e.g., via SIGKILL), those buffers are no longer available
for subsequent users for the lifetime of the system. To mitigate this
resource leak, reset the allocator state when the last ref to that
allocator is released.

Note that this only recovers leaked resources for an allocator when
there are no longer any users of that allocator, so there remain
circumstances in which leaked allocator resources may not ever be
recovered - consider a set of multiple netmap processes that are all
using the same allocator (say, the global allocator) where members of
that set may be killed and restarted over time but at any given point
there is one member of that set running.

281955 24-Apr-2015 hiren

MFC r275358 r275483 r276982 - Removing M_FLOWID by hps@

r275358:
Start process of removing the use of the deprecated "M_FLOWID" flag
from the FreeBSD network code. The flag is still kept around in the
"sys/mbuf.h" header file, but does no longer have any users. Instead
the "m_pkthdr.rsstype" field in the mbuf structure is now used to
decide the meaning of the "m_pkthdr.flowid" field. To modify the
"m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX"
macros as defined in the "sys/mbuf.h" header file.

This patch introduces new behaviour in the transmit direction.
Previously network drivers checked if "M_FLOWID" was set in "m_flags"
before using the "m_pkthdr.flowid" field. This check has now now been
replaced by checking if "M_HASHTYPE_GET(m)" is different from
"M_HASHTYPE_NONE". In the future more hashtypes will be added, for
example hashtypes for hardware dedicated flows.

"M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is
valid and has no particular type. This change removes the need for an
"if" statement in TCP transmit code checking for the presence of a
valid flowid value. The "if" statement mentioned above is now a direct
variable assignment which is then later checked by the respective
network drivers like before.

r275483:
Remove M_FLOWID from SCTP code.

r276982:
Remove no longer used "M_FLOWID" flag from mbuf.h and update the netisr
manpage.

Note: The FreeBSD version has been bumped.

Reviewed by: hps, tuexen
Sponsored by: Limelight Networks


/freebsd-10-stable/share/man/man9/netisr.9
/freebsd-10-stable/sys/dev/bxe/bxe.c
/freebsd-10-stable/sys/dev/cxgb/cxgb_sge.c
/freebsd-10-stable/sys/dev/cxgbe/t4_main.c
/freebsd-10-stable/sys/dev/cxgbe/t4_sge.c
/freebsd-10-stable/sys/dev/e1000/if_igb.c
/freebsd-10-stable/sys/dev/ixgbe/ixgbe.c
/freebsd-10-stable/sys/dev/ixgbe/ixv.c
/freebsd-10-stable/sys/dev/ixl/ixl_txrx.c
/freebsd-10-stable/sys/dev/mxge/if_mxge.c
netmap_freebsd.c
/freebsd-10-stable/sys/dev/oce/oce_if.c
/freebsd-10-stable/sys/dev/qlxgbe/ql_isr.c
/freebsd-10-stable/sys/dev/qlxgbe/ql_os.c
/freebsd-10-stable/sys/dev/qlxge/qls_isr.c
/freebsd-10-stable/sys/dev/qlxge/qls_os.c
/freebsd-10-stable/sys/dev/sfxge/sfxge_rx.c
/freebsd-10-stable/sys/dev/sfxge/sfxge_tx.c
/freebsd-10-stable/sys/dev/virtio/network/if_vtnet.c
/freebsd-10-stable/sys/dev/vmware/vmxnet3/if_vmx.c
/freebsd-10-stable/sys/dev/vxge/vxge.c
/freebsd-10-stable/sys/net/flowtable.c
/freebsd-10-stable/sys/net/ieee8023ad_lacp.c
/freebsd-10-stable/sys/net/if_lagg.c
/freebsd-10-stable/sys/net/if_lagg.h
/freebsd-10-stable/sys/net/netisr.c
/freebsd-10-stable/sys/netinet/in_pcb.h
/freebsd-10-stable/sys/netinet/ip_output.c
/freebsd-10-stable/sys/netinet/sctp_indata.c
/freebsd-10-stable/sys/netinet/sctp_input.c
/freebsd-10-stable/sys/netinet/sctp_output.c
/freebsd-10-stable/sys/netinet/sctp_pcb.c
/freebsd-10-stable/sys/netinet/sctp_structs.h
/freebsd-10-stable/sys/netinet/sctputil.c
/freebsd-10-stable/sys/netinet/tcp_input.c
/freebsd-10-stable/sys/netinet/tcp_syncache.c
/freebsd-10-stable/sys/netinet6/sctp6_usrreq.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/en_rx.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/en_tx.c
/freebsd-10-stable/sys/sys/mbuf.h
/freebsd-10-stable/sys/sys/param.h
281706 18-Apr-2015 rpaulo

MFC r281406:
netmap: improve the netmap attach message on FreeBSD.

278779 14-Feb-2015 luigi

sync the code with the version in head. which the exception of
svn 275358 (M_FLOWID deprecation, only a couple of lines)
which cannot be merged.

if_lem_netmap.h, if_re_netmap.h:
- use the same (commented out) function to update the stat counters
as in HEAD. This is a no-op here

netmap.c
- merge 274459 (support for private knote lock)
and minor changes on nm_config and comments

netmap_freebsd.c
- merge 274459 (support for private knote lock)
- merge 274354 (initialize color if passed as argument)

netmap_generic.c
- fix a comment

netmap_kern.h
- revise the lock macros, using sx locks;
merge 274459 (private knote lock)

netmap_monitor.c
- use full memory barriers

netmap_pipe.c
- use full memory barriers, use length from the correct queue
(mostly cosmetic, since the queues typically have the same size)

272604 06-Oct-2014 luigi

MFC r272111
fix a panic when passing ifioctl from a netmap file descriptor to
the underlying device. This needs to be merged to 10.1

270298 21-Aug-2014 np

MFC r270253:
Change netmap's global lock to sx instead of a mutex.

270252 20-Aug-2014 luigi

MFC 270063: update of netmap code
(vtnet and cxgbe not merged yet because we need some other mfc first)

267334 10-Jun-2014 luigi

MFC 267284
Fixes from Fanco Ficthner on transparent mode

* The way rings are updated changed with the last API bump.
Also sync ->head when moving slots in netmap_sw_to_nic().

* Remove a crashing selrecord() call.

* Unclog the logic surrounding netmap_rxsync_from_host().

* Add timestamping to RX host ring.

* Remove a couple of obsolete comments.

Submitted by: Franco Fichtner
MFC after: 3 days
Sponsored by: Packetwerk

267333 10-Jun-2014 luigi

MFC 267328:
change the netmap mbuf destructor so the same code works also on FreeBSD 9.
For head and 10 this change has no effect, but on stable/9 it would cause
panics when using emulated netmap on top of a standard device driver.

MFC after: 3 days

267282 09-Jun-2014 luigi

sync netmap code with the version in HEAD:
- fix handling of tx mbufs in emulated netmap mode;
- introduce mbq_lock() and mbq_unlock()
- rate limit some error messages
- many whitespace and comment fixes

262214 19-Feb-2014 luigi

allow building without INET

262152 18-Feb-2014 luigi

missing files from previous commit...

262151 18-Feb-2014 luigi

MFH: sync the netmap code with the one in HEAD
(enhanced VALE switch, netmap pipes, emulated netmap mode).
See details in the log for svn 261909.

256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


256200 09-Oct-2013 jfv

Update the Intel igb driver to version 2.4.0
- This version has support for the new Intel Avoton systems,
including 2.5Gb support, further it now has IPv6/TSO6 support as
well. Shared code has been updated where necessary as well. Thanks
to my new assistant Eric Joyner for doing the transmit path changes
to bring in the IPv6/TSO6 support. Thanks to Gleb for catching the
one bug and change needed in NETMAP.

Approved by: re


251425 05-Jun-2013 luigi

- fix a bug in the previous commit that was dropping the last packet
from each batch flowing on the VALE switch

- feature: add glue for 'indirect' buffers on the sender side:
if a slot has NS_INDIRECT set, the netmap buffer contains pointer(s)
to the actual userspace buffers, which are accessed with copyin().
The feature is not finalised yet, as it will likely need to deal
with some iovec variant for proper scatter/gather support.
This will save one copy for clients (e.g. qemu) that cannot
use the netmap buffer directly.

A curiosity: on amd64 copyin() appears to be 10-15% faster than pkt_copy()
or bcopy() at least for sizes of 256 and greater.


251139 30-May-2013 luigi

Bring in a number of new features, mostly implemented by Michio Honda:

- the VALE switch now support up to 254 destinations per switch,
unicast or broadcast (multicast goes to all ports).

- we can attach hw interfaces and the host stack to a VALE switch,
which means we will be able to use it more or less as a native bridge
(minor tweaks still necessary).
A 'vale-ctl' program is supplied in tools/tools/netmap
to attach/detach ports the switch, and list current configuration.

- the lookup function in the VALE switch can be reassigned to
something else, similar to the pf hooks. This will enable
attaching the firewall, or other processing functions (e.g. in-kernel
openvswitch) directly on the netmap port.

The internal API used by device drivers does not change.

Userspace applications should be recompiled because we
bump NETMAP_API as we now use some fields in the struct nmreq
that were previously ignored -- otherwise, data structures
are the same.

Manpages will be committed separately.


250441 10-May-2013 luigi

another minor bugfix in the memory allocator, this time in the free routine.


250184 02-May-2013 luigi

remove trailing whitespace


250107 30-Apr-2013 luigi

Partial cleanup in preparation for upcoming changes:

- netmap_rx_irq()/netmap_tx_irq() can now be called by FreeBSD drivers
hiding the logic for handling NIC interrupts in netmap mode.
This also simplifies the case of NICs attached to VALE switches.
Individual drivers will be updated with separate commits.

- use the same refcount() API for FreeBSD and linux

- plus some comments, typos and formatting fixes

Portions contributed by Michio Honda


250054 29-Apr-2013 luigi

whitespace - document alternative locking under linux


250052 29-Apr-2013 luigi

whitespace changes:
remove $Id$ lines, and add blank lines around some #if / #elif /#endif


250049 29-Apr-2013 luigi

explicitly mark some variables as const


249659 19-Apr-2013 luigi

mostly whitespace changes:
- remove vestiges of the old memory allocator
- clean up some comments


249504 15-Apr-2013 luigi

fix a bug in the computation of the userspace offset for a give netmap buffer.

Submitted by: Hugh Nhan


248084 09-Mar-2013 attilio

Switch the vm_object mutex to be a rwlock. This will enable in the
future further optimizations where the vm_object lock will be held
in read mode most of the time the page cache resident pool of pages
are accessed for reading purposes.

The change is mostly mechanical but few notes are reported:
* The KPI changes as follow:
- VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK()
- VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK()
- VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK()
- VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED()
(in order to avoid visibility of implementation details)
- The read-mode operations are added:
VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(),
VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED()
* The vm/vm_pager.h namespace pollution avoidance (forcing requiring
sys/mutex.h in consumers directly to cater its inlining functions
using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h
consumers now must include also sys/rwlock.h.
* zfs requires a quite convoluted fix to include FreeBSD rwlocks into
the compat layer because the name clash between FreeBSD and solaris
versions must be avoided.
At this purpose zfs redefines the vm_object locking functions
directly, isolating the FreeBSD components in specific compat stubs.

The KPI results heavilly broken by this commit. Thirdy part ports must
be updated accordingly (I can think off-hand of VirtualBox, for example).

Sponsored by: EMC / Isilon storage division
Reviewed by: jeff
Reviewed by: pjd (ZFS specific review)
Discussed with: alc
Tested by: pho


245836 23-Jan-2013 luigi

Add support for transparent mode while in netmap.

By setting dev.netmap.fwd=1 (or enabling the feature with a per-ring flag),
packets are forwarded between the NIC and the host stack unless the
netmap client clears the NS_FORWARD flag on the individual descriptors.

This feature greatly simplifies applications where some traffic
(think of ARP, control traffic, ssh sessions...) must be processed
by the host stack, whereas the bulk is handled by the netmap process
which simply (un)marks packets that should not be forwarded.
The default is chosen so that now a netmap receiver operates
in a mode very similar to bpf.

Of course there is no free lunch: traffic to/from the host stack
still operates at OS speed (or less, as there is one extra copy in
one direction).
HOWEVER, since traffic goes to the user process before being
reinjected, and reinjection occurs in a user context, you get some
form of livelock protection for free.


245835 23-Jan-2013 luigi

control some debugging messages with dev.netmap.verbose

add infrastracture to adapt to changes in number of queues
and buffers at runtime


245581 17-Jan-2013 luigi

remove the old memory allocator, not useful anymore


245579 17-Jan-2013 luigi

add some definition and driver changes in preparation for
two upcoming features:

semi-transparent mode:
when a device is opened in this mode, the
user program will be able to mark slots that must be forwarded
to the "other" side (i.e. from NIC to host stack, or viceversa),
and the forwarding will occur automatically at the next netmap syscall.
This saves the need to open another file descriptor and do
the forwarding manually.

direct-forwarding mode:
when operating with a VALE port, the user can specify in the slot
the actual destination port, overriding the forwarding decision
made by a lookup of the destination MAC. This can be useful to
implement packet dispatchers.

No API changes will be introduced.
No new functionality in this patch yet.


245570 17-Jan-2013 luigi

remove an incorrect comment and debugging code


244514 20-Dec-2012 luigi

rename the 'tag' and 'map' fields used the rx ring to their
previous names, 'ptag' and 'pmap' -- p stands for packet.

This change reduces the difference between the code in stable/9
and head, and also helps using the same ixgbe_netmap.h on both branches.

Approved by: Jack Vogel


243714 30-Nov-2012 jfv

First of a series of 11 patches leading to new ixgbe version 2.5.0
This removes the header split and supporting code from the driver.


241750 19-Oct-2012 emaste

Use M_NOWAIT when calling malloc with a lock held.

The check for a NULL return was already in place so I assume this was just
an oversight.


241723 19-Oct-2012 glebius

Fix build.


241719 19-Oct-2012 luigi

This is an import of code, mostly from Giuseppe Lettieri,
that revises the netmap memory allocator so that the
various parameters (number and size of buffers, rings, descriptors)
can be modified at runtime through sysctl variables.
The changes become effective when no netmap clients are active.

The API is mostly unchanged, although the NIOCUNREGIF ioctl now
does not bring the interface back to normal mode: and you
need to close the file descriptor for that.
This change was necessary to track who is using the mapped region,
and since it is a simplification of the API there was no
incentive in trying to preserve NIOCUNREGIF.
We will remove the ioctl from the kernel next time we need
a real API change (and version bump).

Among other things, buffer allocation when opening devices is
now much faster: it used to take O(N^2) time, now it is linear.

Submitted by: Giuseppe Lettieri


241643 17-Oct-2012 emaste

Avoid panic when a netmap instance cannot obtain memory.

A uint32_t is always >= 0.

Sponsored by: ADARA Networks


239242 13-Aug-2012 emaste

Reword comment to try to improve clarity, and fix a typo.


239149 09-Aug-2012 emaste

Improve lock and unlock symmetry

- Move destruction of per-ring locks to netmap_dtor_locked to mirror the
initialization that happens in NIOCREGIF. Otherwise unloading a netmap-
capable interface that was never put into netmap mode would try to
mtx_destroy an uninitialized mutex, and panic.

- Destroy core_lock in netmap_detach, mirroring init in netmap_attach.

- Also comment out the knlist_destroy for now as there is currently no
knlist_init.

Sponsored by: ADARA Networks
Reviewed by: luigi@


239141 08-Aug-2012 emaste

Fix whitespace (missing newline)


239140 08-Aug-2012 emaste

Clarify comments about number of tx / rx rings


238985 02-Aug-2012 luigi

fix some signed/unsigned warnings in the netmap code.
Unfortunately the original drivers still have a lot of
sign conversion/comparison warnings.


238982 02-Aug-2012 luigi

Add a newline on an error message;
rename linux functions to avoid confusion;
fix error reporting on linux


238937 31-Jul-2012 luigi

remove a redundant MALLOC_DECLARE


238912 30-Jul-2012 luigi

- move the inclusion of netmap headers to the common part of the code;
- more portable annotations for unused arguments;


238837 27-Jul-2012 luigi

use __builtin_prefetch() for prefetch.

merge in the remaining part of the linux-specific glue so i do not need
to maintain two different distributions.


238831 27-Jul-2012 luigi

remove unused definition, whitespace cleanup


238818 26-Jul-2012 luigi

define prefetch as a noop on !x86


238812 26-Jul-2012 luigi

Add support for VALE bridges to the netmap core, see

http://info.iet.unipi.it/~luigi/vale/

VALE lets you dynamically instantiate multiple software bridges
that talk the netmap API (and are *extremely* fast), so you can test
netmap applications without the need for high end hardware.

This is particularly useful as I am completing a netmap-aware
version of ipfw, and VALE provides an excellent testing platform.

Also, I also have netmap backends for qemu mostly ready for commit
to the port, and this too will let you interconnect virtual machines
at high speed without fiddling with bridges, tap or other slow solutions.

The API for applications is unchanged, so you can use the code
in tools/tools/netmap (which i will update soon) on the VALE ports.

This commit also syncs the code with the one in my internal repository,
so you will see some conditional code for other platforms.
The code should run mostly unmodified on stable/9 so people interested
in trying it can just copy sys/dev/netmap/ and sys/net/netmap*.h
from HEAD

VALE is joint work with my colleague Giuseppe Lettieri, and
is partly supported by the EU Projects CHANGE and OPENLAB


235562 17-May-2012 luigi

this file is too old and not interesting anymore now that netmap
has been MFC'ed.


234986 03-May-2012 luigi

print 'netmap stack ring full' only in verbose mode.


234290 14-Apr-2012 luigi

i prefer this fix for the -Wformat warning (just one cast,
all the other variables are already correct for %x).
My previous attempt put the cast in the wrong place.


234283 14-Apr-2012 bz

Make compile on 64bit somehow for now after a first try at r234242 on
maybe 32bit?


234242 13-Apr-2012 luigi

fix build with -Wformat -Wmissing-prototypes


234229 13-Apr-2012 luigi

Properly disable crc stripping when operating in netmap mode.

Contrarily to what i wrote in my previous commit, the 82599
does include the CRC in the length. The operating mode is
reset in ixgbe_init_locked() and so we need to hook into
the places where the two registers (HLREG0 and RDRXCTL) are
modified.


234228 13-Apr-2012 luigi

add the new memory allocator for netmap, which allocates memory
in small clusters instead of one big contiguous chunk.
This was already enabled in the previous commit.


234227 13-Apr-2012 luigi

A bit of cleanup in the names of fields of netmap-related structures.
Use the name 'ring' instead of 'queue' in all fields.
Bump NETMAP_API.


234225 13-Apr-2012 luigi

do not use a deprecated field in a structure.


234185 12-Apr-2012 luigi

Apparently the length field in advanced descriptors
does not include the CRC irrespective of the setting
of CRCSTRIP. The 82599 data sheets (sec. 7.1.6) say differently.
Very strange. Need to check what happens on legacy descriptors,
but for the time being this restores functionality.


234174 12-Apr-2012 luigi

Some code restructuring to bring the memory allocator out of netmap.c
and make it easier to replace it with a different implementation.
On passing, also fix indentation.

NOTE: I know that #include "foo.c" is ugly, but the alternative
(add another entry to sys/conf/files, add a separate header with
structs and prototypes, and expose functions that are meant to
be private) looks even worse to me.
We need a more modular way to specify dependencies and build options.


234169 12-Apr-2012 luigi

use correct selinfo pointer for the generic interrupt handler
(it is never used in current FreeBSD drivers).


234140 11-Apr-2012 luigi

A couple of changes related to ixgbe operation in netmap mode:

- add a sysctl, dev.netmap.ix_crcstrip, to control whether ixgbe should
strip the CRC on received frames. Defaults to 0, which keeps the CRC.
and improves performance when receiving min-sized (64-byte) frames.
This matters because min-sized frames is one of the standard
benchmarks for switches and routers, some chipsets seem to issue
read-modify-write cycles for PCIe transactions that are not a
full cache line, and a min-sized frame triggers the bug, resulting
in reduced throughput -- 9.7 instead of 14.88 Mpps -- and heavy
bus load.

- for the time being, always look for incoming packets on a select/poll
even if there has not been an interrupt in the meantime. This is
only a temporary workaround for a probable race condition in keeping
track of rx interrupts.
Add a couple of diagnostic vars to help studying the problem.


232238 27-Feb-2012 luigi

A bunch of netmap fixes:

USERSPACE:
1. add support for devices with different number of rx and tx queues;

2. add better support for zero-copy operation, adding an extra field
to the netmap ring to indicate how many buffers we have already processed
but not yet released (with help from Eddie Kohler);

3. The two changes above unfortunately require an API change, so while
at it add a version field and some spares to the ioctl() argument
to help detect mismatches.

4. update the manual page for the two changes above;

5. update sample applications in tools/tools/netmap

KERNEL:

1. simplify the internal structures moving the global wait queues
to the 'struct netmap_adapter';

2. simplify the functions that map kring<->nic ring indexes

3. normalize device-specific code, helps mainteinance;

4. start exploring the impact of micro-optimizations (prefetch etc.)
in the ixgbe driver.
Use 'legacy' descriptors on the tx ring and prefetch slots gives
about 20% speedup at 900 MHz. Another 7-10% would come from removing
the explict calls to bus_dmamap* in the core (they are effectively
NOPs in this case, but it takes expensive load of the per-buffer
dma maps to figure out that they are all NULL.

Rx performance not investigated.

I am postponing the MFC so i can import a few more improvements
before merging.


231881 17-Feb-2012 luigi

Various cleanups for readability (no functional changes)

- remove the KEVENT code, which was incomplete and not compiled anyways;
- change some while() loops into for()
- adjust indentation
- remove extra whitespace

MFC after: 1 week


231796 15-Feb-2012 luigi

(This commit only touches code within the DEV_NETMAP blocks)

Introduce some functions to map NIC ring indexes into netmap ring
indexes and vice versa. This way we can implement the bound
checks only in one place (and hopefully in a correct way).

On passing, make the code and comments more uniform across the
various drivers.


231778 15-Feb-2012 luigi

reduce the differences between these three files.
The three drivers (em, lem and igb) are extremely similar, too bad
that the structures use different names and we cannot share the code.


231594 13-Feb-2012 luigi

- use struct ifnet as explicit type of the argument to the
txsync() and rxsync() callbacks, removing some variables made
useless by this change;

- add generic lock and irq handling routines. These can be useful
in case there are no driver locks that we can reuse;

- add a few macros to reduce differences with the Linux version.


231198 08-Feb-2012 luigi

- change the buffer size from a constant to a
TUNABLE variable (hw.netmap.buf_size) so we can experiment
with values different from 2048 which may give better cache performance.

- rearrange the memory allocation code so it will be easier
to replace it with a different implementation. The current code
relies on a single large contiguous chunk of memory obtained through
contigmalloc.
The new implementation (not committed yet) uses multiple
smaller chunks which are easier to fit in a fragmented address
space.


230572 26-Jan-2012 luigi

ixgbe changes:
- remove experimental code for disabling CRC
- use the correct constant for conversion between interrupt rate
and EITR values (the previous values were off by a factor of 2)
- make dev.ix.N.queueM.interrupt_rate a RW sysctl variable.
Changing individual values affects the queue immediately,
and propagates to all interfaces at the next reinit.
- add dev.ix.N.queueM.irqs rdonly sysctl, to export the actual
interrupt counts

Netmap-related changes for ixgbe:
- use the "new" format for TX descriptors in netmap mode.
- pass interrupt mitigation delays to the user process doing poll()
on a netmap file descriptor.
On the RX side this means we will not check the ring more than once
per interrupt. This gives the process a chance to sleep and process
packets in larger batches, thus reducing CPU usage.
On the TX side we take this even further: completed transmissions are
reclaimed every half ring even if the NIC interrupts more often.
This saves even more CPU without any additional tx delays.

Generic Netmap-related changes:
- align the netmap_kring to cache lines so that there is no false sharing
(possibly useful for multiqueue NICs and MSIX interrupts, which are
handled by different cores). It's a minor improvement but it does not
cost anything.

Reviewed by: Jack Vogel
Approved by: Jack Vogel


230058 13-Jan-2012 luigi

indentation and whitespace fixes


230055 13-Jan-2012 luigi

fix indentation


230052 13-Jan-2012 luigi

Two performance-related fixes:
1. as reported by Alexander Fiveg, the allocator was reporting
half of the allocated memory. Fix this by exiting from the
loop earlier (not too critical because this code is going
away soon).

2. following a discussion on freebsd-current
http://lists.freebsd.org/pipermail/freebsd-current/2012-January/031144.html
turns out that (re)loading the dmamap was expensive and not optimized.
This operation is in the critical path when doing zero-copy forwarding
between interfaces.
At least on netmap and i386/amd64, the bus_dmamap_load can be
completely bypassed if the map is NULL, so we do it.

The latter change gives an almost 3x improvement in forwarding
performance, from the previous 9.5Mpps at 2.9GHz to the current
line rate (14.2Mpps) at 1.733GHz. (this is for 64+4 byte packets,
in other configurations the PCIe bus is a bottleneck).


229947 10-Jan-2012 luigi

other simplifications in the internal interfaces to the
memory allocator.


229939 10-Jan-2012 luigi

small code cleanup in preparation for future modifications in
the memory allocator used by netmap. No functional change,
two small bug fixes:
- in if_re.c add a missing bus_dmamap_sync()
- in netmap.c comment out a spurious free() in an error handling block


228881 25-Dec-2011 luigi

remove a variable definition which shadows the correct one.

Submitted by: Eitan Adler


228845 23-Dec-2011 luigi

1. don't use if_pspare directly, but through a macro WMA()

2. move a variable declaration at the beginning of a block


228844 23-Dec-2011 luigi

whitespace fixes (one missing newline, one extra tab)


228694 18-Dec-2011 marius

Fix compilation on sparc64 by actually supplying the bus_dma_tag_t member
of the rx_ring to bus_dmamap_sync(9). Given that netmap code tries to
obtain the bus addresses of netmap buffers via vtophys(9) instead of using
bus_dma(9) it currently has zero chance of actually working on sparc64
though (and for that matter f.e. also not with MACs limited to 32-bit DMA
on x86 machines with more than 4GB of RAM).


228280 05-Dec-2011 luigi

revise the implementation of the rings connected to the host stack


228276 05-Dec-2011 luigi

1. Fix the handling of link reset while in netmap more.
A link reset now is completely transparent for the netmap client:
even if the NIC resets its own ring (e.g. restarting from 0),
the client will not see any change in the current rx/tx positions,
because the driver will keep track of the offset between the two.

2. make the device-specific code more uniform across different drivers
There were some inconsistencies in the implementation of the netmap
support routines, now drivers have been aligned to a common
code structure.

3. import netmap support for ixgbe . This is implemented as a very
small patch for ixgbe.c (233 lines, 11 chunks, mostly comments:
in total the patch has only 54 lines of new code) , as most of
the code is in an external file sys/dev/netmap/ixgbe_netmap.h ,
following some initial comments from Jack Vogel about making
changes less intrusive.
(Note, i have emailed Jack multiple times asking if he had
comments on this structure of the code; i got no reply so
i assume he is fine with it).

Support for other drivers (em, lem, re, igb) will come later.

"ixgbe" is now the reference driver for netmap support. Both the
external file (sys/dev/netmap/ixgbe_netmap.h) and the device-specific
patches (in sys/dev/ixgbe/ixgbe.c) are heavily commented and should
serve as a reference for other device drivers.

Tested on i386 and amd64 with the pkt-gen program in tools/tools/netmap,
the sender does 14.88 Mpps at 1050 Mhz and 14.2 Mpps at 900 MHz
on an i7-860 with 4 cores and 82599 card. Haven't tried yet more
aggressive optimizations such as adding 'prefetch' instructions
in the time-critical parts of the code.


227875 23-Nov-2011 luigi

fix formatting warning using casts. The numbers involved
are small and these are debug statements, so there is no reason to
obfuscate the format string with PRIsomeKINDofINTEGER


227614 17-Nov-2011 luigi

Bring in support for netmap, a framework for very efficient packet
I/O from userspace, capable of line rate at 10G, see

http://info.iet.unipi.it/~luigi/netmap/

At this time I am bringing in only the generic code (sys/dev/netmap/
plus two headers under sys/net/), and some sample applications in
tools/tools/netmap. There is also a manpage in share/man/man4 [1]

In order to make use of the framework you need to build a kernel
with "device netmap", and patch individual drivers with the code
that you can find in

sys/dev/netmap/head.diff

The file will go away as the relevant pieces are committed to
the various device drivers, which should happen in a few days
after talking to the driver maintainers.

Netmap support is available at the moment for Intel 10G and 1G
cards (ixgbe, em/lem/igb), and for the Realtek 1G card ("re").
I have partial patches for "bge" and am starting to work on "cxgbe".
Hopefully changes are trivial enough so interested third parties
can submit their patches. Interested people can contact me
for advice on how to add netmap support to specific devices.

CREDITS:
Netmap has been developed by Luigi Rizzo and other collaborators
at the Universita` di Pisa, and supported by EU project CHANGE
(http://www.change-project.eu/)
The code is distributed under a BSD Copyright.

[1] In my opinion is a bad idea to have all manpage in one directory.
We should place kernel documentation in the same dir that contains
the code, which would make it much simpler to keep doc and code
in sync, reduce the clutter in share/man/ and incidentally is
the policy used for all of userspace code.
Makefiles and doc tools can be trivially adjusted to find the
manpages in the relevant subdirs.