History log of /freebsd-current/sys/sys/buf_ring.h
Revision Date Author Comments
# 95ee2897 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: two-line .h pattern

Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# 571a1a64 18-Apr-2021 Warner Losh <imp@FreeBSD.org>

Minor style tidy: if( -> if (

Fix a few 'if(' to be 'if (' in a few places, per style(9) and
overwhelming usage in the rest of the kernel / tree.

MFC After: 3 days
Sponsored by: Netflix


# f6e54eb3 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

sys: clean up empty lines in .c and .h files


# c591d46e 23-Apr-2019 Wojciech Macek <wma@FreeBSD.org>

This patch offers a workaround to buf_ring reordering
visible on armv7 and armv8. Similar issue to rS302292.

Obtained from: Semihalf
Authored by: Michal Krawczyk <mk@semihalf.com>
Approved by: wma
Differential Revision: https://reviews.freebsd.org/D19932


# 9a6981ef 15-May-2018 Andrew Gallatin <gallatin@FreeBSD.org>

Unhook DEBUG_BUFRING from INVARIANTS

Some of the DEBUG_BUFRING checks are racy, and can lead to
spurious assertions when run under high load. Unhook these
from INVARIANTS until the author can fix or remove them.

Reviewed by: mmacy
Sponsored by: Netflix


# c4e20cad 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/sys: further adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.


# 1d64db52 15-Jan-2017 Conrad Meyer <cem@FreeBSD.org>

Fix a variety of cosmetic typos and misspellings

No functional change.

PR: 216096, 216097, 216098, 216101, 216102, 216106, 216109, 216110
Reported by: Bulat <bltsrc at mail.ru>
Sponsored by: Dell EMC Isilon


# 669f39b2 02-Dec-2016 Ryan Stone <rstone@FreeBSD.org>

Revert r309372

The bug intended to be fixed by r309372 was already addressed by r296178,
so revert my change.

Reported by: seph


# 86a6fcd4 01-Dec-2016 Ryan Stone <rstone@FreeBSD.org>

Fix a false positive in a buf_ring assert

buf_ring contains an assert that checks whether an item being
enqueued already exists on the ring. There is a subtle bug in
this assert. An item can be returned by a peek() function and
freed, and then the consumer thread can be preempted before
calling advance(). If this happens the item appears to still be
on the queue, but another thread may allocate the item from the
free pool and wind up trying to enqueue it again, causing the
assert to trigger incorrectly.

Fix this by skipping the head of the consumer's portion of the
ring, as this index is what will be returned by peek().

Sponsored by: Dell EMC Isilon
MFC After: 1 week
Differential Revision: https://reviews.freebsd.org/D8685
Reviewed by: hselasky


# cadab293 29-Jun-2016 Wojciech Macek <wma@FreeBSD.org>

ARM, ARM64: Workaround for buf_ring reordering

This patch offers a workaround to buf_ring reordering
visible on armv7 and armv8. This is supposed to be
removed once new buf_ring implementation is integrated
into the tree.

Obtained from: Semihalf
Reviewed by: alc,emaste
Differential Revision: https://reviews.freebsd.org/D6986
Approved by: re (gjb)


# 7f417bfa 03-May-2016 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/sys: minor spelling fixes.

While the changes are minor, these headers are very visible.

MFC after: 2 weeks


# 1321c502 28-Feb-2016 Sepherosa Ziehau <sephe@FreeBSD.org>

buf_ring/drbr: Add buf_ring_peek_clear_sc and use it in drbr_peek

Unlike buf_ring_peek, it only supports single consumer mode, and it
clears the cons_head if DEBUG_BUFRING/INVARIANTS is defined.

The normal use case of drbr_peek for network drivers is:

m = drbr_peek(br);
err = hw_spec_encap(&m); /* could m_defrag/m_collapse */
(*)
if (err) {
if (m == NULL)
drbr_advance(br);
else
drbr_putback(br, m);
/* break the loop */
}
drbr_advance(br);

The race is:
If hw_spec_encap() m_defrag or m_collapse the mbuf, i.e. the old mbuf
was freed, or like the Hyper-V's network driver, that transmission-
done does not even require the TX lock; then on the other CPU at the
(*) time, the freed mbuf could be recycled and being drbr_enqueue even
before the current CPU had the chance to call drbr_{advance,putback}.
This triggers a panic in drbr_enqueue duplicated element check, if
DEBUG_BUFRING/INVARIANTS is defined.

Use buf_ring_peek_clear_sc() in drbr_peek() to fix the above race.

This change is a NO-OP, if neither DEBUG_BUFRING nor INVARIANTS are
defined.

MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5416


# 24684257 30-Oct-2014 Josh Paetzel <jpaetzel@FreeBSD.org>

Plug memory ordering holes in buf_ring_enqueue. For at least some
users this patch eliminates the races previously discussed on the
mailing list.

Submitted by: oleg
Reviewed by: kmacy
MFC after: 2 weeks
Tested by: kmacy,rpaulo


# 8a27a339 30-Mar-2014 Warner Losh <imp@FreeBSD.org>

Remove instances of variables that were set, but never used. gcc 4.9
warns about these by default.


# ded5ea6a 07-Feb-2013 Randall Stewart <rrs@FreeBSD.org>

This fixes a out-of-order problem with several
of the newer drivers. The basic problem was
that the driver was pulling the mbuf off the
drbr ring and then when sending with xmit(), encounting
a full transmit ring. Thus the lower layer
xmit() function would return an error, and the
drivers would then append the data back on to the ring.
For TCP this is a horrible scenario sure to bring
on a fast-retransmit.

The fix is to use drbr_peek() to pull the data pointer
but not remove it from the ring. If it fails then
we either call the new drbr_putback or drbr_advance
method. Advance moves it forward (we do this sometimes
when the xmit() function frees the mbuf). When
we succeed we always call advance. The
putback will always copy the mbuf back to the top
of the ring. Note that the putback *cannot* be used
with a drbr_dequeue() only with drbr_peek(). We most
of the time, in putback, would not need to copy it
back since most likey the mbuf is still the same, but
sometimes xmit() functions will change the mbuf via
a pullup or other call. So the optimial case for
the single consumer is to always copy it back. If
we ever do a multiple_consumer (for lagg?) we
will need a test and atomic in the put back possibly
a seperate putback_mc() in the ring buf.

Reviewed by: jhb@freebsd.org, jlv@freebsd.org


# 85e43e96 28-Dec-2012 Attilio Rao <attilio@FreeBSD.org>

Improve bufring impl:
- Remove unused br_prod_bufs member
- Fixup r241037: buf_ring pads br_prod_* and br_cons_* members at 128
bytes, assuming a fixed cache line size for all the architectures.
However, the above mentioned revision broke the padding.
Use explicit padding to the CACHE_LINE_SIZE on the members that
mark the initial new padded sections. Of course, the padding is not
important for performance reasons in the DEBUG_BUFRING case, leaving
br_cons members to share the cache line with br_lock.
- Fixup r244732: by removing incorrectly added membar in
buf_ring_dequeue_sc() where surrounding locking shoud be enough.
- Drastically reduce the number of membar used (pratically reverting
r244732) by switching rmb() in buf_ring_dequeue_mc() and wmb() in
buf_ring_enqueue() to be complete barriers. This, along with
br_prod_bufs departure, should fix ordering issues as explained in
the provided comments.

This patch is not targeted for MFC.

Sponsored by: EMC / Isilon storage division
Reviewed by: glebius


# ad9505aa 26-Dec-2012 Attilio Rao <attilio@FreeBSD.org>

Remove an unused var.

Sponsored by: EMC / Isilon storage division
MFC after: 3 days


# 30c7dd14 26-Dec-2012 Attilio Rao <attilio@FreeBSD.org>

br_prod_tail and br_cons_tail members are used as barrier to
signal bug_ring ownership. However, instructions can be reordered
around members write leading to stale values for ie. br_prod_bufs.

Use correct memory barriers to ensure proper ordering of the
ownership tokens updates.

Sponsored by: EMC / Isilon storage division
MFC after: 2 weeks


# 063efed2 28-Sep-2012 Gleb Smirnoff <glebius@FreeBSD.org>

The drbr(9) API appeared to be so unclear, that most drivers in
tree used it incorrectly, which lead to inaccurate overrated
if_obytes accounting. The drbr(9) used to update ifnet stats on
drbr_enqueue(), which is not accurate since enqueuing doesn't
imply successful processing by driver. Dequeuing neither mean
that. Most drivers also called drbr_stats_update() which did
accounting again, leading to doubled if_obytes statistics. And
in case of severe transmitting, when a packet could be several
times enqueued and dequeued it could have been accounted several
times.

o Thus, make drbr(9) API thinner. Now drbr(9) merely chooses between
ALTQ queueing or buf_ring(9) queueing.
- It doesn't touch the buf_ring stats any more.
- It doesn't touch ifnet stats anymore.
- drbr_stats_update() no longer exists.

o buf_ring(9) handles its stats itself:
- It handles br_drops itself.
- br_prod_bytes stats are dropped. Rationale: no one ever
reads them but update of a common counter on every packet
negatively affects performance due to excessive cache
invalidation.
- buf_ring_enqueue_bytes() reduced to buf_ring_enqueue(), since
we no longer account bytes.

o Drivers handle their stats theirselves: if_obytes, if_omcasts.

o mlx4(4), igb(4), em(4), vxge(4), oce(4) and ixv(4) no longer
use drbr_stats_update(), and update ifnet stats theirselves.

o bxe(4) was the most correct driver, it didn't call
drbr_stats_update(), thus it was the only driver accurate under
moderate load. Now it also maintains stats itself.

o ixgbe(4) had already taken stats from hardware, so just
- drop software stats updating.
- take multicast packet count from hardware as well.

o mxge(4) just no longer needs NO_SLOW_STATS define.

o cxgb(4), cxgbe(4) need no change, since they obtain stats
from hardware.

Reviewed by: jfv, gnn


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# 7c049a85 11-Jun-2010 Kenneth D. Merry <ken@FreeBSD.org>

MFC 199549, 199997, 204158, 207673, and 208901.

Bring in a number of netfront changes:

r199549 | jhb

Remove commented out reference to if_watchdog and an assignment of zero to
if_timer.

Reviewed by: scottl

r199997 | gibbs

Add media ioctl support and link notifications so that devd will attempt
to run dhclient on a netfront (xn) device that is setup for DHCP in
/etc/rc.conf.

PR: kern/136251 (fixed differently than the submitted patch)

r204158 | kmacy

- make printf conditional
- fix witness warnings by making configuration lock a mutex

r207673 | joel

Switch to our preferred 2-clause BSD license.

Approved by: kmacy

r208901 | ken

A number of netfront fixes and stability improvements:

- Re-enable TSO. This was broken previously due to CSUM_TSO clearing the
CSUM_TCP flag, so our checksum flags were incorrectly set going to the
netback driver. That was fixed in r206844 in tcp_output.c, so we can
turn TSO back on here.

- Fix the way transmit slots are calculated, so that we can't overfill
the ring.

- Avoid sending packets with more fragments/segments than netback can
handle. The Linux netback code can only handle packets of
MAX_SKB_FRAGS, which turns out to be 18 on machines with 4K pages. We
can easily generate packets with 32 or so fragments with TSO turned on.
Right now the solution is just to drop the packets (since netback
doesn't seem to handle it gracefully), but we should come up with a way
to allow a driver to tell the TCP stack the maximum number of fragments
it can handle in a single packet.

- Fix the way the consumer is tracked in the receive path. It could get
out of sync fairly easily.

- Use standard Xen ring macros to make it clearer how netfront is using
the rings.

- Get rid of Linux-ish negative errno return values.

- Added more documentation to the driver.

- Refactored code to make it easier to read.

- Some other minor fixes.

Reviewed by: gibbs
Sponsored by: Spectra Logic

Approved by: re (bz)


# 8e0ad55a 05-May-2010 Joel Dahl <joel@FreeBSD.org>

Switch to our preferred 2-clause BSD license.

Approved by: kmacy


# a913be09 09-Jun-2009 Kip Macy <kmacy@FreeBSD.org>

- add drbr routines for accessing #qentries and conditionally dequeueing
- track bytes enqueued in buf_ring


# 3982c699 07-May-2009 Kip Macy <kmacy@FreeBSD.org>

No man page currently exists so comment the two uncommented
non-trivial functions


# 1635d917 16-Dec-2008 Kip Macy <kmacy@FreeBSD.org>

merge in 2 buf_ring helper routines for enqueueing and freeing buf_rings


# 2760fcd0 01-Dec-2008 Kip Macy <kmacy@FreeBSD.org>

return ENOBUFS when ring is full


# 3b14b38b 22-Nov-2008 Kip Macy <kmacy@FreeBSD.org>

buf_ring_peek should return NULL if the ring is empty rather than
whatever happened to be at cons_tail last time it was in use


# db7f0b97 21-Nov-2008 Kip Macy <kmacy@FreeBSD.org>

- bump __FreeBSD version to reflect added buf_ring, memory barriers,
and ifnet functions

- add memory barriers to <machine/atomic.h>
- update drivers to only conditionally define their own

- add lockless producer / consumer ring buffer
- remove ring buffer implementation from cxgb and update its callers

- add if_transmit(struct ifnet *ifp, struct mbuf *m) to ifnet to
allow drivers to efficiently manage multiple hardware queues
(i.e. not serialize all packets through one ifq)
- expose if_qflush to allow drivers to flush any driver managed queues

This work was supported by Bitgravity Inc. and Chelsio Inc.