History log of /freebsd-10.1-release/sys/kern/uipc_mbuf2.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 272461 02-Oct-2014 gjb

Copy stable/10@r272459 to releng/10.1 as part of
the 10.1-RELEASE process.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


# 243882 05-Dec-2012 glebius

Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags within sys.

Exceptions:

- sys/contrib not touched
- sys/mbuf.h edited manually


# 209390 21-Jun-2010 ed

Use ISO C99 integer types in sys/kern where possible.

There are only about 100 occurences of the BSD-specific u_int*_t
datatypes in sys/kern. The ISO C99 integer types are used here more
often.


# 193511 05-Jun-2009 rwatson

Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC
and used in a large number of files, but also because an increasing number
of incorrect uses of MAC calls were sneaking in due to copy-and-paste of
MAC-aware code without the associated opt_mac.h include.

Discussed with: pjd


# 172930 24-Oct-2007 rwatson

Merge first in a series of TrustedBSD MAC Framework KPI changes
from Mac OS X Leopard--rationalize naming for entry points to
the following general forms:

mac_<object>_<method/action>
mac_<object>_check_<method/action>

The previous naming scheme was inconsistent and mostly
reversed from the new scheme. Also, make object types more
consistent and remove spaces from object types that contain
multiple parts ("posix_sem" -> "posixsem") to make mechanical
parsing easier. Introduce a new "netinet" object type for
certain IPv4/IPv6-related methods. Also simplify, slightly,
some entry point names.

All MAC policy modules will need to be recompiled, and modules
not updates as part of this commit will need to be modified to
conform to the new KPI.

Sponsored by: SPARTA (original patches against Mac OS X)
Obtained from: TrustedBSD Project, Apple Computer


# 163606 22-Oct-2006 rwatson

Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h
begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now
contains the userspace and user<->kernel API and definitions, with all
in-kernel interfaces moved to mac_framework.h, which is now included
across most of the kernel instead.

This change is the first step in a larger cleanup and sweep of MAC
Framework interfaces in the kernel, and will not be MFC'd.

Obtained from: TrustedBSD Project
Sponsored by: SPARTA


# 148095 17-Jul-2005 rwatson

Define four constants, MBUF_{,MEM,CLUSTER,PACKET,TAG}_MEM_NAME, which
are string names for their respective UMA zones and malloc types, and
are passed into uma_zcreate() and MALLOC_DEFINE(). Export them
outside of _KERNEL in mbuf.h so that netstat can reference them.

Change the names to improve consistency, with each zone/type
associated with the mbuf allocator being prefixed mbuf_.

MFC after: 1 week


# 141616 10-Feb-2005 phk

Make a bunch of malloc types static.

Found by: src/tools/tools/kernxref


# 139804 06-Jan-2005 imp

/* -> /*- for copyright notices, minor format tweaks as necessary


# 136386 11-Oct-2004 glebius

Rename _m_tag_free() to m_tag_free_default() and make it non-static.

Approved by: sam


# 136347 10-Oct-2004 glebius

Revert last commit since it breaks API.

Requested by: sam


# 136310 09-Oct-2004 glebius

Remove inlined m_tag_free(). Rename _m_tag_free() to m_tag_free()
and make it visible (same way as in OpenBSD). Describe usage in manpage.

This change is useful for creating custom free methods, which
call default free method at their end.

While here, make malloc declaration for mbuf tags more informative.

Approved by: julian (mentor), sam
MFC after: 1 month


# 132488 21-Jul-2004 alfred

Make sure we don't call mbuf allocation functions with mutexes held.

Discussed with: rwatson


# 129906 31-May-2004 bmilekic

Bring in mbuma to replace mballoc.

mbuma is an Mbuf & Cluster allocator built on top of a number of
extensions to the UMA framework, all included herein.

Extensions to UMA worth noting:
- Better layering between slab <-> zone caches; introduce
Keg structure which splits off slab cache away from the
zone structure and allows multiple zones to be stacked
on top of a single Keg (single type of slab cache);
perhaps we should look into defining a subset API on
top of the Keg for special use by malloc(9),
for example.
- UMA_ZONE_REFCNT zones can now be added, and reference
counters automagically allocated for them within the end
of the associated slab structures. uma_find_refcnt()
does a kextract to fetch the slab struct reference from
the underlying page, and lookup the corresponding refcnt.

mbuma things worth noting:
- integrates mbuf & cluster allocations with extended UMA
and provides caches for commonly-allocated items; defines
several zones (two primary, one secondary) and two kegs.
- change up certain code paths that always used to do:
m_get() + m_clget() to instead just use m_getcl() and
try to take advantage of the newly defined secondary
Packet zone.
- netstat(1) and systat(1) quickly hacked up to do basic
stat reporting but additional stats work needs to be
done once some other details within UMA have been taken
care of and it becomes clearer to how stats will work
within the modified framework.

From the user perspective, one implication is that the
NMBCLUSTERS compile-time option is no longer used. The
maximum number of clusters is still capped off according
to maxusers, but it can be made unlimited by setting
the kern.ipc.nmbclusters boot-time tunable to zero.
Work should be done to write an appropriate sysctl
handler allowing dynamic tuning of kern.ipc.nmbclusters
at runtime.

Additional things worth noting/known issues (READ):
- One report of 'ips' (ServeRAID) driver acting really
slow in conjunction with mbuma. Need more data.
Latest report is that ips is equally sucking with
and without mbuma.
- Giant leak in NFS code sometimes occurs, can't
reproduce but currently analyzing; brueffer is
able to reproduce but THIS IS NOT an mbuma-specific
problem and currently occurs even WITHOUT mbuma.
- Issues in network locking: there is at least one
code path in the rip code where one or more locks
are acquired and we end up in m_prepend() with
M_WAITOK, which causes WITNESS to whine from within
UMA. Current temporary solution: force all UMA
allocations to be M_NOWAIT from within UMA for now
to avoid deadlocks unless WITNESS is defined and we
can determine with certainty that we're not holding
any locks when we're M_WAITOK.
- I've seen at least one weird socketbuffer empty-but-
mbuf-still-attached panic. I don't believe this
to be related to mbuma but please keep your eyes
open, turn on debugging, and capture crash dumps.

This change removes more code than it adds.

A paper is available detailing the change and considering
various performance issues, it was presented at BSDCan2004:
http://www.unixdaemons.com/~bmilekic/netbuf_bmilekic.pdf
Please read the paper for Future Work and implementation
details, as well as credits.

Testing and Debugging:
rwatson,
brueffer,
Ketrien I. Saihr-Kesenchedra,
...
Reviewed by: Lots of people (for different parts)


# 129062 09-May-2004 sam

set m_len to reflect mbuf contents on return from m_dup1; fixes an obscure
m_pullup case that contributed to breaking ipcomp in tunnel mode for kame

Submitted by: itojun
Obtained from: kame


# 127911 05-Apr-2004 imp

Remove advertising clause from University of California Regent's license,
per letter dated July 22, 1999.

Approved by: core


# 124077 02-Jan-2004 sam

m_tag fixups in preparation for heavier use:

o promote several m_tag_* routines to inline
o add an m_tag_setup inline to set the fixed fields in a packet tag
o add an m_tag_free method pointer to each mtag to support, for example,
allocating tags from zones
o have m_tag_find check if the tag list is not empty before calling
m_tag_locate to search

Reviewed by: brooks, silence from others


# 121645 29-Oct-2003 sam

Introduce the notion of "persistent mbuf tags"; these are tags that stay
with an mbuf until it is reclaimed. This is in contrast to tags that
vanish when an mbuf chain passes through an interface. Persistent tags
are used, for example, by MAC labels.

Add an m_tag_delete_nonpersistent function to strip non-persistent tags
from mbufs and use it to strip such tags from packets as they pass through
the loopback interface and when turned around by icmp. This fixes problems
with "tag leakage".

Pointed out by: Jonathan Stone
Reviewed by: Robert Watson


# 116182 10-Jun-2003 obrien

Use __FBSDID().


# 113487 14-Apr-2003 rwatson

Move MAC label storage for mbufs into m_tags from the m_pkthdr structure,
returning some additional room in the first mbuf in a chain, and
avoiding feature-specific contents in the mbuf header. To do this:

- Modify mbuf_to_label() to extract the tag, returning NULL if not
found.

- Introduce mac_init_mbuf_tag() which does most of the work
mac_init_mbuf() used to do, except on an m_tag rather than an
mbuf.

- Scale back mac_init_mbuf() to perform m_tag allocation and invoke
mac_init_mbuf_tag().

- Replace mac_destroy_mbuf() with mac_destroy_mbuf_tag(), since
m_tag's are now GC'd deep in the m_tag/mbuf code rather than
at a higher level when mbufs are directly free()'d.

- Add mac_copy_mbuf_tag() to support m_copy_pkthdr() and related
notions.

- Generally change all references to mbuf labels so that they use
mbuf_to_label() rather than &mbuf->m_pkthdr.label. This
required no changes in the MAC policies (yay!).

- Tweak mbuf release routines to not call mac_destroy_mbuf(),
tag destruction takes care of it for us now.

- Remove MAC magic from m_copy_pkthdr() and m_move_pkthdr() --
the existing m_tag support does all this for us. Note that
we can no longer just zero the m_tag list on the target mbuf,
rather, we have to delete the chain because m_tag's will
already be hung off freshly allocated mbuf's.

- Tweak m_tag copying routines so that if we're copying a MAC
m_tag, we don't do a binary copy, rather, we initialize the
new storage and do a deep copy of the label.

- Remove use of MAC_FLAG_INITIALIZED in a few bizarre places
having to do with mbuf header copies previously.

- When an mbuf is copied in ip_input(), we no longer need to
explicitly copy the label because it will get handled by the
m_tag code now.

- No longer any weird handling of MAC labels in if_loop.c during
header copies.

- Add MPC_LOADTIME_FLAG_LABELMBUFS flag to Biba, MLS, mac_test.
In mac_test, handle the label==NULL case, since it can be
dynamically loaded.

In order to improve performance with this change, introduce the notion
of "lazy MAC label allocation" -- only allocate m_tag storage for MAC
labels if we're running with a policy that uses MAC labels on mbufs.
Policies declare this intent by setting the MPC_LOADTIME_FLAG_LABELMBUFS
flag in their load-time flags field during declaration. Note: this
opens up the possibility of post-boot policy modules getting back NULL
slot entries even though they have policy invariants of non-NULL slot
entries, as the policy might have been loaded after the mbuf was
allocated, leaving the mbuf without label storage. Policies that cannot
handle this case must be declared as NOTLATE, or must be modified.

- mac_labelmbufs holds the current cumulative status as to whether
any policies require mbuf labeling or not. This is updated whenever
the active policy set changes by the function mac_policy_updateflags().
The function iterates the list and checks whether any have the
flag set. Write access to this variable is protected by the policy
list; read access is currently not protected for performance reasons.
This might change if it causes problems.

- Add MAC_POLICY_LIST_ASSERT_EXCLUSIVE() to permit the flags update
function to assert appropriate locks.

- This makes allocation in mac_init_mbuf() conditional on the flag.

Reviewed by: sam
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories


# 111119 19-Feb-2003 imp

Back out M_* changes, per decision of the TRB.

Approved by: trb


# 109623 21-Jan-2003 alfred

Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.


# 109619 21-Jan-2003 sam

preserve the order of tags copied by m_tag_copy_chain

Obtained from: OpenBSD


# 108466 30-Dec-2002 sam

Correct mbuf packet header propagation. Previously, packet headers
were sometimes propagated using M_COPY_PKTHDR which actually did
something between a "move" and a "copy" operation. This is replaced
by M_MOVE_PKTHDR (which copies the pkthdr contents and "removes" it
from the source mbuf) and m_dup_pkthdr which copies the packet
header contents including any m_tag chain. This corrects numerous
problems whereby mbuf tags could be lost during packet manipulations.

These changes also introduce arguments to m_tag_copy and m_tag_copy_chain
to specify if the tag copy work should potentially block. This
introduces an incompatibility with openbsd which we may want to revisit.

Note that move/dup of packet headers does not handle target mbufs
that have a cluster bound to them. We may want to support this;
for now we watch for it with an assert.

Finally, M_COPYFLAGS was updated to include M_FIRSTFRAG|M_LASTFRAG.

Supported by: Vernier Networks
Reviewed by: Robert Watson <rwatson@FreeBSD.org>


# 107283 26-Nov-2002 sam

correct function names in KASSERT's for 2 m_tag routines

Submitted by: rwatson
Approved by: re


# 105194 15-Oct-2002 sam

Replace aux mbufs with packet tags:

o instead of a list of mbufs use a list of m_tag structures a la openbsd
o for netgraph et. al. extend the stock openbsd m_tag to include a 32-bit
ABI/module number cookie
o for openbsd compatibility define a well-known cookie MTAG_ABI_COMPAT and
use this in defining openbsd-compatible m_tag_find and m_tag_get routines
o rewrite KAME use of aux mbufs in terms of packet tags
o eliminate the most heavily used aux mbufs by adding an additional struct
inpcb parameter to ip_output and ip6_output to allow the IPsec code to
locate the security policy to apply to outbound packets
o bump __FreeBSD_version so code can be conditionalized
o fixup ipfilter's call to ip_output based on __FreeBSD_version

Reviewed by: julian, luigi (silent), -arch, -net, darren
Approved by: julian, silence from everyone else
Obtained from: openbsd (mostly)
MFC after: 1 month


# 97177 23-May-2002 ume

In m_aux_delete, no need to chase beyond victim.

Submitted by: archie
Obtained from: KAME


# 95023 19-Apr-2002 suz

just merged cosmetic changes from KAME to ease sync between KAME and FreeBSD.
(based on freebsd4-snap-20020128)

Reviewed by: ume
MFC after: 1 week


# 92723 19-Mar-2002 alfred

Remove __P.


# 78109 11-Jun-2001 ume

This is force commit to mention about previous commit.

- add a pointer to struct mauxtag. two integer was too restrictive
- have m_aux_{add,find}2.
- make sure to return non-cluster on m_pulldown(). this is safer
(but of course less performant) when we have non-loopback L2 code
which throws the mbuf back to input path, like L2 bridging or some
multicast handling code.


# 78064 11-Jun-2001 ume

Sync with recent KAME.
This work was based on kame-20010528-freebsd43-snap.tgz and some
critical problem after the snap was out were fixed.
There are many many changes since last KAME merge.

TODO:
- The definitions of SADB_* in sys/net/pfkeyv2.h are still different
from RFC2407/IANA assignment because of binary compatibility
issue. It should be fixed under 5-CURRENT.
- ip6po_m member of struct ip6_pktopts is no longer used. But, it
is still there because of binary compatibility issue. It should
be removed under 5-CURRENT.

Reviewed by: itojun
Obtained from: KAME
MFC after: 3 weeks


# 76166 01-May-2001 markm

Undo part of the tangle of having sys/lock.h and sys/mutex.h included in
other "system" header files.

Also help the deprecation of lockmgr.h by making it a sub-include of
sys/lock.h and removing sys/lockmgr.h form kernel .c files.

Sort sys/*.h includes where possible in affected files.

OK'ed by: bde (with reservations)


# 72356 11-Feb-2001 bmilekic

Long awaited style fixup in mbuf code. Get rid of K&R style prototyping
and function argument declarations. Make sure that functions that are
supposed to return a pointer return NULL in case of failure. Don't cast
NULL. Finally, get rid of annoying `register' uses.


# 68618 11-Nov-2000 bmilekic

* Have m_pulldown() use the new M_WRITABLE() macro in order to determine
whether the given ext_buf is shared.

* Have the sf_bufs be setup with the mbuf subsystem using MEXTADD() with the
two new arguments.

Note: m_pulldown() is somewhat crotchy; the added comment explains the
situation.

Reviewed by: jlemon


# 67882 29-Oct-2000 phk

Remove unneeded #include <sys/proc.h> lines.


# 64837 19-Aug-2000 dwmalone

Replace the mbuf external reference counting code with something
that should be better.

The old code counted references to mbuf clusters by using the offset
of the cluster from the start of memory allocated for mbufs and
clusters as an index into an array of chars, which did the reference
counting. If the external storage was not a cluster then reference
counting had to be done by the code using that external storage.

NetBSD's system of linked lists of mbufs was cosidered, but Alfred
felt it would have locking issues when the kernel was made more
SMP friendly.

The system implimented uses a pool of unions to track external
storage. The union contains an int for counting the references and
a pointer for forming a free list. The reference counts are
incremented and decremented atomically and so should be SMP friendly.
This system can track reference counts for any sort of external
storage.

Access to the reference counting stuff is now through macros defined
in mbuf.h, so it should be easier to make changes to the system in
the future.

The possibility of storing the reference count in one of the
referencing mbufs was considered, but was rejected 'cos it would
often leave extra mbufs allocated. Storing the reference count in
the cluster was also considered, but because the external storage
may not be a cluster this isn't an option.

The size of the pool of reference counters is available in the
stats provided by "netstat -m".

PR: 19866
Submitted by: Bosko Milekic <bmilekic@dsuper.net>
Reviewed by: alfred (glanced at by others on -net)


# 63024 12-Jul-2000 itojun

remove m_pulldown statistics, which is highly experimental and does not
belong to *bsd-merged tree


# 62587 04-Jul-2000 itojun

sync with kame tree as of july00. tons of bug fixes/improvements.

API changes:
- additional IPv6 ioctls
- IPsec PF_KEY API was changed, it is mandatory to upgrade setkey(8).
(also syntax change)