History log of /freebsd-10-stable/sys/netinet/ip_output.c
Revision Date Author Comments
# 284496 17-Jun-2015 hselasky

MFC r280991:
Extend fixes made in r278103 and r38754 by copying the complete packet
header and not only partial flags and fields. Firewalls can attach
classification tags to the outgoing mbufs which should be copied to
all the new fragments. Else only the first fragment will be let
through by the firewall. This can easily be tested by sending a large
ping packet through a firewall. It was also discovered that VLAN
related flags and fields should be copied for packets traversing
through VLANs. This is all handled by "m_dup_pkthdr()".

Regarding the MAC policy check in ip_fragment(), the tag provided by
the originating mbuf is copied instead of using the default one
provided by m_gethdr().

Tested by: Karim Fodil-Lemelin <fodillemlinkarim at gmail.com>
Sponsored by: Mellanox Technologies
PR: 7802


# 281955 24-Apr-2015 hiren

MFC r275358 r275483 r276982 - Removing M_FLOWID by hps@

r275358:
Start process of removing the use of the deprecated "M_FLOWID" flag
from the FreeBSD network code. The flag is still kept around in the
"sys/mbuf.h" header file, but does no longer have any users. Instead
the "m_pkthdr.rsstype" field in the mbuf structure is now used to
decide the meaning of the "m_pkthdr.flowid" field. To modify the
"m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX"
macros as defined in the "sys/mbuf.h" header file.

This patch introduces new behaviour in the transmit direction.
Previously network drivers checked if "M_FLOWID" was set in "m_flags"
before using the "m_pkthdr.flowid" field. This check has now now been
replaced by checking if "M_HASHTYPE_GET(m)" is different from
"M_HASHTYPE_NONE". In the future more hashtypes will be added, for
example hashtypes for hardware dedicated flows.

"M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is
valid and has no particular type. This change removes the need for an
"if" statement in TCP transmit code checking for the presence of a
valid flowid value. The "if" statement mentioned above is now a direct
variable assignment which is then later checked by the respective
network drivers like before.

r275483:
Remove M_FLOWID from SCTP code.

r276982:
Remove no longer used "M_FLOWID" flag from mbuf.h and update the netisr
manpage.

Note: The FreeBSD version has been bumped.

Reviewed by: hps, tuexen
Sponsored by: Limelight Networks


# 280552 25-Mar-2015 hselasky

MFC r279281:
Fix a special case in ip_fragment() to produce a more sensible chain
of packets. When the data payload length excluding any headers, of an
outgoing IPv4 packet exceeds PAGE_SIZE bytes, a special case in
ip_fragment() can kick in to optimise the outgoing payload(s). The
code which was added in r98849 as part of zero copy socket support
assumes that the beginning of any MTU sized payload is aligned to
where a MBUF's "m_data" pointer points. This is not always the case
and can sometimes cause large IPv4 packets, as part of ping replies,
to be split more than needed.

Instead of iterating the MBUFs to figure out how much data is in the
current chain, use the value already in the "m_pkthdr.len" field of
the first MBUF in the chain.

Reviewed by: ken @
Differential Revision: https://reviews.freebsd.org/D1893
Sponsored by: Mellanox Technologies


# 278514 10-Feb-2015 hselasky

Append to the MFC of r278103 that we also pass along the M_FLOWID flag.

Sponsored by: Mellanox Technologies


# 278511 10-Feb-2015 hselasky

MFC r278103:
The flowid and hashtype should be copied from the originating packet
when fragmenting IP packets to preserve the order of the packets in a
stream. Else the resulting fragments can be sent out of order when the
hardware supports multiple transmit rings.

Sponsored by: Mellanox Technologies


# 268956 21-Jul-2014 np

MFC r268450 (by glebius). The leak affects stable/10 too.

In several cases in ip_output() we obtain reference on ifa. Do not
leak it.


# 267736 22-Jun-2014 tuexen

MFC r265691:

For some UDP packets (for example with 200 byte payload) and IP options,
the IP header and the UDP header are not in the same mbuf.
Add code to in_delayed_cksum() to deal with this case.

MFC r265713:

Use KASSERTs as suggested by glebius@


# 263478 21-Mar-2014 glebius

Merge r262763, r262767, r262771, r262806 from head:
- Remove rt_metrics_lite and simply put its members into rtentry.
- Use counter(9) for rt_pksent (former rt_rmx.rmx_pksent). This
removes another cache trashing ++ from packet forwarding path.
- Create zini/fini methods for the rtentry UMA zone. Via initialize
mutex and counter in them.
- Fix reporting of rmx_pksent to routing socket.
- Fix netstat(1) to report "Use" both in kvm(3) and sysctl(3) mode.


# 263334 19-Mar-2014 glebius

Merge r262747: remove extraneous ifa_ref()/ifa_free().


# 262743 04-Mar-2014 glebius

Merge r261582, r261601, r261610, r261613, r261627, r261640, r261641, r261823,
r261825, r261859, r261875, r261883, r261911, r262027, r262028, r262029,
r262030, r262162 from head.

Large flowtable revamp. See commit messages for merged revisions for
details.

Sponsored by: Netflix


# 261545 06-Feb-2014 ae

MFC r260702 (by melifaro):
Fix ipfw fwd for IPv4 traffic broken by r249894.

Problem case:
Original lookup returns route with GW set, so gw points to
rte->rt_gateway.
After that we're changing dst and performing lookup another time.
Since fwd host is most probably directly reachable, resulting
rte does not contain rt_gateway, so gw is not set. Finally, we
end with packet transmitted to proper interface but wrong
link-layer address.


# 260319 05-Jan-2014 glebius

Merge r260188 from head:
Fix regression from r249894. Now we pass "gw" as argument to if_output
method, thus for multicast case we need it to point at "dst".

PR: 185395


# 284496 17-Jun-2015 hselasky

MFC r280991:
Extend fixes made in r278103 and r38754 by copying the complete packet
header and not only partial flags and fields. Firewalls can attach
classification tags to the outgoing mbufs which should be copied to
all the new fragments. Else only the first fragment will be let
through by the firewall. This can easily be tested by sending a large
ping packet through a firewall. It was also discovered that VLAN
related flags and fields should be copied for packets traversing
through VLANs. This is all handled by "m_dup_pkthdr()".

Regarding the MAC policy check in ip_fragment(), the tag provided by
the originating mbuf is copied instead of using the default one
provided by m_gethdr().

Tested by: Karim Fodil-Lemelin <fodillemlinkarim at gmail.com>
Sponsored by: Mellanox Technologies
PR: 7802


# 281955 24-Apr-2015 hiren

MFC r275358 r275483 r276982 - Removing M_FLOWID by hps@

r275358:
Start process of removing the use of the deprecated "M_FLOWID" flag
from the FreeBSD network code. The flag is still kept around in the
"sys/mbuf.h" header file, but does no longer have any users. Instead
the "m_pkthdr.rsstype" field in the mbuf structure is now used to
decide the meaning of the "m_pkthdr.flowid" field. To modify the
"m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX"
macros as defined in the "sys/mbuf.h" header file.

This patch introduces new behaviour in the transmit direction.
Previously network drivers checked if "M_FLOWID" was set in "m_flags"
before using the "m_pkthdr.flowid" field. This check has now now been
replaced by checking if "M_HASHTYPE_GET(m)" is different from
"M_HASHTYPE_NONE". In the future more hashtypes will be added, for
example hashtypes for hardware dedicated flows.

"M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is
valid and has no particular type. This change removes the need for an
"if" statement in TCP transmit code checking for the presence of a
valid flowid value. The "if" statement mentioned above is now a direct
variable assignment which is then later checked by the respective
network drivers like before.

r275483:
Remove M_FLOWID from SCTP code.

r276982:
Remove no longer used "M_FLOWID" flag from mbuf.h and update the netisr
manpage.

Note: The FreeBSD version has been bumped.

Reviewed by: hps, tuexen
Sponsored by: Limelight Networks


# 280552 25-Mar-2015 hselasky

MFC r279281:
Fix a special case in ip_fragment() to produce a more sensible chain
of packets. When the data payload length excluding any headers, of an
outgoing IPv4 packet exceeds PAGE_SIZE bytes, a special case in
ip_fragment() can kick in to optimise the outgoing payload(s). The
code which was added in r98849 as part of zero copy socket support
assumes that the beginning of any MTU sized payload is aligned to
where a MBUF's "m_data" pointer points. This is not always the case
and can sometimes cause large IPv4 packets, as part of ping replies,
to be split more than needed.

Instead of iterating the MBUFs to figure out how much data is in the
current chain, use the value already in the "m_pkthdr.len" field of
the first MBUF in the chain.

Reviewed by: ken @
Differential Revision: https://reviews.freebsd.org/D1893
Sponsored by: Mellanox Technologies


# 278514 10-Feb-2015 hselasky

Append to the MFC of r278103 that we also pass along the M_FLOWID flag.

Sponsored by: Mellanox Technologies


# 278511 10-Feb-2015 hselasky

MFC r278103:
The flowid and hashtype should be copied from the originating packet
when fragmenting IP packets to preserve the order of the packets in a
stream. Else the resulting fragments can be sent out of order when the
hardware supports multiple transmit rings.

Sponsored by: Mellanox Technologies


# 268956 21-Jul-2014 np

MFC r268450 (by glebius). The leak affects stable/10 too.

In several cases in ip_output() we obtain reference on ifa. Do not
leak it.


# 267736 22-Jun-2014 tuexen

MFC r265691:

For some UDP packets (for example with 200 byte payload) and IP options,
the IP header and the UDP header are not in the same mbuf.
Add code to in_delayed_cksum() to deal with this case.

MFC r265713:

Use KASSERTs as suggested by glebius@


# 263478 21-Mar-2014 glebius

Merge r262763, r262767, r262771, r262806 from head:
- Remove rt_metrics_lite and simply put its members into rtentry.
- Use counter(9) for rt_pksent (former rt_rmx.rmx_pksent). This
removes another cache trashing ++ from packet forwarding path.
- Create zini/fini methods for the rtentry UMA zone. Via initialize
mutex and counter in them.
- Fix reporting of rmx_pksent to routing socket.
- Fix netstat(1) to report "Use" both in kvm(3) and sysctl(3) mode.


# 263334 19-Mar-2014 glebius

Merge r262747: remove extraneous ifa_ref()/ifa_free().


# 262743 04-Mar-2014 glebius

Merge r261582, r261601, r261610, r261613, r261627, r261640, r261641, r261823,
r261825, r261859, r261875, r261883, r261911, r262027, r262028, r262029,
r262030, r262162 from head.

Large flowtable revamp. See commit messages for merged revisions for
details.

Sponsored by: Netflix


# 261545 06-Feb-2014 ae

MFC r260702 (by melifaro):
Fix ipfw fwd for IPv4 traffic broken by r249894.

Problem case:
Original lookup returns route with GW set, so gw points to
rte->rt_gateway.
After that we're changing dst and performing lookup another time.
Since fwd host is most probably directly reachable, resulting
rte does not contain rt_gateway, so gw is not set. Finally, we
end with packet transmitted to proper interface but wrong
link-layer address.


# 260319 05-Jan-2014 glebius

Merge r260188 from head:
Fix regression from r249894. Now we pass "gw" as argument to if_output
method, thus for multicast case we need it to point at "dst".

PR: 185395