History log of /freebsd-10-stable/sys/dev/cxgbe/
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
353418 10-Oct-2019 np

MFC r319872, r321063, r321582, r322034, r322425, r322962, r322985,
r325596, r326026, r328420, r331472, r333276, r333650, r333652, r334406,
r334409-r334410, r334489, r336042, r340651, r342603, and r345083.

This updates the cxgbe firmwares in stable/10 and also pulls in support
for some newer boards and flash parts.

r319872:
cxgbe(4): Do not request an FEC setting that the port does not support.

r321063:
cxgbe(4): Various link/media related improvements.

r321582:
cxgbe(4): Some updates to the common code.

r322034:
cxgbe(4): Always use the first and not the last virtual interface
associated with a port in begin_synchronized_op.

r322425:
cxgbe(4): Save the last reported link parameters and compare them with
the current state to determine whether to generate a link-state change
notification. This fixes a bug introduced in r321063 that caused the
driver to sometimes skip these notifications.

r322962:
cxgbe(4): Remove write only variable from t4_port_init.

r322985:
cxgbe(4): Maintain one ifmedia per physical port instead of one per
Virtual Interface (VI). All autonomous VIs that share a port share the
same media.

r325596:
cxgbe(4): Do not request settings not supported by the port.

r326026:
cxgbe(4): Add a custom board to the device id list.

r328420:
cxgbe(4): Do not display harmless warning in non-debug builds.

r331472:
cxgbe(4): Always initialize requested_speed to a valid value.

This fixes an avoidable EINVAL when the user tries to disable AN after
the port is initialized but l1cfg doesn't have a valid speed to use.

r333276:
cxgbe(4): Update all firmwares to 1.19.1.0.

r333650:
cxgbe(4): Claim some more T5 and T6 boards.

r333652:
cxgbe(4): Add support for two more flash parts.

r334406:
cxgbe(4): Consider all supported speeds when building the ifmedia list
for a port. Fix other related issues while here:
- Require port lock for access to link_config.
- Allow 100Mbps operation by tracking the speed in Mbps. Yes, really.
- New port flag to indicate that the media list is immutable. It will
be used in future refinements.

This also fixes a bug where the driver reports incorrect media with
recent firmwares.

r334409:
cxgbe(4): Implement ifm_change callback.

r334410:
cxgbe(4): Use ifm for ifmedia just like the rest of the kernel.

No functional change.

r334489:
cxgbe(4): Include full duplex mediaopt in media that can be reported as
active. Always report full duplex in active media.

r336042:
cxgbe(4): Assume that any unknown flash on the card is 4MB and has 64KB
sectors, instead of refusing to attach to the card.

r340651:
cxgbe(4): Update T4/5/6 firmwares to 1.22.0.3.

r342603:
cxgbe(4): Attach to two T540 variants.

r345083:
cxgbe(4): Update T4/5/6 firmwares to 1.23.0.0.

342583 29-Dec-2018 jhb

MFC 340304: Use tcp_state_change() in the cxgbe(4) TOE module.

r254889 added tcp_state_change() as a centralized place to log state
changes in TCP connections for DTrace. r294869 and r296881 took
advantage of this central location to manage per-state counters.
However, TOE sockets were still performing some (but not all) state
change updates via direct assignments to t_state. This resulted in
state counters underflowing when TOE was in use. Fix by using
tcp_state_change() when changing a TOE connection's state.

331719 29-Mar-2018 np

MFC r323006 and r324386.

This brings the cxgbe(4) firmware up to 1.16.63.0.

Sponsored by: Chelsio Communications

331647 27-Mar-2018 jhb

MFC 318387: Add support for child devices that aren't ports.

Invoke any identify routines of child drivers during attach before attaching
children, and delete any remaining devices after deleting ports.

Sponsored by: Chelsio Communications

330303 03-Mar-2018 jhb

MFC 328608: Export tcp_always_keepalive for use by the Chelsio TOM module.

This used to work by accident with ld.bfd even though always_keepalive
was marked as static. LLD honors static more correctly, so export this
variable properly (including moving it into the tcp_* namespace).

Relative to HEAD the MFC includes two additional changes:
- The t3_tom module used for cxgb(4) is also patched.
- A strong reference from the new name (tcp_always_keepalive) to the old
name (always_keepalive) has been added to preserve the KBI for existing
modules.

Suggested by: kib (strong reference)
Sponsored by: Chelsio Communications

325611 09-Nov-2017 hselasky

MFC r324792:
The remote DMA TCP portspace selector, RDMA_PS_TCP, is used for both
iWarp and RoCE in ibcore. The selection of RDMA_PS_TCP can not be used
to indicate iWarp protocol use. Backport the proper IB device
capabilities from Linux upstream to distinguish between iWarp and
RoCE. Only allocate the additional socket required for iWarp for RDMA
IDs when at least one iWarp device present. This resolves
interopability issues between iWarp and RoCE in ibcore

Reviewed by: np @
Differential Revision: https://reviews.freebsd.org/D12563
Sponsored by: Mellanox Technologies

324685 17-Oct-2017 hselasky

MFC r289568, r300676, r300677, r300719, r300720 and r300721:
Implement LinuxKPI module parameters as SYSCTLs.

The bool module parameter is no longer supported, because there is no
equivalent in FreeBSD 10-stable. These are converted into "int" type.

There are two macros available which control the behaviour of the
LinuxKPI module parameters:

- LINUXKPI_PARAM_PARENT allows the consumer to set the SYSCTL parent
where the modules parameters will be created.

- LINUXKPI_PARAM_PREFIX defines a parameter name prefix, which is
added to all created module parameters.

The LinuxKPI module parameters also have a permissions value.
If any write bits are set we are allowed to modify the module
parameter runtime. Reflect this when creating the static SYSCTL
nodes.

The module_param_call() function is no longer supported.

Sponsored by: Mellanox Technologies

319272 31-May-2017 np

MFC r318774:

cxgbe/iw_cxgbe: sodisconnect failures are harmless and should not be
treated as fatal errors.

Sponsored by: Chelsio Communications

319270 31-May-2017 np

MFC r318762:

cxgbe(4): Update the T4, T5, and T6 firmwares to 1.16.45.0.

The latest firmware has a number of link related fixes, support for a
new custom card, and the fix for a bug that affected rate limiting on
FreeBSD.

Relnotes: Yes
Sponsored by: Chelsio Communications

318855 25-May-2017 np

MFC r318014, r318091, r318125, and r318263.

r318014:
cxgbe(4): Fixes related to the knob that controls link autonegotiation.

- Do not leak the adapter lock in sysctl_autoneg.
- Accept only 0 or 1 as valid settings for autonegotiation.
- A fixed speed must be requested by the driver when autonegotiation is
disabled otherwise the firmware will reject the l1cfg command. Use
the top speed supported by the port for now.

r318091:
cxgbe(4): Do not assume that if_qflush is always followed by inteface-down.

r318125:
Adjust whitespace and fix a comment. No functional change.

r318263:
cxgbe(4): netmap-only interrupts for a VI do not have an associated rxq
or ofld_rxq and should be ignored by vi_intr_iq.

Sponsored by: Chelsio Communications

318851 25-May-2017 np

MFC r317702, r317847, r318307

r317702:
cxgbe(4): Support routines for Tx traffic scheduling.

- Create a new file, t4_sched.c, and move all of the code related to
traffic management from t4_main.c and t4_sge.c to this file.
- Track both Channel Rate Limiter (ch_rl) and Class Rate Limiter (cl_rl)
parameters in the PF driver.
- Initialize all the cl_rl limiters with somewhat arbitrary default
rates and provide routines to update them on the fly.
- Provide routines to reserve and release traffic classes.

r317847:
cxgbe(4): The Tx scheduler initialization either works or doesn't. It
doesn't need a refresh in either case.

r318307:
cxgbe(4): Avoid an out of bounds access when an attempt to unbind a tx
queue from a traffic class fails.

Sponsored by: Chelsio Communications

318844 25-May-2017 np

MFC r317820 and r317837.

r317820:
cxgbe(4): Update the list of PCIe devices claimed by the driver. At
this point any board with a T6 should just work.

r317837:
cxgbe(4): Update the VF device ids too. This should have been part
of r317820.

Sponsored by: Chelsio Communications

318840 25-May-2017 np

MFC r316971:

cxgbe: Add a tunable to configure the SGE time scaler, which is
available starting with T6. The values in the timer holdoff registers
are multiplied by the scaling factor before use.

dev.<nexus>.<n>.holdoff_timers shows the final values of the
timers in microseconds.

Sponsored by: Chelsio Communications

318838 24-May-2017 np

MFC r316506:

cxgbe(4): Program the global RSS key once instead of once per ifnet.

Sponsored by: Chelsio Communications

318836 24-May-2017 np

MFC r316172:

cxgbe: Don't call t4_edc_err_read for errors not related to the EDCs.

Sponsored by: Chelsio Communications

318826 24-May-2017 np

MFC r309725:

cxgbe(4): netmap does not set IFCAP_NETMAP in an ifnet's if_capabilities
any more (since r307394). Do it in the driver instead.

318809 24-May-2017 np

MFC r313318:

cxgbe(4): Allow tunables that control the number of queues to be set to
'-n' to tell the driver to create _up to_ 'n' queues if enough cores are
available. For example, setting hw.cxgbe.nrxq10g="-32" will result in
16 queues if the system has 16 cores, 32 if it has 32.

There is no change in the default number of queues of any type.

Sponsored by: Chelsio Communications

318804 24-May-2017 np

MFC r313346:

cxgbe/t4_tom: Fix CLIP entry refcounting on the passive side. Every
IPv6 connection being handled by the TOE should have a reference on its
CLIP entry.

Sponsored by: Chelsio Communications

318799 24-May-2017 np

MFC r311880, r314167, r316118, r316571, r316573, r316580, r316936-r316937,
r316940, and r317410.

r311880:
The iw_cxgb and iw_cxgbe drivers should not use a FreeBSD device_t where
a linuxkpi style device is expected. If OFED/linuxkpi actually starts
using this field then we'll have to figure out whether to create fake
devices for these drivers or have linuxkpi deal with NULL device.

This mismatch was first reported as part of D6585.

r314167:
cxgbe/iw_cxgbe: Minor changes for T6.

r316118:
cxgbe/iw_cxgbe: T6 has no limit on the amount of memory that can be
registered in one ib_reg_phys_mr.

r316571:
cxgbe/iw_cxgbe: Remove bad cast that resulted in incorrect length for
memory regions larger than 4GB.

r316573:
cxgbe/iw_cxgbe: Replace a magic constant with something more readable
(and accurate).

T4 and later have an extra bit for page shift so the maximum page size
is 8TB (shift of 12 + 31) instead of 128MB (12 + 15). This saves space
in the chip's PBL (physical buffer list) when registering very large
memory regions.

r316580:
cxgbe/iw_cxgbe: Remove another bad cast. This should have been
included in r316571.

r316936:
cxgbe/iw_cxgbe: hw supports 64K (not 32K) Protection Domains.

r316937:
cxgbe/iw_cxgbe: Report accurate page_size_cap in ib_query_device.

r316940:
cxgbe/iw_cxgbe: Report the actual values of various parameters as
configured by the firmware.

r317410:
cxgbe/iw_cxgbe: Pull in some updates to c4iw_wait_for_reply from the
iw_cxgb4 Linux driver.

318797 24-May-2017 np

MFC r316774:

cxgbe: Query some more RDMA related parameters from the firmware.

Sponsored by: Chelsio Communications

318775 24-May-2017 np

MFC r311846:
cxgbe(4): Refresh t4_msg.h, mainly for definitions related to the crypto
engine.

316123 29-Mar-2017 np

MFC r315201, r315920, r315921, r315922, r316008, and r316062.

r315201:
cxgbe(4): Fix an always-true assertion (reported by PVS-Studio).

sys/dev/cxgbe/t4_main.c: PVS-Studio: Expression is Always True (CWE-571) (3)

r315920:
cxgbe/iw_cxgbe: c4iw_connect should always returns a -ve errno on failure.

r315921:

cxgbe/iw_cxgbe: alloc_ep expects a gfp_t, and it's always ok to sleep during
alloc_ep.

r315922:
cxgbe/iw_cxgbe: allocations that use GFP_KERNEL (which is M_WAITOK on
FreeBSD) cannot fail.

r316008:
cxgbe/iw_cxgbe: Remove unused code.

r316062:
cxgbe/iw_cxgbe: Defer the handling of error CQEs and RDMA_TERMINATE to
the thread that deals with socket state changes. This eliminates
various bad races with the ithread.

315868 23-Mar-2017 np

MFC r314814 and r315325.

r314814:
cxgbe/iw_cxgbe: Abort connection if there is an error during c4iw_modify_qp.

r315325:
cxgbe/iw_cxgbe: Use the socket and not the toepcb to reach for the
inpcb. t4_tom detaches the inpcb from the toepcb as soon as the
hardware is done with the connection (in final_cpl_received) but the
socket is around as long as the cm_id and the rest of iWARP state is.

This fixes an intermittent NULL dereference during abort.

314776 06-Mar-2017 np

MFC r314509 and r314578.

r314509:
cxgbe/iw_cxgbe: Do not check the size of the memory region being
registered. T4/5/6 have no internal limit on this size. This is
probably a copy paste from the T3 iw_cxgb driver.

r314578:
cxgbe/iw_cxgbe: Implement sq/rq drain operation.

ULPs can set a qp's state to ERROR and then post a work request on the
sq and/or rq. When the reply for that work request comes back it is
guaranteed that all previous work requests posted on that queue have
been drained.

Sponsored by: Chelsio Communications

314606 03-Mar-2017 np

MFC r314400:

cxgbe/iw_cxgbe: fix various double-close panics with iWARP sockets.

Sockets representing the TCP endpoints for iWARP connections are
allocated by the ibcore module. Before this revision they were closed
either by the ibcore module or the iw_cxgbe hardware driver depending on
the state transitions during connection teardown. This is error prone
and there were cases where both iw_cxgbe and ibcore closed the socket
leading to double-free panics. The fix is to let ibcore close the
sockets it creates and never do it in the driver.

- Use sodisconnect instead of soclose (preceded by solinger = 0) in the
driver to tear down an RDMA connection abruptly. This does what's
intended without releasing the socket's fd reference.

- Close the socket in ibcore when the iWARP iw_cm_id is destroyed. This
works for all kinds of sockets: clients that initiate connections,
listeners, and sockets accepted off of listeners.

Sponsored by: Chelsio Communications

313178 03-Feb-2017 jhb

MFC 312906:
Unregister CPL handlers for TOE-related messages when unloading TOM.

Sponsored by: Chelsio Communications

312525 20-Jan-2017 np

MFC r312368:
cxgbe/tom: Fix a case where do_pass_accept_req wasn't properly restoring
the VNET.

312337 17-Jan-2017 np

Fix mismerge in r312117. This is a direct commit to stable/10.

312188 14-Jan-2017 np

MFC r311848:
cxgbe(4): Attach to the 2x25 debug card. This is for internal use only.

312186 14-Jan-2017 np

MFC r311831 and r311832.

r311831:
cxgbe(4): The wraparound logic in start_wrq_wr() should not get involved
in work requests that end at the end of the descriptor ring, even though
the pidx wraps around to 0.

r311832:
cxgbe(4): Enable automatic cidx flush for all control queues.

312117 14-Jan-2017 np

MFC r311569, r311657, and r311949.

r311569:
Fix comment in t4_tom. No functional change.

r311657:
cxgbe/t4_tom: Fix tid accounting. An offloaded IPv6 connection uses 2
tids, not 1, in the hardware.

r311949:
cxgbe/tom: Add VIMAGE support to the TOE driver.

Active Open:
- Save the socket's vnet at the time of the active open (t4_connect) and
switch to it when processing the reply (do_act_open_rpl or
do_act_establish).

Passive Open:
- Save the listening socket's vnet in the driver's listen_ctx and switch
to it when processing incoming SYNs for the socket.
- Reject SYNs that arrive on an ifnet that's not in the same vnet as the
listening socket.

CLIP (Compressed Local IPv6) table:
- Add only those IPv6 addresses to the CLIP that are in a vnet
associated with one of the card's ifnets.

Misc:
- Set vnet from the toepcb when processing TCP state transitions.
- The kernel sets the vnet when calling the driver's output routine
so t4_push_frames runs in proper vnet context already. One exception
is when incoming credits trigger tx within the driver's ithread. Set
the vnet explicitly in do_fw4_ack for that case.

Sponsored by: Chelsio Communications

311507 06-Jan-2017 np

MFC r310151 and r311173.

r310151:
cxgbe(4): Changes to the default T6 firmware configuration file.

- Disable features that are not supported or not used on FreeBSD.
- Increase the RSS table slice per interface.
- Increase the share of the TCAM reserved for filtering.

r311173:
cxgbe(4): Update T4, T5 and T6 firmwares to 1.16.26.0.

Sponsored by: Chelsio Communications

311261 04-Jan-2017 np

MFC r309666, r310033, r310049, r310100, r310152, and r310807.

r309666:
cxgbe(4): unsigned short isn't large enough to store link speed (which
is in Mbps) for 100Gbps links.

r310033:
cxgbe(4): Retire t4_bus_space_read_8 and t4_bus_space_write_8.

r310049:
cxgbe(4): Fix the tid range shown for T6 cards in misc.tids.

r310100:
cxgbe(4): Deal with compressed error vectors.

r310152:
cxgbe(4): Fix typo in an unused macro.

r310807:
cxgbe(4): Updates to link configuration.

- Update struct link_settings and associated shared code.

- Add tunables to control FEC and autonegotiation. All ports inherit
these values as their initial settings.
hw.cxgbe.fec
hw.cxgbe.autoneg

- Add per-port sysctls to control FEC and autonegotiation. These can be
modified at any time.
dev.<port>.<n>.fec
dev.<port>.<n>.autoneg

309724 09-Dec-2016 jhb

MFC 309613: cxgbe(4): Update firmwares from version 1.16.12.0 to 1.16.22.0.

Sponsored by: Chelsio Communications

309580 06-Dec-2016 jhb

MFC 308066: cxgbe(4): Accurate statistics for all chip settings.

There are 4 independent knobs in T5+ chips to include or exclude PAUSE
frames from the "total frames" and "multicast frames" counters in either
direction. This change lets the driver deal with any combination of
these settings.

309579 05-Dec-2016 jhb

MFC 307876:
cxgbe(4): Fix bug in the calculation of the number of physically
contiguous regions in an mbuf chain.

If the payload of an mbuf ends at a page boundary count_mbuf_nsegs would
incorrectly consider the next mbuf's payload physically contiguous based
solely on a KVA comparison.

309578 05-Dec-2016 jhb

MFC 307759: cxgbe(4): Dump any mailbox command that times out.

309575 05-Dec-2016 jhb

MFC 307233:
cxgbe(4): Allow the interface MTU to be set as high as the actual
hardware limit.

309569 05-Dec-2016 jhb

MFC 306821,306823: Permit updating firmware config file in flash.

306821:
cxgbe(4): Add an ioctl to copy a firmware config file to the card's flash.

306823:
cxgbetool: Add a loadcfg subcommand to allow a user to upload a firmware
configuration file to the card.

309564 05-Dec-2016 jhb

MFC 306277:
cxgbe(4): Make the location/length of all descriptor rings available in
the sysctl MIB.

309560 05-Dec-2016 jhb

MFC 305695,305696,305699,305702,305703,305713,305715,305827,305852,305906,
305908,306062,306063,306137,306138,306206,306216,306273,306295,306301,
306465,309302:
Add support for adapters using the Terminator T6 ASIC.

305695:
cxgbe(4): Set up fl_starve_threshold2 accurately for T6.

305696:
cxgbe(4): Use correct macro for header length with T6 ASICs. This
affects the transmit of the VF driver only.

305699:
cxgbe(4): Update the pad_boundary calculation for T6, which has a
different range of boundaries.

305702:
cxgbe(4): Use smaller min/max bursts for fl descriptors with a T6.

305703:
cxgbe(4): Deal with the slightly different SGE_STAT_CFG in T6.

305713:
cxgbe(4): Add support for additional port types and link speeds.

305715:
cxgbe(4): Catch up with the rename of tlscaps -> cryptocaps. TLS is one
of the capabilities of the crypto engine in T6.

305827:
cxgbe(4): Use the interface's viid to calculate the PF/VF/VFValid fields
to use in tx work requests.

305852:
cxgbe(4): Attach to cards with the Terminator 6 ASIC. T6 cards will
come up as 't6nex' nexus devices with 'cc' ports hanging off them.

The T6 firmware and configuration files will be added as soon as they
are released. For now the driver will try to work with whatever
firmware and configuration is on the card's flash.

305906:
cxgbe/t4_tom: The SMAC entry for a VI is at a different location in the T6.

305908:
cxgbe/t4_tom: Update the active/passive open code to support T6. Data
path works as-is.

306062:
cxgbe(4): Show wcwr_stats for T6 cards.

306063:
cxgbe(4): Setup congestion response for T6 rx queues.

306137:
cxgbetool: Add T6 support to the SGE context decoder.

306138:
Fix typo.

306206:
cxgbe(4): Catch up with the different layout of WHOAMI in T6.

Note that the code moved below t4_prep_adapter() as part of this change
because now it needs a working chip_id().

306216:
cxgbe(4): Fix the output of the "tids" sysctl on T6.

306273:
cxgbe(4): Fix netmap with T6, which doesn't encapsulate SGE_EGR_UPDATE
message inside a FW_MSG. The base NIC already deals with updates in
either form.

306295:
cxgbe(4): Support SIOGIFXMEDIA so that ifconfig displays correct media
for 25Gbps and 100Gbps ports. This should have been part of r305713,
which is when the driver first started reporting extended media types.

306301:
cxgbe(4): Use the port's top speed to figure out whether it is "high
speed" or not (for the purpose of calculating the number of queues etc.)
This does the right thing for 25Gbps and 100Gbps ports.

306465:
cxgbe(4): Claim the T6 -DBG card.

309302:
cxgbe(4): Include firmware for T6 cards in the driver. Update all
firmwares to 1.16.12.0.

Sponsored by: Chelsio Communications

309559 05-Dec-2016 jhb

MFC 305667:
cxgbe(4): Avoid a NULL dereference in the clearstats ioctl handler.
Port softc's are not initialized when the adapter is in recovery mode.

309558 05-Dec-2016 jhb

MFC 305652: cxgbe(4): Do not prescreen frames before attempting LRO.

309557 05-Dec-2016 jhb

MFC 305433:
cxgbe/t4_tom: toepcb should be all-zero on allocation because the code
that cleans up on failure assumes that non-NULL values indicate
initialized items.

309556 05-Dec-2016 jhb

MFC 303688,303750,305166,305167: Centralize and rework page pod handling.

Note that the TOE DDP code in 10 is different from 11 and later and
had to be updated directly.

303688:
cxgbe/t4_tom: Read the chip's DDP page sizes and save them in a
per-adapter data structure. This replaces a global array with hardcoded
page sizes.

303750:
cxgbe/t4_tom: The page pod arena allocates from pod address space and
not index space. The minimum valid allocation out of this arena is the
size of a single page pod.

305166:
cxgbe/t4_tom: Add general purpose routines to deal with page pod regions
and allocations within them. Switch to these routines to manage the TOE
DDP region.

305167:
cxgbe/t4_tom: Two new routines to allocate and write page pods for a
buffer in the kernel's address space.

Sponsored by: Chelsio Communications

309529 04-Dec-2016 kib

Add sys/systm.h to have critical_enter() defined, required by
machine/counter.h on i386.

This is a direct commit to stable/10.

Sponsored by: The FreeBSD Foundation

309459 03-Dec-2016 jhb

MFC 303348:
cxgbe(4): Initialize the adapter queues (fwq and mgmtq) instead of
returning EAGAIN if they aren't available when the user tries to program
a filter. Do this after validating the filter so that the driver
doesn't bring up the queues if it doesn't have to.

309458 03-Dec-2016 jhb

MFC 302440,304873,305704,305985,306787,307531: Fixes for sysctls.

302440:
cxgbe(4): Add sysctl to display the RSS indirection table size for an
interface.

dev.cxl.<n>.rss_size
dev.vcxl.<n>.rss_size

304873:
cxgbe(4): Provide more details about the card in the sysctl MIB.

dev.t5nex.0.%desc: Chelsio T580-CR
dev.t5nex.0.hw_revision: 1
dev.t5nex.0.sn: PT13140042
dev.t5nex.0.pn: 110117150A0
dev.t5nex.0.ec: 0000000000000000
dev.t5nex.0.na: 0007432AF490
dev.t5nex.0.vpd_version: 3
dev.t5nex.0.scfg_version: 53255
dev.t5nex.0.bs_version: 1.1.0.0
dev.t5nex.0.er_version: 1.0.0.68
dev.t5nex.0.tp_version: 0.1.4.9
dev.t5nex.0.firmware_version: 1.16.2.0

305704:
cxgbe(4): Rename the debug_flags driver tunable/sysctl to dflags.
Tunables that end with _flags are special.

305985:
cxgbe(4): Fixes to wrq stats.

- Increment tx_wrs_copied in the correct place.
- Add tx_wrs_sspace to the sysctl MIB.

306787:
cxgbe(4): Fix whitespace in the pm_stats display.

307531:
cxgbe(4): Adjust whitespace to line up the column titles in cim_qcfg
with the values displayed.

Sponsored by: Chelsio Communications

309450 03-Dec-2016 jhb

MFC 304854: cxgbe/iw_cxgbe: Various fixes to the iWARP driver.

- Return appropriate error code instead of ENOMEM when sosend() fails in
send_mpa_req.
- Fix for problematic race during destroy_qp.
- Abortive close in the failure of send_mpa_reject() instead of normal close.
- Remove the unnecessary doorbell flowcontrol logic.

Sponsored by: Chelsio Communications

309449 02-Dec-2016 jhb

MFC 303859,305851: Fix a typo and some whitespace nits.

309448 02-Dec-2016 jhb

MFC 303454: Mark spg_len and fl_pktshift static.

These variables are no longer exported to t4_netmap.c after r296478.

309447 02-Dec-2016 jhb

MFC 303522,303647,303860,303880,304168-304170,304479,304482,304485,305548,
305549:
Chelsio T4/T5 VF driver.

303522:
Various fixes to the t4/5nex character device.

- Remove null open/close methods.
- Don't set d_flags to 0 explicitly.
- Remove t5_cdevsw as the .d_name member isn't really used and doesn't
warrant a separate cdevsw just for the name.
- Use ENOTTY as the error value for an unknown ioctl request.
- Use make_dev_s() to close race with setting si_drv1.

303647:
Store the offset of the KDOORBELL and GTS registers in the softc.

VF devices use a different register layout than PF devices. Storing
the offset in a value in the softc allows code to be shared between the
PF and VF drivers.

303860:
Reserve an adapter flag IS_VF to mark VF devices vs PF devices.

303880:
Track the base absolute ID of ingress and egress queues.

Use this to map an absolute queue ID to a logical queue ID in interrupt
handlers. For the regular cxgbe/cxl drivers this should be a no-op as
the base absolute ID should be zero. VF devices have a non-zero base
absolute ID and require this change. While here, export the absolute ID
of egress queues via a sysctl.

304168:
Make SGE parameter handling more VF-friendly.

Add fields to hold the SGE control register and free list buffer sizes to
the sge_params structure. Populate these new fields in
t4_init_sge_params() for PF devices and change t4_read_chip_settings() to
pull these values out of the params structure instead of reading
registers directly. This will permit t4_read_chip_settings() to be reused
for VF devices which cannot read SGE registers directly.

While here, move the call to t4_init_sge_params() to
get_params__post_init(). The VF driver will populate the SGE parameters
structure via a different method before calling t4_read_chip_settings().

304169:
Update mailbox writes to work with VF devices.

- Use alternate register locations for the data and control registers for
VFs.
- Do a dummy read to force the writes to the mailbox data registers to
post before the write to the control register on VFs.
- Do not check the PCI-e firmware register for errors on VFs.

304170:
Add support for register dumps on VF devices.

- Add handling of VF register sets to t4_get_regs_len() and t4_get_regs().
- While here, use t4_get_regs_len() in the ioctl handler for regdump
instead of inlining it.

304479:
Add structures for VF-specific adapter parameters.

While here, mark which parameters are PF-specific and which are
VF-specific.

304482:
Adjust t4_port_init() to work with VF devices.

Specifically, the FW_PORT_CMD may or may not work for a VF (the PF
driver can choose whether or not to permit access to this command),
so don't attempt to fetch port information on a VF if permission is
denied by the PF.

304485:
Reorder sysctls so that nodes shared with the VF driver are added first.

This permits a single early return for VF devices in the routines that
add sysctl nodes.

305548:
Don't break out of the m_advance() loop if len drops to zero.

If a packet contains the Ethernet header (14 bytes) in the first mbuf
and the payload (IP + UDP + data) in the second mbuf, then the attempt
to fetch the l3hdr will return a NULL pointer. The first loop iteration
will drop len to zero and exit the loop without setting 'p'. However,
the desired data is at the start of the second mbuf, so the correct
behavior is to loop around and let the conditional set 'p' to m_data of
the next mbuf (and leave offset as 0).

305549:
Chelsio T4/T5 VF driver.

The cxgbev/cxlv driver supports Virtual Function devices for Chelsio
T4 and T4 adapters. The VF devices share most of their code with the
existing PF4 driver (cxgbe/cxl) and as such the VF device driver
currently depends on the PF4 driver.

Similar to the cxgbe/cxl drivers, the VF driver includes a t4vf/t5vf
PCI device driver that attaches to the VF device. It then creates
child cxgbev/cxlv devices representing ports assigned to the VF.
By default, the PF driver assigns a single port to each VF.

t4vf_hw.c contains VF-specific routines from the shared code used to
fetch VF-specific parameters from the firmware.

t4_vf.c contains the VF-specific PCI device driver and includes its
own attach routine.

VF devices are required to use a different firmware request when
transmitting packets (which in turn requires a different CPL message
to encapsulate messages). This alternate firmware request does not
permit chaining multiple packets in a single message, so each packet
results in a firmware request. In addition, the different CPL message
requires more detailed information when enabling hardware checksums,
so parse_pkt() on VF devices must examine L2 and L3 headers for all
packets (not just TSO packets) for VF devices. Finally, L2 checksums
on non-UDP/non-TCP packets do not work reliably (the firmware trashes
the IPv4 fragment field), so IPv4 checksums for such packets are
calculated in software.

Most of the other changes in the non-VF-specific code are to expose
various variables and functions private to the PF driver so that they
can be used by the VF driver.

Note that a limited subset of cxgbetool functions are supported on VF
devices including register dumps, scheduler classes, and clearing of
statistics. In addition, TOE is not supported on VF devices, only for
the PF interfaces.

Sponsored by: Chelsio Communications

309445 02-Dec-2016 jhb

Fix build without INVARIANTS.

This is a direct commit to stable/10.

309444 02-Dec-2016 jhb

MFC 303204: Install a handler for firmware work request error messages.

If a driver sends an malformed or disallowed work request, the firmware
responds with a work request error. Previously the driver treated this is
as an unexpected message and panicked. Now it decodes the error message
to aid in debugging.

309442 02-Dec-2016 jhb

MFC 302339:
cxgbe(4): Changes to the CPL-handler registration mechanism and code
related to "shared" CPLs.

a) Combine t4_set_tcb_field and t4_set_tcb_field_rpl into a single
function. Allow callers to direct the response to any iq. Tidy up
set_ulp_mode_iscsi while there to use names from t4_tcb.h instead of
magic constants.

b) Remove all CPL handler tables from struct adapter. This reduces its
size by around 2KB. All handlers are now registered at MOD_LOAD instead
of attach or some kind of initialization/activation. The registration
functions do not need an adapter parameter any more.

c) Add per-iq handlers to deal with CPLs whose destination cannot be
determined solely from the opcode. There are 2 such CPLs in use right
now: SET_TCB_RPL and L2T_WRITE_RPL. The base driver continues to send
filter and L2T_WRITEs over the mgmtq and solicits the reply on fwq.
t4_tom (including the DDP code) now uses the port's ctrlq to send
L2T_WRITEs and SET_TCB_FIELDs and solicits the reply on an ofld_rxq.
fwq and ofld_rxq have different handlers that know what kind of tid to
expect in the reply. Update t4_write_l2e and callers to to support any
wrq/iq combination.

Sponsored by: Chelsio Communications

309440 02-Dec-2016 jhb

MFC 292736:
cxgbe(4): Updates to the base NIC driver and t4_tom to support the iSCSI
offload driver. These changes come from projects/cxl_iscsi.

Note that these changes make use of the mbufq API from 11.0, but that
API is not present in 10.x in the same form. Borrow an implementation
from the CAM CTL ha code that uses m_nextpkt to implement mbufq for use
in 10.

309379 02-Dec-2016 jhb

MFC 297797:
cxgbe(4): Provide an explicit value for nqpcq in the firmware
configuration file.

Sponsored by: Chelsio Communications

309378 01-Dec-2016 jhb

MFC 273806,289103,289201,289338,289578,293185,294474,294610,297124,297368,
297406,300875,300888,301158,301896,301897,304838:

Pull in most of the Chelsio and iWARP related changes from stable/11 into
stable/10. A few changes from 278886 (OFED 1.2) were also included though
the full merge is not:
- The find_gid_port() function in infiband/core/cma.c.
- Addition of the 'ord' and 'ird' fields to 'struct iw_cm_event'.

273806:
Userspace library for Chelsio's Terminator 5 based iWARP RNICs (pretty
much every T5 card that does _not_ have "-SO" in its name is RDMA
capable).

This plugs into the OFED verbs framework and allows userspace RDMA
applications to work over T5 RNICs. Tested with rping.

289103:
iw_cxgbe: fix for page fault in cm_close_handler().

This is roughly the iw_cxgbe equivalent of
https://github.com/torvalds/linux/commit/be13b2dff8c4e41846477b22cc5c164ea5a6ac2e
-----------------
RDMA/cxgb4: Connect_request_upcall fixes

When processing an MPA Start Request, if the listening endpoint is
DEAD, then abort the connection.

If the IWCM returns an error, then we must abort the connection and
release resources. Also abort_connection() should not post a CLOSE
event, so clean that up too.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
-----------------

289201:
iw_cxgbe: MPA v2 is always available.

289338:
iw_cxgbe: use correct RFC number.

289578:
Merge LinuxKPI changes from DragonflyBSD:
- Define the kref structure identical to the one found in Linux.
- Update clients referring inside the kref structure.
- Implement kref_sub() for FreeBSD.

293185:
iw_cxgbe: Shut down the socket but do not close the fd in case of error.
The fd is closed later in this case. This fixes a "SS_NOFDREF on enter"
panic.

294474:
iw_cxgbe: fix a couple of problems int the RDMA_TERMINATE handler.

a) Look for the CPL in the payload buffer instead of the descriptor.
b) Retrieve the socket associated with the tid with the inpcb lock held.

294610:
Fix for iWARP servers that listen on INADDR_ANY.

The iWARP Connection Manager (CM) on FreeBSD creates a TCP socket to
represent an iWARP endpoint when the connection is over TCP. For
servers the current approach is to invoke create_listen callback for
each iWARP RNIC registered with the CM. This doesn't work too well for
INADDR_ANY because a listen on any TCP socket already notifies all
hardware TOEs/RNICs of the new listener. This patch fixes the server
side of things for FreeBSD. We've tried to keep all these modifications
in the iWARP/TCP specific parts of the OFED infrastructure as much as
possible.

297124:
iw_cxgbe/libcxgb4: Pull in many applicable fixes from the upstream Linux
iWARP driver and userspace library to the FreeBSD iw_cxgbe and libcxgb4.

This commit includes internal changesets 6785 8111 8149 8478 8617 8648
8650 9110 9143 9440 9511 9894 10164 10261 10450 10980 10981 10982 11730
11792 12218 12220 12222 12223 12225 12226 12227 12228 12229 12654.

297368:
cxgbe/iw_cxgbe: Fix for stray "start_ep_timer timer already started!"
messages.

297406:
Remove unnecessary dequeue_mutex (added in r294610) from the iWARP
connection manager. Examining so_comp without synchronization with
iw_so_event_handler is a harmless race.

300875:
iw_cxgbe: Use vmem(9) to manage PBL and RQT allocations.

300888:
iw_cxgbe: Plug a lock leak in process_mpa_request().

If the parent is DEAD or connect_request_upcall() fails, the parent
mutex is left locked. This leads to a hang when process_mpa_request()
is called again for another child of the listening endpoint.

301158:
iw_cxgbe: Fix panic that occurs when c4iw_ev_handler tries to acquire
comp_handler_lock but c4iw_destroy_cq has already freed the CQ memory
(which is where the lock resides).

301896:
Fix bug in iwcm that caused a panic in iw_cm_wq when krping is run
repeatedly in a tight loop.

301897:
iw_cxgbe: Make sure that send_abort results in a TCP RST and not a FIN.
Release the hold on ep->com immediately after sending the RST. This
fixes a bug that sometimes leaves userspace iWARP tools hung when the
user presses ^C.

304838:
Do not free an uninitialized pointer on soaccept failure in the iWARP
connection manager.

Submitted by: Krishnamraju Eraparaju @ Chelsio (original patch)
Sponsored by: Chelsio Communications

309108 24-Nov-2016 jch

MFC r286227, r286443:

r286227:

Decompose TCP INP_INFO lock to increase short-lived TCP connections scalability:

- The existing TCP INP_INFO lock continues to protect the global inpcb list
stability during full list traversal (e.g. tcp_pcblist()).

- A new INP_LIST lock protects inpcb list actual modifications (inp allocation
and free) and inpcb global counters.

It allows to use TCP INP_INFO_RLOCK lock in critical paths (e.g. tcp_input())
and INP_INFO_WLOCK only in occasional operations that walk all connections.

PR: 183659
Differential Revision: https://reviews.freebsd.org/D2599
Reviewed by: jhb, adrian
Tested by: adrian, nitroboost-gmail.com
Sponsored by: Verisign, Inc.

r286443:

Fix a kernel assertion issue introduced with r286227:
Avoid too strict INP_INFO_RLOCK_ASSERT checks due to
tcp_notify() being called from in6_pcbnotify().

Reported by: Larry Rosenman <ler@lerctr.org>
Submitted by: markj, jch

308322 04-Nov-2016 jhb

MFC 302313:
cxgbe(4): Avoid a NULL dereference while dumping the L2 table. Entries
used by switching filters that rewrite L2 information do not have any
associated ifnet.

308321 04-Nov-2016 jhb

MFC 301516,301520,301531,301535,301540,301542,301628: Traffic scheduling
updates.

301516:
cxgbetool: Allow max-rate > 10Gbps for rate-limited traffic.

301520:
cxgbe(4): Create a reusable struct type for scheduling class parameters.

301531:
cxgbe(4): Break up set_sched_class. Validate the channel number and
min/max rates against their actual limits (which are chip and port
specific) instead of hardcoded constants.

301535:
cxgbe(4): Track the state of the hardware traffic schedulers in the
driver. This works as long as everyone uses set_sched_class_params
to program them.

301540:
cxgbe(4): Provide information about traffic classes in the sysctl mib.

301542:
cxgbe(4): A couple of fixes to set_sched_queue.

- Validate the scheduling class against the actual limit (which is chip
specific) instead of a magic number.

- Return an error if an attempt is made to manipulate the tx queues of a
VI that hasn't been initialized.

301628:
cxgbe(4): Add a sysctl to manage the binding of a txq to a traffic class.

Sponsored by: Chelsio Communications

308320 04-Nov-2016 jhb

MFC 297883:
cxgbe(4): Always dispatch all work requests that have been written to the
descriptor ring before leaving drain_wrq_wr_list.

308319 04-Nov-2016 jhb

MFC 297875: cxgbe(4): Always read the entire mailbox into the reply buffer.

The size of the reply can be different from the size of the command in
case a debug firmware asserts. fw_asrt() needs the entire reply in
order to decode the location of the assert.

308318 04-Nov-2016 jhb

MFC 297776,297777,297779: Add DDB commands to cxgbe(4).

297776:
Add a function to lookup a device_t object by name.

This just walks the global list of devices looking for one with the
requested name. The one use case outside of devctl2's implementation
is for DDB commands that wish to lookup devices by name.

297777:
Add a 'show t4 tcb <nexus> <tid>' command to dump a TCB from DDB.

This allows the contents of a TCB to be extracted from a T4/T5 card in
DDB after a panic.

297779:
Add a 'show t4 devlog <nexus>' DDB command.

This command displays the adapter's firmware device log similar to the
dev.<nexus>.misc.devlog sysctl.

Sponsored by: Chelsio Communications

308316 04-Nov-2016 jhb

MFC 297194:
cxgbe(4): Be consistent and call ETHER_BPF_MTAP before writing anything
to the descriptor ring no matter what path the frame takes within the
driver's tx.

308315 04-Nov-2016 jhb

MFC 296975: cxgbe(4): Tidy up PAUSE frame accounting.

Figure out if the chip is counting PAUSE frames in the "normal" stats
and take them out if it is. This fixes a bug in the tx stats because
the default hardware behavior is different for Tx and Rx but the driver
was treating both the same way. The result was that OPACKETS, OBYTES,
and OMCASTS were under-reported (if tx_pause > 0) before this change.

Note that the mac_stats sysctl still gives you the raw value of these
statistics straight from the device registers.

Sponsored by: Chelsio Communications

308313 04-Nov-2016 jhb

MFC 296950,296951: Configuration updates.

296950:
cxgbe(4): Update some register settings in the default configuration
files to match the "uwire" configuration.

296951:
cxgbe(4): Enable additional capabilities in the default configuration
files. All features with FreeBSD drivers of some kind are now in the
default configuration.

Sponsored by: Chelsio Communications

308311 04-Nov-2016 jhb

MFC 296018,296640,296641,296689,296735,296949: Fixes for sysctl handlers.

296018:
cxgbe(4): Add a sysctl to retrieve the maximum speed/bandwidth supported by a
port.

dev.cxgbe.<n>.max_speed
dev.cxl.<n>.max_speed

296640:
cxgbe(4): Add a sysctl for the event capture mask of the TP block's
logic analyzer.

dev.t5nex.<n>.misc.tp_la_mask
dev.t4nex.<n>.misc.tp_la_mask

296641:
cxgbe(4): Add sysctls to display the TP microcode version and the
expansion rom version (if there's one).

trantor:~# sysctl dev.t4nex dev.t5nex | grep _version
dev.t4nex.0.firmware_version: 1.15.28.0
dev.t4nex.0.tp_version: 0.1.9.4
dev.t5nex.0.firmware_version: 1.15.28.0
dev.t5nex.0.exprom_version: 1.0.0.68
dev.t5nex.0.tp_version: 0.1.4.9

296689:
cxgbe(4): sysctls to display the TOE's TCP timers.

cask:~# sysctl -d dev.t5nex.0.toe
dev.t5nex.0.toe.finwait2_timer: FINWAIT2 timer (us)
dev.t5nex.0.toe.initial_srtt: Initial SRTT (us)
dev.t5nex.0.toe.keepalive_intvl: Keepidle interval (us)
dev.t5nex.0.toe.keepalive_idle: Keepidle idle timer (us)
dev.t5nex.0.toe.persist_max: Persist timer max (us)
dev.t5nex.0.toe.persist_min: Persist timer min (us)
dev.t5nex.0.toe.rexmt_max: Retransmit max (us)
dev.t5nex.0.toe.rexmt_min: Retransmit min (us)
dev.t5nex.0.toe.dack_timer: DACK timer (us)
dev.t5nex.0.toe.dack_tick: DACK tick (us)
dev.t5nex.0.toe.timestamp_tick: TCP timestamp tick (us)
dev.t5nex.0.toe.timer_tick: TP timer tick (us)
...

cask:~# sysctl dev.t5nex.0.toe
dev.t5nex.0.toe.finwait2_timer: 9765440
dev.t5nex.0.toe.initial_srtt: 244128
dev.t5nex.0.toe.keepalive_intvl: 73240800
dev.t5nex.0.toe.keepalive_idle: 7031116800
dev.t5nex.0.toe.persist_max: 9765440
dev.t5nex.0.toe.persist_min: 976544
dev.t5nex.0.toe.rexmt_max: 9765440
dev.t5nex.0.toe.rexmt_min: 244128
dev.t5nex.0.toe.dack_timer: 19520
dev.t5nex.0.toe.dack_tick: 32.768
dev.t5nex.0.toe.timestamp_tick: 1048.576
dev.t5nex.0.toe.timer_tick: 32.768
...

296735:
Fix the following gcc warnings on sparc64, when TCP_OFFLOAD is not
defined:

sys/dev/cxgbe/t4_main.c:7474: warning: 'sysctl_tp_tick' defined but not used
sys/dev/cxgbe/t4_main.c:7505: warning: 'sysctl_tp_dack_timer' defined but not used
sys/dev/cxgbe/t4_main.c:7519: warning: 'sysctl_tp_timer' defined but not used

This just adds a bunch of #ifdef TCP_OFFLOAD in the right places.

296949:
cxgbe(4): Remove a couple of pointless assignments in sysctl_meminfo.
Do not display range if start = stop (this is a workaround for some
unused regions).

Sponsored by: Chelsio Communications

308305 04-Nov-2016 jhb

MFC 296552,296596,296603,296624,296627: Fixes related to memory windows.

296552:
cxgbe(4): Rename regwin_lock to reg_lock. It is used to protect access
to indirect registers only.

296596:
cxgbe(4): Allow the addr/len pair that is being validated in
validate_mem_range to span multiple memory types. Update
validate_mt_off_len to use validate_mem_range.

296603:
cxgbe(4): Add general purpose routines that offer safe access to the
chip's memory windows. Convert existing users of these windows to the
new routines.

296624:
cxgbe(4): Fix bug in r296603. The memory window needs to be
repositioned if the start address isn't in the window already. One
of the bounds check used the end address instead.

296627:
cxgbe(4): Improvements to the code that deals with the firmware's log.

- Query the location of the log very early during attach. Refresh the
location later after establishing contact with the firmware.
- Save the log's location as a flat address in devlog_params.
- Use a memory window instead of backdoor access to the EDC/MC to read
the log.

Sponsored by: Chelsio Communications

308304 04-Nov-2016 jhb

MFC 295778,296249,296333,296383,296471,296478,296481,296485,296488-296491,
296493-296496,296544,296710-296711,297863,299685: Catch up to changes to
the internal shared code.

Note that this merge includes two different firmware updates, but the
effective change is to update to the last version (1.15.37.0). As such,
I've trimmed the log message of the first update (1.15.28.0).

In addition, the M_WAIT macro added in t4_regs.h had to be renamed to
CXGBE_M_WAIT to avoid a collision on 10.x that is not present on 11.

295778:
cxgbe: catch up with the latest hardware-related definitions.

296249:
cxgbe(4): Update T5 and T4 firmwares to 1.15.28.0.

296333:
cxgbe(4): First of many changes to reduce diffs with internal shared
code:

- Rename some CamelCase variables.
- s/t4_link_start/t4_link_l1cfg/g
- Pull in t4_get_port_type_description.
- Move t4_wait_op_done to t4_hw.c.
- Flip the order of the RDMA stats.
- Remove unsused function t4_iq_start_stop.
- Move t4_wait_op_done and t4_wait_op_done_val to t4_hw.c

296383:
cxgbe(4): Very basic T6 awareness. This is part of ongoing work to
update to the latest internal shared code.

- Add a chip_params structure to keep track of hardware constants for
all generations of Terminators handled by cxgbe.
- Update t4_hw_pci_read_cfg4 to work with T6.
- Update the hardware debug sysctls (hidden within dev.<tNnex>.<n>.misc.*) to
work with T6. Most of the changes are in the decoders for the CIM
logic analyzer and the MPS TCAM.
- Acquire the regwin lock around indirect register accesses.

296471:
cxgbe(4): Updated register dumps.

- Get the list of registers to read during a regdump from the shared
code instead of the OS specific code. This follows a similar move
internally. The shared code includes the list for T6.

- Update cxgbetool to be able to decode T5 VF, T6, and T6 VF register
dumps (and catch up with some updates to T4 and T5 register decode).

296478:
cxgbe(4): Add a struct sge_params to store per-adapter SGE parameters.
Move the code that reads all the parameters to t4_init_sge_params in the
shared code. Use these per-adapter values instead of globals.

296481:
cxgbe(4): Overhaul the shared code that deals with the chip's TP block,
which is responsible for filtering and RSS.

Add the ability to use filters that match on PF/VF (aka "VNIC id") while
here. This is mutually exclusive with filtering on outer VLAN tag with
Q-in-Q.

296485:
cxgbe(4): Update the interrupt handlers for hardware errors.

296488:
cxgbe(4): Updates to mailbox routines in the shared code.

296489:
cxgbe(4): Updates to the shared routines that deal with the serial EEPROM,
flash, and VPD.

296490:
cxgbe(4): Remove __devinit and SPEED_<foo> as part of catch up with
internal shared code.

296491:
cxgbe(4): Updates to shared routines that get/set various parameters via
the firmware.

296493:
cxgbe(4): Use t4_link_down_rc_str in shared code to decode the reason
the link is down, instead of doing it in OS specific code.

296494:
cxgbe(4): Many new functions in the shared code, unused at this time.

296495:
cxgbe(4): Fix t4_tp_get_rdma_stats.

296496:
cxgbe(4): Minor updates to the shared routines that deal with firmware images.

296544:
cxgbe(4): Reshuffle and rototill t4_hw.c, solely to reduce diffs with
the internal shared code.

296710:
cxgbe(4): Catch up with the latest list of card capabilities as reported
by the firmware.

296711:
cxgbe(4): Fix typo in previous commit.

297863:
Rename the 'M_B' macro in t4_regs.h to 'CXGBE_M_B'.

This fixes a conflict with the M_B macro in powerpc's
<machine/db_machdep.h> exposed by the recent addition of DDB commands
to the cxgbe driver.

299685:
cxgbe(4): Update T5 and T4 firmwares to 1.15.37.0.

These firmwares were obtained from the "Chelsio T5/T4 Unified Wire
v2.12.0.3 for Linux" release. Changes since 1.14.4.0 (which is the
firmware in -STABLE branches) are in the "Release Notes" accompanying
the Unified Wire release and are copy-pasted here as well.

22.1. T5 Firmware
+++++++++++++++++++++++++++++++++

Version : 1.15.37.0
Date : 04/27/2016
================================================================================

FIXES
-----

BASE:
- Fixed an issue in FW_RSS_VI_CONFIG_CMD handling where the default ingress
queue was ignored.
- Fixed an issue where adapter failed to load fw by adjusting DRAM frequency.
- Fixed an issue in watchdog which was causing VM bring-up failure after reboot.
- Fixed 40G link failures with some switches when auto-negotiation enabled.
- Fixed to improve on link bring-up time.
- Per port buffer groups size doubled to improve performance.
- Fixed an issue where bogus d3hot bits were set causing traffic stall.
- Fixed an issue where sometimes adapter was not seen after reboot.
- Fixed an issue where iWARP was crashing in conjunction with traffic management.
- Fixed an issue where link failed to come up after removing twinax cable and
inserting optical module.

ETH
- Fixed a link flap issue on T580-CR.

OFLD
- Fixed a potential iSCSI data corruption issue by disabling RxFragEn flag.

FOiSCSI
- Fixed an issue in recovery path where connection was getting closed before
recovery processing was done.
- Fixed an issue in TCP port reuse.
- Fixed an issue in recovery path when large number (>64) of iSCSI connections
were in use.
- Returned ENETUNREACH if IP was not been provisioned yet and driver tried to
use given inerface.
- Fixed an issue where fw was sending ENETUNREACH event for normal tcp
disconnection.

DCBX
- Fixed an issue where iscsi tlv is sent incorrectly to host. (DCBX CEE)
- Fixed an issue where apply bit set for APP id was affecting the ETS and PFC
settings.(DCBX IEEE)
- Fixed an issue where app priority values are not handled correctly in fw.
(DCBX IEEE)
- Fixed an issue where enable/disable dcbx can cause crash. (DCBX CEE,DCBX IEEE)

FOFCoE
- Removed BB6 support.

ENHANCEMENTS
------------

BASE:
- Added new interface to program DCA settings in SGE contexts; allow 32-byte
IQE size
- Added PTP interface fw_ptp_ts to support PTP Frequeny and Offset adjustment.
- Added MPS raw interface.

ETH:
- New mailbox command FW_DCB_IEEE_CMD api added for IEEE dcbx.

OFLD:
- WR opcode is returned to host in cqe error response.

22.2. T4 Firmware
+++++++++++++++++

Version : 1.15.37.0
Date : 04/27/2016
================================================================================

FIXES
-----

BASE:
- Fixed an issue in FW_RSS_VI_CONFIG_CMD handling where default ingress queue
was ignored.
- Fixed an issue in watchdog which was causing VM bring-up failure after reboot.
- Per port buffer groups size doubled to improve performance.
- Fixed an issue where iWARP was crashing in conjunction with traffic management.

FOiSCSI:
- Fixed an issue in recovery path where connection was getting closed before
recovery processing was done.
- Fixed an issue in TCP port reuse.
- Fixed an issue in recovery path when large number (>64) of iSCSI connections
were in use.
- Returned ENETUNREACH if IP had not been provisioned yet and driver tried to
use given inerface.

DCBX
- Fixed an issue where iscsi tlv is sent incorrectly to host.(DCBX CEE)
- Fixed an issue where enable/disable dcbx can cause crash in firmware.(DCBX CEE)

FOiSCSI
- Fixes an issue where fw was sending ENETUNREACH event for normal tcp
disconnection.

FOFCoE
- Removed BB6 support.

ENHANCEMENTS
------------

BASE:
- Added MPS raw interface.

ETH:
- New mailbox command FW_DCB_IEEE_CMD api added for IEEE dcbx.
================================================================================

Sponsored by: Chelsio Communications

308284 04-Nov-2016 jhb

MFC 295573: Remove duplicate definition (CPL_TRACE_PKT_T5).

Sponsored by: Chelsio Communications

308283 04-Nov-2016 jhb

MFC 301932: Use sbused() instead of sbspace() to avoid signed issues.

Inserting a full mbuf with an external cluster into the socket buffer
resulted in sbspace() returning -MLEN. However, since sb_hiwat is
unsigned, the -MLEN value was converted to unsigned in comparisons. As a
result, the socket buffer was never autosized. Note that sb_lowat is signed
to permit direct comparisons with sbspace(), but sb_hiwat is unsigned.
Follow suit with what tcp_output() does and compare the value of sbused()
with sb_hiwat instead.

Note: Since stable/10 does not include sbused(), this uses sb->sb_cc
instead.

Sponsored by: Chelsio Communications

308282 04-Nov-2016 jhb

MFC 290175,290633,299206,300895,301898: Various TOE fixes.

290175:
cxgbe/tom: decide whether to shove segments or not only if there is
payload to transmit.

290633:
cxgbe/t4_tom: add a knob to the default configuration file to tune
the TOE for LAN operation. It is possible to set this to other values
(cluster for networks with little loss and really tight RTTs, and wan
for relatively large RTTs and/or lossy networks) depending on the
environment in which the TOE is being used.

None of this affects plain NIC operation in any way.

299206:
Set the correct vnet in TOE event handlers.

300895:
cxgbe/t4_tom: Exempt RDMA connections from a TCP sanity test for now, to
avoid panicking debug kernels.

t4_tom does not keep track of a connection once it switches to ULP mode
iWARP. If the connection falls out of ULP mode the driver/hardware seq#
etc. are out of sync. A better fix would be to figure out what the
current seq# are, update the driver's state, and perform all sanity
checks as usual.

301898:
cxgbe/t4_tom: Fix inverted assertion in r300895. It is RDMA
connections and not others that are allowed to fail the receive window
check.

308281 04-Nov-2016 jhb

MFC 277763,280146,287631: Various fixes to DDP.

277763:
Lock the socket buffer before jumping to the 'out' label if sblock()
fails in t4_soreceive_ddp().

280146:
Move special DDP handling for closing a connection into a new
handle_ddp_close() function in t4_ddp.c as the logic is similar
to handle_ddp_data(). This allows all knowledge of the special
DDP mbufs to be private to t4_ddp.c as well.

287631:
Add a comment to clarify how to determine the amount of received DDP
data.

Sponsored by: Chelsio Communications

308154 31-Oct-2016 jhb

MFC 291665,291685,291856,297467,302110,302263: Add support for VIs.

291665:
Add support for configuring additional virtual interfaces (VIs) on a port.

Each virtual interface has its own MAC address, queues, and statistics.
The dedicated netmap interfaces (ncxgbeX / ncxlX) were already implemented
as additional VIs on each port. This change allows additional non-netmap
interfaces to be configured on each port. Additional virtual interfaces
use the naming scheme vcxgbeX or vcxlX.

Additional VIs are enabled by setting the hw.cxgbe.num_vis tunable to a
value greater than 1 before loading the cxgbe(4) or cxl(4) driver.
NB: The first VI on each port is the "main" interface (cxgbeX or cxlX).

T4/T5 NICs provide a limited number of MAC addresses for each physical port.
As a result, a maximum of six VIs can be configured on each port (including
the "main" interface and the netmap interface when netmap is enabled).

One user-visible result is that when netmap is enabled, packets received
or transmitted via the netmap interface are no longer counted in the stats
for the "main" interface, but are not accounted to the netmap interface.

The netmap interfaces now also have a new-bus device and export various
information sysctl nodes via dev.n(cxgbe|cxl).X.

The cxgbetool 'clearstats' command clears the stats for all VIs on the
specified port along with the port's stats. There is currently no way to
clear the stats of an individual VI.

291685:
Fix build for !TCP_OFFLOAD case.

291856:
Fix RSS build.

297467:
Remove #ifdef's from various structures used in the cxgbe/cxl driver.

This provides a constant ABI and layout for these structures (especially
struct adapter) avoiding some foot shooting.

302110:
cxgbe(4): Merge netmap support from the ncxgbe/ncxl interfaces to the
vcxgbe/vcxl interfaces and retire the 'n' interfaces. The main
cxgbe/cxl interfaces and tunables related to them are not affected by
any of this and will continue to operate as usual.

The driver used to create an additional 'n' interface for every
cxgbe/cxl interface if "device netmap" was in the kernel. The 'n'
interface shared the wire with the main interface but was otherwise
autonomous (with its own MAC address, etc.). It did not have normal
tx/rx but had a specialized netmap-only data path. r291665 added
another set of virtual interfaces (the 'v' interfaces) to the driver.
These had normal tx/rx but no netmap support.

This revision consolidates the features of both the interfaces into the
'v' interface which now has a normal data path, TOE support, and native
netmap support. The 'v' interfaces need to be created explicitly with
the hw.cxgbe.num_vis tunable. This means "device netmap" will not
result in the automatic creation of any virtual interfaces.

The following tunables can be used to override the default number of
queues allocated for each 'v' interface. nofld* = 0 will disable TOE on
the virtual interface and nnm* = 0 to will disable native netmap
support.

# number of normal NIC queues
hw.cxgbe.ntxq_vi
hw.cxgbe.nrxq_vi

# number of TOE queues
hw.cxgbe.nofldtxq_vi
hw.cxgbe.nofldrxq_vi

# number of netmap queues
hw.cxgbe.nnmtxq_vi
hw.cxgbe.nnmrxq_vi

hw.cxgbe.nnm{t,r}xq{10,1}g tunables have been removed.

--- tl;dr version ---
The workflow for netmap on cxgbe starting with FreeBSD 11 is:
1) "device netmap" in the kernel config.
2) "hw.cxgbe.num_vis=2" in loader.conf. num_vis > 2 is ok too, you'll
end up with multiple autonomous netmap-capable interfaces for every
port.
3) "dmesg | grep vcxl | grep netmap" to verify that the interface has
netmap queues.
4) Use any of the 'v' interfaces for netmap. pkt-gen -i vcxl<n>... .
One major improvement is that the netmap interface has a normal data
path as expected.
5) Just ignore the cxl interfaces if you want to use netmap only. No
need to bring them up. The vcxl interfaces are completely independent
and everything should just work.
---------------------

302263:
cxgbe(4): Do not bring up an interface when IFCAP_TOE is enabled on it.
The interface's queues are functional after VI_INIT_DONE (which is short
of interface-up) and that's all that's needed for t4_tom to communicate
with the chip.

Relnotes: yes
Sponsored by: Chelsio Communications

308153 31-Oct-2016 jhb

MFC 289401: cxgbe(4): support for the kernel RSS option.

You need PCBGROUP and RSS in the kernel config to use this.

Note: Since RSS is not present in 10.x this is mostly a no-op and is
stubbed out by removing the #include of opt_rss.h. This is merged
primarily to reduce conflicts in future merges, however it does add a
couple of diagnostic messages related to RSS buckets vs RX queue
counts.

Discussed with: np
Sponsored by: Chelsio Communications

308138 31-Oct-2016 jhb

MFC 282039: Don't use ifm_data. It was used only for self checking debug.

Sponsored by: Chelsio Communications

308071 29-Oct-2016 jhb

MFC 272079,272080: cxgbe/tom: Update for syncache_add locking changes.

272079:
cxgbe/tom: Catch up with r271119, syncache_add doesn't need tcbinfo lock.

272080:
Update comment (missed this bit in r272079).

298653 26-Apr-2016 pfg

MFC r298482:
Cleanup redundant parenthesis from existing howmany()/roundup() macro uses.

Requested by: dchagin

297059 20-Mar-2016 np

MFC r277759 (by jhb@)

Fix a couple of panics when detaching from a cxgbe/cxl interface that was
never brought up:
- Allow NULL to be passed to sglist_free().
- Don't try to stop an interface that was never fully initialized.

PR: 208136

291083 19-Nov-2015 jhb

MFC 290416:
Chelsio T5 chips do not properly echo the No Snoop and Relaxed Ordering
attributes when replying to a TLP from a Root Port. As a workaround,
disable No Snoop and Relaxed Ordering in the Root Port of each T5 adapter
during attach so that CPU-initiated requests do not contain these flags.

Note that this affects CPU-initiated requests to all devices under this
root port.

Sponsored by: Chelsio

287149 26-Aug-2015 np

MFC r286926:

cxgbe(4): Save the flags for the last adapter-wide synchronized
operation that was initiated successfully. (The caller and thread are
already recorded).

286898 18-Aug-2015 np

MFC r271490:
cxgbe(4): add support for the SIOCGI2C ioctl.

286897 18-Aug-2015 np

MFC r285648:

cxgbe(4): Ask the firmware for the start of the RSS slice for a port and
save it for later. This enables direct manipulation of the indirection
tables (although the stock driver doesn't do that right now).

286895 18-Aug-2015 np

MFC r285527 and r286338. This takes the firmware from 1.11.27.0 to 1.14.4.0.

r286338:
cxgbe(4): Update T5 and T4 firmwares bundled with the driver to 1.14.4.0. The
changes in the firmwares since 1.11.27.0 are listed here (straight copy-paste
from the "Release Notes.txt" accompanying the Chelsio Unified Wire 2.11.1.0
release on the website).

22.1. T5 Firmware
+++++++++++++++++++++++++++++++++

Version : 1.14.4.0
Date : 08/05/2015
================================================================================

FIXES
-----

BASE:
- Fixes a potential data path hang by properly programming PMTX congestion
threshold settings.
- Fixes a potential initialization error when accessing a configuration file
stored on the flash.
- Fixes a regression where SGE resources can be miss-sized if iWARP is disabled.

ETH:
- Fixes a timing issue that would prevent CR4 links from coming up with some
switches.

FOFCoE:
- Defers fcoe linkdown mailbox command handling till LOGO is sent.
- Updates vlan prio for all outstanding IOs during dcbx update.

ENHANCEMENTS
------------

BASE:
- Adds support for PAUSE OFF watchdog.
- Reports devlog access information in PCIE_FW_PF register 7.

ETH:
- Enhances segmentation offload to include VxLAN and Geneve.
- Adds PTP support.
- Adds new interface to allow the driver to query the VI rss table base
addresses.
- Allows the driver to program the SGE ingrext contxt CongDrop field.

OFLD:
- Adds new interface for the driver to specify offloaded connections TCP snd
and rcv scale factors.

iSCSI:
- Adds support for iscsi segmentatation offload (ISO).
- Adds support for iscsi t10-dif offload.

FOiSCSI:
- Sets FORCE_BIT for cut through processing for FOiSCSI.

FOFCoE:
- Adds support for FCoE BB6.
- Improves WRITE performance.

================================================================================
================================================================================

Version : 1.13.32.0
Date : 03/25/2015
================================================================================

FIXES
-----

BASE:
- Fixes FW_CAPS_CONFIG_CMD return value on error (was positive instead of
negative)
- Fixes FW_PARAMS_PARAM_DEV_FLOWC_BUFFIFO_SZ indication (was wrong on certain
adapter configurations)
- Fixes config file based PL_TIMEOUT register programming

ETH:
- Fixes a potential EO UDP SEG header corruption
- Fixes an issue where 1000Base-X was not enabled correctly when using QSA
modules

OFLD:
- Fixes timeout issue with half-open connections
- Fixes FW_FLOWC_WR processing when state is set to finwait1

FOFCoE:
- Fixes fcoe xchg leaks in linkdown/peer down path
- Fixes cleanup in FCoE linkdown and fixed buf timer flowid abuse
- Fixes fw crash by clearing fcf flowc during bye

FOiSCSI:
- Don't create a new tcp socket if ERL0 attempt has timed out.

ENHANCEMENTS
------------

BASE:
- Adds support for VFs on PFs 4 to 7
- Adds support for QPs/CQs on any physical and virtual function

ETH:
- Stops sending LACP frames on loopback interface
- Adds an AUTOEQU indication to CPL_SGE_EGR_UPDATE
- Adds support for CR4 links (BEAN/AEC on 40G TwinAx cables)

OFLD:
- Improves default settings of LAN and CLUSTER TCP timer settings
- Sends Negative Advice CPLs to software

FOISCSI:
- Adds IPv6 support for foiscsi. Keeps backward compatibility with
old foiscsi drivers which doesn't support ipv6.

FOFCoE:
- Added fcoe debug support in flowc dump

================================================================================
================================================================================

Version : 1.12.25.0
Date : 10/22/2014
================================================================================

FIXES
-----

BASE:
- Improves precision of the Weight Round Robing Traffic Management Algorithm
- Fixes an issue where the link would intermittently fail to come up
- Fixes an issue where adapters with an external PHY couldn't run at 100Mbps
- Fixes an issue where active optical cables were not recognized
- Fixes link advertising issues on T520-BT (speed and pause frames) that would
cause the link to negotiate unexpected settings
- Forces link restart when auto-negotiation is disabled
- Fix an issue where pause frames wouldn't be fully disabled even if requested

ETH:
- Fixes NVGRE Segmentation Offload network header generation.

DCBX:
- Fixes an issue where some settings were not being sent to the switch
correctly
- Fixes an issue where back-to-back DCBX port updates could get overwritten by
FW
- Fixes a firmware crash on DCBX APP information request before link up

FOiSCSI:
- Fixes abort task leak in tmf response handling
- Fixes TCP RST handling while in iSCSI ERL0
- Fixes a firmware crash on BYE without INIT

ENHANCEMENTS
-------------

BASE:
- Adds link partner settings reporting when available
- Adds QSA support (in conjunction with QSA VPD)
- Adds T520-BT LED support
- Reports NOTSUPPORTED for modules with an unhandled identifier

DCBX:
- Adds version reporting (indicating which version FW is trying to negotiate)
- Adds IEEE support
- Reports LLDP time outs

FOiSCSI:
- Add support for multiple iSCSI DDP client
- Sends DHCP renew request when lease expires

================================================================================

22.2. T4 Firmware
+++++++++++++++++

Version : 1.14.4.0
Date : 08/05/2015
================================================================================

FIXES
-----

BASE:
- Fixes a potential initialization error when accessing a configuration file
stored on the flash.
- Initialize PCIE_DBG_INDIR_REQ.Enable to 0, as hardware failed to do so and
register dumps could result in errors.

ETH:
- Fixes an issue that sometimes prevented the link from coming up in CR adapters.

ENHANCEMENTS
------------

BASE:
- Adds support for PAUSE OFF watchdog.
- Reports devlog access information in PCIE_FW_PF register 7.

ETH:
- Adds new interface to allow the driver to query the VI rss table base
addresses.

OFLD:
- Adds new interface for the driver to specify offloaded connections TCP snd
and rcv scale factors.

================================================================================
================================================================================

Version : 1.13.32.0
Date : 03/25/2015
================================================================================

FIXES
-----

BASE:
- Fixes FW_CAPS_CONFIG_CMD return value on error (was positive instead of
negative)
- Fixes FW_PARAMS_PARAM_DEV_FLOWC_BUFFIFO_SZ indication (was wrong on certain
adapter configurations)
- Fixes config file based PL_TIMEOUT register programming

ETH:
- Fixes a potential EO UDP SEG header corruption

OFLD:
- Fixes timeout issue with half-open connections
- Fixes FW_FLOWC_WR processing when state is set to finwait1

FOiSCSI:
- Don't create a new tcp socket if ERL0 attempt has timed out.

ENHANCEMENTS
------------

ETH:
- Stops sending LACP frames on loopback interface
- Adds an AUTOEQU indication to CPL_SGE_EGR_UPDATE

OFLD:
- Improves default settings of LAN and CLUSTER TCP timer settings
- Sends Negative Advice CPLs to software

================================================================================
================================================================================

Version : 1.12.25.0
Date : 10/22/2014
================================================================================

FIXES
-----

BASE:
- Improves precision of the Weight Round Robing Traffic Management Algorithm
- Forces link restart when auto-negotiation is disabled
- Fix an issue where pause frames wouldn't be fully disabled even if requested

DCBX:
- Fixes an issue where some settings were not being sent to the switch
correctly
- Fixes an issue where back-to-back DCBX port updates could get overwritten by
FW
- Fixes a firmware crash on DCBX APP information request before link up

FOiSCSI:
- Fixes abort task leak in tmf response handling
- Fixes TCP RST handling while in iSCSI ERL0
- Fixes a firmware crash on BYE without INIT

ENHANCEMENTS
------------

BASE:
- Adds link partner settings reporting when available
- Firmware now reports NOTSUPPORTED for modules with an unhandled identifier

DCBX:
- Adds version reporting (indicating which version FW is trying to negotiate)
- Adds IEEE support
- Reports LLDP time outs

FOiSCSI:
- Adds support for multiple iSCSI DDP clients
- Sends DHCP renew request when lease expires

================================================================================

Obtained from: Chelsio Communications
Sponsored by: Chelsio Communications

286274 04-Aug-2015 np

MFC r284988, r285220, and r285221.

r284988:
cxgbe(4): request an automatic tx update when a netmap tx queue idles.
The NIC tx queues already do this.

r285220:
cxgbe(4): Do not override the the global defaults for congestion drops.
The hw.cxgbe.cong_drop knob is not affected by this change because the
driver sets up congestion drop on a per-queue basis.

r285221:
cxgbe(4): Add a new knob that controls the congestion response of netmap
rx queues. The default is to drop rather than backpressure.

This decouples the congestion settings of NIC and netmap rx queues.

286273 04-Aug-2015 np

MFC r284718:

cxgbe: get_fl_payload returns a header mbuf when successful.

286271 04-Aug-2015 np

MFC r284445 and r286107.

r284445:
cxgbe(4): Add the ability to dump mailbox commands and replies. It is
enabled/disabled via bit 0 of adapter->debug_flags (which is available
at dev.t5nex.<n>.debug_flags).

r286107:
cxgbe(4): initialize debug_flags from the kernel environment.

284098 06-Jun-2015 np

MFC r259150 (by adrian@) and r283864.

r259150:
Print out the full PCIe link negotiation during dmesg.

I found this useful when checking whether a NIC is in a PCIE 3.0 8x slot
or not.

r283864:
cxgbe: no need to display the per-lane GT/s rating of the pcie link.

284093 06-Jun-2015 np

MFC r283858 and r284007.

r283858:
cxgbe: set minimum burst size when fetching freelist buffers to 128B.

r284007:
cxgbe: set the minimum burst size when fetching fl buffers to 128B for
netmap rx queues too. This should have gone in as part of r283858.

284092 06-Jun-2015 np

MFC r280878:

cxgbe/tom: return rx credits promptly if the socket buffer's low water
mark cannot be reached because the window advertised to the peer isn't
wide enough. While here, tweak the normal credit return too.

284089 06-Jun-2015 np

MFC r278239 and r278374.

r278239:
cxgbe(4): reserve id for iSCSI upper layer driver.

r278374:
cxgbe(4): tidy up some of the interaction between the Upper Layer
Drivers (ULDs) and the base if_cxgbe driver.

Track the per-adapter activation of ULDs in a new "active_ulds" field.
This was done pretty arbitrarily before this change -- via TOM_INIT_DONE
in adapter->flags for TOM, and the (1 << MAX_NPORTS) bit in
adapter->offload_map for iWARP.

iWARP and hw-accelerated iSCSI rely on the TOE (supported by the TOM
ULD). The rules are:
a) If the iWARP and/or iSCSI ULDs are available when TOE is enabled then
iWARP and/or iSCSI are enabled too.
b) When the iWARP and iSCSI modules are loaded they go looking for
adapters with TOE enabled and enable themselves on that adapter.
c) You cannot deactivate or unload the TOM module from underneath iWARP
or iSCSI. Any such attempt will fail with EBUSY.

284052 06-Jun-2015 np

MFC r276480, r276485, r276498, r277225, r277226, r277227, r277230,
r277637, and r283149 (by emaste@).

r276485 is the real change here, the rest deal with the fallout of
mp_ring's reliance on 64b atomics.

Use the incorrectly spelled 'eigth' from struct pkthdr in this branch
instead of MFC'ing r261733, which would have renamed the field of a
public structure in a -STABLE branch.
---

r276480:
Temporarily unplug cxgbe(4) from !amd64 builds.

r276485:
cxgbe(4): major tx rework.

a) Front load as much work as possible in if_transmit, before any driver
lock or software queue has to get involved.

b) Replace buf_ring with a brand new mp_ring (multiproducer ring). This
is specifically for the tx multiqueue model where one of the if_transmit
producer threads becomes the consumer and other producers carry on as
usual. mp_ring is implemented as standalone code and it should be
possible to use it in any driver with tx multiqueue. It also has:
- the ability to enqueue/dequeue multiple items. This might become
significant if packet batching is ever implemented.
- an abdication mechanism to allow a thread to give up writing tx
descriptors and have another if_transmit thread take over. A thread
that's writing tx descriptors can end up doing so for an unbounded
time period if a) there are other if_transmit threads continuously
feeding the sofware queue, and b) the chip keeps up with whatever the
thread is throwing at it.
- accurate statistics about interesting events even when the stats come
at the expense of additional branches/conditional code.

The NIC txq lock is uncontested on the fast path at this point. I've
left it there for synchronization with the control events (interface
up/down, modload/unload).

c) Add support for "type 1" coalescing work request in the normal NIC tx
path. This work request is optimized for frames with a single item in
the DMA gather list. These are very common when forwarding packets.
Note that netmap tx in cxgbe already uses these "type 1" work requests.

d) Do not request automatic cidx updates every 32 descriptors. Instead,
request updates via bits in individual work requests (still every 32
descriptors approximately). Also, request an automatic final update
when the queue idles after activity. This means NIC tx reclaim is still
performed lazily but it will catch up quickly as soon as the queue
idles. This seems to be the best middle ground and I'll probably do
something similar for netmap tx as well.

e) Implement a faster tx path for WRQs (used by TOE tx and control
queues, _not_ by the normal NIC tx). Allow work requests to be written
directly to the hardware descriptor ring if room is available. I will
convert t4_tom and iw_cxgbe modules to this faster style gradually.

r276498:
cxgbe(4): remove buf_ring specific restriction on the txq size.

r277225:
Make cxgbe(4) buildable with the gcc in base.

r277226:
Allow cxgbe(4) to be built on i386. Driver attach will succeed only on
a subset of i386 systems.

r277227:
Plug cxgbe(4) back into !powerpc && !arm builds, instead of building it
on amd64 only.

r277230:
Build cxgbe(4) on powerpc64 too.

r277637:
Make sure the compiler flag to get cxgbe(4) to compile with gcc is used
only when gcc is being used. This is what r277225 should have been.

283856 01-Jun-2015 np

MFC r273480, r273750, r273753, r273797, and r274461.

r273480:
cxgbe/iw_cxgbe: wake up waiters after flushing the qp.

r273750:
Some cxgbe/iw_cxgbe fixes:
- Free rt in c4iw_connect only if it is allocated.
- Call soclose instead of so_shutdown if there is an abort from the peer.
- Close socket and return failure if TOE is not enabled.

r273753:
iwcm_event status needs to be populated for close_complete_upcall

r273797:
Always request a completion for every work request for iWARP. The
initial MPA exchange must be tracked this way so that t4_tom's state for
the tid is all clean at the time the tid transitions to RDMA mode. Once
it does, t4_tom is out of the way and iw_cxgbe uses the qp endpoints
directly.

r274461:
iw_cxgbe: don't forget to close the socket in c4iw_connect if soconnect
fails.

283854 31-May-2015 np

MFC r272719:

cxgbe/tom: don't leak resources tied to an active open request that
cannot be sent to the chip because a prerequisite L2 resolution
failed.

282486 05-May-2015 np

Backport some parts of r272200.
- a lock to protect indirect register access
- put code that deals with stats in a separate cxgbe_refresh_stats.

This is a direct commit to stable/10.

282367 03-May-2015 np

MFC r272183:

Make sure the adapter's management queue and the event queue are
available before any uppper layer driver (TOE, iWARP, or iSCSI)
registers with the base cxgbe(4) driver.

282365 03-May-2015 np

MFC r272051:

cxgbe(4): Verify that the addresses in if_multiaddrs really are multicast
addresses. (The chip doesn't really care, it's just that it needs to be
told explicitly if unicast DMACs are checked for "hits" in the hash that
is used after the TCAM entries are all used up).

281955 24-Apr-2015 hiren

MFC r275358 r275483 r276982 - Removing M_FLOWID by hps@

r275358:
Start process of removing the use of the deprecated "M_FLOWID" flag
from the FreeBSD network code. The flag is still kept around in the
"sys/mbuf.h" header file, but does no longer have any users. Instead
the "m_pkthdr.rsstype" field in the mbuf structure is now used to
decide the meaning of the "m_pkthdr.flowid" field. To modify the
"m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX"
macros as defined in the "sys/mbuf.h" header file.

This patch introduces new behaviour in the transmit direction.
Previously network drivers checked if "M_FLOWID" was set in "m_flags"
before using the "m_pkthdr.flowid" field. This check has now now been
replaced by checking if "M_HASHTYPE_GET(m)" is different from
"M_HASHTYPE_NONE". In the future more hashtypes will be added, for
example hashtypes for hardware dedicated flows.

"M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is
valid and has no particular type. This change removes the need for an
"if" statement in TCP transmit code checking for the presence of a
valid flowid value. The "if" statement mentioned above is now a direct
variable assignment which is then later checked by the respective
network drivers like before.

r275483:
Remove M_FLOWID from SCTP code.

r276982:
Remove no longer used "M_FLOWID" flag from mbuf.h and update the netisr
manpage.

Note: The FreeBSD version has been bumped.

Reviewed by: hps, tuexen
Sponsored by: Limelight Networks


/freebsd-10-stable/share/man/man9/netisr.9
/freebsd-10-stable/sys/dev/bxe/bxe.c
/freebsd-10-stable/sys/dev/cxgb/cxgb_sge.c
t4_main.c
t4_sge.c
/freebsd-10-stable/sys/dev/e1000/if_igb.c
/freebsd-10-stable/sys/dev/ixgbe/ixgbe.c
/freebsd-10-stable/sys/dev/ixgbe/ixv.c
/freebsd-10-stable/sys/dev/ixl/ixl_txrx.c
/freebsd-10-stable/sys/dev/mxge/if_mxge.c
/freebsd-10-stable/sys/dev/netmap/netmap_freebsd.c
/freebsd-10-stable/sys/dev/oce/oce_if.c
/freebsd-10-stable/sys/dev/qlxgbe/ql_isr.c
/freebsd-10-stable/sys/dev/qlxgbe/ql_os.c
/freebsd-10-stable/sys/dev/qlxge/qls_isr.c
/freebsd-10-stable/sys/dev/qlxge/qls_os.c
/freebsd-10-stable/sys/dev/sfxge/sfxge_rx.c
/freebsd-10-stable/sys/dev/sfxge/sfxge_tx.c
/freebsd-10-stable/sys/dev/virtio/network/if_vtnet.c
/freebsd-10-stable/sys/dev/vmware/vmxnet3/if_vmx.c
/freebsd-10-stable/sys/dev/vxge/vxge.c
/freebsd-10-stable/sys/net/flowtable.c
/freebsd-10-stable/sys/net/ieee8023ad_lacp.c
/freebsd-10-stable/sys/net/if_lagg.c
/freebsd-10-stable/sys/net/if_lagg.h
/freebsd-10-stable/sys/net/netisr.c
/freebsd-10-stable/sys/netinet/in_pcb.h
/freebsd-10-stable/sys/netinet/ip_output.c
/freebsd-10-stable/sys/netinet/sctp_indata.c
/freebsd-10-stable/sys/netinet/sctp_input.c
/freebsd-10-stable/sys/netinet/sctp_output.c
/freebsd-10-stable/sys/netinet/sctp_pcb.c
/freebsd-10-stable/sys/netinet/sctp_structs.h
/freebsd-10-stable/sys/netinet/sctputil.c
/freebsd-10-stable/sys/netinet/tcp_input.c
/freebsd-10-stable/sys/netinet/tcp_syncache.c
/freebsd-10-stable/sys/netinet6/sctp6_usrreq.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/en_rx.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/en_tx.c
/freebsd-10-stable/sys/sys/mbuf.h
/freebsd-10-stable/sys/sys/param.h
281315 09-Apr-2015 jhb

MFC 279892:
Resize receive socket buffers that support autosizing when receiving
TCP data via direct data placement.

281264 08-Apr-2015 np

MFC r279092:

cxgbe(4): there is no need to force an "unimplemented" panic needlessly.
The calls to free_nm_txq and free_nm_rxq are made just a few lines prior
to the panic.

281263 08-Apr-2015 np

MFC r278303:

cxgbe(4): Add a minimal if_cxl module that pulls in the real driver as
a dependency. This ensures "ifconfig cxl<n> ..." does the right thing
even when it's run with no driver loaded.

if_cxl.ko is the tiniest module in /boot/kernel.

281259 08-Apr-2015 np

MFC r278485:

cxgbe(4): allow the SET_FILTER_MODE ioctl to change the mode when it's
safe to do so.

281253 08-Apr-2015 np

MFC r279243-r279246, r279251, r279691, r279700, and r279701.

r279243:
cxgbe(4): request an automatic tx update when a netmap txq idles.

r279244:
cxgbe(4): wait for the hardware to catch up before destroying a netmap txq.

r279245:
cxgbe(4): do not set the netmap rxq interrupts on a hair-trigger.

r279246:
cxgbe(4): set up congestion management for netmap rx queues.

The hw.cxgbe.cong_drop knob controls the response of the chip when
netmap queues are congested.

r279251:
cxgbe(4): allow tx hardware checksumming on the netmap interface.

It is disabled by default but users can set IFCAP_TXCSUM on the
netmap ifnet (ifconfig ncxl0 txcsum) to override netmap and force
the hardware to calculate and insert proper IP and L4 checksums in
outbound frames.

r279691:
cxgbe(4): provide the correct size of freelists associated with netmap
rx queues to the chip. This will fix many problems with native netmap
rx on ncxl/ncxgbe interfaces.

r279700:
cxgbe(4): knobs to experiment with the interrupt coalescing timer for
netmap rx queues, and the "batchiness" of rx updates sent to the chip.

These knobs will probably become per-rxq in the near future and will be
documented only after their final form is decided.

r279701:
cxgbe(4): experimental rx packet sink for netmap queues. This is not
intended for general use.

281252 08-Apr-2015 np

MFC r280403:

cxgbe(4): Do not call sbuf_trim on an sbuf with a drain function.

281251 08-Apr-2015 np

MFC r279969:

cxgbe(4): fix if_media handling for T520-BT cards. 1Gbps and 100Mbps
are valid for this card.

281250 08-Apr-2015 np

MFC r278372:

cxgbe(4): adapter_full_init is always a synchronized operation.

281249 08-Apr-2015 np

MFC r278371:

cxgbe(4): a change to the synchronization rules within the the driver.
This is purely cosmetic because the new rules are already followed.

281248 08-Apr-2015 np

MFC r278342:

cxgbe(4): fix a test made while enabling TOE.

281247 08-Apr-2015 np

MFC r277102, r277135.

r277102:
cxgbe/iw_cxgbe: allow any size during the initial MPA exchange.

r277135:
cxgbe/iw_cxgbe: fix whitespace nit in r277102.

281245 08-Apr-2015 np

MFC r276729, r276775.

r276729:
cxgbe/tom: use vmem(9) as the DDP page pod allocator.

r276775:
cxgbe/tom: allocate page pod addresses instead of ppod#.

281244 08-Apr-2015 np

MFC r276597:

cxgbe/tom: do not engage the TOE's payload chopper for payload < 2 MSS
or for 10Gbps ports.

281241 08-Apr-2015 np

MFC r276728:

cxgbe(4): fix the description of a strange bunch of counters.

281214 07-Apr-2015 np

MFC r276574:

cxgbe/tom: fix the MSS calculation for IPv6 connections handled by the TOE.

281213 07-Apr-2015 np

MFC r276570:

cxgbe/tom: log some more details in send_flowc_wr.

281212 07-Apr-2015 np

MFC r275539, r275554.

r275539:
cxgbe(4): Allow for different pad and pack boundaries for different
adapters. Set the pack boundary for T5 cards to be the same as the
PCIe max payload size. The chip likes it this way.

In this revision the driver allocate rx buffers that align on both
boundaries. This is not a strict requirement and a followup commit
will switch the driver to a more relaxed allocation strategy.

r275554:
cxgbe(4): allow the driver to use rx buffers that do not end on a pack
boundary.

281211 07-Apr-2015 np

MFC r275733:

Move KTR_CXGBE from t4_tom.h to adapter.h so that the base if_cxgbe
code can use it too.

281207 07-Apr-2015 np

MFC r274456:

Fix some bad interaction between cxgbe(4) and lacp lagg(4) that could
leave a port permanently disabled when a copper cable is unplugged and
then plugged right back in.

lacp_linkstate goes looking for the current ifmedia on a link state
change and it could get stale information from cxgbe(4) on a module
unplug followed by replug. The fix is to process module events before
link-state events within the driver, and to always rebuild the ifmedia
list on a module change event (instead of rebuilding it lazily).

Thanks to asomers@ for the problem report and detailed analysis to go
with it.

278319 06-Feb-2015 jhb

MFC 275808:
Check for SS_NBIO in so->so_state instead of sb->sb_flags in
soreceive_stream().

278286 05-Feb-2015 jhb

MFC 274402:
Add device ID for the T502-BT (dual-port 1G) adapter.

275092 26-Nov-2014 np

MFC r274724:
cxgbe(4): figure out the max payload size and save it for later.

274612 17-Nov-2014 np

MFC r274351:

cxgbe(4): adjust PMRX and PMTX parameters.

274449 12-Nov-2014 np

MFC r271328:

Whitespace nit.

274446 12-Nov-2014 np

MFC r273615:

cxgbe(4): bump up PF4's share of some global resources.

This increases the size of the per-port RSS slice and also allows the
driver to use a larger number of tx and rx queues.

S2curity:

274440 12-Nov-2014 np

MFC r272190:

cxgbe(4): explicitly set various if_hw_tso* values.

273736 27-Oct-2014 hselasky

MFC r263710, r273377, r273378, r273423 and r273455:

- De-vnet hash sizes and hash masks.
- Fix multiple issues related to arguments passed to SYSCTL macros.

Sponsored by: Mellanox Technologies


/freebsd-10-stable/sys/amd64/amd64/fpu.c
/freebsd-10-stable/sys/arm/arm/busdma_machdep-v6.c
/freebsd-10-stable/sys/arm/arm/busdma_machdep.c
/freebsd-10-stable/sys/cam/scsi/scsi_sa.c
/freebsd-10-stable/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c
/freebsd-10-stable/sys/cddl/dev/dtrace/dtrace_sysctl.c
/freebsd-10-stable/sys/compat/ndis/kern_ndis.c
/freebsd-10-stable/sys/dev/acpi_support/acpi_asus.c
/freebsd-10-stable/sys/dev/acpi_support/acpi_asus_wmi.c
/freebsd-10-stable/sys/dev/acpi_support/acpi_hp.c
/freebsd-10-stable/sys/dev/acpi_support/acpi_ibm.c
/freebsd-10-stable/sys/dev/acpi_support/acpi_rapidstart.c
/freebsd-10-stable/sys/dev/acpi_support/acpi_sony.c
/freebsd-10-stable/sys/dev/bxe/bxe.c
/freebsd-10-stable/sys/dev/cxgb/cxgb_sge.c
t4_main.c
/freebsd-10-stable/sys/dev/e1000/if_em.c
/freebsd-10-stable/sys/dev/e1000/if_igb.c
/freebsd-10-stable/sys/dev/e1000/if_lem.c
/freebsd-10-stable/sys/dev/hatm/if_hatm.c
/freebsd-10-stable/sys/dev/ixgbe/ixgbe.c
/freebsd-10-stable/sys/dev/ixgbe/ixv.c
/freebsd-10-stable/sys/dev/ixl/if_ixl.c
/freebsd-10-stable/sys/dev/mpr/mpr.c
/freebsd-10-stable/sys/dev/mps/mps.c
/freebsd-10-stable/sys/dev/mrsas/mrsas.c
/freebsd-10-stable/sys/dev/mrsas/mrsas.h
/freebsd-10-stable/sys/dev/mxge/if_mxge.c
/freebsd-10-stable/sys/dev/oce/oce_sysctl.c
/freebsd-10-stable/sys/dev/qlxgb/qla_os.c
/freebsd-10-stable/sys/dev/qlxgbe/ql_os.c
/freebsd-10-stable/sys/dev/rt/if_rt.c
/freebsd-10-stable/sys/dev/sound/pci/hda/hdaa.c
/freebsd-10-stable/sys/dev/vxge/vxge.c
/freebsd-10-stable/sys/dev/xen/netfront/netfront.c
/freebsd-10-stable/sys/fs/devfs/devfs_devs.c
/freebsd-10-stable/sys/fs/fuse/fuse_main.c
/freebsd-10-stable/sys/fs/fuse/fuse_vfsops.c
/freebsd-10-stable/sys/fs/nfsserver/nfs_nfsdkrpc.c
/freebsd-10-stable/sys/geom/geom_kern.c
/freebsd-10-stable/sys/kern/kern_cpuset.c
/freebsd-10-stable/sys/kern/kern_descrip.c
/freebsd-10-stable/sys/kern/kern_mib.c
/freebsd-10-stable/sys/kern/kern_synch.c
/freebsd-10-stable/sys/kern/subr_devstat.c
/freebsd-10-stable/sys/kern/subr_kdb.c
/freebsd-10-stable/sys/kern/subr_uio.c
/freebsd-10-stable/sys/kern/vfs_cache.c
/freebsd-10-stable/sys/mips/mips/busdma_machdep.c
/freebsd-10-stable/sys/net/if_lagg.c
/freebsd-10-stable/sys/net/pfvar.h
/freebsd-10-stable/sys/net80211/ieee80211_ht.c
/freebsd-10-stable/sys/net80211/ieee80211_hwmp.c
/freebsd-10-stable/sys/net80211/ieee80211_mesh.c
/freebsd-10-stable/sys/net80211/ieee80211_superg.c
/freebsd-10-stable/sys/netgraph/bluetooth/common/ng_bluetooth.c
/freebsd-10-stable/sys/netgraph/ng_base.c
/freebsd-10-stable/sys/netgraph/ng_socket.c
/freebsd-10-stable/sys/netinet/cc/cc_chd.c
/freebsd-10-stable/sys/netinet/tcp_reass.c
/freebsd-10-stable/sys/netipsec/ipsec.h
/freebsd-10-stable/sys/netipx/ipx_proto.c
/freebsd-10-stable/sys/netpfil/pf/if_pfsync.c
/freebsd-10-stable/sys/netpfil/pf/pf.c
/freebsd-10-stable/sys/netpfil/pf/pf_ioctl.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/mlx4_en.h
/freebsd-10-stable/sys/powerpc/powermac/fcu.c
/freebsd-10-stable/sys/powerpc/powermac/smu.c
/freebsd-10-stable/sys/powerpc/powerpc/busdma_machdep.c
/freebsd-10-stable/sys/powerpc/powerpc/cpu.c
/freebsd-10-stable/sys/sys/sysctl.h
/freebsd-10-stable/sys/vm/memguard.c
/freebsd-10-stable/sys/vm/vm_kern.c
/freebsd-10-stable/sys/x86/x86/busdma_bounce.c
273246 18-Oct-2014 hselasky

MFC r273135:
Update the OFED Linux compatibility layer and
Mellanox hardware driver(s):

- Properly name an inclusion guard
- Fix compile warnings regarding unsigned enums
- Add two new sysctl nodes
- Remove all empty linux header files
- Make an error printout more verbose
- Use "mod_delayed_work()" instead of
cancelling and starting a timeout.
- Implement more Linux scatterlist
functions.

Sponsored by: Mellanox Technologies


/freebsd-10-stable/sys/contrib/rdma/krping/krping.c
iw_cxgbe/iw_cxgbe.h
/freebsd-10-stable/sys/ofed/drivers/infiniband/core/addr.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/core/agent.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/core/ucm.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/core/uverbs_main.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/hw/mlx4/ah.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/hw/mthca/mthca_config_reg.h
/freebsd-10-stable/sys/ofed/drivers/infiniband/hw/mthca/mthca_memfree.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/hw/mthca/mthca_uar.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/ulp/ipoib/ipoib_verbs.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/Makefile
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/en_ethtool.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/en_frag.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/en_netdev.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/en_params.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/en_selftest.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/en_tx.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/pd.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/xrcd.c
/freebsd-10-stable/sys/ofed/include/asm/page.h
/freebsd-10-stable/sys/ofed/include/linux/completion.h
/freebsd-10-stable/sys/ofed/include/linux/etherdevice.h
/freebsd-10-stable/sys/ofed/include/linux/ethtool.h
/freebsd-10-stable/sys/ofed/include/linux/inet.h
/freebsd-10-stable/sys/ofed/include/linux/mlx4/device.h
/freebsd-10-stable/sys/ofed/include/linux/mlx4/driver.h
/freebsd-10-stable/sys/ofed/include/linux/mount.h
/freebsd-10-stable/sys/ofed/include/linux/netdevice.h
/freebsd-10-stable/sys/ofed/include/linux/scatterlist.h
/freebsd-10-stable/sys/ofed/include/linux/vmalloc.h
/freebsd-10-stable/sys/ofed/include/rdma/ib_addr.h
/freebsd-10-stable/sys/ofed/include/rdma/ib_smi.h
/freebsd-10-stable/sys/ofed/include/rdma/ib_user_cm.h
271961 22-Sep-2014 np

MFC r271450:
cxgbe(4): knobs to enable/disable PAUSE frame based flow control.

Approved by: re (glebius)

271127 04-Sep-2014 hselasky

MFC r270710 and r270821:
- Update the OFED Linux Emulation layer as a preparation for a
hardware driver update from Mellanox Technologies.
- Remove empty files from the OFED Linux Emulation layer.
- Fix compile warnings related to printf() and the "%lld" and "%llx"
format specifiers.
- Add some missing 2-clause BSD copyrights.
- Add "Mellanox Technologies, Ltd." to list of copyright holders.
- Add some new compatibility files.
- Fix order of uninit in the mlx4ib module to avoid crash at unload
using the new module_exit_order() function.

Sponsored by: Mellanox Technologies


/freebsd-10-stable/sys/contrib/rdma/krping/krping.c
/freebsd-10-stable/sys/dev/cxgb/cxgb_osdep.h
iw_cxgbe/cm.c
iw_cxgbe/qp.c
/freebsd-10-stable/sys/modules/mlx4/Makefile
/freebsd-10-stable/sys/modules/mlx4ib/Makefile
/freebsd-10-stable/sys/modules/mlxen/Makefile
/freebsd-10-stable/sys/ofed/drivers/infiniband/core/addr.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/core/cm.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/core/device.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/core/iwcm.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/core/sa_query.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/core/sysfs.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/core/ucm.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/core/user_mad.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/core/uverbs_cmd.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/core/uverbs_main.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/hw/mlx4/alias_GUID.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/hw/mlx4/cm.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/hw/mlx4/mad.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/hw/mlx4/main.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/hw/mlx4/mlx4_ib.h
/freebsd-10-stable/sys/ofed/drivers/infiniband/hw/mlx4/mr.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/hw/mlx4/qp.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/hw/mlx4/sysfs.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/hw/mthca/mthca_allocator.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/hw/mthca/mthca_main.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/hw/mthca/mthca_provider.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/hw/mthca/mthca_reset.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/ulp/ipoib/ipoib_main.c
/freebsd-10-stable/sys/ofed/drivers/infiniband/ulp/sdp/sdp.h
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/alloc.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/cmd.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/cq.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/en_netdev.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/en_rx.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/eq.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/fw.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/main.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/mcg.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/mr.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/pd.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/qp.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/reset.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/resource_tracker.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/sense.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/srq.c
/freebsd-10-stable/sys/ofed/drivers/net/mlx4/xrcd.c
/freebsd-10-stable/sys/ofed/include/asm/atomic-long.h
/freebsd-10-stable/sys/ofed/include/asm/atomic.h
/freebsd-10-stable/sys/ofed/include/asm/byteorder.h
/freebsd-10-stable/sys/ofed/include/asm/current.h
/freebsd-10-stable/sys/ofed/include/asm/fcntl.h
/freebsd-10-stable/sys/ofed/include/asm/io.h
/freebsd-10-stable/sys/ofed/include/asm/page.h
/freebsd-10-stable/sys/ofed/include/asm/pgtable.h
/freebsd-10-stable/sys/ofed/include/asm/semaphore.h
/freebsd-10-stable/sys/ofed/include/asm/system.h
/freebsd-10-stable/sys/ofed/include/asm/types.h
/freebsd-10-stable/sys/ofed/include/asm/uaccess.h
/freebsd-10-stable/sys/ofed/include/linux/atomic.h
/freebsd-10-stable/sys/ofed/include/linux/bitmap.h
/freebsd-10-stable/sys/ofed/include/linux/bitops.h
/freebsd-10-stable/sys/ofed/include/linux/cache.h
/freebsd-10-stable/sys/ofed/include/linux/cdev.h
/freebsd-10-stable/sys/ofed/include/linux/clocksource.h
/freebsd-10-stable/sys/ofed/include/linux/compat.h
/freebsd-10-stable/sys/ofed/include/linux/compiler.h
/freebsd-10-stable/sys/ofed/include/linux/completion.h
/freebsd-10-stable/sys/ofed/include/linux/ctype.h
/freebsd-10-stable/sys/ofed/include/linux/delay.h
/freebsd-10-stable/sys/ofed/include/linux/device.h
/freebsd-10-stable/sys/ofed/include/linux/dma-attrs.h
/freebsd-10-stable/sys/ofed/include/linux/dma-mapping.h
/freebsd-10-stable/sys/ofed/include/linux/dmapool.h
/freebsd-10-stable/sys/ofed/include/linux/err.h
/freebsd-10-stable/sys/ofed/include/linux/errno.h
/freebsd-10-stable/sys/ofed/include/linux/etherdevice.h
/freebsd-10-stable/sys/ofed/include/linux/ethtool.h
/freebsd-10-stable/sys/ofed/include/linux/file.h
/freebsd-10-stable/sys/ofed/include/linux/fs.h
/freebsd-10-stable/sys/ofed/include/linux/gfp.h
/freebsd-10-stable/sys/ofed/include/linux/hardirq.h
/freebsd-10-stable/sys/ofed/include/linux/idr.h
/freebsd-10-stable/sys/ofed/include/linux/if_arp.h
/freebsd-10-stable/sys/ofed/include/linux/if_ether.h
/freebsd-10-stable/sys/ofed/include/linux/if_vlan.h
/freebsd-10-stable/sys/ofed/include/linux/in.h
/freebsd-10-stable/sys/ofed/include/linux/in6.h
/freebsd-10-stable/sys/ofed/include/linux/inet.h
/freebsd-10-stable/sys/ofed/include/linux/inetdevice.h
/freebsd-10-stable/sys/ofed/include/linux/init.h
/freebsd-10-stable/sys/ofed/include/linux/interrupt.h
/freebsd-10-stable/sys/ofed/include/linux/io-mapping.h
/freebsd-10-stable/sys/ofed/include/linux/io.h
/freebsd-10-stable/sys/ofed/include/linux/ioctl.h
/freebsd-10-stable/sys/ofed/include/linux/jiffies.h
/freebsd-10-stable/sys/ofed/include/linux/kdev_t.h
/freebsd-10-stable/sys/ofed/include/linux/kernel.h
/freebsd-10-stable/sys/ofed/include/linux/kmod.h
/freebsd-10-stable/sys/ofed/include/linux/kobject.h
/freebsd-10-stable/sys/ofed/include/linux/kref.h
/freebsd-10-stable/sys/ofed/include/linux/kthread.h
/freebsd-10-stable/sys/ofed/include/linux/ktime.h
/freebsd-10-stable/sys/ofed/include/linux/linux_compat.c
/freebsd-10-stable/sys/ofed/include/linux/linux_idr.c
/freebsd-10-stable/sys/ofed/include/linux/linux_radix.c
/freebsd-10-stable/sys/ofed/include/linux/list.h
/freebsd-10-stable/sys/ofed/include/linux/lockdep.h
/freebsd-10-stable/sys/ofed/include/linux/log2.h
/freebsd-10-stable/sys/ofed/include/linux/math64.h
/freebsd-10-stable/sys/ofed/include/linux/miscdevice.h
/freebsd-10-stable/sys/ofed/include/linux/mm.h
/freebsd-10-stable/sys/ofed/include/linux/module.h
/freebsd-10-stable/sys/ofed/include/linux/moduleparam.h
/freebsd-10-stable/sys/ofed/include/linux/mount.h
/freebsd-10-stable/sys/ofed/include/linux/mutex.h
/freebsd-10-stable/sys/ofed/include/linux/net.h
/freebsd-10-stable/sys/ofed/include/linux/netdevice.h
/freebsd-10-stable/sys/ofed/include/linux/notifier.h
/freebsd-10-stable/sys/ofed/include/linux/page.h
/freebsd-10-stable/sys/ofed/include/linux/pci.h
/freebsd-10-stable/sys/ofed/include/linux/poll.h
/freebsd-10-stable/sys/ofed/include/linux/radix-tree.h
/freebsd-10-stable/sys/ofed/include/linux/random.h
/freebsd-10-stable/sys/ofed/include/linux/rbtree.h
/freebsd-10-stable/sys/ofed/include/linux/rtnetlink.h
/freebsd-10-stable/sys/ofed/include/linux/rwlock.h
/freebsd-10-stable/sys/ofed/include/linux/rwsem.h
/freebsd-10-stable/sys/ofed/include/linux/scatterlist.h
/freebsd-10-stable/sys/ofed/include/linux/sched.h
/freebsd-10-stable/sys/ofed/include/linux/semaphore.h
/freebsd-10-stable/sys/ofed/include/linux/slab.h
/freebsd-10-stable/sys/ofed/include/linux/socket.h
/freebsd-10-stable/sys/ofed/include/linux/spinlock.h
/freebsd-10-stable/sys/ofed/include/linux/stddef.h
/freebsd-10-stable/sys/ofed/include/linux/string.h
/freebsd-10-stable/sys/ofed/include/linux/sysfs.h
/freebsd-10-stable/sys/ofed/include/linux/timer.h
/freebsd-10-stable/sys/ofed/include/linux/types.h
/freebsd-10-stable/sys/ofed/include/linux/uaccess.h
/freebsd-10-stable/sys/ofed/include/linux/vmalloc.h
/freebsd-10-stable/sys/ofed/include/linux/wait.h
/freebsd-10-stable/sys/ofed/include/linux/workqueue.h
/freebsd-10-stable/sys/ofed/include/net/addrconf.h
/freebsd-10-stable/sys/ofed/include/net/arp.h
/freebsd-10-stable/sys/ofed/include/net/if_inet6.h
/freebsd-10-stable/sys/ofed/include/net/ip.h
/freebsd-10-stable/sys/ofed/include/net/ip6_route.h
/freebsd-10-stable/sys/ofed/include/net/ipv6.h
/freebsd-10-stable/sys/ofed/include/net/neighbour.h
/freebsd-10-stable/sys/ofed/include/net/netevent.h
/freebsd-10-stable/sys/ofed/include/net/tcp.h
/freebsd-10-stable/sys/ofed/include/rdma/ib_umem.h
/freebsd-10-stable/sys/ofed/include/rdma/ib_verbs.h
270297 21-Aug-2014 np

MFC r266571, r266757, r268536, r269076, r269364, r269366, r269411,
r269413, r269428, r269440, r269537, r269644, r269731, and the cxgbe
portion of r270063.

r266571:
cxgbe(4): Remove stray if_up from the code that creates the tracing ifnet.

r266757:
cxgbe(4): netmap support for Terminator 5 (T5) based 10G/40G cards.
Netmap gets its own hardware-assisted virtual interface and won't take
over or disrupt the "normal" interface in any way. You can use both
simultaneously.

For kernels with DEV_NETMAP, cxgbe(4) carves out an ncxl<N> interface
(note the 'n' prefix) in the hardware to accompany each cxl<N>
interface. These two ifnet's per port share the same wire but really
are separate interfaces in the hardware and software. Each gets its own
L2 MAC addresses (unicast and multicast), MTU, checksum caps, etc. You
should run netmap on the 'n' interfaces only, that's what they are for.

With this, pkt-gen is able to transmit > 45Mpps out of a single 40G port
of a T580 card. 2 port tx is at ~56Mpps total (28M + 28M) as of now.
Single port receive is at 33Mpps but this is very much a work in
progress. I expect it to be closer to 40Mpps once done. In any case
the current effort can already saturate multiple 10G ports of a T5 card
at the smallest legal packet size. T4 gear is totally untested.

trantor:~# ./pkt-gen -i ncxl0 -f tx -D 00:07:43:ab:cd:ef
881.952141 main [1621] interface is ncxl0
881.952250 extract_ip_range [275] range is 10.0.0.1:0 to 10.0.0.1:0
881.952253 extract_ip_range [275] range is 10.1.0.1:0 to 10.1.0.1:0
881.962540 main [1804] mapped 334980KB at 0x801dff000
Sending on netmap:ncxl0: 4 queues, 1 threads and 1 cpus.
10.0.0.1 -> 10.1.0.1 (00:00:00:00:00:00 -> 00:07:43:ab:cd:ef)
881.962562 main [1882] Sending 512 packets every 0.000000000 s
881.962563 main [1884] Wait 2 secs for phy reset
884.088516 main [1886] Ready...
884.088535 nm_open [457] overriding ifname ncxl0 ringid 0x0 flags 0x1
884.088607 sender_body [996] start
884.093246 sender_body [1064] drop copy
885.090435 main_thread [1418] 45206353 pps (45289533 pkts in 1001840 usec)
886.091600 main_thread [1418] 45322792 pps (45375593 pkts in 1001165 usec)
887.092435 main_thread [1418] 45313992 pps (45351784 pkts in 1000834 usec)
888.094434 main_thread [1418] 45315765 pps (45406397 pkts in 1002000 usec)
889.095434 main_thread [1418] 45333218 pps (45378551 pkts in 1001000 usec)
890.097434 main_thread [1418] 45315247 pps (45405877 pkts in 1002000 usec)
891.099434 main_thread [1418] 45326515 pps (45417168 pkts in 1002000 usec)
892.101434 main_thread [1418] 45333039 pps (45423705 pkts in 1002000 usec)
893.103434 main_thread [1418] 45324105 pps (45414708 pkts in 1001999 usec)
894.105434 main_thread [1418] 45318042 pps (45408723 pkts in 1002001 usec)
895.106434 main_thread [1418] 45332430 pps (45377762 pkts in 1001000 usec)
896.107434 main_thread [1418] 45338072 pps (45383410 pkts in 1001000 usec)
...

r268536:
cxgbe(4): Add an iSCSI softc to the adapter structure.

r269076:
Some hooks in cxgbe(4) for the offloaded iSCSI driver.

r269364:
Improve compliance with style.Makefile(5).

r269366:
List one file per line in the Makefiles. This makes it easier to read
diffs when a file is added or removed.

r269411:
cxgbe(4): minor optimizations in ingress queue processing.

Reorganize struct sge_iq. Make the iq entry size a compile time
constant. While here, eliminate RX_FL_ESIZE and use EQ_ESIZE directly.

r269413:
cxgbe(4): Fix an off by one error when looking for the BAR2 doorbell
address of an egress queue.

r269428:
cxgbe(4): some optimizations in freelist handling.

r269440:
cxgbe(4): Remove an unused version of t4_enable_vi.

r269537:
cxgbe(4): Do not run any sleepable code in the SIOCSIFFLAGS handler when
IFF_PROMISC or IFF_ALLMULTI is being flipped. bpf(4) holds its global
mutex around ifpromisc in at least the bpf_dtor path.

r269644:
cxgbe(4): Let caller specify whether it's ok to sleep in
t4_sched_config and t4_sched_params.

r269731:
cxgbe(4): Do not poke T4-only registers on a T5 (and vice versa).

Relnotes: Yes (native netmap support for Chelsio T4/T5 cards)

270051 16-Aug-2014 bz

MFC r266596:

Move the tcp_fields_to_host() and tcp_fields_to_net() (inline)
functions to the tcp_var.h header file in order to avoid further
duplication with upcoming commits.

Reviewed by: np

269356 31-Jul-2014 np

MFC r268971 and r269032.

r268971:
Simplify r267600, there's no need to distinguish between allocated and
inlined mbufs.

r269032:
cxgbe(4): Keep track of the clusters that have to be freed by the
custom free routine (rxb_free) in the driver. Fail MOD_UNLOAD with
EBUSY if any such cluster has been handed up to the kernel but hasn't
been freed yet. This prevents a panic later when the cluster finally
needs to be freed but rxb_free is gone from the kernel.

269082 25-Jul-2014 np

MFC r268640 and r268989.

r268640:
Allow multi-byte reads in the private CHELSIO_T4_GET_I2C ioctl. The
firmware allows up to 48B to be read this way but the driver limits
itself to 8B at a time to remain compatible with old cxgbetool
binaries.

r268989:
Add missing newline to an error message.

268823 18-Jul-2014 np

MFC r268706:
cxgbe(4): Display CF facility correctly in the device log.

267849 25-Jun-2014 np

MFC r267757:

cxgbe(4): Update the bundled T4 and T5 firmwares to versions 1.11.27.0

Obtained from: Chelsio

267764 23-Jun-2014 np

MFC r267689:

Consider the total number of descriptors available (and not just those
that are ready to be reclaimed) when deciding whether to resume tx after
a stall.

267694 21-Jun-2014 np

MFC r267600:

cxgbe(4): Fix bug in the fast rx buffer recycle path. In some cases rx
buffers were getting recycled when they should have been left alone.

267244 08-Jun-2014 np

MFC r267082:
cxgbe(4): Properly account for the freelist buffers used when returning
early from service_iq due to a budget restriction. This fixes a potential
rx hang when using INTx.

266965 02-Jun-2014 np

MFC r266908:

cxgbe(4): Fix a NULL dereference when the very first call to
get_scatter_segment() in get_fl_payload() fails. While here,
fix the code to adjust fl_bufs_used when a failure occurs for
any other scatter segment.

265426 06-May-2014 np

MFC r259382:

Read card capabilities after firmware initialization, instead of setting
them up as part of firmware initialization (which the driver gets to do
only if it's the master driver).

Read the range of tids available for the ETHOFLD functionality if it's
enabled.

New is_ftid() and is_etid() functions to test whether a tid falls within
the range of filter tids or ETHOFLD tids respectively.

265425 06-May-2014 np

MFC r263317, r263412, and r263451.

r263317:
cxgbe(4): significant rx rework.

- More flexible cluster size selection, including the ability to fall
back to a safe cluster size (PAGE_SIZE from zone_jumbop by default) in
case an allocation of a larger size fails.
- A single get_fl_payload() function that assembles the payload into an
mbuf chain for any kind of freelist. This replaces two variants: one
for freelists with buffer packing enabled and another for those without.
- Buffer packing with any sized cluster. It was limited to 4K clusters
only before this change.
- Enable buffer packing for TOE rx queues as well.
- Statistics and tunables to go with all these changes. The driver's
man page will be updated separately.

r263412:
cxgbe(4): if_iqdrops statistic should include tunnel congestion drops.

r263451:
cxgbe(4): man page updates.

265421 06-May-2014 np

MFC r260210 (by adrian@):
Add an option to enable or disable the small RX packet copying that
is done to improve performance of small frames.

When doing RX packing, the RX copying isn't necessarily required.

265410 06-May-2014 np

MFC r261533, r261536, r261537, and r263457.

r261533:
cxgbe(4): Use the port's tx channel to identify it to t4_clr_port_stats.

r261536:
cxgbe(4): The T5 allows for a different freelist starvation threshold
for queues with buffer packing. Use the correct value to calculate a
freelist's low water mark.

r261537:
cxgbe(4): Use the rx channel map (instead of the tx channel map) as the
congestion channel map.

r263457:
cxgbe(4): Recognize the "spider" configuration where a T5 card's 40G
QSFP port is presented as 4 distinct 10G SFP+ ports to the driver.

264736 21-Apr-2014 emax

MFC r264621

use correct (integer) type for the temperature sysctl

Reviewed by: np, scottl
Obtained from: Netflix

264493 15-Apr-2014 scottl

MFC r261558

Add a new sysctl, dev.cxgbe.N.rsrv_noflow, and a companion tunable,
hw.cxgbe.rsrv_noflow. When set, queue 0 of the port is reserved for
TX packets without a flowid. The hash value of packets with a flowid
is bumped up by 1. The intent is to provide a private queue for
link-level packets like LACP that is unlikely to overflow or suffer
deep queue latency.

262132 17-Feb-2014 dim

MFC r261907:

In cxgbe, conditionalize the t4_pgprot_wc() function, since it is only
used when DOT5 is defined.

Reviewed by: np

259804 24-Dec-2013 np

MFC r259527:

Do not create a hardware IPv6 server if the listen address is not
in6addr_any and is not in the CLIP table either. This fixes a reported
TOE+IPv6 NULL-dereference panic in do_pass_open_rpl().

While here, stop creating hardware servers for any loopback address.
It's just a waste of server tids.

259241 12-Dec-2013 np

MFC r259145:
Unstaticize t4_list and t4_uld_list. This works around a clang
annoyance[1] and allows kgdb to find these symbols.

[1] http://lists.freebsd.org/pipermail/freebsd-hackers/2012-November/041166.html

259142 09-Dec-2013 np

MFC r257654, r257772, r258441, r258689, r258698, r258879, r259048, and
r259103.

r257654:
cxgbe(4): Exclude MPS_RPLC_MAP_CTL (0x11114) from the register dump. Turns
out it's a write-only register with strange side effects on read.

r257772:
cxgbe(4): Tidy up the display for payload memory statistics (pm_stats).

r258441:
cxgbe(4): update the internal list of device features.

r258689:
Disable an assertion that relies on some code[1] that isn't in HEAD yet.

r258698:
cxgbetool: "modinfo" command to display SFP+ module information.

r258879:
cxgbe(4): T4_SET_SCHED_CLASS and T4_SET_SCHED_QUEUE ioctls to program
scheduling classes in the chip and to bind tx queue(s) to a scheduling
class respectively. These can be used for various kinds of tx traffic
throttling (to force selected tx queues to drain at a fixed Kbps rate,
or a % of the port's total bandwidth, or at a fixed pps rate, etc.).

r259048:
Two new cxgbetool subcommands to set up scheduler classes and to bind
them to NIC queues.

r259103:
cxgbe(4): save a copy of the RSS map for each port for the driver's use.

256819 21-Oct-2013 np

MFC r256694, r256713, r256714.

r256694:
iw_cxgbe: iWARP driver for Chelsio T4/T5 chips. This is a straight port
of the iw_cxgb4 found in OFED distributions.

r256713:
iw_cxgbe should have a dependency on t4nex.

r256714:
Fix typo in previous commit.

Approved by: re (hrs)

256794 20-Oct-2013 np

MFC r256477:

cxgbe(4): Store the log2 of the # of doorbells per BAR2 page for both
ingress and egress queues, and for both T4 and T5. These values are
used by the T4/T5 iWARP driver.

Approved by: re (glebius)

256791 20-Oct-2013 np

MFC r256459.

cxgbe(4): Update T4 and T5 firmwares to 1.9.12.0

Approved by: re (glebius)

256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


256218 09-Oct-2013 glebius

There are some high performance NICs that count statistics in hardware,
and there are ifnets, that do that via counter(9). Provide a flag that
would skip cache line trashing '+=' operation in ether_input().

Sponsored by: Netflix
Sponsored by: Nginx, Inc.
Reviewed by: melifaro, adrian
Approved by: re (marius)


256131 07-Oct-2013 dim

Fix kernel build on amd64 after r256118, since the machine/md_var.h
header is not implicitly included there. So include it explicitly.

Approved by: re (delphij)
Pointy hat to: dim
MFC after: 3 days
X-MFC-With: r256118


256118 07-Oct-2013 dim

Remove redundant declaration of cpu_clflush_line_size in
sys/dev/cxgbe/t4_sge.c, to silence a gcc warning.

Approved by: re (gjb)
MFC after: 3 days


255411 09-Sep-2013 np

Rework the tx credit mechanism between the cxgbe/tom driver
and the card. This helps smooth out some burstiness in the
exchange.

Approved by: re (glebius)


255410 09-Sep-2013 np

Fix a miscalculation that caused cxgbe/tom to auto-increment
a TOE socket's tx buffer size too aggressively.

Approved by: re (delphij)


255198 03-Sep-2013 np

For TOE connections, the window scale factor in CPL_PASS_ACCEPT_REQ is
set to 15 to indicate that the peer did not send a window scale option
with its SYN. Do not send a window scale option in the SYN|ACK reply
in that case.


255052 30-Aug-2013 np

Fix the sysctl that displays whether buffer packing is enabled
or not.


255050 30-Aug-2013 np

Implement support for rx buffer packing. Enable it by default for T5
cards.

This is a T4 and T5 chip feature which lets the chip deliver multiple
Ethernet frames in a single buffer. This is more efficient within the
chip, in the driver, and reduces wastage of space in rx buffers.

- Always allocate rx buffers from the jumbop zone, no matter what the
MTU is. Do not use the normal cluster refcounting mechanism.
- Reserve space for an mbuf and a refcount in the cluster itself and let
the chip DMA multiple frames in the rest.
- Use the embedded mbuf for the first frame and allocate mbufs on the
fly for any additional frames delivered in the cluster. Each of these
mbufs has a reference on the underlying cluster.


255015 29-Aug-2013 np

Merge r254386 from user/np/cxl_tuning. Add an INET|INET6 check missing
in said revision.

r254386:
Flush inactive LRO entries periodically.


255011 28-Aug-2013 np

Whitespace nit.


255006 28-Aug-2013 np

Change t4_list_lock and t4_uld_list_lock from mutexes to sx'es.

- tom_uninit had to be reworked not to hold the adapter lock (a mutex)
around t4_deactivate_uld, which acquires the uld_list_lock.
- the ifc_match for the interface cloner that creates the tracer ifnet
had to be reworked as the kernel calls ifc_match with the global
if_cloners_mtx held.


255005 28-Aug-2013 np

Add hooks in base cxgbe(4) for the iWARP upper-layer driver. Update a
couple of assertions in the TOE driver as well.


254933 26-Aug-2013 np

Use correct mailbox and PCIe PF number when querying RDMA parameters.


254727 23-Aug-2013 np

There is no need to hold the freelist lock around alloc/free of
software descriptors. This also silences WITNESS warnings when
the software descriptors are allocated with M_WAITOK.

MFC after: 1 week


254577 20-Aug-2013 np

Display P/N information in the description.

Submitted by: gnn
MFC after: 3 days


253890 02-Aug-2013 np

Display temperature sensor data. Shows -1 if sensor not
available on the card.

# sysctl dev.t4nex.0.temperature
# sysctl dev.t5nex.0.temperature


253889 02-Aug-2013 np

Fix previous commit (r253873). "cong" has one bit per channel but the
congestion channel map has 1 nibble per channel. So bits wxyz need to
be blown up into 000w000x000y000z.


253873 01-Aug-2013 np

Set up congestion manager context properly for T5 based cards.

MFC after: 3 days (will check with re@)


253829 31-Jul-2013 np

Display SGE tunables in the sysctl tree.

dev.t5nex.0.fl_pktshift: payload DMA offset in rx buffer (bytes)
dev.t5nex.0.fl_pad: payload pad boundary (bytes)
dev.t5nex.0.spg_len: status page size (bytes)
dev.t5nex.0.cong_drop: congestion drop setting

Discussed with: scottl


253701 27-Jul-2013 np

Display a string instead of a numeric code in the linkdnrc sysctl.

Submitted by: gnn@


253699 27-Jul-2013 np

Expand the list of devices claimed by cxgbe(4).


253691 26-Jul-2013 np

Add support for packet-sniffing tracers to cxgbe(4). This works with
all T4 and T5 based cards and is useful for analyzing TSO, LRO, TOE, and
for general purpose monitoring without tapping any cxgbe or cxl ifnet
directly.

Tracers on the T4/T5 chips provide access to Ethernet frames exactly as
they were received from or transmitted on the wire. On transmit, a
tracer will capture a frame after TSO segmentation, hw VLAN tag
insertion, hw L3 & L4 checksum insertion, etc. It will also capture
frames generated by the TCP offload engine (TOE traffic is normally
invisible to the kernel). On receive, a tracer will capture a frame
before hw VLAN extraction, runt filtering, other badness filtering,
before the steering/drop/L2-rewrite filters or the TOE have had a go at
it, and of course before sw LRO in the driver.

There are 4 tracers on a chip. A tracer can trace only in one direction
(tx or rx). For now cxgbetool will set up tracers to capture the first
128B of every transmitted or received frame on a given port. This is a
small subset of what the hardware can do. A pseudo ifnet with the same
name as the nexus driver (t4nex0 or t5nex0) will be created for tracing.
The data delivered to this ifnet is an additional copy made inside the
chip. Normal delivery to cxgbe<n> or cxl<n> will be made as usual.

/* watch cxl0, which is the first port hanging off t5nex0. */
# cxgbetool t5nex0 tracer 0 tx0 (watch what cxl0 is transmitting)
# cxgbetool t5nex0 tracer 1 rx0 (watch what cxl0 is receiving)
# cxgbetool t5nex0 tracer list
# tcpdump -i t5nex0 <== all that cxl0 sees and puts on the wire

If you were doing TSO, a tcpdump on cxl0 may have shown you ~64K
"frames" with no L3/L4 checksum but this will show you the frames that
were actually transmitted.

/* all done */
# cxgbetool t5nex0 tracer 0 disable
# cxgbetool t5nex0 tracer 1 disable
# cxgbetool t5nex0 tracer list
# ifconfig t5nex0 destroy


253688 26-Jul-2013 np

Reserve room for ioctls that aren't in this copy of the driver yet.


253407 17-Jul-2013 np

Specify a timeout for the PL block.

MFC after: 3 days


253217 11-Jul-2013 np

Attach to the 4x10G T540-CR card.


252747 05-Jul-2013 np

- Show the reason why link is down if this information is available.
- Display the temperature and PHY firmware version of the BT PHY.

MFC after: 1 day


252728 04-Jul-2013 np

- Make note of interface MTU change if the rx queues exist, and not just
when the interface is up.
- Add a tunable to control the TOE's rx coalesce feature (enabled by
default as it always has been). Consider the interface MTU or the
coalesce size when deciding which cluster zone to use to fill the
offload rx queue's free list. The tunable is:
dev.{t4nex,t5nex}.<N>.toe.rx_coalesce

MFC after: 1 day


252724 04-Jul-2013 np

On-the-fly changes to the interrupt coalescing timer should apply to the
TOE rx queues too.

MFC after: 1 day


252716 04-Jul-2013 np

Pay attention to TCP_NODELAY when it's set/unset after the connection
is established.

MFC after: 1 day


252715 04-Jul-2013 np

Ring the egress queue's doorbell as soon as there are 8 or more
descriptors ready to be processed.

MFC after: 1 day


252711 04-Jul-2013 np

The T5 allows the driver to specify the ISS. Do so; use the ISS picked
by the kernel.

MFC after: 1 day


252705 04-Jul-2013 np

- Read all TP parameters in one place.
- Read the filter mode, calculate various shifts, and use them
properly during active open (in select_ntuple).

MFC after: 1 day


252661 04-Jul-2013 np

- Include the T5 firmware with the driver.
- Update the T4 firmware to the latest.
- Minor reorganization and updates to the version macros, etc.

Obtained from: Chelsio
MFC after: 1 day


252469 01-Jul-2013 np

Add a sysctl to get the number of filters available.

sysctl dev.t4nex.<N>.nfilters
sysctl dev.t5nex.<N>.nfilters

MFC after: 3 days


252312 27-Jun-2013 np

Update T5 register ranges. This is so that regdump skips over registers
with read side-effects.

MFC after: 3 days


251638 11-Jun-2013 np

cxgbe/tom: Allow caller to select the queue (control or data) used to
send the CPL_SET_TCB_FIELD request in t4_set_tcb_field().

MFC after: 1 week


251518 08-Jun-2013 np

cxgbe/tom: Fix bad signed/unsigned mixup in the stid allocator. This
fixes a panic when allocating a mixture of IPv6 and IPv4 stids.

MFC after: 1 week


251434 05-Jun-2013 np

cxgbe(4): Never install a firmware if hw.cxgbe.fw_install is 0.

MFC after: 1 week


251358 04-Jun-2013 np

cxgbe(4): Provide accurate hit count for filters on T5 cards. The
location within the TCB and the size have both changed.

MFC after: 1 week


251213 01-Jun-2013 np

cxgbe(4): Some more debug sysctls. These work on both T4 and T5 based
cards.

dev.t5nex.0.misc.cim_ma_la: CIM MA logic analyzer
dev.t5nex.0.misc.cim_pif_la: CIM PIF logic analyzer
dev.t5nex.0.misc.mps_tcam: MPS TCAM entries
dev.t5nex.0.misc.tp_la: TP logic analyzer
dev.t5nex.0.misc.ulprx_la: ULPRX logic analyzer

Obtained from: Chelsio
MFC after: 1 week


250697 16-May-2013 kib

Add dependencies on the firmware, which allows the loading of the cxgb
and cxgbe modules.

Reviewed and approved by: np
MFC after: 1 week


250614 13-May-2013 np

Deal correctly with 40G ports that don't have any transceiver plugged
in. Do not claim that they have unknown tranceivers.

MFC after: 3 days


250221 03-May-2013 np

cxgbe: Switch to a better way to install firmware.

MFC after: 1 week


250218 03-May-2013 np

cxgbe/tom: Do not use M_PROTO1 to mark rx zero-copy mbufs as special.
All the M_PROTOn flags are clobbered when an mbuf is appended to the
socket buffer.

MFC after: 1 week


250117 30-Apr-2013 np

Fix DDP breakage introduced in r248925. Bitwise OR has higher
precedence than ternary conditional.

MFC after: 1 week


250093 30-Apr-2013 np

Attach to the T580 (2 x 40G) card.

MFC after: 1 week.


250092 30-Apr-2013 np

- Provide accurate ifmedia information so that 40G ports/transceivers are
displayed properly in ifconfig, etc.

- Use the same number of tx and rx queues for a 40G port as for a 10G port.

MFC after: 1 week


250090 30-Apr-2013 np

cxgbe(4): Some updates to shared code.

Obtained from: Chelsio
MFC after: 1 week


249629 18-Apr-2013 np

cxgbe(4): Refuse to install T5 firmwares on a T4 card (and vice versa).

MFC after: 1 week


249627 18-Apr-2013 np

cxgbe/tom: Update the CLIP table on the chip when there are changes
to the list of IPv6 addresses on the system. The table is used for
TOE+IPv6 only.


249393 11-Apr-2013 np

Add pciids of the T5 based cards. The ones that I haven't tested with
cxgbe(4) are disabled for now. This will change.

MFC after: 2 weeks


249392 11-Apr-2013 np

Cosmetic change (s/wrwc/wcwr/;s/WRWC/WCWR/).

MFC after: 3 days.


249391 11-Apr-2013 np

Auto-reduce the holdoff timers that are greater than the maximum value
allowed by the hardware.

MFC after: 3 days


249385 11-Apr-2013 np

cxgbe/tom: Slight simplification of code that calculates options2.

MFC after: 3 days


249383 11-Apr-2013 np

Get rid of a couple of stray \n's.

MFC after: 3 days.


249382 11-Apr-2013 np

There is no need for elaborate queries and error checking when trying to
set FW4MSG_ENCAP.

MFC after: 3 days


249376 11-Apr-2013 np

- Explain clearly why a different firmware is being installed (if/when
it is being installed). Improve other error messages while here.

- Select special FPGA specific configuration profile when appropriate.

MFC after: 3 days


249370 11-Apr-2013 np

cxgbe(4): Ensure that the MOD_LOAD handler runs before either t4nex or
t5nex attach to their devices.

MFC after: 3 days


248925 30-Mar-2013 np

cxgbe(4): Add support for Chelsio's Terminator 5 (aka T5) ASIC. This
includes support for the NIC and TOE features of the 40G, 10G, and
1G/100M cards based on the T5.

The ASIC is mostly backward compatible with the Terminator 4 so cxgbe(4)
has been updated instead of writing a brand new driver. T5 cards will
show up as cxl (short for cxlgb) ports attached to the t5nex bus driver.

Sponsored by: Chelsio


247355 26-Feb-2013 np

cxgbe(4): Report unusual out of band errors from the firmware.

Obtained from: Chelsio
MFC after: 5 days


247347 26-Feb-2013 np

cxgbe(4): Consider all the API versions of the interfaces exported by
the firmware (instead of just the main firmware version) when evaluating
firmware compatibility. Document the new "hw.cxgbe.fw_install" knob
being introduced here.

This should fix kern/173584 too. Setting hw.cxgbe.fw_install=2 will
mostly do what was requested in the PR but it's a bit more intelligent
in that it won't reinstall the same firmware repeatedly if the knob is
left set.

PR: kern/173584
MFC after: 5 days


247291 26-Feb-2013 np

cxgbe(4): Ask the card's firmware to pad up tiny CPLs by encapsulating
them in a firmware message if it is able to do so. This works out
better for one of the FIFOs in the chip.

MFC after: 5 days


247289 26-Feb-2013 np

cxgbe(4): Update firmware to 1.8.4.0.

MFC after: 5 days


247122 21-Feb-2013 np

cxgbe(4): Add sysctls to extract debug information from the chip:

dev.t4nex.X.misc.cim_la logic analyzer dump
dev.t4nex.X.misc.cim_qcfg queue configuration
dev.t4nex.X.misc.cim_ibq_xxx inbound queues
dev.t4nex.X.misc.cim_obq_xxx outbound queues

Obtained from: Chelsio
MFC after: 1 week


247062 20-Feb-2013 np

cxgbe(4): Assume that CSUM_TSO in the transmit path implies CSUM_IP and
CSUM_TCP too. They are all set explicitly by the kernel usually.

While here, fix an unrelated bug where hardware L4 checksum calculation
was accidentally disabled for some IPv6 packets.

Reported by: alfred@
MFC after: 3 days


246575 09-Feb-2013 np

Do not hold locks around hardware context reads.

MFC after: 3 days


246385 06-Feb-2013 np

Busy-wait when cold.

Reported by: gnn, jhb
MFC after: 3 days


246093 29-Jan-2013 np

Provide a statistic to track the number of drops in each of the port's
txq's buf_ring. The aggregate for all the queues of a port is already
provided in ifnet->if_snd.ifq_drops.

MFC after: 3 days.


245937 26-Jan-2013 np

Install an extra hold on the newly allocated synq entry so that it
cannot be freed while do_pass_accept_req is running. This closes a race
where do_pass_establish on another CPU (the driver chose a different
queue for the new tid) expands the synq entry into a full PCB and then
releases the only hold on it, all while do_pass_accept_req is still
running.

MFC after: 3 days


245936 26-Jan-2013 np

Force the 404-BT card (4 x 1G) to use the "uwire" configuration file.

MFC after: 3 days


245935 26-Jan-2013 np

Add a couple of missing error codes. Treat CPL_ERR_KEEPALV_NEG_ADVICE as
negative advice and not a fatal error.

MFC after: 3 days


245933 26-Jan-2013 np

cxgbe/tom: List IFCAP_TOE6 as supported now that all the required pieces
are in place. You still have to enable it explicitly, after loading the
t4_tom KLD.


245567 17-Jan-2013 np

cxgbe: Make the for_each macros safer to use by turning them
into a single statement each.

Submitted by: Christoph Mallon <christoph dot mallon at gmx dot de>
MFC after: 1 week


245518 17-Jan-2013 np

cxgbe: Do a more thorough job in the CLEAR_STATS ioctl.

MFC after: 3 days


245517 16-Jan-2013 np

cxgbe: Fix the for_each_foo macros -- the last argument should not share
its name with any member of struct sge.

MFC after: 3 days


245468 15-Jan-2013 np

cxgbe/tom: Add support for fully offloaded TCP/IPv6 connections (passive open).

MFC after: 1 week


245467 15-Jan-2013 np

cxgbe/tom: Add support for fully offloaded TCP/IPv6 connections (active open).

MFC after: 1 week


245448 15-Jan-2013 np

cxgbe/tom: Basic CLIP table management.

This is the Compressed Local IPv6 table on the chip. To save space, the
chip uses an index into this table instead of a full IPv6 address in
some of its hardware data structures.

For now the driver fills this table with all the local IPv6 addresses
that it sees at the time the table is initialized. I'll improve this
later so that the table is updated whenever new IPv6 addresses are
configured or existing ones deleted.

MFC after: 1 week


245441 15-Jan-2013 np

cxgbe/tom: Miscellaneous updates for TOE+IPv6 support (more to follow).

- Teach find_best_mtu_idx() to deal with IPv6 endpoints.

- Install correct protosw in offloaded TCP/IPv6 sockets when DDP is
enabled.

- Move set_tcp_ddp_ulp_mode to t4_tom.c so that t4_tom.h can be included
without having to drag in t4_msg.h too. This was bothering the iWARP
driver for some reason.

MFC after: 1 week


245434 14-Jan-2013 np

cxgbe(4): Updates to the hardware L2 table management code.

- Add full support for IPv6 addresses.

- Read the size of the L2 table during attach. Do not assume that PCIe
physical function 4 of the card has all of the table to itself.

- Use FNV instead of Jenkins to hash L3 addresses and drop the private
copy of jhash.h from the driver.

MFC after: 1 week


245276 11-Jan-2013 np

Overhaul the stid allocator so that it can be used for IPv6 servers
too. The entry for an IPv6 server in the TCAM takes up the equivalent
of two ordinary stids and must be properly aligned too.

MFC after: 1 week


245274 11-Jan-2013 np

cxgbe(4): Add functions to help synchronize "slow" operations (those not
on the fast data path) and use them instead of frobbing the adapter lock
and busy flag directly.

Other changes made while reworking all slow operations:
- Wait for the reply to a filter request (add/delete). This guarantees
that the operation is complete by the time the ioctl returns.
- Tidy up the tid_info structure.
- Do not allow the tx queue size to be set to something that's not a
power of 2.

MFC after: 1 week


245243 09-Jan-2013 np

cxgbe(4): updates to the configuration file that controls how hardware
resources are partitioned.

- Reduce the number of virtual interfaces reserved for PF4. This leaves
spare room in the source MAC table and allows the driver to setup
filters that rewrite the source MAC address.

- Reduce the number of filters and use the freed up space for the CLIP
(Compressed Local IPv6 addresses) table. This is a prerequisite for
IPv6 TOE support which will follow separately in a series of commits.

MFC after: 1 week


244580 22-Dec-2012 np

cxgbe(4): Add support for the T440-LP-CR card. This is the 4x10G low
profile card with a QSFP+ transceiver.

MFC after: 3 days


244551 21-Dec-2012 np

cxgbe(4): must hold a write-lock on the table while allocating an L2
entry for switching.

MFC after: 3 days


243857 04-Dec-2012 glebius

Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags in sys/dev.


243681 29-Nov-2012 np

cxgbe/tom: Handle the case where the chip falls out of DDP mode by
itself. The hole in the receive sequence space corresponds to the
number of bytes placed directly up to that point.

MFC after: 1 week


243680 29-Nov-2012 np

cxgbe/tom: Add a flag to indicate that the L2 table entry for an
embryonic connection has been setup and never attempt to abort a tid
before this is done. This fixes a bad race where a listening socket is
closed when the driver is in the middle of step (b) here. The symptom
of this were "ARP miss" errors from the driver followed by tid leaks.

A hardware-offloaded passive open works this way:

a) A SYN "hits" the TCAM entry for a server tid and the chip delivers it
to the queue associated with the server tid (say, queue A). It waits
for a response from the driver telling it what to do.

b) The driver decides it is ok to proceed. It adds the new tid to the
list of embryonic connections associated with the server tid and then
hands off the SYN to the kernel's syncache to make sure that the kernel
okays it too. If it does then the driver provides an L2 table entry,
queue id (say, queue B), etc. and instructs the chip to send the SYN/ACK
response.

c) The chip delivers a status to queue B depending on how the third step
of the 3-way handshake goes. The driver removes the tid from its list
of embryonic connections and either expands the syncache entry or
destroys the tid. In any case all subsequent messages for the new tid
will be delivered to queue B, not queue A. Anything running in queue B
knows that the L2 entry has long been setup and the new flag is of no
interest from here on. If the listener is closed it will deal with
so_comp as normal.

MFC after: 1 week


243110 16-Nov-2012 np

cxgbe/tom: Plug mbuf leak.

MFC after: 3 days


242671 06-Nov-2012 np

Make sure the inp hasn't been dropped before trying to access its socket
and tcpcb.

MFC after: 3 days


242666 06-Nov-2012 np

Remove the tid from the software table (and bump down the in-use
counter) when the syncache doesn't want the driver to reply to an
incoming SYN. This fixes a harmless bug where tids_in_use would
go out of sync with the hardware counter.

MFC after: 3 days


241733 19-Oct-2012 ed

Prefer __containerof() over __member2struct().

The former works better with qualifiers, but also properly type checks
the input pointer.


241642 17-Oct-2012 np

Always provide sndbuf and MSS values in a flowc command, even when the
driver is going to abort the connection right after the flowc.

MFC after: 3 days


241626 17-Oct-2012 np

Whitespace cleanup.

MFC after: 3 days


241494 12-Oct-2012 np

Temporary fix for kern/172364.

PR: kern/172364
MFC after: 3 days


241493 12-Oct-2012 np

Use global knob in the TP_PARA_REG3 register to disable congestion
drops if the user has chosen this behaviour.

MFC after: 3 days


241409 10-Oct-2012 np

Add a driver ioctl to clear a port's MAC statistics.

Submitted by: gnn@
MFC after: 3 days


241399 10-Oct-2012 np

Add a driver ioctl to read a byte from any device on a port's i2c bus.
This lets userspace read arbitrary information from the SFP+ modules
etc. on this bus.

Reading multiple bytes in the same transaction isn't possible right now.
I'll update the driver once the chip's firmware supports this.

MFC after: 3 days


241398 10-Oct-2012 np

There is no need to report the same error twice.

MFC after: 3 days


241397 10-Oct-2012 np

Remove unused item. cxgbe's rx queue's lock was removed a long time ago.

MFC after: 3 days


241394 10-Oct-2012 kevlo

Revert previous commit...

Pointyhat to: kevlo (myself)


241370 09-Oct-2012 kevlo

Prefer NULL over 0 for pointers


240693 19-Sep-2012 gavin

Switch some PCI register reads from using magic numbers to using the names
defined in pcireg.h

MFC after: 1 week


240680 18-Sep-2012 gavin

Align the PCI Express #defines with the style used for the PCI-X
#defines. This also has the advantage that it makes the names more
compact, iand also allows us to correct the non-uniform naming of
the PCIM_LINK_* defines, making them all consistent amongst themselves.

This is a mostly mechanical rename:
s/PCIR_EXPRESS_/PCIER_/g
s/PCIM_EXP_/PCIEM_/g
s/PCIM_LINK_/PCIEM_LINK_/g

When this is MFC'd, #defines will be added for the old names to assist
out-of-tree drivers.

Discussed with: jhb
MFC after: 1 week


240453 13-Sep-2012 np

Install interrupt handlers early, during attach, for the reason
explained in r239913 by jhb.

MFC after: 1 week


240452 13-Sep-2012 np

Use native FreeBSD facilities everywhere except the shared code in common/

MFC after: 1 week


240443 13-Sep-2012 np

Update interface to firmware 1.6.2 and include the firmware in the driver.

Obtained from: Chelsio
MFC after: 1 week


239544 21-Aug-2012 np

Deal with the case where a syncache entry added by the TOE driver is
evicted from the syncache but a later syncache_expand succeeds because
of syncookies. The TOE driver has to resort to more direct means to
install its hooks in the socket in this case.


239528 21-Aug-2012 np

Avoid a NULL pointer dereference.


239527 21-Aug-2012 np

Cannot hold a mutex around vm_fault_quick_hold_pages, so don't. Tweak
some comments while here.


239514 21-Aug-2012 np

Minor cleanup: use bitwise ops instead of pointless wrappers around
setbit/clrbit.


239511 21-Aug-2012 np

Correctly handle the case where an inp has already been dropped by the time
the TOE driver reports that an active open failed. toe_connect_failed is
supposed to handle this but it should be provided the inpcb instead of the
tcpcb which may no longer be around.


239344 17-Aug-2012 np

Support for TCP DDP (Direct Data Placement) in the T4 TOE module.

Basically, this is automatic rx zero copy when feasible. TCP payload is
DMA'd directly into the userspace buffer described by the uio submitted
in soreceive by an application.

- Works with sockets that are being handled by the TCP offload engine
of a T4 chip (you need t4_tom.ko module loaded after cxgbe, and an
"ifconfig +toe" on the cxgbe interface).
- Does not require any modification to the application.
- Not enabled by default. Use hw.t4nex.<X>.toe.ddp="1" to enable it.


239341 16-Aug-2012 np

Initialize various DDP parameters in the main cxgbe(4) driver:

- Setup multiple DDP page sizes. When the driver attempts DDP it will
try to combine physically contiguous pages into regions of these sizes.

- Set the indicate size such that the payload carried in the indicate can
be copied in the header mbuf (and the 16K rx buffer can be recycled).

- Set DDP threshold to the max payload that the chip will coalesce and
deliver to the driver (this is ~16K by default, which is also why the
offload rx queue is backed by 16K buffers). If the chip is able to
coalesce up to the max it's allowed to, it's a good sign that the peer
is transmitting in bulk without any TCP PSH.

MFC after: 2 weeks


239339 16-Aug-2012 np

Make room for DDP page pods in the default configuration profile. While
here, bump up the L2 table's size to 4K entries.

MFC after: 2 weeks


239338 16-Aug-2012 np

Add a routine (t4_set_tcb_field) to update arbitrary parts of a hardware
TCB. Filters are programmed by modifying the TCB too (via a different
routine) and the reply to any TCB update is delivered via a
CPL_SET_TCB_RPL. Figure out whether the reply is for a filter-write or
something else and route it appropriately.

MFC after: 2 weeks


239336 16-Aug-2012 np

Allow for a different handler for each type of firmware message.

MFC after: 2 weeks


239266 15-Aug-2012 np

The size of the buffers in an Ethernet freelist has to be higher than the
interface's MTU. Initialize such freelists with correct values.

This wasn't a problem for common MTUs (1500 and 9000) as the buffers (2048
and 9216 in size) happened to have enough spare room. I ran into it when
playing around with unusual MTUs.

MFC after: 2 weeks


239259 14-Aug-2012 np

if_iqdrops should include frames truncated within the chip.

MFC after: 2 weeks


239258 14-Aug-2012 np

Convert some fixed parameters to tunables (with reasonable default
values).

- cong_drop specifies what to do on congestion: nothing, backpressure,
or drop.
- fl_pktshift specifies the padding before Ethernet payload.
- fl_pad specifies the boundary upto which to pad Ethernet payload.
- spg_len controls the length of the status page.

MFC after: 2 weeks


239102 06-Aug-2012 dim

In sys/dev/cxgbe/firmware/t4fw_interface.h, change the enum
'fw_hdr_intfver' into an anonymous enum, which avoids a clang 3.2
warning about all the enum values being the same value.

Reviewed by: np
MFC after: 1 week


238313 09-Jul-2012 np

Fix a bug in code that calculates the number of the first interrupt
vector for a port. This affected the gigabit ports of T422 cards (the
ones with 2x10G ports and 2x1G ports).

MFC after: will check with re@


238054 03-Jul-2012 np

Fix inverted test that resulted in incorrect multicast hw programming.


238028 02-Jul-2012 np

Instruct the firmware not to provision resources for TCP offload if the
kernel is being built without TCP_OFFLOAD. But never override
toecaps_allowed if it has been set manually.


237831 30-Jun-2012 np

- Assign (don't OR) the CSUM_XXX bits to csum_flags in the rx checksum code.
- Fix TSO/TSO4 mixup.
- Add IFCAP_LINKSTATE to the available/enabled capabilities.


237819 29-Jun-2012 np

cxgbe(4): support for IPv6 TSO and LRO.

Submitted by: bz (this is a modified version of that patch)


237799 29-Jun-2012 np

cxgbe(4): support for IPv6 hardware checksumming (rx and tx).


237587 26-Jun-2012 np

Allow cxgbe(4) running within a VM to attach to its devices that have been
exported via PCI passthrough.

- Do not check for a specific physical function (PF) before claiming a device.
Different PFs have different device-ids so this check is redundant anyway.

- Obtain the PF# from the WHOAMI register instead of pci_get_function().

- Setup the memory windows using the real BAR0 address, not what the VM says it
is.

Obtained from: Chelsio Communications


237512 23-Jun-2012 np

Better way to determine the status page length and rx pad boundary.


237463 22-Jun-2012 np

Do not allocate extra vectors when adapter is not TOE
capable (or toecaps have been disallowed by the user).

+ one very minor unrelated cleanup in t4_sge.c


237439 22-Jun-2012 np

Do not read registers with read side effects while performing a register
dump for cxgbetool.


237436 22-Jun-2012 np

cxgbe(4): update to firmware interface 1.5.2.0; updates to shared code.


237263 19-Jun-2012 np

- Updated TOE support in the kernel.

- Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs.
These are available as t3_tom and t4_tom modules that augment cxgb(4)
and cxgbe(4) respectively. The cxgb/cxgbe drivers continue to work as
usual with or without these extra features.

- iWARP driver for Terminator 3 ASIC (kernel verbs). T4 iWARP in the
works and will follow soon.

Build-tested with make universe.

30s overview
============
What interfaces support TCP offload? Look for TOE4 and/or TOE6 in the
capabilities of an interface:
# ifconfig -m | grep TOE

Enable/disable TCP offload on an interface (just like any other ifnet
capability):
# ifconfig cxgbe0 toe
# ifconfig cxgbe0 -toe

Which connections are offloaded? Look for toe4 and/or toe6 in the
output of netstat and sockstat:
# netstat -np tcp | grep toe
# sockstat -46c | grep toe

Reviewed by: bz, gnn
Sponsored by: Chelsio communications.
MFC after: ~3 months (after 9.1, and after ensuring MFC is feasible)


235944 24-May-2012 bz

MFp4 bz_ipv6_fast:

Significantly update tcp_lro for mostly two things:
1) introduce basic support for IPv6 without extension headers.
2) try hard to also get the incremental checksum updates right,
especially also in the IPv4 case for the IP and TCP header.

Move variables around for better locality, factor things out into
functions, allow checksum updates to be compiled out, ...

Leave a few comments on further things to look at in the future,
though that is not the full list.

Update drivers with appropriate #includes as needed for IPv6 data
type in LRO.

Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems

Reviewed by: gnn (as part of the whole)
MFC After: 3 days


234833 30-Apr-2012 np

Change the default to not use packet counters to generate rx interrupts.
Rely solely on the timer based mechanism.

Update man page to reflect this change.

MFC after: 1 week


234831 30-Apr-2012 np

Make sure that the firmware version is available in
dev.t4nex.X.firmware_version even if the driver fails to attach
properly. At least it'll be easy to tell what we're dealing with.

MFC after: 1 week


231592 13-Feb-2012 np

Use the non-sleeping variang of t4_wr_mbox in code that can be called
with locks held.

MFC after: 1 day


231172 08-Feb-2012 np

Program the MAC exact match table in batches of 7 addresses at
a time when possible. This is more efficient than one at a time.

Submitted by: gnn
MFC after: 3 days


231120 07-Feb-2012 np

Acquire the adapter lock before updating fields of the filter structure.

Submitted by: gnn (different version)
MFC after: 3 days


231116 07-Feb-2012 np

Remove if_start from cxgb and cxgbe.

Submitted by: jhb
MFC after: 3 days


231115 07-Feb-2012 np

cxgbe: reduce diffs with other branches.
Will help future MFCs from HEAD.

MFC after: 3 days


228561 16-Dec-2011 np

Many updates to cxgbe(4)

- Device configuration via plain text config file. Also able to operate
when not attached to the chip as the master driver.

- Generic "work request" queue that serves as the base for both ctrl and
ofld tx queues.

- Generic interrupt handler routine that can process any event on any
kind of ingress queue (via a dispatch table).

- A couple of new driver ioctls. cxgbetool can now install a firmware
to the card ("loadfw" command) and can read the card's memory
("memdump" and "tcb" commands).

- Lots of assorted information within dev.t4nex.X.misc.* This is
primarily for debugging and won't show up in sysctl -a.

- Code to manage the L2 tables on the chip.

- Updates to cxgbe(4) man page to go with the tunables that have changed.

- Updates to the shared code in common/

- Updates to the driver-firmware interface (now at fw 1.4.16.0)

MFC after: 1 month


228491 14-Dec-2011 np

Do not clobber the ingress queue's congestion setting.

MFC after: 1 month


228443 12-Dec-2011 mdf

Do not define bool/true/false if the symbols already exist.

MFC after: 2 weeks
Sponsored by: Isilon Systems, LLC


227843 22-Nov-2011 marius

- There's no need to overwrite the default device method with the default
one. Interestingly, these are actually the default for quite some time
(bus_generic_driver_added(9) since r52045 and bus_generic_print_child(9)
since r52045) but even recently added device drivers do this unnecessarily.
Discussed with: jhb, marcel
- While at it, use DEVMETHOD_END.
Discussed with: jhb
- Also while at it, use __FBSDID.


227309 07-Nov-2011 ed

Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.

The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.


222973 11-Jun-2011 np

- driver ioctl to get SGE context for any given queue.
- sysctls to display the context id, cidx, and pidx of all kinds of queues.

MFC after: 3 days


222703 04-Jun-2011 np

Cause backpressure (instead of dropping frames) on congestion.

MFC after: 3 days


222701 04-Jun-2011 np

Allow lazy fill up of freelists.

MFC after: 3 days


222552 01-Jun-2011 np

Provide hit-count with rest of the information about a filter.

MFC after: 1 week


222551 31-May-2011 np

Firmware device log.

# sysctl dev.t4nex.0.devlog

MFC after: mdf's sysctl+sbuf changes are MFC'd


222513 30-May-2011 np

Update to firmware interface 1.3.10

MFC after: 1 week


222510 30-May-2011 np

- Specialized ingress queues that take interrupts for other ingress
queues. Try to have a set of these per port when possible, fall back
to sharing a common pool between all ports otherwise.

- One control queue per port (used to be one per hardware channel).

- t4_eth_rx now handles Ethernet rx only.

- sysctls to display pidx/cidx for some queues.

MFC after: 1 week


222509 30-May-2011 np

L2 table code. This is enough to get the T4's switch + L2 rewrite
filters working. (All other filters - switch without L2 info rewrite,
steer, and drop - were already fully-functional).

Some contrived examples of "switch" filters with L2 rewriting:

# cxgbetool t4nex0 iport 0 dport 80 action switch vlan +9 eport 3
Intercept all packets received on physical port 0 with TCP port 80 as
destination, insert a vlan tag with VID 9, and send them out of port 3.

# cxgbetool t4nex0 sip 192.168.1.1/32 ivlan 5 action switch \
vlan =9 smac aa:bb:cc:dd:ee:ff eport 0
Intercept all packets (received on any port) with source IP address
192.168.1.1 and VLAN id 5, rewrite the VLAN id to 9, rewrite source mac
to aa:bb:cc:dd:ee:ff, and send it out of port 0.

MFC after: 1 week


222102 19-May-2011 np

Simplify t4_os_find_pci_capability.

MFC after: 3 days


222085 18-May-2011 np

- Enable per-channel congestion notification.
- Enable PCIe relaxed ordering for all egress queues and rx data buffers.

MFC after: 3 days


222003 17-May-2011 np

Add missing header. The test for VLAN_CAPABILITIES later in the file
doesn't make sense without it.

MFC after: 3 days


221911 14-May-2011 np

sysctl that displays the absolute queue id of an rxq.


221516 05-May-2011 np

Bump up the number of egress queues that the driver is allowed to use.

MFC after: 3 days


221477 05-May-2011 np

T4 packet timestamps.

Reference code that shows how to get a packet's timestamp out of
cxgbe(4). Disabled by default because we don't have a standard way
today to pass this information up the stack.

The timestamp is 60 bits wide and each increment represents 1 tick of
the T4's core clock. As an example, the timestamp granularity is ~4.4ns
for this card:

# sysctl dev.t4nex.0.core_clock
dev.t4nex.0.core_clock: 228125

MFC after: 1 week


221474 05-May-2011 np

T4 packet filtering/steering.

- Enable 5-tuple and every-packet lookup.

- Setup the default filter mode to allow filtering/steering based on IP
protocol, ingress port, inner VLAN ID, IP frag, FCoE, and MPS match
type; all combined together. You can also filter based on MAC index,
Ethernet type, IP TOS/IPv6 Traffic Class, and outer VLAN ID but you'll
have to modify the default filter mode and exclude some of the
match-fields in it.

IPv4 and IPv6 SIP/DIP/SPORT/DPORT are always available in all filter
rules.

- Add driver ioctls to get/set the global filter mode.

- Add driver ioctls to program and delete hardware filters. A couple of
the "switch" actions that rewrite Ethernet and VLAN information and
switch the packet out of another port may not work as the L2 code is not
yet in place. Everything else, including all "drop" and "pass" rules
with RSS or absolute qid, should work.

Obtained from: Chelsio Communications


221464 04-May-2011 np

Always re-arm an iq's interrupt before leaving the handler.

MFC after: 1 week


220905 20-Apr-2011 np

Ring the freelist doorbell from within refill_fl. While here, fix a bug
that could have allowed the hardware pidx to reach the cidx even though
the freelist isn't empty. (Haven't actually seen this but it was there
waiting to happen..)

MFC after: 1 week


220897 20-Apr-2011 np

Use the correct free routine when destroying a control queue.

X-MFC after: r220873


220874 19-Apr-2011 np

Use Toeplitz hash for RSS.

MFC after: 3 days


220873 19-Apr-2011 np

- Move all Ethernet specific items from sge_eq to sge_txq. sge_eq is
now a suitable base for all kinds of egress queues.

- Add control queues (sge_ctrlq) and allocate one of these per hardware
channel. They can be used to program filters and steer traffic (and
more).

MFC after: 1 week


220649 15-Apr-2011 np

Fix a couple of bad races that can occur when a cxgbe interface is taken
down. The ingress queue lock was unused and has been removed as part of
these changes.

- An in-flight egress update from the SGE must be handled before the
queue that requested it is destroyed. Wait for the update to arrive.

- Interrupt handlers must stop processing rx events for a queue before
the queue is destroyed. Events that have not yet been processed
should be ignored once the queue disappears.

MFC after: 1 week


220643 14-Apr-2011 np

There is no need to request a tx credit flush if such a request is already
pending.

MFC after: 3 days


220410 07-Apr-2011 np

Modify read/write ioctls to work with 64 bit registers too.

MFC after: 3 days


220232 01-Apr-2011 np

Update header and related code for firmware 1.3.8

MFC after: 3 days


219944 24-Mar-2011 np

Do not over-allocate MSI interrupts for the case where each ingress
queue has its own interrupt. If the exact number that we need is not a
power of 2 and we're using MSI, then switch to interrupt multiplexing.

While here, replace the magic numbers with something more readable.

MFC after: 3 days


219883 22-Mar-2011 np

Fix an error while constructing the table that maps context id -> egress
queue.

MFC after: 1 day


219436 09-Mar-2011 np

Display holdoff timers and packet counts as a list of numbers.

MFC after: 1 week


219392 08-Mar-2011 np

cxgbe shouldn't directly know of the UMA zones where network buffers
come from.

MFC after: 1 week


219299 05-Mar-2011 np

Be sure to stay within the bounds of the mod_str array when displaying
the transceiver type.


219293 05-Mar-2011 np

There is no need to hold an ingress queue's lock while processing its
descriptors.

MFC after: 1 week


219292 05-Mar-2011 np

Calculate how many descriptors can be reclaimed before calling
reclaim_tx_descs


219290 05-Mar-2011 np

Tweaks for rx:

- everything related to LRO should be in #ifdef INET blocks
- reorder sge_iq's fields so that the most frequently used are all together
- pull all rx code into t4_intr_data directly
- let go of the ingress queue lock when passing up data
- refill the freelist only if it is short of at least 32 buffers


219289 05-Mar-2011 np

Store the ifnet rather than the port_info in each txq and rxq struct.

MFC after: 1 week


219288 05-Mar-2011 np

A txpkts work request should have a valid FID.

MFC after: 1 week


219287 05-Mar-2011 np

Upgrade the firmware on the card automatically if a better version is
available. Downgrade only for a major version mismatch.

MFC after: 1 week


219286 05-Mar-2011 np

Resume tx immediately in response to an SGE egress update from the hardware.

MFC after: 1 week


219285 05-Mar-2011 np

Fix incorrect assertion.

MFC after: 3 days


218792 18-Feb-2011 np

cxgbe(4) - NIC driver for Chelsio T4 (Terminator 4) based 10Gb/1Gb adapters.

MFC after: 3 weeks