History log of /freebsd-10.1-release/sys/dev/cxgbe/t4_main.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 272461 02-Oct-2014 gjb

Copy stable/10@r272459 to releng/10.1 as part of
the 10.1-RELEASE process.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 271961 22-Sep-2014 np

MFC r271450:
cxgbe(4): knobs to enable/disable PAUSE frame based flow control.

Approved by: re (glebius)


# 270297 21-Aug-2014 np

MFC r266571, r266757, r268536, r269076, r269364, r269366, r269411,
r269413, r269428, r269440, r269537, r269644, r269731, and the cxgbe
portion of r270063.

r266571:
cxgbe(4): Remove stray if_up from the code that creates the tracing ifnet.

r266757:
cxgbe(4): netmap support for Terminator 5 (T5) based 10G/40G cards.
Netmap gets its own hardware-assisted virtual interface and won't take
over or disrupt the "normal" interface in any way. You can use both
simultaneously.

For kernels with DEV_NETMAP, cxgbe(4) carves out an ncxl<N> interface
(note the 'n' prefix) in the hardware to accompany each cxl<N>
interface. These two ifnet's per port share the same wire but really
are separate interfaces in the hardware and software. Each gets its own
L2 MAC addresses (unicast and multicast), MTU, checksum caps, etc. You
should run netmap on the 'n' interfaces only, that's what they are for.

With this, pkt-gen is able to transmit > 45Mpps out of a single 40G port
of a T580 card. 2 port tx is at ~56Mpps total (28M + 28M) as of now.
Single port receive is at 33Mpps but this is very much a work in
progress. I expect it to be closer to 40Mpps once done. In any case
the current effort can already saturate multiple 10G ports of a T5 card
at the smallest legal packet size. T4 gear is totally untested.

trantor:~# ./pkt-gen -i ncxl0 -f tx -D 00:07:43:ab:cd:ef
881.952141 main [1621] interface is ncxl0
881.952250 extract_ip_range [275] range is 10.0.0.1:0 to 10.0.0.1:0
881.952253 extract_ip_range [275] range is 10.1.0.1:0 to 10.1.0.1:0
881.962540 main [1804] mapped 334980KB at 0x801dff000
Sending on netmap:ncxl0: 4 queues, 1 threads and 1 cpus.
10.0.0.1 -> 10.1.0.1 (00:00:00:00:00:00 -> 00:07:43:ab:cd:ef)
881.962562 main [1882] Sending 512 packets every 0.000000000 s
881.962563 main [1884] Wait 2 secs for phy reset
884.088516 main [1886] Ready...
884.088535 nm_open [457] overriding ifname ncxl0 ringid 0x0 flags 0x1
884.088607 sender_body [996] start
884.093246 sender_body [1064] drop copy
885.090435 main_thread [1418] 45206353 pps (45289533 pkts in 1001840 usec)
886.091600 main_thread [1418] 45322792 pps (45375593 pkts in 1001165 usec)
887.092435 main_thread [1418] 45313992 pps (45351784 pkts in 1000834 usec)
888.094434 main_thread [1418] 45315765 pps (45406397 pkts in 1002000 usec)
889.095434 main_thread [1418] 45333218 pps (45378551 pkts in 1001000 usec)
890.097434 main_thread [1418] 45315247 pps (45405877 pkts in 1002000 usec)
891.099434 main_thread [1418] 45326515 pps (45417168 pkts in 1002000 usec)
892.101434 main_thread [1418] 45333039 pps (45423705 pkts in 1002000 usec)
893.103434 main_thread [1418] 45324105 pps (45414708 pkts in 1001999 usec)
894.105434 main_thread [1418] 45318042 pps (45408723 pkts in 1002001 usec)
895.106434 main_thread [1418] 45332430 pps (45377762 pkts in 1001000 usec)
896.107434 main_thread [1418] 45338072 pps (45383410 pkts in 1001000 usec)
...

r268536:
cxgbe(4): Add an iSCSI softc to the adapter structure.

r269076:
Some hooks in cxgbe(4) for the offloaded iSCSI driver.

r269364:
Improve compliance with style.Makefile(5).

r269366:
List one file per line in the Makefiles. This makes it easier to read
diffs when a file is added or removed.

r269411:
cxgbe(4): minor optimizations in ingress queue processing.

Reorganize struct sge_iq. Make the iq entry size a compile time
constant. While here, eliminate RX_FL_ESIZE and use EQ_ESIZE directly.

r269413:
cxgbe(4): Fix an off by one error when looking for the BAR2 doorbell
address of an egress queue.

r269428:
cxgbe(4): some optimizations in freelist handling.

r269440:
cxgbe(4): Remove an unused version of t4_enable_vi.

r269537:
cxgbe(4): Do not run any sleepable code in the SIOCSIFFLAGS handler when
IFF_PROMISC or IFF_ALLMULTI is being flipped. bpf(4) holds its global
mutex around ifpromisc in at least the bpf_dtor path.

r269644:
cxgbe(4): Let caller specify whether it's ok to sleep in
t4_sched_config and t4_sched_params.

r269731:
cxgbe(4): Do not poke T4-only registers on a T5 (and vice versa).

Relnotes: Yes (native netmap support for Chelsio T4/T5 cards)


# 269356 31-Jul-2014 np

MFC r268971 and r269032.

r268971:
Simplify r267600, there's no need to distinguish between allocated and
inlined mbufs.

r269032:
cxgbe(4): Keep track of the clusters that have to be freed by the
custom free routine (rxb_free) in the driver. Fail MOD_UNLOAD with
EBUSY if any such cluster has been handed up to the kernel but hasn't
been freed yet. This prevents a panic later when the cluster finally
needs to be freed but rxb_free is gone from the kernel.


# 269082 24-Jul-2014 np

MFC r268640 and r268989.

r268640:
Allow multi-byte reads in the private CHELSIO_T4_GET_I2C ioctl. The
firmware allows up to 48B to be read this way but the driver limits
itself to 8B at a time to remain compatible with old cxgbetool
binaries.

r268989:
Add missing newline to an error message.


# 268823 17-Jul-2014 np

MFC r268706:
cxgbe(4): Display CF facility correctly in the device log.


# 265426 06-May-2014 np

MFC r259382:

Read card capabilities after firmware initialization, instead of setting
them up as part of firmware initialization (which the driver gets to do
only if it's the master driver).

Read the range of tids available for the ETHOFLD functionality if it's
enabled.

New is_ftid() and is_etid() functions to test whether a tid falls within
the range of filter tids or ETHOFLD tids respectively.


# 265425 06-May-2014 np

MFC r263317, r263412, and r263451.

r263317:
cxgbe(4): significant rx rework.

- More flexible cluster size selection, including the ability to fall
back to a safe cluster size (PAGE_SIZE from zone_jumbop by default) in
case an allocation of a larger size fails.
- A single get_fl_payload() function that assembles the payload into an
mbuf chain for any kind of freelist. This replaces two variants: one
for freelists with buffer packing enabled and another for those without.
- Buffer packing with any sized cluster. It was limited to 4K clusters
only before this change.
- Enable buffer packing for TOE rx queues as well.
- Statistics and tunables to go with all these changes. The driver's
man page will be updated separately.

r263412:
cxgbe(4): if_iqdrops statistic should include tunnel congestion drops.

r263451:
cxgbe(4): man page updates.


# 265421 06-May-2014 np

MFC r260210 (by adrian@):
Add an option to enable or disable the small RX packet copying that
is done to improve performance of small frames.

When doing RX packing, the RX copying isn't necessarily required.


# 265410 06-May-2014 np

MFC r261533, r261536, r261537, and r263457.

r261533:
cxgbe(4): Use the port's tx channel to identify it to t4_clr_port_stats.

r261536:
cxgbe(4): The T5 allows for a different freelist starvation threshold
for queues with buffer packing. Use the correct value to calculate a
freelist's low water mark.

r261537:
cxgbe(4): Use the rx channel map (instead of the tx channel map) as the
congestion channel map.

r263457:
cxgbe(4): Recognize the "spider" configuration where a T5 card's 40G
QSFP port is presented as 4 distinct 10G SFP+ ports to the driver.


# 264736 21-Apr-2014 emax

MFC r264621

use correct (integer) type for the temperature sysctl

Reviewed by: np, scottl
Obtained from: Netflix


# 264493 15-Apr-2014 scottl

MFC r261558

Add a new sysctl, dev.cxgbe.N.rsrv_noflow, and a companion tunable,
hw.cxgbe.rsrv_noflow. When set, queue 0 of the port is reserved for
TX packets without a flowid. The hash value of packets with a flowid
is bumped up by 1. The intent is to provide a private queue for
link-level packets like LACP that is unlikely to overflow or suffer
deep queue latency.


# 259241 11-Dec-2013 np

MFC r259145:
Unstaticize t4_list and t4_uld_list. This works around a clang
annoyance[1] and allows kgdb to find these symbols.

[1] http://lists.freebsd.org/pipermail/freebsd-hackers/2012-November/041166.html


# 259142 09-Dec-2013 np

MFC r257654, r257772, r258441, r258689, r258698, r258879, r259048, and
r259103.

r257654:
cxgbe(4): Exclude MPS_RPLC_MAP_CTL (0x11114) from the register dump. Turns
out it's a write-only register with strange side effects on read.

r257772:
cxgbe(4): Tidy up the display for payload memory statistics (pm_stats).

r258441:
cxgbe(4): update the internal list of device features.

r258689:
Disable an assertion that relies on some code[1] that isn't in HEAD yet.

r258698:
cxgbetool: "modinfo" command to display SFP+ module information.

r258879:
cxgbe(4): T4_SET_SCHED_CLASS and T4_SET_SCHED_QUEUE ioctls to program
scheduling classes in the chip and to bind tx queue(s) to a scheduling
class respectively. These can be used for various kinds of tx traffic
throttling (to force selected tx queues to drain at a fixed Kbps rate,
or a % of the port's total bandwidth, or at a fixed pps rate, etc.).

r259048:
Two new cxgbetool subcommands to set up scheduler classes and to bind
them to NIC queues.

r259103:
cxgbe(4): save a copy of the RSS map for each port for the driver's use.


# 256791 20-Oct-2013 np

MFC r256459.

cxgbe(4): Update T4 and T5 firmwares to 1.9.12.0

Approved by: re (glebius)


# 256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


# 256218 09-Oct-2013 glebius

There are some high performance NICs that count statistics in hardware,
and there are ifnets, that do that via counter(9). Provide a flag that
would skip cache line trashing '+=' operation in ether_input().

Sponsored by: Netflix
Sponsored by: Nginx, Inc.
Reviewed by: melifaro, adrian
Approved by: re (marius)


# 255015 29-Aug-2013 np

Merge r254386 from user/np/cxl_tuning. Add an INET|INET6 check missing
in said revision.

r254386:
Flush inactive LRO entries periodically.


# 255006 28-Aug-2013 np

Change t4_list_lock and t4_uld_list_lock from mutexes to sx'es.

- tom_uninit had to be reworked not to hold the adapter lock (a mutex)
around t4_deactivate_uld, which acquires the uld_list_lock.
- the ifc_match for the interface cloner that creates the tracer ifnet
had to be reworked as the kernel calls ifc_match with the global
if_cloners_mtx held.


# 255005 28-Aug-2013 np

Add hooks in base cxgbe(4) for the iWARP upper-layer driver. Update a
couple of assertions in the TOE driver as well.


# 254933 26-Aug-2013 np

Use correct mailbox and PCIe PF number when querying RDMA parameters.


# 254577 20-Aug-2013 np

Display P/N information in the description.

Submitted by: gnn
MFC after: 3 days


# 253890 02-Aug-2013 np

Display temperature sensor data. Shows -1 if sensor not
available on the card.

# sysctl dev.t4nex.0.temperature
# sysctl dev.t5nex.0.temperature


# 253829 31-Jul-2013 np

Display SGE tunables in the sysctl tree.

dev.t5nex.0.fl_pktshift: payload DMA offset in rx buffer (bytes)
dev.t5nex.0.fl_pad: payload pad boundary (bytes)
dev.t5nex.0.spg_len: status page size (bytes)
dev.t5nex.0.cong_drop: congestion drop setting

Discussed with: scottl


# 253701 27-Jul-2013 np

Display a string instead of a numeric code in the linkdnrc sysctl.

Submitted by: gnn@


# 253699 26-Jul-2013 np

Expand the list of devices claimed by cxgbe(4).


# 253691 26-Jul-2013 np

Add support for packet-sniffing tracers to cxgbe(4). This works with
all T4 and T5 based cards and is useful for analyzing TSO, LRO, TOE, and
for general purpose monitoring without tapping any cxgbe or cxl ifnet
directly.

Tracers on the T4/T5 chips provide access to Ethernet frames exactly as
they were received from or transmitted on the wire. On transmit, a
tracer will capture a frame after TSO segmentation, hw VLAN tag
insertion, hw L3 & L4 checksum insertion, etc. It will also capture
frames generated by the TCP offload engine (TOE traffic is normally
invisible to the kernel). On receive, a tracer will capture a frame
before hw VLAN extraction, runt filtering, other badness filtering,
before the steering/drop/L2-rewrite filters or the TOE have had a go at
it, and of course before sw LRO in the driver.

There are 4 tracers on a chip. A tracer can trace only in one direction
(tx or rx). For now cxgbetool will set up tracers to capture the first
128B of every transmitted or received frame on a given port. This is a
small subset of what the hardware can do. A pseudo ifnet with the same
name as the nexus driver (t4nex0 or t5nex0) will be created for tracing.
The data delivered to this ifnet is an additional copy made inside the
chip. Normal delivery to cxgbe<n> or cxl<n> will be made as usual.

/* watch cxl0, which is the first port hanging off t5nex0. */
# cxgbetool t5nex0 tracer 0 tx0 (watch what cxl0 is transmitting)
# cxgbetool t5nex0 tracer 1 rx0 (watch what cxl0 is receiving)
# cxgbetool t5nex0 tracer list
# tcpdump -i t5nex0 <== all that cxl0 sees and puts on the wire

If you were doing TSO, a tcpdump on cxl0 may have shown you ~64K
"frames" with no L3/L4 checksum but this will show you the frames that
were actually transmitted.

/* all done */
# cxgbetool t5nex0 tracer 0 disable
# cxgbetool t5nex0 tracer 1 disable
# cxgbetool t5nex0 tracer list
# ifconfig t5nex0 destroy


# 253217 11-Jul-2013 np

Attach to the 4x10G T540-CR card.


# 252747 05-Jul-2013 np

- Show the reason why link is down if this information is available.
- Display the temperature and PHY firmware version of the BT PHY.

MFC after: 1 day


# 252728 04-Jul-2013 np

- Make note of interface MTU change if the rx queues exist, and not just
when the interface is up.
- Add a tunable to control the TOE's rx coalesce feature (enabled by
default as it always has been). Consider the interface MTU or the
coalesce size when deciding which cluster zone to use to fill the
offload rx queue's free list. The tunable is:
dev.{t4nex,t5nex}.<N>.toe.rx_coalesce

MFC after: 1 day


# 252724 04-Jul-2013 np

On-the-fly changes to the interrupt coalescing timer should apply to the
TOE rx queues too.

MFC after: 1 day


# 252705 04-Jul-2013 np

- Read all TP parameters in one place.
- Read the filter mode, calculate various shifts, and use them
properly during active open (in select_ntuple).

MFC after: 1 day


# 252661 03-Jul-2013 np

- Include the T5 firmware with the driver.
- Update the T4 firmware to the latest.
- Minor reorganization and updates to the version macros, etc.

Obtained from: Chelsio
MFC after: 1 day


# 252469 01-Jul-2013 np

Add a sysctl to get the number of filters available.

sysctl dev.t4nex.<N>.nfilters
sysctl dev.t5nex.<N>.nfilters

MFC after: 3 days


# 252312 27-Jun-2013 np

Update T5 register ranges. This is so that regdump skips over registers
with read side-effects.

MFC after: 3 days


# 251434 05-Jun-2013 np

cxgbe(4): Never install a firmware if hw.cxgbe.fw_install is 0.

MFC after: 1 week


# 251358 04-Jun-2013 np

cxgbe(4): Provide accurate hit count for filters on T5 cards. The
location within the TCB and the size have both changed.

MFC after: 1 week


# 251213 01-Jun-2013 np

cxgbe(4): Some more debug sysctls. These work on both T4 and T5 based
cards.

dev.t5nex.0.misc.cim_ma_la: CIM MA logic analyzer
dev.t5nex.0.misc.cim_pif_la: CIM PIF logic analyzer
dev.t5nex.0.misc.mps_tcam: MPS TCAM entries
dev.t5nex.0.misc.tp_la: TP logic analyzer
dev.t5nex.0.misc.ulprx_la: ULPRX logic analyzer

Obtained from: Chelsio
MFC after: 1 week


# 250697 16-May-2013 kib

Add dependencies on the firmware, which allows the loading of the cxgb
and cxgbe modules.

Reviewed and approved by: np
MFC after: 1 week


# 250614 13-May-2013 np

Deal correctly with 40G ports that don't have any transceiver plugged
in. Do not claim that they have unknown tranceivers.

MFC after: 3 days


# 250221 03-May-2013 np

cxgbe: Switch to a better way to install firmware.

MFC after: 1 week


# 250093 30-Apr-2013 np

Attach to the T580 (2 x 40G) card.

MFC after: 1 week.


# 250092 30-Apr-2013 np

- Provide accurate ifmedia information so that 40G ports/transceivers are
displayed properly in ifconfig, etc.

- Use the same number of tx and rx queues for a 40G port as for a 10G port.

MFC after: 1 week


# 249393 11-Apr-2013 np

Add pciids of the T5 based cards. The ones that I haven't tested with
cxgbe(4) are disabled for now. This will change.

MFC after: 2 weeks


# 249392 11-Apr-2013 np

Cosmetic change (s/wrwc/wcwr/;s/WRWC/WCWR/).

MFC after: 3 days.


# 249383 11-Apr-2013 np

Get rid of a couple of stray \n's.

MFC after: 3 days.


# 249382 11-Apr-2013 np

There is no need for elaborate queries and error checking when trying to
set FW4MSG_ENCAP.

MFC after: 3 days


# 249376 11-Apr-2013 np

- Explain clearly why a different firmware is being installed (if/when
it is being installed). Improve other error messages while here.

- Select special FPGA specific configuration profile when appropriate.

MFC after: 3 days


# 249370 11-Apr-2013 np

cxgbe(4): Ensure that the MOD_LOAD handler runs before either t4nex or
t5nex attach to their devices.

MFC after: 3 days


# 248925 30-Mar-2013 np

cxgbe(4): Add support for Chelsio's Terminator 5 (aka T5) ASIC. This
includes support for the NIC and TOE features of the 40G, 10G, and
1G/100M cards based on the T5.

The ASIC is mostly backward compatible with the Terminator 4 so cxgbe(4)
has been updated instead of writing a brand new driver. T5 cards will
show up as cxl (short for cxlgb) ports attached to the t5nex bus driver.

Sponsored by: Chelsio


# 247347 26-Feb-2013 np

cxgbe(4): Consider all the API versions of the interfaces exported by
the firmware (instead of just the main firmware version) when evaluating
firmware compatibility. Document the new "hw.cxgbe.fw_install" knob
being introduced here.

This should fix kern/173584 too. Setting hw.cxgbe.fw_install=2 will
mostly do what was requested in the PR but it's a bit more intelligent
in that it won't reinstall the same firmware repeatedly if the knob is
left set.

PR: kern/173584
MFC after: 5 days


# 247291 25-Feb-2013 np

cxgbe(4): Ask the card's firmware to pad up tiny CPLs by encapsulating
them in a firmware message if it is able to do so. This works out
better for one of the FIFOs in the chip.

MFC after: 5 days


# 247122 21-Feb-2013 np

cxgbe(4): Add sysctls to extract debug information from the chip:

dev.t4nex.X.misc.cim_la logic analyzer dump
dev.t4nex.X.misc.cim_qcfg queue configuration
dev.t4nex.X.misc.cim_ibq_xxx inbound queues
dev.t4nex.X.misc.cim_obq_xxx outbound queues

Obtained from: Chelsio
MFC after: 1 week


# 246575 08-Feb-2013 np

Do not hold locks around hardware context reads.

MFC after: 3 days


# 246093 29-Jan-2013 np

Provide a statistic to track the number of drops in each of the port's
txq's buf_ring. The aggregate for all the queues of a port is already
provided in ifnet->if_snd.ifq_drops.

MFC after: 3 days.


# 245936 26-Jan-2013 np

Force the 404-BT card (4 x 1G) to use the "uwire" configuration file.

MFC after: 3 days


# 245933 25-Jan-2013 np

cxgbe/tom: List IFCAP_TOE6 as supported now that all the required pieces
are in place. You still have to enable it explicitly, after loading the
t4_tom KLD.


# 245518 16-Jan-2013 np

cxgbe: Do a more thorough job in the CLEAR_STATS ioctl.

MFC after: 3 days


# 245434 14-Jan-2013 np

cxgbe(4): Updates to the hardware L2 table management code.

- Add full support for IPv6 addresses.

- Read the size of the L2 table during attach. Do not assume that PCIe
physical function 4 of the card has all of the table to itself.

- Use FNV instead of Jenkins to hash L3 addresses and drop the private
copy of jhash.h from the driver.

MFC after: 1 week


# 245274 10-Jan-2013 np

cxgbe(4): Add functions to help synchronize "slow" operations (those not
on the fast data path) and use them instead of frobbing the adapter lock
and busy flag directly.

Other changes made while reworking all slow operations:
- Wait for the reply to a filter request (add/delete). This guarantees
that the operation is complete by the time the ioctl returns.
- Tidy up the tid_info structure.
- Do not allow the tx queue size to be set to something that's not a
power of 2.

MFC after: 1 week


# 244580 22-Dec-2012 np

cxgbe(4): Add support for the T440-LP-CR card. This is the 4x10G low
profile card with a QSFP+ transceiver.

MFC after: 3 days


# 241733 19-Oct-2012 ed

Prefer __containerof() over __member2struct().

The former works better with qualifiers, but also properly type checks
the input pointer.


# 241494 12-Oct-2012 np

Temporary fix for kern/172364.

PR: kern/172364
MFC after: 3 days


# 241493 12-Oct-2012 np

Use global knob in the TP_PARA_REG3 register to disable congestion
drops if the user has chosen this behaviour.

MFC after: 3 days


# 241409 10-Oct-2012 np

Add a driver ioctl to clear a port's MAC statistics.

Submitted by: gnn@
MFC after: 3 days


# 241399 10-Oct-2012 np

Add a driver ioctl to read a byte from any device on a port's i2c bus.
This lets userspace read arbitrary information from the SFP+ modules
etc. on this bus.

Reading multiple bytes in the same transaction isn't possible right now.
I'll update the driver once the chip's firmware supports this.

MFC after: 3 days


# 240680 18-Sep-2012 gavin

Align the PCI Express #defines with the style used for the PCI-X
#defines. This also has the advantage that it makes the names more
compact, iand also allows us to correct the non-uniform naming of
the PCIM_LINK_* defines, making them all consistent amongst themselves.

This is a mostly mechanical rename:
s/PCIR_EXPRESS_/PCIER_/g
s/PCIM_EXP_/PCIEM_/g
s/PCIM_LINK_/PCIEM_LINK_/g

When this is MFC'd, #defines will be added for the old names to assist
out-of-tree drivers.

Discussed with: jhb
MFC after: 1 week


# 240453 13-Sep-2012 np

Install interrupt handlers early, during attach, for the reason
explained in r239913 by jhb.

MFC after: 1 week


# 240452 13-Sep-2012 np

Use native FreeBSD facilities everywhere except the shared code in common/

MFC after: 1 week


# 239341 16-Aug-2012 np

Initialize various DDP parameters in the main cxgbe(4) driver:

- Setup multiple DDP page sizes. When the driver attempts DDP it will
try to combine physically contiguous pages into regions of these sizes.

- Set the indicate size such that the payload carried in the indicate can
be copied in the header mbuf (and the 16K rx buffer can be recycled).

- Set DDP threshold to the max payload that the chip will coalesce and
deliver to the driver (this is ~16K by default, which is also why the
offload rx queue is backed by 16K buffers). If the chip is able to
coalesce up to the max it's allowed to, it's a good sign that the peer
is transmitting in bulk without any TCP PSH.

MFC after: 2 weeks


# 239338 16-Aug-2012 np

Add a routine (t4_set_tcb_field) to update arbitrary parts of a hardware
TCB. Filters are programmed by modifying the TCB too (via a different
routine) and the reply to any TCB update is delivered via a
CPL_SET_TCB_RPL. Figure out whether the reply is for a filter-write or
something else and route it appropriately.

MFC after: 2 weeks


# 239336 16-Aug-2012 np

Allow for a different handler for each type of firmware message.

MFC after: 2 weeks


# 239259 14-Aug-2012 np

if_iqdrops should include frames truncated within the chip.

MFC after: 2 weeks


# 239258 14-Aug-2012 np

Convert some fixed parameters to tunables (with reasonable default
values).

- cong_drop specifies what to do on congestion: nothing, backpressure,
or drop.
- fl_pktshift specifies the padding before Ethernet payload.
- fl_pad specifies the boundary upto which to pad Ethernet payload.
- spg_len controls the length of the status page.

MFC after: 2 weeks


# 238054 03-Jul-2012 np

Fix inverted test that resulted in incorrect multicast hw programming.


# 238028 02-Jul-2012 np

Instruct the firmware not to provision resources for TCP offload if the
kernel is being built without TCP_OFFLOAD. But never override
toecaps_allowed if it has been set manually.


# 237831 30-Jun-2012 np

- Assign (don't OR) the CSUM_XXX bits to csum_flags in the rx checksum code.
- Fix TSO/TSO4 mixup.
- Add IFCAP_LINKSTATE to the available/enabled capabilities.


# 237819 29-Jun-2012 np

cxgbe(4): support for IPv6 TSO and LRO.

Submitted by: bz (this is a modified version of that patch)


# 237799 29-Jun-2012 np

cxgbe(4): support for IPv6 hardware checksumming (rx and tx).


# 237587 25-Jun-2012 np

Allow cxgbe(4) running within a VM to attach to its devices that have been
exported via PCI passthrough.

- Do not check for a specific physical function (PF) before claiming a device.
Different PFs have different device-ids so this check is redundant anyway.

- Obtain the PF# from the WHOAMI register instead of pci_get_function().

- Setup the memory windows using the real BAR0 address, not what the VM says it
is.

Obtained from: Chelsio Communications


# 237463 22-Jun-2012 np

Do not allocate extra vectors when adapter is not TOE
capable (or toecaps have been disallowed by the user).

+ one very minor unrelated cleanup in t4_sge.c


# 237439 22-Jun-2012 np

Do not read registers with read side effects while performing a register
dump for cxgbetool.


# 237263 19-Jun-2012 np

- Updated TOE support in the kernel.

- Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs.
These are available as t3_tom and t4_tom modules that augment cxgb(4)
and cxgbe(4) respectively. The cxgb/cxgbe drivers continue to work as
usual with or without these extra features.

- iWARP driver for Terminator 3 ASIC (kernel verbs). T4 iWARP in the
works and will follow soon.

Build-tested with make universe.

30s overview
============
What interfaces support TCP offload? Look for TOE4 and/or TOE6 in the
capabilities of an interface:
# ifconfig -m | grep TOE

Enable/disable TCP offload on an interface (just like any other ifnet
capability):
# ifconfig cxgbe0 toe
# ifconfig cxgbe0 -toe

Which connections are offloaded? Look for toe4 and/or toe6 in the
output of netstat and sockstat:
# netstat -np tcp | grep toe
# sockstat -46c | grep toe

Reviewed by: bz, gnn
Sponsored by: Chelsio communications.
MFC after: ~3 months (after 9.1, and after ensuring MFC is feasible)


# 234833 30-Apr-2012 np

Change the default to not use packet counters to generate rx interrupts.
Rely solely on the timer based mechanism.

Update man page to reflect this change.

MFC after: 1 week


# 234831 30-Apr-2012 np

Make sure that the firmware version is available in
dev.t4nex.X.firmware_version even if the driver fails to attach
properly. At least it'll be easy to tell what we're dealing with.

MFC after: 1 week


# 231172 07-Feb-2012 np

Program the MAC exact match table in batches of 7 addresses at
a time when possible. This is more efficient than one at a time.

Submitted by: gnn
MFC after: 3 days


# 231120 07-Feb-2012 np

Acquire the adapter lock before updating fields of the filter structure.

Submitted by: gnn (different version)
MFC after: 3 days


# 231116 07-Feb-2012 np

Remove if_start from cxgb and cxgbe.

Submitted by: jhb
MFC after: 3 days


# 231115 07-Feb-2012 np

cxgbe: reduce diffs with other branches.
Will help future MFCs from HEAD.

MFC after: 3 days


# 228561 16-Dec-2011 np

Many updates to cxgbe(4)

- Device configuration via plain text config file. Also able to operate
when not attached to the chip as the master driver.

- Generic "work request" queue that serves as the base for both ctrl and
ofld tx queues.

- Generic interrupt handler routine that can process any event on any
kind of ingress queue (via a dispatch table).

- A couple of new driver ioctls. cxgbetool can now install a firmware
to the card ("loadfw" command) and can read the card's memory
("memdump" and "tcb" commands).

- Lots of assorted information within dev.t4nex.X.misc.* This is
primarily for debugging and won't show up in sysctl -a.

- Code to manage the L2 tables on the chip.

- Updates to cxgbe(4) man page to go with the tunables that have changed.

- Updates to the shared code in common/

- Updates to the driver-firmware interface (now at fw 1.4.16.0)

MFC after: 1 month


# 227843 22-Nov-2011 marius

- There's no need to overwrite the default device method with the default
one. Interestingly, these are actually the default for quite some time
(bus_generic_driver_added(9) since r52045 and bus_generic_print_child(9)
since r52045) but even recently added device drivers do this unnecessarily.
Discussed with: jhb, marcel
- While at it, use DEVMETHOD_END.
Discussed with: jhb
- Also while at it, use __FBSDID.


# 227309 07-Nov-2011 ed

Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.

The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.


# 222973 11-Jun-2011 np

- driver ioctl to get SGE context for any given queue.
- sysctls to display the context id, cidx, and pidx of all kinds of queues.

MFC after: 3 days


# 222703 04-Jun-2011 np

Cause backpressure (instead of dropping frames) on congestion.

MFC after: 3 days


# 222552 31-May-2011 np

Provide hit-count with rest of the information about a filter.

MFC after: 1 week


# 222551 31-May-2011 np

Firmware device log.

# sysctl dev.t4nex.0.devlog

MFC after: mdf's sysctl+sbuf changes are MFC'd


# 222510 30-May-2011 np

- Specialized ingress queues that take interrupts for other ingress
queues. Try to have a set of these per port when possible, fall back
to sharing a common pool between all ports otherwise.

- One control queue per port (used to be one per hardware channel).

- t4_eth_rx now handles Ethernet rx only.

- sysctls to display pidx/cidx for some queues.

MFC after: 1 week


# 222509 30-May-2011 np

L2 table code. This is enough to get the T4's switch + L2 rewrite
filters working. (All other filters - switch without L2 info rewrite,
steer, and drop - were already fully-functional).

Some contrived examples of "switch" filters with L2 rewriting:

# cxgbetool t4nex0 iport 0 dport 80 action switch vlan +9 eport 3
Intercept all packets received on physical port 0 with TCP port 80 as
destination, insert a vlan tag with VID 9, and send them out of port 3.

# cxgbetool t4nex0 sip 192.168.1.1/32 ivlan 5 action switch \
vlan =9 smac aa:bb:cc:dd:ee:ff eport 0
Intercept all packets (received on any port) with source IP address
192.168.1.1 and VLAN id 5, rewrite the VLAN id to 9, rewrite source mac
to aa:bb:cc:dd:ee:ff, and send it out of port 0.

MFC after: 1 week


# 222102 19-May-2011 np

Simplify t4_os_find_pci_capability.

MFC after: 3 days


# 222085 18-May-2011 np

- Enable per-channel congestion notification.
- Enable PCIe relaxed ordering for all egress queues and rx data buffers.

MFC after: 3 days


# 222003 16-May-2011 np

Add missing header. The test for VLAN_CAPABILITIES later in the file
doesn't make sense without it.

MFC after: 3 days


# 221516 05-May-2011 np

Bump up the number of egress queues that the driver is allowed to use.

MFC after: 3 days


# 221474 05-May-2011 np

T4 packet filtering/steering.

- Enable 5-tuple and every-packet lookup.

- Setup the default filter mode to allow filtering/steering based on IP
protocol, ingress port, inner VLAN ID, IP frag, FCoE, and MPS match
type; all combined together. You can also filter based on MAC index,
Ethernet type, IP TOS/IPv6 Traffic Class, and outer VLAN ID but you'll
have to modify the default filter mode and exclude some of the
match-fields in it.

IPv4 and IPv6 SIP/DIP/SPORT/DPORT are always available in all filter
rules.

- Add driver ioctls to get/set the global filter mode.

- Add driver ioctls to program and delete hardware filters. A couple of
the "switch" actions that rewrite Ethernet and VLAN information and
switch the packet out of another port may not work as the L2 code is not
yet in place. Everything else, including all "drop" and "pass" rules
with RSS or absolute qid, should work.

Obtained from: Chelsio Communications


# 220874 19-Apr-2011 np

Use Toeplitz hash for RSS.

MFC after: 3 days


# 220873 19-Apr-2011 np

- Move all Ethernet specific items from sge_eq to sge_txq. sge_eq is
now a suitable base for all kinds of egress queues.

- Add control queues (sge_ctrlq) and allocate one of these per hardware
channel. They can be used to program filters and steer traffic (and
more).

MFC after: 1 week


# 220649 15-Apr-2011 np

Fix a couple of bad races that can occur when a cxgbe interface is taken
down. The ingress queue lock was unused and has been removed as part of
these changes.

- An in-flight egress update from the SGE must be handled before the
queue that requested it is destroyed. Wait for the update to arrive.

- Interrupt handlers must stop processing rx events for a queue before
the queue is destroyed. Events that have not yet been processed
should be ignored once the queue disappears.

MFC after: 1 week


# 220643 14-Apr-2011 np

There is no need to request a tx credit flush if such a request is already
pending.

MFC after: 3 days


# 220410 07-Apr-2011 np

Modify read/write ioctls to work with 64 bit registers too.

MFC after: 3 days


# 220232 31-Mar-2011 np

Update header and related code for firmware 1.3.8

MFC after: 3 days


# 219944 23-Mar-2011 np

Do not over-allocate MSI interrupts for the case where each ingress
queue has its own interrupt. If the exact number that we need is not a
power of 2 and we're using MSI, then switch to interrupt multiplexing.

While here, replace the magic numbers with something more readable.

MFC after: 3 days


# 219436 09-Mar-2011 np

Display holdoff timers and packet counts as a list of numbers.

MFC after: 1 week


# 219392 08-Mar-2011 np

cxgbe shouldn't directly know of the UMA zones where network buffers
come from.

MFC after: 1 week


# 219299 05-Mar-2011 np

Be sure to stay within the bounds of the mod_str array when displaying
the transceiver type.


# 219289 05-Mar-2011 np

Store the ifnet rather than the port_info in each txq and rxq struct.

MFC after: 1 week


# 219287 05-Mar-2011 np

Upgrade the firmware on the card automatically if a better version is
available. Downgrade only for a major version mismatch.

MFC after: 1 week


# 219286 05-Mar-2011 np

Resume tx immediately in response to an SGE egress update from the hardware.

MFC after: 1 week


# 218792 18-Feb-2011 np

cxgbe(4) - NIC driver for Chelsio T4 (Terminator 4) based 10Gb/1Gb adapters.

MFC after: 3 weeks