History log of /freebsd-current/sys/dev/xen/netfront/netfront.c
Revision Date Author Comments
# 318bbb6d 03-Nov-2023 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: attempt to make cleanup idempotent

Current cleanup code assumes that all the fields are allocated and/or setup by
the time cleanup is called, but this is not always true: a failure in mid-setup
of the device will cause the functions to be called with possibly uninitialized
fields.

Fix the functions to cope with such sate, while also attempting to make the
cleanup idempotent.

Finally fix an error path during setup that would not mark the device as
closed, and hence prevents the kernel from finishing booting.

Fixes: 96375eac945c ("xen-netfront: add multiqueue support")
Sponsored by: Citrix Systems R&D


# da4b0d6e 12-Aug-2023 Doug Rabson <dfr@FreeBSD.org>

netfront: fix the support for disabling LRO at boot time

The driver has a tunable hw.xn.enable_lro which is intended to control
whether LRO is enabled. This is currently non-functional - even if its
set to zero, the driver still requests LRO support from the backend.
This change fixes the feature so that if enable_lro is set to zero, LRO
no longer appears in the interface capabilities and LRO is not requested
from the backend.

PR: 273046
MFC after: 1 week
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D41439


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# 02f3b17f 01-Mar-2022 Justin Hibbits <jhibbits@FreeBSD.org>

Mechanically convert Xen netfront/netback(4) to IfAPI

Reviewed by: zlei
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D37800


# dabb3db7 03-Nov-2022 Roger Pau Monné <royger@FreeBSD.org>

xen/netfront: deal with mbuf data crossing a page boundary

There's been a report recently of mbufs with data that crosses a page
boundary. It seems those mbufs are generated by the iSCSI target
system:

https://lists.xenproject.org/archives/html/xen-devel/2021-12/msg01581.html

In order to handle those mbufs correctly on netfront use the bus_dma
interface and explicitly request that segments must not cross a page
boundary. No other requirements are necessary, so it's expected that
bus_dma won't need to bounce the data and hence it shouldn't
introduce a too big performance penalty.

Using bus_dma requires some changes to netfront, mainly in order to
accommodate for the fact that now ring slots no longer have a 1:1
match with mbufs, as a single mbuf can use two ring slots if the data
buffer crosses a page boundary. Store the first packet of the mbuf
chain in every ring slot that's used, and use a mbuf tag in order to
store the bus_dma related structures and a refcount to keep track of
the pending slots before the mbuf chain can be freed.

Reported by: G.R.
Tested by: G.R.
MFC: 1 week
Differential revision: https://reviews.freebsd.org/D33876


# f929eb1e 06-May-2022 John Baldwin <jhb@FreeBSD.org>

xen: Remove unused devclass arguments to DRIVER_MODULE.


# ad7dd514 12-Oct-2021 Elliott Mitchell <ehem+freebsd@m5p.com>

xen: switch to use headers in contrib

These headers originate with the Xen project and shouldn't be mixed with
the main portion of the FreeBSD kernel. Notably they shouldn't be the
target of clean-up commits.

Switch to use the headers in sys/contrib/xen.

Reviewed by: royger


# e7236a7d 15-Dec-2021 Mateusz Guzik <mjg@FreeBSD.org>

xen: plug some of set-but-not-used vars

Sponsored by: Rubicon Communications, LLC ("Netgate")


# 5f700083 16-Oct-2015 Julien Grall <julien@xen.org>

xen/netfront: introduce xen_pv_nics_disabled()

ARM guest is considered as HVM but it only supports PV nics (no
emulation available).

Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29405


# 6c7cae4a 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

dev/xen: clean up empty lines in .c and .h files


# 903eaa68 23-Aug-2018 Kristof Provost <kp@FreeBSD.org>

xen/netfront: Ensure curvnet is set

netfront_backend_changed() is called from the xenwatch_thread(), which means
that the curvnet is not set. We have to set it before we can call things like
arp_ifinit().

PR: 230845


# d7c5a620 18-May-2018 Matt Macy <mmacy@FreeBSD.org>

ifnet: Replace if_addr_lock rwlock with epoch + mutex

Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree
4.98 0.00 4.42 0.00 4235592 33 83.80 4720653 2149771 1235 247.32
4.73 0.00 4.20 0.00 4025260 33 82.99 4724900 2139833 1204 247.32
4.72 0.00 4.20 0.00 4035252 33 82.14 4719162 2132023 1264 247.32
4.71 0.00 4.21 0.00 4073206 33 83.68 4744973 2123317 1347 247.32
4.72 0.00 4.21 0.00 4061118 33 80.82 4713615 2188091 1490 247.32
4.72 0.00 4.21 0.00 4051675 33 85.29 4727399 2109011 1205 247.32
4.73 0.00 4.21 0.00 4039056 33 84.65 4724735 2102603 1053 247.32

After the patch

InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree
5.43 0.00 4.20 0.00 3313143 33 84.96 5434214 1900162 2656 245.51
5.43 0.00 4.20 0.00 3308527 33 85.24 5439695 1809382 2521 245.51
5.42 0.00 4.19 0.00 3316778 33 87.54 5416028 1805835 2256 245.51
5.42 0.00 4.19 0.00 3317673 33 90.44 5426044 1763056 2332 245.51
5.42 0.00 4.19 0.00 3314839 33 88.11 5435732 1792218 2499 245.52
5.44 0.00 4.19 0.00 3293228 33 91.84 5426301 1668597 2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by: gallatin
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D15366


# 718cf2cc 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/dev: further adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.


# 98018db4 30-Jun-2017 Ryan Libby <rlibby@FreeBSD.org>

netfront.c: avoid gcc variably-modified warning

gcc produces a "variably modified X at file scope" warning for
structures that use these size definitions. I think the definitions are
actually fine but can be rephrased with the __CONST_RING_SIZE macro more
cleanly anyway.

Reviewed by: markj, royger
Approved by: markj (mentor)
Sponsored by: Dell EMC Isilon
Differential revision: https://reviews.freebsd.org/D11417


# c74415ed 02-Jun-2017 Colin Percival <cperciva@FreeBSD.org>

Skip setting the MTU in the netfront driver (xn# devices) if the new MTU
is the same as the old MTU. In particular, on Amazon EC2 "T2" instances
without this change, the network interface is reinitialized every 30
minutes due to the MTU being (re)set when a new DHCP lease is obtained,
causing packets to be dropped, along with annoying syslog messages about
the link state changing.

As a side note, the behaviour this commit fixes was responsible for
exposing the locking problems fixed via r318523 and r318631.

Maintainers of other network interface drivers may wish to consider making
the corresponding change; the handling of SIOCSIFMTU does not seem to
exhibit a great deal of consistency between drivers.

MFC after: 1 week


# 477a40c7 22-May-2017 Roger Pau Monné <royger@FreeBSD.org>

xen/netfront: don't drop the RX lock in xn_rxeof

Since netfront uses different locks for the RX and TX paths there's no need to
drop the RX lock before calling if_input.

Suggested by: jhb
Tested by: cperciva
Sponsored by: Citrix Systems R&D
MFC with: r318523


# bf319173 19-May-2017 Roger Pau Monné <royger@FreeBSD.org>

xen/netfront: don't drop the ring RX lock with inconsistent ring state

Make sure the RX ring lock is only released when the state of the ring is
consistent, or else concurrent calls to xn_rxeof might get an inconsistent ring
state and thus some packets might be processed twice.

Note that this is not very common, and could only happen when an interrupt is
delivered while in xn_ifinit.

Reported by: cperciva
Tested by: cperciva
MFC after: 1 week
Sponsored by: Citrix Systems R&D


# a81683c3 07-Mar-2017 Roger Pau Monné <royger@FreeBSD.org>

xen/netfront: fix inbound packet flags for checksum offload

Currently netfront is setting the flags of inbound packets with the checksum
not present (offloaded) to (CSUM_IP_CHECKED | CSUM_IP_VALID | CSUM_DATA_VALID |
CSUM_PSEUDO_HDR). According to the mbuf(9) man page this is not the correct
combination of flags, it should instead be (CSUM_DATA_VALID |
CSUM_PSEUDO_HDR).

Reviewed by: Wei Liu <wei.liu2@citrix.com>
MFC after: 2 weeks
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D9831


# 8dee0e9b 07-Mar-2017 Roger Pau Monné <royger@FreeBSD.org>

xen: add support for canceled suspend

When running on Xen, it's possible that a suspend request to the hypervisor
fails (return from HYPERVISOR_suspend different than 0). This means that the
suspend hasn't succeed, and the resume procedure needs to properly handle this
case.

First of all, when such situation happens there's no need to reset the vector
callback, hypercall page, shared info, event channels or grant table, because
it's state is preserved. Also, the PV drivers don't need to be reset to the
initial state, since the connection with the backed has not been interrupted.

Submitted by: Liuyingdong <liuyingdong@huawei.com>
Reviewed by: royger
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D9635


# 36ea5721 03-Jan-2017 Olivier Houchard <cognet@FreeBSD.org>

In the netfront_rxq struct, we should use NET_RX_RING_SIZE, not
NET_TX_RING_SIZE.

Reviewed by: royger


# b2fd6999 31-Oct-2016 Roger Pau Monné <royger@FreeBSD.org>

xen/netfront: fix statistics

Fix the statistics used by netfront.

Reported by: Trond.Endrestol@ximalas.info
Submitted by: ae
Reviewed by: royger, Wei Liu <wei.liu2@citrix.com>
MFC after: 4 weeks
PR: 213439


# 3c9d5940 05-Aug-2016 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: improve the logic when handling nic features from ioctl

Simplify the logic involved in changing the nic features on the fly, and
only reset the frontend when really needed (when changing RX features). Also
don't return from the ioctl until the interface has been properly
reconfigured.

While there, make sure XN_CSUM_FEATURES is used consistently.

Reported by: julian
MFC after: 5 days
X-MFC-with: r303488
Sponsored by: Citrix Systems R&D


# 339690b5 29-Jul-2016 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: fix trying to send packets with disconnected netfront

In certain circumstances xn_txq_mq_start might be called with num_queues ==
0 during the resume phase after a migration, which can trigger a KASSERT.
Fix this by making sure the carrier is on before trying to transmit, or else
return that the queues are full.

Just as a note, I haven't been able to reproduce this crash on my test
systems, but I still think it's possible and worth fixing.

Reported by: Karl Pielorz <kpielorz_lst@tdx.co.uk>
Sponsored by: Citrix Systems R&D
MFC after: 5 days
Reviewed by: Wei Liu <wei.liu2@citrix.com>
Differential revision: https://reviews.freebsd.org/D7349


# 65671253 06-Jun-2016 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: fix initialization

A couple of mostly cosmetic fixes for the final initialization of netfront:

- Switch to "connected" state before starting to kick the rings.
- Correctly use "rxq" in the initialization loop (previously rxq was not
updated in the loop, and netfront would kick np->rxq[N] several times).
- Declare and define xn_connect as static, it's not used outside of this
file.

Reviewed by: Wei Liu <wei.liu2@citrix.com>
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D6657


# bf7b50db 02-Jun-2016 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: use callout_reset_curcpu instead of callout_reset

This should help distribute the load of the callbacks.

Suggested by: hps
Sponsored by: Citrix Systems R&D


# c2d12e5e 02-Jun-2016 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: perform an interface reset when changing options

The PV backend will only pick the new options when the interface is detached
and reattached again, so perform a full reset when changing options. This is
very fast, and should not be noticeable by the user.

Reviewed by: Wei Liu <wei.liu2@citrix.com>
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D6658


# d039b070 02-Jun-2016 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: release grant references used for the shared rings

Just calling gnttab_end_foreign_access_ref doesn't free the references,
instead call gnttab_end_foreign_access with a NULL page argument in order to
have the grant references freed. The code that maps the ring
(xenbus_map_ring) already uses gnttab_grant_foreign_access which takes care
of allocating a grant reference.

Reviewed by: Wei Liu <wei.liu2@citrix.com>
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D6608


# c21b47d8 02-Jun-2016 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: fix two hotplug related issues

This patch fixes two issues seen on hot-unplug. The first one is a panic
caused by calling ether_ifdetach after freeing the internal netfront queue
structures. ether_ifdetach will call xn_qflush, and this needs to be done
before freeing the queues. This prevents the following panic:

Fatal trap 9: general protection fault while in kernel mode
cpuid = 2; apic id = 04
instruction pointer = 0x20:0xffffffff80b1687f
stack pointer = 0x28:0xfffffe009239e770
frame pointer = 0x28:0xfffffe009239e780
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 0 (thread taskq)
[ thread pid 0 tid 100015 ]
Stopped at strlen+0x1f: movq (%rcx),%rax
db> bt
Tracing pid 0 tid 100015 td 0xfffff800038a6000
strlen() at strlen+0x1f/frame 0xfffffe009239e780
kvprintf() at kvprintf+0xfa0/frame 0xfffffe009239e890
vsnprintf() at vsnprintf+0x31/frame 0xfffffe009239e8b0
kassert_panic() at kassert_panic+0x5a/frame 0xfffffe009239e920
__mtx_lock_flags() at __mtx_lock_flags+0x164/frame 0xfffffe009239e970
xn_qflush() at xn_qflush+0x59/frame 0xfffffe009239e9b0
if_detach() at if_detach+0x17e/frame 0xfffffe009239ea10
netif_free() at netif_free+0x97/frame 0xfffffe009239ea30
netfront_detach() at netfront_detach+0x11/frame 0xfffffe009239ea40
[...]

Another panic can be triggered by hot-plugging a NIC:

Fatal trap 18: integer divide fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer = 0x20:0xffffffff80902203
stack pointer = 0x28:0xfffffe00508d3660
frame pointer = 0x28:0xfffffe00508d36a0
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 2960 (ifconfig)
[ thread pid 2960 tid 100088 ]
Stopped at xn_txq_mq_start+0x33: divl %esi,%eax
db> bt
Tracing pid 2960 tid 100088 td 0xfffff8000850aa00
xn_txq_mq_start() at xn_txq_mq_start+0x33/frame 0xfffffe00508d36a0
ether_output() at ether_output+0x570/frame 0xfffffe00508d3720
arprequest() at arprequest+0x433/frame 0xfffffe00508d3820
arp_ifinit() at arp_ifinit+0x49/frame 0xfffffe00508d3850
xn_ioctl() at xn_ioctl+0x1a2/frame 0xfffffe00508d3890
in_control() at in_control+0x882/frame 0xfffffe00508d3910
ifioctl() at ifioctl+0xda1/frame 0xfffffe00508d39a0
kern_ioctl() at kern_ioctl+0x246/frame 0xfffffe00508d3a00
sys_ioctl() at sys_ioctl+0x171/frame 0xfffffe00508d3ae0
amd64_syscall() at amd64_syscall+0x2db/frame 0xfffffe00508d3bf0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe00508d3bf0
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x8011e185a, rsp =
0x7fffffffe478, rbp = 0x7fffffffe4c0 ---

This is caused by marking the driver as active before it's fully
initialized, and thus calling xn_txq_mq_start with num_queues set to 0.

Reviewed by: Wei Liu <wei.liu2@citrix.com>
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D6646


# da695b05 02-Jun-2016 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: switch to using an interrupt handler

In order to use custom taskqueues we would have to mask the interrupt, which
is basically what is already done for an interrupt handler, or else we risk
loosing interrupts. This switches netfront to the same interrupt handling
that was done before multiqueue support was added.

Reviewed by: Wei Liu <wei.liu2@citrix.com>
Sponsored by: Citrix Systems R&D


# 2568ee67 02-Jun-2016 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: always keep the Rx ring full of requests

This is based on Linux commit 1f3c2eba1e2d866ef99bb9b10ade4096e3d7607c from
David Vrabel:

A full Rx ring only requires 1 MiB of memory. This is not enough memory
that it is useful to dynamically scale the number of Rx requests in the ring
based on traffic rates, because:

a) Even the full 1 MiB is a tiny fraction of a typically modern Linux
VM (for example, the AWS micro instance still has 1 GiB of memory).

b) Netfront would have used up to 1 MiB already even with moderate
data rates (there was no adjustment of target based on memory
pressure).

c) Small VMs are going to typically have one VCPU and hence only one
queue.

Keeping the ring full of Rx requests handles bursty traffic better than
trying to converge on an optimal number of requests to keep filled.

Reviewed by: Wei Liu <wei.liu2@citrix.com>
Sponsored by: Citrix Systems R&D


# d9a66b6de 02-Jun-2016 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: fix receiving TSO packets

Currently FreeBSD is not properly fetching the TSO information from the Xen
PV ring, and thus the received packets didn't have all the necessary
information, like the segment size or even the TSO flag set.

Sponsored by: Citrix Systems R&D


# 107cfbb7 12-May-2016 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: fix feature detection

Current netfront code relies on xs_scanf returning a value < 0 on error,
which is not right, xs_scanf returns a positive value on error.

MFC after: 3 days
Tested by: Stephen Jones <StephenJo@LivingComputerMuseum.org>
Sponsored by: Citrix Systems R&D


# 6dd38b87 01-Apr-2016 Sepherosa Ziehau <sephe@FreeBSD.org>

tcp/lro: Use tcp_lro_flush_all in device drivers to avoid code duplication

And factor out tcp_lro_rx_done, which deduplicates the same logic with
netinet/tcp_lro.c

Reviewed by: gallatin (1st version), hps, zbb, np, Dexuan Cui <decui microsoft com>
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5725


# cbc4d2db 01-Mar-2016 John Baldwin <jhb@FreeBSD.org>

Remove taskqueue_enqueue_fast().

taskqueue_enqueue() was changed to support both fast and non-fast
taskqueues 10 years ago in r154167. It has been a compat shim ever
since. It's time for the compat shim to go.

Submitted by: Howard Su <howard0su@gmail.com>
Reviewed by: sephe
Differential Revision: https://reviews.freebsd.org/D5131


# 8f28a42e 11-Feb-2016 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: remove useless NULL check in netif_free

xn_ifp is allocated in create_netdev with if_alloc(IFT_ETHER).
According to the current arrangement it can't be NULL.

Coverity ID: 1349805
Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D5252


# d4dae2b1 11-Feb-2016 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: rearrange error paths in setup_txqs

Coverity spotted double free errors in error path. Fix that by
removing the extraneous calls.

Coverity ID: 1349798
Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D5251


# 78034994 11-Feb-2016 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: remove pointless assignment in xn_ioctl

The variable error is assigned to 0 before entering the switch.
Assigning error to 0 before break pointless rewrites the real error
value that should be returned.

Coverity ID: 1304974
Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D5250


# 96375eac 20-Jan-2016 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: add multiqueue support

Add support for multiple TX and RX queue pairs. The default number of queues
is set to 4, but can be easily changed from the sysctl node hw.xn.num_queues.

Also heavily refactor netfront driver: break out a bunch of helper
functions and different structures. Use threads to handle TX and RX.
Remove some dead code and fix quite a few bugs as I go along.

Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Sponsored by: Citrix Systems R&D
Relnotes: Yes
Differential Revision: https://reviews.freebsd.org/D4193


# d5b4f139 05-Nov-2015 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: remove unused header files

Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Sponsored by: Citrix Systems R&D
Differential Revision: https://reviews.freebsd.org/D4079


# ce8df48b 30-Oct-2015 Simon J. Gerraty <sjg@FreeBSD.org>

Do not FALLTHROUGH for SIOC{ADD,DEL}MULTI

ifmedia_ioctl() returns EINVAL

Differential Revision: 3897
Submitted by: aronen@juniper.net
Reviewed by: marcel


# 3778878d 21-Oct-2015 Roger Pau Monné <royger@FreeBSD.org>

netfront: fix LINT-NOIP

r289587 broke LINT-NOIP kernels because the lro and queued local variables
are defined but not used. Add preprocessor guards around them.

Reported by: emaste
Sponsored by: Citrix Systems R&D


# 2f9ec994 21-Oct-2015 Roger Pau Monné <royger@FreeBSD.org>

xen: Code cleanup and small bug fixes

xen/hypervisor.h:
- Remove unused helpers: MULTI_update_va_mapping, is_initial_xendomain,
is_running_on_xen
- Remove unused define CONFIG_X86_PAE
- Remove unused variable xen_start_info: note that it's used inpcifront
which is not built at all
- Remove forward declaration of HYPERVISOR_crash

xen/xen-os.h:
- Remove unused define CONFIG_X86_PAE
- Drop unused helpers: test_and_clear_bit, clear_bit,
force_evtchn_callback
- Implement a generic version (based on ofed/include/linux/bitops.h) of
set_bit and test_bit and prefix them by xen_ to avoid any use by other
code than Xen. Note that It would be worth to investigate a generic
implementation in FreeBSD.
- Replace barrier() by __compiler_membar()
- Replace cpu_relax() by cpu_spinwait(): it's exactly the same as rep;nop
= pause

xen/xen_intr.h:
- Move the prototype of xen_intr_handle_upcall in it: Use by all the
platform

x86/xen/xen_intr.c:
- Use BITSET* for the enabledbits: Avoid to use custom helpers
- test_bit/set_bit has been renamed to xen_test_bit/xen_set_bit
- Don't export the variable xen_intr_pcpu

dev/xen/blkback/blkback.c:
- Fix the string format when XBB_DEBUG is enabled: host_addr is typed
uint64_t

dev/xen/balloon/balloon.c:
- Remove set but not used variable
- Use the correct type for frame_list: xen_pfn_t represents the frame
number on any architecture

dev/xen/control/control.c:
- Return BUS_PROBE_WILDCARD in xs_probe: Returning 0 in a probe callback
means the driver can handle this device. If by any chance xenstore is the
first driver, every new device with the driver is unset will use
xenstore.

dev/xen/grant-table/grant_table.c:
- Remove unused cmpxchg
- Drop unused include opt_pmap.h: Doesn't exist on ARM64 and it doesn't
contain anything required for the code on x86

dev/xen/netfront/netfront.c:
- Use the correct type for rx_pfn_array: xen_pfn_t represents the frame
number on any architecture

dev/xen/netback/netback.c:
- Use the correct type for gmfn: xen_pfn_t represents the frame number on
any architecture

dev/xen/xenstore/xenstore.c:
- Return BUS_PROBE_WILDCARD in xctrl_probe: Returning 0 in a probe callback
means the driver can handle this device. If by any chance xenstore is the
first driver, every new device with the driver is unset will use xenstore.

Note that with the changes, x86/include/xen/xen-os.h doesn't contain anymore
arch-specific code. Although, a new series will add some helpers that differ
between x86 and ARM64, so I've kept the headers for now.

Submitted by: Julien Grall <julien.grall@citrix.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D3921
Sponsored by: Citrix Systems R&D


# 4955cbf3 19-Oct-2015 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: use "netfront" in lock description

Missed from r289585.

Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D3937
Sponsored by: Citrix Systems R&D


# 1a2928b7 19-Oct-2015 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: fix netfront create_dev error path

The failure path for allocating rx grant refs should not try to free tx
grant refs because tx grant refs were allocated after that. Also fix the
error path for xen_net_read_mac.

Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D3891
Sponsored by: Citrix Systems R&D


# b31a0d73 19-Oct-2015 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: no need to set if_output

This is redundant because ether_ifattach will set that field.

Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D3918
Sponsored by: Citrix Systems R&D


# 08c9c2e0 19-Oct-2015 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: remove a bunch of FreeBSD version check

We're way beyond FreeBSD 7 at this point.

Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D3892
Sponsored by: Citrix Systems R&D


# 177e3f13 19-Oct-2015 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: remove XN_LOCK_{INIT,DESTROY}

Multiqueue feature will make the number of queues dynamic, so XN_LOCK_INIT
won't be that useful. Remove the macro and call mtx_init directly.

XN_LOCK_DESTROY is just dead code.

Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D3890
Sponsored by: Citrix Systems R&D


# 9a7f9fea 19-Oct-2015 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: clean up netfront stats structure

Rename it with netfront_ prefix and purge a bunch of unused fields.

Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D3889
Sponsored by: Citrix Systems R&D


# d0f3a8b9 19-Oct-2015 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: purge page flipping support

Currently neither Linux nor FreeBSD netback supports page flipping. NetBSD
still supports that. It is not sure how many people actually use page
flipping, but page flipping is supposed to be slower than copying nowadays.
It will also shatter frontend / backend address space.

Overall this feature is more of a burden than a benefit.

Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D3888
Sponsored by: Citrix Systems R&D


# 17374b6c 19-Oct-2015 Roger Pau Monné <royger@FreeBSD.org>

xen-netfront: delete all trailing white spaces

Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D3886
Sponsored by: Citrix Systems R&D


# f0c2f5e2 25-Aug-2015 Marcelo Araujo <araujo@FreeBSD.org>

Code cleanup unused-but-set-variable spotted by gcc.

Reviewed by: royger
Approved by: bapt (mentor)
Differential Revision: D3476


# f8f1bb83 21-Aug-2015 Roger Pau Monné <royger@FreeBSD.org>

xen: allow disabling PV disks and nics

Introduce two new loader tunnables that can be used to disable PV disks and
PV nics at boot time. They default to 0 and should be set to 1 (or any
number different than 0) in order to disable the PV devices:

hw.xen.disable_pv_disks=1
hw.xen.disable_pv_nics=1

In /boot/loader.conf will disable both PV disks and nics.

Sponsored by: Citrix Systems R&D
Tested by: Karl Pielorz <kpielorz_lst@tdx.co.uk>
MFC after: 1 week


# 3ebe4c01 14-Aug-2015 John Baldwin <jhb@FreeBSD.org>

Remove another remnant of PV domU support and assume that we always run
with an automatically translated physmap under XEN.

Reviewed by: royger (earlier version)
Differential Revision: https://reviews.freebsd.org/D3325


# 3c790178 06-Aug-2015 John Baldwin <jhb@FreeBSD.org>

Remove some more vestiges of the Xen PV domu support. Specifically,
use vtophys() directly instead of vtomach() and retire the no-longer-used
headers <machine/xenfunc.h> and <machine/xenvar.h>.

Reported by: bde (stale bits in <machine/xenfunc.h>)
Reviewed by: royger (earlier version)
Differential Revision: https://reviews.freebsd.org/D3266


# 6a8e9695 02-Jul-2015 Roger Pau Monné <royger@FreeBSD.org>

netfront: preserve configuration across migrations

Try to preserve the xn configuration when migrating. This is not always
possible since the backend might not have the same set of options
available, in which case we will try to preserve as many as possible.

MFC after: 2 weeks
PR: 183139
Reported by: mcdouga9@egr.msu.edu
Sponsored by: Citrix Systems R&D


# fd90e2ed 22-May-2015 Jung-uk Kim <jkim@FreeBSD.org>

CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten
years for head. However, it is continuously misused as the mpsafe argument
for callout_init(9). Deprecate the flag and clean up callout_init() calls
to make them more consistent.

Differential Revision: https://reviews.freebsd.org/D2613
Reviewed by: jhb
MFC after: 2 weeks


# dbf82bde 14-May-2015 Roger Pau Monné <royger@FreeBSD.org>

netfront: wait for backend to connect before sending ARP

Netfront has to wait for the backend to switch to state XenbusStateConnected
before sending the ARP request, or else the backend might not be connected
and thus the packet will be lost.

Sponsored by: Citrix Systems R&D
MFC after: 1 week


# ed95805e 30-Apr-2015 John Baldwin <jhb@FreeBSD.org>

Remove support for Xen PV domU kernels. Support for HVM domU kernels
remains. Xen is planning to phase out support for PV upstream since it
is harder to maintain and has more overhead. Modern x86 CPUs include
virtualization extensions that support HVM guests instead of PV guests.
In addition, the PV code was i386 only and not as well maintained recently
as the HVM code.
- Remove the i386-only NATIVE option that was used to disable certain
components for PV kernels. These components are now standard as they
are on amd64.
- Remove !XENHVM bits from PV drivers.
- Remove various shims required for XEN (e.g. PT_UPDATES_FLUSH, LOAD_CR3,
etc.)
- Remove duplicate copy of <xen/features.h>.
- Remove unused, i386-only xenstored.h.

Differential Revision: https://reviews.freebsd.org/D2362
Reviewed by: royger
Tested by: royger (i386/amd64 HVM domU and amd64 PVH dom0)
Relnotes: yes


# d8edb414 20-Apr-2015 Marcelo Araujo <araujo@FreeBSD.org>

Remove unused variable.

Differential Revision: D2333
Reviewed by: royger


# c2d9c6f0 27-Feb-2015 Gleb Smirnoff <glebius@FreeBSD.org>

Use m_getjcl() instead of old mbuf(9) KPIs.

Tested by: royger


# 49e6be9c 23-Feb-2015 Gleb Smirnoff <glebius@FreeBSD.org>

Previous version of mbufq were fine initialized by M_ZERO, while
new one require explicti initialization.

Reported by: royger


# c578b6ac 18-Feb-2015 Gleb Smirnoff <glebius@FreeBSD.org>

Provide a set of inline functions to manage simple mbuf(9) queues, based
on queue(3)'s STAILQ. Utilize them in cxgb(4) and Xen, deleting home
grown implementations.

Sponsored by: Netflix
Sponsored by: Nginx, Inc.


# 25baf019 12-Jan-2015 Xin LI <delphij@FreeBSD.org>

Use the common codepath to handle SIOCGIFADDR.

Before this change, the current code handles SIOCGIFADDR the same
way with SIOCSIFADDR, which involves full arp_ifinit, et al. They
should be unnecessary for SIOCGIFADDR case.

Differential Revision: https://reviews.freebsd.org/D1508
Reviewed by: glebius
MFC after: 2 weeks


# 2a8c860f 05-Jan-2015 Robert Watson <rwatson@FreeBSD.org>

In order to reduce use of M_EXT outside of the mbuf allocator and
socket-buffer implementations, introduce a return value for MCLGET()
(and m_cljget() that underlies it) to allow the caller to avoid testing
M_EXT itself. Update all callers to use the return value.

With this change, very few network device drivers remain aware of
M_EXT; the primary exceptions lie in mbuf-chain pretty printers for
debugging, and in a few cases, custom mbuf and cluster allocation
implementations.

NB: This is a difficult-to-test change as it touches many drivers for
which I don't have physical devices. Instead we've gone for intensive
review, but further post-commit review would definitely be appreciated
to spot errors where changes could not easily be made mechanically,
but were largely mechanical in nature.

Differential Revision: https://reviews.freebsd.org/D1440
Reviewed by: adrian, bz, gnn
Sponsored by: EMC / Isilon Storage Division


# f0188618 21-Oct-2014 Hans Petter Selasky <hselasky@FreeBSD.org>

Fix multiple incorrect SYSCTL arguments in the kernel:

- Wrong integer type was specified.

- Wrong or missing "access" specifier. The "access" specifier
sometimes included the SYSCTL type, which it should not, except for
procedural SYSCTL nodes.

- Logical OR where binary OR was expected.

- Properly assert the "access" argument passed to all SYSCTL macros,
using the CTASSERT macro. This applies to both static- and dynamically
created SYSCTLs.

- Properly assert the the data type for both static and dynamic
SYSCTLs. In the case of static SYSCTLs we only assert that the data
pointed to by the SYSCTL data pointer has the correct size, hence
there is no easy way to assert types in the C language outside a
C-function.

- Rewrote some code which doesn't pass a constant "access" specifier
when creating dynamic SYSCTL nodes, which is now a requirement.

- Updated "EXAMPLES" section in SYSCTL manual page.

MFC after: 3 days
Sponsored by: Mellanox Technologies


# 9fd573c3 22-Sep-2014 Hans Petter Selasky <hselasky@FreeBSD.org>

Improve transmit sending offload, TSO, algorithm in general.

The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.

Reviewed by: adrian, rmacklem
Sponsored by: Mellanox Technologies
MFC after: 1 week


# c8dfaf38 18-Sep-2014 Gleb Smirnoff <glebius@FreeBSD.org>

Mechanically convert to if_inc_counter().


# 72f31000 13-Sep-2014 Hans Petter Selasky <hselasky@FreeBSD.org>

Revert r271504. A new patch to solve this issue will be made.

Suggested by: adrian @


# eb93b77a 13-Sep-2014 Hans Petter Selasky <hselasky@FreeBSD.org>

Improve transmit sending offload, TSO, algorithm in general.

The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# c6b15dd5 10-Nov-2013 Gleb Smirnoff <glebius@FreeBSD.org>

Fix typo in r257515.

Submitted by: az


# f909bbb4 01-Nov-2013 Gleb Smirnoff <glebius@FreeBSD.org>

Somehow fix LINT-NOIP.


# 6483a98b 28-Oct-2013 Gleb Smirnoff <glebius@FreeBSD.org>

Remove dead function show_device(). It isn't buildable if DEBUG is
defined, due to unknown field "xn_ifno". The field wasn't known
since beginning of history of this file.


# 77374386 28-Oct-2013 Gleb Smirnoff <glebius@FreeBSD.org>

Include if_var.h.


# 76acc41f 29-Aug-2013 Justin T. Gibbs <gibbs@FreeBSD.org>

Implement vector callback for PVHVM and unify event channel implementations

Re-structure Xen HVM support so that:
- Xen is detected and hypercalls can be performed very
early in system startup.
- Xen interrupt services are implemented using FreeBSD's native
interrupt delivery infrastructure.
- the Xen interrupt service implementation is shared between PV
and HVM guests.
- Xen interrupt handlers can optionally use a filter handler
in order to avoid the overhead of dispatch to an interrupt
thread.
- interrupt load can be distributed among all available CPUs.
- the overhead of accessing the emulated local and I/O apics
on HVM is removed for event channel port events.
- a similar optimization can eventually, and fairly easily,
be used to optimize MSI.

Early Xen detection, HVM refactoring, PVHVM interrupt infrastructure,
and misc Xen cleanups:

Sponsored by: Spectra Logic Corporation

Unification of PV & HVM interrupt infrastructure, bug fixes,
and misc Xen cleanups:

Submitted by: Roger Pau Monné
Sponsored by: Citrix Systems R&D

sys/x86/x86/local_apic.c:
sys/amd64/include/apicvar.h:
sys/i386/include/apicvar.h:
sys/amd64/amd64/apic_vector.S:
sys/i386/i386/apic_vector.s:
sys/amd64/amd64/machdep.c:
sys/i386/i386/machdep.c:
sys/i386/xen/exception.s:
sys/x86/include/segments.h:
Reserve IDT vector 0x93 for the Xen event channel upcall
interrupt handler. On Hypervisors that support the direct
vector callback feature, we can request that this vector be
called directly by an injected HVM interrupt event, instead
of a simulated PCI interrupt on the Xen platform PCI device.
This avoids all of the overhead of dealing with the emulated
I/O APIC and local APIC. It also means that the Hypervisor
can inject these events on any CPU, allowing upcalls for
different ports to be handled in parallel.

sys/amd64/amd64/mp_machdep.c:
sys/i386/i386/mp_machdep.c:
Map Xen per-vcpu area during AP startup.

sys/amd64/include/intr_machdep.h:
sys/i386/include/intr_machdep.h:
Increase the FreeBSD IRQ vector table to include space
for event channel interrupt sources.

sys/amd64/include/pcpu.h:
sys/i386/include/pcpu.h:
Remove Xen HVM per-cpu variable data. These fields are now
allocated via the dynamic per-cpu scheme. See xen_intr.c
for details.

sys/amd64/include/xen/hypercall.h:
sys/dev/xen/blkback/blkback.c:
sys/i386/include/xen/xenvar.h:
sys/i386/xen/clock.c:
sys/i386/xen/xen_machdep.c:
sys/xen/gnttab.c:
Prefer FreeBSD primatives to Linux ones in Xen support code.

sys/amd64/include/xen/xen-os.h:
sys/i386/include/xen/xen-os.h:
sys/xen/xen-os.h:
sys/dev/xen/balloon/balloon.c:
sys/dev/xen/blkback/blkback.c:
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/console/xencons_ring.c:
sys/dev/xen/control/control.c:
sys/dev/xen/netback/netback.c:
sys/dev/xen/netfront/netfront.c:
sys/dev/xen/xenpci/xenpci.c:
sys/i386/i386/machdep.c:
sys/i386/include/pmap.h:
sys/i386/include/xen/xenfunc.h:
sys/i386/isa/npx.c:
sys/i386/xen/clock.c:
sys/i386/xen/mp_machdep.c:
sys/i386/xen/mptable.c:
sys/i386/xen/xen_clock_util.c:
sys/i386/xen/xen_machdep.c:
sys/i386/xen/xen_rtc.c:
sys/xen/evtchn/evtchn_dev.c:
sys/xen/features.c:
sys/xen/gnttab.c:
sys/xen/gnttab.h:
sys/xen/hvm.h:
sys/xen/xenbus/xenbus.c:
sys/xen/xenbus/xenbus_if.m:
sys/xen/xenbus/xenbusb_front.c:
sys/xen/xenbus/xenbusvar.h:
sys/xen/xenstore/xenstore.c:
sys/xen/xenstore/xenstore_dev.c:
sys/xen/xenstore/xenstorevar.h:
Pull common Xen OS support functions/settings into xen/xen-os.h.

sys/amd64/include/xen/xen-os.h:
sys/i386/include/xen/xen-os.h:
sys/xen/xen-os.h:
Remove constants, macros, and functions unused in FreeBSD's Xen
support.

sys/xen/xen-os.h:
sys/i386/xen/xen_machdep.c:
sys/x86/xen/hvm.c:
Introduce new functions xen_domain(), xen_pv_domain(), and
xen_hvm_domain(). These are used in favor of #ifdefs so that
FreeBSD can dynamically detect and adapt to the presence of
a hypervisor. The goal is to have an HVM optimized GENERIC,
but more is necessary before this is possible.

sys/amd64/amd64/machdep.c:
sys/dev/xen/xenpci/xenpcivar.h:
sys/dev/xen/xenpci/xenpci.c:
sys/x86/xen/hvm.c:
sys/sys/kernel.h:
Refactor magic ioport, Hypercall table and Hypervisor shared
information page setup, and move it to a dedicated HVM support
module.

HVM mode initialization is now triggered during the
SI_SUB_HYPERVISOR phase of system startup. This currently
occurs just after the kernel VM is fully setup which is
just enough infrastructure to allow the hypercall table
and shared info page to be properly mapped.

sys/xen/hvm.h:
sys/x86/xen/hvm.c:
Add definitions and a method for configuring Hypervisor event
delievery via a direct vector callback.

sys/amd64/include/xen/xen-os.h:
sys/x86/xen/hvm.c:

sys/conf/files:
sys/conf/files.amd64:
sys/conf/files.i386:
Adjust kernel build to reflect the refactoring of early
Xen startup code and Xen interrupt services.

sys/dev/xen/blkback/blkback.c:
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkfront/block.h:
sys/dev/xen/control/control.c:
sys/dev/xen/evtchn/evtchn_dev.c:
sys/dev/xen/netback/netback.c:
sys/dev/xen/netfront/netfront.c:
sys/xen/xenstore/xenstore.c:
sys/xen/evtchn/evtchn_dev.c:
sys/dev/xen/console/console.c:
sys/dev/xen/console/xencons_ring.c
Adjust drivers to use new xen_intr_*() API.

sys/dev/xen/blkback/blkback.c:
Since blkback defers all event handling to a taskqueue,
convert this task queue to a "fast" taskqueue, and schedule
it via an interrupt filter. This avoids an unnecessary
ithread context switch.

sys/xen/xenstore/xenstore.c:
The xenstore driver is MPSAFE. Indicate as much when
registering its interrupt handler.

sys/xen/xenbus/xenbus.c:
sys/xen/xenbus/xenbusvar.h:
Remove unused event channel APIs.

sys/xen/evtchn.h:
Remove all kernel Xen interrupt service API definitions
from this file. It is now only used for structure and
ioctl definitions related to the event channel userland
device driver.

Update the definitions in this file to match those from
NetBSD. Implementing this interface will be necessary for
Dom0 support.

sys/xen/evtchn/evtchnvar.h:
Add a header file for implemenation internal APIs related
to managing event channels event delivery. This is used
to allow, for example, the event channel userland device
driver to access low-level routines that typical kernel
consumers of event channel services should never access.

sys/xen/interface/event_channel.h:
sys/xen/xen_intr.h:
Standardize on the evtchn_port_t type for referring to
an event channel port id. In order to prevent low-level
event channel APIs from leaking to kernel consumers who
should not have access to this data, the type is defined
twice: Once in the Xen provided event_channel.h, and again
in xen/xen_intr.h. The double declaration is protected by
__XEN_EVTCHN_PORT_DEFINED__ to ensure it is never declared
twice within a given compilation unit.

sys/xen/xen_intr.h:
sys/xen/evtchn/evtchn.c:
sys/x86/xen/xen_intr.c:
sys/dev/xen/xenpci/evtchn.c:
sys/dev/xen/xenpci/xenpcivar.h:
New implementation of Xen interrupt services. This is
similar in many respects to the i386 PV implementation with
the exception that events for bound to event channel ports
(i.e. not IPI, virtual IRQ, or physical IRQ) are further
optimized to avoid mask/unmask operations that aren't
necessary for these edge triggered events.

Stubs exist for supporting physical IRQ binding, but will
need additional work before this implementation can be
fully shared between PV and HVM.

sys/amd64/amd64/mp_machdep.c:
sys/i386/i386/mp_machdep.c:
sys/i386/xen/mp_machdep.c
sys/x86/xen/hvm.c:
Add support for placing vcpu_info into an arbritary memory
page instead of using HYPERVISOR_shared_info->vcpu_info.
This allows the creation of domains with more than 32 vcpus.

sys/i386/i386/machdep.c:
sys/i386/xen/clock.c:
sys/i386/xen/xen_machdep.c:
sys/i386/xen/exception.s:
Add support for new event channle implementation.


# d5aeb779 13-Jun-2013 Justin T. Gibbs <gibbs@FreeBSD.org>

sys/dev/xen/netfront/netfront.c:
In netif_free(), call ifmedia_removeall() after ether_ifdetach()
so that bpf listeners are detached, any link state processing
is completed, and there is no chance for external reference to media
information.

Suggested by: yongari
MFC after: 1 week


# 33e0730e 03-Jun-2013 Andre Oppermann <andre@FreeBSD.org>

Specify a maximum TSO length limiting the segment chain to what the
Xen host side can handle after defragmentation.

This prevents the driver from throwing away too long TSO chains and
improves the performance on Amazon AWS instances with 10GigE virtual
interfaces to the normally expected throughput.

Submitted by: cperciva (earlier version)
Reviewed by: cperciva
Tested by: cperciva
MFC after: 1 week


# e3242f9d 30-May-2013 Justin T. Gibbs <gibbs@FreeBSD.org>

Make netif_free() safe to call on a partially initialized softc.

Sponsored by: Spectra Logic Corporation
MFC after: 1 week


# 818fe953 22-May-2013 Justin T. Gibbs <gibbs@FreeBSD.org>

Correct panic on detach of Xen PV network interfaces.

dev/xen/netfront:
In netif_free(), properly stop the interface and drain any pending
timers prior to disconnecting from the backend device.

Remove all media and detach our interface object from the system
prior to deleting it.

PR: kern/176471
Submitted by: Roger Pau Monne <roger.pau@citrix.com>
Reviewed by: gibbs
MFC after: 1 week


# 6f9767ac 03-Jan-2013 Marius Strobl <marius@FreeBSD.org>

- Replace partially incorrect function names in panic(9) strings with
__func__ and add some missing ones.
- Remove a stale comment.
- Remove unused NUM_ELEMENTS macro.
- Remove extra empty lines.
- Use DEVMETHOD_END.
- Use NULL rather than 0 for pointers.

MFC after: 3 days


# c6499ecc 04-Dec-2012 Gleb Smirnoff <glebius@FreeBSD.org>

Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags in sys/dev.


# 5bbe0c53 07-Jan-2012 Kevin Lo <kevlo@FreeBSD.org>

ether_ifattach() sets if_mtu to ETHERMTU, don't bother set it again

Reviewed by: yongari


# 578e4bf7 20-Sep-2011 Justin T. Gibbs <gibbs@FreeBSD.org>

Update netfront so that it queries and honors published
back-end features.

sys/dev/xen/netfront/netfront.c:
o Add xn_query_features() which reads the XenStore and
records the TSO, LRO, and chained ring-request support
of the backend.
o Rename xn_configure_lro() to xn_configure_features() and
use this routine to manage the setup of TSO, LRO, and
checksum offload.
o In create_netdev(), initialize if_capabilities and
if_hwassist to the capabilities found on all backends.
Delegate configuration of if_capenable and the TSO flag
if if_hwassist to xn_configure_features().

Reported by: Hugo Silva (fix inspired by patch provided)
Approved by: re
MFC after: 1 week


# ffa06904 20-Sep-2011 Justin T. Gibbs <gibbs@FreeBSD.org>

Modify the netfront driver so it can successfully attach to
PV devices with the ioemu attribute set.

sys/dev/xen/netfront/netfront.c:
o If a mac address for the interface cannot be found
in the front-side XenStore tree, look for an entry
in the back-side tree. With ioemu devices, the
emulator does not populate the front side tree and
neither does Xend.
o Return an error rather than panic when an attach
attempt fails.

Reported by: Janne Snabb (fix inspired by patch provided)
PR: kern/154302
Approved by: re


# cf9c09e1 20-Sep-2011 Justin T. Gibbs <gibbs@FreeBSD.org>

Correct suspend/resume support in the Netfront driver.

Sponsored by: BQ Internet

sys/dev/xen/netfront/netfront.c:
o Implement netfront_suspend(), a specialized suspend
handler for the netfront driver. This routine simply
disables the carrier so the driver is idle during
system suspend processing.
o Fix a leak when re-initializing LRO during a link reset.
o In netif_release_tx_bufs(), when cleaning up the grant
references for our TX ring, use gnttab_end_foreign_access_ref
instead of attempting to grant the page again.
o In netif_release_tx_bufs(), we do not track mbufs associated
with mbuf chains, but instead just free each mbuf directly.
Use m_free(), not m_freem(), to avoid double frees of mbufs.
o Refactor some code to enhance clarity.

Approved by: re
MFC after: 1 week


# 283d6f72 10-Jun-2011 Justin T. Gibbs <gibbs@FreeBSD.org>

Monitor and emit events for XenStore changes to XenBus trees
of the devices we manage. These changes can be due to writes
we make ourselves or due to changes made by the control domain.
The goal of these changes is to insure that all state transitions
can be detected regardless of their source and to allow common
device policies (e.g. "onlined" backend devices) to be centralized
in the XenBus bus code.

sys/xen/xenbus/xenbusvar.h:
sys/xen/xenbus/xenbus.c:
sys/xen/xenbus/xenbus_if.m:
Add a new method for XenBus drivers "localend_changed".
This method is invoked whenever a write is detected to
a device's XenBus tree. The default implementation of
this method is a no-op.

sys/xen/xenbus/xenbus_if.m:
sys/dev/xen/netfront/netfront.c:
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkback/blkback.c:
Change the signature of the "otherend_changed" method.
This notification cannot fail, so it should return void.

sys/xen/xenbus/xenbusb_back.c:
Add "online" device handling to the XenBus Back Bus
support code. An online backend device remains active
after a front-end detaches as a reconnect is expected
to occur in the near future.

sys/xen/interface/io/xenbus.h:
Add comment block further explaining the meaning and
driver responsibilities associated with the XenBus
Closed state.

sys/xen/xenbus/xenbusb.c:
sys/xen/xenbus/xenbusb.h:
sys/xen/xenbus/xenbusb_back.c:
sys/xen/xenbus/xenbusb_front.c:
sys/xen/xenbus/xenbusb_if.m:
o Register a XenStore watch against the local XenBus tree
for all devices.
o Cache the string length of the path to our local tree.
o Allow the xenbus front and back drivers to hook/filter both
local and otherend watch processing.
o Update the device ivar version of "state" when we detect
a XenStore update of that node.

sys/dev/xen/control/control.c:
sys/xen/xenbus/xenbus.c:
sys/xen/xenbus/xenbusb.c:
sys/xen/xenbus/xenbusb.h:
sys/xen/xenbus/xenbusvar.h:
sys/xen/xenstore/xenstorevar.h:
Allow clients of the XenStore watch mechanism to attach
a single uintptr_t worth of client data to the watch.
This removes the need to carefully place client watch
data within enclosing objects so that a cast or offsetof
calculation can be used to convert from watch to enclosing
object.

Sponsored by: Spectra Logic Corporation
MFC after: 1 week


# a0ae8f04 27-Apr-2011 Bjoern A. Zeeb <bz@FreeBSD.org>

Make various (pseudo) interfaces compile without INET in the kernel
adding appropriate #ifdefs. For module builds the framework needs
adjustments for at least carp.

Reviewed by: gnn
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
MFC after: 4 days


# 8577146e 28-Jan-2011 Justin T. Gibbs <gibbs@FreeBSD.org>

Fix bug in the netfront driver that caused excessive packet drops during
receive processing.

Remove unnecessary restrictions on the mbuf chain length built during an
LRO receive. This restriction was copied from the Linux netfront driver
where the LRO implementation cannot handle more than 18 discontinuities.
The FreeBSD implementation has no such restriction.

MFC after: 1 week


# 2913e88c 04-Jan-2011 Robert Watson <rwatson@FreeBSD.org>

Make "options XENHVM" compile for i386, not just amd64 -- a largely
mechanical change. This opens the door for using PV device drivers
under Xen HVM on i386, as well as more general harmonisation of i386
and amd64 Xen support in FreeBSD.

Reviewed by: cperciva
MFC after: 3 weeks


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# ff662b5c 19-Oct-2010 Justin T. Gibbs <gibbs@FreeBSD.org>

Improve the Xen para-virtualized device infrastructure of FreeBSD:

o Add support for backend devices (e.g. blkback)
o Implement extensions to the Xen para-virtualized block API to allow
for larger and more outstanding I/Os.
o Import a completely rewritten block back driver with support for fronting
I/O to both raw devices and files.
o General cleanup and documentation of the XenBus and XenStore support code.
o Robustness and performance updates for the block front driver.
o Fixes to the netfront driver.

Sponsored by: Spectra Logic Corporation

sys/xen/xenbus/init.txt:
Deleted: This file explains the Linux method for XenBus device
enumeration and thus does not apply to FreeBSD's NewBus approach.

sys/xen/xenbus/xenbus_probe_backend.c:
Deleted: Linux version of backend XenBus service routines. It
was never ported to FreeBSD. See xenbusb.c, xenbusb_if.m,
xenbusb_front.c xenbusb_back.c for details of FreeBSD's XenBus
support.

sys/xen/xenbus/xenbusvar.h:
sys/xen/xenbus/xenbus_xs.c:
sys/xen/xenbus/xenbus_comms.c:
sys/xen/xenbus/xenbus_comms.h:
sys/xen/xenstore/xenstorevar.h:
sys/xen/xenstore/xenstore.c:
Split XenStore into its own tree. XenBus is a software layer built
on top of XenStore. The old arrangement and the naming of some
structures and functions blurred these lines making it difficult to
discern what services are provided by which layer and at what times
these services are available (e.g. during system startup and shutdown).

sys/xen/xenbus/xenbus_client.c:
sys/xen/xenbus/xenbus.c:
sys/xen/xenbus/xenbus_probe.c:
sys/xen/xenbus/xenbusb.c:
sys/xen/xenbus/xenbusb.h:
Split up XenBus code into methods available for use by client
drivers (xenbus.c) and code used by the XenBus "bus code" to
enumerate, attach, detach, and service bus drivers.

sys/xen/reboot.c:
sys/dev/xen/control/control.c:
Add a XenBus front driver for handling shutdown, reboot, suspend, and
resume events published in the XenStore. Move all PV suspend/reboot
support from reboot.c into this driver.

sys/xen/blkif.h:
New file from Xen vendor with macros and structures used by
a block back driver to service requests from a VM running a
different ABI (e.g. amd64 back with i386 front).

sys/conf/files:
Adjust kernel build spec for new XenBus/XenStore layout and added
Xen functionality.

sys/dev/xen/balloon/balloon.c:
sys/dev/xen/netfront/netfront.c:
sys/dev/xen/blkfront/blkfront.c:
sys/xen/xenbus/...
sys/xen/xenstore/...
o Rename XenStore APIs and structures from xenbus_* to xs_*.
o Adjust to use of M_XENBUS and M_XENSTORE malloc types for allocation
of objects returned by these APIs.
o Adjust for changes in the bus interface for Xen drivers.

sys/xen/xenbus/...
sys/xen/xenstore/...
Add Doxygen comments for these interfaces and the code that
implements them.

sys/dev/xen/blkback/blkback.c:
o Rewrite the Block Back driver to attach properly via newbus,
operate correctly in both PV and HVM mode regardless of domain
(e.g. can be in a DOM other than 0), and to deal with the latest
metadata available in XenStore for block devices.

o Allow users to specify a file as a backend to blkback, in addition
to character devices. Use the namei lookup of the backend path
to automatically configure, based on file type, the appropriate
backend method.

The current implementation is limited to a single outstanding I/O
at a time to file backed storage.

sys/dev/xen/blkback/blkback.c:
sys/xen/interface/io/blkif.h:
sys/xen/blkif.h:
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkfront/block.h:
Extend the Xen blkif API: Negotiable request size and number of
requests.

This change extends the information recorded in the XenStore
allowing block front/back devices to negotiate for optimal I/O
parameters. This has been achieved without sacrificing backward
compatibility with drivers that are unaware of these protocol
enhancements. The extensions center around the connection protocol
which now includes these additions:

o The back-end device publishes its maximum supported values for,
request I/O size, the number of page segments that can be
associated with a request, the maximum number of requests that
can be concurrently active, and the maximum number of pages that
can be in the shared request ring. These values are published
before the back-end enters the XenbusStateInitWait state.

o The front-end waits for the back-end to enter either the InitWait
or Initialize state. At this point, the front end limits it's
own capabilities to the lesser of the values it finds published
by the backend, it's own maximums, or, should any back-end data
be missing in the store, the values supported by the original
protocol. It then initializes it's internal data structures
including allocation of the shared ring, publishes its maximum
capabilities to the XenStore and transitions to the Initialized
state.

o The back-end waits for the front-end to enter the Initalized
state. At this point, the back end limits it's own capabilities
to the lesser of the values it finds published by the frontend,
it's own maximums, or, should any front-end data be missing in
the store, the values supported by the original protocol. It
then initializes it's internal data structures, attaches to the
shared ring and transitions to the Connected state.

o The front-end waits for the back-end to enter the Connnected
state, transitions itself to the connected state, and can
commence I/O.

Although an updated front-end driver must be aware of the back-end's
InitWait state, the back-end has been coded such that it can
tolerate a front-end that skips this step and transitions directly
to the Initialized state without waiting for the back-end.

sys/xen/interface/io/blkif.h:
o Increase BLKIF_MAX_SEGMENTS_PER_REQUEST to 255. This is
the maximum number possible without changing the blkif
request header structure (nr_segs is a uint8_t).

o Add two new constants:
BLKIF_MAX_SEGMENTS_PER_HEADER_BLOCK, and
BLKIF_MAX_SEGMENTS_PER_SEGMENT_BLOCK. These respectively
indicate the number of segments that can fit in the first
ring-buffer entry of a request, and for each subsequent
(sg element only) ring-buffer entry associated with the
"header" ring-buffer entry of the request.

o Add the blkif_request_segment_t typedef for segment
elements.

o Add the BLKRING_GET_SG_REQUEST() macro which wraps the
RING_GET_REQUEST() macro and returns a properly cast
pointer to an array of blkif_request_segment_ts.

o Add the BLKIF_SEGS_TO_BLOCKS() macro which calculates the
number of ring entries that will be consumed by a blkif
request with the given number of segments.

sys/xen/blkif.h:
o Update for changes in interface/io/blkif.h macros.

o Update the BLKIF_MAX_RING_REQUESTS() macro to take the
ring size as an argument to allow this calculation on
multi-page rings.

o Add a companion macro to BLKIF_MAX_RING_REQUESTS(),
BLKIF_RING_PAGES(). This macro determines the number of
ring pages required in order to support a ring with the
supplied number of request blocks.

sys/dev/xen/blkback/blkback.c:
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkfront/block.h:
o Negotiate with the other-end with the following limits:
Reqeust Size: MAXPHYS
Max Segments: (MAXPHYS/PAGE_SIZE) + 1
Max Requests: 256
Max Ring Pages: Sufficient to support Max Requests with
Max Segments.

o Dynamically allocate request pools and segemnts-per-request.

o Update ring allocation/attachment code to support a
multi-page shared ring.

o Update routines that access the shared ring to handle
multi-block requests.

sys/dev/xen/blkfront/blkfront.c:
o Track blkfront allocations in a blkfront driver specific
malloc pool.

o Strip out XenStore transaction retry logic in the
connection code. Transactions only need to be used when
the update to multiple XenStore nodes must be atomic.
That is not the case here.

o Fully disable blkif_resume() until it can be fixed
properly (it didn't work before this change).

o Destroy bus-dma objects during device instance tear-down.

o Properly handle backend devices with powef-of-2 sector
sizes larger than 512b.

sys/dev/xen/blkback/blkback.c:
Advertise support for and implement the BLKIF_OP_WRITE_BARRIER
and BLKIF_OP_FLUSH_DISKCACHE blkif opcodes using BIO_FLUSH and
the BIO_ORDERED attribute of bios.

sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkfront/block.h:
Fix various bugs in blkfront.

o gnttab_alloc_grant_references() returns 0 for success and
non-zero for failure. The check for < 0 is a leftover
Linuxism.

o When we negotiate with blkback and have to reduce some of our
capabilities, print out the original and reduced capability before
changing the local capability. So the user now gets the correct
information.

o Fix blkif_restart_queue_callback() formatting. Make sure we hold
the mutex in that function before calling xb_startio().

o Fix a couple of KASSERT()s.

o Fix a check in the xb_remove_* macro to be a little more specific.

sys/xen/gnttab.h:
sys/xen/gnttab.c:
Define GNTTAB_LIST_END publicly as GRANT_REF_INVALID.

sys/dev/xen/netfront/netfront.c:
Use GRANT_REF_INVALID instead of driver private definitions of the
same constant.

sys/xen/gnttab.h:
sys/xen/gnttab.c:
Add the gnttab_end_foreign_access_references() API.

This API allows a client to batch the release of an array of grant
references, instead of coding a private for loop. The implementation
takes advantage of this batching to reduce lock overhead to one
acquisition and release per-batch instead of per-freed grant reference.

While here, reduce the duration the gnttab_list_lock is held during
gnttab_free_grant_references() operations. The search to find the
tail of the incoming free list does not rely on global state and so
can be performed without holding the lock.

sys/dev/xen/xenpci/evtchn.c:
sys/dev/xen/evtchn/evtchn.c:
sys/xen/xen_intr.h:
o Implement the bind_interdomain_evtchn_to_irqhandler API for HVM mode.
This allows an HVM domain to serve back end devices to other domains.
This API is already implemented for PV mode.

o Synchronize the API between HVM and PV.

sys/dev/xen/xenpci/xenpci.c:
o Scan the full region of CPUID space in which the Xen VMM interface
may be implemented. On systems using SuSE as a Dom0 where the
Viridian API is also exported, the VMM interface is above the region
we used to search.

o Pass through bus_alloc_resource() calls so that XenBus drivers
attaching on an HVM system can allocate unused physical address
space from the nexus. The block back driver makes use of this
facility.

sys/i386/xen/xen_machdep.c:
Use the correct type for accessing the statically mapped xenstore
metadata.

sys/xen/interface/hvm/params.h:
sys/xen/xenstore/xenstore.c:
Move hvm_get_parameter() to the correct global header file instead
of as a private method to the XenStore.

sys/xen/interface/io/protocols.h:
Sync with vendor.

sys/xeninterface/io/ring.h:
Add macro for calculating the number of ring pages needed for an N
deep ring.

To avoid duplication within the macros, create and use the new
__RING_HEADER_SIZE() macro. This macro calculates the size of the
ring book keeping struct (producer/consumer indexes, etc.) that
resides at the head of the ring.

Add the __RING_PAGES() macro which calculates the number of shared
ring pages required to support a ring with the given number of
requests.

These APIs are used to support the multi-page ring version of the
Xen block API.

sys/xeninterface/io/xenbus.h:
Add Comments.

sys/xen/xenbus/...
o Refactor the FreeBSD XenBus support code to allow for both front and
backend device attachments.

o Make use of new config_intr_hook capabilities to allow front and back
devices to be probed/attached in parallel.

o Fix bugs in probe/attach state machine that could cause the system to
hang when confronted with a failure either in the local domain or in
a remote domain to which one of our driver instances is attaching.

o Publish all required state to the XenStore on device detach and
failure. The majority of the missing functionality was for serving
as a back end since the typical "hot-plug" scripts in Dom0 don't
handle the case of cleaning up for a "service domain" that is not
itself.

o Add dynamic sysctl nodes exposing the generic ivars of
XenBus devices.

o Add doxygen style comments to the majority of the code.

o Cleanup types, formatting, etc.

sys/xen/xenbus/xenbusb.c:
Common code used by both front and back XenBus busses.

sys/xen/xenbus/xenbusb_if.m:
Method definitions for a XenBus bus.

sys/xen/xenbus/xenbusb_front.c:
sys/xen/xenbus/xenbusb_back.c:
XenBus bus specialization for front and back devices.

MFC after: 1 month


# 7c049a85 11-Jun-2010 Kenneth D. Merry <ken@FreeBSD.org>

MFC 199549, 199997, 204158, 207673, and 208901.

Bring in a number of netfront changes:

r199549 | jhb

Remove commented out reference to if_watchdog and an assignment of zero to
if_timer.

Reviewed by: scottl

r199997 | gibbs

Add media ioctl support and link notifications so that devd will attempt
to run dhclient on a netfront (xn) device that is setup for DHCP in
/etc/rc.conf.

PR: kern/136251 (fixed differently than the submitted patch)

r204158 | kmacy

- make printf conditional
- fix witness warnings by making configuration lock a mutex

r207673 | joel

Switch to our preferred 2-clause BSD license.

Approved by: kmacy

r208901 | ken

A number of netfront fixes and stability improvements:

- Re-enable TSO. This was broken previously due to CSUM_TSO clearing the
CSUM_TCP flag, so our checksum flags were incorrectly set going to the
netback driver. That was fixed in r206844 in tcp_output.c, so we can
turn TSO back on here.

- Fix the way transmit slots are calculated, so that we can't overfill
the ring.

- Avoid sending packets with more fragments/segments than netback can
handle. The Linux netback code can only handle packets of
MAX_SKB_FRAGS, which turns out to be 18 on machines with 4K pages. We
can easily generate packets with 32 or so fragments with TSO turned on.
Right now the solution is just to drop the packets (since netback
doesn't seem to handle it gracefully), but we should come up with a way
to allow a driver to tell the TCP stack the maximum number of fragments
it can handle in a single packet.

- Fix the way the consumer is tracked in the receive path. It could get
out of sync fairly easily.

- Use standard Xen ring macros to make it clearer how netfront is using
the rings.

- Get rid of Linux-ish negative errno return values.

- Added more documentation to the driver.

- Refactored code to make it easier to read.

- Some other minor fixes.

Reviewed by: gibbs
Sponsored by: Spectra Logic

Approved by: re (bz)


# 931eeffa 07-Jun-2010 Kenneth D. Merry <ken@FreeBSD.org>

A number of netfront fixes and stability improvements:

- Re-enable TSO. This was broken previously due to CSUM_TSO clearing the
CSUM_TCP flag, so our checksum flags were incorrectly set going to the
netback driver. That was fixed in r206844 in tcp_output.c, so we can
turn TSO back on here.

- Fix the way transmit slots are calculated, so that we can't overfill
the ring.

- Avoid sending packets with more fragments/segments than netback can
handle. The Linux netback code can only handle packets of
MAX_SKB_FRAGS, which turns out to be 18 on machines with 4K pages. We
can easily generate packets with 32 or so fragments with TSO turned on.
Right now the solution is just to drop the packets (since netback
doesn't seem to handle it gracefully), but we should come up with a way
to allow a driver to tell the TCP stack the maximum number of fragments
it can handle in a single packet.

- Fix the way the consumer is tracked in the receive path. It could get
out of sync fairly easily.

- Use standard Xen ring macros to make it clearer how netfront is using
the rings.

- Get rid of Linux-ish negative errno return values.

- Added more documentation to the driver.

- Refactored code to make it easier to read.

- Some other minor fixes.

Reviewed by: gibbs

Reviewed by: gibbs
Sponsored by: Spectra Logic
MFC after: 7 days


# 8e0ad55a 05-May-2010 Joel Dahl <joel@FreeBSD.org>

Switch to our preferred 2-clause BSD license.

Approved by: kmacy


# 227ca257 20-Feb-2010 Kip Macy <kmacy@FreeBSD.org>

- make printf conditional
- fix witness warnings by making configuration lock a mutex


# 0e509842 01-Dec-2009 Justin T. Gibbs <gibbs@FreeBSD.org>

Add media ioctl support and link notifications so that devd will attempt
to run dhclient on a netfront (xn) device that is setup for DHCP in
/etc/rc.conf.

PR: kern/136251 (fixed differently than the submitted patch)


# dbd69bc5 19-Nov-2009 John Baldwin <jhb@FreeBSD.org>

Remove commented out reference to if_watchdog and an assignment of zero to
if_timer.

Reviewed by: scottl


# cfed3783 13-Jun-2009 Kip Macy <kmacy@FreeBSD.org>

update backend_changed to reflect .m prototype


# 8cb07992 06-Jun-2009 Adrian Chadd <adrian@FreeBSD.org>

Fix compilation when compiled w/out WITNESS.

Submitted by: Edwin Shao <poleris@gmail.com>


# 3552092b 27-May-2009 Adrian Chadd <adrian@FreeBSD.org>

Delete useless #ifdef; make it more obvious if setting TSO fails.


# d76e4550 27-May-2009 Adrian Chadd <adrian@FreeBSD.org>

Clear IFF_DRV_OACTIVE if at least one TX xen/mbuf ring slot has been freed.


# 7c66482c 27-May-2009 Adrian Chadd <adrian@FreeBSD.org>

Enforce that there are actually enough xenbus TX ring descriptors available
before attempting to queue the packet.


# 3fb28bbb 26-May-2009 Adrian Chadd <adrian@FreeBSD.org>

Comment tidyup; comment where the next explicit check should
appear.


# a4ec37f5 26-May-2009 Adrian Chadd <adrian@FreeBSD.org>

Ensure that there are enough TX mbuf ring slots available before beginning
to dequeue a packet.

The tx path was trying to ensure that enough Xenbus TX ring slots existed but
it didn't check to see whether the mbuf TX ring slots were also available.
They get freed in xn_txeof() which occurs after transmission, rather than earlier
on in the process. (The same happens under Linux too.)

Due to whatever reason (CPU use, scheduling, memory constraints, whatever) the
mbuf TX ring may not have enough slots free and would allocate slot 0. This is
used as the freelist head pointer to represent "free" mbuf TX ring slots; setting
this to an actual mbuf value rather than an id crashes the code.

This commit introduces some basic code to track the TX mbuf ring use and then
(hopefully!) ensures that enough slots are free in said TX mbuf ring before it
enters the actual work loop.

A few notes:

* Similar logic needs to be introduced to check there are enough actual slots
available in the xenbuf TX ring. There's some logic which is invoked earlier
but it doesn't hard-check against the number of available ring slots.
Its trivial to do; I'll do it in a subsequent commit.

* As I've now commented in the source, it is likely possible to deadlock the
driver under certain conditions where the rings aren't receiving any changes
(which I should enumerate) and thus Xen doesn't send any further software
interrupts. I need to make sure that the timer(s) are running right and
the queues are periodically kicked.

PR: 134926


# 2d8fae98 26-May-2009 Adrian Chadd <adrian@FreeBSD.org>

Do the invariant check before the mbuf is dereferenced.


# c099cafa 26-May-2009 Adrian Chadd <adrian@FreeBSD.org>

Flesh out some inline documentation which hopefully reflect the intended
reality of these functions.


# 0e6993e4 26-May-2009 Adrian Chadd <adrian@FreeBSD.org>

Add in some INVARIANT checks in the TX mbuf descriptor "freelist" management code.

Slot 0 must always remain "free" and be a pointer to the first free entry in the
mbuf descriptor list. It is thus an error to have code allocate or push slot 0
back into the list.


# 3a539122 17-May-2009 Adrian Chadd <adrian@FreeBSD.org>

The merge in r189699 reverted part of the work done in a previous commit
(r188036.)

Re-revert that change so the Xen networking functions again.


# 12678024 11-Mar-2009 Doug Rabson <dfr@FreeBSD.org>

Merge in support for Xen HVM on amd64 architecture.


# 532700bd 05-Feb-2009 Kip Macy <kmacy@FreeBSD.org>

fix non-witness compile


# a392a271 02-Feb-2009 Kip Macy <kmacy@FreeBSD.org>

break out of loop if we run out of mbufs


# 3a6d1fcf 28-Dec-2008 Kip Macy <kmacy@FreeBSD.org>

merge 186535, 186537, and 186538 from releng_7_xen

Log:
- merge in latest xenbus from dfr's xenhvm
- fix race condition in xs_read_reply by converting tsleep to mtx_sleep

Log:
unmask evtchn in bind_{virq, ipi}_to_irq

Log:
- remove code for handling case of not being able to sleep
- eliminate tsleep - make sleeps atomic


# 23dc5621 04-Dec-2008 Kip Macy <kmacy@FreeBSD.org>

Integrate 185578 from dfr
Use newbus to managed devices


# 49906218 29-Nov-2008 Doug Rabson <dfr@FreeBSD.org>

Don't call ether_ioctl() with locks held. Loop in xn_rxeof() until the backend
stops adding stuff to the ring otherwise we miss RX interrupts which kills
performance.


# d7f03759 19-Oct-2008 Ulf Lilleengen <lulf@FreeBSD.org>

- Import the HEAD csup code which is the basis for the cvsmode work.


# 920ba15b 25-Sep-2008 Kip Macy <kmacy@FreeBSD.org>

Update xen/interface includes to the latest in mercurial

MFC after: 1 month


# 646787d9 25-Sep-2008 Kip Macy <kmacy@FreeBSD.org>

reflect header change in netfront

MFC after: 1 month


# 83b92f6e 20-Aug-2008 Kip Macy <kmacy@FreeBSD.org>

For reasons that I have not delved in to Xen 3.2 netback now does header splitting
so packets > 128 bytes are now split in to multiple buffer. This fixes netfront
to handle multiple buffers per rx packet.

MFC after: 1 month


# 6ae0e31b 20-Aug-2008 Kip Macy <kmacy@FreeBSD.org>

change netfront to match xen31_6
fix console locking


# 980c7178 20-Aug-2008 Kip Macy <kmacy@FreeBSD.org>

include vmparam.h for KERNBASE and fix typo


# 7a5048f1 20-Aug-2008 Kip Macy <kmacy@FreeBSD.org>

register netfront before xenbus does its probing

MFC after: 1 month


# 89e0f4d2 12-Aug-2008 Kip Macy <kmacy@FreeBSD.org>

Import Xen paravirtual drivers.

MFC after: 2 weeks