History log of /freebsd-current/sys/cam/ctl/ctl_tpc.c
Revision Date Author Comments
# 2ffd30f7 06-Nov-2023 Warner Losh <imp@FreeBSD.org>

cam: Remove left-over sys/cdefs.h in sys/cam

These weren't removed when $FreeBSD$ was removed. They aren't needed and
now are a style(9) nonconformity.

Sponsored by: Netflix


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# f4d499fd 07-Jan-2022 Alexander Motin <mav@FreeBSD.org>

CTL: Relax callouts precisions.

MFC after: 2 weeks


# b06771aa 29-Dec-2021 Alexander Motin <mav@FreeBSD.org>

CTL: Allow I/Os up to 8MB, depending on maxphys value.

For years CTL block backend limited I/O size to 1MB, splitting larger
requests into sequentially processed chunks. It is sufficient for
most of use cases, since typical initiators rarely use bigger I/Os.

One of known exceptions is VMWare VAAI offload, by default sending up
to 8 4MB EXTENDED COPY requests same time. CTL internally converted
those into 32 1MB READ/WRITE requests, that could overwhelm the block
backend, having finite number of processing threads and making more
important interactive I/Os to wait in its queue. Previously it was
partially covered by CTL core serializing sequential reads to help
ZFS speculative prefetcher, but that serialization was significantly
relaxed after recent ZFS improvements.

With the new settings block backend receives 8 4MB requests, that
should be easier for both CTL itself and the underlying storage.

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.


# a9bd2281 25-Feb-2021 Alexander Motin <mav@FreeBSD.org>

Remove pointless lun->be_lun checks.

There is no such thing as LUN without backend, at least for years.

MFC after: 1 week


# 812c9f48 18-Feb-2021 Alexander Motin <mav@FreeBSD.org>

Save context switch per I/O for iSCSI and IOCTL frontends.

Introduce new CTL core KPI ctl_run(), preprocessing I/Os in the caller
context instead of scheduling another thread just for that. This call
may sleep, that is not acceptable for some frontends like the original
CAM/FC one, but iSCSI already has separate sleepable per-connection RX
threads, and another thread scheduling is mostly just a waste of time.
IOCTL frontend actually waits for the I/O completion in the caller
thread, so the use of another thread for this has even less sense.

With this change I can measure ~5% IOPS improvement on 4KB iSCSI I/Os
to ZFS.

MFC after: 1 month


# 27dcd3d9 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

cam: clean up empty lines in .c and .h files


# 8951f055 09-May-2018 Marcelo Araujo <araujo@FreeBSD.org>

Rework CTL frontend & backend options to use nv(3), allow creating multiple
ioctl frontend ports.

This revision introduces two changes to CTL:
- Changes the way options are passed to CTL_LUN_REQ and CTL_PORT_REQ ioctls.
Removes ctl_be_arg structure and associated logic and replaces it with
nv(3)-based logic for passing in and out arguments.
- Allows creating multiple ioctl frontend ports using either ctladm(8) or
ctld(8).
New frontend ports are represented by /dev/cam/ctl<pp>.<vp> nodes, eg /dev/cam/ctl5.3.
Those device nodes respond only to CTL_IO ioctl.

New command-line options for ctladm:
# creates new ioctl frontend port with using free pp and vp=0
ctladm port -c
# creates new ioctl frontend port with pp=10 and vp=0
ctladm port -c -O pp=10
# creates new ioctl frontend port with pp=11 and vp=12
ctladm port -c -O pp=11 -O vp=12
# removes port with number 4 (it's a "targ_port" number, not pp number)
ctladm port -r -p 4

New syntax for ctl.conf:
target ... {
port ioctl/<pp>
...
}

target ... {
port ioctl/<pp>/<vp>
...

Note: Most of this work was made by jceel@, thank you.

Submitted by: jceel
Reworked by: myself
Reviewed by: mav (earlier versions and recently during the rework)
Obtained from: FreeNAS and TrueOS
Relnotes: Yes
Sponsored by: iXsystems Inc.
Differential Revision: https://reviews.freebsd.org/D9299


# f24882ec 16-Jan-2018 Pedro F. Giffuni <pfg@FreeBSD.org>

SPDX: finish tagging sys/cam.


# 9ff948d0 27-Feb-2017 Alexander Motin <mav@FreeBSD.org>

Polish handling of different reset flavours.

The biggest change is that ctl_remove_initiator() now generates I_T NEXUS
LOSS event, cleaning part of LUs state related to the initiator.

MFC after: 2 weeks


# 46511441 17-Feb-2017 Alexander Motin <mav@FreeBSD.org>

Change XCOPY memory allocations.

Before this change XCOPY code could allocate memory in chunks up to 16-32MB
(VMware does XCOPY in 4MB chunks by default), that could be difficult for
VM subsystem to do due to KVA fragmentation, that sometimes created huge
allocation delays, blocking any I/O for respective LU for that time.

This change limits allocations down to TPC_MAX_IO_SIZE, which is 1MB now.
1MB is also not a cookie, but ZFS also can do that for large blocks, so
it should be less dramatic. As drawback this increases CPU overhead, but
it still look acceptable comparing to time consumed by ZFS read/write.

MFC after: 1 week


# 640603fb 17-Jan-2017 Alexander Motin <mav@FreeBSD.org>

Remove writing 'residual' field of struct ctl_scsiio.

This field has no practical use and never readed. Initiators already
receive respective residual size from frontends. Removed field had
different semantics, which looks useless, and was never passed through
by any frontend.

While there, fix kern_data_resid field support in case of HA, missed in
r312291.

MFC after: 13 days


# eb6ac6f9 16-Jan-2017 Alexander Motin <mav@FreeBSD.org>

Make CTL frontends report kern_data_resid for under-/overruns.

It seems like kern_data_resid was never really implemented. This change
finally does it. Now frontends update this field while transferring data,
while CTL/backends getting it can more flexibly handle the result.
At this point behavior should not change significantly, still reporting
errors on write overrun, but that may be changed later, if we decide so.

CAM target frontend still does not properly handle overruns due to CAM API
limitations. We may need to add some fields to struct ccb_accept_tio to
pass information about initiator requested transfer size(s).

MFC after: 2 weeks


# 9cbbfd2f 29-Dec-2016 Alexander Motin <mav@FreeBSD.org>

Improve use of I/O's private area.

- Since I/Os are allocates from per-port pools, make allocations store
pointer to CTL softc there, and use it where needed instead of global.
- Created bunch of helper macros to access LUN, port and CTL softc.

MFC after: 2 weeks


# 41243159 25-Dec-2016 Alexander Motin <mav@FreeBSD.org>

Remove CTL_MAX_LUNS from places where it is not required.

MFC after: 2 weeks


# a3dd8378 25-Dec-2016 Alexander Motin <mav@FreeBSD.org>

Improve third-party copy error reporting.

For EXTENDED COPY:
- improve parameters checking to report some errors before copy start;
- forward sense data from copy target as descriptor in case of error;
- report which CSCD reported error in sense key specific information.
For WRITE USING TOKEN:
- pass through real sense data from copy target instead of reporting
our copy error, since for initiator its a "simple" write, not a copy.

MFC after: 2 weeks


# 32920cbf 19-Dec-2016 Alexander Motin <mav@FreeBSD.org>

When reporting "Logical block address out of range" error, report the LBA
in sense data INFORMATION field.

MFC after: 2 weeks


# 8fadf660 10-May-2016 Alexander Motin <mav@FreeBSD.org>

Fix previous commit to report proper error code.

MFC after: 2 weeks


# 38618bf4 10-May-2016 Alexander Motin <mav@FreeBSD.org>

Validate XCOPY range offsets and lengths.

MFC after: 2 weeks


# e13f4248 10-May-2016 Alexander Motin <mav@FreeBSD.org>

More XCOPY parameters validation.

MFC after: 2 weeks


# 3eb7651a 10-May-2016 Alexander Motin <mav@FreeBSD.org>

Improve validation of some POPULATE TOKEN parameters.

MFC after: 2 weeks


# 0952a19f 01-Oct-2015 Alexander Motin <mav@FreeBSD.org>

More aggressively fill WUT read pipeline.

On some tests I've measured 5% copy speedup from this.


# 6ac1446d 01-Oct-2015 Alexander Motin <mav@FreeBSD.org>

Make zero WUT use WRITE SAME with recently allowed NDOB flag.


# 862aedb0 28-Sep-2015 Alexander Motin <mav@FreeBSD.org>

Fix arguments order.


# 042e9bdc 17-Sep-2015 Alexander Motin <mav@FreeBSD.org>

Report number of failed XCOPY segment.


# a65a997f 12-Sep-2015 Alexander Motin <mav@FreeBSD.org>

Improve XCOPY error reporting.


# 238b6b7c 12-Sep-2015 Alexander Motin <mav@FreeBSD.org>

Report that we have no limit on POPULATE TOKEN segment size.


# 7ac58230 09-Sep-2015 Alexander Motin <mav@FreeBSD.org>

Reimplement CTL High Availability.

CTL HA functionality was originally implemented by Copan many years ago,
but large part of the sources was never published. This change includes
clean room implementation of the missing code and fixes for many bugs.

This code supports dual-node HA with ALUA in four modes:
- Active/Unavailable without interlink between nodes;
- Active/Standby with second node handling only basic LUN discovery and
reservation, synchronizing with the first node through the interlink;
- Active/Active with both nodes processing commands and accessing the
backing storage, synchronizing with the first node through the interlink;
- Active/Active with second node working as proxy, transfering all
commands to the first node for execution through the interlink.

Unlike original Copan's implementation, depending on specific hardware,
this code uses simple custom TCP-based protocol for interlink. It has
no authentication, so it should never be enabled on public interfaces.

The code may still need some polishing, but generally it is functional.

Relnotes: yes
Sponsored by: iXsystems, Inc.


# 2f444d15 15-Aug-2015 Alexander Motin <mav@FreeBSD.org>

Drop "internal" CTL frontend.

Its idea was to be a simple initiator and execute several commands from
kernel level, but FreeBSD never had consumer for that functionality,
while its implementation polluted many unrelated places..


# 73942c5c 05-Aug-2015 Alexander Motin <mav@FreeBSD.org>

Issue all reads of single XCOPY segment simultaneously.

During vMotion and Clone VMware by default runs multiple sequential 4MB
XCOPY requests same time. If CTL issues reads sequentially in 1MB chunks
for each XCOPY command, reads from different commands are not detected
as sequential by serseq option code and allowed to execute simultaneously.
Such read pattern confused ZFS prefetcher, causing suboptimal disk access.
Issuing all reads same time make serseq code work properly, serializing
reads both within each XCOPY command and between them.

My tests with ZFS pool of 14 disks in RAID10 shows prefetcher efficiency
improved from 37% to 99.7%, copying speed improved by 10-60%, average
read latency reduced twice on HDD layer and by five times on zvol layer.

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.


# 2d8b2876 20-Jun-2015 Alexander Motin <mav@FreeBSD.org>

Introduce separate lock for tokens to reduce ctl_lock scope.


# fee04ef7 12-Feb-2015 Alexander Motin <mav@FreeBSD.org>

Make XCOPY and WUT commands respect physical block size/offset.

This change by 2-3 times improves performance of misaligned XCOPY and WUT
commands by avoiding unneeded read-modify-write cycles inside ZFS.

MFC after: 1 week


# 117f1bc1 24-Jan-2015 Alexander Motin <mav@FreeBSD.org>

Fix wrong LUN reference in XCOPY block-to-block operation.

This could cause data corruption due to accessing wrong LUN in case of
retries on write errors. Failed writes were retried to read LUN.

MFC after: 3 days


# 9602f436 19-Dec-2014 Alexander Motin <mav@FreeBSD.org>

Reduce number of places where global control_softc is used.

At some point we may want to have several CTL instances, and that is not
really impossible.

MFC after: 2 weeks


# 2a72b593 03-Dec-2014 Alexander Motin <mav@FreeBSD.org>

Plug memory leaks on UNMAP and XCOPY with invalid parameters.

MFC after: 1 week


# f7241cce 25-Nov-2014 Alexander Motin <mav@FreeBSD.org>

Coalesce last data move and command status for read commands.

Make CTL core and block backend set success status before initiating last
data move for read commands. Make CAM target and iSCSI frontends detect
such condition and send command status together with data. New I/O flag
allows to skip duplicate status sending on later fe_done() call.

For Fibre Channel this change saves one of three interrupts per read command,
increasing performance from 126K to 160K IOPS. For iSCSI this change saves
one of three PDUs per read command, increasing performance from 1M to 1.2M
IOPS.

MFC after: 1 month
Sponsored by: iXsystems, Inc.


# 1251a76b 24-Nov-2014 Alexander Motin <mav@FreeBSD.org>

Replace home-grown CTL IO allocator with UMA.

Old allocator created significant lock congestion protecting its lists
of preallocated I/Os, while UMA provides much better SMP scalability.
The downside of UMA is lack of reliable preallocation, that could guarantee
successful allocation in non-sleepable environments. But careful code
review shown, that only CAM target frontend really has that requirement.
Fix that making that frontend preallocate and statically bind CTL I/O for
every ATIO/INOT it preallocates any way. That allows to avoid allocations
in hot I/O path. Other frontends either may sleep in allocation context
or can properly handle allocation errors.

On 40-core server with 6 ZVOL-backed LUNs and 7 iSCSI client connections
this change increases peak performance from ~700K to >1M IOPS! Yay! :)

MFC after: 1 month
Sponsored by: iXsystems, Inc.


# 0b060244 01-Oct-2014 Alexander Motin <mav@FreeBSD.org>

Fix couple issues with ROD tokens content.

MFC after: 3 days


# 4ab4d687 17-Sep-2014 Alexander Motin <mav@FreeBSD.org>

Fix tpc_create_token() introduced in r269497 to encode CREATOR LOGICAL UNIT
DESCRIPTOR field as Identification Descriptor CSCD descriptor, not just as
Identification Descriptor.

MFC after: 3 days


# 2ac1d5af 19-Aug-2014 Alexander Motin <mav@FreeBSD.org>

Fix lock recursion on LUN shutdown, introduced on r269497.

MFC after: 3 days


# e3e592bb 05-Aug-2014 Alexander Motin <mav@FreeBSD.org>

Reimplement WRITE USING TOKEN with Block Zero token using WRITE SAME.

On my ZVOL of SSDs that increases speed of zero writing in that way from
1 to 2.5GB/s by reducing CPU overhead.
MFC after: 2 weeks


# 25eee848 03-Aug-2014 Alexander Motin <mav@FreeBSD.org>

Add support for Windows dialect of EXTENDED COPY command, aka Microsoft ODX.

This allows to avoid extra network traffic when copying files on NTFS iSCSI
disks within one storage host by drag'n'dropping them in Windows Explorer
of Windows 8/2012. It should also accelerate Hyper-V VM operations, etc.

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.


# c5c60595 02-Aug-2014 Alexander Motin <mav@FreeBSD.org>

Rework r269444 to work also for lists without IDs.

MFC after: 3 days


# 475267ef 02-Aug-2014 Alexander Motin <mav@FreeBSD.org>

Plug EXTENDED COPY request data memory leak.

MFC after: 3 days


# a7c09f5c 02-Aug-2014 Alexander Motin <mav@FreeBSD.org>

Fix some bugs in RECEIVE COPY STATUS data.

MFC after: 3 days


# 43d2d719 02-Aug-2014 Alexander Motin <mav@FreeBSD.org>

Add missing comparisons to make list IDs in EXTENDED COPY per-initiator,
as they should be. Wrap it into a function to not duplicate the code.

MFC after: 3 days


# 8cbf9eae 17-Jul-2014 Alexander Motin <mav@FreeBSD.org>

Increase maximal number of SCSI ports in CTL from 32 to 128.

After I gave each iSCSI target its own port, the old limit appeared to be
not so big. This change almost proportionally increases per-LUN memory
use, but it is still three times better then it was before r268807.

MFC after: 2 weeks


# 984a2ea9 16-Jul-2014 Alexander Motin <mav@FreeBSD.org>

Add support for VMWare dialect of EXTENDED COPY command, aka VAAI Clone.

This allows to clone VMs and move them between LUNs inside one storage
host without generating extra network traffic to the initiator and back,
and without being limited by network bandwidth.

LUNs participating in copy operation should have UNIQUE NAA or EUI IDs set.
For LUNs without these IDs VMWare will use traditional copy operations.

Beware: the above LUN IDs explicitly set to values non-unique from the VM
cluster point of view may cause data corruption if wrong LUN is addressed!

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.