History log of /freebsd-current/sys/geom/raid/md_promise.c
Revision Date Author Comments
# 81092e92 12-Feb-2024 Eugene Grosbein <eugen@FreeBSD.org>

graid: unbreak Promise RAID1 with 4+ providers

Fix a problem in graid implementation of Promise RAID1 created with 4+ disks.
Such an array generally works fine until reboot only due to a bug
in metadata writing code. Before the fix, next taste erronously created
RAID1E (kind of RAID10) instead of RAID1, hence graid used wrong offsets
for I/O operations.

The bug did not affect Promise RAID1 arrays with 2 or 3 disks only.

Reviewed by: mav
MFC after: 3 days


# fdafd315 24-Nov-2023 Warner Losh <imp@FreeBSD.org>

sys: Automated cleanup of cdefs and other formatting

Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.

Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/

Sponsored by: Netflix


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# 9e9ba9c7 31-Aug-2021 Mark Johnston <markj@FreeBSD.org>

graid: Avoid tasting devices with small sector sizes

The RAID metadata parsers effectively assume a sector size of 512 bytes
or larger, but md(4) devices can be created with a sector size that's
any power of 2. Add some seatbelts to graid tasting routines to ensure
that the requested sector(s) are large enough for the device to
plausibly contain RAID metadata.

Reported by: syzbot+f43583c9bf8357c8b56f@syzkaller.appspotmail.com
Reported by: syzbot+537dd9f22b91b698e161@syzkaller.appspotmail.com
Reported by: syzbot+51509dd48871c57c6e47@syzkaller.appspotmail.com
Reported by: syzbot+c882a31037ea2a54ff63@syzkaller.appspotmail.com
MFC after: 1 week
Sponsored by: The FreeBSD Foundation


# cd853791 27-Nov-2020 Konstantin Belousov <kib@FreeBSD.org>

Make MAXPHYS tunable. Bump MAXPHYS to 1M.

Replace MAXPHYS by runtime variable maxphys. It is initialized from
MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys.

Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer
cache buffers exactly to atop(maxbcachebuf) (currently it is sized to
atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1.
The +1 for pbufs allow several pbuf consumers, among them vmapbuf(),
to use unaligned buffers still sized to maxphys, esp. when such
buffers come from userspace (*). Overall, we save significant amount
of otherwise wasted memory in b_pages[] for buffer cache buffers,
while bumping MAXPHYS to desired high value.

Eliminate all direct uses of the MAXPHYS constant in kernel and driver
sources, except a place which initialize maxphys. Some random (and
arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted
straight. Some drivers, which use MAXPHYS to size embeded structures,
get private MAXPHYS-like constant; their convertion is out of scope
for this work.

Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs,
dev/siis, where either submitted by, or based on changes by mav.

Suggested by: mav (*)
Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions)
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D27225


# d40bc607 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

geom: clean up empty lines in .c and .h files


# 8510f61a 08-Jul-2020 Xin LI <delphij@FreeBSD.org>

sys/geom: consistently use _PATH_DEV instead of hardcoding "/dev/".

Reviewed by: cem
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D25565


# ac03832e 07-Aug-2019 Conrad Meyer <cem@FreeBSD.org>

GEOM: Reduce unnecessary log interleaving with sbufs

Similar to what was done for device_printfs in r347229.

Convert g_print_bio() to a thin shim around g_format_bio(), which acts on an
sbuf; documented in g_bio.9.

Reviewed by: markj
Discussed with: rlibby
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D21165


# 151ba793 24-Dec-2017 Alexander Kabaev <kan@FreeBSD.org>

Do pass removing some write-only variables from the kernel.

This reduces noise when kernel is compiled by newer GCC versions,
such as one used by external toolchain ports.

Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial)
Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c)
Differential Revision: https://reviews.freebsd.org/D10385


# 3728855a 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/geom: adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.


# b28ea2c2 11-Jan-2017 Conrad Meyer <cem@FreeBSD.org>

g_raid: Prevent tasters from attempting excessively large reads

Some g_raid tasters attempt metadata reads in multiples of the provider
sectorsize. Reads larger than MAXPHYS are invalid, so detect and abort
in such situations.

Spiritually similar to r217305 / PR 147851.

PR: 214721
Sponsored by: Dell EMC Isilon


# 28323add 08-Nov-2016 Bryan Drewery <bdrewery@FreeBSD.org>

Fix improper use of "its".

Sponsored by: Dell EMC Isilon


# b99bce73 27-Apr-2016 Pedro F. Giffuni <pfg@FreeBSD.org>

geom: unsign some types to match their definitions and avoid overflows.

In struct:gctl_req, nargs is unsigned.

In mirror:
g_mirror_syncreqs is unsigned.

In raid:
in struct:g_raid_volume, v_disks_count is unsigned.

In virstor:
in struct:g_virstor_softc, n_components is unsigned.

MFC after: 2 weeks


# 74b8d63d 10-Apr-2016 Pedro F. Giffuni <pfg@FreeBSD.org>

Cleanup unnecessary semicolons from the kernel.

Found with devel/coccinelle.


# 0b1b7c2c 25-Feb-2015 Alexander Motin <mav@FreeBSD.org>

Replace constant with proper sizeof().

Submitted by: Dmitry Luhtionov <dmitryluhtionov@gmail.com>
MFC after: 2 weeks


# dea1e226 28-Apr-2014 Alexander Motin <mav@FreeBSD.org>

Reduce number of opens by REOM RAID during provider taste.

Instead opening/closing provider by each of metadata classes, do it only
once in core code. Since for SCSI disks open/close means sending some
SCSI commands to the device, this change reduces taste time.

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.


# 40ea77a0 22-Oct-2013 Alexander Motin <mav@FreeBSD.org>

Merge GEOM direct dispatch changes from the projects/camlock branch.

When safety requirements are met, it allows to avoid passing I/O requests
to GEOM g_up/g_down thread, executing them directly in the caller context.
That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid
several context switches per I/O.

The defined now safety requirements are:
- caller should not hold any locks and should be reenterable;
- callee should not depend on GEOM dual-threaded concurency semantics;
- on the way down, if request is unmapped while callee doesn't support it,
the context should be sleepable;
- kernel thread stack usage should be below 50%.

To keep compatibility with GEOM classes not meeting above requirements
new provider and consumer flags added:
- G_CF_DIRECT_SEND -- consumer code meets caller requirements (request);
- G_CF_DIRECT_RECEIVE -- consumer code meets callee requirements (done);
- G_PF_DIRECT_SEND -- provider code meets caller requirements (done);
- G_PF_DIRECT_RECEIVE -- provider code meets callee requirements (request).
Capable GEOM class can set them, allowing direct dispatch in cases where
it is safe. If any of requirements are not met, request is queued to
g_up or g_down thread same as before.

Such GEOM classes were reviewed and updated to support direct dispatch:
CONCAT, DEV, DISK, GATE, MD, MIRROR, MULTIPATH, NOP, PART, RAID, STRIPE,
VFS, ZERO, ZFS::VDEV, ZFS::ZVOL, all classes based on g_slice KPI (LABEL,
MAP, FLASHMAP, etc).

To declare direct completion capability disk(9) KPI got new flag equivalent
to G_PF_DIRECT_SEND -- DISKFLAG_DIRECT_COMPLETION. da(4) and ada(4) disk
drivers got it set now thanks to earlier CAM locking work.

This change more then twice increases peak block storage performance on
systems with manu CPUs, together with earlier CAM locking changes reaching
more then 1 million IOPS (512 byte raw reads from 16 SATA SSDs on 4 HBAs to
256 user-level threads).

Sponsored by: iXsystems, Inc.
MFC after: 2 months


# c3ec009a 16-Jan-2013 Alexander Motin <mav@FreeBSD.org>

- Fix rebuild position broken at r245522.
- Identify one more metadata field.


# 821a0f63 16-Jan-2013 Alexander Motin <mav@FreeBSD.org>

For Promise/AMD metadata add support for disks with capacity above 2TiB
and for volumes with sector size above 512 bytes.


# 609a7474 29-Oct-2012 Alexander Motin <mav@FreeBSD.org>

Add basic BIO_DELETE support to GEOM RAID class for all RAID levels.

If at least one subdisk in the volume supports it, BIO_DELETE requests
will be propagated down. Unfortunatelly, for RAID levels with redundancy
unmapped blocks will be mapped back during first rebuild/resync process.

Sponsored by: iXsystems, Inc.
MFC after: 1 month


# c6f0cd57 10-Oct-2012 Alexander Motin <mav@FreeBSD.org>

NULL-ify last previously used pointer instead of last possible pointer.
This should be only a cosmetic change.

Found by: Clang Static Analyzer


# 6871a543 07-Oct-2012 Alexander Motin <mav@FreeBSD.org>

Make graid command line a bit more friendly by allowing volume name or
provider name to be specified instead of geom name (first argument in all
subcommands except label). In most cases there is only one array used
any way, so it is not really useful to make user type ugly geom names like
Intel-f0bdf223 or SiI-732c2b9448cf. Though they can be used in some cases.

Sponsored by: iXsystems, Inc.
MFC after: 1 month


# c89d2fbe 13-Sep-2012 Alexander Motin <mav@FreeBSD.org>

Add global and per-module sysctls/tunables to enable/disable metadata taste.
That should help to handle some cases when disk has some RAID metadata that
should be ignored, especially during boot.

MFC after: 3 days


# eb3b1cd0 05-May-2012 Alexander Motin <mav@FreeBSD.org>

Plug small memory leaks.


# 7b2a8d78 27-Apr-2012 Alexander Motin <mav@FreeBSD.org>

Fix RAID5 level names changed at r234603.


# e26083ca 23-Apr-2012 Alexander Motin <mav@FreeBSD.org>

Add sos@ copyrights to RAID metadata modules, respecting his efforts in
decoding metadata formats in ataraid(4) code.


# fc1de960 18-Apr-2012 Alexander Motin <mav@FreeBSD.org>

Add to GEOM RAID class module for reading non-degraded RAID5 volumes and
some environment to differentiate 4 possible RAID5 on-disk layouts.

Tested with Intel and AMD RAID BIOSes.

MFC after: 2 weeks


# 733a1f3f 26-Oct-2011 Alexander Motin <mav@FreeBSD.org>

Clarify disks/volumes above 2TiB support in geom_raid:
- add support for volumes above 2TiB with Promise metadata format;
- enforse and document other limitations:
- Intel and Promise metadata formats do not support disks above 2TiB;
- NVIDIA metadata format does not support volumes above 2TiB.

Sponsored by: iXsystems, Inc.
MFC after: 2 weeks


# 14e2cd0a 31-Mar-2011 Alexander Motin <mav@FreeBSD.org>

Bunch of small bugfixes and cleanups.

Found with: Clang Static Analyzer


# 63607675 31-Mar-2011 Alexander Motin <mav@FreeBSD.org>

Bunch of small bugfixes and cleanups.

Found with: Coverity Prevent(tm)
CID: 9656, 9658, 9693, 9705, 9706, 9707, 9808, 9809, 9810,
9711, 9712, 9713, 9714


# 89b17223 24-Mar-2011 Alexander Motin <mav@FreeBSD.org>

MFgraid/head:
Add new RAID GEOM class, that is going to replace ataraid(4) in supporting
various BIOS-based software RAIDs. Unlike ataraid(4) this implementation
does not depend on legacy ata(4) subsystem and can be used with any disk
drivers, including new CAM-based ones (ahci(4), siis(4), mvs(4), ata(4)
with `options ATA_CAM`). To make code more readable and extensible, this
implementation follows modular design, including core part and two sets
of modules, implementing support for different metadata formats and RAID
levels.

Support for such popular metadata formats is now implemented:
Intel, JMicron, NVIDIA, Promise (also used by AMD/ATI) and SiliconImage.

Such RAID levels are now supported:
RAID0, RAID1, RAID1E, RAID10, SINGLE, CONCAT.

For any all of these RAID levels and metadata formats this class supports
full cycle of volume operations: reading, writing, creation, deletion,
disk removal and insertion, rebuilding, dirty shutdown detection
and resynchronization, bad sector recovery, faulty disks tracking,
hot-spare disks. For Intel and Promise formats there is support multiple
volumes per disk set.

Look graid(8) manual page for additional details.

Co-authored by: imp
Sponsored by: Cisco Systems, Inc. and iXsystems, Inc.