History log of /freebsd-9.3-release/sys/geom/vinum/
Revision Date Author Comments
267654 20-Jun-2014 gjb

Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


248085 09-Mar-2013 marius

MFC: r227309 (partial)

Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.

The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.


225736 23-Sep-2011 kensmith

Copy head to stable/9 as part of 9.0-RELEASE release cycle.

Approved by: re (implicit)


223921 11-Jul-2011 ae

Include sys/sbuf.h directly.

Reviewed by: pjd


222283 25-May-2011 ae

Prevent non-aligned reading from provider while tasting. Reject
providers with unsupported sectorsize.

Reported by: Joerg Wunsch
MFC after: 1 week


213318 01-Oct-2010 lulf

- Check flag with the bitwise operator, not the logical operator.

Submitted by: arundel
MFC after: 1 week


207878 10-May-2010 jh

- Don't return EAGAIN from gv_unload(). It was used to work around the
deadlock fixed in r207671.
- Wait for worker process to exit at class unload. The worker process
was not guaranteed to exit before the linker unloaded the module.
- Use 0 as the worker process exit status instead of ENXIO and style
the NOTREACHED comment.

Reviewed by: lulf
X-MFC after: r207671


207789 08-May-2010 lulf

- Remove obsolete flags.

MFC after: 1 week


204886 08-Mar-2010 lulf

- Set missing flag when initiating a plex rebuild with the rebuildparity
command.
- Check if plex is already syncing or rebuilding before initiating a parity
rebuild or check.


202974 25-Jan-2010 trasz

Remove some pointless variable assignments.

Found with: clang


197767 05-Oct-2009 lulf

- Improve error message consistency and wording.


195752 18-Jul-2009 lulf

- Fix the issue with read access count modification on RAID-5 plexes properly.
If the access counts were not increased and decreased in equal numbers by
gvinum consumers, the read access count would be inconsistent with the write
access count. Instead, modify the read access count with the write access
count directly to prevent any inconsistencies.

Approved by: re (kib)


193066 29-May-2009 jamie

Place hostnames and similar information fully under the prison system.
The system hostname is now stored in prison0, and the global variable
"hostname" has been removed, as has the hostname_mtx mutex. Jails may
have their own host information, or they may inherit it from the
parent/system. The proper way to read the hostname is via
getcredhostname(), which will copy either the hostname associated with
the passed cred, or the system hostname if you pass NULL. The system
hostname can still be accessed directly (and without locking) at
prison0.pr_host, but that should be avoided where possible.

The "similar information" referred to is domainname, hostid, and
hostuuid, which have also become prison parameters and had their
associated global variables removed.

Approved by: bz (mentor)


191856 06-May-2009 lulf

- Split up the BIO queue into a queue for new and one for completed requests.
This is necessary for two reasons:
1) In order to avoid collisions with the use of a BIOs flags set by a consumer
or a provider
2) Because GV_BIO_DONE was used to mark a BIO as done, not enough flags was
available, so the consumer flags of a BIO had to be misused in order to
support enough flags. The new queue makes it possible to recycle the
GV_BIO_DONE flag into GV_BIO_GROW.
As a consequence, gvinum will now work with any other GEOM class under it or
on top of it.

- Use bio_pflags for storing internal flags on downgoing BIOs, as the requests
appear to come from a consumer of a gvinum volume. Use bio_cflags only for
cloned BIOs.
- Move gv_post_bio to be used internally for maintenance requests.
- Remove some cases where flags where set without need.

PR: kern/133604


191855 06-May-2009 lulf

- Fix a case where a RAID5 volume would think that it is supposed to grow a new
subdisk after a parity rebuild.


191854 06-May-2009 lulf

- Check if any plexes are doing internal maintenance before removing them.


191853 06-May-2009 lulf

- Add forgotten KASSERT.


191852 06-May-2009 lulf

- Fix a bug where the bio_data field of the wrong BIO is freed if an error
occurs when doing a RAID5 request.


191850 06-May-2009 lulf

- GV_BIO_RETRY is not used, and it is actually impossible with more than 8
values for bio_cflags/bio_pflags.


191849 06-May-2009 lulf

- Split the queue mutex into one for the event queue and one for the BIO queue,
as they do not really relate and to prepare for an additional queue to be
covered by the BIO queue mutex.
- Implement wrappers for fetching the next element from the event queue as well
as for putting a new element into the BIO queue.


191787 04-May-2009 lulf

- Make the gvinum softc invisible to userland, as it is not needed.


191248 18-Apr-2009 lulf

- Remove assertion of topology lock remaining from 7.x gvinum. It is not needed,
as the renaming only changes internal gvinum names and will not alter the geom
topology.
- The topology lock was not held when calling g_wither_geom after renaming.


190881 10-Apr-2009 lulf

- Move out allocation part of different gvinum objects into its own routine and
make use of it in the gvinum userland code.


190513 28-Mar-2009 lulf

- Add files that should have been added in r190507.


190507 28-Mar-2009 lulf

Import the gvinum work that have been done during and after Summer of Code 2007.
The work have been under testing and fixing since then, and it is mature enough
to be put into HEAD for further testing.

A lot have changed in this time, and here are the most important:
- Gvinum now uses one single workerthread instead of one thread for each
volume and each plex. The reason for this is that the previous scheme was
very complex, and was the cause of many of the bugs discovered in gvinum.
Instead, gvinum now uses one worker thread with an event queue, quite
similar to what used in gmirror.
- The rebuild/grow/initialize/parity check routines no longer runs in
separate threads, but are run as regular I/O requests with special flags.
This made it easier to support mounted growing and parity rebuild.
- Support for growing striped and raid5-plexes, meaning that one can extend the
volumes for these plex types in addition to the concat type. Also works while
the volume is mounted.
- Implementation of many of the missing commands from the old vinum:
attach/detach, start (was partially implemented), stop (was partially
implemented), concat, mirror, stripe, raid5 (shortcuts for creating volumes
with one plex of these organizations).
- The parity check and rebuild no longer goes between userland/kernel, meaning
that the gvinum command will not stay and wait forever for the rebuild to
finish. You can instead watch the status with the list command.
- Many problems with gvinum have been reported since 5.x, and some has been hard
to fix due to the complicated architecture. Hopefully, it should be more
stable and better handle edge cases that previously made gvinum crash.
- Failed drives no longer disappears entirely, but now leave behind a dummy
drive that makes sure the original state is not forgotten in case the system
is rebooted between drive failures/swaps.
- Update manpage to reflect new commands and extend it with some examples.

Sponsored by: Google Summer of Code 2007
Mentored by: le
Tested by: Rick C. Petty <rick-freebsd2008 -at- kiwi-computer.com>


186517 27-Dec-2008 lulf

- Fix an issue with access permissions to underlying disks used by a gvinum
plex. If the plex is a raid5 plex, and is being written to, parity data might
have to be read from the underlying disks, requiring them to be opened for
reading as well as writing.

MFC after: 1 week


185309 25-Nov-2008 lulf

- Fix a potential NULL pointer reference. Note that this cannot happen in
practice, but it is a good programming practice nontheless and it allows the
kernel to not depend on userland correctness.

Found with: Coverity Prevent(tm)
CID: 655-659, 664-667


184292 26-Oct-2008 lulf

- Import macros used in gmirror for printing gvinum debug messages and making
the output more standardized.
- Add a sysctl to set the verbosity of the debug messages.
- While there, fixup typos and wording in the messages.


183546 02-Oct-2008 lulf

- Use the new gv_write_header function to write out the header when removing a
drive to make sure that the header is in the correct format.


183545 02-Oct-2008 lulf

- Remove unneeded macro since the config_length field in the header was changed
to 64 bit in the new format.


183514 01-Oct-2008 lulf

- Make gvinum header on-disk structure consistent on all platforms by storing
the gvinum header in fields of fixed size and in a big endian byte order
rather than the size and byte order of the actual platform.

Note that the change is backwards compatible with the old gvinum configuration
format, but will save the configuration in the new format when the 'saveconfig'
command is executed.

Submitted by: Rick C. Petty <rick-freebsd -at- kiwi-computer.com>


181803 17-Aug-2008 bz

Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).

This is the first in a series of commits over the course
of the next few weeks.

Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.

We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.

Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch


180612 19-Jul-2008 lulf

- When renaming a drive, also set the drive name in the gvinum header.

PR: kern/125632
Approved by: pjd (mentor)
MFC after: 3 days


180451 11-Jul-2008 lulf

- Fix a logic error when updating plex configuration.

Approved by: pjd (mentor)


180291 05-Jul-2008 rwatson

Introduce a new lock, hostname_mtx, and use it to synchronize access
to global hostname and domainname variables. Where necessary, copy
to or from a stack-local buffer before performing copyin() or
copyout(). A few uses, such as in cd9660 and daemon_saver, remain
under-synchronized and will require further updates.

Correct a bug in which a failed copyin() of domainname would leave
domainname potentially corrupted.

MFC after: 3 weeks


179206 22-May-2008 lulf

- Recognize the 'volume' parameter when creating a plex.

PR: kern/75632
Approved by: pjd (mentor)
MFC after: 1 day


177345 18-Mar-2008 lulf

- Fix a memory leak when re-discovering a gvinum configuration.

Approved by: pjd (mentor)
MFC after: 1 week


172836 20-Oct-2007 julian

Rename the kthread_xxx (e.g. kthread_create()) calls
to kproc_xxx as they actually make whole processes.
Thos makes way for us to add REAL kthread_create() and friends
that actually make theads. it turns out that most of these
calls actually end up being moved back to the thread version
when it's added. but we need to make this cosmetic change first.

I'd LOVE to do this rename in 7.0 so that we can eventually MFC the
new kthread_xxx() calls.


168670 12-Apr-2007 le

-) Correct sdcount for a plex when removing or adding subdisks.
-) Set correct sizes for plexes and volumes a subdisk has been removed.

Submitted by: Ulf Lilleengen <lulf_AT_freebsd.org>


168669 12-Apr-2007 le

Avoid infinite loop if the device string given for a drive
only consists of "/".

Submitted by: Ulf Lilleengen <lulf_AT_freebsd.org>


161425 17-Aug-2006 imp

while (0); -> while (0) in multi-line macros


157292 30-Mar-2006 le

Protect from creating striped and RAID5 plexes with unequally sized
subdisks.


157053 23-Mar-2006 le

Fix whitespace.


157052 23-Mar-2006 le

Implement the 'resetconfig' command.

PR: kern/94835
Submitted by: Ulf Lilleengen <lulf@stud.ntnu.no>


155462 08-Feb-2006 le

Catch the case when a subdisk has no provider or no consumer
attached to it.


154075 06-Jan-2006 le

Get rid of the gv_bioq hack in most parts of the I/O path and
use the standard bioq structures.


152971 30-Nov-2005 sobomax

Don't pass error value pointer to g_read_data(9) at all if we don't
have any use of it.

Suggested by: pjd


152967 30-Nov-2005 sobomax

Check for g_read_data(9) errors properly:

o The only indication of error condition is NULL value returned by
the function;

o value pointed to by error argument is undefined in the case when
operation completes successfully.

Discussed with: phk


152773 24-Nov-2005 le

Since we want a vinum geom created anytime the module loads, move
the geom creation to a seperate init function and ignore the tasting.

The config is now parsed only in the vinumdrive geom, which hopefully
fixes the problem, that the drive class tasted before the vinum class
had a chance, for good.

Also restore the behaviour that the module can be loaded at boot time
and on a running system.


152634 20-Nov-2005 le

Whitespace.


152633 20-Nov-2005 le

Always declare variables at the start of the function.
Don't allocate potentially large variables on the stack.
Check strsep() return values when the string comes from userland.
Shorten variable names for lucidity's sake.

most of the stuff:
Pointed out by: njl@


152632 20-Nov-2005 le

Fix whitespace issue.

Pointed out by: joel@


152615 19-Nov-2005 le

Finally bring in what was produced during Google SoC 2005:

Add functions to rename objects and to move a subdisk from one drive
to another.

Obtained from: Chris Jones <chris.jones@ualberta.ca>
Sponsored by: Google Summer of Code 2005
MFC in: 1 week


149895 08-Sep-2005 le

Set the G_PF_WITHER flag on the subdisk provider that is about to
be destroyed. That way the GEOM system handles all deallocations
and we don't have to do it ourselves.


149555 28-Aug-2005 le

Prevent that sync operations can be started when they are already
in progress, and be a bit more user friendly in terms of error
messages returned from the kernel.


149501 26-Aug-2005 le

Shuffle around the order in which the components are compiled.

This way, the VINUMDRIVE class is loaded before the VINUM class,
but since geom does the tasting for newly arrived classes
last-in-first-out, the VINUM class tastes first.

This removes the need to call gv_parse_config() in the drive
taste path.


149379 22-Aug-2005 le

Correct the check if a plex is accessible in case it is not up.
This makes degraded RAID5 plexes actually work.


149140 16-Aug-2005 le

Make it possible to remove stale, left-over subdisks.


149094 15-Aug-2005 le

Fix a stupid logic bug introduced in geom_vinum_drive.c rev 1.18:

When a drive is newly created, it's state is initially set to 'down',
so it won't allow saving the config to it (thus it will never know of
itself being created). Work around this by adding a new flag, that's
also checked when saving the config to a drive.


148048 15-Jul-2005 le

*) Implement round-robin reads for multiplex volumes.

*) Plug a possible memory leak. [1]

[1] obtained from: pjd@.


146325 17-May-2005 le

When a drive dies, don't call g_wither_geom() directly, but instead
post an event to the geom event queue that will take care of it,
letting outstanding bios finish, and closing the consumers.

Plus some cosmetic clean ups.


145619 28-Apr-2005 le

Only allow RAID5 plexes to be parity checked.

PR: kern/80427
Submitty by: Stijn Hoop <stijn@win.tue.nl>


143259 07-Mar-2005 le

Remove test for zero sectorsize when tasting. This check doesn't
seem to be necessary anymore, and it prevents tasting a valid drive
when booting with geom_vinum already loaded, since SCSI disks set their
sectorsize not until first opening them.


143130 04-Mar-2005 le

Don't allow to synchronize a plex that is already sychronizing.

Reset the 'syncing' flag in case of errors, too.

Some cosmetics.


142301 23-Feb-2005 le

Correctly calculate what to do and how to retry a request to a plex when
the previous one failed and there are more than one plex in the volume.

This could have led to a flood of error messages on the console and
probably a deadlock in certain situations.


142020 17-Feb-2005 le

In case of drive errors, don't close the associated consumer and
detach it, but instead let the geom wither away.

Bump copyright year.


140591 21-Jan-2005 le

Only report state changes of subdisks and plexes when there's
really a state change.

Reword the info a bit.


140590 21-Jan-2005 le

Don't initialize error with ENXIO as we might end up here when
the plex has no more consumers (e.g. orphaning).


140476 19-Jan-2005 le

Rename synchronization and initialization threads and prefix them
with 'gv_' for consistency.


140475 19-Jan-2005 le

Although an object may already be known in the configuration, it's
worker thread may have been destroyed (e.g. during orphaning).

Make sure that objects get back their worker threads when they get a
new geom.


140474 19-Jan-2005 le

Reset object flags after killing off an object's worker thread.


139778 06-Jan-2005 imp

/* -> /*- for copyright notices, minor format tweaks as necessary


138112 26-Nov-2004 le

Implement 'setstate' to allow setting the state of drives and subdisks
for debugging and emergency purposes.


138110 26-Nov-2004 le

Implement checkparity/rebuildparity.


137730 15-Nov-2004 le

Move RAID5 offset calculation into a separate function to avoid
code duplication.


137727 15-Nov-2004 le

Share gv_roughlength() between kernel and userland, as we will need it
there later.


136983 26-Oct-2004 le

Give each plex a separate queue where held back bios are put on.
This lowers the CPU usage of the worker thread and prevents a
possible live lock on non-SMP machines.

MFC candidate.


136065 02-Oct-2004 le

Don't allow to create a drive that already exists.


136064 02-Oct-2004 le

Correctly skip the '/dev/' part when creating new drives and prefix
a drive's provider with '/dev/' when printing the config.

Reported by: will@


135966 30-Sep-2004 le

Make it possible to rebuild degraded RAID5 plexes. Note that it is
currently not possible to do this while the volume is mounted.

MFC in: 1 week


135434 18-Sep-2004 le

Single concat or striped plexes don't need no special initialization
if their subdisks are all available, so let them be brought up.


135426 18-Sep-2004 le

Re-vamp how I/O is handled in volumes and plexes.

Analogous to the drive level, give each volume and plex a worker thread
that picks up and processes incoming and completed BIOs.

This should fix the data corruption issues that have come up a few
weeks ago and improve performance, especially of RAID5 plexes.

The volume level needs a little work, though.


135173 13-Sep-2004 le

Give the DRIVE geom a worker thread that picks up incoming bios,
sends them down, and takes care of the finished bios. This makes it
easier to handle I/O errors at drive level.


135164 13-Sep-2004 le

Rename gv_kill_thread() to gv_kill_plex_thread(), since there are more
threads to come.


135162 13-Sep-2004 le

Save the config back to disk when a drive goes down.


135161 13-Sep-2004 le

Read a whole sector instead of GV_HDR_LEN, since a sector might be
bigger (i.e. on CD-ROMs).


134407 27-Aug-2004 le

Move config_new_drive() to the correct place and rename it to
gv_config_new_drive().


134356 26-Aug-2004 le

When attaching a consumer from a volume to a plex, check if the
volume already has a plex attached and adjust the access counts
of the new consumer accordingly.


134221 23-Aug-2004 le

Compare the addresses of two RAID5 work packets directly instead
of the addresses of their related bios when locking one out, since
they could share a bio and this could lead to parity corruption.


134176 22-Aug-2004 le

Implement the possibility to remove drives.


134155 22-Aug-2004 le

Add forgotten format specifier in a KASSERT and shut up the compiler.

Submitted by: Gavin Atkinson <gavin.atkinson@ury.york.ac.uk>


134014 19-Aug-2004 le

A volume can be up if it has a degraded RAID5 plex.


133984 18-Aug-2004 le

Pretty print some informational messages.


133983 18-Aug-2004 le

Fix a stupid bug in the drive taste function: when checking if a
drive is known to the configuration check also if it already has a geom.
Without this check several needless geoms are created and valid
configuration data was overwritten.

This change obsoletes the need for a separate geom to taste an
offered provider and the consumer doesn't need to be opened with the
exclusive bit set.


133717 14-Aug-2004 le

Make informational output look less like an accident.


133450 10-Aug-2004 le

If we kill the worklist thread of a RAID5 plex we can destroy
the worklist mutex at the same time, so move the mtx_destroy() call
to gv_kill_thread().


133449 10-Aug-2004 le

Lock the topology before calling gv_parse_config, not afterwards.


133318 08-Aug-2004 phk

Tag all geom classes in the tree with a version number.


132940 31-Jul-2004 le

Propagate size changes upwards.


132906 30-Jul-2004 le

Set the access counts of a subdisk correctly when attaching it
to a plex that already has subdisks.


132833 29-Jul-2004 le

Shut up the compiler and temporarily '#if 0' gv_destroy_geom(),
until we need it again.


132654 26-Jul-2004 le

Save the vinum config back to disk after syncing two plexes.


132642 25-Jul-2004 le

There's a chance that the VINUMDRIVE class tastes before the
VINUM class, so let the VINUMDRIVE class parse the on-disk
configuration, too.


132617 24-Jul-2004 le

Use a temporary geom when tasting vinumdrives and lock the 'real'
vinumdrive geom with an exclusive bit. This should fix the problem
when underlying partitions overlap (i.e. the 'a' partition is at
the same offset as the 'c' partition).

Ideas borrowed from pjd@, quite a bit of testing by
Matthias Schuendehuette <msch@snafu.de>.


132607 24-Jul-2004 le

Disable kldunloading of geom_vinum temporarily until I figured out
how to do it correctly.


131625 05-Jul-2004 pjd

g_clone_bio() can fail, be ready for this.

Approved by: le


131107 25-Jun-2004 le

Mark a plex as 'newborn' when it is created. This is used to indicate
that new RAID5 plexes need to be initialized first.


131015 24-Jun-2004 csjp

Currently, if the drives specified for volume creation are
not active GEOM providers, it will result in a kernel panic.

If the GEOM provider or disk goes away before the volume
configuration data gets written to the disk, it will result
in another kernel panic.

o Make sure that the drives specified for volume creation
are active GEOM providers.

o When writing out volume configuration data to associated drives,
make sure that the GEOM provider is active, otherwise continue
to the next drive in the volume.

Approved by: le, bmilekic (mentor)


131000 23-Jun-2004 le

Add a function to clean up RAID5 packets and use it when I/O has
finished or when building the complete packet fails.


130997 23-Jun-2004 le

Remove two debugging printfs that are currently rather disturbing
than helpful.


130990 23-Jun-2004 le

Accept "sd len 0" and auto-size the subdisk correctly.

Spotted by: csjp


130930 22-Jun-2004 le

No need to free the softc, because it wasn't allocated.


130925 22-Jun-2004 le

Don't sleep in the g_down path. More error checks to come.


130697 18-Jun-2004 le

Clean up allocated ressources when destroying the main vinum geom.


130597 16-Jun-2004 le

Handle dead disks in a somewhat sane way.


130542 15-Jun-2004 le

Fix several bugs related to subdisk drive_offset calculation.


130478 14-Jun-2004 le

Don't free a VINUMDRIVE softc when it's orphaned or spoiled. All
allocated ressouces should be ultimately freed in gv_destroy_geom()
(when unloading the module and not earlier), but I need to look at this
more closely.


130477 14-Jun-2004 le

Correctly calculate subdisk offset in RAID5 plexes.


130389 12-Jun-2004 le

Add a first version of a GEOMified vinum.