History log of /freebsd-10.1-release/sys/geom/geom_dev.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 272461 02-Oct-2014 gjb

Copy stable/10@r272459 to releng/10.1 as part of
the 10.1-RELEASE process.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 260385 06-Jan-2014 scottl

MFC Alexander Motin's GEOM direct dispatch work:

r256603:
Introduce new function devstat_end_transaction_bio_bt(), adding new argument
to specify present time. Use this function to move binuptime() out of lock,
substantially reducing lock congestion when slow timecounter is used.

r256606:
Move g_io_deliver() out of the lock, as required for direct dispatch.
Move g_destroy_bio() out too to reduce lock scope even more.

r256607:
Fix passing uninitialized bio_resid argument to g_trace().

r256610:
Add unmapped I/O support to GEOM RAID.

r256830:
Restore BIO_UNMAPPED and BIO_TRANSIENT_MAPPING in biodonne() when unmapping
temporary mapped buffer. That fixes double unmap if biodone() called twice
for the same BIO (but with different done methods).

r256880:
Merge GEOM direct dispatch changes from the projects/camlock branch.

When safety requirements are met, it allows to avoid passing I/O requests
to GEOM g_up/g_down thread, executing them directly in the caller context.
That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid
several context switches per I/O.

r259247:
Fix bug introduced at r256607. We have to recalculate bp_resid here since
sizes of original and completed requests may differ due to end of media.

Testing of the stable/10 merge was done by Netflix, but all of the credit
goes to Alexander and iX Systems.

Submitted by: mav
Sponsored by: iX Systems


# 256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


# 254389 15-Aug-2013 ken

Change the way that unmapped I/O capability is advertised.

The previous method was to set the D_UNMAPPED_IO flag in the cdevsw
for the driver. The problem with this is that in many cases (e.g.
sa(4)) there may be some instances of the driver that can handle
unmapped I/O and some that can't. The isp(4) driver can handle
unmapped I/O, but the esp(4) driver currently cannot. The cdevsw
is shared among all driver instances.

So instead of setting a flag on the cdevsw, set a flag on the cdev.
This allows drivers to indicate support for unmapped I/O on a
per-instance basis.

sys/conf.h: Remove the D_UNMAPPED_IO cdevsw flag and replace it
with an SI_UNMAPPED cdev flag.

kern_physio.c: Look at the cdev SI_UNMAPPED flag to determine
whether or not a particular driver can handle
unmapped I/O.

geom_dev.c: Set the SI_UNMAPPED flag for all GEOM cdevs.
Since GEOM will create a temporary mapping when
needed, setting SI_UNMAPPED unconditionally will
work.

Remove the D_UNMAPPED_IO flag.

nvme_ns.c: Set the SI_UNMAPPED flag on cdevs created here
if NVME_UNMAPPED_BIO_SUPPORT is enabled.

vfs_aio.c: In aio_qphysio(), check the SI_UNMAPPED flag on a
cdev instead of the D_UNMAPPED_IO flag on the cdevsw.

sys/param.h: Bump __FreeBSD_version to 1000045 for the switch from
setting the D_UNMAPPED_IO flag in the cdevsw to setting
SI_UNMAPPED in the cdev.

Reviewed by: kib, jimharris
MFC after: 1 week
Sponsored by: Spectra Logic


# 249930 26-Apr-2013 smh

Added a sysctl (kern.geom.dev.delete_max_sectors) to control the maximum
size of a delete request sent to the providing device performed by g_dev_ioctl.

This allows the kernel and apps via ioctl e.g. newfs -E to request large LBA
deletes which siginificantly improves performance.

Previously this was hard coded to 65536 sectors, the new default is 262144
which doubles the throughput of deletes on commonly available SSD's.

In tests on a Intel 520 120GB FW: 400i disk it improved the delete throughput
from 1.6GB/s to over 2.6GB/s on a full disk delete such as that done via
newfs -E

For some SSD's where delete time is pretty much constant, no matter what
the request, setting this to 0 will provide significantly better throughput
e.g. Samsung 840 240GB FW DXT07B0Q @ 262144 = 79G/s, @ 0 = 2259G/s

Reviewed by: mav
Approved by: pjd (mentor)
MFC after: 2 weeks


# 249193 06-Apr-2013 trasz

Make it possible to submit FLUSH bios through geom_dev strategy. This
is required for CTL to work with device-backed LUNs.

Reviewed by: mav


# 248712 25-Mar-2013 kan

Do not pass unmapped buffers to drivers that cannot handle them

In physio, check if device can handle unmapped IO and pass an
appropriately mapped buffer to the driver strategy routine. The
only driver in the tree that can handle unmapped buffers is one
exposed by GEOM, so mark it as such with the new flag in the
driver cdevsw structure.

This fixes insta-panics on hosts, running dconschat, as /dev/fwmem
is an example of the driver that makes use of physio routine, but
bypasses the g_down thread, where the buffer gets mapped normally.

Discussed with: kib (earlier version)


# 248679 24-Mar-2013 mav

Fix long known deadlock between geom dev destruction and d_close() call.
Use destroy_dev_sched_cb() to not wait for device destruction while holding
GEOM topology lock (that actually caused deadlock). Use request counting
protected by mutex to properly wait for outstanding requests completion in
cases of device closing and geom destruction. Unlike r227009, this code
does not block taskqueue thread for indefinite time, waiting for completion.


# 243333 20-Nov-2012 jh

- Don't pass geom and provider names as format strings.
- Add __printflike() attributes.
- Remove an extra argument for the g_new_geomf() call in swapongeom_ev().

Reviewed by: pjd


# 242439 01-Nov-2012 alfred

Provide a device name in the sysctl tree for programs to query the
state of crashdump target devices.

This will be used to add a "-l" (ell) flag to dumpon(8) to list the
currently configured dumpdev.

Reviewed by: phk


# 239790 28-Aug-2012 ed

Remove unneeded G_PF_CANDELETE flag.

This flag is only used by GEOM so it can be propagated to the character
device's SI_CANDELETE. Unfortunately, SI_CANDELETE seems to do nothing.


# 238886 29-Jul-2012 mav

Implement media change notification for DA and CD removable media devices.
It includes three parts:
1) Modifications to CAM to detect media media changes and report them to
disk(9) layer. For modern SATA (and potentially UAS) devices it utilizes
Asynchronous Notification mechanism to receive events from hardware.
Active polling with TEST UNIT READY commands with 3 seconds period is used
for incapable hardware. After that both CD and DA drivers work the same way,
detecting two conditions: "NOT READY: Medium not present" after medium was
detected previously, and "UNIT ATTENTION: Not ready to ready change, medium
may have changed". First one reported to disk(9) as media removal, second
as media insert/change. To reliably receive second event new
AC_UNIT_ATTENTION async added to make UAs broadcasted to all periphs by
generic error handling code in cam_periph_error().
2) Modifications to GEOM core to handle media remove and change events.
Media removal handled by spoiling all consumers attached to the provider.
Media change event also schedules provider retaste after spoiling to probe
new media. New flag G_CF_ORPHAN was added to consumers to reflect that
consumer is in process of destruction. It allows retaste to create new
geom instance of the same class, while previous one is still dying.
3) Modifications to some GEOM classes: DEV -- to report media change
events to devd; VFS -- to handle spoiling same as orphan to prevent
accessing replaced media. PART class already handles spoiling alike to
orphan.

Reviewed by: silence on geom@ and scsi@
Tested by: avg
Sponsored by: iXsystems, Inc. / PC-BSD
MFC after: 2 months


# 238171 06-Jul-2012 trasz

Fix typo in the comment.


# 227510 14-Nov-2011 mav

Temporary revert r227009 to fix freeze on UP systems without PREEMPTION.

Before r215687, if some withered geom or provider could not be destroyed,
g_event thread went to sleep for 0.1s before retrying. After that change
it is just restarting immediately. r227009 made orphaned (withered) provider
to not detach immediately, but only after context switch. That made loop
inside g_event thread infinite on UP systems without PREEMPTION.

To address original problem with possible dead lock addressed by r227009
we have to fix r215687 change first, that needs some time to think and test.


# 227009 01-Nov-2011 mav

Make orphan() method in geom_dev asynchronous using destroy_dev_sched_cb()
instead of destroy_dev(). It moves device destruction waiting out of the
topology lock and so fixes dead lock between orphanization and closing.
Real provider and geom destruction called from swi context after device
destroyed as callback of the destroy_dev_sched_cb().


# 223089 14-Jun-2011 gibbs

Plumb device physical path reporting from CAM devices, through GEOM and
DEVFS, and make it accessible via the diskinfo utility.

Extend GEOM's generic attribute query mechanism into generic disk consumers.
sys/geom/geom_disk.c:
sys/geom/geom_disk.h:
sys/cam/scsi/scsi_da.c:
sys/cam/ata/ata_da.c:
- Allow disk providers to implement a new method which can override
the default BIO_GETATTR response, d_getattr(struct bio *). This
function returns -1 if not handled, otherwise it returns 0 or an
errno to be passed to g_io_deliver().

sys/cam/scsi/scsi_da.c:
sys/cam/ata/ata_da.c:
- Don't copy the serial number to dp->d_ident anymore, as the CAM XPT
is now responsible for returning this information via
d_getattr()->(a)dagetattr()->xpt_getatr().

sys/geom/geom_dev.c:
- Implement a new ioctl, DIOCGPHYSPATH, which returns the GEOM
attribute "GEOM::physpath", if possible. If the attribute request
returns a zero-length string, ENOENT is returned.

usr.sbin/diskinfo/diskinfo.c:
- If the DIOCGPHYSPATH ioctl is successful, report physical path
data when diskinfo is executed with the '-v' option.

Submitted by: will
Reviewed by: gibbs
Sponsored by: Spectra Logic Corporation

Add generic attribute change notification support to GEOM.

sys/sys/geom/geom.h:
Add a new attrchanged method field to both g_class
and g_geom.

sys/sys/geom/geom.h:
sys/geom/geom_event.c:
- Provide the g_attr_changed() function that providers
can use to advertise attribute changes.
- Perform delivery of attribute change notifications
from a thread context via the standard GEOM event
mechanism.

sys/geom/geom_subr.c:
Inherit the attrchanged method from class to geom (class instance).

sys/geom/geom_disk.c:
Provide disk_attr_changed() to provide g_attr_changed() access
to consumers of the disk API.

sys/cam/scsi/scsi_pass.c:
sys/cam/scsi/scsi_da.c:
sys/geom/geom_dev.c:
sys/geom/geom_disk.c:
Use attribute changed events to track updates to physical path
information.

sys/cam/scsi/scsi_da.c:
Add AC_ADVINFO_CHANGED to the registered asynchronous CAM
events for this driver. When this event occurs, and
the updated buffer type references our physical path
attribute, emit a GEOM attribute changed event via the
disk_attr_changed() API.

sys/cam/scsi/scsi_pass.c:
Add AC_ADVINFO_CHANGED to the registered asynchronous CAM
events for this driver. When this event occurs, update
the physical patch devfs alias for this pass instance.

Submitted by: gibbs
Sponsored by: Spectra Logic Corporation


# 221400 03-May-2011 mav

Use make_dev_alias_p() added in r221397 to create alias dev entry.
It removes panic in case if alias name is already busy for some reason.


# 221071 26-Apr-2011 mav

- Add shim to simplify migration to the CAM-based ATA. For each new adaX
device in /dev/ create symbolic link with adY name, trying to mimic old ATA
numbering. Imitation is not complete, but should be enough in most cases to
mount file systems without touching /etc/fstab.
- To know what behavior to mimic, restore ATA_STATIC_ID option in cases
where it was present before.
- Add some more details to UPDATING.


# 219950 24-Mar-2011 mav

MFgraid/head r217827:
Change BIO_GETATTR("GEOM::kerneldump") API to make set_dumper() called by
consumer (geom_dev) instead of provider (geom_disk). This allows any geom
insert it's code into the dump call chain, implementing more sophisticated
functionality then just disk partitioning.


# 214063 19-Oct-2010 jh

Use make_dev_p(9) with the MAKEDEV_CHECKNAME flag instead of make_dev(9)
and print a diagnostic if the call fails.

This avoids a panic when a device with an invalid name is attempted to
be registered. For example the label class gets device names from
untrusted input.

Reviewed by: freebsd-geom


# 209062 11-Jun-2010 avg

fix a few cases where a string is passed via format argument instead of
via %s

Most of the cases looked harmless, but this is done for the sake of
correctness. In one case it even allowed to drop an intermediate buffer.

Found by: clang
MFC after: 2 week


# 201139 28-Dec-2009 mav

Add BIO_DELETE support to ada(4):
- For SSDs use TRIM feature of DATA SET MANAGEMENT command, as defined by
ACS-2 specification working draft.
- For CompactFlash use CFA ERASE command, same as ad(4) does.

With this patch, `newfs -E /dev/ada1` was able to restore write speed of
my heavily weared OCZ Vertex SSD (firmware 1.4) up to the initial level
for the most part of it's capacity. Previous 1.3 firmware, even reportiong
TRIM capabilty bit set, was not working, reporting ABORT error for every
DSM command.

I have no idea whether it is normal, but for some reason it takes 200ms
to handle any TRIM command on this drive, that was making delete extremely
slow. But TRIM command is able to accept long list of LBAs and the length of
that list seems doesn't affect it's execution time. Implemented request
clusting algorithm allowed me to rise delete rate up to reasonable numbers,
when many parallel DELETE requests running.


# 200934 24-Dec-2009 mav

Add two disk ioctls, giving user-level tools information about disk/array
stripe (optimal access block) size and offset.


# 196964 08-Sep-2009 mav

Do not check proper request alignment here in geom_dev in production.
It will be checked any way later by g_io_check() in g_io_schedule_down().
It is only needed here to not trigger panic from additional check, when
INVARIANTS enabled. So cover it with #ifdef INVARIANTS. It saves two
64bit divisions per request.


# 195436 08-Jul-2009 marcel

Revert revisions 188839 and 188868. Use of the ioctl in geom_dev.c
is invalid because the ioctl happens without prior open. The ioctl
got introduced to provide backward compatibility for extended
partitions, but it ended up not being used because it didn't work
as expected. Since there are no consumers of the ioctl and the
implementation is broken, the best fix is to remove the code
entirely.

Spotted by: phk
Approved by: re (kensmith)


# 188839 20-Feb-2009 marcel

Provide compatibility symlink for logical partitions:
1. Extend geom_dev by having it create the symlink (i.e. call
make_dev_alias) based on the DIOCGPROVIDERALIAS ioctl.
In this way the functionaility is generic and thus usable
by any geom/provider.
2. Have g_part handle said ioctl through the devalias method,
so that it's under control of the scheme itself. By design
the alias will not be created for newly added partitions.


# 187672 24-Jan-2009 ed

Remove unused unrhdr from GEOM character device module.

Now that make_dev() doesn't require unit numbers to be unique, there is
no need to use an unrhdr here to generate the numbers. Remove the entire
init-routine, because it is optional.


# 183381 26-Sep-2008 ed

Remove unit2minor() use from kernel code.

When I changed kern_conf.c three months ago I made device unit numbers
equal to (unneeded) device minor numbers. We used to require
bitshifting, because there were eight bits in the middle that were
reserved for a device major number. Not very long after I turned
dev2unit(), minor(), unit2minor() and minor2unit() into macro's.
The unit2minor() and minor2unit() macro's were no-ops.

We'd better not remove these four macro's from the kernel, because there
is a lot of (external) code that may still depend on them. For now it's
harmless to remove all invocations of unit2minor() and minor2unit().

Reviewed by: kib


# 182843 07-Sep-2008 lulf

- Add a new ioctl for getting the provider name of a geom provider.
- Add a routine for looking up a device and checking if it is a valid geom
provider given a partial or full path to its device node.

Reviewed by: phk
Approved by: pjd (mentor)


# 179413 29-May-2008 ed

Remove the distinction between device minor and unit numbers.

Even though we got rid of device major numbers some time ago, device
drivers still need to provide unique device minor numbers to make_dev().
These numbers are only used inside the kernel. They are not related to
device major and minor numbers which are visible in devfs. These are
actually based on the inode number of the device.

It would eventually be nice to remove minor numbers entirely, but we
don't want to be too agressive here.

Because the 8-15 bits of the device number field (si_drv0) are still
reserved for the major number, there is no 1:1 mapping of the device
minor and unit numbers. Because this is now unused, remove the
restrictions on these numbers.

The MAXMAJOR definition was actually used for two purposes. It was used
to convert both the userspace and kernelspace device numbers to their
major/minor pair, which is why it is now named UMINORMASK.

minor2unit() and unit2minor() have now become useless. Both minor() and
dev2unit() now serve the same purpose. We should eventually remove some
of them, at least turning them into macro's. If devfs would become
completely minor number unaware, we could consider using si_drv0 directly,
just like si_drv1 and si_drv2.

Approved by: philip (mentor)


# 174674 16-Dec-2007 phk

Chop DIOCGDELETE from userland up in 1024 sector chunks to give geom_disk
or any other bio chopping geom a reasonable size of work.

Check for delivered signals between chunks, because the request size
and service time is unbounded.


# 174669 16-Dec-2007 phk

Don't limit BIO_DELETE requests to MAXPHYS, they perform no data
transfers, so they are not subject to the VM system limitation.


# 169284 05-May-2007 pjd

Implement three new ioctls that can be used with GEOM provider:

DIOCGFLUSH - Flush write cache (sends BIO_FLUSH).

DIOCGDELETE - Delete data (mark as unused) (sends BIO_DELETE).

DIOCGIDENT - Get provider's uniqe and fixed identifier (asks for
GEOM::ident attribute).

First two are self-explanatory, but the last one might not be. Here are
properties of provider's ident:

- ident value is preserved between reboots,
- provider can be detached/attached and ident is preserved,
- provider's name can change - ident can't,
- ident value should not be based on on-disk metadata; in other words
copying whole data from one disk to another should not yield the same
ident for the other disk,
- there could be more than one provider with the same ident, but only if
they point at exactly the same physical storage, this is the case for
multipathing for example,
- GEOM classes that consumes single providers and provide single providers,
like geli, gbde, should just attach class name to the ident of the
underlying provider,
- ident is an ASCII string (is printable),
- ident is optional and applications can't relay on its presence.

The main purpose for this is that application and remember provider's ident
and once it tries to open provider by its name again, it may compare idents
to be sure this is the right provider. If it is not (idents don't match),
then it can open provider by its ident.

OK'ed by: phk


# 167913 26-Mar-2007 kris

make_dev(9) can be (and is) called without Giant, so there is no need to
drop the topology lock and acquire Giant around this call.

Reviewed by: phk


# 167086 27-Feb-2007 jhb

Use pause() rather than tsleep() on stack variables and function pointers.


# 166934 23-Feb-2007 jhb

Use tsleep() rather than msleep() with a NULL mtx parameter.


# 159756 18-Jun-2006 simon

In g_dev_strategy(), when failing an IO request with EINVAL due to
offset or request size which is not a multiple of the sector size, make
sure that the bio is set to indicate that no data has actually been
transferred.

The result of this is that the file offset is no longer incremented for
these requests. The fact that the file offset was incremented broke
fdisk(8)'s probing of sector size for non-512 byte sector sizes.

Reviewed by: phk, cperciva
Submitted by: mdodd
MFC after: 2 weeks


# 143790 18-Mar-2005 phk

Avoid null pointer dereference.


# 143238 07-Mar-2005 phk

Add placeholder mutex argument to new_unrhdr().


# 138732 12-Dec-2004 phk

Pass the file->flags down to geom ioctl handlers.

Reject certain ioctls if write permission is not indicated.

Bump geom API version.

Reported by: Ruben de Groot <mail25@bzerk.org>


# 137048 29-Oct-2004 phk

Don't set si_bsize_phys, nobody cares.


# 137029 29-Oct-2004 phk

Give dev_strategy() an explict cdev argument in preparation for removing
buf->b-dev.

Put a bio between the buf passed to dev_strategy() and the device driver
strategy routine in order to not clobber fields in the buf.

Assert copyright on vfs_bio.c and update copyright message to canonical
text. There is no legal difference between John Dysons two-clause
abbreviated BSD license and the canonical text.


# 136946 25-Oct-2004 phk

Use unit number allocation functions for GEOM minor numbers.


# 136940 25-Oct-2004 phk

Retire si_stripesize and si_stripeoffset they will not be needed in cdev
in the future.


# 136839 23-Oct-2004 phk

Don't call g_waitidle(), it happens automagically now.


# 135865 27-Sep-2004 pjd

Deny invalid I/O requests which comes from userland here, because later
we'll get a panic.
MT5 candidate.

Reviewed by: phk


# 135716 24-Sep-2004 phk

Assert topology is held in g_dev_getprovider().

Don't call devsw(). It is not necessary, and we do not need to hold dev_lock
to compare the devsw pointer to our own since we do not dereference it.


# 133318 08-Aug-2004 phk

Tag all geom classes in the tree with a version number.


# 133314 08-Aug-2004 phk

Use default method initialization on geoms.


# 130712 19-Jun-2004 phk

Duplicate the securelevel check from spec_vnops.c here.


# 130651 17-Jun-2004 phk

Reduce the thaumaturgical level of root filesystem mounts: Instead of using
an otherwise redundant clone routine in geom_disk.c, mount a temporary
DEVFS and do a proper lookup.

Submitted by: thomas


# 130640 17-Jun-2004 phk

Second half of the dev_t cleanup.

The big lines are:
NODEV -> NULL
NOUDEV -> NODEV
udev_t -> dev_t
udev2dev() -> findcdev()

Various minor adjustments including handling of userland access to kernel
space struct cdev etc.


# 130585 16-Jun-2004 phk

Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.


# 126080 21-Feb-2004 phk

Device megapatch 4/6:

Introduce d_version field in struct cdevsw, this must always be
initialized to D_VERSION.

Flip sense of D_NOGIANT flag to D_NEEDGIANT, this involves removing
four D_NOGIANT flags and adding 145 D_NEEDGIANT flags.


# 125755 12-Feb-2004 phk

Remove the absolute count g_access_abs() function since experience has
shown that it is not useful.

Rename the relative count g_access_rel() function to g_access(), only
the name has changed.

Change all g_access_rel() calls in our CVS tree to call g_access() instead.

Add an #ifndef BURN_BRIDGES #define of g_access_rel() for source
code compatibility.


# 124880 23-Jan-2004 phk

Add missing newline in printf.

Submitted by: Pawel Jakub Dawidek <nick@garage.freebsd.pl>


# 121253 19-Oct-2003 phk

Remove KASSERT check for negative bio_offsets, add "normal" EIO
error return for same.


# 121030 12-Oct-2003 phk

Assume that bp->bio_offset is correctly initialized.

This fixes non-power-of-2 blocksize GEOM I/O.


# 119749 04-Sep-2003 phk

Make sure to return ENOIOCTL if the ioctl is not handled.


# 119660 01-Sep-2003 phk

Simplify the ioctl handling in GEOM.

This replaces the current ioctl processing with a direct call path
from geom_dev() where the ioctl arrives (from SPECFS) to any directly
connected GEOM class.

The inverse of the above is no longer supported. This is the
situation were you have one or more intervening GEOM classes, for
instance a BSDlabel on top of a MBR or PC98. If you want to issue
MBR or PC98 specific ioctls, you will need to issue them on a MBR
or PC98 providers.

This paves the way for inviting CD's, FD's and other special cases
inside GEOM.


# 119593 30-Aug-2003 phk

Add the new g_dev_getprovider() function, the swap_pager needs it now.

Spotted by: mr


# 118869 13-Aug-2003 phk

Replace a panic with a .1Hz retry loop.
Not a perfect solution, but far cheaper than one.


# 118355 02-Aug-2003 phk

Kick Giant compatibility one layer up.


# 116196 11-Jun-2003 obrien

Use __FBSDID().

Approved by: phk


# 115960 07-Jun-2003 phk

Improve the root-dev prompt facility for printing devices which could
possibly be a root filesystem.


# 115959 07-Jun-2003 phk

Wait for everything to settle before we try to print the list of
geom devices.


# 115515 31-May-2003 phk

Remove unused variables.

Found by: FlexeLint


# 115468 31-May-2003 phk

Remove the G_CLASS_INITIALIZER, we do not need it anymore.


# 114864 09-May-2003 phk

When a GEOM (/dev-)device is closed and we find that I/O requests are
still outstanding, give them a chance to complete.

If after 10 seconds we still find outstanding I/O requests, complete
the close with a console warning that the system is likely to panic
later on.

This is a workaround for umount -f not quite doing the right thing.

Approved by: re/scottl


# 114511 02-May-2003 phk

Back out all the stuff that didn't belong in the last commit.


# 114508 02-May-2003 phk

Use g_slice_spoiled() rather than g_std_spoiled().

Remember to free the buffer we got from g_read_data().


# 114216 29-Apr-2003 kan

Deprecate machine/limits.h in favor of new sys/limits.h.
Change all in-tree consumers to include <sys/limits.h>

Discussed on: standards@
Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>


# 112978 02-Apr-2003 phk

Properly handle races between open/close and orphan.

KASSERT the race between close and strategy, it is an error in the upper
echelons if this happens,

Add XXX: comment explaining why the ioctl/orphan race is not closed.


# 112552 24-Mar-2003 phk

Premptively change initializations of struct g_class to use C99
sparse struct initializations before we extend the struct with
new OAM related member functions.


# 112367 18-Mar-2003 phk

Including <sys/stdint.h> is (almost?) universally only to be able to use
%j in printfs, so put a newsted include in <sys/systm.h> where the printf
prototype lives and save everybody else the trouble.


# 112030 09-Mar-2003 phk

Remove unneeded #include of geom_stats.h


# 112024 09-Mar-2003 phk

When a DEV class consumer is orphan'ed we need to wait for all the
outstanding requests to return before we unravel the mesh.

It is very important that the stuff below us plays nice and don't
overlook a couple of outstanding bio's, because until they remember
the geom event thread is blocked. At an expense in code here this
could be made more robust, but I actually _want_ a robust failure
in this case so any offending drivers can be fixed.


# 111815 03-Mar-2003 phk

Gigacommit to improve device-driver source compatibility between
branches:

Initialize struct cdevsw using C99 sparse initializtion and remove
all initializations to default values.

This patch is automatically generated and has been tested by compiling
LINT with all the fields in struct cdevsw in reverse order on alpha,
sparc64 and i386.

Approved by: re(scottl)


# 111733 02-Mar-2003 phk

NO_GEOM cleanup:

Remove cdevsw->d_psize() implementation, we don't need it any more.


# 111119 19-Feb-2003 imp

Back out M_* changes, per decision of the TRB.

Approved by: trb


# 110728 11-Feb-2003 phk

Advertise MAXPHYS upwards, we will split as necessary before we get to the
bottom of things.


# 110710 11-Feb-2003 phk

Better names for struct disk elements: d_maxsize, d_stripeoffset
and d_stripesisze;

Introduce si_stripesize and si_stripeoffset in struct cdev so we
can make the visible to clustering code.

Add stripesize and stripeoffset to providers.

DTRT with stripesize and stripeoffset in various places in GEOM.


# 110700 11-Feb-2003 phk

Use the SI_CANDELETE flag on the dev_t rather than the D_CANFREE flag
on the cdevsw to determine ability to handle the BIO_DELETE request.


# 110541 08-Feb-2003 phk

Move the g_stat struct to its own .h file, we will export it to other code.

Insted of embedding a struct g_stat in consumers and providers, merely
include a pointer.

Remove a couple of <sys/time.h> includes now unneeded.

Add a special allocator for struct g_stat. This allocator will allocate
entire pages and hand out g_stat functions from there. The "id" field
indicates free/used status.

Add "/dev/geom.stats" device driver whic exports the pages from the
allocator to userland with mmap(2) in read-only mode.

This mmap(2) interface should be considered a non-public interface and
the functions in libgeom (not yet committed) should be used to access
the statistics data.


# 110540 08-Feb-2003 phk

Move #defines of major/minor to internal header file so other bits can
share and coordinate with geom_dev.


# 110523 07-Feb-2003 phk

Commit the correct copy of the g_stat structure.

Add debug.sizeof.g_stat sysctl.

Set the id field of the g_stat when we create consumers and providers.

Remove biocount from consumer, we will use the counters in the g_stat
structure instead. Replace one field which will need to be atomically
manipulated with two fields which will not (stat.nop and stat.nend).

Change add companion field to bio_children: bio_inbed for the exact
same reason.

Don't output the biocount in the confdot output.

Fix KASSERT in g_io_request().

Add sysctl kern.geom.collectstats defaulting to off.

Collect the following raw statistics conditioned on this sysctl:

for each consumer and provider {
total number of operations started.
total number of operations completed.
time last operation completed.
sum of idle-time.
for each of BIO_READ, BIO_WRITE and BIO_DELETE {
number of operations completed.
number of bytes completed.
number of ENOMEM errors.
number of other errors.
sum of transaction time.
}
}

API for getting hold of these statistics data not included yet.


# 110517 07-Feb-2003 phk

Rename bio_linkage to the more obvious bio_parent.
Add bio_t0 timestamp, and include <sys/time.h> where needed


# 109623 21-Jan-2003 alfred

Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.


# 109253 14-Jan-2003 phk

Now that we have non-geom_disk based drivers, we need to cover for those,
in case they return EOPNOTSUPP on an ioctl.

Found by: jhb


# 109176 13-Jan-2003 phk

Always issue ioctls as BIO_GEATTR requests. The direction of data copies on
ioctls are no reliable indication of the ioctls "set" or "get" nature or if
such simplistic categories can even be applied.

MFC candidate: boot0cfg issue.


# 109170 13-Jan-2003 phk

Remove g_silence(). It does not do anything anymore.


# 108552 02-Jan-2003 phk

Update si_bsize_phys on open.

MFC candidate.


# 108294 26-Dec-2002 phk

Add an XXX comment to explain the predicament.


# 107834 13-Dec-2002 phk

Add a couple of KASSERTS, just in case.


# 106300 01-Nov-2002 phk

Add KASSERT for bio_cmd validity here as well. Various hacks still
bypass specfs.


# 105947 25-Oct-2002 phk

Add a g_dev_print() function which prints all the /dev entries GEOM
know about.


# 105941 25-Oct-2002 phk

Loose the g_dev_clone() noise.


# 105551 20-Oct-2002 phk

Now that the sectorsize and mediasize are properties of the provider,
don't take the detour over the I/O path to discover them using getattr(),
we can just pick them out directly.

Do note though, that for now they are only valid after the first open
of the underlying disk device due compatibility with the old disk_create()
API. This will change in the future so they will always be valid.

Sponsored by: DARPA & NAI Labs.


# 105540 20-Oct-2002 phk

Use %jd instead of %lld now that we have it.


# 105452 19-Oct-2002 tmm

The argument to the DIOCGMEDIASIZE ioctl() is an off_t, not an u_int.

Reviewed by: phk


# 105180 15-Oct-2002 njl

Return an error if the drive reports heads/sectors that do not make sense.
This fixes a divide by zero in fdisk(8)

Reviewed by: phk


# 104602 07-Oct-2002 phk

Copyin and copyout are only possible from a process-native thread,
and therefore we need a way for ioctl handlers to run in that thread
in GEOM. Rather than invent a complicated registration system to
recognize which ioctl handler to use for a given ioctl, we still
schedule all ioctls down the tree as bio transactions but add a
special return code that means "call me directly" and have the
geom_dev layer do that.

Use this for all ioctls that make it as far as a diskdriver to
avoid any backwards compatibility problems.

Requested by: scottl
Sponsored by: DARPA & NAI Labs


# 104452 04-Oct-2002 phk

Properly isolate the locking domains of sysctl from the topology lock
for the sysctls which report the configuration.

Sponsored by: DARPA & NAI Labs.


# 104357 02-Oct-2002 phk

Put some failing ioctl related printfs under a suitable debug flag.

Sponsored by: DARPA & NAI Labs.


# 104316 01-Oct-2002 phk

Use the canonical root:operator 0640 for GEOM disk devices.

Spotted by: brooks
Sponsored by: DARPA & NAI Labs.


# 104087 28-Sep-2002 phk

Style, whitespace and lint fixes.

Sponsored by: DARPA & NAI Labs.


# 104060 27-Sep-2002 phk

Various no-ops:

Add a __unused.

Make the 2byte decoder functions return 16 bits for the benefits
of picky lints.

No need to grab giant around a tsleep() when we have a timeout.

Sponsored by: DARPA & NAI Labs.


# 103670 20-Sep-2002 phk

Retire now unused DIOCGDVIRGIN kludge.

Sponsored by: DARPA & NAI Labs.


# 103004 06-Sep-2002 phk

Don't respect the O_EXCL flag, we don't get it back on close so we cannot
correctly track it.

Spotted by: peter
Sponsored by: DARPA & NAI Labs.


# 98066 09-Jun-2002 phk

Improve some on the naming.

Submitted by: iedowse


# 97075 21-May-2002 phk

Remove the "-class" suffix from classes, they will not be ambiguous.

Sponsored by: DARPA & NAI Labs.


# 96987 20-May-2002 phk

Don't grab Giant around malloc(9) and free(9).
Don't grab Giant around wakeup(9).
Don't print verbose messages about each device found in geom_dev.
Various cleanups.

Sponsored by: DARPA & NAI Labs.


# 95323 23-Apr-2002 phk

Implement the GEOMGETCONF ioctl which returns vital stats for the
current device in XML in an sbuf.

Sponsored by: DARPA & NAI Labs


# 95038 19-Apr-2002 phk

Make kernel dumps work with GEOM.

Notice that if the device on which the dump is set is destroyed for
any reason, the dump setting is lost. This in particular will
happen in the case of spoilage. For instance if you set dump on
ad0s1b and open ad0 for writing, ad0s* will be spoilt and the dump
setting lost. See geom(4) for more about spoiling.

Sponsored by: DARPA & NAI Labs.


# 94287 09-Apr-2002 phk

Implement DIOCGFRONTSTUFF ioctl which reports how many bytes from the start
of the device magic stuff might occupy.

Sponsored by: DARPA & NAI Labs.


# 94285 09-Apr-2002 phk

Various stylistic nit picking.

Sponsored by: DARPA & NAI Labs.


# 94175 08-Apr-2002 phk

In reverence of the 3rd X11 development rule:

3.The only thing worse than generalizing from one example
is generalizing from no examples at all.

Remove the fwcylinders attribute before anybody gets the idea that we
alone have squared the circle.

Sponsored by: DARPA & NAI Labs.


# 93776 04-Apr-2002 phk

Move access and orphan member functions from class to geom.

Sponsored by: DARPA & NAI Labs


# 93326 28-Mar-2002 phk

In the absense of any smarter way to do this, cast various printf
arguments to silence printf format warnings.


# 93250 26-Mar-2002 phk

Eliminate some thread pointers which do not make sense anymore.

Split private parts of geom.h into geom_int.h. The latter should
never be included in class implemtations.


# 93248 26-Mar-2002 phk

Cave in to tradition and rename "methods" to "classes".


# 92698 19-Mar-2002 phk

Add five GEOM oriented ioctls to get basic information about a geom device.


# 92479 17-Mar-2002 phk

Change the giant-dropping method a fair bit to keep WITNESS more
happy.


# 92408 16-Mar-2002 phk

Hmm, talk about optimizer-fodder. Make the DIOCGDVIRGIN hack work again.


# 92403 16-Mar-2002 phk

Add a generic and general ioctl pass-through mechanism.

It should now be posible to issue ioctls to SCSI CD drives.


# 92108 11-Mar-2002 phk

First commit of the GEOM subsystem to make it easier for people to
test and play with this.

This is not yet production quality and should be run only on dedicated
test boxes.

For people who want to develop transformations for GEOM there exist a
set of shims to run geom in userland (ask phk@freebsd.org).

Reports of all kinds to: phk@freebsd.org
Please include in report:
dmesg
sysctl debug.geomdot
sysctl debug.geomconf

Known significant limitations:
no kernel dump facility.
ioctls severely restricted.

Sponsored by: DARPA, NAI Labs