History log of /freebsd-current/sys/kern/kern_physio.c
Revision Date Author Comments
# 473c90ac 10-May-2024 John Baldwin <jhb@FreeBSD.org>

uio: Use switch statements when handling UIO_READ vs UIO_WRITE

This is mostly to reduce the diff with CheriBSD which adds additional
constants to enum uio_rw, but also matches the normal style used for
uio_segflg.

Reviewed by: kib, emaste
Obtained from: CheriBSD
Differential Revision: https://reviews.freebsd.org/D45142


# fdafd315 24-Nov-2023 Warner Losh <imp@FreeBSD.org>

sys: Automated cleanup of cdefs and other formatting

Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.

Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/

Sponsored by: Netflix


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 7cd4984e 16-Sep-2022 Warner Losh <imp@FreeBSD.org>

SPDX: Not BSD-4-Clause

This is not BSD-4-Clause. It's closer to a modified BSD-2-Clause with 2
added clauses (and the first one has added clauses). Remove
SPDX-License-Idnetifier since this license doesn't match anything in
SPDX.


# 571a1a64 18-Apr-2021 Warner Losh <imp@FreeBSD.org>

Minor style tidy: if( -> if (

Fix a few 'if(' to be 'if (' in a few places, per style(9) and
overwhelming usage in the rest of the kernel / tree.

MFC After: 3 days
Sponsored by: Netflix


# 83f6b501 28-Nov-2020 Alexander Motin <mav@FreeBSD.org>

Remove alignment requirements for KVA buffer mapping.

After r368124 pbuf_zone has extra page to handle this particular case.


# cd853791 27-Nov-2020 Konstantin Belousov <kib@FreeBSD.org>

Make MAXPHYS tunable. Bump MAXPHYS to 1M.

Replace MAXPHYS by runtime variable maxphys. It is initialized from
MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys.

Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer
cache buffers exactly to atop(maxbcachebuf) (currently it is sized to
atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1.
The +1 for pbufs allow several pbuf consumers, among them vmapbuf(),
to use unaligned buffers still sized to maxphys, esp. when such
buffers come from userspace (*). Overall, we save significant amount
of otherwise wasted memory in b_pages[] for buffer cache buffers,
while bumping MAXPHYS to desired high value.

Eliminate all direct uses of the MAXPHYS constant in kernel and driver
sources, except a place which initialize maxphys. Some random (and
arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted
straight. Some drivers, which use MAXPHYS to size embeded structures,
get private MAXPHYS-like constant; their convertion is out of scope
for this work.

Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs,
dev/siis, where either submitted by, or based on changes by mav.

Suggested by: mav (*)
Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions)
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D27225


# 16e4a0c8 15-Oct-2020 Brooks Davis <brooks@FreeBSD.org>

physio: Don't store user addresses in bio_data

Only assign the address from the iovec to bio_data if it is a kernel
address. This was the single place where bio_data stored (however
briefly) a userspace pointer.

Reviewed by: imp, markj
Obtained from: CheriBSD
MFC after: 1 week
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D26783


# 756a5412 14-Jan-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Allocate pager bufs from UMA instead of 80-ish mutex protected linked list.

o In vm_pager_bufferinit() create pbuf_zone and start accounting on how many
pbufs are we going to have set.
In various subsystems that are going to utilize pbufs create private zones
via call to pbuf_zsecond_create(). The latter calls uma_zsecond_create(),
and sets a limit on created zone. After startup preallocate pbufs according
to requirements of all pbuf zones.

Subsystems that used to have a private limit with old allocator now have
private pbuf zones: md(4), fusefs, NFS client, smbfs, VFS cluster, FFS,
swap, vnode pager.

The following subsystems use shared pbuf zone: cam(4), nvme(4), physio(9),
aio(4). They should have their private limits, but changing that is out of
scope of this commit.

o Fetch tunable value of kern.nswbuf from init_param2() and while here move
NSWBUF_MIN to opt_param.h and eliminate opt_swap.h, that was holding only
this option.
Default values aren't touched by this commit, but they probably should be
reviewed wrt to modern hardware.

This change removes a tight bottleneck from sendfile(2) operation, that
uses pbufs in vnode pager. Other pagers also would benefit from faster
allocation.

Together with: gallatin
Tested by: pho


# cd6ba3f0 18-May-2018 Matt Macy <mmacy@FreeBSD.org>

physio: avoid uninitialized variables


# 8a36da99 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/kern: adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.


# ae34b6ff 06-Apr-2016 Edward Tomasz Napierala <trasz@FreeBSD.org>

Add four new RCTL resources - readbps, readiops, writebps and writeiops,
for limiting disk (actually filesystem) IO.

Note that in some cases these limits are not quite precise. It's ok,
as long as it's within some reasonable bounds.

Testing - and review of the code, in particular the VFS and VM parts - is
very welcome.

MFC after: 1 month
Relnotes: yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D5080


# c55f5707 17-Feb-2016 Warner Losh <imp@FreeBSD.org>

Create an API to reset a struct bio (g_reset_bio). This is mandatory
for all struct bio you get back from g_{new,alloc}_bio. Temporary
bios that you create on the stack or elsewhere should use this before
first use of the bio, and between uses of the bio. At the moment, it
is nothing more than a wrapper around bzero, but that may change in
the future. The wrapper also removes one place where we encode the
size of struct bio in the KBI.


# cb3450e2 29-Oct-2015 Hans Petter Selasky <hselasky@FreeBSD.org>

Add missing NULL check in physio().

When destroying a character device the si_devsw field is set to NULL
before all references are gone, to indicate the character device is
going away. This can cause a NULL-dereference fault inside physio().

The callers of physio() should own a thread reference on the cdev and
if si_devsw is seen as non-NULL, it is usable during the execution of
the function. Else an ENXIO error code is returned.

Reviewed by: kib
MFC after: 2 weeks


# 869fd29a 21-Apr-2015 Alexander Motin <mav@FreeBSD.org>

Rewrite physio() to not allocate pbufs for unmapped I/O.

pbufs is a limited resource, and their allocator is not SMP-scalable.
So instead of always allocating pbuf to immediately convert it to bio,
allocate bio just here. If buffer needs kernel mapping, then pbuf is
still allocated, but used only as a source of KVA and storage for a list
of held pages.

On 40-core system doing many 512-byte reads from user level to array of
raw SSDs this change removes huge lock congestion inside pbuf allocator.
It improves peak performance from ~300K to ~1.2M IOPS. On my previous
24-core system this problem also existed, but was less serious.

Reviewed by: kib
MFC after: 2 weeks


# 880e57b6 29-Aug-2013 Kenneth D. Merry <ken@FreeBSD.org>

Fix some issues in change 254760 pointed out by Bruce Evans:

- Remove excessive parenthesis
- Use KNF continuation indentation
- Cut down on excessive continuation lines
- More consistent style in messages
- Use uprintf() instead of printf()

Submitted by: bde


# aaea33e5 24-Aug-2013 Kenneth D. Merry <ken@FreeBSD.org>

Fix a printf format warning on 32-bit mips and powerpc.

Reported by: bde, gjb
Pointy hat to: ken


# 93729c17 23-Aug-2013 Kenneth D. Merry <ken@FreeBSD.org>

Add support to physio(9) for devices that don't want I/O split and
configure sa(4) to request no I/O splitting by default.

For tape devices, the user needs to be able to clearly understand
what blocksize is actually being used when writing to a tape
device. The previous behavior of physio(9) was that it would split
up any I/O that was too large for the device, or too large to fit
into MAXPHYS. This means that if, for instance, the user wrote a
1MB block to a tape device, and MAXPHYS was 128KB, the 1MB write
would be split into 8 128K chunks. This would be done without
informing the user.

This has suboptimal effects, especially when trying to communicate
status to the user. In the event of an error writing to a tape
(e.g. physical end of tape) in the middle of a 1MB block that has
been split into 8 pieces, the user could have the first two 128K
pieces written successfully, the third returned with an error, and
the last 5 returned with 0 bytes written. If the user is using
a standard write(2) system call, all he will see is the ENOSPC
error. He won't have a clue how much actually got written. (With
a writev(2) system call, he should be able to determine how much
got written in addition to the error.)

The solution is to prevent physio(9) from splitting the I/O. The
new cdev flag, SI_NOSPLIT, tells physio that the driver does not
want I/O to be split beforehand.

Although the sa(4) driver now enables SI_NOSPLIT by default,
that can be disabled by two loader tunables for now. It will not
be configurable starting in FreeBSD 11.0. kern.cam.sa.allow_io_split
allows the user to configure I/O splitting for all sa(4) driver
instances. kern.cam.sa.%d.allow_io_split allows the user to
configure I/O splitting for a specific sa(4) instance.

There are also now three sa(4) driver sysctl variables that let the
users see some sa(4) driver values. kern.cam.sa.%d.allow_io_split
shows whether I/O splitting is turned on. kern.cam.sa.%d.maxio shows
the maximum I/O size allowed by kernel configuration parameters
(e.g. MAXPHYS, DFLTPHYS) and the capabilities of the controller.
kern.cam.sa.%d.cpi_maxio shows the maximum I/O size supported by
the controller.

Note that a better long term solution would be to implement support
for chaining buffers, so that that MAXPHYS is no longer a limiting
factor for I/O size to tape and disk devices. At that point, the
controller and the tape drive would become the limiting factors.

sys/conf.h: Add a new cdev flag, SI_NOSPLIT, that allows a
driver to tell physio not to split up I/O.

sys/param.h: Bump __FreeBSD_version to 1000049 for the addition
of the SI_NOSPLIT cdev flag.

kern_physio.c: If the SI_NOSPLIT flag is set on the cdev, return
any I/O that is larger than si_iosize_max or
MAXPHYS, has more than one segment, or would have
to be split because of misalignment with EFBIG.
(File too large).

In the event of an error, print a console message to
give the user a clue about what happened.

scsi_sa.c: Set the SI_NOSPLIT cdev flag on the devices created
for the sa(4) driver by default.

Add tunables to control whether we allow I/O splitting
in physio(9).

Explain in the comments that allowing I/O splitting
will be deprecated for the sa(4) driver in FreeBSD
11.0.

Add sysctl variables to display the maximum I/O
size we can do (which could be further limited by
read block limits) and the maximum I/O size that
the controller can do.

Limit our maximum I/O size (recorded in the cdev's
si_iosize_max) by MAXPHYS. This isn't strictly
necessary, because physio(9) will limit it to
MAXPHYS, but it will provide some clarity for the
application.

Record the controller's maximum I/O size reported
in the Path Inquiry CCB.

sa.4: Document the block size behavior, and explain that
the option of allowing physio(9) to split the I/O
will disappear in FreeBSD 11.0.

Sponsored by: Spectra Logic


# ce625ec7 15-Aug-2013 Kenneth D. Merry <ken@FreeBSD.org>

Change the way that unmapped I/O capability is advertised.

The previous method was to set the D_UNMAPPED_IO flag in the cdevsw
for the driver. The problem with this is that in many cases (e.g.
sa(4)) there may be some instances of the driver that can handle
unmapped I/O and some that can't. The isp(4) driver can handle
unmapped I/O, but the esp(4) driver currently cannot. The cdevsw
is shared among all driver instances.

So instead of setting a flag on the cdevsw, set a flag on the cdev.
This allows drivers to indicate support for unmapped I/O on a
per-instance basis.

sys/conf.h: Remove the D_UNMAPPED_IO cdevsw flag and replace it
with an SI_UNMAPPED cdev flag.

kern_physio.c: Look at the cdev SI_UNMAPPED flag to determine
whether or not a particular driver can handle
unmapped I/O.

geom_dev.c: Set the SI_UNMAPPED flag for all GEOM cdevs.
Since GEOM will create a temporary mapping when
needed, setting SI_UNMAPPED unconditionally will
work.

Remove the D_UNMAPPED_IO flag.

nvme_ns.c: Set the SI_UNMAPPED flag on cdevs created here
if NVME_UNMAPPED_BIO_SUPPORT is enabled.

vfs_aio.c: In aio_qphysio(), check the SI_UNMAPPED flag on a
cdev instead of the D_UNMAPPED_IO flag on the cdevsw.

sys/param.h: Bump __FreeBSD_version to 1000045 for the switch from
setting the D_UNMAPPED_IO flag in the cdevsw to setting
SI_UNMAPPED in the cdev.

Reviewed by: kib, jimharris
MFC after: 1 week
Sponsored by: Spectra Logic


# d1e99f43 27-Mar-2013 Konstantin Belousov <kib@FreeBSD.org>

Add dev_strategy_csw() function, which is similar to dev_strategy()
but assumes that a thread reference was already obtained on the passed
device. Use the function from physio(), to avoid two extra dev_mtx
lock and unlock. Note that physio() is always used as the cdevsw
method, or is called from a cdevsw method, and the caller already owns
the reference.

dev_strategy() is left to keep KPI intact, but now it is implemented
as a wrapper around dev_strategy_csw().

Do some style cleanup in physio().

Requested and reviewed by: kan (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks


# 31932fae 25-Mar-2013 Alexander Kabaev <kan@FreeBSD.org>

Do not pass unmapped buffers to drivers that cannot handle them

In physio, check if device can handle unmapped IO and pass an
appropriately mapped buffer to the driver strategy routine. The
only driver in the tree that can handle unmapped buffers is one
exposed by GEOM, so mark it as such with the new flag in the
driver cdevsw structure.

This fixes insta-panics on hosts, running dconschat, as /dev/fwmem
is an example of the driver that makes use of physio routine, but
bypasses the g_down thread, where the buffer gets mapped normally.

Discussed with: kib (earlier version)


# e81ff91e 19-Mar-2013 Konstantin Belousov <kib@FreeBSD.org>

Do not remap usermode pages into KVA for physio.

Sponsored by: The FreeBSD Foundation
Tested by: pho


# eea7f71c 25-Nov-2010 Konstantin Belousov <kib@FreeBSD.org>

Account i/o done on cdevs.

Reported and tested by: Adam Vande More <amvandemore gmail com>
MFC after: 1 week


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# d7f03759 19-Oct-2008 Ulf Lilleengen <lulf@FreeBSD.org>

- Import the HEAD csup code which is the basis for the cvsmode work.


# 9454b2d8 06-Jan-2005 Warner Losh <imp@FreeBSD.org>

/* -> /*- for copyright notices, minor format tweaks as necessary


# c5690651 04-Nov-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Remove buf->b_dev field.


# 6afb3b1c 29-Oct-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Give dev_strategy() an explict cdev argument in preparation for removing
buf->b-dev.

Put a bio between the buf passed to dev_strategy() and the device driver
strategy routine in order to not clobber fields in the buf.

Assert copyright on vfs_bio.c and update copyright message to canonical
text. There is no legal difference between John Dysons two-clause
abbreviated BSD license and the canonical text.


# 1a52a73d 23-Sep-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Eliminate DEV_STRATEGY() macro: call dev_strategy() directly.

Make dev_strategy() handle errors and departing devices properly.


# fad44dee 10-Aug-2004 Alan Cox <alc@FreeBSD.org>

Eliminate the acquisition and release of Giant within physio(). Remove
the spl calls.

Reviewed by: phk@
Discussed with: scottl@


# 89c9c53d 16-Jun-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.


# 00cbe31b 15-Nov-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Send B_PHYS out to pasture, it no longer serves any function.


# 01758670 18-Oct-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Initialize b_iooffset before calling strategy


# f7e56e48 02-Aug-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Grab Giant in physio() since non-giant drivers are starting to appear.


# 677b542e 10-Jun-2003 David E. O'Brien <obrien@FreeBSD.org>

Use __FBSDID().


# ef38cda1 05-Apr-2003 Alan Cox <alc@FreeBSD.org>

Don't reinitialize fields that are already initialized by getpbuf().


# cdb06eda 05-Apr-2003 Alan Cox <alc@FreeBSD.org>

Sufficient access checks are performed by vmapbuf() that calling
useracc() is pointless. Remove the call to useracc() from physio().

Reviewed by: tegge


# 749ffa4e 13-Mar-2003 Jeff Roberson <jeff@FreeBSD.org>

- Add a lock for protecting against msleep(bp, ...) wakeup(bp) races.
- Create a new function bdone() which sets B_DONE and calls wakup(bp). This
is suitable for use as b_iodone for buf consumers who are not going
through the buf cache.
- Create a new function bwait() which waits for the buf to be done at a set
priority and with a specific wmesg.
- Replace several cases where the above functionality was implemented
without locking with the new functions.


# 2d5c7e45 20-Jan-2003 Matthew Dillon <dillon@FreeBSD.org>

Close the remaining user address mapping races for physical
I/O, CAM, and AIO. Still TODO: streamline useracc() checks.

Reviewed by: alc, tegge
MFC after: 7 days


# e2a3ea1c 02-Jan-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Remove unused second argument from DEV_STRATEGY().


# 2b7f24d2 11-Oct-2002 Mike Barcroft <mike@FreeBSD.org>

Change iov_base's type from `char *' to the standard `void *'. All
uses of iov_base which assume its type is `char *' (in order to do
pointer arithmetic) have been updated to cast iov_base to `char *'.


# 7f05b035 28-Jun-2002 Alfred Perlstein <alfred@FreeBSD.org>

More caddr_t removal, make fo_ioctl take a void * instead of a caddr_t.


# e96d018d 18-May-2002 Poul-Henning Kamp <phk@FreeBSD.org>

Use btodb() macro.

Sponsored by: DARPA & NAI Labs.


# 9626b608 05-May-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Separate the struct bio related stuff out of <sys/buf.h> into
<sys/bio.h>.

<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall
not be made a nested include according to bdes teachings on the
subject of nested includes.

Diskdrivers and similar stuff below specfs::strategy() should no
longer need to include <sys/buf.> unless they need caching of data.

Still a few bogus uses of struct buf to track down.

Repocopy by: peter


# c244d2de 02-Apr-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Move B_ERROR flag to b_ioflags and call it BIO_ERROR.

(Much of this done by script)

Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED.

Move b_pblkno and b_iodone_chain to struct bio while we transition, they
will be obsoleted once bio structs chain/stack.

Add bio_queue field for struct bio aware disksort.

Address a lot of stylistic issues brought up by bde.


# b99c307a 20-Mar-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Rename the existing BUF_STRATEGY() to DEV_STRATEGY()

substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo)

substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo)

This patch is machine generated except for the ccd.c and buf.h parts.


# 21144e3b 20-Mar-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Remove B_READ, B_WRITE and B_FREEBUF and replace them with a new
field in struct buf: b_iocmd. The b_iocmd is enforced to have
exactly one bit set.

B_WRITE was bogusly defined as zero giving rise to obvious coding
mistakes.

Also eliminate the redundant struct buf flag B_CALL, it can just
as efficiently be done by comparing b_iodone to NULL.

Should you get a panic or drop into the debugger, complaining about
"b_iocmd", don't continue. It is likely to write on your disk
where it should have been reading.

This change is a step in the direction towards a stackable BIO capability.

A lot of this patch were machine generated (Thanks to style(9) compliance!)

Vinum users: Greg has not had time to test this yet, be careful.


# 02c58685 30-Oct-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Change useracc() and kernacc() to use VM_PROT_{READ|WRITE|EXECUTE} for the
"rw" argument, rather than hijacking B_{READ|WRITE}.

Fix two bugs (physio & cam) resulting by the confusion caused by this.

Submitted by: Tor.Egge@fast.no
Reviewed by: alc, ken (partly)


# 7179e74f 09-Oct-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Give physio a makeover.

- Let physio take read/write compatible args and have it use uio->uio_rw
to determine the direction.

- physread/physwrite are now #defines for physio

- Remove the inversly named minphys(), dev->si_iosize_max takes over.

- Physio() always uses pbufs.

- Fix the check for non page-aligned transfers, now only unaligned
transfers larger than (MAXPHYS - PAGE_SIZE) get fragmented (only
interesting for tapes using max blocksize).

- General wash-and-clean of code.

Constructive input from: bde


# 0b5c7391 08-Oct-1999 Brian Feldman <green@FreeBSD.org>

Add a newline to "WARNING: %s maxphys = 0 ??" so it doesn't trip up
syslogd. Note of course it's simply much more polite and correct, too :)


# dc722a14 02-Oct-1999 Søren Schmidt <sos@FreeBSD.org>

In some drivers we use two devices to be able to boot.
So if si_iosize_max is allready set, dont mess with it..

Also just log the problem with maxphys not being set once.

designed by: phk
tested by: sos


# 45604de3 02-Oct-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Fix a problem relating to si_iosize_max which broke scsi devices.


# c428d4c0 22-Sep-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Kill the cdevsw->d_maxio field.

d_maxio is replaced by the dev->si_iosize_max field which the driver
should be set in all calls to cdevsw->d_open if it has a better
idea than the system wide default.

The field is a generic dev_t field (ie: not disk specific) so that
tapes and other devices can use physio as well.


# f5756ee9 12-Sep-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Remove replace phygetvpbuf() with direct call to getpbuf();


# c3aac50f 27-Aug-1999 Peter Wemm <peter@FreeBSD.org>

$Id$ -> $FreeBSD$


# 60767bf4 21-Aug-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Use more compiler friendly test for overflow.

Submitted by: bde


# 3b782ee9 21-Aug-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Detect if the the offset used to read from a raw device loose bits
when converted to block number.


# 49ff4deb 14-Aug-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Spring cleaning around strategy and disklabels/slices:

Introduce BUF_STRATEGY(struct buf *, int flag) macro, and use it throughout.
please see comment in sys/conf.h about the flag argument.

Remove strategy argument from all the diskslice/label/bad144
implementations, it should be found from the dev_t.

Remove bogus and unused strategy1 routines.

Remove open/close arguments from dssize(). Pick them up from dev_t.

Remove unused and unfinished setgeom support from diskslice/label/bad144 code.


# 67812eac 25-Jun-1999 Kirk McKusick <mckusick@FreeBSD.org>

Convert buffer locking from using the B_BUSY and B_WANTED flags to using
lockmgr locks. This commit should be functionally equivalent to the old
semantics. That is, all buffer locking is done with LK_EXCLUSIVE
requests. Changes to take advantage of LK_SHARED and LK_RECURSIVE will
be done in future commits.


# 4be2eb8c 08-May-1999 Poul-Henning Kamp <phk@FreeBSD.org>

I got tired of seeing all the cdevsw[major(foo)] all over the place.

Made a new (inline) function devsw(dev_t dev) and substituted it.

Changed to the BDEV variant to this format as well: bdevsw(dev_t dev)

DEVFS will eventually benefit from this change too.


# c48d1775 07-May-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Introduce two functions: physread() and physwrite() and use these directly
in *devsw[] rather than the 46 local copies of the same functions.

(grog will do the same for vinum when he has time)


# b0eeea20 06-May-1999 Poul-Henning Kamp <phk@FreeBSD.org>

remove b_proc from struct buf, it's (now) unused.

Reviewed by: dillon, bde


# 57dc5948 05-Apr-1999 Peter Wemm <peter@FreeBSD.org>

Use the reference counted PHOLD()/PRELE() rather than P_PHYSIO.


# 1c7c3c6a 21-Jan-1999 Matthew Dillon <dillon@FreeBSD.org>

This is a rather large commit that encompasses the new swapper,
changes to the VM system to support the new swapper, VM bug
fixes, several VM optimizations, and some additional revamping of the
VM code. The specific bug fixes will be documented with additional
forced commits. This commit is somewhat rough in regards to code
cleanup issues.

Reviewed by: "John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>


# f5ef029e 25-Oct-1998 Poul-Henning Kamp <phk@FreeBSD.org>

Nitpicking and dusting performed on a train. Removes trivial warnings
about unused variables, labels and other lint.


# e620a1cb 19-Aug-1998 Søren Schmidt <sos@FreeBSD.org>

Make struct buf->b_offset reflect the real byte offset which got
in via the uio struct. This enables device drivers to use != DEV_BSIZE
blocking on devices with wierd sector/block sizes (ie CDROM's).


# f7ea2f55 04-Jul-1998 Julian Elischer <julian@FreeBSD.org>

There is no such thing any more as "struct bdevsw".

There is only cdevsw (which should be renamed in a later edit to deventry
or something). cdevsw contains the union of what were in both bdevsw an
cdevsw entries. The bdevsw[] table stiff exists and is a second pointer
to the cdevsw entry of the device. it's major is in d_bmaj rather than
d_maj. some cleanup still to happen (e.g. dsopen now gets two pointers
to the same cdevsw struct instead of one to a bdevsw and one to a cdevsw).

rawread()/rawwrite() went away as part of this though it's not strictly
the same patch, just that it involves all the same lines in the drivers.

cdroms no longer have write() entries (they did have rawwrite (?)).
tapes no longer have support for bdev operations.

Reviewed by: Eivind Eklund and Mike Smith
Changes suggested by eivind.


# aec0bcdf 03-Apr-1998 John Dyson <dyson@FreeBSD.org>

Perhaps fix a problem that some drivers have that they don't properly
initialize the b_kvasize element. This might fix some of the split
I/O requests that some people have.


# 08637435 28-Mar-1998 Bruce Evans <bde@FreeBSD.org>

Moved some #includes from <sys/param.h> nearer to where they are actually
used.


# 52c64c95 19-Mar-1998 John Dyson <dyson@FreeBSD.org>

In kern_physio.c fix tsleep priority messup.

In vfs_bio.c, remove b_generation count usage,
remove redundant reassignbuf,
remove redundant spl(s),
manage page PG_ZERO flags more correctly,
utilize in invalid value for b_offset until it
is properly initialized. Add asserts
for #ifdef DIAGNOSTIC, when b_offset is
improperly used.
when a process is not performing I/O, and just waiting
on a buffer generally, make the sleep priority
low.
only check page validity in getblk for B_VMIO buffers.

In vfs_cluster, add b_offset asserts, correct pointer calculation
for clustered reads. Improve readability of certain parts of
the code. Remove redundant spl(s).

In vfs_subr, correct usage of vfs_bio_awrite (From Andrew Gallatin
<gallatin@cs.duke.edu>). More vtruncbuf problems fixed.


# 50ce7ff4 23-Jan-1998 John Dyson <dyson@FreeBSD.org>

Add better support for larger I/O clusters, including larger physical
I/O. The support is not mature yet, and some of the underlying implementation
needs help. However, support does exist for IDE devices now.


# e4ba6a82 02-Sep-1997 Bruce Evans <bde@FreeBSD.org>

Removed unused #includes.


# eb776aea 25-Aug-1997 Bruce Evans <bde@FreeBSD.org>

Fixed some gratuitous ANSIisms.


# c0ecffb9 09-Aug-1997 John Dyson <dyson@FreeBSD.org>

Modify the scheduling policy to take into account disk I/O waits
as chargeable CPU usage. This should mitigate the problem of processes
doing disk I/O hogging the CPU. Various users have reported the
problem, and test code shows that the problem should now be gone.


# 6875d254 22-Feb-1997 Peter Wemm <peter@FreeBSD.org>

Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.


# 1130b656 14-Jan-1997 Jordan K. Hubbard <jkh@FreeBSD.org>

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


# 4b83b27f 25-Jun-1996 John Dyson <dyson@FreeBSD.org>

Fix a problem that caused system crashes after physio. This problem
was due to non-aligned 64K transfers taking 17 pages. We currently
do not support >16 page transfers. The transfer is unfortunately truncated,
but since buffers are usually malloced, this is a problem only once in
a while. Savecore is a culprit, but tar/cpio usually aren't. This
is NOT the final fix (which is likely a bouncing scheme), but will at
least keep the system from crashing.


# 6ba9ebce 13-Dec-1995 Julian Elischer <julian@FreeBSD.org>

devsw tables are now arrays of POINTERS to struct [cb]devsw
seems to work hre just fine though I can't check every file
that changed due to limmited h/w, however I've checked enught to be petty
happy withe hte code..

WARNING... struct lkm[mumble] has changed
so it might be an idea to recompile any lkm related programs


# efeaf95a 06-Dec-1995 David Greenman <dg@FreeBSD.org>

Untangled the vm.h include file spaghetti.


# 98d93822 02-Dec-1995 Bruce Evans <bde@FreeBSD.org>

Completed function declarations and/or added prototypes.


# 6884d2aa 27-Nov-1995 Peter Wemm <peter@FreeBSD.org>

Implement read/write to kernel space - I use this for a self-loading/
self-decompressing ram disk that I'm fiddling with..

(Note, this depends on the various syscalls having correctly set uio_segflag
before calling physio - I've checked and they look correct.)


# 60039670 08-Sep-1995 Bruce Evans <bde@FreeBSD.org>

Fix benign type mismatches in devsw functions. 82 out of 299 devsw
functions were wrong.


# 9b2e5354 30-May-1995 Rodney W. Grimes <rgrimes@FreeBSD.org>

Remove trailing whitespace.


# b5e8ce9f 16-Mar-1995 Bruce Evans <bde@FreeBSD.org>

Add and move declarations to fix all of the warnings from `gcc -Wimplicit'
(except in netccitt, netiso and netns) and most of the warnings from
`gcc -Wnested-externs'. Fix all the bugs found. There were no serious
ones.


# 0d94caff 09-Jan-1995 David Greenman <dg@FreeBSD.org>

These changes embody the support of the fully coherent merged VM buffer cache,
much higher filesystem I/O performance, and much better paging performance. It
represents the culmination of over 6 months of R&D.

The majority of the merged VM/cache work is by John Dyson.

The following highlights the most significant changes. Additionally, there are
(mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to
support the new VM/buffer scheme.

vfs_bio.c:
Significant rewrite of most of vfs_bio to support the merged VM buffer cache
scheme. The scheme is almost fully compatible with the old filesystem
interface. Significant improvement in the number of opportunities for write
clustering.

vfs_cluster.c, vfs_subr.c
Upgrade and performance enhancements in vfs layer code to support merged
VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff.

vm_object.c:
Yet more improvements in the collapse code. Elimination of some windows that
can cause list corruption.

vm_pageout.c:
Fixed it, it really works better now. Somehow in 2.0, some "enhancements"
broke the code. This code has been reworked from the ground-up.

vm_fault.c, vm_page.c, pmap.c, vm_object.c
Support for small-block filesystems with merged VM/buffer cache scheme.

pmap.c vm_map.c
Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of
kernel PTs.

vm_glue.c
Much simpler and more effective swapping code. No more gratuitous swapping.

proc.h
Fixed the problem that the p_lock flag was not being cleared on a fork.

swap_pager.c, vnode_pager.c
Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the
code doesn't need it anymore.

machdep.c
Changes to better support the parameter values for the merged VM/buffer cache
scheme.

machdep.c, kern_exec.c, vm_glue.c
Implemented a seperate submap for temporary exec string space and another one
to contain process upages. This eliminates all map fragmentation problems
that previously existed.

ffs_inode.c, ufs_inode.c, ufs_readwrite.c
Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on
busy buffers.

Submitted by: John Dyson and David Greenman


# bb56ec4a 25-Sep-1994 Poul-Henning Kamp <phk@FreeBSD.org>

While in the real world, I had a bad case of being swapped out for a lot of
cycles. While waiting there I added a lot of the extra ()'s I have, (I have
never used LISP to any extent). So I compiled the kernel with -Wall and
shut up a lot of "suggest you add ()'s", removed a bunch of unused var's
and added a couple of declarations here and there. Having a lap-top is
highly recommended. My kernel still runs, yell at me if you kernel breaks.


# f23b4c91 18-Aug-1994 Garrett Wollman <wollman@FreeBSD.org>

Fix up some sloppy coding practices:

- Delete redundant declarations.
- Add -Wredundant-declarations to Makefile.i386 so they don't come back.
- Delete sloppy COMMON-style declarations of uninitialized data in
header files.
- Add a few prototypes.
- Clean up warnings resulting from the above.

NB: ioconf.c will still generate a redundant-declaration warning, which
is unavoidable unless somebody volunteers to make `config' smarter.


# 7d7bb69d 08-Aug-1994 David Greenman <dg@FreeBSD.org>

Detect the "EOF" condition. Specifically, end of partition.

Submitted by: Bruce Evans


# a481f200 07-Aug-1994 David Greenman <dg@FreeBSD.org>

Provide support for upcoming merged VM/buffer cache, and fixed a few bugs
that haven't appeared to manifest themselves (yet).

Submitted by: John Dyson


# 16f62314 06-Aug-1994 David Greenman <dg@FreeBSD.org>

Incorporated post 1.1.5 work from John Dyson. This includes performance
improvements via the new routines pmap_qenter/pmap_qremove and pmap_kenter/
pmap_kremove. These routine allow fast mapping of pages for those
architectures that have "normal" MMUs. Also included is a fix to the
pageout daemon to properly check a queue end condition.

Submitted by: John Dyson


# 3c4dd356 02-Aug-1994 David Greenman <dg@FreeBSD.org>

Added $Id$


# 26f9a767 25-May-1994 Rodney W. Grimes <rgrimes@FreeBSD.org>

The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.

Reviewed by: Rodney W. Grimes
Submitted by: John Dyson and David Greenman


# df8bae1d 24-May-1994 Rodney W. Grimes <rgrimes@FreeBSD.org>

BSD 4.4 Lite Kernel Sources