History log of /freebsd-10-stable/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c
Revision Date Author Comments
# 330987 15-Mar-2018 avg

MFC r322245,r329717: MFV r322242: 8373 TXG_WAIT in ZIL commit path

MFC r322245: MFV r322242: 8373 TXG_WAIT in ZIL commit path
MFC r329717: MFV r329715: 8997 ztest assertion failure in zil_lwb_write_issue


# 330065 27-Feb-2018 avg

MFC r329016: remove a duplicate assignment


# 330063 27-Feb-2018 avg

MFC r328881: zfs: move a utility function, ioflags, closer to its consumers


# 326428 01-Dec-2017 avg

MFC r326070: zfs_write: fix problem with writes appearing to succeed when over quota

The problem happens when the writes have offsets and sizes aligned with
a filesystem's recordsize (maximum block size). In this scenario
dmu_tx_assign() would fail because of being over the quota, but the uio
would already be modified in the code path where we copy data from the
uio into a borrowed ARC buffer. That makes an appearance of a partial
write, so zfs_write() would return success and the uio would be modified
consistently with writing a single block.

That bug can result in a data loss because the writes over the quota
would appear to succeed while the actual data is being discarded.

This commit fixes the bug by ensuring that the uio is not changed until
after all error checks are done. To achieve that the code now uses
uiocopy() + uioskip() as in the original illumos design. We can do that
now that uiocopy() has been updated in r326067 to use
vn_io_fault_uiomove().


# 324204 02-Oct-2017 avg

MFC r323918: MFV r323917: 8648 Fix range locking in ZIL commit codepath

This fixes a problem introduced in r320496, MFC of r308782.


# 324159 01-Oct-2017 avg

MFC r323522: slightly simplify zfs_vptocnp


# 319416 01-Jun-2017 avg

MFC r319096: zfs_lookup: fix bogus arguments to lookup of "snapshot" directory


# 316848 14-Apr-2017 avg

MFC r315853: zfs_putpages: use TXG_WAIT


# 315844 23-Mar-2017 avg

MFC r314048,r314194: reimplement zfsctl (.zfs) support


# 314711 05-Mar-2017 mm

MFC r314572:

Fix null pointer dereference in zfs_freebsd_setacl().

Prevents unprivileged users from panicking the kernel by calling
__acl_delete_*() on files or directories inside a ZFS mount.


# 314029 21-Feb-2017 avg

MFC r313686: check remaining space in zfs implementations of vptocnp

PR: 216939


# 310067 14-Dec-2016 avg

MFC r308887,309090: fix unsafe modification of zfs_vnodeops when
DIAGNOSTIC is enabled


# 307996 27-Oct-2016 avg

MFC r306801: implement zfs_vptocnp() using z_parent property


# 307672 20-Oct-2016 kib

MFC r307218:
Fix a race in vm_page_busy_sleep(9).


# 307143 12-Oct-2016 avg

MFC r306665: zfs: fix a wrong assertion for extended attributes

PR: 213112


# 306819 07-Oct-2016 avg

MFC r306292: fix vnode lock assertion for extended attributes directory


# 304671 23-Aug-2016 avg

MFC r303763,303791,303869: zfs: honour and make use of vfs vnode locking protocol

PR: 209158


# 304121 15-Aug-2016 avg

MFC r302839: 6940 Cannot unlink directories when over quota


# 302732 13-Jul-2016 avg

MFC r299906,301870: add zfs_vptocnp with special handling for snapshots
under .zfs

Note that the changed is adjusted for the lack of LK_VNHELD in this
branch.


# 302721 13-Jul-2016 avg

MFC r298105: zfs: enable vn_io_fault support


# 301695 08-Jun-2016 ngie

MFC r300870,r300884:

r300870:

Unbreak the zfs(4) build

vm/vm_pageout.h grew a dependency on the bool typedef in r300865

arc.c didn't include sys/types.h, which included the definition for the typedef

Other items (ofed, drm2) might need to be chased for this commit.

Pointyhat to: alc

r300884:

Fix up r300870

The sys/types.h fix I proposed was only tested with zfs(4), not with
libzpool, which is where the build failure actually existed

Remove vm/vm_pageout.h from arc.c and zfs_vnops.c because they're both
unneeded

In collaboration with: kib


# 297112 20-Mar-2016 mav

MFC r296519: MFV r296518: 5027 zfs large block support (add copyright)

Author: Matthew Ahrens <matt@mahrens.org>

illumos/illumos-gate@c3d26abc9ee97b4f60233556aadeb57e0bd30bb9


# 297096 20-Mar-2016 mav

MFC r294803: MFV r294802: 6334 Cannot unlink files when over quota

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Toomas Soome <tsoome@me.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Simon Klinkert <simon.klinkert@gmail.com>

illumos/illumos-gate@6575bca01367958c7237253d88e5fa9ef0b1650a


# 297084 20-Mar-2016 mav

MFV r258597 (by pjd):
When append-only, immutable or read-only flag is set don't allow for
hard links creation. This matches UFS behaviour.

Reported by: Oleg Ginzburg <olevole@olevole.ru>


# 297083 20-Mar-2016 mav

MFC r262990: MFV r262983:

4638 Panic in ZFS via rfs3_setattr()/rfs3_write(): dirtying snapshot!

illumos/illumos-gate@2144b121c08e0eb676cc6ca4662ebbc9f9c22fe3


# 297077 20-Mar-2016 mav

MFC r277300 (by smh): Mechanically convert cddl sun #ifdef's to illumos

Since the upstream for cddl code is now illumos not sun, mechanically
convert all sun #ifdef's to illumos #ifdef's which have been used in all
newer code for some time.

Also do a manual pass to correct the use if #ifdef comments as per style(9)
as well as few uses of #if defined(__FreeBSD__) vs #ifndef illumos.


# 290765 13-Nov-2015 mav

MFC r289562: 6328 Fix cstyle errors in zfs codebase

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed by: Jorgen Lundman <lundman@lundman.net>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Paul Dagnelie <pcd@delphix.com>

illumos/illumos-gate@9a686fbc186e8e2a64e9a5094d44c7d6fa0ea167


# 288590 03-Oct-2015 mav

MFC r287103 (by avg): 5692 expose the number of hole blocks in a file

FreeBSD porting notes:
- only kernel-side changes are merged
- the new ioctl is not actually implemented yet
- thus, the goal is to synchronize DMU code

illumos/illumos-gate@2bcf0248e992f292c7b814458bcdce2f004925d6

https://www.illumos.org/issues/5692
we would like to expose the number of hole (sparse) blocks in a file.
this can be useful to for example if you want to fill in the holes with
some data; knowing the number of holes in advances allows you to report
progress on hole filling. We could use SEEK_HOLE to do that but it would
be O(n) where n is the number of holes present in the file.

Author: Max Grossman <max.grossman@delphix.com>
Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Boris Protopopov <bprotopopov@hotmail.com>
Approved by: Richard Lowe <richlowe@richlowe.net>


# 288571 03-Oct-2015 mav

MFC r286705: 5960 zfs recv should prefetch indirect blocks
5925 zfs receive -o origin=

Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Author: Paul Dagnelie <pcd@delphix.com>

While running 'zfs recv' we noticed that every 128th 8K block required a
read. We were seeing that restore_write() was calling dmu_tx_hold_write()
and the indirect block was not cached. We should prefetch upcoming indirect
blocks to avoid having to go to disk and blocking the restore_write().

Allow an incremental send stream to be received as a clone, even if the
stream does not mark it as a clone.


# 276899 09-Jan-2015 delphij

MFC r264392 (davide):

Fix a panic in zfs_rename().
this is due to a wrong dereference of a vnode when it's not locked and
can be (potentially) recycled. 'sdvp' cannot be locked on zfs_rename()
entry point because the VFS can't be sure that this scenario is
LOR-free (it might violate the parent->child lock acquisition rule).
Dereference 'tdvp' instead, which is already locked on entry, and access
'sdvp' fields only when it's safe, i.e. under ZFS_ENTER scope.

While at it, remove the usage of VOP_REALVP, as long as this is a NOP
on FreeBSD.


# 276648 03-Jan-2015 kib

MFC r276007:
Handle MAKEENTRY cnp flag in the VOP_CREATE().


# 276500 01-Jan-2015 kib

MFC r275897:
Set NOCACHE flag for CREATE namei() calls, do not specially handle
MAKEENTRY in VOP_LOOKUP().


# 276081 22-Dec-2014 delphij

MFC r274337,r274673,274681,r275515:

ZFS large block support. The default recordsize remains at 128KB.

A new tunable/sysctl variable, vfs.zfs.max_recordsize is added to
allow adjusting the permitted maximum record size, or
zfs_max_recordsize, with a default of 1MB. ZFS will not allow
setting recordsize greater than zfs_max_recordsize as a safety
belt, because larger recordsize means greater read and write
latency and more memory usage.

Please note that booting from datasets that have recordsize greater
than 128KB is not supported (but it's Okay to enable the feature on
the pool).

Limited safety belt is provided for mounted root filesystem but use
caution when using a larger value.

Illumos issue:
5027 zfs large block support


# 275901 18-Dec-2014 avg

MFC r275401: zfs_putpages: actually update mtime and ctime


# 273509 22-Oct-2014 delphij

MFC r272809: MFV r272803:

Illumos issue:
5175 implement dmu_read_uio_dbuf() to improve cached read performance


# 272676 07-Oct-2014 araujo

Make external NFS clients know when files have their attributes changed and
avoid cache the file's state indefinitely. The va_filerev is what is sent
to the client as the "change" attribute, the client is periodically fetching
the attributes and without this option the attribute remains as some garbage
value.

Phabric: D905
Reported by: Kevin Buhr <buhr@asaurus.net>
Reviewed by: rmacklem, delphij
Approved by: delphij
Obtained from: r272467
Sponsored by: QNAP Systems Inc.


# 272134 25-Sep-2014 delphij

MFC r271536: MFV r271518:

Correctly report hole at end of file.

When asked to find a hole, the DMU sees that there are no holes in the
object, and returns ESRCH. The ZPL interprets this as "no holes before
the end of the file", and therefore inserts the "virtual hole" at the
end of the file. Because DMU and ZPL have different ideas of where the
end of an object/file is, we will end up returning the end of file,
which is generally larger, instead of returning the end of object.

The fix is to handle the "virtual hole" in the DMU. If no hole is found,
the DMU will return a hole at the end of the file, rather than an error.

Illumos issue:
5139 SEEK_HOLE failed to report a hole at end of file

Approved by: re (gjb)


# 269061 24-Jul-2014 mav

MFC r268420:
Remove IO_SYNC flag when writing extended file attributes on ZFS.

While it is possible to create and write file, modify its permissions, etc.
without ever doing sync, it looks odd that it is required for setting
extended file attributes on ZFS. UFS does not do sync there too.

Samba uses those extended attributes to store some its data, and doing it
synchronously by many times reduces file creation performance for systems
without SLOG device.


# 269002 22-Jul-2014 delphij

MFC r268464: MFV r268452:

Explicitly mark file removal transactions as "presumed to result
in a net free of space" so they will not fail with ENOSPC.

Illumos issue: 4950 files sometimes can't be removed from a full
filesystem


# 262112 17-Feb-2014 avg

MFC r260704,260717: zfs: getnewvnode_reserve must be called outside of a
zfs transaction


# 262096 17-Feb-2014 avg

MFC r260706: zfs_deleteextattr: name buffer from namei is needed by zfs_remove


# 260786 16-Jan-2014 avg

MFC r258744-258746: zfs: add zfs_freebsd_putpages


# 260776 16-Jan-2014 avg

MFC r258720: MFV r258665: 4347 ZPL can use dmu_tx_assign(TXG_WAIT)


# 260773 16-Jan-2014 avg

MFC r258739: zfs mappedread_sf: assert that a page is never partially valid


# 260763 16-Jan-2014 avg

MFC r258632,258704: MFV r255255: 4045 zfs write throttle & i/o scheduler
performance work

Sponsored by: HybridCluster [merge]


# 258563 25-Nov-2013 avg

MFC r258353: zfs page_busy: fix the boundaries of the cleared range

This is a fix for a regression introduced in r246293.

vm_page_clear_dirty expects the range to have DEV_BSIZE aligned boundaries,
otherwise it extends them. Thus it can happen that the whole page is
marked clean while actually having some small dirty region(s).
This commit makes the range properly aligned and ensures that only
the clean data is marked as such.

It would interesting to evaluate how much benefit clearing with DEV_BSIZE
granularity produces. Perhaps instead we should clear the whole page
when it is completely overwritten and don't bother clearing any bits
if only a portion a page is written.

Reviewed by: kib
Approved by: re (gjb)


# 288590 03-Oct-2015 mav

MFC r287103 (by avg): 5692 expose the number of hole blocks in a file

FreeBSD porting notes:
- only kernel-side changes are merged
- the new ioctl is not actually implemented yet
- thus, the goal is to synchronize DMU code

illumos/illumos-gate@2bcf0248e992f292c7b814458bcdce2f004925d6

https://www.illumos.org/issues/5692
we would like to expose the number of hole (sparse) blocks in a file.
this can be useful to for example if you want to fill in the holes with
some data; knowing the number of holes in advances allows you to report
progress on hole filling. We could use SEEK_HOLE to do that but it would
be O(n) where n is the number of holes present in the file.

Author: Max Grossman <max.grossman@delphix.com>
Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Boris Protopopov <bprotopopov@hotmail.com>
Approved by: Richard Lowe <richlowe@richlowe.net>


# 288571 03-Oct-2015 mav

MFC r286705: 5960 zfs recv should prefetch indirect blocks
5925 zfs receive -o origin=

Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Author: Paul Dagnelie <pcd@delphix.com>

While running 'zfs recv' we noticed that every 128th 8K block required a
read. We were seeing that restore_write() was calling dmu_tx_hold_write()
and the indirect block was not cached. We should prefetch upcoming indirect
blocks to avoid having to go to disk and blocking the restore_write().

Allow an incremental send stream to be received as a clone, even if the
stream does not mark it as a clone.


# 276899 09-Jan-2015 delphij

MFC r264392 (davide):

Fix a panic in zfs_rename().
this is due to a wrong dereference of a vnode when it's not locked and
can be (potentially) recycled. 'sdvp' cannot be locked on zfs_rename()
entry point because the VFS can't be sure that this scenario is
LOR-free (it might violate the parent->child lock acquisition rule).
Dereference 'tdvp' instead, which is already locked on entry, and access
'sdvp' fields only when it's safe, i.e. under ZFS_ENTER scope.

While at it, remove the usage of VOP_REALVP, as long as this is a NOP
on FreeBSD.


# 276648 03-Jan-2015 kib

MFC r276007:
Handle MAKEENTRY cnp flag in the VOP_CREATE().


# 276500 01-Jan-2015 kib

MFC r275897:
Set NOCACHE flag for CREATE namei() calls, do not specially handle
MAKEENTRY in VOP_LOOKUP().


# 276081 22-Dec-2014 delphij

MFC r274337,r274673,274681,r275515:

ZFS large block support. The default recordsize remains at 128KB.

A new tunable/sysctl variable, vfs.zfs.max_recordsize is added to
allow adjusting the permitted maximum record size, or
zfs_max_recordsize, with a default of 1MB. ZFS will not allow
setting recordsize greater than zfs_max_recordsize as a safety
belt, because larger recordsize means greater read and write
latency and more memory usage.

Please note that booting from datasets that have recordsize greater
than 128KB is not supported (but it's Okay to enable the feature on
the pool).

Limited safety belt is provided for mounted root filesystem but use
caution when using a larger value.

Illumos issue:
5027 zfs large block support


# 275901 18-Dec-2014 avg

MFC r275401: zfs_putpages: actually update mtime and ctime


# 273509 22-Oct-2014 delphij

MFC r272809: MFV r272803:

Illumos issue:
5175 implement dmu_read_uio_dbuf() to improve cached read performance


# 272676 07-Oct-2014 araujo

Make external NFS clients know when files have their attributes changed and
avoid cache the file's state indefinitely. The va_filerev is what is sent
to the client as the "change" attribute, the client is periodically fetching
the attributes and without this option the attribute remains as some garbage
value.

Phabric: D905
Reported by: Kevin Buhr <buhr@asaurus.net>
Reviewed by: rmacklem, delphij
Approved by: delphij
Obtained from: r272467
Sponsored by: QNAP Systems Inc.


# 272134 25-Sep-2014 delphij

MFC r271536: MFV r271518:

Correctly report hole at end of file.

When asked to find a hole, the DMU sees that there are no holes in the
object, and returns ESRCH. The ZPL interprets this as "no holes before
the end of the file", and therefore inserts the "virtual hole" at the
end of the file. Because DMU and ZPL have different ideas of where the
end of an object/file is, we will end up returning the end of file,
which is generally larger, instead of returning the end of object.

The fix is to handle the "virtual hole" in the DMU. If no hole is found,
the DMU will return a hole at the end of the file, rather than an error.

Illumos issue:
5139 SEEK_HOLE failed to report a hole at end of file

Approved by: re (gjb)


# 269061 24-Jul-2014 mav

MFC r268420:
Remove IO_SYNC flag when writing extended file attributes on ZFS.

While it is possible to create and write file, modify its permissions, etc.
without ever doing sync, it looks odd that it is required for setting
extended file attributes on ZFS. UFS does not do sync there too.

Samba uses those extended attributes to store some its data, and doing it
synchronously by many times reduces file creation performance for systems
without SLOG device.


# 269002 22-Jul-2014 delphij

MFC r268464: MFV r268452:

Explicitly mark file removal transactions as "presumed to result
in a net free of space" so they will not fail with ENOSPC.

Illumos issue: 4950 files sometimes can't be removed from a full
filesystem


# 262112 17-Feb-2014 avg

MFC r260704,260717: zfs: getnewvnode_reserve must be called outside of a
zfs transaction


# 262096 17-Feb-2014 avg

MFC r260706: zfs_deleteextattr: name buffer from namei is needed by zfs_remove


# 260786 16-Jan-2014 avg

MFC r258744-258746: zfs: add zfs_freebsd_putpages


# 260776 16-Jan-2014 avg

MFC r258720: MFV r258665: 4347 ZPL can use dmu_tx_assign(TXG_WAIT)


# 260773 16-Jan-2014 avg

MFC r258739: zfs mappedread_sf: assert that a page is never partially valid


# 260763 16-Jan-2014 avg

MFC r258632,258704: MFV r255255: 4045 zfs write throttle & i/o scheduler
performance work

Sponsored by: HybridCluster [merge]


# 258563 25-Nov-2013 avg

MFC r258353: zfs page_busy: fix the boundaries of the cleared range

This is a fix for a regression introduced in r246293.

vm_page_clear_dirty expects the range to have DEV_BSIZE aligned boundaries,
otherwise it extends them. Thus it can happen that the whole page is
marked clean while actually having some small dirty region(s).
This commit makes the range properly aligned and ensures that only
the clean data is marked as such.

It would interesting to evaluate how much benefit clearing with DEV_BSIZE
granularity produces. Perhaps instead we should clear the whole page
when it is completely overwritten and don't bother clearing any bits
if only a portion a page is written.

Reviewed by: kib
Approved by: re (gjb)