Cross Reference: /freebsd-10-stable/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs

History log of /freebsd-10-stable/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c
Revision	Date	Author	Comments
# 330987	15-Mar-2018	avg	MFC r322245,r329717: MFV r322242: 8373 TXG_WAIT in ZIL commit path MFC r322245: MFV r322242: 8373 TXG_WAIT in ZIL commit path MFC r329717: MFV r329715: 8997 ztest assertion failure in zil_lwb_write_issue
# 330065	27-Feb-2018	avg	MFC r329016: remove a duplicate assignment
# 330063	27-Feb-2018	avg	MFC r328881: zfs: move a utility function, ioflags, closer to its consumers
# 326428	01-Dec-2017	avg	MFC r326070: zfs_write: fix problem with writes appearing to succeed when over quota The problem happens when the writes have offsets and sizes aligned with a filesystem's recordsize (maximum block size). In this scenario dmu_tx_assign() would fail because of being over the quota, but the uio would already be modified in the code path where we copy data from the uio into a borrowed ARC buffer. That makes an appearance of a partial write, so zfs_write() would return success and the uio would be modified consistently with writing a single block. That bug can result in a data loss because the writes over the quota would appear to succeed while the actual data is being discarded. This commit fixes the bug by ensuring that the uio is not changed until after all error checks are done. To achieve that the code now uses uiocopy() + uioskip() as in the original illumos design. We can do that now that uiocopy() has been updated in r326067 to use vn_io_fault_uiomove().
# 324204	02-Oct-2017	avg	MFC r323918: MFV r323917: 8648 Fix range locking in ZIL commit codepath This fixes a problem introduced in r320496, MFC of r308782.
# 324159	01-Oct-2017	avg	MFC r323522: slightly simplify zfs_vptocnp
# 319416	01-Jun-2017	avg	MFC r319096: zfs_lookup: fix bogus arguments to lookup of "snapshot" directory
# 316848	14-Apr-2017	avg	MFC r315853: zfs_putpages: use TXG_WAIT
# 315844	23-Mar-2017	avg	MFC r314048,r314194: reimplement zfsctl (.zfs) support
# 314711	05-Mar-2017	mm	MFC r314572: Fix null pointer dereference in zfs_freebsd_setacl(). Prevents unprivileged users from panicking the kernel by calling __acl_delete_*() on files or directories inside a ZFS mount.
# 314029	21-Feb-2017	avg	MFC r313686: check remaining space in zfs implementations of vptocnp PR: 216939
# 310067	14-Dec-2016	avg	MFC r308887,309090: fix unsafe modification of zfs_vnodeops when DIAGNOSTIC is enabled
# 307996	27-Oct-2016	avg	MFC r306801: implement zfs_vptocnp() using z_parent property
# 307672	20-Oct-2016	kib	MFC r307218: Fix a race in vm_page_busy_sleep(9).
# 307143	12-Oct-2016	avg	MFC r306665: zfs: fix a wrong assertion for extended attributes PR: 213112
# 306819	07-Oct-2016	avg	MFC r306292: fix vnode lock assertion for extended attributes directory
# 304671	23-Aug-2016	avg	MFC r303763,303791,303869: zfs: honour and make use of vfs vnode locking protocol PR: 209158
# 304121	15-Aug-2016	avg	MFC r302839: 6940 Cannot unlink directories when over quota
# 302732	13-Jul-2016	avg	MFC r299906,301870: add zfs_vptocnp with special handling for snapshots under .zfs Note that the changed is adjusted for the lack of LK_VNHELD in this branch.
# 302721	13-Jul-2016	avg	MFC r298105: zfs: enable vn_io_fault support
# 301695	08-Jun-2016	ngie	MFC r300870,r300884: r300870: Unbreak the zfs(4) build vm/vm_pageout.h grew a dependency on the bool typedef in r300865 arc.c didn't include sys/types.h, which included the definition for the typedef Other items (ofed, drm2) might need to be chased for this commit. Pointyhat to: alc r300884: Fix up r300870 The sys/types.h fix I proposed was only tested with zfs(4), not with libzpool, which is where the build failure actually existed Remove vm/vm_pageout.h from arc.c and zfs_vnops.c because they're both unneeded In collaboration with: kib
# 297112	20-Mar-2016	mav	MFC r296519: MFV r296518: 5027 zfs large block support (add copyright) Author: Matthew Ahrens <matt@mahrens.org> illumos/illumos-gate@c3d26abc9ee97b4f60233556aadeb57e0bd30bb9
# 297096	20-Mar-2016	mav	MFC r294803: MFV r294802: 6334 Cannot unlink files when over quota Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Toomas Soome <tsoome@me.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Simon Klinkert <simon.klinkert@gmail.com> illumos/illumos-gate@6575bca01367958c7237253d88e5fa9ef0b1650a
# 297084	20-Mar-2016	mav	MFV r258597 (by pjd): When append-only, immutable or read-only flag is set don't allow for hard links creation. This matches UFS behaviour. Reported by: Oleg Ginzburg <olevole@olevole.ru>
# 297083	20-Mar-2016	mav	MFC r262990: MFV r262983: 4638 Panic in ZFS via rfs3_setattr()/rfs3_write(): dirtying snapshot! illumos/illumos-gate@2144b121c08e0eb676cc6ca4662ebbc9f9c22fe3
# 297077	20-Mar-2016	mav	MFC r277300 (by smh): Mechanically convert cddl sun #ifdef's to illumos Since the upstream for cddl code is now illumos not sun, mechanically convert all sun #ifdef's to illumos #ifdef's which have been used in all newer code for some time. Also do a manual pass to correct the use if #ifdef comments as per style(9) as well as few uses of #if defined(__FreeBSD__) vs #ifndef illumos.
# 290765	13-Nov-2015	mav	MFC r289562: 6328 Fix cstyle errors in zfs codebase Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Alex Reece <alex@delphix.com> Reviewed by: Richard Elling <Richard.Elling@RichardElling.com> Reviewed by: Jorgen Lundman <lundman@lundman.net> Approved by: Robert Mustacchi <rm@joyent.com> Author: Paul Dagnelie <pcd@delphix.com> illumos/illumos-gate@9a686fbc186e8e2a64e9a5094d44c7d6fa0ea167
# 288590	03-Oct-2015	mav	MFC r287103 (by avg): 5692 expose the number of hole blocks in a file FreeBSD porting notes: - only kernel-side changes are merged - the new ioctl is not actually implemented yet - thus, the goal is to synchronize DMU code illumos/illumos-gate@2bcf0248e992f292c7b814458bcdce2f004925d6 https://www.illumos.org/issues/5692 we would like to expose the number of hole (sparse) blocks in a file. this can be useful to for example if you want to fill in the holes with some data; knowing the number of holes in advances allows you to report progress on hole filling. We could use SEEK_HOLE to do that but it would be O(n) where n is the number of holes present in the file. Author: Max Grossman <max.grossman@delphix.com> Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Boris Protopopov <bprotopopov@hotmail.com> Approved by: Richard Lowe <richlowe@richlowe.net>
# 288571	03-Oct-2015	mav	MFC r286705: 5960 zfs recv should prefetch indirect blocks 5925 zfs receive -o origin= Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Author: Paul Dagnelie <pcd@delphix.com> While running 'zfs recv' we noticed that every 128th 8K block required a read. We were seeing that restore_write() was calling dmu_tx_hold_write() and the indirect block was not cached. We should prefetch upcoming indirect blocks to avoid having to go to disk and blocking the restore_write(). Allow an incremental send stream to be received as a clone, even if the stream does not mark it as a clone.
# 276899	09-Jan-2015	delphij	MFC r264392 (davide): Fix a panic in zfs_rename(). this is due to a wrong dereference of a vnode when it's not locked and can be (potentially) recycled. 'sdvp' cannot be locked on zfs_rename() entry point because the VFS can't be sure that this scenario is LOR-free (it might violate the parent->child lock acquisition rule). Dereference 'tdvp' instead, which is already locked on entry, and access 'sdvp' fields only when it's safe, i.e. under ZFS_ENTER scope. While at it, remove the usage of VOP_REALVP, as long as this is a NOP on FreeBSD.
# 276648	03-Jan-2015	kib	MFC r276007: Handle MAKEENTRY cnp flag in the VOP_CREATE().
# 276500	01-Jan-2015	kib	MFC r275897: Set NOCACHE flag for CREATE namei() calls, do not specially handle MAKEENTRY in VOP_LOOKUP().
# 276081	22-Dec-2014	delphij	MFC r274337,r274673,274681,r275515: ZFS large block support. The default recordsize remains at 128KB. A new tunable/sysctl variable, vfs.zfs.max_recordsize is added to allow adjusting the permitted maximum record size, or zfs_max_recordsize, with a default of 1MB. ZFS will not allow setting recordsize greater than zfs_max_recordsize as a safety belt, because larger recordsize means greater read and write latency and more memory usage. Please note that booting from datasets that have recordsize greater than 128KB is not supported (but it's Okay to enable the feature on the pool). Limited safety belt is provided for mounted root filesystem but use caution when using a larger value. Illumos issue: 5027 zfs large block support
# 275901	18-Dec-2014	avg	MFC r275401: zfs_putpages: actually update mtime and ctime
# 273509	22-Oct-2014	delphij	MFC r272809: MFV r272803: Illumos issue: 5175 implement dmu_read_uio_dbuf() to improve cached read performance
# 272676	07-Oct-2014	araujo	Make external NFS clients know when files have their attributes changed and avoid cache the file's state indefinitely. The va_filerev is what is sent to the client as the "change" attribute, the client is periodically fetching the attributes and without this option the attribute remains as some garbage value. Phabric: D905 Reported by: Kevin Buhr <buhr@asaurus.net> Reviewed by: rmacklem, delphij Approved by: delphij Obtained from: r272467 Sponsored by: QNAP Systems Inc.
# 272134	25-Sep-2014	delphij	MFC r271536: MFV r271518: Correctly report hole at end of file. When asked to find a hole, the DMU sees that there are no holes in the object, and returns ESRCH. The ZPL interprets this as "no holes before the end of the file", and therefore inserts the "virtual hole" at the end of the file. Because DMU and ZPL have different ideas of where the end of an object/file is, we will end up returning the end of file, which is generally larger, instead of returning the end of object. The fix is to handle the "virtual hole" in the DMU. If no hole is found, the DMU will return a hole at the end of the file, rather than an error. Illumos issue: 5139 SEEK_HOLE failed to report a hole at end of file Approved by: re (gjb)
# 269061	24-Jul-2014	mav	MFC r268420: Remove IO_SYNC flag when writing extended file attributes on ZFS. While it is possible to create and write file, modify its permissions, etc. without ever doing sync, it looks odd that it is required for setting extended file attributes on ZFS. UFS does not do sync there too. Samba uses those extended attributes to store some its data, and doing it synchronously by many times reduces file creation performance for systems without SLOG device.
# 269002	22-Jul-2014	delphij	MFC r268464: MFV r268452: Explicitly mark file removal transactions as "presumed to result in a net free of space" so they will not fail with ENOSPC. Illumos issue: 4950 files sometimes can't be removed from a full filesystem
# 262112	17-Feb-2014	avg	MFC r260704,260717: zfs: getnewvnode_reserve must be called outside of a zfs transaction
# 262096	17-Feb-2014	avg	MFC r260706: zfs_deleteextattr: name buffer from namei is needed by zfs_remove
# 260786	16-Jan-2014	avg	MFC r258744-258746: zfs: add zfs_freebsd_putpages
# 260776	16-Jan-2014	avg	MFC r258720: MFV r258665: 4347 ZPL can use dmu_tx_assign(TXG_WAIT)
# 260773	16-Jan-2014	avg	MFC r258739: zfs mappedread_sf: assert that a page is never partially valid
# 260763	16-Jan-2014	avg	MFC r258632,258704: MFV r255255: 4045 zfs write throttle & i/o scheduler performance work Sponsored by: HybridCluster [merge]
# 258563	25-Nov-2013	avg	MFC r258353: zfs page_busy: fix the boundaries of the cleared range This is a fix for a regression introduced in r246293. vm_page_clear_dirty expects the range to have DEV_BSIZE aligned boundaries, otherwise it extends them. Thus it can happen that the whole page is marked clean while actually having some small dirty region(s). This commit makes the range properly aligned and ensures that only the clean data is marked as such. It would interesting to evaluate how much benefit clearing with DEV_BSIZE granularity produces. Perhaps instead we should clear the whole page when it is completely overwritten and don't bother clearing any bits if only a portion a page is written. Reviewed by: kib Approved by: re (gjb)
# 288590	03-Oct-2015	mav	MFC r287103 (by avg): 5692 expose the number of hole blocks in a file FreeBSD porting notes: - only kernel-side changes are merged - the new ioctl is not actually implemented yet - thus, the goal is to synchronize DMU code illumos/illumos-gate@2bcf0248e992f292c7b814458bcdce2f004925d6 https://www.illumos.org/issues/5692 we would like to expose the number of hole (sparse) blocks in a file. this can be useful to for example if you want to fill in the holes with some data; knowing the number of holes in advances allows you to report progress on hole filling. We could use SEEK_HOLE to do that but it would be O(n) where n is the number of holes present in the file. Author: Max Grossman <max.grossman@delphix.com> Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Boris Protopopov <bprotopopov@hotmail.com> Approved by: Richard Lowe <richlowe@richlowe.net>
# 288571	03-Oct-2015	mav	MFC r286705: 5960 zfs recv should prefetch indirect blocks 5925 zfs receive -o origin= Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Author: Paul Dagnelie <pcd@delphix.com> While running 'zfs recv' we noticed that every 128th 8K block required a read. We were seeing that restore_write() was calling dmu_tx_hold_write() and the indirect block was not cached. We should prefetch upcoming indirect blocks to avoid having to go to disk and blocking the restore_write(). Allow an incremental send stream to be received as a clone, even if the stream does not mark it as a clone.
# 276899	09-Jan-2015	delphij	MFC r264392 (davide): Fix a panic in zfs_rename(). this is due to a wrong dereference of a vnode when it's not locked and can be (potentially) recycled. 'sdvp' cannot be locked on zfs_rename() entry point because the VFS can't be sure that this scenario is LOR-free (it might violate the parent->child lock acquisition rule). Dereference 'tdvp' instead, which is already locked on entry, and access 'sdvp' fields only when it's safe, i.e. under ZFS_ENTER scope. While at it, remove the usage of VOP_REALVP, as long as this is a NOP on FreeBSD.
# 276648	03-Jan-2015	kib	MFC r276007: Handle MAKEENTRY cnp flag in the VOP_CREATE().
# 276500	01-Jan-2015	kib	MFC r275897: Set NOCACHE flag for CREATE namei() calls, do not specially handle MAKEENTRY in VOP_LOOKUP().
# 276081	22-Dec-2014	delphij	MFC r274337,r274673,274681,r275515: ZFS large block support. The default recordsize remains at 128KB. A new tunable/sysctl variable, vfs.zfs.max_recordsize is added to allow adjusting the permitted maximum record size, or zfs_max_recordsize, with a default of 1MB. ZFS will not allow setting recordsize greater than zfs_max_recordsize as a safety belt, because larger recordsize means greater read and write latency and more memory usage. Please note that booting from datasets that have recordsize greater than 128KB is not supported (but it's Okay to enable the feature on the pool). Limited safety belt is provided for mounted root filesystem but use caution when using a larger value. Illumos issue: 5027 zfs large block support
# 275901	18-Dec-2014	avg	MFC r275401: zfs_putpages: actually update mtime and ctime
# 273509	22-Oct-2014	delphij	MFC r272809: MFV r272803: Illumos issue: 5175 implement dmu_read_uio_dbuf() to improve cached read performance
# 272676	07-Oct-2014	araujo	Make external NFS clients know when files have their attributes changed and avoid cache the file's state indefinitely. The va_filerev is what is sent to the client as the "change" attribute, the client is periodically fetching the attributes and without this option the attribute remains as some garbage value. Phabric: D905 Reported by: Kevin Buhr <buhr@asaurus.net> Reviewed by: rmacklem, delphij Approved by: delphij Obtained from: r272467 Sponsored by: QNAP Systems Inc.
# 272134	25-Sep-2014	delphij	MFC r271536: MFV r271518: Correctly report hole at end of file. When asked to find a hole, the DMU sees that there are no holes in the object, and returns ESRCH. The ZPL interprets this as "no holes before the end of the file", and therefore inserts the "virtual hole" at the end of the file. Because DMU and ZPL have different ideas of where the end of an object/file is, we will end up returning the end of file, which is generally larger, instead of returning the end of object. The fix is to handle the "virtual hole" in the DMU. If no hole is found, the DMU will return a hole at the end of the file, rather than an error. Illumos issue: 5139 SEEK_HOLE failed to report a hole at end of file Approved by: re (gjb)
# 269061	24-Jul-2014	mav	MFC r268420: Remove IO_SYNC flag when writing extended file attributes on ZFS. While it is possible to create and write file, modify its permissions, etc. without ever doing sync, it looks odd that it is required for setting extended file attributes on ZFS. UFS does not do sync there too. Samba uses those extended attributes to store some its data, and doing it synchronously by many times reduces file creation performance for systems without SLOG device.
# 269002	22-Jul-2014	delphij	MFC r268464: MFV r268452: Explicitly mark file removal transactions as "presumed to result in a net free of space" so they will not fail with ENOSPC. Illumos issue: 4950 files sometimes can't be removed from a full filesystem
# 262112	17-Feb-2014	avg	MFC r260704,260717: zfs: getnewvnode_reserve must be called outside of a zfs transaction
# 262096	17-Feb-2014	avg	MFC r260706: zfs_deleteextattr: name buffer from namei is needed by zfs_remove
# 260786	16-Jan-2014	avg	MFC r258744-258746: zfs: add zfs_freebsd_putpages
# 260776	16-Jan-2014	avg	MFC r258720: MFV r258665: 4347 ZPL can use dmu_tx_assign(TXG_WAIT)
# 260773	16-Jan-2014	avg	MFC r258739: zfs mappedread_sf: assert that a page is never partially valid
# 260763	16-Jan-2014	avg	MFC r258632,258704: MFV r255255: 4045 zfs write throttle & i/o scheduler performance work Sponsored by: HybridCluster [merge]
# 258563	25-Nov-2013	avg	MFC r258353: zfs page_busy: fix the boundaries of the cleared range This is a fix for a regression introduced in r246293. vm_page_clear_dirty expects the range to have DEV_BSIZE aligned boundaries, otherwise it extends them. Thus it can happen that the whole page is marked clean while actually having some small dirty region(s). This commit makes the range properly aligned and ensures that only the clean data is marked as such. It would interesting to evaluate how much benefit clearing with DEV_BSIZE granularity produces. Perhaps instead we should clear the whole page when it is completely overwritten and don't bother clearing any bits if only a portion a page is written. Reviewed by: kib Approved by: re (gjb)