#
324204 |
|
02-Oct-2017 |
avg |
MFC r323918: MFV r323917: 8648 Fix range locking in ZIL commit codepath
This fixes a problem introduced in r320496, MFC of r308782.
|
#
320496 |
|
30-Jun-2017 |
avg |
MFC r308782: After some ZIL changes 6 years ago zil_slog_limit got partially broken due to zl_itx_list_sz not updated when async itx'es upgraded to sync. Actually because of other changes about that time zl_itx_list_sz is not really required to implement the functionality, so this patch removes some unneeded broken code and variables.
Original idea of zil_slog_limit was to reduce chance of SLOG abuse by single heavy logger, that increased latency for other (more latency critical) loggers, by pushing heavy log out into the main pool instead of SLOG. Beside huge latency increase for heavy writers, this implementation caused double write of all data, since the log records were explicitly prepared for SLOG. Since we now have I/O scheduler, I've found it can be much more efficient to reduce priority of heavy logger SLOG writes from ZIO_PRIORITY_SYNC_WRITE to ZIO_PRIORITY_ASYNC_WRITE, while still leave them on SLOG.
Existing ZIL implementation had problem with space efficiency when it has to write large chunks of data into log blocks of limited size. In some cases efficiency stopped to almost as low as 50%. In case of ZIL stored on spinning rust, that also reduced log write speed in half, since head had to uselessly fly over allocated but not written areas. This change improves the situation by offloading problematic operations from z*_log_write() to zil_lwb_commit(), which knows real situation of log blocks allocation and can split large requests into pieces much more efficiently. Also as side effect it removes one of two data copy operations done by ZIL code WR_COPIED case.
While there, untangle and unify code of z*_log_write() functions. Also zfs_log_write() alike to zvol_log_write() can now handle writes crossing block boundary, that may also improve efficiency if ZPL is made to do that.
|
#
308596 |
|
12-Nov-2016 |
mav |
MFC r308173: Fix ZIL records ordering when ZVOL opened both with and without FSYNC.
Before this an earlier writes to a ZVOL opened without FSYNC could get to ZIL after later writes to the same ZVOL opened with FSYNC. Fix this by replicating functionality of ZPL (zv_sync_cnt equivalent to z_sync_cnt), marking all log records sync if anybody opened the ZVOL with FSYNC.
|
#
308594 |
|
12-Nov-2016 |
mav |
MFC r308169: Pass to zvol_log_truncate() same sync values as to zvol_log_write().
Surplus marking of TX_TRUNCATE records as sync could result in putting them into ZIL before previous writes if ones were async.
|
#
308448 |
|
08-Nov-2016 |
mav |
MFC r307857: Fix panic after ZVOL renamed to name invalid for DEVFS.
|
#
308057 |
|
28-Oct-2016 |
mav |
MFC r294329 (by asomers): Disallow zvol-backed ZFS pools
Using zvols as backing devices for ZFS pools is fraught with panics and deadlocks. For example, attempting to online a missing device in the presence of a zvol can cause a panic when vdev_geom tastes the zvol. Better to completely disable vdev_geom from ever opening a zvol. The solution relies on setting a thread-local variable during vdev_geom_open, and returning EOPNOTSUPP during zvol_open if that thread-local variable is set.
Remove the check for MUTEX_HELD(&zfsdev_state_lock) in zvol_open. Its intent was to prevent a recursive mutex acquisition panic. However, the new check for the thread-local variable also fixes that problem.
Also, fix a panic in vdev_geom_taste_orphan. For an unknown reason, this function was set to panic. But it can occur that a device disappears during tasting, and it causes no problems to ignore this departure.
|
#
297549 |
|
04-Apr-2016 |
mav |
MFC r297421: Plug open count leak on zvol rename.
|
#
297548 |
|
04-Apr-2016 |
mav |
MFC r297420: Switch from using make_dev_p() to make_dev_s() to close races.
|
#
297547 |
|
04-Apr-2016 |
mav |
MFC r297337: Pass through error code from make_dev_p().
ENAMETOOLONG is much more informative in logs then ENXIO.
|
#
297546 |
|
04-Apr-2016 |
mav |
MFC r297232: Unify ignoring EEXIST from zvol_create_minor().
This fixes creation of zvol devices for snapshots during zfs receive, that previously failed with "ZFS WARNING: Unable to create ZVOL" message. This solution is not perfect, but IMHO better then it was before.
|
#
297112 |
|
20-Mar-2016 |
mav |
MFC r296519: MFV r296518: 5027 zfs large block support (add copyright)
Author: Matthew Ahrens <matt@mahrens.org>
illumos/illumos-gate@c3d26abc9ee97b4f60233556aadeb57e0bd30bb9
|
#
290746 |
|
13-Nov-2015 |
mav |
MFC r289190: 6250 zvol_dump_init() can hold txg open
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Albert Lee <trisk@omniti.com> Reviewed by: Xin Li <delphij@freebsd.org> Approved by: Garrett D'Amore <garrett@damore.org> Author: George Wilson <george.wilson@delphix.com>
illumos/illumos-gate@b10bba72460aeaa53119c76ff5e647fd5585bece
|
#
288571 |
|
03-Oct-2015 |
mav |
MFC r286705: 5960 zfs recv should prefetch indirect blocks 5925 zfs receive -o origin=
Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Author: Paul Dagnelie <pcd@delphix.com>
While running 'zfs recv' we noticed that every 128th 8K block required a read. We were seeing that restore_write() was calling dmu_tx_hold_write() and the indirect block was not cached. We should prefetch upcoming indirect blocks to avoid having to go to disk and blocking the restore_write().
Allow an incremental send stream to be received as a clone, even if the stream does not mark it as a clone.
|
#
288520 |
|
02-Oct-2015 |
mav |
MFC r279996 (by smh): Allow zvol_geom_worker to process BIO_DELETE's
If zvol_geom_start is called with a BIO_DELETE from a thread which can sleep it queues it for later processing by the zvol_geom_worker. The zvol_geom_worker didn't have a delete case so would simply loose the bio hence preventing the original caller from every completing. In addition an other unknown types would suffer the same fate.
Allow zvol_geom_worker to process BIO_DELETE's via zvol_strategy and return unsupported for all unknown bio types.
|
#
280753 |
|
27-Mar-2015 |
mav |
MFC r279927: Make DIOCGATTR in device mode handle "GEOM::candelete".
|
#
277699 |
|
25-Jan-2015 |
mav |
MFC r276913: Use new optimized dmu_read_uio_dbuf() for ZVOLs in device mode.
This slightly reduces overhead by avoiding dnode_hold()/dnode_rele() calls.
|
#
277483 |
|
21-Jan-2015 |
smh |
MFC r276063: Standardise on illumos for #ifdef's in zvol.c
MFC r276066: Refactor zvol locking to minimise diff with upstream
MFC r276069: Fix panic when resizing ZFS zvol's
Sponsored by: Multiplay
|
#
277482 |
|
21-Jan-2015 |
smh |
MFC r272509 (by delphi): Diff reduction with upstream
Sponsored by: Multiplay
|
#
276081 |
|
22-Dec-2014 |
delphij |
MFC r274337,r274673,274681,r275515:
ZFS large block support. The default recordsize remains at 128KB.
A new tunable/sysctl variable, vfs.zfs.max_recordsize is added to allow adjusting the permitted maximum record size, or zfs_max_recordsize, with a default of 1MB. ZFS will not allow setting recordsize greater than zfs_max_recordsize as a safety belt, because larger recordsize means greater read and write latency and more memory usage.
Please note that booting from datasets that have recordsize greater than 128KB is not supported (but it's Okay to enable the feature on the pool).
Limited safety belt is provided for mounted root filesystem but use caution when using a larger value.
Illumos issue: 5027 zfs large block support
|
#
275892 |
|
18-Dec-2014 |
mav |
MFC r275474: Add GET LBA STATUS command support to CTL.
It is implemented for LUNs backed by ZVOLs in "dev" mode and files. GEOM has no such API, so for LUNs backed by raw devices all LBAs will be reported as mapped/unknown.
Sponsored by: iXsystems, Inc.
|
#
274732 |
|
20-Nov-2014 |
mav |
MFC r274154, r274163: Add to CTL support for logical block provisioning threshold notifications.
For ZVOL-backed LUNs this allows to inform initiators if storage's used or available spaces get above/below the configured thresholds.
Sponsored by: iXsystems, Inc.
|
#
273345 |
|
20-Oct-2014 |
delphij |
MFC r272510: MFV r272498:
Add a new sysctl, vfs.zfs.vol.unmap_enabled, which allows the system administrator to toggle whether ZFS should ignore UNMAP requests.
Illumos issue: 5149 zvols need a way to ignore DKIOCFREE
|
#
272883 |
|
09-Oct-2014 |
smh |
MFC r272474: Fix various issues with zvols
Sponsored by: Multiplay
|
#
272615 |
|
06-Oct-2014 |
mav |
MFC r271308: Make ZVOL writes in device mode support IO_SYNC flag.
|
#
269006 |
|
22-Jul-2014 |
delphij |
MFC r268473: MFV r268455:
Use reserved space for ZFS administrative commands.
|
#
269002 |
|
22-Jul-2014 |
delphij |
MFC r268464: MFV r268452:
Explicitly mark file removal transactions as "presumed to result in a net free of space" so they will not fail with ENOSPC.
Illumos issue: 4950 files sometimes can't be removed from a full filesystem
|
#
268657 |
|
15-Jul-2014 |
delphij |
MFC r268123: MFV r268119:
4914 zfs on-disk bookmark structure should be named *_phys_t
|
#
268649 |
|
15-Jul-2014 |
delphij |
MFC r268075: MFV r267565:
4757 ZFS embedded-data block pointers ("zero block compression") 4913 zfs release should not be subject to space checks
|
#
268274 |
|
04-Jul-2014 |
mav |
MFC r268178: Fix bug in sync control in new "dev" mode of ZVOL (r265678).
Don't check ZVOL_WCE flag, used in Solaris to control device "write cache". It is not applicable on FreeBSD and by default set to "disable".
|
#
265678 |
|
08-May-2014 |
mav |
MFC r264145: Add property and sysctl to control how ZVOLs are exposed to OS.
New ZFS property volmode and sysctl vfs.zfs.vol.mode allow switching ZVOL between three modes: geom -- existing fully functional behavior (default); dev -- exposing volumes only as raw disk device file in devfs; none -- not exposing volumes outside ZFS.
The "dev" mode is less functional (can't be partitioned, mounted, etc), but it is faster, and in some scenarios with untrusted consumers safer. It can be useful for NAS, VM block storages, etc. The "none" mode may be convenient for backup servers, etc. that don't need direct data access.
Due to the way ZVOL is integrated with main ZFS code, those property and sysctl are checked only during pool import and volume creation.
|
#
265677 |
|
08-May-2014 |
mav |
MFC r264086: 3580 Want zvols to return volblocksize when queried for physical block size
illumos/illumos-gate@a0b60564dfc644f4bfaef1ce26d343b44cf68bc5
It is irrelevant for FreeBSD, just reducing diff.
|
#
264733 |
|
21-Apr-2014 |
mav |
MFC r264193: In addition to r264077, tell GEOM that we do support BIO_DELETE now.
|
#
264732 |
|
21-Apr-2014 |
mav |
MFC r264077: Add BIO_DELETE support to ZVOL.
It is an adapted merge from the vendor branch of: 701 UNMAP support for COMSTAR (in part related to ZFS) 2130 zvol DKIOCFREE uses nested DMU transactions
|
#
263987 |
|
01-Apr-2014 |
mav |
MFC r263118: Report ZVOL block size as GEOM stripesize.
|
#
263397 |
|
19-Mar-2014 |
delphij |
MFC r260150: MFV r259170:
4370 avoid transmitting holes during zfs send
4371 DMU code clean up
illumos/illumos-gate@43466aae47bfcd2ad9bf501faec8e75c08095e4f
NOTE: Make sure the boot code is updated if a zpool upgrade is done on boot zpool.
|
#
263390 |
|
19-Mar-2014 |
delphij |
MFC r259813 + r259813: MFV r258374:
4171 clean up spa_feature_*() interfaces
4172 implement extensible_dataset feature for use by other zpool features
illumos/illumos-gate@2acef22db7808606888f8f92715629ff3ba555b9
|
#
260385 |
|
06-Jan-2014 |
scottl |
MFC Alexander Motin's GEOM direct dispatch work:
r256603: Introduce new function devstat_end_transaction_bio_bt(), adding new argument to specify present time. Use this function to move binuptime() out of lock, substantially reducing lock congestion when slow timecounter is used.
r256606: Move g_io_deliver() out of the lock, as required for direct dispatch. Move g_destroy_bio() out too to reduce lock scope even more.
r256607: Fix passing uninitialized bio_resid argument to g_trace().
r256610: Add unmapped I/O support to GEOM RAID.
r256830: Restore BIO_UNMAPPED and BIO_TRANSIENT_MAPPING in biodonne() when unmapping temporary mapped buffer. That fixes double unmap if biodone() called twice for the same BIO (but with different done methods).
r256880: Merge GEOM direct dispatch changes from the projects/camlock branch.
When safety requirements are met, it allows to avoid passing I/O requests to GEOM g_up/g_down thread, executing them directly in the caller context. That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid several context switches per I/O.
r259247: Fix bug introduced at r256607. We have to recalculate bp_resid here since sizes of original and completed requests may differ due to end of media.
Testing of the stable/10 merge was done by Netflix, but all of the credit goes to Alexander and iX Systems.
Submitted by: mav Sponsored by: iX Systems
|
#
288571 |
|
03-Oct-2015 |
mav |
MFC r286705: 5960 zfs recv should prefetch indirect blocks 5925 zfs receive -o origin=
Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Author: Paul Dagnelie <pcd@delphix.com>
While running 'zfs recv' we noticed that every 128th 8K block required a read. We were seeing that restore_write() was calling dmu_tx_hold_write() and the indirect block was not cached. We should prefetch upcoming indirect blocks to avoid having to go to disk and blocking the restore_write().
Allow an incremental send stream to be received as a clone, even if the stream does not mark it as a clone.
|
#
288520 |
|
02-Oct-2015 |
mav |
MFC r279996 (by smh): Allow zvol_geom_worker to process BIO_DELETE's
If zvol_geom_start is called with a BIO_DELETE from a thread which can sleep it queues it for later processing by the zvol_geom_worker. The zvol_geom_worker didn't have a delete case so would simply loose the bio hence preventing the original caller from every completing. In addition an other unknown types would suffer the same fate.
Allow zvol_geom_worker to process BIO_DELETE's via zvol_strategy and return unsupported for all unknown bio types.
|
#
280753 |
|
27-Mar-2015 |
mav |
MFC r279927: Make DIOCGATTR in device mode handle "GEOM::candelete".
|
#
277699 |
|
25-Jan-2015 |
mav |
MFC r276913: Use new optimized dmu_read_uio_dbuf() for ZVOLs in device mode.
This slightly reduces overhead by avoiding dnode_hold()/dnode_rele() calls.
|
#
277483 |
|
21-Jan-2015 |
smh |
MFC r276063: Standardise on illumos for #ifdef's in zvol.c
MFC r276066: Refactor zvol locking to minimise diff with upstream
MFC r276069: Fix panic when resizing ZFS zvol's
Sponsored by: Multiplay
|
#
277482 |
|
21-Jan-2015 |
smh |
MFC r272509 (by delphi): Diff reduction with upstream
Sponsored by: Multiplay
|
#
276081 |
|
22-Dec-2014 |
delphij |
MFC r274337,r274673,274681,r275515:
ZFS large block support. The default recordsize remains at 128KB.
A new tunable/sysctl variable, vfs.zfs.max_recordsize is added to allow adjusting the permitted maximum record size, or zfs_max_recordsize, with a default of 1MB. ZFS will not allow setting recordsize greater than zfs_max_recordsize as a safety belt, because larger recordsize means greater read and write latency and more memory usage.
Please note that booting from datasets that have recordsize greater than 128KB is not supported (but it's Okay to enable the feature on the pool).
Limited safety belt is provided for mounted root filesystem but use caution when using a larger value.
Illumos issue: 5027 zfs large block support
|
#
275892 |
|
18-Dec-2014 |
mav |
MFC r275474: Add GET LBA STATUS command support to CTL.
It is implemented for LUNs backed by ZVOLs in "dev" mode and files. GEOM has no such API, so for LUNs backed by raw devices all LBAs will be reported as mapped/unknown.
Sponsored by: iXsystems, Inc.
|
#
274732 |
|
20-Nov-2014 |
mav |
MFC r274154, r274163: Add to CTL support for logical block provisioning threshold notifications.
For ZVOL-backed LUNs this allows to inform initiators if storage's used or available spaces get above/below the configured thresholds.
Sponsored by: iXsystems, Inc.
|
#
273345 |
|
20-Oct-2014 |
delphij |
MFC r272510: MFV r272498:
Add a new sysctl, vfs.zfs.vol.unmap_enabled, which allows the system administrator to toggle whether ZFS should ignore UNMAP requests.
Illumos issue: 5149 zvols need a way to ignore DKIOCFREE
|
#
272883 |
|
09-Oct-2014 |
smh |
MFC r272474: Fix various issues with zvols
Sponsored by: Multiplay
|
#
272615 |
|
06-Oct-2014 |
mav |
MFC r271308: Make ZVOL writes in device mode support IO_SYNC flag.
|
#
269006 |
|
22-Jul-2014 |
delphij |
MFC r268473: MFV r268455:
Use reserved space for ZFS administrative commands.
|
#
269002 |
|
22-Jul-2014 |
delphij |
MFC r268464: MFV r268452:
Explicitly mark file removal transactions as "presumed to result in a net free of space" so they will not fail with ENOSPC.
Illumos issue: 4950 files sometimes can't be removed from a full filesystem
|
#
268657 |
|
15-Jul-2014 |
delphij |
MFC r268123: MFV r268119:
4914 zfs on-disk bookmark structure should be named *_phys_t
|
#
268649 |
|
15-Jul-2014 |
delphij |
MFC r268075: MFV r267565:
4757 ZFS embedded-data block pointers ("zero block compression") 4913 zfs release should not be subject to space checks
|
#
268274 |
|
04-Jul-2014 |
mav |
MFC r268178: Fix bug in sync control in new "dev" mode of ZVOL (r265678).
Don't check ZVOL_WCE flag, used in Solaris to control device "write cache". It is not applicable on FreeBSD and by default set to "disable".
|
#
265678 |
|
08-May-2014 |
mav |
MFC r264145: Add property and sysctl to control how ZVOLs are exposed to OS.
New ZFS property volmode and sysctl vfs.zfs.vol.mode allow switching ZVOL between three modes: geom -- existing fully functional behavior (default); dev -- exposing volumes only as raw disk device file in devfs; none -- not exposing volumes outside ZFS.
The "dev" mode is less functional (can't be partitioned, mounted, etc), but it is faster, and in some scenarios with untrusted consumers safer. It can be useful for NAS, VM block storages, etc. The "none" mode may be convenient for backup servers, etc. that don't need direct data access.
Due to the way ZVOL is integrated with main ZFS code, those property and sysctl are checked only during pool import and volume creation.
|
#
265677 |
|
08-May-2014 |
mav |
MFC r264086: 3580 Want zvols to return volblocksize when queried for physical block size
illumos/illumos-gate@a0b60564dfc644f4bfaef1ce26d343b44cf68bc5
It is irrelevant for FreeBSD, just reducing diff.
|
#
264733 |
|
21-Apr-2014 |
mav |
MFC r264193: In addition to r264077, tell GEOM that we do support BIO_DELETE now.
|
#
264732 |
|
21-Apr-2014 |
mav |
MFC r264077: Add BIO_DELETE support to ZVOL.
It is an adapted merge from the vendor branch of: 701 UNMAP support for COMSTAR (in part related to ZFS) 2130 zvol DKIOCFREE uses nested DMU transactions
|
#
263987 |
|
01-Apr-2014 |
mav |
MFC r263118: Report ZVOL block size as GEOM stripesize.
|
#
263397 |
|
19-Mar-2014 |
delphij |
MFC r260150: MFV r259170:
4370 avoid transmitting holes during zfs send
4371 DMU code clean up
illumos/illumos-gate@43466aae47bfcd2ad9bf501faec8e75c08095e4f
NOTE: Make sure the boot code is updated if a zpool upgrade is done on boot zpool.
|
#
263390 |
|
19-Mar-2014 |
delphij |
MFC r259813 + r259813: MFV r258374:
4171 clean up spa_feature_*() interfaces
4172 implement extensible_dataset feature for use by other zpool features
illumos/illumos-gate@2acef22db7808606888f8f92715629ff3ba555b9
|
#
260385 |
|
06-Jan-2014 |
scottl |
MFC Alexander Motin's GEOM direct dispatch work:
r256603: Introduce new function devstat_end_transaction_bio_bt(), adding new argument to specify present time. Use this function to move binuptime() out of lock, substantially reducing lock congestion when slow timecounter is used.
r256606: Move g_io_deliver() out of the lock, as required for direct dispatch. Move g_destroy_bio() out too to reduce lock scope even more.
r256607: Fix passing uninitialized bio_resid argument to g_trace().
r256610: Add unmapped I/O support to GEOM RAID.
r256830: Restore BIO_UNMAPPED and BIO_TRANSIENT_MAPPING in biodonne() when unmapping temporary mapped buffer. That fixes double unmap if biodone() called twice for the same BIO (but with different done methods).
r256880: Merge GEOM direct dispatch changes from the projects/camlock branch.
When safety requirements are met, it allows to avoid passing I/O requests to GEOM g_up/g_down thread, executing them directly in the caller context. That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid several context switches per I/O.
r259247: Fix bug introduced at r256607. We have to recalculate bp_resid here since sizes of original and completed requests may differ due to end of media.
Testing of the stable/10 merge was done by Netflix, but all of the credit goes to Alexander and iX Systems.
Submitted by: mav Sponsored by: iX Systems
|