#
359554 |
|
02-Apr-2020 |
mav |
MFC r359112: MFOpenZFS: make zil max block size tunable
We've observed that on some highly fragmented pools, most metaslab allocations are small (~2-8KB), but there are some large, 128K allocations. The large allocations are for ZIL blocks. If there is a lot of fragmentation, the large allocations can be hard to satisfy.
The most common impact of this is that we need to check (and thus load) lots of metaslabs from the ZIL allocation code path, causing sync writes to wait for metaslabs to load, which can take a second or more. In the worst case, we may not be able to satisfy the allocation, in which case the ZIL will resort to txg_wait_synced() to ensure the change is on disk.
To provide a workaround for this, this change adds a tunable that can reduce the size of ZIL blocks.
External-issue: DLPX-61719 Reviewed-by: George Wilson <george.wilson@delphix.com> Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #8865 openzfs/zfs@b8738257c2607c73c731ce8e0fd73282b266d6ef
|
#
339128 |
|
03-Oct-2018 |
mav |
MFC r337181: 9539 Make zvol operations use _by_dnode routines
Continues what was started in 7801 add more by-dnode routines by fully converting zvols to avoid unnecessary dnode_hold() calls. This saves a small amount of CPU time and slightly improves latencies of operations on zvols.
illumos/illumos-gate@8dfe5547fbf0979fc1065a8b6fddc1e940a7cf4f
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Rick McNeal <rick.mcneal@nexenta.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Richard Yao <richard.yao@prophetstor.com>
|
#
331399 |
|
22-Mar-2018 |
mav |
MFC r329694: MFV r324198: 8081 Compiler warnings in zdb
illumos/illumos-gate@3f7978d02b206a6ebc5652c91aa9f42da6fbe00c https://github.com/illumos/illumos-gate/commit/3f7978d02b206a6ebc5652c91aa9f42da6fbe00c
https://www.illumos.org/issues/8081 zdb(8) is full of minor problems that generate compiler warnings. On FreeBSD, which uses -WError, the only way to build it is to disable all compiler warnings. This makes it much harder to detect newly introduced bugs. We should cleanup all the warnings.
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Alan Somers <asomers@gmail.com>
|
#
325132 |
|
30-Oct-2017 |
avg |
MFC r324011, r324016: MFV r323535: 8585 improve batching done in zil_commit()
FreeBSD notes: - this MFV reverts FreeBSD commit r314549 to make the merge easier - at present our emulation of cv_timedwait_hires is rather poor, so I elected to use cv_timedwait_sbt directly Please see the differential revision for details. Unfortunately, I did not get any positive reviews, so there could be bugs in the FreeBSD-specific piece of the merge. Hence, the long MFC timeout.
illumos/illumos-gate@1271e4b10dfaaed576c08a812f466f6e81370e5e https://github.com/illumos/illumos-gate/commit/1271e4b10dfaaed576c08a812f466f6e81370e5e
https://www.illumos.org/issues/8585 The current implementation of zil_commit() can introduce significant latency, beyond what is inherent due to the latency of the underlying storage. The additional latency comes from two main problems: 1. When there's outstanding ZIL blocks being written (i.e. there's already a "writer thread" in progress), then any new calls to zil_commit() will block waiting for the currently oustanding ZIL blocks to complete. The blocks written for each "writer thread" is coined a "batch", and there can only ever be a single "batch" being written at a time. When a batch is being written, any new ZIL transactions will have to wait for the next batch to be written, which won't occur until the current batch finishes. As a result, the underlying storage may not be used as efficiently as possible. While "new" threads enter zil_commit() and are blocked waiting for the next batch, it's possible that the underlying storage isn't fully utilized by the current batch of ZIL blocks. In that case, it'd be better to allow these new threads to generate (and issue) a new ZIL block, such that it could be serviced by the underlying storage concurrently with the other ZIL blocks that are being serviced. 2. Any call to zil_commit() must wait for all ZIL blocks in its "batch" to complete, prior to zil_commit() returning. The size of any given batch is proportional to the number of ZIL transaction in the queue at the time that the batch starts processing the queue; which doesn't occur until the previous batch completes. Thus, if there's a lot of transactions in the queue, the batch could be composed of many ZIL blocks, and each call to zil_commit() will have to wait for all of these writes to complete (even if the thread calling zil_commit() only cared about one of the transactions in the batch).
Reviewed by: Brad Lewis <brad.lewis@delphix.com> Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Prakash Surya <prakash.surya@delphix.com>
|
#
324203 |
|
02-Oct-2017 |
avg |
MFC r323918: MFV r323917: 8648 Fix range locking in ZIL commit codepath
This fixes a problem introduced in r315441, MFC of r308782.
|
#
323748 |
|
19-Sep-2017 |
avg |
MFC r322226: MFV r322223: 8378 crash due to bp in-memory modification of nopwrite block
illumos/illumos-gate@b7edcb940884114e61382937505433c4c38c0278 https://github.com/illumos/illumos-gate/commit/b7edcb940884114e61382937505433c4c38c0278
https://www.illumos.org/issues/8378 The problem is that zfs_get_data() supplies a stale zgd_bp to dmu_sync(), which we then nopwrite against. zfs_get_data() doesn't hold any DMU-related locks, so after it copies db_blkptr to zgd_bp, dbuf_write_ready() could change db_blkptr, and dbuf_write_done() could remove the dirty record. dmu_sync() then sees the stale BP and that the dbuf it not dirty, so it is eligible for nop-writing. The fix is for dmu_sync() to copy db_blkptr to zgd_bp after acquiring the db_mtx. We could still see a stale db_blkptr, but if it is stale then the dirty record will still exist and thus we won't attempt to nopwrite.
Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com>
|
#
321528 |
|
26-Jul-2017 |
mav |
MFC r314280: MFV 314276
7570 tunable to allow zvol SCSI unmap to return on commit of txn to ZIL
illumos/illumos-gate@1c9272b861cd640a8342f4407da026ed98615517 https://github.com/illumos/illumos-gate/commit/1c9272b861cd640a8342f4407da026ed98615517
https://www.illumos.org/issues/7570
Based on the discovery that every unmap waits for the commit of the txn to the ZIL, introducing a very high latency to unmap commands, this behavior was made into a tunable zvol_unmap_sync_enabled and set to false. The net impact of this change is that by default SCSI unmap commands will result in space being freed within the zvol (today they are ignored and returned with good status). However, unlike the code today, instead of 18+ms per unmap, they take about 30us.
With the testing done on NTFS against a Win2k12 target, the new behavior should work seamlessly. Files on the zvol that have already been set with the zfree application will continue to write 0's when deleted, and any new files created since zvol creation will send unmap commands when deleted. This behavior exists today, but with this change the unmap commands will be processed and result in reclaim of space.
Author: Stephen Blinick <stephen.blinick@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Steve Gonczi <steve.gonczi@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com> Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com> Approved by: Robert Mustacchi <rm@joyent.com>
|
#
315446 |
|
17-Mar-2017 |
mav |
MFC r309856: Postpone ZVOL media/block size caching till first open.
At least on FreeBSD there are no legal way to access media or get its size without opening device/provider first. Postponing this caching allows to skip several disk seeks per ZVOL/snapshot during import.
For HDD pool with 1 ZVOL in dev mode with 1000 snapshots this reduces pool import time from 40 to 10 seconds.
|
#
315444 |
|
17-Mar-2017 |
mav |
MFC r308099: Add sysctls for zfs_immediate_write_sz and zvol_immediate_write_sz.
|
#
315441 |
|
17-Mar-2017 |
mav |
MFC r308782: After some ZIL changes 6 years ago zil_slog_limit got partially broken due to zl_itx_list_sz not updated when async itx'es upgraded to sync. Actually because of other changes about that time zl_itx_list_sz is not really required to implement the functionality, so this patch removes some unneeded broken code and variables.
Original idea of zil_slog_limit was to reduce chance of SLOG abuse by single heavy logger, that increased latency for other (more latency critical) loggers, by pushing heavy log out into the main pool instead of SLOG. Beside huge latency increase for heavy writers, this implementation caused double write of all data, since the log records were explicitly prepared for SLOG. Since we now have I/O scheduler, I've found it can be much more efficient to reduce priority of heavy logger SLOG writes from ZIO_PRIORITY_SYNC_WRITE to ZIO_PRIORITY_ASYNC_WRITE, while still leave them on SLOG.
Existing ZIL implementation had problem with space efficiency when it has to write large chunks of data into log blocks of limited size. In some cases efficiency stopped to almost as low as 50%. In case of ZIL stored on spinning rust, that also reduced log write speed in half, since head had to uselessly fly over allocated but not written areas. This change improves the situation by offloading problematic operations from z*_log_write() to zil_lwb_commit(), which knows real situation of log blocks allocation and can split large requests into pieces much more efficiently. Also as side effect it removes one of two data copy operations done by ZIL code WR_COPIED case.
While there, untangle and unify code of z*_log_write() functions. Also zfs_log_write() alike to zvol_log_write() can now handle writes crossing block boundary, that may also improve efficiency if ZPL is made to do that.
Sponsored by: iXsystems, Inc.
|
#
314899 |
|
08-Mar-2017 |
ae |
MFC r314497: Do not invoke the resize event when previous provider's size was zero. This is similar to r303637 fix for geom_disk.
|
#
308595 |
|
12-Nov-2016 |
mav |
MFC r308173: Fix ZIL records ordering when ZVOL opened both with and without FSYNC.
Before this an earlier writes to a ZVOL opened without FSYNC could get to ZIL after later writes to the same ZVOL opened with FSYNC. Fix this by replicating functionality of ZPL (zv_sync_cnt equivalent to z_sync_cnt), marking all log records sync if anybody opened the ZVOL with FSYNC.
|
#
308593 |
|
12-Nov-2016 |
mav |
MFC r308169: Pass to zvol_log_truncate() same sync values as to zvol_log_write().
Surplus marking of TX_TRUNCATE records as sync could result in putting them into ZIL before previous writes if ones were async.
|
#
308447 |
|
08-Nov-2016 |
mav |
MFC r307857: Fix panic after ZVOL renamed to name invalid for DEVFS.
|
#
302408 |
|
07-Jul-2016 |
gjb |
Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle. Prune svn:mergeinfo from the new branch, as nothing has been merged here.
Additional commits post-branch will follow.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
297421 |
|
30-Mar-2016 |
mav |
Plug open count leak on zvol rename.
MFC after: 2 weeks
|
#
297420 |
|
30-Mar-2016 |
mav |
Switch from using make_dev_p() to make_dev_s() to close races.
|
#
297337 |
|
28-Mar-2016 |
mav |
Pass through error code from make_dev_p().
ENAMETOOLONG is much more informative in logs then ENXIO.
MFC after: 1 week
|
#
297232 |
|
24-Mar-2016 |
mav |
Unify ignoring EEXIST from zvol_create_minor().
This fixes creation of zvol devices for snapshots during zfs receive, that previously failed with "ZFS WARNING: Unable to create ZVOL" message. This solution is not perfect, but IMHO better then it was before.
MFC after: 2 weeks
|
#
296519 |
|
08-Mar-2016 |
mav |
MFV r296518: 5027 zfs large block support (add copyright)
Author: Matthew Ahrens <matt@mahrens.org>
illumos/illumos-gate@c3d26abc9ee97b4f60233556aadeb57e0bd30bb9
|
#
295045 |
|
29-Jan-2016 |
asomers |
Add a sysctl to allow ZFS pools backed by zvols
Change 294329 removed the ability to build ZFS pools that are backed by zvols, because having that ability (even if it's not used) leads to deadlocks. By popular demand, I'm adding an off-by-default sysctl to reenable that ability.
Reviewed by: lidl, delphij MFC after: Never Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D4998
|
#
294329 |
|
19-Jan-2016 |
asomers |
Disallow zvol-backed ZFS pools
Using zvols as backing devices for ZFS pools is fraught with panics and deadlocks. For example, attempting to online a missing device in the presence of a zvol can cause a panic when vdev_geom tastes the zvol. Better to completely disable vdev_geom from ever opening a zvol. The solution relies on setting a thread-local variable during vdev_geom_open, and returning EOPNOTSUPP during zvol_open if that thread-local variable is set.
Remove the check for MUTEX_HELD(&zfsdev_state_lock) in zvol_open. Its intent was to prevent a recursive mutex acquisition panic. However, the new check for the thread-local variable also fixes that problem.
Also, fix a panic in vdev_geom_taste_orphan. For an unknown reason, this function was set to panic. But it can occur that a device disappears during tasting, and it causes no problems to ignore this departure.
Reviewed by: delphij MFC after: 1 week Relnotes: yes Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D4986
|
#
289190 |
|
12-Oct-2015 |
mav |
MFV r289185: 6250 zvol_dump_init() can hold txg open
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Albert Lee <trisk@omniti.com> Reviewed by: Xin Li <delphij@freebsd.org> Approved by: Garrett D'Amore <garrett@damore.org> Author: George Wilson <george.wilson@delphix.com>
illumos/illumos-gate@b10bba72460aeaa53119c76ff5e647fd5585bece
|
#
286705 |
|
12-Aug-2015 |
mav |
MFV r286704: 5960 zfs recv should prefetch indirect blocks 5925 zfs receive -o origin=
Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Author: Paul Dagnelie <pcd@delphix.com>
While running 'zfs recv' we noticed that every 128th 8K block required a read. We were seeing that restore_write() was calling dmu_tx_hold_write() and the indirect block was not cached. We should prefetch upcoming indirect blocks to avoid having to go to disk and blocking the restore_write().
Allow an incremental send stream to be received as a clone, even if the stream does not mark it as a clone.
|
#
279996 |
|
14-Mar-2015 |
smh |
Allow zvol_geom_worker to process BIO_DELETE's
If zvol_geom_start is called with a BIO_DELETE from a thread which can sleep it queues it for later processing by the zvol_geom_worker. The zvol_geom_worker didn't have a delete case so would simply loose the bio hence preventing the original caller from every completing. In addition an other unknown types would suffer the same fate.
Allow zvol_geom_worker to process BIO_DELETE's via zvol_strategy and return unsupported for all unknown bio types.
MFC after: 2 weeks Sponsored by: Multiplay
|
#
279927 |
|
12-Mar-2015 |
mav |
Make DIOCGATTR in device mode handle "GEOM::candelete".
MFC after: 3 days
|
#
276913 |
|
10-Jan-2015 |
mav |
Use new optimized dmu_read_uio_dbuf() for ZVOLs in device mode.
This slightly reduces overhead by avoiding dnode_hold()/dnode_rele() calls.
MFC after: 2 weeks
|
#
276069 |
|
22-Dec-2014 |
smh |
Fix panic when resizing ZFS zvol's
Resizing a ZFS ZVOL with debug enabled would result in a panic due to recursion on dp_config_rwlock.
The upstream change "3464 zfs synctask code needs restructuring" changed zvol_set_volsize to avoid the recursion on dp_config_rwlock, but this was missed when originally merged in by r248571 due to significant differences in our codebases in this area.
These changes also relied on bring in changes from upstream: 3557 dumpvp_size is not updated correctly when a dump zvol's size is changed, which where also not present.
In order to help prevent future issues in this area a direct comparison and diff minimisation from current upstream version (b515258) of zvol.c.
Differential Revision: https://reviews.freebsd.org/D1302 MFC after: 1 month X-MFC-With: r276063 & r276066 Sponsored by: Multiplay
|
#
276066 |
|
22-Dec-2014 |
smh |
Refactor zvol locking to minimise diff with upstream
Use #define zfsdev_state_lock spa_namespace_lock instead of replacing all zfsdev_state_lock with spa_namespace_lock to minimise changes from upstream.
Differential Revision: D1302 MFC after: 1 month X-MFC-With r276063 Sponsored by: Multiplay
|
#
276063 |
|
22-Dec-2014 |
smh |
Standardise on illumos for #ifdef's in zvol.c
Also correct as per style(9) on the use of #ifdef comments.
This is a no-op change as pre-cursor to a full cleanup and merge with upstream zvol changes.
Sponsored by: Multiplay
|
#
275474 |
|
04-Dec-2014 |
mav |
Add GET LBA STATUS command support to CTL.
It is implemented for LUNs backed by ZVOLs in "dev" mode and files. GEOM has no such API, so for LUNs backed by raw devices all LBAs will be reported as mapped/unknown.
MFC after: 2 weeks Sponsored by: iXsystems, Inc.
|
#
274337 |
|
10-Nov-2014 |
delphij |
MFV r274273:
ZFS large block support.
Please note that booting from datasets that have recordsize greater than 128KB is not supported (but it's Okay to enable the feature on the pool). This *may* remain unchanged because of memory constraint.
Limited safety belt is provided for mounted root filesystem but use caution is advised.
Illumos issue: 5027 zfs large block support
MFC after: 1 month
|
#
274154 |
|
05-Nov-2014 |
mav |
Add to CTL support for logical block provisioning threshold notifications.
For ZVOL-backed LUNs this allows to inform initiators if storage's used or available spaces get above/below the configured thresholds.
MFC after: 2 weeks Sponsored by: iXsystems, Inc.
|
#
272510 |
|
04-Oct-2014 |
delphij |
Add a new sysctl, vfs.zfs.vol.unmap_enabled, which allows the system administrator to toggle whether ZFS should ignore UNMAP requests.
Illumos issue: 5149 zvols need a way to ignore DKIOCFREE
MFC after: 2 weeks
|
#
272509 |
|
04-Oct-2014 |
delphij |
Diff reduction with upstream. The code change is not really applicable to FreeBSD.
Illumos issue: 5148 zvol's DKIOCFREE holds zfsdev_state_lock too long
MFC after: 1 month
|
#
272474 |
|
03-Oct-2014 |
smh |
Fix various issues with zvols
When performing snapshot renames we could deadlock due to the locking in zvol_rename_minors. In order to avoid this use the same workaround as zvol_open in zvol_rename_minors.
Add missing zvol_rename_minors to dsl_dataset_promote_sync.
Protect against invalid index into zv_name in zvol_remove_minors.
Replace zvol_remove_minor calls with zvol_remove_minors to ensure any potential children are also renamed.
Don't fail zvol_create_minors if zvol_create_minor returns EEXIST.
Restore the valid pool check in zfs_ioc_destroy_snaps to ensure we don't call zvol_remove_minors when zfs_unmount_snap fails.
PR: 193803 MFC after: 1 week Sponsored by: Multiplay
|
#
271308 |
|
09-Sep-2014 |
mav |
Make ZVOL writes in device mode support IO_SYNC flag.
MFC after: 1 month
|
#
268473 |
|
09-Jul-2014 |
delphij |
MFV r268455:
Use reserved space for ZFS administrative commands.
We reserve 1/2^spa_slop_shift = 1/32 or 3.125% of pool space (or 32MB at least) for system use. Most ZPL operations, e.g. write(2), creat(2), will fail with ENOSPC if we fall below this.
Certain operations, e.g. file removal and most administrative actions, still permitted until half of the slop space is used. This would allow users to use these operations to free up space in the pool when pool is close to full but half of slop space is still free.
A very restricted set of operations that frees up space or change quota are always permitted, regardless of the amount of free space.
MFC after: 2 weeks
|
#
268464 |
|
09-Jul-2014 |
delphij |
MFV r268452:
Explicitly mark file removal transactions as "presumed to result in a net free of space" so they will not fail with ENOSPC.
Illumos issue: 4950 files sometimes can't be removed from a full filesystem MFC after: 2 weeks
|
#
268178 |
|
02-Jul-2014 |
mav |
Fix bug in sync control in new "dev" mode of ZVOL (r265678).
Don't check ZVOL_WCE flag, used in Solaris to control device "write cache". It is not applicable on FreeBSD and by default set to "disable".
MFC after: 3 days
|
#
268123 |
|
01-Jul-2014 |
delphij |
MFV r268119:
4914 zfs on-disk bookmark structure should be named *_phys_t
illumos/illumos-gate@7802d7bf98dec568dadf72286893b1fe5abd8602
MFC after: 2 weeks
|
#
268075 |
|
01-Jul-2014 |
delphij |
MFV r267565:
4757 ZFS embedded-data block pointers ("zero block compression") 4913 zfs release should not be subject to space checks
MFC after: 2 weeks
|
#
267992 |
|
28-Jun-2014 |
hselasky |
Pull in r267961 and r267973 again. Fix for issues reported will follow.
|
#
267985 |
|
27-Jun-2014 |
gjb |
Revert r267961, r267973:
These changes prevent sysctl(8) from returning proper output, such as:
1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory
|
#
267961 |
|
27-Jun-2014 |
hselasky |
Extend the meaning of the CTLFLAG_TUN flag to automatically check if there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel.
Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change.
MFC after: 2 weeks Sponsored by: Mellanox Technologies
|
#
264193 |
|
06-Apr-2014 |
mav |
In addition to r264077, tell GEOM that we do support BIO_DELETE now.
|
#
264145 |
|
05-Apr-2014 |
mav |
Add property and sysctl to control how ZVOLs are exposed to OS.
New ZFS property volmode and sysctl vfs.zfs.vol.mode allow switching ZVOL between three modes: geom -- existing fully functional behavior (default); dev -- exposing volumes only as raw disk device file in devfs; none -- not exposing volumes outside ZFS.
The "dev" mode is less functional (can't be partitioned, mounted, etc), but it is faster, and in some scenarios with untrusted consumers safer. It can be useful for NAS, VM block storages, etc. The "none" mode may be convenient for backup servers, etc. that don't need direct data access.
Due to the way ZVOL is integrated with main ZFS code, those property and sysctl are checked only during pool import and volume creation.
MFC after: 1 month Sponsored by: iXsystems, Inc.
|
#
264086 |
|
03-Apr-2014 |
mav |
MFV r258922: 3580 Want zvols to return volblocksize when queried for physical block size
illumos/illumos-gate@a0b60564dfc644f4bfaef1ce26d343b44cf68bc5
It is irrelevant for FreeBSD, just reducing diff.
|
#
264077 |
|
03-Apr-2014 |
mav |
Add BIO_DELETE support to ZVOL.
It is an adapted merge from the vendor branch of: 701 UNMAP support for COMSTAR (in part related to ZFS) 2130 zvol DKIOCFREE uses nested DMU transactions
|
#
263118 |
|
13-Mar-2014 |
mav |
Report ZVOL block size as GEOM stripesize.
MFC after: 2 weeks
|
#
260150 |
|
31-Dec-2013 |
delphij |
MFV r259170:
4370 avoid transmitting holes during zfs send
4371 DMU code clean up
illumos/illumos-gate@43466aae47bfcd2ad9bf501faec8e75c08095e4f
NOTE: Make sure the boot code is updated if a zpool upgrade is done on boot zpool.
MFC after: 2 weeks
|
#
259813 |
|
24-Dec-2013 |
delphij |
MFV r258374:
4171 clean up spa_feature_*() interfaces
4172 implement extensible_dataset feature for use by other zpool features
illumos/illumos-gate@2acef22db7808606888f8f92715629ff3ba555b9
MFC after: 2 weeks
|
#
256880 |
|
22-Oct-2013 |
mav |
Merge GEOM direct dispatch changes from the projects/camlock branch.
When safety requirements are met, it allows to avoid passing I/O requests to GEOM g_up/g_down thread, executing them directly in the caller context. That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid several context switches per I/O.
The defined now safety requirements are: - caller should not hold any locks and should be reenterable; - callee should not depend on GEOM dual-threaded concurency semantics; - on the way down, if request is unmapped while callee doesn't support it, the context should be sleepable; - kernel thread stack usage should be below 50%.
To keep compatibility with GEOM classes not meeting above requirements new provider and consumer flags added: - G_CF_DIRECT_SEND -- consumer code meets caller requirements (request); - G_CF_DIRECT_RECEIVE -- consumer code meets callee requirements (done); - G_PF_DIRECT_SEND -- provider code meets caller requirements (done); - G_PF_DIRECT_RECEIVE -- provider code meets callee requirements (request). Capable GEOM class can set them, allowing direct dispatch in cases where it is safe. If any of requirements are not met, request is queued to g_up or g_down thread same as before.
Such GEOM classes were reviewed and updated to support direct dispatch: CONCAT, DEV, DISK, GATE, MD, MIRROR, MULTIPATH, NOP, PART, RAID, STRIPE, VFS, ZERO, ZFS::VDEV, ZFS::ZVOL, all classes based on g_slice KPI (LABEL, MAP, FLASHMAP, etc).
To declare direct completion capability disk(9) KPI got new flag equivalent to G_PF_DIRECT_SEND -- DISKFLAG_DIRECT_COMPLETION. da(4) and ada(4) disk drivers got it set now thanks to earlier CAM locking work.
This change more then twice increases peak block storage performance on systems with manu CPUs, together with earlier CAM locking changes reaching more then 1 million IOPS (512 byte raw reads from 16 SATA SSDs on 4 HBAs to 256 user-level threads).
Sponsored by: iXsystems, Inc. MFC after: 2 months
|
#
255750 |
|
20-Sep-2013 |
delphij |
MFV r254750:
Add support of Illumos dumps on zvol over RAID-Z.
Note that this only adds the features. FreeBSD would still need more work to support dumping on zvols.
Illumos ZFS issues: 2932 support crash dumps to raidz, etc. pools
MFC after: 1 month Approved by: re (ZFS blanket)
|
#
249195 |
|
06-Apr-2013 |
mm |
MFV r248217: Merge change from vendor to reduce diff only. ZFS dtrace probes are not supported on FreeBSD yet.
Illumos ZFS issues: 3598 want to dtrace when errors are generated in zfs
MFC after: 3 weeks
|
#
248976 |
|
01-Apr-2013 |
mm |
Call dmu_snapshot_list_next() in zvol.c with dsl_pool_config lock held
Submitted by: Andriy Gapon <avg@FreeBSD.org> MFC after: 17 days
|
#
248571 |
|
21-Mar-2013 |
mm |
Merge libzfs_core branch: includes MFV 238590, 238592, 247580
MFV 238590, 238592: In the first zfs ioctl restructuring phase, the libzfs_core library was introduced. It is a new thin library that wraps around kernel ioctl's. The idea is to provide a forward-compatible way of dealing with new features. Arguments are passed in nvlists and not random zfs_cmd fields, new-style ioctls are logged to pool history using a new method of history logging.
http://blog.delphix.com/matt/2012/01/17/the-future-of-libzfs/
MFV 247580 [1]: To address issues of several deadlocks and race conditions the locking code around dsl_dataset was rewritten and the interface to synctasks was changed.
User-Visible Changes: "zfs snapshot" can create more arbitrary snapshots at once (atomically) "zfs destroy" destroys multiple snapshots at once "zfs recv" has improved performance
Backward Compatibility: I have extended the compatibility layer to support full backward compatibility by remapping or rewriting the responsible ioctl arguments. Old utilities are fully supported by the new kernel module.
Forward Compatibility: New utilities work with old kernels with the following restrictions: - creating, destroying, holding and releasing of multiple snapshots at once is not supported, this includes recursive (-r) commands
Illumos ZFS issues: 2882 implement libzfs_core 2900 "zfs snapshot" should be able to create multiple, arbitrary snapshots at once 3464 zfs synctask code needs restructuring
References: https://www.illumos.org/issues/2882 https://www.illumos.org/issues/2900 https://www.illumos.org/issues/3464 [1]
MFC after: 1 month Sponsored by: Hybrid Logic Inc. [1]
|
#
246666 |
|
11-Feb-2013 |
mm |
MFV r246392: Import vendor ZFS bugfix fixing a possible deadlock in arc_read().
Illumos ZFS issues: 3498 panic in arc_read(): !refcount_is_zero(&pbuf->b_hdr->b_refcnt)
References: https://www.illumos.org/issues/3498
MFC after: 2 weeks
|
#
243524 |
|
25-Nov-2012 |
mm |
MFV r243013 and r243267:
Import the zio nop-write improvement from Illumos. To reduce I/O, nop-write omits overwriting data if the checksum (cryptographically secure) of new data matches the checksum of existing data. It also saves space if snapshots are in use.
It currently works only on datasets with enabled compression, disabled deduplication and sha256 checksums.
IllumOS 13887:196932ec9e6a and 13888:7204b3392a58 3236 zio nop-write
References: https://www.illumos.org/issues/3236
MFC after: 2 weeks
|
#
241297 |
|
06-Oct-2012 |
avg |
zvol: set mediasize in geom provider right upon its creation
... instead of deferring the action until first open. Unlike upstream this has no benefit on FreeBSD. We know that as soon as the provider is created it is going to be tasted and thus opened. Initial mediasize of zero causes tasting failure and subsequent retasting because of the size change.
MFC after: 14 days
|
#
240831 |
|
22-Sep-2012 |
avg |
zfs: allow a zvol to be used as a pool vdev, again
Do this by checking if spa_namespace_lock is already held and not taking it again in that case. Add a comment explaining why that is done and why it is safe.
Reviewed by: pjd MFC after: 24 days
|
#
239774 |
|
28-Aug-2012 |
mm |
Merge recent vendor changes: 3100 zvol rename fails with EBUSY when dirty 3104 eliminate empty bpobjs 3120 zinject hangs in zfsdev_ioctl() due to uninitialized zc
References: https://www.illumos.org/issues/3100 https://www.illumos.org/issues/3104 https://www.illumos.org/issues/3120
Obtained from: illumos (vendor/illumos, vendor/illumos-sys) MFC after: 2 weeks
|
#
238656 |
|
20-Jul-2012 |
trasz |
Make ZVOL resizing ('zfs set volsize') properly resize the GEOM provider.
Sponsored by: FreeBSD Foundation
|
#
227111 |
|
05-Nov-2011 |
pjd |
Correct typo in comment.
Reported by: Fabian Keil <fk@fabiankeil.de> MFC after: 3 days
|
#
227110 |
|
05-Nov-2011 |
pjd |
In zvol_open() if the spa_namespace_lock is already held, it means that ZFS is trying to open and taste ZVOL as its VDEV. This is not supported, so return an error instead of panicing on spa_namespace_lock recursion.
Reported by: Robert Millan <rmh@debian.org> PR: kern/162008 MFC after: 3 days
|
#
226724 |
|
25-Oct-2011 |
mm |
Update copyright information in several ZFS files, as the clause 3.3 of the CDDL licence explicitly requires every Contributor to add a copyright notice.
This also reflects the copyright notices for the changes recently added by Illumos.
MFC after: 3 days
|
#
224855 |
|
13-Aug-2011 |
mm |
zfs_ioctl.c: improve code readability in zfs_ioc_dataset_list_next()
zvol.c: fix calling of dmu_objset_prefetch() in zvol_create_minors() by passing full instead of relative dataset name and prefetching all visible datasets to be processed later instead of just the pool name
Reviewed by: pjd Approved by: re (kib) MFC after: 1 week > Reviewed by: If someone else reviewed your modification. > Approved by: If you needed approval for this commit. > Obtained from: If the change is from a third party. > MFC after: N [day[s]|week[s]|month[s]]. Request a reminder email. > Security: Vulnerability reference (one per line) or description. > Empty fields above will be automatically removed.
M opensolaris/uts/common/fs/zfs/zfs_ioctl.c M opensolaris/uts/common/fs/zfs/zvol.c
|
#
224791 |
|
12-Aug-2011 |
pjd |
Eliminate the zfsdev_state_lock entirely and replace it with the spa_namespace_lock. This fixes LOR between the spa_namespace_lock and spa_config lock. LOR can cause deadlock on vdevs removal/insertion.
Reported by: gibbs, delphij Tested by: delphij Approved by: re (kib) MFC after: 1 week
|
#
219317 |
|
05-Mar-2011 |
pjd |
Make renaming of a ZVOL, ZVOL's parent directory and ZVOL snapshot work.
Reported by: avg MFC after: 1 month
|
#
219316 |
|
05-Mar-2011 |
pjd |
Simplify zvol_remove_minors() a bit.
MFC after: 1 month
|
#
219089 |
|
27-Feb-2011 |
pjd |
Finally... Import the latest open-source ZFS version - (SPA) 28.
Few new things available from now on:
- Data deduplication. - Triple parity RAIDZ (RAIDZ3). - zfs diff. - zpool split. - Snapshot holds. - zpool import -F. Allows to rewind corrupted pool to earlier transaction group. - Possibility to import pool in read-only mode.
MFC after: 1 month
|
#
209962 |
|
12-Jul-2010 |
mm |
Merge ZFS version 15 and almost all OpenSolaris bugfixes referenced in Solaris 10 updates 141445-09 and 142901-14.
Detailed information: (OpenSolaris revisions and Bug IDs, Solaris 10 patch numbers)
7844:effed23820ae 6755435 zfs_open() and zfs_close() needs to use ZFS_ENTER/ZFS_VERIFY_ZP (141445-01)
7897:e520d8258820 6748436 inconsistent zpool.cache in boot_archive could panic a zfs root filesystem upon boot-up (141445-01)
7965:b795da521357 6740164 zpool attach can create an illegal root pool (141909-02)
8084:b811cc60d650 6769612 zpool_import() will continue to write to cachefile even if altroot is set (N/A)
8121:7fd09d4ebd9c 6757430 want an option for zdb to disable space map loading and leak tracking (141445-01)
8129:e4f45a0bfbb0 6542860 ASSERT: reason != VDEV_LABEL_REMOVE||vdev_inuse(vd, crtxg, reason, 0) (141445-01)
8188:fd00c0a81e80 6761100 want zdb option to select older uberblocks (141445-01)
8190:6eeea43ced42 6774886 zfs_setattr() won't allow ndmp to restore SUNWattr_rw (141445-01)
8225:59a9961c2aeb 6737463 panic while trying to write out config file if root pool import fails (141445-01)
8227:f7d7be9b1f56 6765294 Refactor replay (141445-01)
8228:51e9ca9ee3a5 6572357 libzfs should do more to avoid mnttab lookups (141909-01) 6572376 zfs_iter_filesystems and zfs_iter_snapshots get objset stats twice (141909-01)
8241:5a60f16123ba 6328632 zpool offline is a bit too conservative (141445-01) 6739487 ASSERT: txg <= spa_final_txg due to scrub/export race (141445-01) 6767129 ASSERT: cvd->vdev_isspare, in spa_vdev_detach() (141445-01) 6747698 checksum failures after offline -t / export / import / scrub (141445-01) 6745863 ZFS writes to disk after it has been offlined (141445-01) 6722540 50% slowdown on scrub/resilver with certain vdev configurations (141445-01) 6759999 resilver logic rewrites ditto blocks on both source and destination (141445-01) 6758107 I/O should never suspend during spa_load() (141445-01) 6776548 codereview(1) runs off the page when faced with multi-line comments (N/A) 6761406 AMD errata 91 workaround doesn't work on 64-bit systems (141445-01)
8242:e46e4b2f0a03 6770866 GRUB/ZFS should require physical path or devid, but not both (141445-01)
8269:03a7e9050cfd 6674216 "zfs share" doesn't work, but "zfs set sharenfs=on" does (141445-01) 6621164 $SRC/cmd/zfs/zfs_main.c seems to have a syntax error in the translation note (141445-01) 6635482 i18n problems in libzfs_dataset.c and zfs_main.c (141445-01) 6595194 "zfs get" VALUE column is as wide as NAME (141445-01) 6722991 vdev_disk.c: error checking for ddi_pathname_to_dev_t() must test for NODEV (141445-01) 6396518 ASSERT strings shouldn't be pre-processed (141445-01)
8274:846b39508aff 6713916 scrub/resilver needlessly decompress data (141445-01)
8343:655db2375fed 6739553 libzfs_status msgid table is out of sync (141445-01) 6784104 libzfs unfairly rejects numerical values greater than 2^63 (141445-01) 6784108 zfs_realloc() should not free original memory on failure (141445-01)
8525:e0e0e525d0f8 6788830 set large value to reservation cause core dump (141445-01) 6791064 want sysevents for ZFS scrub (141445-01) 6791066 need to be able to set cachefile on faulted pools (141445-01) 6791071 zpool_do_import() should not enable datasets on faulted pools (141445-01) 6792134 getting multiple properties on a faulted pool leads to confusion (141445-01)
8547:bcc7b46e5ff7 6792884 Vista clients cannot access .zfs (141445-01)
8632:36ef517870a3 6798384 It can take a village to raise a zio (141445-01)
8636:7e4ce9158df3 6551866 deadlock between zfs_write(), zfs_freesp(), and zfs_putapage() (141909-01) 6504953 zfs_getpage() misunderstands VOP_GETPAGE() interface (141909-01) 6702206 ZFS read/writer lock contention throttles sendfile() benchmark (141445-01) 6780491 Zone on a ZFS filesystem has poor fork/exec performance (141445-01) 6747596 assertion failed: DVA_EQUAL(BP_IDENTITY(&zio->io_bp_orig), BP_IDENTITY(zio->io_bp))); (141445-01)
8692:692d4668b40d 6801507 ZFS read aggregation should not mind the gap (141445-01)
8697:e62d2612c14d 6633095 creating a filesystem with many properties set is slow (141445-01)
8768:dfecfdbb27ed 6775697 oracle crashes when overwriting after hitting quota on zfs (141909-01)
8811:f8deccf701cf 6790687 libzfs mnttab caching ignores external changes (141445-01) 6791101 memory leak from libzfs_mnttab_init (141445-01)
8845:91af0d9c0790 6800942 smb_session_create() incorrectly stores IP addresses (N/A) 6582163 Access Control List (ACL) for shares (141445-01) 6804954 smb_search - shortname field should be space padded following the NULL terminator (N/A) 6800184 Panic at smb_oplock_conflict+0x35() (N/A)
8876:59d2e67b4b65 6803822 Reboot after replacement of system disk in a ZFS mirror drops to grub> prompt (141445-01)
8924:5af812f84759 6789318 coredump when issue zdb -uuuu poolname/ (141445-01) 6790345 zdb -dddd -e poolname coredump (141445-01) 6797109 zdb: 'zdb -dddddd pool_name/fs_name inode' coredump if the file with inode was deleted (141445-01) 6797118 zdb: 'zdb -dddddd poolname inum' coredump if I miss the fs name (141445-01) 6803343 shareiscsi=on failed, iscsitgtd failed request to share (141445-01)
9030:243fd360d81f 6815893 hang mounting a dataset after booting into a new boot environment (141445-01)
9056:826e1858a846 6809691 'zpool create -f' no longer overwrites ufs infomation (141445-01)
9179:d8fbd96b79b3 6790064 zfs needs to determine uid and gid earlier in create process (141445-01)
9214:8d350e5d04aa 6604992 forced unmount + being in .zfs/snapshot/<snap1> = not happy (141909-01) 6810367 assertion failed: dvp->v_flag & VROOT, file: ../../common/fs/gfs.c, line: 426 (141909-01)
9229:e3f8b41e5db4 6807765 ztest_dsl_dataset_promote_busy needs to clean up after ENOSPC (141445-01)
9230:e4561e3eb1ef 6821169 offlining a device results in checksum errors (141445-01) 6821170 ZFS should not increment error stats for unavailable devices (141445-01) 6824006 need to increase issue and interrupt taskqs threads in zfs (141445-01)
9234:bffdc4fc05c4 6792139 recovering from a suspended pool needs some work (141445-01) 6794830 reboot command hangs on a failed zfs pool (141445-01)
9246:67c03c93c071 6824062 System panicked in zfs_mount due to NULL pointer dereference when running btts and svvs tests (141909-01)
9276:a8a7fc849933 6816124 System crash running zpool destroy on broken zpool (141445-03)
9355:09928982c591 6818183 zfs snapshot -r is slow due to set_snap_props() doing txg_wait_synced() for each new snapshot (141445-03)
9391:413d0661ef33 6710376 log device can show incorrect status when other parts of pool are degraded (141445-03)
9396:f41cf682d0d3 (part already merged) 6501037 want user/group quotas on ZFS (141445-03) 6827260 assertion failed in arc_read(): hdr == pbuf->b_hdr (141445-03) 6815592 panic: No such hold X on refcount Y from zfs_znode_move (141445-03) 6759986 zfs list shows temporary %clone when doing online zfs recv (141445-03)
9404:319573cd93f8 6774713 zfs ignores canmount=noauto when sharenfs property != off (141445-03)
9412:4aefd8704ce0 6717022 ZFS DMU needs zero-copy support (141445-03)
9425:e7ffacaec3a8 6799895 spa_add_spares() needs to be protected by config lock (141445-03) 6826466 want to post sysevents on hot spare activation (141445-03) 6826468 spa 'allowfaulted' needs some work (141445-03) 6826469 kernel support for storing vdev FRU information (141445-03) 6826470 skip posting checksum errors from DTL regions of leaf vdevs (141445-03) 6826471 I/O errors after device remove probe can confuse FMA (141445-03) 6826472 spares should enjoy some of the benefits of cache devices (141445-03)
9443:2a96d8478e95 6833711 gang leaders shouldn't have to be logical (141445-03)
9463:d0bd231c7518 6764124 want zdb to be able to checksum metadata blocks only (141445-03)
9465:8372081b8019 6830237 zfs panic in zfs_groupmember() (141445-03)
9466:1fdfd1fed9c4 6833162 phantom log device in zpool status (141445-03)
9469:4f68f041ddcd 6824968 add ZFS userquota support to rquotad (141445-03)
9470:6d827468d7b5 6834217 godfather I/O should reexecute (141445-03)
9480:fcff33da767f 6596237 Stop looking and start ganging (141909-02)
9493:9933d599bc93 6623978 lwb->lwb_buf != NULL, file ../../../uts/common/fs/zfs/zil.c, line 787, function zil_lwb_commit (141445-06)
9512:64cafcbcc337 6801810 Commit of aligned streaming rewrites to ZIL device causes unwanted disk reads (N/A)
9515:d3b739d9d043 6586537 async zio taskqs can block out userland commands (142901-09)
9554:787363635b6a 6836768 zfs_userspace() callback has no way to indicate failure (N/A)
9574:1eb6a6ab2c57 6838062 zfs panics when an error is encountered in space_map_load() (141909-02)
9583:b0696cd037cc 6794136 Panic BAD TRAP: type=e when importing degraded zraid pool. (141909-03)
9630:e25a03f552e0 6776104 "zfs import" deadlock between spa_unload() and spa_async_thread() (141445-06)
9653:a70048a304d1 6664765 Unable to remove files when using fat-zap and quota exceeded on ZFS filesystem (141445-06)
9688:127be1845343 6841321 zfs userspace / zfs get userused@ doesn't work on mounted snapshot (N/A) 6843069 zfs get userused@S-1-... doesn't work (N/A)
9873:8ddc892eca6e 6847229 assertion failed: refcount_count(&tx->tx_space_written) + delta <= tx->tx_space_towrite in dmu_tx.c (141445-06)
9904:d260bd3fd47c 6838344 kernel heap corruption detected on zil while stress testing (141445-06)
9951:a4895b3dd543 6844900 zfs_ioc_userspace_upgrade leaks (N/A)
10040:38b25aeeaf7a 6857012 zfs panics on zpool import (141445-06)
10000:241a51d8720c 6848242 zdb -e no longer works as expected (N/A)
10100:4a6965f6bef8 6856634 snv_117 not booting: zfs_parse_bootfs: error2 (141445-07)
10160:a45b03783d44 6861983 zfs should use new name <-> SID interfaces (N/A) 6862984 userquota commands can hang (141445-06)
10299:80845694147f 6696858 zfs receive of incremental replication stream can dereference NULL pointer and crash (N/A)
10302:a9e3d1987706 6696858 zfs receive of incremental replication stream can dereference NULL pointer and crash (fix lint) (N/A)
10575:2a8816c5173b (partial merge) 6882227 spa_async_remove() shouldn't do a full clear (142901-14)
10800:469478b180d9 6880764 fsync on zfs is broken if writes are greater than 32kb on a hard crash and no log attached (142901-09) 6793430 zdb -ivvvv assertion failure: bp->blk_cksum.zc_word[2] == dmu_objset_id(zilog->zl_os) (N/A)
10801:e0bf032e8673 (partial merge) 6822816 assertion failed: zap_remove_int(ds_next_clones_obj) returns ENOENT (142901-09)
10810:b6b161a6ae4a 6892298 buf->b_hdr->b_state != arc_anon, file: ../../common/fs/zfs/arc.c, line: 2849 (142901-09)
10890:499786962772 6807339 spurious checksum errors when replacing a vdev (142901-13)
11249:6c30f7dfc97b 6906110 bad trap panic in zil_replay_log_record (142901-13) 6906946 zfs replay isn't handling uid/gid correctly (142901-13)
11454:6e69bacc1a5a 6898245 suspended zpool should not cause rest of the zfs/zpool commands to hang (142901-10)
11546:42ea6be8961b (partial merge) 6833999 3-way deadlock in dsl_dataset_hold_ref() and dsl_sync_task_group_sync() (142901-09)
Discussed with: pjd Approved by: delphij (mentor) Obtained from: OpenSolaris (multiple Bug IDs) MFC after: 2 months
|
#
208047 |
|
13-May-2010 |
mm |
Import OpenSolaris revision 7837:001de5627df3 It includes the following changes: - parallel reads in traversal code (Bug ID 6333409) - faster traversal for zfs send (Bug ID 6418042) - traversal code cleanup (Bug ID 6725675) - fix for two scrub related bugs (Bug ID 6729696, 6730101) - fix assertion in dbuf_verify (Bug ID 6752226) - fix panic during zfs send with i/o errors (Bug ID 6577985) - replace P2CROSS with P2BOUNDARY (Bug ID 6725680)
List of OpenSolaris Bug IDs: 6333409, 6418042, 6757112, 6725668, 6725675, 6725680, 6725698, 6729696, 6730101, 6752226, 6577985, 6755042
Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (multiple Bug IDs) MFC after: 1 week
|
#
200126 |
|
05-Dec-2009 |
pjd |
Fix deadlock when ZVOLs are present and we are replacing dead component or calling scrub when pool is in a degraded state. It will try to taste ZVOLs, which will lead to deadlock, as ZVOL will try to acquire the same locks as replace/scrub is holding already.
We can't simply skip provider based on their GEOM class, because ZVOL can have providers build on top of it and we need to skip those as well.
We do it by asking for ZFS::iszvol attribute. Any ZVOL-based provider will give us positive answer and we have to skip those providers.
This way we remove possibility to create ZFS pools on top of ZVOLs, but it is not very useful anyway.
I believe deadlock is still possible in some very complex situations like when we have MD provider on top of UFS file on top of ZVOL. When we try to replace dead component in the pool mentioned ZVOL is based on, there might be a deadlock when ZFS will try to taste MD provider. There is no easy way to detect that, but it isn't very common.
MFC after: 1 week
|
#
196927 |
|
07-Sep-2009 |
pjd |
Changing provider size is not really supported by GEOM, but doing so when provider is closed should be ok.
When administrator requests to change ZVOL size do it immediately if ZVOL is closed or do it on last ZVOL close.
PR: kern/136942 Requested by: Bernard Buri <bsd@ask-us.at> MFC after: 1 week
|
#
196458 |
|
23-Aug-2009 |
pjd |
- Hide ZFS kernel threads under zfskern process. - Use better (shorter) threads names: 'zvol:worker zvol/tank/vol00' -> 'zvol tank/vol00' 'vdev:worker da0' -> 'vdev da0'
|
#
196457 |
|
23-Aug-2009 |
pjd |
Set priority of vdev_geom threads and zvol threads to PRIBIO.
|
#
185029 |
|
17-Nov-2008 |
pjd |
Update ZFS from version 6 to 13 and bring some FreeBSD-specific changes.
This bring huge amount of changes, I'll enumerate only user-visible changes:
- Delegated Administration
Allows regular users to perform ZFS operations, like file system creation, snapshot creation, etc.
- L2ARC
Level 2 cache for ZFS - allows to use additional disks for cache. Huge performance improvements mostly for random read of mostly static content.
- slog
Allow to use additional disks for ZFS Intent Log to speed up operations like fsync(2).
- vfs.zfs.super_owner
Allows regular users to perform privileged operations on files stored on ZFS file systems owned by him. Very careful with this one.
- chflags(2)
Not all the flags are supported. This still needs work.
- ZFSBoot
Support to boot off of ZFS pool. Not finished, AFAIK.
Submitted by: dfr
- Snapshot properties
- New failure modes
Before if write requested failed, system paniced. Now one can select from one of three failure modes: - panic - panic on write error - wait - wait for disk to reappear - continue - serve read requests if possible, block write requests
- Refquota, refreservation properties
Just quota and reservation properties, but don't count space consumed by children file systems, clones and snapshots.
- Sparse volumes
ZVOLs that don't reserve space in the pool.
- External attributes
Compatible with extattr(2).
- NFSv4-ACLs
Not sure about the status, might not be complete yet.
Submitted by: trasz
- Creation-time properties
- Regression tests for zpool(8) command.
Obtained from: OpenSolaris
|
#
177698 |
|
28-Mar-2008 |
jb |
Forced commit to note that these files were repo copied.
|
#
173250 |
|
01-Nov-2007 |
pjd |
Call zil_commit() (if ZIL is not disabled) after every non-read request (BIO_WRITE and BIO_FLUSH) as it is done is Solaris. The difference is that Solaris calls it only for sync requests, but we can't say in GEOM is the request is sync or async, so we do it for every request.
MFC after: 1 week
|
#
172836 |
|
20-Oct-2007 |
julian |
Rename the kthread_xxx (e.g. kthread_create()) calls to kproc_xxx as they actually make whole processes. Thos makes way for us to add REAL kthread_create() and friends that actually make theads. it turns out that most of these calls actually end up being moved back to the thread version when it's added. but we need to make this cosmetic change first.
I'd LOVE to do this rename in 7.0 so that we can eventually MFC the new kthread_xxx() calls.
|
#
168962 |
|
22-Apr-2007 |
pjd |
MFp4: Reduce diff against vendor code: - Move FreeBSD-specific code to zfs_freebsd_*() functions in zfs_vnops.c and keep original functions as similar to vendor's code as possible. - Add various includes back, now that we have them.
|
#
168404 |
|
05-Apr-2007 |
pjd |
Please welcome ZFS - The last word in file systems.
ZFS file system was ported from OpenSolaris operating system. The code in under CDDL license.
I'd like to thank all SUN developers that created this great piece of software.
Supported by: Wheel LTD (http://www.wheel.pl/) Supported by: The FreeBSD Foundation (http://www.freebsdfoundation.org/) Supported by: Sentex (http://www.sentex.net/)
|