#
333196 |
|
03-May-2018 |
avg |
MFC r332426: allow ZFS pool to have temporary name for duration of current import
The change adds -t <name> option to zpool create and -t option to zpool import in its form with an old name and a new name. This allows to import (or create) a pool under a name that's different from its real, permanent name without affecting that name. This is useful when working with VM images or images of other physical systems if they happen to have a ZFS pool with the same name as the host system.
Sponsored by: Panzura (porting)
|
#
325540 |
|
08-Nov-2017 |
avg |
MFC r324757: remove spa_sync_on assert from spa_async_thread_vd
|
#
323747 |
|
19-Sep-2017 |
avg |
MFC r321471: spa_import_rootpool should be able to handle an imported root pool
That is required to support reboot -r with a new root filesystem being on an already imported pool.
PR: 210721
|
#
314857 |
|
07-Mar-2017 |
avg |
MFC r314058: zfs: lower priority of zio_write_issue threads by four
Obtained from: Panzura Sponsored by: Panzura
|
#
314668 |
|
04-Mar-2017 |
avg |
MFC r314273: zfs: call spa_deadman on a taskqueue thread
|
#
314356 |
|
27-Feb-2017 |
avg |
MFC r314059: zfs: move zio_taskq_basedc under SYSDC
|
#
310516 |
|
24-Dec-2016 |
avg |
MFC r309250: MFV r309249: 3821 Race in rollback, zil close, and zil flush
|
#
307279 |
|
14-Oct-2016 |
mav |
MFC r305331: MFV r304155: 7090 zfs should improve allocation order and throttle allocations
illumos/illumos-gate@0f7643c7376dd69a08acbfc9d1d7d548b10c846a https://github.com/illumos/illumos-gate/commit/0f7643c7376dd69a08acbfc9d1d7d548b 10c846a
https://www.illumos.org/issues/7090 When write I/Os are issued, they are issued in block order but the ZIO pipelin e will drive them asynchronously through the allocation stage which can result i n blocks being allocated out-of-order. It would be nice to preserve as much of the logical order as possible. In addition, the allocations are equally scattered across all top-level VDEVs but not all top-level VDEVs are created equally. The pipeline should be able t o detect devices that are more capable of handling allocations and should allocate more blocks to those devices. This allows for dynamic allocation distribution when devices are imbalanced as fuller devices will tend to be slower than empty devices. The change includes a new pool-wide allocation queue which would throttle and order allocations in the ZIO pipeline. The queue would be ordered by issued time and offset and would provide an initial amount of allocation of work to each top-level vdev. The allocation logic utilizes a reservation system to reserve allocations that will be performed by the allocator. Once an allocatio n is successfully completed it's scheduled on a given top-level vdev. Each top- level vdev maintains a maximum number of allocations that it can handle (mg_alloc_queue_depth). The pool-wide reserved allocations (top-levels * mg_alloc_queue_depth) are distributed across the top-level vdevs metaslab groups and round robin across all eligible metaslab groups to distribute the work. As top-levels complete their work, they receive additional work from the pool-wide allocation queue until the allocation queue is emptied.
Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: Alex Reece <alex@delphix.com> Reviewed by: Christopher Siden <christopher.siden@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Sebastien Roy <sebastien.roy@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: George Wilson <george.wilson@delphix.com>
|
#
307127 |
|
11-Oct-2016 |
mav |
MFC r305224: MFV r304158: 7136 ESC_VDEV_REMOVE_AUX ought to always include vdev information
7115 6922 generates ESC_ZFS_VDEV_REMOVE_AUX a bit too often
illumos/illumos-gate@b72b6bb10ad55121a1b352c6f68ebdc8e20c9086 https://github.com/illumos/illumos-gate/commit/b72b6bb10ad55121a1b352c6f68ebdc8e 20c9086
https://www.illumos.org/issues/7136 6922 added ESC_ZFS_VDEV_REMOVE_AUX and ESC_ZFS_VDEV_REMOVE_DEV sysevents whenever an aux device gets removed from a pool. However, those sysevents will be created without the vdev_guid and vdev_path fields. It would be better to always populate those fields.
https://www.illumos.org/issues/7115 The addition of spa_event_notify in vdev removal code (see #6922) causes event s to be generated even if the spare failed to be removed with EBUSY.
Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net> Approved by: Robert Mustacchi <rm@joyent.com> Author: Alan Somers <asomers@gmail.com>
|
#
307122 |
|
11-Oct-2016 |
mav |
MFC r305209: MFV r302660: 6314 buffer overflow in dsl_dataset_name
illumos/illumos-gate@9adfa60d484ce2435f5af77cc99dcd4e692b6660 https://github.com/illumos/illumos-gate/commit/9adfa60d484ce2435f5af77cc99dcd4e6 92b6660
https://www.illumos.org/issues/6314 Callers of dsl_dataset_name pass a buffer of size ZFS_MAXNAMELEN, but dsl_dataset_name copies the datasets' name PLUS the snapshot name to it, resulting in a max of 2 * ZFS_MAXNAMELEN + '@'.
Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Matthew Ahrens <mahrens@delphix.com>
|
#
307055 |
|
11-Oct-2016 |
mav |
MFC r305198: MFV r302647: 6922 Emit ESC_ZFS_VDEV_REMOVE_AUX after removing an aux device
illumos/illumos-gate@63364b0ee2604783e7a55f8425888867768eafa4 https://github.com/illumos/illumos-gate/commit/63364b0ee2604783e7a55f84258888677 68eafa4
https://www.illumos.org/issues/6922 ZFS does not do a config_sync after removing an aux (spare, log, or cache) device. AFAICT this isn't being done because it is slow and was deemed unnecessary. However, it should be such a rare operation that speed doesn't matter, and not doing it results in two problems: 1) It is theoretically possible to remove an aux device from one pool and attach it to another, then lose power. When power is restored, both pools woul d think that they own the aux device. 2) Removal of the aux device doesn't send any useful sysevents to userland.
Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Alan Somers <asomers@gmail.com>
|
#
307052 |
|
11-Oct-2016 |
mav |
MFC r305193: MFV r302642: 6876 Stack corruption after importing a pool with a too-long name
illumos/illumos-gate@c971037baa5d64dfecf6d87ed602fc3116ebec41 https://github.com/illumos/illumos-gate/commit/c971037baa5d64dfecf6d87ed602fc3116ebec41
https://www.illumos.org/issues/6876 Calling dsl_dataset_name on a dataset with a 256 byte buffer is asking for trouble. We should check every dataset on import, using a 1024 byte buffer and checking each time to see if the dataset's new name is longer than 256 bytes.
Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Paul Dagnelie <pcd@delphix.com>
|
#
298469 |
|
22-Apr-2016 |
avg |
MFC r297709: zio write issue threads should have lower (numerically greater) priority
|
#
297115 |
|
20-Mar-2016 |
mav |
MFC r296528: MFV r296527: 6659 nvlist_free(NULL) is a no-op
Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: Marcel Telka <marcel@telka.sk> Approved by: Robert Mustacchi <rm@joyent.com> Author: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
illumos/illumos-gate@aab83bb83be7342f6cfccaed8d5fe0b2f404855d
|
#
297112 |
|
20-Mar-2016 |
mav |
MFC r296519: MFV r296518: 5027 zfs large block support (add copyright)
Author: Matthew Ahrens <matt@mahrens.org>
illumos/illumos-gate@c3d26abc9ee97b4f60233556aadeb57e0bd30bb9
|
#
297100 |
|
20-Mar-2016 |
mav |
MFC r294811: MFV r294810: 6414 vdev_config_sync could be simpler
Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Will Andrews <will@firepipe.net>
illumos/illumos-gate@eb5bb58421f46cee79155a55688e6c675e7dd361
|
#
297077 |
|
20-Mar-2016 |
mav |
MFC r277300 (by smh): Mechanically convert cddl sun #ifdef's to illumos
Since the upstream for cddl code is now illumos not sun, mechanically convert all sun #ifdef's to illumos #ifdef's which have been used in all newer code for some time.
Also do a manual pass to correct the use if #ifdef comments as per style(9) as well as few uses of #if defined(__FreeBSD__) vs #ifndef illumos.
|
#
297076 |
|
20-Mar-2016 |
mav |
MFC r271785: Reorder sysctls for spa.c global tunables; add sysctl for ccw_retry_interval.
|
#
297067 |
|
20-Mar-2016 |
mav |
MFC r264670: MFV r264667:
4752 fan out read zio taskqs
illumos/illumos-gate@1b497ab83e8f1c58bba5da59c649207a442a4720
|
#
294843 |
|
26-Jan-2016 |
asomers |
MFC r292066, r292069, r293708, r294027, and r294358, mostly to vdev_geom.c
r292066 | asomers | 2015-12-10 14:46:21 -0700 (Thu, 10 Dec 2015) | 25 lines
During vdev_geom_open, require that the vdev guids match the device's label except during split, add, or create operations. This fixes a bug where the wrong disk could be returned, and higher layers of ZFS would immediately eject it again.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c: o When opening by GUID, require both the pool and vdev GUIDs to match. While it is highly unlikely for two vdevs to have the same vdev GUIDs, the ZFS storage pool allocator only guarantees they are unique within a pool.
o Modify the open behavior to: - If we are opening a vdev that hasn't previously been opened, open by path without checking GUIDs. - Otherwise, open by path and verify GUIDs. - If that fails, search all geom providers for a device with matching GUIDs. - If that fails, return ENOENT.
r292069 | asomers | 2015-12-10 17:04:13 -0700 (Thu, 10 Dec 2015) | 6 lines
Change an important error message from ZFS_LOG to printf
r293708 | asomers | 2016-01-11 15:15:46 -0700 (Mon, 11 Jan 2016) | 16 lines
Fix importing l2arc device by guid
After r292066, vdev_geom verifies both the vdev and pool guids of device labels during open. However, spare and l2arc devices don't have pool guids, so opening them by guid will fail (opening by path, when the pathname is known, still succeeds). This change allows a vdev to be opened by guid if the label contains no pool_guid, which is the case for inactive spares and l2arc devices.
r294027 | asomers | 2016-01-14 11:19:05 -0700 (Thu, 14 Jan 2016) | 14 lines
Fix race condition involving ZFS remove events
When a ZFS drive disappears, ZFS sends a resource.fs.zfs.removed event to userland. A userland program like zfsd(8) can use that event, for example to activate a hotspare. The current code contains a race condition: vdev_geom will sent the sysevent _before_ spa.c would update the vdev's status, causing userland processes to see pool state that does not reflect the device removal. This change moves the sysevent to spa.c, closing the race.
r294358 | asomers | 2016-01-19 16:16:24 -0700 (Tue, 19 Jan 2016) | 10 lines
Quell harmless CID about unchecked return value in nvlist_get_guids.
The return value doesn't need to be checked, because nvlist_get_guid's callers check the returned values of the guids.
|
#
294334 |
|
19-Jan-2016 |
dim |
MFC r294102:
MFV r294101: 6527 Possible access beyond end of string in zpool comment
Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Dan McDonald <danmcd@omniti.com> Approved by: Gordon Ross <gwr@nexenta.com>
illumos/illumos-gate@2bd7a8d078223b122d65fea49bb8641f858b1409
This fixes erroneous double increments of the 'check' variable in a loop in spa_prop_validate(). I ran into this in the clang380-import branch, where clang 3.8.0 warns about it. (It is already fixed there.)
|
#
290757 |
|
13-Nov-2015 |
mav |
MFC r289422: 4185 add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R
Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com> Reviewed by: Richard Lowe <richlowe@richlowe.net> Approved by: Garrett D'Amore <garrett@damore.org> Author: Matthew Ahrens <mahrens@delphix.com>
illumos/illumos-gate@45818ee124adeaaf947698996b4f4c722afc6d1f
This is only a partial merge of respective ZFS infrastructure changes. At this moment FreeBSD kernel has no those crypto algorithms, so the parts of the code to enable them are commented out. When they are implemented, it will be trivial to plug them in.
|
#
290745 |
|
13-Nov-2015 |
mav |
MFC r287745: 5997 FRU field not set during pool creation and never updated
ZFS already supports storing the vdev FRU in a vdev property. There is code in libzfs to work with this property, and there is code in the zfs-retire FMA module that looks for that information. But there is no code actually setting or updating the FRU.
To address this, ZFS is changed to send a handful of new events whenever a vdev is added, attached, cleared, or onlined, as well as when a pool is created or imported.
Note that syseventd is not currently available on FreeBSD and thus some work is needed to actually support the new ZFS events (e.g. in zfsd) to actually use this capability, this changeset is mostly a diff reduction from upstream.
illumos/illumos-gate@1437283407f89cab03860accf49408f94559bc34
Illumos issues:
5997 FRU field not set during pool creation and never updated https://www.illumos.org/issues/5997
|
#
288597 |
|
03-Oct-2015 |
mav |
MFC r287744 (by delphij): Reduce diff against upstream.
|
#
288571 |
|
03-Oct-2015 |
mav |
MFC r286705: 5960 zfs recv should prefetch indirect blocks 5925 zfs receive -o origin=
Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Author: Paul Dagnelie <pcd@delphix.com>
While running 'zfs recv' we noticed that every 128th 8K block required a read. We were seeing that restore_write() was calling dmu_tx_hold_write() and the indirect block was not cached. We should prefetch upcoming indirect blocks to avoid having to go to disk and blocking the restore_write().
Allow an incremental send stream to be received as a clone, even if the stream does not mark it as a clone.
|
#
288569 |
|
03-Oct-2015 |
mav |
MFC r286686: 5269 zpool import slow
illumos/illumos-gate@12380e1e701fda28c9e9f32d01cafb54af279eb5
https://www.illumos.org/issues/5269 When importing a pool (at boot or with zpool import) with many filesystem, the process can take minutes. It doesn't matter whether the pool has been exported cleanly or uncleanly. The problem is that each dataset has its own log chain. On import, all datasets have to be checked if there are logs to replay. The idea is to speed up this process by paralellizing it.
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george@delphix.com> Reviewed by: Dan McDonald <danmcd@omniti.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Arne Jansen <jansen@webgods.de>
|
#
288558 |
|
03-Oct-2015 |
mav |
MFC r286600: 5808 spa_check_logs is not necessary on readonly pools
Reviewed by: George Wilson <george@delphix.com> Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com> Reviewed by: Simon Klinkert <simon.klinkert@gmail.com> Reviewed by: Will Andrews <will@freebsd.org> Approved by: Gordon Ross <gwr@nexenta.com> Author: Matthew Ahrens <mahrens@delphix.com>
illumos/illumos-gate@23367a2f2caec1ccb4d918bdd0f2fc2c9cadcd06
|
#
288549 |
|
03-Oct-2015 |
mav |
MFC r286575: 5056 ZFS deadlock on db_mtx and dn_holds
Reviewed by: Will Andrews <willa@spectralogic.com> Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Justin Gibbs <justing@spectralogic.com>
illumos/illumos-gate@bc9014e6a81272073b9854d9f65dd59e18d18c35
|
#
287667 |
|
11-Sep-2015 |
avg |
MFC r287100: spa_import_rootpool: prevent lock and resource leak
PR: 198563
|
#
285001 |
|
01-Jul-2015 |
avg |
MFC r284304: MFV r284030: 5818 zfs {ref}compressratio is incorrect with 4k sector size
Note: no MFC to stable/9 because r268075 (vendor r267565) has not been MFC-ed.
|
#
277585 |
|
23-Jan-2015 |
delphij |
MFC r275782: MFV r275551:
Remove "dbuf phys" db->db_data pointer aliases.
Use function accessors that cast db->db_data to the appropriate "phys" type, removing the need for clients of the dmu buf user API to keep properly typed pointer aliases to db->db_data in order to conveniently access their data.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_leaf.c: In zap_leaf() and zap_leaf_byteswap, now that the pointer alias field l_phys has been removed, use the db_data field in an on stack dmu_buf_t to point to the leaf's phys data.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c: Remove the db_user_data_ptr_ptr field from dbuf and all logic to maintain it.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dbuf.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c: Modify the DMU buf user API to remove the ability to specify a db_data aliasing pointer (db_user_data_ptr_ptr).
cddl/contrib/opensolaris/cmd/zdb/zdb.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_bookmark.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deadlist.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deleg.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_destroy.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_prop.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_synctask.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_userhold.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_history.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_leaf.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dataset.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dir.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_impl.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_leaf.h: Create and use the new "phys data" accessor functions dsl_dir_phys(), dsl_dataset_phys(), zap_m_phys(), zap_f_phys(), and zap_leaf_phys().
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dataset.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dir.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_impl.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_leaf.h: Remove now unused "phys pointer" aliases to db->db_data from clients of the DMU buf user API.
Illumos issue: 5314 Remove "dbuf phys" db->db_data pointer aliases in ZFS
|
#
277584 |
|
23-Jan-2015 |
delphij |
MFC r275781: MFV r275550:
In addition to r273158, make the code in spa_sync() that checks if the current TXG is a no-op TXG less fragile.
Illumos issue: 5347 idle pool may run itself out of space
|
#
276081 |
|
22-Dec-2014 |
delphij |
MFC r274337,r274673,274681,r275515:
ZFS large block support. The default recordsize remains at 128KB.
A new tunable/sysctl variable, vfs.zfs.max_recordsize is added to allow adjusting the permitted maximum record size, or zfs_max_recordsize, with a default of 1MB. ZFS will not allow setting recordsize greater than zfs_max_recordsize as a safety belt, because larger recordsize means greater read and write latency and more memory usage.
Please note that booting from datasets that have recordsize greater than 128KB is not supported (but it's Okay to enable the feature on the pool).
Limited safety belt is provided for mounted root filesystem but use caution when using a larger value.
Illumos issue: 5027 zfs large block support
|
#
273348 |
|
20-Oct-2014 |
delphij |
MFC r272598: MFV r272585:
Split the godfather zio into CPU number's to reduce lock contention.
Illumos issue: 5176 lock contention on godfather zio
|
#
271001 |
|
03-Sep-2014 |
delphij |
MFC r270247: MFV r270195:
Illumos issue: 5045 use atomic_{inc,dec}_* instead of atomic_add_*
|
#
269773 |
|
10-Aug-2014 |
delphij |
MFC r269118: MFV r269010:
Import Illumos changes to address the following Illumos issues: 4976 zfs should only avoid writing to a failing non-redundant top-level vdev 4978 ztest fails in get_metaslab_refcount() 4979 extend free space histogram to device and pool 4980 metaslabs should have a fragmentation metric 4981 remove fragmented ops vector from block allocator 4982 space_map object should proactively upgrade when feature is enabled 4984 device selection should use fragmentation metric
|
#
269219 |
|
29-Jul-2014 |
delphij |
MFC r268720: MFV r268714:
Improve extreme rewind import.
When doing an "extreme rewind" import ("zpool import -XF"), we attempt to verify all data in the pool, essentially scrubbing the entire pool. The problem is that spa_load_verify_cb() issues an unbounded number of concurrent scrub i/os. This can lead to all of memory being used for these zio's, wedging the system. Like normal scrub, we need to put a cap on the number of outstanding i/os, and have the traverse thread block when we reach this cap.
For this purpose the cap can be very large (10,000) to optimize the elevator algorithm. Three kernel tunables have been added:
vfs.zfs.spa_load_verify_maxinflight vfs.zfs.spa_load_verify_metadata vfs.zfs.spa_load_verify_data
The latter two tunables controls whether metadata and/or user data when doing extreme rewind.
Make 'zpool import -T' imply scrub.
Make zpool import -T <txg> accept hexadecimal values for the txg when prefixed with 0x.
Skip txg's for which there is no uberblock when doing extreme rewind.
Skip reading all user data twice by skipping prefetches when doing extreme rewinds as we do not access via the ARC.
Illumos issues: 4970 need controls on i/o issued by zpool import -XF 4971 zpool import -T should accept hex values 4972 zpool import -T implies extreme rewind, and thus a scrub 4973 spa_load_retry retries the same txg 4974 spa_load_verify() reads all data twice
|
#
269006 |
|
22-Jul-2014 |
delphij |
MFC r268473: MFV r268455:
Use reserved space for ZFS administrative commands.
|
#
268658 |
|
15-Jul-2014 |
delphij |
MFC r268126: MFV r268121:
4924 LZ4 Compression for metadata
|
#
268657 |
|
15-Jul-2014 |
delphij |
MFC r268123: MFV r268119:
4914 zfs on-disk bookmark structure should be named *_phys_t
|
#
268650 |
|
15-Jul-2014 |
delphij |
MFC r268079: MFV r267566:
4390 i/o errors when deleting filesystem/zvol can lead to space map corruption
|
#
268649 |
|
15-Jul-2014 |
delphij |
MFC r268075: MFV r267565:
4757 ZFS embedded-data block pointers ("zero block compression") 4913 zfs release should not be subject to space checks
|
#
267571 |
|
17-Jun-2014 |
mav |
MFC r267029, r267038: Replace gethrtime() with cpu_ticks(), as source of random for the taskqueue selection. gethrtime() in our port updated with HZ rate, so unusable for this specific purpose, completely draining benefit of multiple taskqueues.
|
#
263397 |
|
19-Mar-2014 |
delphij |
MFC r260150: MFV r259170:
4370 avoid transmitting holes during zfs send
4371 DMU code clean up
illumos/illumos-gate@43466aae47bfcd2ad9bf501faec8e75c08095e4f
NOTE: Make sure the boot code is updated if a zpool upgrade is done on boot zpool.
|
#
263390 |
|
19-Mar-2014 |
delphij |
MFC r259813 + r259813: MFV r258374:
4171 clean up spa_feature_*() interfaces
4172 implement extensible_dataset feature for use by other zpool features
illumos/illumos-gate@2acef22db7808606888f8f92715629ff3ba555b9
|
#
263269 |
|
17-Mar-2014 |
delphij |
MFC r262676:
All callers of static method load_nvlist() in spa.c handles error case, so there is no reason to assert that we won't hit an error. Instead, just return that error to caller and have the upper layer handle it.
Obtained from: FreeNAS Reported by: rodrigc Reviewed by: Matthew Ahrens
|
#
262093 |
|
17-Feb-2014 |
avg |
MFC r258717: MFV r258371,r258372: 4101 metaslab_debug should allow for fine-grained control
|
#
260763 |
|
16-Jan-2014 |
avg |
MFC r258632,258704: MFV r255255: 4045 zfs write throttle & i/o scheduler performance work
Sponsored by: HybridCluster [merge]
|
#
260750 |
|
16-Jan-2014 |
avg |
MFC r258631: MFV r247578
3581 spa_zio_taskq[ZIO_TYPE_FREE][ZIO_TASKQ_ISSUE]->tq_lock is piping hot
|
#
260742 |
|
16-Jan-2014 |
avg |
MFC r258630: 734 taskq_dispatch_prealloc() desired
|
#
260617 |
|
13-Jan-2014 |
delphij |
MFC r259811:
MFV r258373:
4168 ztest assertion failure in dbuf_undirty
4169 verbatim import causes zdb to segfa 4170 zhack leaves pool in ACTIVE state
illumos/illumos-gate@7fdd916c474ea52896c671bbe7b56ba34a1ca132
|
#
288597 |
|
03-Oct-2015 |
mav |
MFC r287744 (by delphij): Reduce diff against upstream.
|
#
288571 |
|
03-Oct-2015 |
mav |
MFC r286705: 5960 zfs recv should prefetch indirect blocks 5925 zfs receive -o origin=
Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Author: Paul Dagnelie <pcd@delphix.com>
While running 'zfs recv' we noticed that every 128th 8K block required a read. We were seeing that restore_write() was calling dmu_tx_hold_write() and the indirect block was not cached. We should prefetch upcoming indirect blocks to avoid having to go to disk and blocking the restore_write().
Allow an incremental send stream to be received as a clone, even if the stream does not mark it as a clone.
|
#
288569 |
|
03-Oct-2015 |
mav |
MFC r286686: 5269 zpool import slow
illumos/illumos-gate@12380e1e701fda28c9e9f32d01cafb54af279eb5
https://www.illumos.org/issues/5269 When importing a pool (at boot or with zpool import) with many filesystem, the process can take minutes. It doesn't matter whether the pool has been exported cleanly or uncleanly. The problem is that each dataset has its own log chain. On import, all datasets have to be checked if there are logs to replay. The idea is to speed up this process by paralellizing it.
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george@delphix.com> Reviewed by: Dan McDonald <danmcd@omniti.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Arne Jansen <jansen@webgods.de>
|
#
288558 |
|
03-Oct-2015 |
mav |
MFC r286600: 5808 spa_check_logs is not necessary on readonly pools
Reviewed by: George Wilson <george@delphix.com> Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com> Reviewed by: Simon Klinkert <simon.klinkert@gmail.com> Reviewed by: Will Andrews <will@freebsd.org> Approved by: Gordon Ross <gwr@nexenta.com> Author: Matthew Ahrens <mahrens@delphix.com>
illumos/illumos-gate@23367a2f2caec1ccb4d918bdd0f2fc2c9cadcd06
|
#
288549 |
|
03-Oct-2015 |
mav |
MFC r286575: 5056 ZFS deadlock on db_mtx and dn_holds
Reviewed by: Will Andrews <willa@spectralogic.com> Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Justin Gibbs <justing@spectralogic.com>
illumos/illumos-gate@bc9014e6a81272073b9854d9f65dd59e18d18c35
|
#
287667 |
|
11-Sep-2015 |
avg |
MFC r287100: spa_import_rootpool: prevent lock and resource leak
PR: 198563
|
#
285001 |
|
01-Jul-2015 |
avg |
MFC r284304: MFV r284030: 5818 zfs {ref}compressratio is incorrect with 4k sector size
Note: no MFC to stable/9 because r268075 (vendor r267565) has not been MFC-ed.
|
#
277585 |
|
23-Jan-2015 |
delphij |
MFC r275782: MFV r275551:
Remove "dbuf phys" db->db_data pointer aliases.
Use function accessors that cast db->db_data to the appropriate "phys" type, removing the need for clients of the dmu buf user API to keep properly typed pointer aliases to db->db_data in order to conveniently access their data.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_leaf.c: In zap_leaf() and zap_leaf_byteswap, now that the pointer alias field l_phys has been removed, use the db_data field in an on stack dmu_buf_t to point to the leaf's phys data.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c: Remove the db_user_data_ptr_ptr field from dbuf and all logic to maintain it.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dbuf.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c: Modify the DMU buf user API to remove the ability to specify a db_data aliasing pointer (db_user_data_ptr_ptr).
cddl/contrib/opensolaris/cmd/zdb/zdb.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_bookmark.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deadlist.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deleg.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_destroy.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_prop.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_synctask.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_userhold.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_history.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_leaf.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dataset.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dir.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_impl.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_leaf.h: Create and use the new "phys data" accessor functions dsl_dir_phys(), dsl_dataset_phys(), zap_m_phys(), zap_f_phys(), and zap_leaf_phys().
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dataset.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dir.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_impl.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zap_leaf.h: Remove now unused "phys pointer" aliases to db->db_data from clients of the DMU buf user API.
Illumos issue: 5314 Remove "dbuf phys" db->db_data pointer aliases in ZFS
|
#
277584 |
|
23-Jan-2015 |
delphij |
MFC r275781: MFV r275550:
In addition to r273158, make the code in spa_sync() that checks if the current TXG is a no-op TXG less fragile.
Illumos issue: 5347 idle pool may run itself out of space
|
#
276081 |
|
22-Dec-2014 |
delphij |
MFC r274337,r274673,274681,r275515:
ZFS large block support. The default recordsize remains at 128KB.
A new tunable/sysctl variable, vfs.zfs.max_recordsize is added to allow adjusting the permitted maximum record size, or zfs_max_recordsize, with a default of 1MB. ZFS will not allow setting recordsize greater than zfs_max_recordsize as a safety belt, because larger recordsize means greater read and write latency and more memory usage.
Please note that booting from datasets that have recordsize greater than 128KB is not supported (but it's Okay to enable the feature on the pool).
Limited safety belt is provided for mounted root filesystem but use caution when using a larger value.
Illumos issue: 5027 zfs large block support
|
#
273348 |
|
20-Oct-2014 |
delphij |
MFC r272598: MFV r272585:
Split the godfather zio into CPU number's to reduce lock contention.
Illumos issue: 5176 lock contention on godfather zio
|
#
271001 |
|
03-Sep-2014 |
delphij |
MFC r270247: MFV r270195:
Illumos issue: 5045 use atomic_{inc,dec}_* instead of atomic_add_*
|
#
269773 |
|
10-Aug-2014 |
delphij |
MFC r269118: MFV r269010:
Import Illumos changes to address the following Illumos issues: 4976 zfs should only avoid writing to a failing non-redundant top-level vdev 4978 ztest fails in get_metaslab_refcount() 4979 extend free space histogram to device and pool 4980 metaslabs should have a fragmentation metric 4981 remove fragmented ops vector from block allocator 4982 space_map object should proactively upgrade when feature is enabled 4984 device selection should use fragmentation metric
|
#
269219 |
|
29-Jul-2014 |
delphij |
MFC r268720: MFV r268714:
Improve extreme rewind import.
When doing an "extreme rewind" import ("zpool import -XF"), we attempt to verify all data in the pool, essentially scrubbing the entire pool. The problem is that spa_load_verify_cb() issues an unbounded number of concurrent scrub i/os. This can lead to all of memory being used for these zio's, wedging the system. Like normal scrub, we need to put a cap on the number of outstanding i/os, and have the traverse thread block when we reach this cap.
For this purpose the cap can be very large (10,000) to optimize the elevator algorithm. Three kernel tunables have been added:
vfs.zfs.spa_load_verify_maxinflight vfs.zfs.spa_load_verify_metadata vfs.zfs.spa_load_verify_data
The latter two tunables controls whether metadata and/or user data when doing extreme rewind.
Make 'zpool import -T' imply scrub.
Make zpool import -T <txg> accept hexadecimal values for the txg when prefixed with 0x.
Skip txg's for which there is no uberblock when doing extreme rewind.
Skip reading all user data twice by skipping prefetches when doing extreme rewinds as we do not access via the ARC.
Illumos issues: 4970 need controls on i/o issued by zpool import -XF 4971 zpool import -T should accept hex values 4972 zpool import -T implies extreme rewind, and thus a scrub 4973 spa_load_retry retries the same txg 4974 spa_load_verify() reads all data twice
|
#
269006 |
|
22-Jul-2014 |
delphij |
MFC r268473: MFV r268455:
Use reserved space for ZFS administrative commands.
|
#
268658 |
|
15-Jul-2014 |
delphij |
MFC r268126: MFV r268121:
4924 LZ4 Compression for metadata
|
#
268657 |
|
15-Jul-2014 |
delphij |
MFC r268123: MFV r268119:
4914 zfs on-disk bookmark structure should be named *_phys_t
|
#
268650 |
|
15-Jul-2014 |
delphij |
MFC r268079: MFV r267566:
4390 i/o errors when deleting filesystem/zvol can lead to space map corruption
|
#
268649 |
|
15-Jul-2014 |
delphij |
MFC r268075: MFV r267565:
4757 ZFS embedded-data block pointers ("zero block compression") 4913 zfs release should not be subject to space checks
|
#
267571 |
|
17-Jun-2014 |
mav |
MFC r267029, r267038: Replace gethrtime() with cpu_ticks(), as source of random for the taskqueue selection. gethrtime() in our port updated with HZ rate, so unusable for this specific purpose, completely draining benefit of multiple taskqueues.
|
#
263397 |
|
19-Mar-2014 |
delphij |
MFC r260150: MFV r259170:
4370 avoid transmitting holes during zfs send
4371 DMU code clean up
illumos/illumos-gate@43466aae47bfcd2ad9bf501faec8e75c08095e4f
NOTE: Make sure the boot code is updated if a zpool upgrade is done on boot zpool.
|
#
263390 |
|
19-Mar-2014 |
delphij |
MFC r259813 + r259813: MFV r258374:
4171 clean up spa_feature_*() interfaces
4172 implement extensible_dataset feature for use by other zpool features
illumos/illumos-gate@2acef22db7808606888f8f92715629ff3ba555b9
|
#
263269 |
|
17-Mar-2014 |
delphij |
MFC r262676:
All callers of static method load_nvlist() in spa.c handles error case, so there is no reason to assert that we won't hit an error. Instead, just return that error to caller and have the upper layer handle it.
Obtained from: FreeNAS Reported by: rodrigc Reviewed by: Matthew Ahrens
|
#
262093 |
|
17-Feb-2014 |
avg |
MFC r258717: MFV r258371,r258372: 4101 metaslab_debug should allow for fine-grained control
|
#
260763 |
|
16-Jan-2014 |
avg |
MFC r258632,258704: MFV r255255: 4045 zfs write throttle & i/o scheduler performance work
Sponsored by: HybridCluster [merge]
|
#
260750 |
|
16-Jan-2014 |
avg |
MFC r258631: MFV r247578
3581 spa_zio_taskq[ZIO_TYPE_FREE][ZIO_TASKQ_ISSUE]->tq_lock is piping hot
|
#
260742 |
|
16-Jan-2014 |
avg |
MFC r258630: 734 taskq_dispatch_prealloc() desired
|
#
260617 |
|
13-Jan-2014 |
delphij |
MFC r259811:
MFV r258373:
4168 ztest assertion failure in dbuf_undirty
4169 verbatim import causes zdb to segfa 4170 zhack leaves pool in ACTIVE state
illumos/illumos-gate@7fdd916c474ea52896c671bbe7b56ba34a1ca132
|