#
307279 |
|
14-Oct-2016 |
mav |
MFC r305331: MFV r304155: 7090 zfs should improve allocation order and throttle allocations
illumos/illumos-gate@0f7643c7376dd69a08acbfc9d1d7d548b10c846a https://github.com/illumos/illumos-gate/commit/0f7643c7376dd69a08acbfc9d1d7d548b 10c846a
https://www.illumos.org/issues/7090 When write I/Os are issued, they are issued in block order but the ZIO pipelin e will drive them asynchronously through the allocation stage which can result i n blocks being allocated out-of-order. It would be nice to preserve as much of the logical order as possible. In addition, the allocations are equally scattered across all top-level VDEVs but not all top-level VDEVs are created equally. The pipeline should be able t o detect devices that are more capable of handling allocations and should allocate more blocks to those devices. This allows for dynamic allocation distribution when devices are imbalanced as fuller devices will tend to be slower than empty devices. The change includes a new pool-wide allocation queue which would throttle and order allocations in the ZIO pipeline. The queue would be ordered by issued time and offset and would provide an initial amount of allocation of work to each top-level vdev. The allocation logic utilizes a reservation system to reserve allocations that will be performed by the allocator. Once an allocatio n is successfully completed it's scheduled on a given top-level vdev. Each top- level vdev maintains a maximum number of allocations that it can handle (mg_alloc_queue_depth). The pool-wide reserved allocations (top-levels * mg_alloc_queue_depth) are distributed across the top-level vdevs metaslab groups and round robin across all eligible metaslab groups to distribute the work. As top-levels complete their work, they receive additional work from the pool-wide allocation queue until the allocation queue is emptied.
Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: Alex Reece <alex@delphix.com> Reviewed by: Christopher Siden <christopher.siden@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Sebastien Roy <sebastien.roy@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: George Wilson <george.wilson@delphix.com>
|
#
307268 |
|
14-Oct-2016 |
mav |
MFC r305324: MFV r303077: 7072 zfs fails to expand if lun added when os is in shutdown state
illumos/illumos-gate@c39a2aae1e2c439d156021edfc20910dad7f9891 https://github.com/illumos/illumos-gate/commit/c39a2aae1e2c439d156021edfc20910da d7f9891
https://www.illumos.org/issues/7072 upstream: 38733 zfs fails to expand if lun added when os is in shutdown state DLPX-36910 spares and caches should not display expandable space DLPX-39262 vdev_disk_open spam zfs_dbgmsg buffer
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Alex Reece <alex@delphix.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: George Wilson <george.wilson@delphix.com>
|
#
297112 |
|
20-Mar-2016 |
mav |
MFC r296519: MFV r296518: 5027 zfs large block support (add copyright)
Author: Matthew Ahrens <matt@mahrens.org>
illumos/illumos-gate@c3d26abc9ee97b4f60233556aadeb57e0bd30bb9
|
#
290753 |
|
13-Nov-2015 |
mav |
MFC r289307: 6295 metaslab_condense's dbgmsg should include vdev id
Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Andriy Gapon <avg@freebsd.org> Reviewed by: Xin Li <delphij@freebsd.org> Reviewed by: Justin Gibbs <gibbs@scsiguy.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Joe Stein <joe.stein@delphix.com>
illumos/illumos-gate@daec38ecb4fb5e73e4ca9e99be84f6b8c50c02fa
|
#
277553 |
|
22-Jan-2015 |
delphij |
MFC r275594: MFV r275540:
When importing a pool, don't assume that the passed pool configuration at vdev_load is always vaild. It's possible that a stale configuration that comes with extra vdevs, where metaslab_init() would fail because of lower layer returns error.
Change the code to make metaslab_init() handle and return errors from lower layer and pass it back to upper layer and handle it there.
Illumos issue: 5213 panic in metaslab_init due to space_map_open returning ENXIO
|
#
276081 |
|
22-Dec-2014 |
delphij |
MFC r274337,r274673,274681,r275515:
ZFS large block support. The default recordsize remains at 128KB.
A new tunable/sysctl variable, vfs.zfs.max_recordsize is added to allow adjusting the permitted maximum record size, or zfs_max_recordsize, with a default of 1MB. ZFS will not allow setting recordsize greater than zfs_max_recordsize as a safety belt, because larger recordsize means greater read and write latency and more memory usage.
Please note that booting from datasets that have recordsize greater than 128KB is not supported (but it's Okay to enable the feature on the pool).
Limited safety belt is provided for mounted root filesystem but use caution when using a larger value.
Illumos issue: 5027 zfs large block support
|
#
273341 |
|
20-Oct-2014 |
delphij |
MFC r272504: MFV r272494:
Make space_map_truncate() always do space_map_reallocate(). Without this, setting space_map_max_blksz would cause panic for existing pool, as dmu_objset_set_blocksize would fail if the object have multiple blocks.
Illumos issues: 5164 space_map_max_blksz causes panic, does not work 5165 zdb fails assertion when run on pool with recently-enabled spacemap_histogram feature
|
#
269774 |
|
10-Aug-2014 |
delphij |
MFC r269138:
Add two sysctls for newly added tunables.
|
#
269773 |
|
10-Aug-2014 |
delphij |
MFC r269118: MFV r269010:
Import Illumos changes to address the following Illumos issues: 4976 zfs should only avoid writing to a failing non-redundant top-level vdev 4978 ztest fails in get_metaslab_refcount() 4979 extend free space histogram to device and pool 4980 metaslabs should have a fragmentation metric 4981 remove fragmented ops vector from block allocator 4982 space_map object should proactively upgrade when feature is enabled 4984 device selection should use fragmentation metric
|
#
269416 |
|
02-Aug-2014 |
delphij |
MFC r268855: MFV r268848:
Instead of asserting all zio's be properly aligned, only assert on the logical ones.
Cap uberblocks at 8k, otherwise with ashift=17, there would be only one uberblock.
This fixes a problem that zdb would trip assert on pools with ashift >= 0xe (8k).
While there, also change the code so it only attempt to condense space map unless the uncondensed size consumes greater than zfs_metaslab_condense_block_threshold blocks.
Illumos issue: 4958 zdb trips assert on pools with ashift >= 0xe
|
#
268656 |
|
15-Jul-2014 |
delphij |
MFC r268086: MFV r267570:
4756 metaslab_group_preload() could deadlock
|
#
265746 |
|
09-May-2014 |
delphij |
MFC r265458: Import George Wilson's change for Illumos #4730:
4730 metaslab group taskq should be destroyed in metaslab_group_destroy() Reviewed by: Alex Reece <alex.reece@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Original author: George Wilson
|
#
265741 |
|
09-May-2014 |
delphij |
MFC r264671 (MFV r264668):
4754 io issued to near-full luns even after setting noalloc threshold 4755 mg_alloc_failures is no longer needed
illumos/illumos@b6240e830b871f59c22a3918aebb3b36c872edba
|
#
265740 |
|
09-May-2014 |
delphij |
MFC r264669: MFV r264666:
4374 dn_free_ranges should use range_tree_t
illumos/illumos-gate@bf16b11e8deb633dd6c4296d46e92399d1582df4
|
#
262093 |
|
17-Feb-2014 |
avg |
MFC r258717: MFV r258371,r258372: 4101 metaslab_debug should allow for fine-grained control
|
#
260768 |
|
16-Jan-2014 |
avg |
MFC r258633: MFV r255256: 3954 metaslabs continue to load even after hitting zfs_mg_alloc_failure limit
|
#
277553 |
|
22-Jan-2015 |
delphij |
MFC r275594: MFV r275540:
When importing a pool, don't assume that the passed pool configuration at vdev_load is always vaild. It's possible that a stale configuration that comes with extra vdevs, where metaslab_init() would fail because of lower layer returns error.
Change the code to make metaslab_init() handle and return errors from lower layer and pass it back to upper layer and handle it there.
Illumos issue: 5213 panic in metaslab_init due to space_map_open returning ENXIO
|
#
276081 |
|
22-Dec-2014 |
delphij |
MFC r274337,r274673,274681,r275515:
ZFS large block support. The default recordsize remains at 128KB.
A new tunable/sysctl variable, vfs.zfs.max_recordsize is added to allow adjusting the permitted maximum record size, or zfs_max_recordsize, with a default of 1MB. ZFS will not allow setting recordsize greater than zfs_max_recordsize as a safety belt, because larger recordsize means greater read and write latency and more memory usage.
Please note that booting from datasets that have recordsize greater than 128KB is not supported (but it's Okay to enable the feature on the pool).
Limited safety belt is provided for mounted root filesystem but use caution when using a larger value.
Illumos issue: 5027 zfs large block support
|
#
273341 |
|
20-Oct-2014 |
delphij |
MFC r272504: MFV r272494:
Make space_map_truncate() always do space_map_reallocate(). Without this, setting space_map_max_blksz would cause panic for existing pool, as dmu_objset_set_blocksize would fail if the object have multiple blocks.
Illumos issues: 5164 space_map_max_blksz causes panic, does not work 5165 zdb fails assertion when run on pool with recently-enabled spacemap_histogram feature
|
#
269774 |
|
10-Aug-2014 |
delphij |
MFC r269138:
Add two sysctls for newly added tunables.
|
#
269773 |
|
10-Aug-2014 |
delphij |
MFC r269118: MFV r269010:
Import Illumos changes to address the following Illumos issues: 4976 zfs should only avoid writing to a failing non-redundant top-level vdev 4978 ztest fails in get_metaslab_refcount() 4979 extend free space histogram to device and pool 4980 metaslabs should have a fragmentation metric 4981 remove fragmented ops vector from block allocator 4982 space_map object should proactively upgrade when feature is enabled 4984 device selection should use fragmentation metric
|
#
269416 |
|
02-Aug-2014 |
delphij |
MFC r268855: MFV r268848:
Instead of asserting all zio's be properly aligned, only assert on the logical ones.
Cap uberblocks at 8k, otherwise with ashift=17, there would be only one uberblock.
This fixes a problem that zdb would trip assert on pools with ashift >= 0xe (8k).
While there, also change the code so it only attempt to condense space map unless the uncondensed size consumes greater than zfs_metaslab_condense_block_threshold blocks.
Illumos issue: 4958 zdb trips assert on pools with ashift >= 0xe
|
#
268656 |
|
15-Jul-2014 |
delphij |
MFC r268086: MFV r267570:
4756 metaslab_group_preload() could deadlock
|
#
265746 |
|
09-May-2014 |
delphij |
MFC r265458: Import George Wilson's change for Illumos #4730:
4730 metaslab group taskq should be destroyed in metaslab_group_destroy() Reviewed by: Alex Reece <alex.reece@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Original author: George Wilson
|
#
265741 |
|
09-May-2014 |
delphij |
MFC r264671 (MFV r264668):
4754 io issued to near-full luns even after setting noalloc threshold 4755 mg_alloc_failures is no longer needed
illumos/illumos@b6240e830b871f59c22a3918aebb3b36c872edba
|
#
265740 |
|
09-May-2014 |
delphij |
MFC r264669: MFV r264666:
4374 dn_free_ranges should use range_tree_t
illumos/illumos-gate@bf16b11e8deb633dd6c4296d46e92399d1582df4
|
#
262093 |
|
17-Feb-2014 |
avg |
MFC r258717: MFV r258371,r258372: 4101 metaslab_debug should allow for fine-grained control
|
#
260768 |
|
16-Jan-2014 |
avg |
MFC r258633: MFV r255256: 3954 metaslabs continue to load even after hitting zfs_mg_alloc_failure limit
|