History log of /freebsd-10-stable/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_zfetch.c
Revision Date Author Comments
# 299433 11-May-2016 mav

MFC r297832: MFV r297831: 6322 ZFS indirect block predictive prefetch

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Author: Alexander Motin <mav@FreeBSD.org>

Improve speculative prefetch of indirect blocks.

Scalability of many operations on wide ZFS pool can be limited by
requirement to prefetch indirect blocks first. Recently added
asynchronous indirect block read partially helped, but did not
solve the problem completely. This patch extends existing prefetcher
functionality to explicitly work with indirect blocks.

Before this change prefetcher issued reads for up to 8MB of data in
advance. With this change it also issues indirect block reads
for up to 64MB of data in advance, so that when it will be time to
actually read those data, it can be done immediately. Alike effect
can be achieved by just increasing maximal data prefetch distance,
but at higher memory cost.

Also this change introduces indirect block prefetch for rewrite
operations, that was never done before. Previously ARC miss for
Indirect blocks regularly blocked rewrites, converting perfectly
aligned asynchronous operations into synchronous read-write pairs,
significantly reducing maximal rewrite speed.

While being there this issue was also fixed:
- prefetch was done always, even if caching for the dataset was
completely disabled.

Testing on FreeBSD with zvol on top of 6x striped 2x mirrored pool
of 12 assorted HDDs shown me such performance numbers:
------- BEFORE --------
Write 491363677 bytes/sec
Read 312430631 bytes/sec
Rewrite 97680464 bytes/sec
-------- AFTER --------
Write 493524146 bytes/sec
Read 438598079 bytes/sec
Rewrite 277506044 bytes/sec

Closes #65
Closes #80

openzfs/openzfs@792fd28ac04f78cc5e43ead2d72a96f244ea84e8


# 290748 13-Nov-2015 mav

MFC r289192: 6281 prefetching should apply to 1MB reads

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Alexander Motin <mav@freebsd.org>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Justin Gibbs <gibbs@scsiguy.com>
Reviewed by: Xin Li <delphij@freebsd.org>
Approved by: Gordon Ross <gordon.ross@nexenta.com>
Author: George Wilson <george.wilson@delphix.com>

illumos/illumos-gate@632802744ef6d17e06d6980a95f631615c3b060f


# 288594 03-Oct-2015 mav

MFC r287702: 5987 zfs prefetch code needs work

Rewrite the ZFS prefetch code to detect only forward, sequential
streams.

The following kstats have been added:

kstat.zfs.misc.arcstats.sync_wait_for_async

How many sync reads have waited for async read
to complete. (less is better)

kstat.zfs.misc.arcstats.demand_hit_predictive_prefetch

How many demand read didn't have to wait for I/O
because of predictive prefetch. (more is better)

zfetch kstats have been similified to hits, misses, and max_streams,
with max_streams representing times when we were not able to create
new stream because we already have the maximum number of sequences
for a file.

The sysctl variable/loader tunable vfs.zfs.zfetch.block_cap have been
replaced by vfs.zfs.zfetch.max_distance, which controls maximum bytes
to prefetch per stream.

illumos/illumos-gate@cf6106c8a0d6598b045811f9650d66e07eb332af

Illumos ZFS issues:

5987 zfs prefetch code needs work
https://www.illumos.org/issues/5987


# 288571 03-Oct-2015 mav

MFC r286705: 5960 zfs recv should prefetch indirect blocks
5925 zfs receive -o origin=

Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Author: Paul Dagnelie <pcd@delphix.com>

While running 'zfs recv' we noticed that every 128th 8K block required a
read. We were seeing that restore_write() was calling dmu_tx_hold_write()
and the indirect block was not cached. We should prefetch upcoming indirect
blocks to avoid having to go to disk and blocking the restore_write().

Allow an incremental send stream to be received as a clone, even if the
stream does not mark it as a clone.


# 288516 02-Oct-2015 mav

MFC r269117: Make sysctls under vfs.zfs.zfetch writeable.

I don't see any reason for them to be read-only, while tuning them without
reboot is much more convenient for experiments.


# 262167 18-Feb-2014 mav

MFC r260236:
In dmu_zfetch_stream_reclaim() replace division with multiplication and
move it out of the loop and lock.


# 260763 16-Jan-2014 avg

MFC r258632,258704: MFV r255255: 4045 zfs write throttle & i/o scheduler
performance work

Sponsored by: HybridCluster [merge]


# 288594 03-Oct-2015 mav

MFC r287702: 5987 zfs prefetch code needs work

Rewrite the ZFS prefetch code to detect only forward, sequential
streams.

The following kstats have been added:

kstat.zfs.misc.arcstats.sync_wait_for_async

How many sync reads have waited for async read
to complete. (less is better)

kstat.zfs.misc.arcstats.demand_hit_predictive_prefetch

How many demand read didn't have to wait for I/O
because of predictive prefetch. (more is better)

zfetch kstats have been similified to hits, misses, and max_streams,
with max_streams representing times when we were not able to create
new stream because we already have the maximum number of sequences
for a file.

The sysctl variable/loader tunable vfs.zfs.zfetch.block_cap have been
replaced by vfs.zfs.zfetch.max_distance, which controls maximum bytes
to prefetch per stream.

illumos/illumos-gate@cf6106c8a0d6598b045811f9650d66e07eb332af

Illumos ZFS issues:

5987 zfs prefetch code needs work
https://www.illumos.org/issues/5987


# 288571 03-Oct-2015 mav

MFC r286705: 5960 zfs recv should prefetch indirect blocks
5925 zfs receive -o origin=

Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Author: Paul Dagnelie <pcd@delphix.com>

While running 'zfs recv' we noticed that every 128th 8K block required a
read. We were seeing that restore_write() was calling dmu_tx_hold_write()
and the indirect block was not cached. We should prefetch upcoming indirect
blocks to avoid having to go to disk and blocking the restore_write().

Allow an incremental send stream to be received as a clone, even if the
stream does not mark it as a clone.


# 288516 02-Oct-2015 mav

MFC r269117: Make sysctls under vfs.zfs.zfetch writeable.

I don't see any reason for them to be read-only, while tuning them without
reboot is much more convenient for experiments.


# 262167 18-Feb-2014 mav

MFC r260236:
In dmu_zfetch_stream_reclaim() replace division with multiplication and
move it out of the loop and lock.


# 260763 16-Jan-2014 avg

MFC r258632,258704: MFV r255255: 4045 zfs write throttle & i/o scheduler
performance work

Sponsored by: HybridCluster [merge]