NameDateSize

..20-Dec-201657

g_sched.cH A D08-Mar-201545.1 KiB

g_sched.hH A D08-Mar-20154.8 KiB

gs_rr.cH A D08-Mar-201519 KiB

gs_scheduler.hH A D08-Mar-20157.2 KiB

READMEH A D08-Mar-20156.4 KiB

subr_disk.cH A D08-Mar-20156.5 KiB

README

1
2	--- GEOM BASED DISK SCHEDULERS FOR FREEBSD ---
3
4This code contains a framework for GEOM-based disk schedulers and a
5couple of sample scheduling algorithms that use the framework and
6implement two forms of "anticipatory scheduling" (see below for more
7details).
8
9As a quick example of what this code can give you, try to run "dd",
10"tar", or some other program with highly SEQUENTIAL access patterns,
11together with "cvs", "cvsup", "svn" or other highly RANDOM access patterns
12(this is not a made-up example: it is pretty common for developers
13to have one or more apps doing random accesses, and others that do
14sequential accesses e.g., loading large binaries from disk, checking
15the integrity of tarballs, watching media streams and so on).
16
17These are the results we get on a local machine (AMD BE2400 dual
18core CPU, SATA 250GB disk):
19
20    /mnt is a partition mounted on /dev/ad0s1f
21
22    cvs: 	cvs -d /mnt/home/ncvs-local update -Pd /mnt/ports
23    dd-read:	dd bs=128k of=/dev/null if=/dev/ad0 (or ad0-sched-)
24    dd-writew	dd bs=128k if=/dev/zero of=/mnt/largefile
25
26			NO SCHEDULER		RR SCHEDULER
27                	dd	cvs		dd	cvs
28
29    dd-read only        72 MB/s	----		72 MB/s	---
30    dd-write only	55 MB/s	---		55 MB/s	---
31    dd-read+cvs		 6 MB/s	ok    		30 MB/s	ok
32    dd-write+cvs	55 MB/s slooow		14 MB/s	ok
33
34As you can see, when a cvs is running concurrently with dd, the
35performance drops dramatically, and depending on read or write mode,
36one of the two is severely penalized.  The use of the RR scheduler
37in this example makes the dd-reader go much faster when competing
38with cvs, and lets cvs progress when competing with a writer.
39
40To try it out:
41
421. USERS OF FREEBSD 7, PLEASE READ CAREFULLY THE FOLLOWING:
43
44    On loading, this module patches one kernel function (g_io_request())
45    so that I/O requests ("bio's") carry a classification tag, useful
46    for scheduling purposes.
47
48    ON FREEBSD 7, the tag is stored in an existing (though rarely used)
49    field of the "struct bio", a solution which makes this module
50    incompatible with other modules using it, such as ZFS and gjournal.
51    Additionally, g_io_request() is patched in-memory to add a call
52    to the function that initializes this field (i386/amd64 only;
53    for other architectures you need to manually patch sys/geom/geom_io.c).
54    See details in the file g_sched.c.
55
56    On FreeBSD 8.0 and above, the above trick is not necessary,
57    as the struct bio contains dedicated fields for the classifier,
58    and hooks for request classifiers.
59
60    If you don't like the above, don't run this code.
61
622. PLEASE MAKE SURE THAT THE DISK THAT YOU WILL BE USING FOR TESTS
63   DOES NOT CONTAIN PRECIOUS DATA.
64    This is experimental code, so we make no guarantees, though
65    I am routinely using it on my desktop and laptop.
66
673. EXTRACT AND BUILD THE PROGRAMS
68    A 'make install' in the directory should work (with root privs),
69    or you can even try the binary modules.
70    If you want to build the modules yourself, look at the Makefile.
71
724. LOAD THE MODULE, CREATE A GEOM NODE, RUN TESTS
73
74    The scheduler's module must be loaded first:
75
76      # kldload gsched_rr
77
78    substitute with gsched_as to test AS.  Then, supposing that you are
79    using /dev/ad0 for testing, a scheduler can be attached to it with:
80
81      # geom sched insert ad0
82
83    The scheduler is inserted transparently in the geom chain, so
84    mounted partitions and filesystems will keep working, but
85    now requests will go through the scheduler.
86
87    To change scheduler on-the-fly, you can reconfigure the geom:
88
89      # geom sched configure -a as ad0.sched.
90
91    assuming that gsched_as was loaded previously.
92
935. SCHEDULER REMOVAL
94
95    In principle it is possible to remove the scheduler module
96    even on an active chain by doing
97
98	# geom sched destroy ad0.sched.
99
100    However, there is some race in the geom subsystem which makes
101    the removal unsafe if there are active requests on a chain.
102    So, in order to reduce the risk of data losses, make sure
103    you don't remove a scheduler from a chain with ongoing transactions.
104
105--- NOTES ON THE SCHEDULERS ---
106
107The important contribution of this code is the framework to experiment
108with different scheduling algorithms.  'Anticipatory scheduling'
109is a very powerful technique based on the following reasoning:
110
111    The disk throughput is much better if it serves sequential requests.
112    If we have a mix of sequential and random requests, and we see a
113    non-sequential request, do not serve it immediately but instead wait
114    a little bit (2..5ms) to see if there is another one coming that
115    the disk can serve more efficiently.
116
117There are many details that should be added to make sure that the
118mechanism is effective with different workloads and systems, to
119gain a few extra percent in performance, to improve fairness,
120insulation among processes etc.  A discussion of the vast literature
121on the subject is beyond the purpose of this short note.
122
123--------------------------------------------------------------------------
124
125TRANSPARENT INSERT/DELETE
126
127geom_sched is an ordinary geom module, however it is convenient
128to plug it transparently into the geom graph, so that one can
129enable or disable scheduling on a mounted filesystem, and the
130names in /etc/fstab do not depend on the presence of the scheduler.
131
132To understand how this works in practice, remember that in GEOM
133we have "providers" and "geom" objects.
134Say that we want to hook a scheduler on provider "ad0",
135accessible through pointer 'pp'. Originally, pp is attached to
136geom "ad0" (same name, different object) accessible through pointer old_gp
137
138  BEFORE	---> [ pp    --> old_gp ...]
139
140A normal "geom sched create ad0" call would create a new geom node
141on top of provider ad0/pp, and export a newly created provider
142("ad0.sched." accessible through pointer newpp).
143
144  AFTER create  ---> [ newpp --> gp --> cp ] ---> [ pp    --> old_gp ... ]
145
146On top of newpp, a whole tree will be created automatically, and we
147can e.g. mount partitions on /dev/ad0.sched.s1d, and those requests
148will go through the scheduler, whereas any partition mounted on
149the pre-existing device entries will not go through the scheduler.
150
151With the transparent insert mechanism, the original provider "ad0"/pp
152is hooked to the newly created geom, as follows:
153
154  AFTER insert  ---> [ pp    --> gp --> cp ] ---> [ newpp --> old_gp ... ]
155
156so anything that was previously using provider pp will now have
157the requests routed through the scheduler node.
158
159A removal ("geom sched destroy ad0.sched.") will restore the original
160configuration.
161
162# $FreeBSD$
163