1<?xml version="1.0" encoding="UTF-8"?>
2<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
3	"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" [
4<!ENTITY procfsexample SYSTEM "procfs_example.xml">
5]>
6
7<book id="LKProcfsGuide">
8  <bookinfo>
9    <title>Linux Kernel Procfs Guide</title>
10
11    <authorgroup>
12      <author>
13	<firstname>Erik</firstname>
14	<othername>(J.A.K.)</othername>
15	<surname>Mouw</surname>
16	<affiliation>
17	  <orgname>Delft University of Technology</orgname>
18	  <orgdiv>Faculty of Information Technology and Systems</orgdiv>
19	  <address>
20            <email>J.A.K.Mouw@its.tudelft.nl</email>
21            <pob>PO BOX 5031</pob>
22            <postcode>2600 GA</postcode>
23            <city>Delft</city>
24            <country>The Netherlands</country>
25          </address>
26	</affiliation>
27      </author>
28    </authorgroup>
29
30    <revhistory>
31      <revision>
32	<revnumber>1.0&nbsp;</revnumber>
33	<date>May 30, 2001</date>
34	<revremark>Initial revision posted to linux-kernel</revremark>
35      </revision>
36      <revision>
37	<revnumber>1.1&nbsp;</revnumber>
38	<date>June 3, 2001</date>
39	<revremark>Revised after comments from linux-kernel</revremark>
40      </revision>
41    </revhistory>
42
43    <copyright>
44      <year>2001</year>
45      <holder>Erik Mouw</holder>
46    </copyright>
47
48
49    <legalnotice>
50      <para>
51        This documentation is free software; you can redistribute it
52        and/or modify it under the terms of the GNU General Public
53        License as published by the Free Software Foundation; either
54        version 2 of the License, or (at your option) any later
55        version.
56      </para>
57      
58      <para>
59        This documentation is distributed in the hope that it will be
60        useful, but WITHOUT ANY WARRANTY; without even the implied
61        warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
62        PURPOSE.  See the GNU General Public License for more details.
63      </para>
64      
65      <para>
66        You should have received a copy of the GNU General Public
67        License along with this program; if not, write to the Free
68        Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
69        MA 02111-1307 USA
70      </para>
71      
72      <para>
73        For more details see the file COPYING in the source
74        distribution of Linux.
75      </para>
76    </legalnotice>
77  </bookinfo>
78
79
80
81
82  <toc>
83  </toc>
84
85
86
87
88  <preface>
89    <title>Preface</title>
90
91    <para>
92      This guide describes the use of the procfs file system from
93      within the Linux kernel. The idea to write this guide came up on
94      the #kernelnewbies IRC channel (see <ulink
95      url="http://www.kernelnewbies.org/">http://www.kernelnewbies.org/</ulink>),
96      when Jeff Garzik explained the use of procfs and forwarded me a
97      message Alexander Viro wrote to the linux-kernel mailing list. I
98      agreed to write it up nicely, so here it is.
99    </para>
100
101    <para>
102      I'd like to thank Jeff Garzik
103      <email>jgarzik@pobox.com</email> and Alexander Viro
104      <email>viro@parcelfarce.linux.theplanet.co.uk</email> for their input,
105      Tim Waugh <email>twaugh@redhat.com</email> for his <ulink
106      url="http://people.redhat.com/twaugh/docbook/selfdocbook/">Selfdocbook</ulink>,
107      and Marc Joosen <email>marcj@historia.et.tudelft.nl</email> for
108      proofreading.
109    </para>
110
111    <para>
112      This documentation was written while working on the LART
113      computing board (<ulink
114      url="http://www.lart.tudelft.nl/">http://www.lart.tudelft.nl/</ulink>),
115      which is sponsored by the Mobile Multi-media Communications
116      (<ulink
117      url="http://www.mmc.tudelft.nl/">http://www.mmc.tudelft.nl/</ulink>)
118      and Ubiquitous Communications (<ulink
119      url="http://www.ubicom.tudelft.nl/">http://www.ubicom.tudelft.nl/</ulink>)
120      projects.
121    </para>
122
123    <para>
124      Erik
125    </para>
126  </preface>
127
128
129
130
131  <chapter id="intro">
132    <title>Introduction</title>
133
134    <para>
135      The <filename class="directory">/proc</filename> file system
136      (procfs) is a special file system in the linux kernel. It's a
137      virtual file system: it is not associated with a block device
138      but exists only in memory. The files in the procfs are there to
139      allow userland programs access to certain information from the
140      kernel (like process information in <filename
141      class="directory">/proc/[0-9]+/</filename>), but also for debug
142      purposes (like <filename>/proc/ksyms</filename>).
143    </para>
144
145    <para>
146      This guide describes the use of the procfs file system from
147      within the Linux kernel. It starts by introducing all relevant
148      functions to manage the files within the file system. After that
149      it shows how to communicate with userland, and some tips and
150      tricks will be pointed out. Finally a complete example will be
151      shown.
152    </para>
153
154    <para>
155      Note that the files in <filename
156      class="directory">/proc/sys</filename> are sysctl files: they
157      don't belong to procfs and are governed by a completely
158      different API described in the Kernel API book.
159    </para>
160  </chapter>
161
162
163
164
165  <chapter id="managing">
166    <title>Managing procfs entries</title>
167    
168    <para>
169      This chapter describes the functions that various kernel
170      components use to populate the procfs with files, symlinks,
171      device nodes, and directories.
172    </para>
173
174    <para>
175      A minor note before we start: if you want to use any of the
176      procfs functions, be sure to include the correct header file! 
177      This should be one of the first lines in your code:
178    </para>
179
180    <programlisting>
181#include &lt;linux/proc_fs.h&gt;
182    </programlisting>
183
184
185
186
187    <sect1 id="regularfile">
188      <title>Creating a regular file</title>
189      
190      <funcsynopsis>
191	<funcprototype>
192	  <funcdef>struct proc_dir_entry* <function>create_proc_entry</function></funcdef>
193	  <paramdef>const char* <parameter>name</parameter></paramdef>
194	  <paramdef>mode_t <parameter>mode</parameter></paramdef>
195	  <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef>
196	</funcprototype>
197      </funcsynopsis>
198
199      <para>
200        This function creates a regular file with the name
201        <parameter>name</parameter>, file mode
202        <parameter>mode</parameter> in the directory
203        <parameter>parent</parameter>. To create a file in the root of
204        the procfs, use <constant>NULL</constant> as
205        <parameter>parent</parameter> parameter. When successful, the
206        function will return a pointer to the freshly created
207        <structname>struct proc_dir_entry</structname>; otherwise it
208        will return <constant>NULL</constant>. <xref
209        linkend="userland"/> describes how to do something useful with
210        regular files.
211      </para>
212
213      <para>
214        Note that it is specifically supported that you can pass a
215        path that spans multiple directories. For example
216        <function>create_proc_entry</function>(<parameter>"drivers/via0/info"</parameter>)
217        will create the <filename class="directory">via0</filename>
218        directory if necessary, with standard
219        <constant>0755</constant> permissions.
220      </para>
221
222    <para>
223      If you only want to be able to read the file, the function
224      <function>create_proc_read_entry</function> described in <xref
225      linkend="convenience"/> may be used to create and initialise
226      the procfs entry in one single call.
227    </para>
228    </sect1>
229
230
231
232
233    <sect1>
234      <title>Creating a symlink</title>
235
236      <funcsynopsis>
237	<funcprototype>
238	  <funcdef>struct proc_dir_entry*
239	  <function>proc_symlink</function></funcdef> <paramdef>const
240	  char* <parameter>name</parameter></paramdef>
241	  <paramdef>struct proc_dir_entry*
242	  <parameter>parent</parameter></paramdef> <paramdef>const
243	  char* <parameter>dest</parameter></paramdef>
244	</funcprototype>
245      </funcsynopsis>
246      
247      <para>
248        This creates a symlink in the procfs directory
249        <parameter>parent</parameter> that points from
250        <parameter>name</parameter> to
251        <parameter>dest</parameter>. This translates in userland to
252        <literal>ln -s</literal> <parameter>dest</parameter>
253        <parameter>name</parameter>.
254      </para>
255    </sect1>
256
257    <sect1>
258      <title>Creating a directory</title>
259      
260      <funcsynopsis>
261	<funcprototype>
262	  <funcdef>struct proc_dir_entry* <function>proc_mkdir</function></funcdef>
263	  <paramdef>const char* <parameter>name</parameter></paramdef>
264	  <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef>
265	</funcprototype>
266      </funcsynopsis>
267
268      <para>
269        Create a directory <parameter>name</parameter> in the procfs
270        directory <parameter>parent</parameter>.
271      </para>
272    </sect1>
273
274
275
276
277    <sect1>
278      <title>Removing an entry</title>
279      
280      <funcsynopsis>
281	<funcprototype>
282	  <funcdef>void <function>remove_proc_entry</function></funcdef>
283	  <paramdef>const char* <parameter>name</parameter></paramdef>
284	  <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef>
285	</funcprototype>
286      </funcsynopsis>
287
288      <para>
289        Removes the entry <parameter>name</parameter> in the directory
290        <parameter>parent</parameter> from the procfs. Entries are
291        removed by their <emphasis>name</emphasis>, not by the
292        <structname>struct proc_dir_entry</structname> returned by the
293        various create functions. Note that this function doesn't
294        recursively remove entries.
295      </para>
296
297      <para>
298        Be sure to free the <structfield>data</structfield> entry from
299        the <structname>struct proc_dir_entry</structname> before
300        <function>remove_proc_entry</function> is called (that is: if
301        there was some <structfield>data</structfield> allocated, of
302        course). See <xref linkend="usingdata"/> for more information
303        on using the <structfield>data</structfield> entry.
304      </para>
305    </sect1>
306  </chapter>
307
308
309
310
311  <chapter id="userland">
312    <title>Communicating with userland</title>
313    
314    <para>
315       Instead of reading (or writing) information directly from
316       kernel memory, procfs works with <emphasis>call back
317       functions</emphasis> for files: functions that are called when
318       a specific file is being read or written. Such functions have
319       to be initialised after the procfs file is created by setting
320       the <structfield>read_proc</structfield> and/or
321       <structfield>write_proc</structfield> fields in the
322       <structname>struct proc_dir_entry*</structname> that the
323       function <function>create_proc_entry</function> returned:
324    </para>
325
326    <programlisting>
327struct proc_dir_entry* entry;
328
329entry->read_proc = read_proc_foo;
330entry->write_proc = write_proc_foo;
331    </programlisting>
332
333    <para>
334      If you only want to use a the
335      <structfield>read_proc</structfield>, the function
336      <function>create_proc_read_entry</function> described in <xref
337      linkend="convenience"/> may be used to create and initialise the
338      procfs entry in one single call.
339    </para>
340
341
342
343    <sect1>
344      <title>Reading data</title>
345
346      <para>
347        The read function is a call back function that allows userland
348        processes to read data from the kernel. The read function
349        should have the following format:
350      </para>
351
352      <funcsynopsis>
353	<funcprototype>
354	  <funcdef>int <function>read_func</function></funcdef>
355	  <paramdef>char* <parameter>page</parameter></paramdef>
356	  <paramdef>char** <parameter>start</parameter></paramdef>
357	  <paramdef>off_t <parameter>off</parameter></paramdef>
358	  <paramdef>int <parameter>count</parameter></paramdef>
359	  <paramdef>int* <parameter>eof</parameter></paramdef>
360	  <paramdef>void* <parameter>data</parameter></paramdef>
361	</funcprototype>
362      </funcsynopsis>
363
364      <para>
365        The read function should write its information into the
366        <parameter>page</parameter>. For proper use, the function
367        should start writing at an offset of
368        <parameter>off</parameter> in <parameter>page</parameter> and
369        write at most <parameter>count</parameter> bytes, but because
370        most read functions are quite simple and only return a small
371        amount of information, these two parameters are usually
372        ignored (it breaks pagers like <literal>more</literal> and
373        <literal>less</literal>, but <literal>cat</literal> still
374        works).
375      </para>
376
377      <para>
378        If the <parameter>off</parameter> and
379        <parameter>count</parameter> parameters are properly used,
380        <parameter>eof</parameter> should be used to signal that the
381        end of the file has been reached by writing
382        <literal>1</literal> to the memory location
383        <parameter>eof</parameter> points to.
384      </para>
385
386      <para>
387        The parameter <parameter>start</parameter> doesn't seem to be
388        used anywhere in the kernel. The <parameter>data</parameter>
389        parameter can be used to create a single call back function for
390        several files, see <xref linkend="usingdata"/>.
391      </para>
392
393      <para>
394        The <function>read_func</function> function must return the
395        number of bytes written into the <parameter>page</parameter>.
396      </para>
397
398      <para>
399        <xref linkend="example"/> shows how to use a read call back
400        function.
401      </para>
402    </sect1>
403
404
405
406
407    <sect1>
408      <title>Writing data</title>
409
410      <para>
411        The write call back function allows a userland process to write
412        data to the kernel, so it has some kind of control over the
413        kernel. The write function should have the following format:
414      </para>
415
416      <funcsynopsis>
417	<funcprototype>
418	  <funcdef>int <function>write_func</function></funcdef>
419	  <paramdef>struct file* <parameter>file</parameter></paramdef>
420	  <paramdef>const char* <parameter>buffer</parameter></paramdef>
421	  <paramdef>unsigned long <parameter>count</parameter></paramdef>
422	  <paramdef>void* <parameter>data</parameter></paramdef>
423	</funcprototype>
424      </funcsynopsis>
425
426      <para>
427        The write function should read <parameter>count</parameter>
428        bytes at maximum from the <parameter>buffer</parameter>. Note
429        that the <parameter>buffer</parameter> doesn't live in the
430        kernel's memory space, so it should first be copied to kernel
431        space with <function>copy_from_user</function>. The
432        <parameter>file</parameter> parameter is usually
433        ignored. <xref linkend="usingdata"/> shows how to use the
434        <parameter>data</parameter> parameter.
435      </para>
436
437      <para>
438        Again, <xref linkend="example"/> shows how to use this call back
439        function.
440      </para>
441    </sect1>
442
443
444
445
446    <sect1 id="usingdata">
447      <title>A single call back for many files</title>
448
449      <para>
450         When a large number of almost identical files is used, it's
451         quite inconvenient to use a separate call back function for
452         each file. A better approach is to have a single call back
453         function that distinguishes between the files by using the
454         <structfield>data</structfield> field in <structname>struct
455         proc_dir_entry</structname>. First of all, the
456         <structfield>data</structfield> field has to be initialised:
457      </para>
458
459      <programlisting>
460struct proc_dir_entry* entry;
461struct my_file_data *file_data;
462
463file_data = kmalloc(sizeof(struct my_file_data), GFP_KERNEL);
464entry->data = file_data;
465      </programlisting>
466     
467      <para>
468          The <structfield>data</structfield> field is a <type>void
469          *</type>, so it can be initialised with anything.
470      </para>
471
472      <para>
473        Now that the <structfield>data</structfield> field is set, the
474        <function>read_proc</function> and
475        <function>write_proc</function> can use it to distinguish
476        between files because they get it passed into their
477        <parameter>data</parameter> parameter:
478      </para>
479
480      <programlisting>
481int foo_read_func(char *page, char **start, off_t off,
482                  int count, int *eof, void *data)
483{
484        int len;
485
486        if(data == file_data) {
487                /* special case for this file */
488        } else {
489                /* normal processing */
490        }
491
492        return len;
493}
494      </programlisting>
495
496      <para>
497        Be sure to free the <structfield>data</structfield> data field
498        when removing the procfs entry.
499      </para>
500    </sect1>
501  </chapter>
502
503
504
505
506  <chapter id="tips">
507    <title>Tips and tricks</title>
508
509
510
511
512    <sect1 id="convenience">
513      <title>Convenience functions</title>
514
515      <funcsynopsis>
516	<funcprototype>
517	  <funcdef>struct proc_dir_entry* <function>create_proc_read_entry</function></funcdef>
518	  <paramdef>const char* <parameter>name</parameter></paramdef>
519	  <paramdef>mode_t <parameter>mode</parameter></paramdef>
520	  <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef>
521	  <paramdef>read_proc_t* <parameter>read_proc</parameter></paramdef>
522	  <paramdef>void* <parameter>data</parameter></paramdef>
523	</funcprototype>
524      </funcsynopsis>
525      
526      <para>
527        This function creates a regular file in exactly the same way
528        as <function>create_proc_entry</function> from <xref
529        linkend="regularfile"/> does, but also allows to set the read
530        function <parameter>read_proc</parameter> in one call. This
531        function can set the <parameter>data</parameter> as well, like
532        explained in <xref linkend="usingdata"/>.
533      </para>
534    </sect1>
535
536
537
538    <sect1>
539      <title>Modules</title>
540
541      <para>
542        If procfs is being used from within a module, be sure to set
543        the <structfield>owner</structfield> field in the
544        <structname>struct proc_dir_entry</structname> to
545        <constant>THIS_MODULE</constant>.
546      </para>
547
548      <programlisting>
549struct proc_dir_entry* entry;
550
551entry->owner = THIS_MODULE;
552      </programlisting>
553    </sect1>
554
555
556
557
558    <sect1>
559      <title>Mode and ownership</title>
560
561      <para>
562        Sometimes it is useful to change the mode and/or ownership of
563        a procfs entry. Here is an example that shows how to achieve
564        that:
565      </para>
566
567      <programlisting>
568struct proc_dir_entry* entry;
569
570entry->mode =  S_IWUSR |S_IRUSR | S_IRGRP | S_IROTH;
571entry->uid = 0;
572entry->gid = 100;
573      </programlisting>
574
575    </sect1>
576  </chapter>
577
578
579
580
581  <chapter id="example">
582    <title>Example</title>
583
584    <!-- be careful with the example code: it shouldn't be wider than
585    approx. 60 columns, or otherwise it won't fit properly on a page
586    -->
587
588&procfsexample;
589
590  </chapter>
591</book>
592