1207753Smm
2207753SmmXZ Utils
3207753Smm========
4207753Smm
5207753Smm    0. Overview
6207753Smm    1. Documentation
7207753Smm       1.1. Overall documentation
8244601Smm       1.2. Documentation for command-line tools
9207753Smm       1.3. Documentation for liblzma
10207753Smm    2. Version numbering
11207753Smm    3. Reporting bugs
12213700Smm    4. Translating the xz tool
13213700Smm    5. Other implementations of the .xz format
14213700Smm    6. Contact information
15207753Smm
16207753Smm
17207753Smm0. Overview
18207753Smm-----------
19207753Smm
20244601Smm    XZ Utils provide a general-purpose data-compression library plus
21244601Smm    command-line tools. The native file format is the .xz format, but
22207753Smm    also the legacy .lzma format is supported. The .xz format supports
23244601Smm    multiple compression algorithms, which are called "filters" in the
24207753Smm    context of XZ Utils. The primary filter is currently LZMA2. With
25207753Smm    typical files, XZ Utils create about 30 % smaller files than gzip.
26207753Smm
27207753Smm    To ease adapting support for the .xz format into existing applications
28207753Smm    and scripts, the API of liblzma is somewhat similar to the API of the
29244601Smm    popular zlib library. For the same reason, the command-line tool xz
30244601Smm    has a command-line syntax similar to that of gzip.
31207753Smm
32244601Smm    When aiming for the highest compression ratio, the LZMA2 encoder uses
33207753Smm    a lot of CPU time and may use, depending on the settings, even
34244601Smm    hundreds of megabytes of RAM. However, in fast modes, the LZMA2 encoder
35207753Smm    competes with bzip2 in compression speed, RAM usage, and compression
36207753Smm    ratio.
37207753Smm
38207753Smm    LZMA2 is reasonably fast to decompress. It is a little slower than
39207753Smm    gzip, but a lot faster than bzip2. Being fast to decompress means
40207753Smm    that the .xz format is especially nice when the same file will be
41207753Smm    decompressed very many times (usually on different computers), which
42207753Smm    is the case e.g. when distributing software packages. In such
43207753Smm    situations, it's not too bad if the compression takes some time,
44207753Smm    since that needs to be done only once to benefit many people.
45207753Smm
46207753Smm    With some file types, combining (or "chaining") LZMA2 with an
47244601Smm    additional filter can improve the compression ratio. A filter chain may
48244601Smm    contain up to four filters, although usually only one or two are used.
49207753Smm    For example, putting a BCJ (Branch/Call/Jump) filter before LZMA2
50207753Smm    in the filter chain can improve compression ratio of executable files.
51207753Smm
52207753Smm    Since the .xz format allows adding new filter IDs, it is possible that
53207753Smm    some day there will be a filter that is, for example, much faster to
54207753Smm    compress than LZMA2 (but probably with worse compression ratio).
55207753Smm    Similarly, it is possible that some day there is a filter that will
56207753Smm    compress better than LZMA2.
57207753Smm
58207753Smm    XZ Utils doesn't support multithreaded compression or decompression
59207753Smm    yet. It has been planned though and taken into account when designing
60207753Smm    the .xz file format.
61207753Smm
62207753Smm
63207753Smm1. Documentation
64207753Smm----------------
65207753Smm
66207753Smm1.1. Overall documentation
67207753Smm
68207753Smm    README              This file
69207753Smm
70207753Smm    INSTALL.generic     Generic install instructions for those not familiar
71207753Smm                        with packages using GNU Autotools
72207753Smm    INSTALL             Installation instructions specific to XZ Utils
73207753Smm    PACKAGERS           Information to packagers of XZ Utils
74207753Smm
75207753Smm    COPYING             XZ Utils copyright and license information
76207753Smm    COPYING.GPLv2       GNU General Public License version 2
77207753Smm    COPYING.GPLv3       GNU General Public License version 3
78207753Smm    COPYING.LGPLv2.1    GNU Lesser General Public License version 2.1
79207753Smm
80207753Smm    AUTHORS             The main authors of XZ Utils
81207753Smm    THANKS              Incomplete list of people who have helped making
82207753Smm                        this software
83207753Smm    NEWS                User-visible changes between XZ Utils releases
84207753Smm    ChangeLog           Detailed list of changes (commit log)
85207753Smm    TODO                Known bugs and some sort of to-do list
86207753Smm
87207753Smm    Note that only some of the above files are included in binary
88207753Smm    packages.
89207753Smm
90207753Smm
91244601Smm1.2. Documentation for command-line tools
92207753Smm
93244601Smm    The command-line tools are documented as man pages. In source code
94207753Smm    releases (and possibly also in some binary packages), the man pages
95207753Smm    are also provided in plain text (ASCII only) and PDF formats in the
96207753Smm    directory "doc/man" to make the man pages more accessible to those
97207753Smm    whose operating system doesn't provide an easy way to view man pages.
98207753Smm
99207753Smm
100207753Smm1.3. Documentation for liblzma
101207753Smm
102207753Smm    The liblzma API headers include short docs about each function
103207753Smm    and data type as Doxygen tags. These docs should be quite OK as
104207753Smm    a quick reference.
105207753Smm
106207753Smm    I have planned to write a bunch of very well documented example
107207753Smm    programs, which (due to comments) should work as a tutorial to
108207753Smm    various features of liblzma. No such example programs have been
109207753Smm    written yet.
110207753Smm
111207753Smm    For now, if you have never used liblzma, libbzip2, or zlib, I
112244601Smm    recommend learning the *basics* of the zlib API. Once you know that,
113244601Smm    it should be easier to learn liblzma.
114207753Smm
115207753Smm        http://zlib.net/manual.html
116207753Smm        http://zlib.net/zlib_how.html
117207753Smm
118207753Smm
119207753Smm2. Version numbering
120207753Smm--------------------
121207753Smm
122207753Smm    The version number format of XZ Utils is X.Y.ZS:
123207753Smm
124207753Smm      - X is the major version. When this is incremented, the library
125207753Smm        API and ABI break.
126207753Smm
127244601Smm      - Y is the minor version. It is incremented when new features
128244601Smm        are added without breaking the existing API or ABI. An even Y
129244601Smm        indicates a stable release and an odd Y indicates unstable
130244601Smm        (alpha or beta version).
131207753Smm
132244601Smm      - Z is the revision. This has a different meaning for stable and
133207753Smm        unstable releases:
134244601Smm
135207753Smm          * Stable: Z is incremented when bugs get fixed without adding
136244601Smm            any new features. This is intended to be convenient for
137244601Smm            downstream distributors that want bug fixes but don't want
138244601Smm            any new features to minimize the risk of introducing new bugs.
139244601Smm
140207753Smm          * Unstable: Z is just a counter. API or ABI of features added
141207753Smm            in earlier unstable releases having the same X.Y may break.
142207753Smm
143207753Smm      - S indicates stability of the release. It is missing from the
144244601Smm        stable releases, where Y is an even number. When Y is odd, S
145207753Smm        is either "alpha" or "beta" to make it very clear that such
146207753Smm        versions are not stable releases. The same X.Y.Z combination is
147244601Smm        not used for more than one stability level, i.e. after X.Y.Zalpha,
148207753Smm        the next version can be X.Y.(Z+1)beta but not X.Y.Zbeta.
149207753Smm
150207753Smm
151207753Smm3. Reporting bugs
152207753Smm-----------------
153207753Smm
154207753Smm    Naturally it is easiest for me if you already know what causes the
155207753Smm    unexpected behavior. Even better if you have a patch to propose.
156207753Smm    However, quite often the reason for unexpected behavior is unknown,
157207753Smm    so here are a few things to do before sending a bug report:
158207753Smm
159207753Smm      1. Try to create a small example how to reproduce the issue.
160207753Smm
161207753Smm      2. Compile XZ Utils with debugging code using configure switches
162207753Smm         --enable-debug and, if possible, --disable-shared. If you are
163207753Smm         using GCC, use CFLAGS='-O0 -ggdb3'. Don't strip the resulting
164207753Smm         binaries.
165207753Smm
166207753Smm      3. Turn on core dumps. The exact command depends on your shell;
167207753Smm         for example in GNU bash it is done with "ulimit -c unlimited",
168207753Smm         and in tcsh with "limit coredumpsize unlimited".
169207753Smm
170207753Smm      4. Try to reproduce the suspected bug. If you get "assertion failed"
171207753Smm         message, be sure to include the complete message in your bug
172207753Smm         report. If the application leaves a coredump, get a backtrace
173207753Smm         using gdb:
174207753Smm           $ gdb /path/to/app-binary   # Load the app to the debugger.
175207753Smm           (gdb) core core   # Open the coredump.
176207753Smm           (gdb) bt   # Print the backtrace. Copy & paste to bug report.
177207753Smm           (gdb) quit   # Quit gdb.
178207753Smm
179207753Smm    Report your bug via email or IRC (see Contact information below).
180207753Smm    Don't send core dump files or any executables. If you have a small
181207753Smm    example file(s) (total size less than 256 KiB), please include
182207753Smm    it/them as an attachment. If you have bigger test files, put them
183244601Smm    online somewhere and include a URL to the file(s) in the bug report.
184207753Smm
185207753Smm    Always include the exact version number of XZ Utils in the bug report.
186207753Smm    If you are using a snapshot from the git repository, use "git describe"
187207753Smm    to get the exact snapshot version. If you are using XZ Utils shipped
188207753Smm    in an operating system distribution, mention the distribution name,
189207753Smm    distribution version, and exact xz package version; if you cannot
190207753Smm    repeat the bug with the code compiled from unpatched source code,
191207753Smm    you probably need to report a bug to your distribution's bug tracking
192207753Smm    system.
193207753Smm
194207753Smm
195213700Smm4. Translating the xz tool
196213700Smm--------------------------
197213700Smm
198213700Smm    The messages from the xz tool have been translated into a few
199213700Smm    languages. Before starting to translate into a new language, ask
200244601Smm    the author whether someone else hasn't already started working on it.
201213700Smm
202213700Smm    Test your translation. Testing includes comparing the translated
203213700Smm    output to the original English version by running the same commands
204213700Smm    in both your target locale and with LC_ALL=C. Ask someone to
205213700Smm    proof-read and test the translation.
206213700Smm
207213700Smm    Testing can be done e.g. by installing xz into a temporary directory:
208213700Smm
209213700Smm        ./configure --disable-shared --prefix=/tmp/xz-test
210213700Smm        # <Edit the .po file in the po directory.>
211213700Smm        make -C po update-po
212213700Smm        make install
213213700Smm        bash debug/translations.bash | less
214213700Smm        bash debug/translations.bash | less -S  # For --list outputs
215213700Smm
216213700Smm    Repeat the above as needed (no need to re-run configure though).
217213700Smm
218213700Smm    Note especially the following:
219213700Smm
220213700Smm      - The output of --help and --long-help must look nice on
221244601Smm        an 80-column terminal. It's OK to add extra lines if needed.
222213700Smm
223213700Smm      - In contrast, don't add extra lines to error messages and such.
224213700Smm        They are often preceded with e.g. a filename on the same line,
225213700Smm        so you have no way to predict where to put a \n. Let the terminal
226213700Smm        do the wrapping even if it looks ugly. Adding new lines will be
227213700Smm        even uglier in the generic case even if it looks nice in a few
228213700Smm        limited examples.
229213700Smm
230213700Smm      - Be careful with column alignment in tables and table-like output
231213700Smm        (--list, --list --verbose --verbose, --info-memory, --help, and
232213700Smm        --long-help):
233213700Smm
234213700Smm          * All descriptions of options in --help should start in the
235213700Smm            same column (but it doesn't need to be the same column as
236213700Smm            in the English messages; just be consistent if you change it).
237213700Smm            Check that both --help and --long-help look OK, since they
238213700Smm            share several strings.
239213700Smm
240213700Smm          * --list --verbose and --info-memory print lines that have
241213700Smm            the format "Description:   %s". If you need a longer
242213700Smm            description, you can put extra space between the colon
243213700Smm            and %s. Then you may need to add extra space to other
244213700Smm            strings too so that the result as a whole looks good (all
245213700Smm            values start at the same column).
246213700Smm
247213700Smm          * The columns of the actual tables in --list --verbose --verbose
248213700Smm            should be aligned properly. Abbreviate if necessary. It might
249213700Smm            be good to keep at least 2 or 3 spaces between column headings
250213700Smm            and avoid spaces in the headings so that the columns stand out
251213700Smm            better, but this is a matter of opinion. Do what you think
252213700Smm            looks best.
253213700Smm
254213700Smm      - Be careful to put a period at the end of a sentence when the
255213700Smm        original version has it, and don't put it when the original
256213700Smm        doesn't have it. Similarly, be careful with \n characters
257213700Smm        at the beginning and end of the strings.
258213700Smm
259213700Smm      - Read the TRANSLATORS comments that have been extracted from the
260213700Smm        source code and included in xz.pot. If they suggest testing the
261213700Smm        translation with some type of command, do it. If testing needs
262213700Smm        input files, use e.g. tests/files/good-*.xz.
263213700Smm
264213700Smm      - When updating the translation, read the fuzzy (modified) strings
265213700Smm        carefully, and don't mark them as updated before you actually
266213700Smm        have updated them. Reading through the unchanged messages can be
267213700Smm        good too; sometimes you may find a better wording for them.
268213700Smm
269213700Smm      - If you find language problems in the original English strings,
270213700Smm        feel free to suggest improvements. Ask if something is unclear.
271213700Smm
272213700Smm      - The translated messages should be understandable (sometimes this
273213700Smm        may be a problem with the original English messages too). Don't
274213700Smm        make a direct word-by-word translation from English especially if
275213700Smm        the result doesn't sound good in your language.
276213700Smm
277213700Smm    In short, take your time and pay attention to the details. Making
278213700Smm    a good translation is not a quick and trivial thing to do. The
279213700Smm    translated xz should look as polished as the English version.
280213700Smm
281213700Smm
282213700Smm5. Other implementations of the .xz format
283207753Smm------------------------------------------
284207753Smm
285207753Smm    7-Zip and the p7zip port of 7-Zip support the .xz format starting
286207753Smm    from the version 9.00alpha.
287207753Smm
288207753Smm        http://7-zip.org/
289207753Smm        http://p7zip.sourceforge.net/
290207753Smm
291207753Smm    XZ Embedded is a limited implementation written for use in the Linux
292207753Smm    kernel, but it is also suitable for other embedded use.
293207753Smm
294207753Smm        http://tukaani.org/xz/embedded.html
295207753Smm
296207753Smm
297213700Smm6. Contact information
298207753Smm----------------------
299207753Smm
300207753Smm    If you have questions, bug reports, patches etc. related to XZ Utils,
301207753Smm    contact Lasse Collin <lasse.collin@tukaani.org> (in Finnish or English).
302207753Smm    I'm sometimes slow at replying. If you haven't got a reply within two
303207753Smm    weeks, assume that your email has got lost and resend it or use IRC.
304207753Smm
305207753Smm    You can find me also from #tukaani on Freenode; my nick is Larhzu.
306207753Smm    The channel tends to be pretty quiet, so just ask your question and
307207753Smm    someone may wake up.
308207753Smm
309