History log of /freebsd-current/usr.bin/mkuzip/mkuzip.c
Revision Date Author Comments
# 5e3934b1 24-Nov-2023 Warner Losh <imp@FreeBSD.org>

usr.bin: Automated cleanup of cdefs and other formatting

Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.

Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/

Sponsored by: Netflix


# 1d386b48 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 1a7ac2bd 07-Jul-2023 Alfonso Gregory <gfunni234@gmail.com>

Mark usage function as __dead2 in programs where it does not return

In most cases, usage does not return, so mark them as __dead2. For the
cases where they do return, they have not been marked __dead2.

Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/735


# 1291d48f 20-Jun-2023 John Baldwin <jhb@FreeBSD.org>

mkuzip: Remove set but unused variable.

Reported by: GCC


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# eefd8f96 13-Aug-2019 Conrad Meyer <cem@FreeBSD.org>

geom_uzip(4), mkuzip(8): Add Zstd image mode

The Zstd format bumps the CLOOP major number to 4 to avoid incompatibility
with older systems. Support in geom_uzip(4) is conditional on the ZSTDIO
kernel option, which is enabled in amd64 GENERIC, but not all in-tree
configurations.

mkuzip(8) was modified slightly to always initialize the nblocks + 1'th
offset in the CLOOP file format. Previously, it was only initialized in the
case where the final compressed block happened to be unaligned w.r.t.
DEV_BSIZE. The "Fake" last+1 block change in r298619 means that the final
compressed block's 'blen' was never correct unless the compressed uzip image
happened to be BSIZE-aligned. This happened in about 1 out of every 512
cases. The zlib and lzma decompressors are probably tolerant of extra trash
following the frame they were told to decode, but Zstd complains that the
input size is incorrect.

Correspondingly, geom_uzip(4) was modified slightly to avoid trashing the
nblocks + 1'th offset when it is known to be initialized to a good value.
This corrects the calculated final real cluster compressed length to match
that printed by mkuzip(8).

mkuzip(8) was refactored somewhat to reduce code duplication and increase
ease of adding other compression formats.

* Input block size validation was pulled out of individual compression
init routines into main().

* Init routines now validate a user-provided compression level or select
an algorithm-specific default, if none was provided.

* A new interface for calculating the maximal compressed size of an
incompressible input block was added for each driver. The generic code
uses it to validate against MAXPHYS as well as to allocate compression
result buffers in the generic code.

* Algorithm selection is now driven by a table lookup, to increase ease of
adding other formats in the future.

mkuzip(8) gained the ability to explicitly specify a compression level with
'-C'. The prior defaults -- 9 for zlib and 6 for lzma -- are maintained.
The new zstd default is 9, to match zlib.

Rather than select lzma or zlib with '-L' or its absense, respectively, a
new argument '-A <algorithm>' is provided to select 'zlib', 'lzma', or
'zstd'. '-L' is considered deprecated, but will probably never be removed.

All of the new features were documented in mkuzip.8; the page was also
cleaned up slightly.

Relnotes: yes


# 1de7b4b8 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

various: general adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.


# bc3b2c55 16-Jun-2017 Maxim Sobolev <sobomax@FreeBSD.org>

o Move logic that determines size of the input image into its own
file. That logic has grown quite significantly now;

o add a special handling for the snapshot images. Those have some
extra headers at the end of the image and we don't need those
in the output image really.

MFC after: 6 weeks


# 0ce59aa8 10-May-2017 Alan Somers <asomers@FreeBSD.org>

Don't depend on assert(3) getting evaluated

Reported by: imp
MFC after: 3 weeks
X-MFC-With: 318141, 318143
Sponsored by: Spectra Logic Corp


# 5cbe126a 10-May-2017 Alan Somers <asomers@FreeBSD.org>

strcpy => strlcpy

Reported by: Coverity
CID: 1352771
MFC after: 3 weeks
Sponsored by: Spectra Logic Corp


# d654df13 25-Apr-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Try to make gcc builds happy again by removing a redundant declaration.


# 4fc55e3e 23-Apr-2016 Maxim Sobolev <sobomax@FreeBSD.org>

Improve performance in a few key areas:

o Split the compression across several worker threads. By default, "several"
matches number of CPUs, capped at 24 for sanity when running on a very big
hardwares. Provide option to set that number manually;

o Fix bug inherited from the mkulzma (R.I.P) which degraded already slow LZMA
compression even further by calling function to release compression state
after processing each block.

It is neither documented as required nor actually required by the LZMA
library. This caused spree of system calls to release memory and then map
it again for every block. LZMA compression is more than 2x faster after this
change alone;

o Record time it takes to do compression and report throughput achieved.

o Add simple first-level 256 entry hash table for de-dup code, so it's not
becoming a bottleneck at big files.


# 6e5a582d 13-Mar-2016 Maxim Sobolev <sobomax@FreeBSD.org>

In the de-duplication mode, when found matching md5 checksum also read
back block and compare actual content. Just output original block
instead of back reference in the unlikely event of collision.


# 62ee4b69 10-Mar-2016 Maxim Sobolev <sobomax@FreeBSD.org>

When -S is specified dump summary to stdout, not stderr, so it's
easier to capture and process it with external tools via pipe.


# d83e0778 10-Mar-2016 Maxim Sobolev <sobomax@FreeBSD.org>

Add -S option to print out summary after compression has been
completed.

MFC after: 2 weeks


# 8f8cb840 23-Feb-2016 Maxim Sobolev <sobomax@FreeBSD.org>

Improve mkuzip(8) and geom_uzip(4), merge in LZMA support from mkulzma(8)
and geom_uncompress(4):

1. mkuzip(8):

- Proper support for eliminating all-zero blocks when compressing an
image. This feature is already supported by the geom_uzip(4) module
and CLOOP format in general, so it's just a matter of making mkuzip(8)
match. It should be noted, however that this feature while it sounds
great, results in very slight improvement in the overall compression
ratio, since compressing default 16k all-zero block produces only 39
bytes compressed output block, which is 99.8% compression ratio. With
typical average compression ratio of amd64 binaries and data being
around 60-70% the difference between 99.8% and 100.0% is not that
great further diluted by the ratio of number of zero blocks in the
uncompressed image to the overall number of blocks being less than
0.5 (typically). However, this may be important from performance
standpoint, so that kernel are not spinning its wheels decompressing
those empty blocks every time this zero region is read. It could also
be important when you create huge image mostly filled with zero
blocks for testing purposes.

- New feature allowing to de-duplicate output image. It turns out that
if you twist CLOOP format a bit you can do that as well. And unlike
zero-blocks elimination, this gives a noticeable improvement in the
overall compression ratio, reducing output image by something like
3-4% on my test UFS2 3GB image consisting of full FreeBSD base system
plus some of the packages (openjdk, apache etc), about 2.3GB worth of
file data (800+MB compressed). The only caveat is that images created
with this feature "on" would not work on older versions of FeeBSDxi
kernel, hence it's turned off by default.

- provide options to control both features and document them in manual
page.

- merge in all relevant LZMA compression support from the mkulzma(8),
add new option to select between both.

- switch license from ad-hoc beerware into standard 2-clause BSD.

2. geom_uzip(4):

- implement support for de-duplicated images;

- optimize some code paths to handle "all-zero" blocks without reading
any compressed data;

- beef up manual page to explain that geom_uzip(4) is not limited only
to md(4) images. The compressed data can be written to the block
device and accessed directly via magic of GEOM(4) and devfs(4),
including to mount root fs from a compressed drive.

- convert debug log code from being compiled in conditionally into
being present all the time and provide two sysctls to turn it on or
off. Due to intended use of the module, it can be used in
environments where there may not be a luxury to put new kernel with
debug code enabled. Having those options handy allows debug issues
without as much problem by just having access to serial console or
network shell access to a box/appliance. The resulting additional
CPU cycles are just few int comparisons and branches, and those are
minuscule when compared to data decompression which is the main
feature of the module.

- hopefully improve robustness and resiliency of the geom_uzip(4) by
performing some of the data validation / range checking on the TOC
entries and rejecting to attach to an image if those checks fail.

- merge in all relevant LZMA decompression support from the
geom_uncompress(4), enable automatically when appropriate format is
indicated in the header.

- move compilation work into its own worker thread so that it does not
clog g_up. This allows multiple instances work in parallel utilizing
smp cores.

- document new knobs in the manual page.

Reviewed by: adrian
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D5333


# 22a2d42f 13-May-2011 Ruslan Ermilov <ru@FreeBSD.org>

Fixed an embedded shell script.

Reviewed by: sobomax


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# fe0506d7 09-Mar-2010 Marcel Moolenaar <marcel@FreeBSD.org>

Create the altix project branch. The altix project will add support
for the SGI Altix 350 to FreeBSD/ia64. The hardware used for porting
is a two-module system, consisting of a base compute module and a
CPU expansion module. SGI's NUMAFlex architecture can be an excellent
platform to test CPU affinity and NUMA-aware features in FreeBSD.


# d7f03759 19-Oct-2008 Ulf Lilleengen <lulf@FreeBSD.org>

- Import the HEAD csup code which is the basis for the cvsmode work.


# 27d0a1a4 06-Mar-2007 Max Khon <fjoe@FreeBSD.org>

Support character device as input file.

PR: 103500


# d72d8f53 30-Jan-2006 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Tell the user exactly where the problem was.


# 5cf3bf70 11-May-2005 Max Khon <fjoe@FreeBSD.org>

- check for geom_uzip module presence using kldstat -m.
kldstat -m finds geom_uzip module even if it is compiled in statically.
- create output file with x bit set.
- build mkuzip on all architectures (verified with "make universe").
- fix typo in info message.


# ed9302fd 02-May-2005 Maxim Sobolev <sobomax@FreeBSD.org>

Make WARNS=6 clean, which should make it compiling on amd64.

Submitted by: Matteo Riondato <rionda@gufi.org>


# 0b99ac63 10-Sep-2004 Maxim Sobolev <sobomax@FreeBSD.org>

o Print more info in the verbose mode;

o use zlib(3) function which computes maximum length of the output
buffer instead of rolling own version;

o allow size of input file to be not multiple of cluster size by applying
zero padding.


# 7f4caa8c 10-Sep-2004 Maxim Sobolev <sobomax@FreeBSD.org>

Add mkuzip(8), non-GPL utility to compress filesystem images for use with
geom_uzip module. This is based on utility I wrote some 3 years ago for a
hack for md(4), which functionally was close to what geom_uzip does today.

Since I don't have a time to test that it compiles/works on other arches,
stick it to i386 only. Will do it later.

Unlike original cloop util, this one embedds FreeBSD-compatible shell code
into the generated image, not Linux one. Unfortunately severe space
restriction imposed by the CLOOP format doesn't allow to put conditional
code which will work both on Linux and FreeBSD. In fact it was quite a
challenge to fit necessary FreeBSD code into 127 bytes. ;-)