History log of /freebsd-current/usr.bin/sort/bwstring.c
Revision Date Author Comments
# bd234c0d 07-Dec-2023 Warner Losh <imp@FreeBSD.org>

sort: Only build FreeBSD-specific ALTMON_x stuff when ATLMON_1 is defined

On MacOS, we bootstrap sort. Since ALTMON_* are not defined there, the
build blows up. Since we don't need this feature for the FreeBSD build
process, and since we won't use it unless we actually install the NL
files that have this data in it, just #ifdef it out for now. In the
extremely unlikely event that the FreeBSD bootstrap/build process grows
this dependency, we can evaluate the best solution then (which most
likely is going to be not depend on the local's month names).

Fixes: 3d44dce90a69 (MacOS builds and github CI)
Sponsored by: Netflix
Reviewed by: jrtc27, jlduran@gmail.com, markj
Differential Revision: https://reviews.freebsd.org/D42868


# 3d44dce9 30-Nov-2023 Christos Margiolis <christos@FreeBSD.org>

sort: test against all month formats in month-sort

The CLDR specification [1] defines three possible month formats:

- Abbreviation (e.g Jan, Ιαν)
- Full (e.g January, Ιανουαρίου)
- Standalone (e.g January, Ιανουάριος)

Many languages use different case endings depending on whether the month
is referenced as a standalone word (nominative case), or in date context
(genitive, partitive, etc.). sort(1)'s -M option currently sorts months
by testing input against only the abbrevation format, which is
essentially a substring of the full format. While this works fine for
languages like English, where there are no cases, for languages where
there is a different case ending between the abbreviation/full and
standalone formats, it is not sufficient.

For example, in Greek, "May" can take the following forms:

Abbreviation: Μαΐ (genitive case)
Full: Μαΐου (genitive case)
Standalone: Μάιος (nominative case)

If we use the standalone format in Greek, sort(1) will not able to match
"Μαΐ" to "Μάιος" and the sort will fail.

This change makes sort(1) test against all three formats. It also works
when the input contains mixed formats.

[1] https://cldr.unicode.org/translation/date-time/date-time-patterns

Reviewed by: markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D42847


# 1d386b48 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# e8815fb3 13-Oct-2022 Baptiste Daroussin <bapt@FreeBSD.org>

sort: remove unused function


# b58094c0 12-Oct-2022 Baptiste Daroussin <bapt@FreeBSD.org>

sort: replace home made line reader by getdelim(3)

The previous code had bug when reading lines with an unexpected
encoding, returning without the full line being captured.
This result in sort complaining with "sort: Illegal byte sequence"

Using getdelim(3) instead of the home made code, fixes the situation.

PR: 241679
Reported by: Ronald F. Guilmette <rfg-freebsd@tristatelogic.com>
MFC After: 1 week
Reviewed by: markj, imp
Differential Revision: https://reviews.freebsd.org/D36948


# e9bfb50d 29-Oct-2021 Mark Johnston <markj@FreeBSD.org>

sort: Fix random sort

bwsrawdata() is supposed to return the string buffer.

PR: 259451
Reported by: sigsys@gmail.com
Fixes: d053fb22f6d3 ("usr.bin/sort: Avoid UBSan errors")
MFC after: 3 days
Sponsored by: The FreeBSD Foundation


# d053fb22 05-Jul-2021 Alex Richardson <arichardson@FreeBSD.org>

usr.bin/sort: Avoid UBSan errors

UBSan complains about out-of-bounds accesses for zero-length arrays. To
avoid this we can use flexible array members. However, the C standard does
not allow for structures that only contain flexible array members, so we
move the length parameters into that structure too.

Split out from D28233.

Reviewed By: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31009


# 71ec05a2 13-May-2021 Cyril Zhang <cyril@freebsdfoundation.org>

sort: Cache value of MB_CUR_MAX

Every usage of MB_CUR_MAX results in a call to __mb_cur_max. This is
inefficient and redundant. Caching the value of MB_CUR_MAX in a global
variable removes these calls and speeds up the runtime of sort. For
numeric sorting, runtime is almost halved in some tests.

PR: 255551
PR: 255840
Reviewed by: markj
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30170


# 266b51ac 10-Sep-2020 Alex Richardson <arichardson@FreeBSD.org>

Fix -Wpointer-sign warnings in bwstring.c


# 1de7b4b8 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

various: general adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.


# 759a9a9d 17-Feb-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sort(1): Remove unneeded initializations.

Found by: Clang static analyzer


# c514c3ed 27-Nov-2016 Xin LI <delphij@FreeBSD.org>

Eliminate variables that are computed, assigned but never
used.

MFC after: 2 weeks


# 80c7cc1c 15-Apr-2016 Pedro F. Giffuni <pfg@FreeBSD.org>

Cleanup unnecessary semicolons from utilities we all love.


# b904ab99 06-Apr-2015 Pedro F. Giffuni <pfg@FreeBSD.org>

sort(1): Cleanups and a small memory leak.

Remove useless check for leading blanks in the month name. The
code didn't adjust len after stripping blanks so even if a month
*did* start with a blank we'd end up copying garbage at the end.
Also convert a malloc + memcpy to strdup and fix a memory leak in
the wide char version if mbstowcs() fails.
Originally from Andre Smagin.

Obtained from: OpenBSD (CVS rev. 1.2, 1.3)
MFC after: 1 week


# e5f71a07 05-Apr-2015 Pedro F. Giffuni <pfg@FreeBSD.org>

Revert (partial) r281123, r281125:
sort: style knits / cleanups.

Our style guide(9) specifies that in absence of local variables
an empty line must be inserted.

Pointed out by: eadler


# db8026c7 05-Apr-2015 Pedro F. Giffuni <pfg@FreeBSD.org>

sort: style knits / cleanups.

Obtained from: OpenBSD


# bd0f80c6 05-Apr-2015 Pedro F. Giffuni <pfg@FreeBSD.org>

sort: Fix a comment.

Obtained from: OpenBSD


# c859c6dd 02-Jun-2013 Gabor Kovesdan <gabor@FreeBSD.org>

- Update Oleg Moskalenko's email address

Requested by: Oleg Moskalenko <mom040267@gmail.com>


# e8da8c74 01-Nov-2012 Gabor Kovesdan <gabor@FreeBSD.org>

- Portability changes for ARM
- Allow larger sort memory on 64-bit platforms

Submitted by: Oleg Moskalenko <oleg.moskalenko@citrix.com>


# 5ca724dc 25-May-2012 Gabor Kovesdan <gabor@FreeBSD.org>

- Only use multi-threading for large files
- Do not use mmap() by default; it can be enabled by --mmap
- Add some minor optimizations for -u
- Update manual page according to the changes

Submitted by: Oleg Moskalenko <oleg.moskalenko@citrix.com>


# ce1e997f 14-May-2012 Gabor Kovesdan <gabor@FreeBSD.org>

- Eliminate initializations if global variables. Compilers are not
required to optimize these so it may result in larger binary size.

Pointed out by: kib


# c66bbc91 10-May-2012 Gabor Kovesdan <gabor@FreeBSD.org>

Add a BSD-licensed sort rewrite that was started by me and later completed
with the major functionality and optimizations by Oleg Moskalenko.
It is compatible with the latest version of POSIX and the current GNU sort
version that we have in base. Beside this, it implements all the
functionality introduced in later versions of GNU sort. For now, it will
be installed as "bsdsort", keeping GNU sort as the default sort
implementation.