History log of /freebsd-11-stable/usr.bin/grep/file.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 354628 11-Nov-2019 kevans

MFC bsdgrep(1) fixes: r320414, r328559, r332805-r332806, r332809, r332832,
r332850-r332852, r332856, r332858, r332876, r333351, r334803,
r334806-r334809, r334821, r334837, r334889, r335188, r351769, r352691

r320414:
Expect :mmap_eof_not_eol to fail

It relies on a jemalloc feature (opt.redzone) no longer available after
r319971.

r328559:
Remove t_grep:mmap_eof_not_eol test

The test was marked as an expected failure in r320414 after r319971's import
of a newer jemalloc removed an essential feature (opt.redzone) for
reproducing the behavior it was testing. Since then, no way has been found
or demonstrated to reliably test the behavior, so remove the test.

r332805:
bsdgrep: Split match processing out of procfile

procfile is getting kind of hairy, and it's not going to get better as we
correct some more bits that assume we process one line at a time.

r332806:
bsdgrep: Clean up procmatches a little bit

r332809:
bsdgrep: Add some TODOs for future work on operating on chunks

r332832:
bsdgrep: Break procmatches down a little bit more

Split the matching and non-matching cases out into their own functions to
reduce future complexity. As the name implies, procmatches will eventually
process more than one match itself in the future.

r332850:
bsdgrep: Some light cleanup

There's no point checking for a bunch of file modes if we're not a
practicing believer of DIR_SKIP or DEV_SKIP.

This also reduces some style violations that were particularly ugly looking
when browsing through.

r332851:
bsdgrep: More trivial cleanup/style cleanup

We can avoid branching for these easily reduced patterns

r332852:
bsdgrep: if chain => switch

This makes some of this a little easier to follow (in my opinion).

r332856:
bsdgrep: Fix --include/--exclude ordering issues

Prior to r332851:
* --exclude always win out over --include
* --exclude-dir always wins out over --include-dir

r332851 broke that behavior, resulting in:
* First of --exclude, --include wins
* First of --exclude-dir, --include-dir wins

As it turns out, both behaviors are wrong by modern grep standards- the
latest rule wins. e.g.:

`grep --exclude foo --include foo 'thing' foo`
foo is included

`grep --include foo --exclude foo 'thing' foo`
foo is excluded

As tested with GNU grep 3.1.

This commit makes bsdgrep follow this behavior.

r332858:
bsdgrep: Use grep_strdup instead of grep_malloc+strcpy

r332876:
bsdgrep: Fix build failure WITHOUT_LZMA (incorrect bracket placement)

r333351:
bsdgrep: Allow "-" to be passed to -f to mean "standard input"

A version of this patch was originally sent to me by se@, matching behavior
from newer versions of GNU grep.

While there have been some differences of opinion on whether stdin should be
closed or not after depleting it in process of -f, I've opted to leave stdin
open and just let the later matching stuff fail and result in a no-match.
I'm not married to the current behavior- it was generally chosen since we
are adopting this in particular from GNU grep, and I would like to stay
consistent without a strong argument to the contrary. The current behavior
isn't technically wrong, it's just fairly unfriendly to the developer-user
of grep that may not realize their usage is trivially invalid.

r334803:
netbsd-tests: grep(1): Add test for -c flag

Someone might be inclined to accidentally break this. someone might have
written said test because they broke it locally.

r334806:
bsdgrep(1): Do some less dirty things with return types

Neither procfile nor grep_tree return anything meaningful to their callers.
None of the callers actually care about how many lines were matched in all
of the files they processed; it's all about "did anything match?"

This is generally just a light refactoring to remind me of what actually
matters as I'm rewriting these bits to care less about 'stuff'.

r334807:
bsdgrep(1): whoops, garbage collect the now write-only variable

r334808:
bsdgrep(1): Don't initialize fts_flags twice

Admittedly, this is a clang-scan complaint... but it wasn't wrong. fts_flags
is initialized by all cases in the switch(), which should be fairly obvious.
Annotate this anyways.

r334809:
netbsd-tests: bsdgrep(1): Add a test for -m, too

r334821:
bsdgrep(1): Slooowly peel away the chunky onion

(or peel off the band-aid, whatever floats your boat)

This addresses two separate issues:

1.) Nothing within bsdgrep actually knew whether it cared about line numbers
or not.

2.) The file layer knew nothing about the context in which it was being
called.

#1 is only important when we're *not* processing line-by-line. #2 is
debatably a good idea; the parsing context is only handy because that's
where we store current offset information and, as of this commit, whether or
not it needs to be line-aware.

r334837:
bsdgrep(1): Evict character sequence that moved in

r334889:
bsdgrep(1): Some more int -> bool conversions and name changes

Again motivated by upcoming work to rewrite a bunch of this- single-letter
variable names and slightly misleading variable names ("lastmatches" to
indicate that the last matched) are not helpful.

r335188:
bsdgrep(1): Remove redundant initialization; unconditionally assigned later

r351769:
bsdgrep(1): add some basic tests for some GNU Extension support

These will be expanded later as I come up with good test cases; for now,
these seem to be enough to trigger bugs in base gnugrep and expose missing
features in bsdgrep.

r352691:
bsdgrep(1): various fixes of empty pattern/exit code/-c behavior

When an empty pattern is encountered in the pattern list, I had previously
broken bsdgrep to count that as a "match all" and ignore any other patterns
in the list. This commit rectifies that mistake, among others:

- The -v flag semantics were not quite right; lines matched should have been
counted differently based on whether the -v flag was set or not. procline
now definitively returns whether it's matched or not, and interpreting
that result has been kicked up a level.
- Empty patterns with the -x flag was broken similarly to empty patterns
with the -w flag. The former is a whole-line match and should be more
strict, only matching blank lines. No -x and no -w will will match the
empty string at the beginning of each line.
- The exit code with -L was broken, w.r.t. modern grep. Modern grap will
exit(0) if any file that didn't match was output, so our interpretation
was simply backwards. The new interpretation makes sense to me.

Tests updated and added to try and catch some of this.

This misbehavior was found by autoconf while fixing ports found in PR 229925
expecting either a more sane or a more GNU-like sed.


# 330449 05-Mar-2018 eadler

MFC r326276:

various: general adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.


# 322608 17-Aug-2017 kevans

bsdgrep: fix segfault with --mmap and add relevant test

MFC r318565: bsdgrep: fix segfault with --mmap

r313948 partially fixed --mmap behavior but was incomplete. This commit
generally reverts it and does it the more correct way- by just consuming
the rest of the buffer and moving on.

MFC r318908: bsdgrep: add --mmap tests

Basic sanity tests as well as coverage for the bug fixed in r318565.

PR: 219402
Approved by: emaste (mentor, blanket MFC)


# 322560 16-Aug-2017 kevans

bsdgrep: add -z/--null-data support and update NLS catalogs accordingly

MFC r317049: bsdgrep: add -z/--null-data support

-z treats input and output data as sequences of lines terminated by a
zero byte instead of a newline. This brings it more in line with GNU grep
and brings us closer to passing the current tests with BSD grep.

MFC r317679: bsdgrep: correct nls usage data after r317049

r317049 added -z/--null-data to BSD grep but missed the update to nls
catalogs.

Approved by: emaste (mentor, blanket MFC)
Relnotes: yes


# 322520 14-Aug-2017 kevans

MFC r313948: bsdgrep: fix EOF handling with --mmap

Rework part of the loop in grep_fgetln to return the rest of the line
and ensure that we still advance the buffer by the length of the rest
of the line.

PR: 165471
Approved by: emaste (mentor)


# 302408 07-Jul-2016 gjb

Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle.
Prune svn:mergeinfo from the new branch, as nothing has been merged
here.

Additional commits post-branch will follow.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


/freebsd-11-stable/MAINTAINERS
/freebsd-11-stable/cddl
/freebsd-11-stable/cddl/contrib/opensolaris
/freebsd-11-stable/cddl/contrib/opensolaris/cmd/dtrace/test/tst/common/print
/freebsd-11-stable/cddl/contrib/opensolaris/cmd/zfs
/freebsd-11-stable/cddl/contrib/opensolaris/lib/libzfs
/freebsd-11-stable/contrib/amd
/freebsd-11-stable/contrib/apr
/freebsd-11-stable/contrib/apr-util
/freebsd-11-stable/contrib/atf
/freebsd-11-stable/contrib/binutils
/freebsd-11-stable/contrib/bmake
/freebsd-11-stable/contrib/byacc
/freebsd-11-stable/contrib/bzip2
/freebsd-11-stable/contrib/com_err
/freebsd-11-stable/contrib/compiler-rt
/freebsd-11-stable/contrib/dialog
/freebsd-11-stable/contrib/dma
/freebsd-11-stable/contrib/dtc
/freebsd-11-stable/contrib/ee
/freebsd-11-stable/contrib/elftoolchain
/freebsd-11-stable/contrib/elftoolchain/ar
/freebsd-11-stable/contrib/elftoolchain/brandelf
/freebsd-11-stable/contrib/elftoolchain/elfdump
/freebsd-11-stable/contrib/expat
/freebsd-11-stable/contrib/file
/freebsd-11-stable/contrib/gcc
/freebsd-11-stable/contrib/gcclibs/libgomp
/freebsd-11-stable/contrib/gdb
/freebsd-11-stable/contrib/gdtoa
/freebsd-11-stable/contrib/groff
/freebsd-11-stable/contrib/ipfilter
/freebsd-11-stable/contrib/ldns
/freebsd-11-stable/contrib/ldns-host
/freebsd-11-stable/contrib/less
/freebsd-11-stable/contrib/libarchive
/freebsd-11-stable/contrib/libarchive/cpio
/freebsd-11-stable/contrib/libarchive/libarchive
/freebsd-11-stable/contrib/libarchive/libarchive_fe
/freebsd-11-stable/contrib/libarchive/tar
/freebsd-11-stable/contrib/libc++
/freebsd-11-stable/contrib/libc-vis
/freebsd-11-stable/contrib/libcxxrt
/freebsd-11-stable/contrib/libexecinfo
/freebsd-11-stable/contrib/libpcap
/freebsd-11-stable/contrib/libstdc++
/freebsd-11-stable/contrib/libucl
/freebsd-11-stable/contrib/libxo
/freebsd-11-stable/contrib/llvm
/freebsd-11-stable/contrib/llvm/projects/libunwind
/freebsd-11-stable/contrib/llvm/tools/clang
/freebsd-11-stable/contrib/llvm/tools/lldb
/freebsd-11-stable/contrib/llvm/tools/llvm-dwarfdump
/freebsd-11-stable/contrib/llvm/tools/llvm-lto
/freebsd-11-stable/contrib/mdocml
/freebsd-11-stable/contrib/mtree
/freebsd-11-stable/contrib/ncurses
/freebsd-11-stable/contrib/netcat
/freebsd-11-stable/contrib/ntp
/freebsd-11-stable/contrib/nvi
/freebsd-11-stable/contrib/one-true-awk
/freebsd-11-stable/contrib/openbsm
/freebsd-11-stable/contrib/openpam
/freebsd-11-stable/contrib/openresolv
/freebsd-11-stable/contrib/pf
/freebsd-11-stable/contrib/sendmail
/freebsd-11-stable/contrib/serf
/freebsd-11-stable/contrib/sqlite3
/freebsd-11-stable/contrib/subversion
/freebsd-11-stable/contrib/tcpdump
/freebsd-11-stable/contrib/tcsh
/freebsd-11-stable/contrib/tnftp
/freebsd-11-stable/contrib/top
/freebsd-11-stable/contrib/top/install-sh
/freebsd-11-stable/contrib/tzcode/stdtime
/freebsd-11-stable/contrib/tzcode/zic
/freebsd-11-stable/contrib/tzdata
/freebsd-11-stable/contrib/unbound
/freebsd-11-stable/contrib/vis
/freebsd-11-stable/contrib/wpa
/freebsd-11-stable/contrib/xz
/freebsd-11-stable/crypto/heimdal
/freebsd-11-stable/crypto/openssh
/freebsd-11-stable/crypto/openssl
/freebsd-11-stable/gnu/lib
/freebsd-11-stable/gnu/usr.bin/binutils
/freebsd-11-stable/gnu/usr.bin/cc/cc_tools
/freebsd-11-stable/gnu/usr.bin/gdb
/freebsd-11-stable/lib/libc/locale/ascii.c
/freebsd-11-stable/sys/cddl/contrib/opensolaris
/freebsd-11-stable/sys/contrib/dev/acpica
/freebsd-11-stable/sys/contrib/ipfilter
/freebsd-11-stable/sys/contrib/libfdt
/freebsd-11-stable/sys/contrib/octeon-sdk
/freebsd-11-stable/sys/contrib/x86emu
/freebsd-11-stable/sys/contrib/xz-embedded
/freebsd-11-stable/usr.sbin/bhyve/atkbdc.h
/freebsd-11-stable/usr.sbin/bhyve/bhyvegc.c
/freebsd-11-stable/usr.sbin/bhyve/bhyvegc.h
/freebsd-11-stable/usr.sbin/bhyve/console.c
/freebsd-11-stable/usr.sbin/bhyve/console.h
/freebsd-11-stable/usr.sbin/bhyve/pci_fbuf.c
/freebsd-11-stable/usr.sbin/bhyve/pci_xhci.c
/freebsd-11-stable/usr.sbin/bhyve/pci_xhci.h
/freebsd-11-stable/usr.sbin/bhyve/ps2kbd.c
/freebsd-11-stable/usr.sbin/bhyve/ps2kbd.h
/freebsd-11-stable/usr.sbin/bhyve/ps2mouse.c
/freebsd-11-stable/usr.sbin/bhyve/ps2mouse.h
/freebsd-11-stable/usr.sbin/bhyve/rfb.c
/freebsd-11-stable/usr.sbin/bhyve/rfb.h
/freebsd-11-stable/usr.sbin/bhyve/sockstream.c
/freebsd-11-stable/usr.sbin/bhyve/sockstream.h
/freebsd-11-stable/usr.sbin/bhyve/usb_emul.c
/freebsd-11-stable/usr.sbin/bhyve/usb_emul.h
/freebsd-11-stable/usr.sbin/bhyve/usb_mouse.c
/freebsd-11-stable/usr.sbin/bhyve/vga.c
/freebsd-11-stable/usr.sbin/bhyve/vga.h
# 277463 20-Jan-2015 delphij

Fix xz handling for files larger than 32K.

Submitted by: Stefan Ehmann <shoesoft gmx net>
PR: bin/186861
MFC after: 2 weeks


# 245171 08-Jan-2013 obrien

Following r226271, allow disabling lzma support with "WITHOUT_LZMA_SUPPORT".
Correct r226271 which should have used WITHOUT_BZIP2_SUPPORT per r166255.

Obtained from: Juniper Networks


# 226271 11-Oct-2011 gabor

- Use getprogname() instead of __progname
- Allow disabling bzip2 support with WITHOUT_BZIP2
- Fix handling patterns that start with a dot
- Remove superfluous semicolon

Approved by: delphij (mentor)


# 226035 05-Oct-2011 gabor

Update BSD grep to the latest development version. It has some code
backported that was written for the TRE integration project in Google
Summer of Code 2011. This is a temporary solution until the whole
regex library is not replaced so that BSD grep development can continue
and the backported code gets some review and testing. This change only
improves scalability slightly, there is no big performance boost yet
but several minor bugs have been found and fixed.

Approved by: delphij (mentor)
Sposored by: Google Summer of Code 2011
MFC after: 1 week


# 220422 07-Apr-2011 gabor

- Adjust a comment to actual behaviour
- Makefile nit
- Add more CVS/SVN keywords to make it easier to track changes from NetBSD
in case they add further improvements

Approved by: delphij (mentor)
Obtained from: The NetBSD Project


# 211496 19-Aug-2010 des

UTFize my name.


# 211463 18-Aug-2010 gabor

- Refactor file reading code to use pure syscalls and an internal buffer
instead of stdio. This gives BSD grep a very big performance boost,
its speed is now almost comparable to GNU grep.

Submitted by: Dimitry Andric <dimitry@andric.com>
Approved by: delphij (mentor)


# 211364 15-Aug-2010 gabor

- Revert strlcpy() changes to memcpy() because it's more efficient and
former may be safer but in this case it doesn't add extra
safety [1]
- Fix -w option [2]
- Fix handling of GREP_OPTIONS [3]
- Fix --line-buffered
- Make stdin input imply --line-buffered so that tail -f can be piped
to grep [4]
- Imply -h if single file is grepped, this is the GNU behaviour
- Reduce locking overhead to gain some more performance [5]
- Inline some functions to help the compiler better optimize the code
- Use shortcut for empty files [6]

PR: bin/149425 [6]
Prodded by: jilles [1]
Reported by: Alex Kozlov <spam@rm-rf.kiev.ua> [2] [3],
swell.k@gmail.com [2],
poyopoyo@puripuri.plala.or.jp [4]
Submitted by: scf [5],
Shuichi KITAGUCHI <ki@hh.iij4u.or.jp> [6]
Approved by: delphij (mentor)


# 210389 22-Jul-2010 gabor

Add BSD grep to the base system and make it our default grep.

Deliverables: Small and clean code (1,4 KSLOC vs GNU's 8,5 KSLOC),
lower memory usage than GNU grep, GNU compatibility,
BSD license.

TODO: Performance is somewhat behind GNU grep but it is only
significant for bigger searches. The reason is complex, the
most important factor is that GNU grep uses lots of
optimizations to improve the speed of the regex library.
First, we need a modern regex library (practically by adopting
TRE), add support for GNU-style non-standard regexes and then
reevalute the performance issues and look for bottlenecks. In
the meantime, for those, who need better performance, it is
possible to build GNU grep by setting WITH_GNU_GREP.

Approved by: delphij (mentor)
Obtained from: OpenBSD (http://www.openbsd.org/cgi-bin/cvsweb/src/usr.bin/grep/),
freegrep (http://github.com/howardjp/freegrep)
Sponsored by: Google SoC 2008
Portbuild tests run by: kris, pav, erwin
Acknowledgements to: fjoe (as SoC 2008 mentor),
everyone who helped in reviewing and testing