#
354628 |
|
11-Nov-2019 |
kevans |
MFC bsdgrep(1) fixes: r320414, r328559, r332805-r332806, r332809, r332832, r332850-r332852, r332856, r332858, r332876, r333351, r334803, r334806-r334809, r334821, r334837, r334889, r335188, r351769, r352691
r320414: Expect :mmap_eof_not_eol to fail
It relies on a jemalloc feature (opt.redzone) no longer available after r319971.
r328559: Remove t_grep:mmap_eof_not_eol test
The test was marked as an expected failure in r320414 after r319971's import of a newer jemalloc removed an essential feature (opt.redzone) for reproducing the behavior it was testing. Since then, no way has been found or demonstrated to reliably test the behavior, so remove the test.
r332805: bsdgrep: Split match processing out of procfile
procfile is getting kind of hairy, and it's not going to get better as we correct some more bits that assume we process one line at a time.
r332806: bsdgrep: Clean up procmatches a little bit
r332809: bsdgrep: Add some TODOs for future work on operating on chunks
r332832: bsdgrep: Break procmatches down a little bit more
Split the matching and non-matching cases out into their own functions to reduce future complexity. As the name implies, procmatches will eventually process more than one match itself in the future.
r332850: bsdgrep: Some light cleanup
There's no point checking for a bunch of file modes if we're not a practicing believer of DIR_SKIP or DEV_SKIP.
This also reduces some style violations that were particularly ugly looking when browsing through.
r332851: bsdgrep: More trivial cleanup/style cleanup
We can avoid branching for these easily reduced patterns
r332852: bsdgrep: if chain => switch
This makes some of this a little easier to follow (in my opinion).
r332856: bsdgrep: Fix --include/--exclude ordering issues
Prior to r332851: * --exclude always win out over --include * --exclude-dir always wins out over --include-dir
r332851 broke that behavior, resulting in: * First of --exclude, --include wins * First of --exclude-dir, --include-dir wins
As it turns out, both behaviors are wrong by modern grep standards- the latest rule wins. e.g.:
`grep --exclude foo --include foo 'thing' foo` foo is included
`grep --include foo --exclude foo 'thing' foo` foo is excluded
As tested with GNU grep 3.1.
This commit makes bsdgrep follow this behavior.
r332858: bsdgrep: Use grep_strdup instead of grep_malloc+strcpy
r332876: bsdgrep: Fix build failure WITHOUT_LZMA (incorrect bracket placement)
r333351: bsdgrep: Allow "-" to be passed to -f to mean "standard input"
A version of this patch was originally sent to me by se@, matching behavior from newer versions of GNU grep.
While there have been some differences of opinion on whether stdin should be closed or not after depleting it in process of -f, I've opted to leave stdin open and just let the later matching stuff fail and result in a no-match. I'm not married to the current behavior- it was generally chosen since we are adopting this in particular from GNU grep, and I would like to stay consistent without a strong argument to the contrary. The current behavior isn't technically wrong, it's just fairly unfriendly to the developer-user of grep that may not realize their usage is trivially invalid.
r334803: netbsd-tests: grep(1): Add test for -c flag
Someone might be inclined to accidentally break this. someone might have written said test because they broke it locally.
r334806: bsdgrep(1): Do some less dirty things with return types
Neither procfile nor grep_tree return anything meaningful to their callers. None of the callers actually care about how many lines were matched in all of the files they processed; it's all about "did anything match?"
This is generally just a light refactoring to remind me of what actually matters as I'm rewriting these bits to care less about 'stuff'.
r334807: bsdgrep(1): whoops, garbage collect the now write-only variable
r334808: bsdgrep(1): Don't initialize fts_flags twice
Admittedly, this is a clang-scan complaint... but it wasn't wrong. fts_flags is initialized by all cases in the switch(), which should be fairly obvious. Annotate this anyways.
r334809: netbsd-tests: bsdgrep(1): Add a test for -m, too
r334821: bsdgrep(1): Slooowly peel away the chunky onion
(or peel off the band-aid, whatever floats your boat)
This addresses two separate issues:
1.) Nothing within bsdgrep actually knew whether it cared about line numbers or not.
2.) The file layer knew nothing about the context in which it was being called.
#1 is only important when we're *not* processing line-by-line. #2 is debatably a good idea; the parsing context is only handy because that's where we store current offset information and, as of this commit, whether or not it needs to be line-aware.
r334837: bsdgrep(1): Evict character sequence that moved in
r334889: bsdgrep(1): Some more int -> bool conversions and name changes
Again motivated by upcoming work to rewrite a bunch of this- single-letter variable names and slightly misleading variable names ("lastmatches" to indicate that the last matched) are not helpful.
r335188: bsdgrep(1): Remove redundant initialization; unconditionally assigned later
r351769: bsdgrep(1): add some basic tests for some GNU Extension support
These will be expanded later as I come up with good test cases; for now, these seem to be enough to trigger bugs in base gnugrep and expose missing features in bsdgrep.
r352691: bsdgrep(1): various fixes of empty pattern/exit code/-c behavior
When an empty pattern is encountered in the pattern list, I had previously broken bsdgrep to count that as a "match all" and ignore any other patterns in the list. This commit rectifies that mistake, among others:
- The -v flag semantics were not quite right; lines matched should have been counted differently based on whether the -v flag was set or not. procline now definitively returns whether it's matched or not, and interpreting that result has been kicked up a level. - Empty patterns with the -x flag was broken similarly to empty patterns with the -w flag. The former is a whole-line match and should be more strict, only matching blank lines. No -x and no -w will will match the empty string at the beginning of each line. - The exit code with -L was broken, w.r.t. modern grep. Modern grap will exit(0) if any file that didn't match was output, so our interpretation was simply backwards. The new interpretation makes sense to me.
Tests updated and added to try and catch some of this.
This misbehavior was found by autoconf while fixing ports found in PR 229925 expecting either a more sane or a more GNU-like sed.
|
#
330449 |
|
05-Mar-2018 |
eadler |
MFC r326276:
various: general adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task.
The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.
No functional change intended.
|
#
323443 |
|
11-Sep-2017 |
kevans |
bsdgrep: add a primitive literal matcher to unbreak fgrep in some scenarios
MFC r322825: bsdgrep: add some additional tests for fgrep
Previously added tests only check that fgrep is somewhat sane and works. Add some more tests that check that the implementation is basically functional and not producing incorrect results with various flags.
MFC r322826: bsdgrep: add a primitive literal matcher
fgrep/grep -F will error out at runtime if compiled with a regex(3) that does not define REG_NOSPEC or REG_LITERAL. glibc is one such regex(3) implementation, and as it turns out they don't support literal matching at all.
Provide a primitive literal matcher for use with glibc and other implementations that don't support literal matching so that we don't completely lose fgrep/grep -F if compiled against libgnuregex on stable/10, stable/11, or other systems that we don't necessarily support.
This is a wholly unoptimized implementation with no plans to optimize it as of now. This is due to both its use-case being primarily on unsupported systems in the near-distant future and that it's reinventing the wheel that we already have available as a feature of regex(3).
PR: 222201 Approved by: emaste (mentor, blanket MFC)
|
#
322625 |
|
17-Aug-2017 |
kevans |
bsdgrep: bump version number to 2.6.0 and update copyright information
MFC r319132: bsdgrep: bump version number and add Kyle Evans copyright
The following changes have been made over the last couple of months:
Features:
- With bsdgrep -r, the working directory is implied if no directory is specified - bsdgrep will now behave as bsdgrep -r does when it's named rgrep - bsdgrep now understands -z/--null-data to use \0 as EOL - GNU regex compatibility is now indicated with a "GNU compatible" in the version string
Fixes:
- --mmap no longer hangs when coming across an EOF without an accompanying EOL - -o/--color matching generally improved, now produces earliest / longest matches - Context output now more closely aligns with GNU grep - Zero-length matches no longer exhibit broken behavior - Every output line now honors -b/-H/-n flags
Tests have been added for previous regressions as well as other previously untested behaviors.
Various other fixes have been commited, and refactoring for further / later improvements has taken place.
(The original submission changed the version string to 2.5.2, but I decided to use 2.6.0 to reflect the addition of new features.)
MFC r320754: Update copyright e-mail address to @FreeBSD.org address
Approved by: emaste (mentor, blanket MFC)
|
#
322622 |
|
17-Aug-2017 |
kevans |
MFC r318914: bsdgrep: correct assumptions to prepare for chunking
Correct a couple of minor BSD grep assumptions that are valid for line processing but not future chunk-based processing.
Approved by: emaste (mentor, blanket MFC)
|
#
322610 |
|
17-Aug-2017 |
kevans |
MFC r318574: bsdgrep: Correct per-line line metadata printing
Metadata printing with -b, -H, or -n flags suffered from a few flaws:
1) -b/offset printing was broken when used in conjunction with -o
2) With -o, bsdgrep did not print metadata for every match/line, just the first match of a line
3) There were no tests for this
Address these issues by outputting this data per-match if the -o flag is specified, and prior to outputting any matches if -o but not --color, since --color alone will not generate a new line of output for every iteration over the matches.
To correct -b output, fudge the line offset as we're printing matches.
While here, make sure we're using grep_printline in -A context. Context printing should *never* look at the parsing context, just the line.
The tests included do not pass with gnugrep in base due to it exhibiting similar quirky behavior that bsdgrep previously exhibited.
Approved by: emaste (mentor, blanket MFC)
|
#
322607 |
|
17-Aug-2017 |
kevans |
bsdgrep: Don't allow negative context flags, add more tests
MFC r318302: bsdgrep: don't allow negative -A / -B / -C
Previously, when given a negative -A/-B/-C argument bsdgrep would overflow the respective context flag(s) and exhibited surprising behavior.
Fix this by removing unsignedness of Aflag/Bflag and erroring out if we're given a value < 0. Also adjust the type used to track 'tail' context in procfile() so that it accurately reflects the Aflag value rather than overflowing and losing trailing context.
This also fixes an inconsistency previously existing between -n and -C "n" behavior. They are now both limited to LLONG_MAX, to be consistent.
Add some test cases to make sure grep errors out properly for both negative context values as well as non-numeric context values rather than giving bogus matches.
MFC r318317: bsdgrep: add more tests for different binary flags
The existing 'binary' test in netbsd-tests/ does a basic check of the default treatment for binary behavior, but not much more than that. Given some opportunity for breakage recently that did not trigger any failures, add some tests to cover the three different binary file behaviors (a, -I, -U) and their --binary-files= equivalent values.
Approved by: emaste (mentor, blanket MFC)
|
#
322587 |
|
16-Aug-2017 |
kevans |
bsdgrep: fix -w flag matching with an empty pattern
MFC r317703: bsdgrep: fix -w flag matching with an empty pattern
-w flag matching with an empty pattern was generally 'broken', allowing matches to occur on any line whether or not it actually matches -w criteria.
This fix required a good amount of refactoring to address. procline() is altered to *only* process the line and return whether it was a match or not, necessary to be able to short-circuit the whole function in case of this matchall flag. -m flag handling is moved out as well because it suffers from the same fate as context handling if we bypass any actual pattern matching.
The matching context (matches, mostly) didn't previously exist outside of procline(), so we go ahead and create context object for file processing bits to pass around. grep_printline() was created due to this, for the scenarios where the matches don't actually matter and we just want to print a line or two, a la flushing the context queue and no -o or --color specified.
Damage from this broken behavior would have been mitigated by the fact that it is unlikely users would invoke grep -w with an empty pattern.
This was identified while checking PR 105221 for problems it this may cause in BSD grep, but PR 105221 is *not* a report of this behavior.
MFC r317741: bsdgrep: correct uninitialized variable introduced in r317703
MFC r317842: bsdgrep: don't ouptut matches with -c, -l, -L
Refactoring done in r317703 broke -c, -l, and -L flags implying suppression of match printing. Fortunately this is just a matter of not doing any printing of the resulting matches and context printing was not broken in this refactoring.
Add some regression tests since this area may still see further refactoring, include different context flags as well even though they were not broken in this case.
PR: 219077 Approved by: emaste (mentor, blanket MFC)
|
#
322582 |
|
16-Aug-2017 |
kevans |
MFC r317254: bsdgrep: add BSD_GREP_FASTMATCH knob for built-in fastmatch
Bugs have been found in the fastmatch implementation as used in bsdgrep. Some have been fixed (r316495) while fixes for others are in review (D10098).
In comparison with the fastmatch implementation, Kyle Evans found that:
- regex(3)'s performance with literal expressions offers a speed improvement over fastmatch
- regex(3)'s performance, both with simple BREs and EREs, seems to be comparable
The regex implementation was imported in r226035, and the commit message reports:
This is a temporary solution until the whole regex library is not replaced so that BSD grep development can continue and the backported code gets some review and testing. This change only improves scalability slightly, there is no big performance boost yet but several minor bugs have been found and fixed.
Introduce a WITH_/WITHOUT_BSD_GREP_FASTMATCH knob to support testing of both approaches.
Regenerate src.conf(5) as per the original commit
PR: 175314, 194823 Approved by: emaste (mentor, blanket MFC)
|
#
322560 |
|
16-Aug-2017 |
kevans |
bsdgrep: add -z/--null-data support and update NLS catalogs accordingly
MFC r317049: bsdgrep: add -z/--null-data support
-z treats input and output data as sequences of lines terminated by a zero byte instead of a newline. This brings it more in line with GNU grep and brings us closer to passing the current tests with BSD grep.
MFC r317679: bsdgrep: correct nls usage data after r317049
r317049 added -z/--null-data to BSD grep but missed the update to nls catalogs.
Approved by: emaste (mentor, blanket MFC) Relnotes: yes
|
#
302408 |
|
07-Jul-2016 |
gjb |
Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle. Prune svn:mergeinfo from the new branch, as nothing has been merged here.
Additional commits post-branch will follow.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
264744 |
|
21-Apr-2014 |
pfg |
Various style(9) fixes and typos in grep, sort and patch.
MFC after: 3 days
|
#
244493 |
|
20-Dec-2012 |
eadler |
Make bsdgrep behave as gnugrep and as documented: -m should only stop reading the specific file, not any file.
Tested by: frogs (irc) Reviewed by: gabor Approved by: cperciva (implicit) MFC after: 1 week
|
#
228319 |
|
07-Dec-2011 |
gabor |
- Match GNU behavior of exit code - Rename variable that has a different meaning now
PR: bin/162930 Submitted by: Jan Beich <jbeich@tormail.net> MFC after: 1 week
|
#
226035 |
|
05-Oct-2011 |
gabor |
Update BSD grep to the latest development version. It has some code backported that was written for the TRE integration project in Google Summer of Code 2011. This is a temporary solution until the whole regex library is not replaced so that BSD grep development can continue and the backported code gets some review and testing. This change only improves scalability slightly, there is no big performance boost yet but several minor bugs have been found and fixed.
Approved by: delphij (mentor) Sposored by: Google Summer of Code 2011 MFC after: 1 week
|
#
220422 |
|
07-Apr-2011 |
gabor |
- Adjust a comment to actual behaviour - Makefile nit - Add more CVS/SVN keywords to make it easier to track changes from NetBSD in case they add further improvements
Approved by: delphij (mentor) Obtained from: The NetBSD Project
|
#
220421 |
|
07-Apr-2011 |
gabor |
- Simplify the fixed string pattern preprocessing code - Improve readability
Approved by: delphij (mentor) Obtained from: The NetBSD Project
|
#
211496 |
|
19-Aug-2010 |
des |
UTFize my name.
|
#
211463 |
|
18-Aug-2010 |
gabor |
- Refactor file reading code to use pure syscalls and an internal buffer instead of stdio. This gives BSD grep a very big performance boost, its speed is now almost comparable to GNU grep.
Submitted by: Dimitry Andric <dimitry@andric.com> Approved by: delphij (mentor)
|
#
211364 |
|
15-Aug-2010 |
gabor |
- Revert strlcpy() changes to memcpy() because it's more efficient and former may be safer but in this case it doesn't add extra safety [1] - Fix -w option [2] - Fix handling of GREP_OPTIONS [3] - Fix --line-buffered - Make stdin input imply --line-buffered so that tail -f can be piped to grep [4] - Imply -h if single file is grepped, this is the GNU behaviour - Reduce locking overhead to gain some more performance [5] - Inline some functions to help the compiler better optimize the code - Use shortcut for empty files [6]
PR: bin/149425 [6] Prodded by: jilles [1] Reported by: Alex Kozlov <spam@rm-rf.kiev.ua> [2] [3], swell.k@gmail.com [2], poyopoyo@puripuri.plala.or.jp [4] Submitted by: scf [5], Shuichi KITAGUCHI <ki@hh.iij4u.or.jp> [6] Approved by: delphij (mentor)
|
#
210578 |
|
28-Jul-2010 |
gabor |
- Use the traditional behaviour for filename and directory name inclusion and exclusion patterns [1] - Some improvements on the exiting code, like replacing memcpy with strlcpy/strcpy
Approved by: delphij (mentor) Pointed out by: bf [1], des [1]
|
#
210461 |
|
25-Jul-2010 |
gabor |
- Fix --color behaviour to only output color sequences if stdout is a tty or if forced mode is specified [1] - While here, add some alternative names for the options and make then case-insensitive - Fix -q and -l behaviour [2] - Some small changes to make the code easier to review
Submitted by: swell.k@gmail.com [1], dougb [2] Approved by: delphij (mentor)
|
#
210389 |
|
22-Jul-2010 |
gabor |
Add BSD grep to the base system and make it our default grep.
Deliverables: Small and clean code (1,4 KSLOC vs GNU's 8,5 KSLOC), lower memory usage than GNU grep, GNU compatibility, BSD license.
TODO: Performance is somewhat behind GNU grep but it is only significant for bigger searches. The reason is complex, the most important factor is that GNU grep uses lots of optimizations to improve the speed of the regex library. First, we need a modern regex library (practically by adopting TRE), add support for GNU-style non-standard regexes and then reevalute the performance issues and look for bottlenecks. In the meantime, for those, who need better performance, it is possible to build GNU grep by setting WITH_GNU_GREP.
Approved by: delphij (mentor) Obtained from: OpenBSD (http://www.openbsd.org/cgi-bin/cvsweb/src/usr.bin/grep/), freegrep (http://github.com/howardjp/freegrep) Sponsored by: Google SoC 2008 Portbuild tests run by: kris, pav, erwin Acknowledgements to: fjoe (as SoC 2008 mentor), everyone who helped in reviewing and testing
|