| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the past, `a.*?c|a.*?b` was factored to `a.*?[bc]`. Thus, given
"abc" as its input string, the automaton would consume "ab" and
then stop (when unanchored) whereas it should consume all of "abc"
as per leftmost semantics.
Fixes #13812.
Change-Id: I67ac0a353d7793b3d0c9c4aaf22d157621dfe784
Reviewed-on: https://go-review.googlesource.com/18357
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's not a group: must handle the inside as a sequence of literal chars,
not a single literal string.
That is, \Qab\E+ is the same as ab+, not (ab)+.
Fixes #11187.
Change-Id: I5406d05ccf7efff3a7f15395bdb0cfb2bd23a8ed
Reviewed-on: https://go-review.googlesource.com/17233
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The one-pass transformation is structured as a search over the input
machine for conditions that violate the one-pass requisites. At each
iteration, we should fully explore all non-input paths that proceed from
the current instruction; this is implemented via recursive check calls.
But when we reach instructions that demand input (InstRune*), these
should be put onto the search queue.
Instead of searching this way, the routine previously (effectively)
proceeded through the machine one instruction at a time until finding an
Inst{Match,Fail,Rune*}, calling check on each instruction. This caused
bug #11905, where the transformation stopped before rewriting all
InstAlts as InstAltMatches.
Further, the check function unnecessarily recurred on InstRune*
instructions. (I believe this helps to mask the above bug.)
This change also deletes some unused functions and duplicate test cases.
Fixes #11905.
Change-Id: I5b0b26efea3d3bd01c7479a518b5ed1b886701cd
Reviewed-on: https://go-review.googlesource.com/17195
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The prefix computation for onepass was incorrectly returning
complete=true when it encountered a beginning-of-text empty width match
(^) in the middle of an expression.
Fix by returning complete only when the prefix is followed by $ and then
an accepting state.
Fixes #11175.
Change-Id: Ie9c4cf5f76c1d2c904a6fb2f016cedb265c19fde
Reviewed-on: https://go-review.googlesource.com/16200
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This helps users who wish to use separate Regexps in each goroutine to
avoid lock contention. Previously they had to parse the expression
multiple times to achieve this.
I used variants of the included benchmark to evaluate this change. I
used the arguments -benchtime 20s -cpu 1,2,4,8,16 on a machine with 16
hardware cores.
Comparing a single shared Regexp vs. copied Regexps, we can see that
lock contention causes huge slowdowns at higher levels of parallelism.
The copied version shows the expected linear speedup.
name old time/op new time/op delta
MatchParallel 366ns ± 0% 370ns ± 0% +1.09% (p=0.000 n=10+8)
MatchParallel-2 324ns ±28% 184ns ± 1% -43.37% (p=0.000 n=10+10)
MatchParallel-4 352ns ± 5% 93ns ± 1% -73.70% (p=0.000 n=9+10)
MatchParallel-8 480ns ± 3% 46ns ± 0% -90.33% (p=0.000 n=9+8)
MatchParallel-16 510ns ± 8% 24ns ± 6% -95.36% (p=0.000 n=10+8)
I also compared a modified version of Regexp that has no mutex and a
single machine (the "RegexpForSingleGoroutine" rsc mentioned in
https://github.com/golang/go/issues/8232#issuecomment-66096128).
In this next test, I compared using N copied Regexps vs. N separate
RegexpForSingleGoroutines. This shows that, even for this relatively
simple regex, avoiding the lock entirely would only buy about 10-12%
further improvement.
name old time/op new time/op delta
MatchParallel 370ns ± 0% 322ns ± 0% -12.97% (p=0.000 n=8+8)
MatchParallel-2 184ns ± 1% 162ns ± 1% -11.60% (p=0.000 n=10+10)
MatchParallel-4 92.7ns ± 1% 81.1ns ± 2% -12.43% (p=0.000 n=10+10)
MatchParallel-8 46.4ns ± 0% 41.8ns ±10% -9.78% (p=0.000 n=8+10)
MatchParallel-16 23.7ns ± 6% 20.6ns ± 1% -13.14% (p=0.000 n=8+8)
Updates #8232.
Change-Id: I15201a080c363d1b44104eafed46d8df5e311902
Reviewed-on: https://go-review.googlesource.com/16110
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
|
|
|
|
|
|
| |
Fixes #12980.
Change-Id: I936db2f57f7c4dc80bb8ec32715c4c6b7bf0d708
Reviewed-on: https://go-review.googlesource.com/16112
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Regular expressions involving a (x){0} term are
simplified by removing this term from the
expression, just before the expression is compiled.
The number of subexpressions is evaluated before
the simplification. The number of capture instructions
in the compiled expressions is not necessarily in line
with the number of subexpressions.
When the ReplaceAll(String) methods are used, a number
of capture slots (nmatch) is evaluated as 2*(s+1)
(s being the number of subexpressions).
In some case, it can be higher than the number of capture
instructions evaluated at compile time, resulting in a
panic when the internal slices of regexp.machine
are resized to this value.
Fixed by capping the number of capture slots to the number
of capture instructions.
I must say I do not really see the benefits of setting
nmatch lower than re.prog.NumCap using this 2*(s+1) formula,
so perhaps this can be further simplified.
Fixes #11178
Fixes #11176
Change-Id: I21415e8ef2dd5f2721218e9a679f7f6bfb76ae9b
Reviewed-on: https://go-review.googlesource.com/14013
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The existing comment for regex.Split contains a plain text example,
while many of the other regex functions have runnable examples. This
change provides a runnable example for Split.
Change-Id: I5373f57f532fe843d7d0adcf4b513061ec797047
Reviewed-on: https://go-review.googlesource.com/14737
Reviewed-by: Andrew Gerrand <adg@golang.org>
Run-TryBot: Andrew Gerrand <adg@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
|
|
|
|
|
|
| |
s/Backtrace/Backtrack/
Change-Id: I062aab18f23f2bc2110cf7210c2e7264747e02cf
Reviewed-on: https://go-review.googlesource.com/11091
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
|
|
|
|
|
|
| |
In 1.6, go doc is more likely to be available.
Change-Id: I970ad1d3317b35273f5c8d830f75713d3570c473
Reviewed-on: https://go-review.googlesource.com/10518
Reviewed-by: Andrew Gerrand <adg@golang.org>
|
|
|
|
|
|
|
|
|
|
| |
Replaced code.google.com/p/re2/ with github.com/google/re2/ and
updated the file names (re2-exhaustive.txt.bz2 not re2.txt.gz)
as well as the re2 make command (make log).
Change-Id: I15937b0b8a898d78d45366857ed86421c8d69960
Reviewed-on: https://go-review.googlesource.com/9372
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
|
|
|
|
|
|
|
| |
Fixes #10319
Change-Id: I96015b0e1dff30a72de11fea3837638b5c672891
Reviewed-on: https://go-review.googlesource.com/8501
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This update makes maxBacktrackLen return 0 if
len(prog.Inst) > maxBacktrackProg. This prevents an attempt to
backtrack against a nil bitstate.
Fixes #10319
Change-Id: Icdbeb2392782ccf66f9d0a70ea57af22fb93f01b
Reviewed-on: https://go-review.googlesource.com/8473
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
|
|
|
|
|
|
| |
Fixes #10224
Change-Id: I21037379b4667575e51ab0b6b683138c505c3f68
Reviewed-on: https://go-review.googlesource.com/7960
Reviewed-by: David Crawshaw <crawshaw@golang.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a port of RE2's bitstate backtracker, which triggers under
the same conditions that the RE2 backtracker triggers. However I wasn't
sure how to port over some of the optimizations in the RE2 backtracker,
and there is a ~2% penalty on benchmarks that don't trigger the backtracker.
benchmark old ns/op new ns/op delta
BenchmarkLiteral 312 189 -39.42%
BenchmarkNotLiteral 4435 3001 -32.33%
BenchmarkMatchClass 5758 4378 -23.97%
BenchmarkMatchClass_InRange 5385 4084 -24.16%
BenchmarkReplaceAll 5291 3505 -33.76%
BenchmarkAnchoredLiteralShortNonMatch 190 200 +5.26%
BenchmarkAnchoredLiteralLongNonMatch 189 194 +2.65%
BenchmarkAnchoredShortMatch 479 304 -36.53%
BenchmarkAnchoredLongMatch 478 499 +4.39%
BenchmarkOnePassShortA 791 798 +0.88%
BenchmarkNotOnePassShortA 3202 1571 -50.94%
BenchmarkOnePassShortB 614 633 +3.09%
BenchmarkNotOnePassShortB 2685 881 -67.19%
BenchmarkOnePassLongPrefix 152 154 +1.32%
BenchmarkOnePassLongNotPrefix 505 533 +5.54%
BenchmarkMatchEasy0_32 139 171 +23.02%
BenchmarkMatchEasy0_1K 653 1797 +175.19%
BenchmarkMatchEasy0_32K 12032 13346 +10.92%
BenchmarkMatchEasy0_1M 462882 461272 -0.35%
BenchmarkMatchEasy0_32M 15015339 15365238 +2.33%
BenchmarkMatchEasy1_32 122 168 +37.70%
BenchmarkMatchEasy1_1K 3339 2612 -21.77%
BenchmarkMatchEasy1_32K 72330 71721 -0.84%
BenchmarkMatchEasy1_1M 2545410 2652284 +4.20%
BenchmarkMatchEasy1_32M 80072063 82609750 +3.17%
BenchmarkMatchMedium_32 2359 1980 -16.07%
BenchmarkMatchMedium_1K 75939 58593 -22.84%
BenchmarkMatchMedium_32K 2450907 2501106 +2.05%
BenchmarkMatchMedium_1M 78707697 80174418 +1.86%
BenchmarkMatchMedium_32M 2535146010 2570896441 +1.41%
BenchmarkMatchHard_32 4297 2960 -31.11%
BenchmarkMatchHard_1K 133592 88997 -33.38%
BenchmarkMatchHard_32K 4240445 4336907 +2.27%
BenchmarkMatchHard_1M 136187006 139350238 +2.32%
BenchmarkMatchHard_32M 4350855890 4478537306 +2.93%
benchmark old MB/s new MB/s speedup
BenchmarkMatchEasy0_32 228.74 186.11 0.81x
BenchmarkMatchEasy0_1K 1565.91 569.64 0.36x
BenchmarkMatchEasy0_32K 2723.31 2455.10 0.90x
BenchmarkMatchEasy0_1M 2265.32 2273.22 1.00x
BenchmarkMatchEasy0_32M 2234.68 2183.79 0.98x
BenchmarkMatchEasy1_32 261.08 190.22 0.73x
BenchmarkMatchEasy1_1K 306.59 391.91 1.28x
BenchmarkMatchEasy1_32K 453.03 456.88 1.01x
BenchmarkMatchEasy1_1M 411.95 395.35 0.96x
BenchmarkMatchEasy1_32M 419.05 406.18 0.97x
BenchmarkMatchMedium_32 13.56 16.16 1.19x
BenchmarkMatchMedium_1K 13.48 17.48 1.30x
BenchmarkMatchMedium_32K 13.37 13.10 0.98x
BenchmarkMatchMedium_1M 13.32 13.08 0.98x
BenchmarkMatchMedium_32M 13.24 13.05 0.99x
BenchmarkMatchHard_32 7.45 10.81 1.45x
BenchmarkMatchHard_1K 7.67 11.51 1.50x
BenchmarkMatchHard_32K 7.73 7.56 0.98x
BenchmarkMatchHard_1M 7.70 7.52 0.98x
BenchmarkMatchHard_32M 7.71 7.49 0.97x
Fixes #4154
Change-Id: Iff7fb9507f0872b320d08afc08679751ed1b28bc
Reviewed-on: https://go-review.googlesource.com/2153
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
|
|
|
|
|
|
| |
Comment changes only.
Change-Id: I56848814564c4aa0988b451df18bebdfc88d6d94
Reviewed-on: https://go-review.googlesource.com/7721
Reviewed-by: Rob Pike <r@golang.org>
|
|
|
|
|
|
| |
Change-Id: I06035d949272157bbb7255563b37ac93cbf07f15
Reviewed-on: https://go-review.googlesource.com/3272
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
|
|
|
|
| |
Change-Id: Idff57050a34d09e7fa9b77e9b53d61bb5ea2a71c
Reviewed-on: https://go-review.googlesource.com/2095
Reviewed-by: Minux Ma <minux@golang.org>
|
|
|
|
|
|
|
| |
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews
https://golang.org/cl/171560043
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The stack blowout can no longer happen,
but we can still test that too-complex regexps
are rejected.
Replacement for CL 162770043.
LGTM=iant, r
R=r, iant
CC=bradfitz, golang-codereviews
https://golang.org/cl/162860043
|
|
|
|
|
|
|
|
|
|
| |
This is already tested by TestRE2Exhaustive, but the build has
not broken because that test is not run when using -test.short.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/155580043
|
|
|
|
|
|
|
|
|
| |
Fixes #8959.
LGTM=adg
R=golang-codereviews, adg
CC=golang-codereviews
https://golang.org/cl/161790043
|
|
|
|
|
|
|
|
|
|
|
| |
Generated using re2/doc/mksyntaxgo.
Fixes #8505.
LGTM=iant
R=r, iant
CC=golang-codereviews
https://golang.org/cl/155890043
|
|
|
|
|
|
|
|
|
| |
Fixes #7609.
LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/150270043
|
|
Preparation was in CL 134570043.
This CL contains only the effect of 'hg mv src/pkg/* src'.
For more about the move, see golang.org/s/go14nopkg.
|