<feed xmlns='http://www.w3.org/2005/Atom'>
<title>delta/go-git.git/src/strings, branch master</title>
<subtitle>github.com: golang/go
</subtitle>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/go-git.git/'/>
<entry>
<title>strings: correct NewReader documentation</title>
<updated>2023-05-12T17:40:51+00:00</updated>
<author>
<name>Jabar Asadi</name>
<email>jasadi@d2iq.com</email>
</author>
<published>2023-05-10T21:44:14+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/go-git.git/commit/?id=9eceffdf12dc4497ee162c005d5e14bb509797b9'/>
<id>9eceffdf12dc4497ee162c005d5e14bb509797b9</id>
<content type='text'>
The provided description for `NewReader` says that the underlying string is read-only. but the following example shows that this is not the case.
&lt;br /&gt;

rd := strings.NewReader("this is a text")

rd.Reset("new text") &lt;--- underlying string gets updated here

Change-Id: I95c7099c2e63670c84307d4317b702bf13a4025a
GitHub-Last-Rev: a16a60b0f1e25d19e05e664c5b41ca57c4fcd9b2
GitHub-Pull-Request: golang/go#60074
Reviewed-on: https://go-review.googlesource.com/c/go/+/493817
Run-TryBot: Ian Lance Taylor &lt;iant@google.com&gt;
Reviewed-by: Cherry Mui &lt;cherryyz@google.com&gt;
TryBot-Result: Gopher Robot &lt;gobot@golang.org&gt;
Reviewed-by: Ian Lance Taylor &lt;iant@google.com&gt;
Auto-Submit: Ian Lance Taylor &lt;iant@google.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The provided description for `NewReader` says that the underlying string is read-only. but the following example shows that this is not the case.
&lt;br /&gt;

rd := strings.NewReader("this is a text")

rd.Reset("new text") &lt;--- underlying string gets updated here

Change-Id: I95c7099c2e63670c84307d4317b702bf13a4025a
GitHub-Last-Rev: a16a60b0f1e25d19e05e664c5b41ca57c4fcd9b2
GitHub-Pull-Request: golang/go#60074
Reviewed-on: https://go-review.googlesource.com/c/go/+/493817
Run-TryBot: Ian Lance Taylor &lt;iant@google.com&gt;
Reviewed-by: Cherry Mui &lt;cherryyz@google.com&gt;
TryBot-Result: Gopher Robot &lt;gobot@golang.org&gt;
Reviewed-by: Ian Lance Taylor &lt;iant@google.com&gt;
Auto-Submit: Ian Lance Taylor &lt;iant@google.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bytes, strings: avoid unnecessary zero initialization</title>
<updated>2023-02-27T19:11:00+00:00</updated>
<author>
<name>Joe Tsai</name>
<email>joetsai@digital-static.net</email>
</author>
<published>2022-12-08T11:51:04+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/go-git.git/commit/?id=132fae93b789ce512068ff4300c665b40635b74e'/>
<id>132fae93b789ce512068ff4300c665b40635b74e</id>
<content type='text'>
Add bytealg.MakeNoZero that specially allocates a []byte
without zeroing it. It assumes the caller will populate every byte.
From within the bytes and strings packages, we can use
bytealg.MakeNoZero in a way where our logic ensures that
the entire slice is overwritten such that uninitialized bytes
are never leaked to the end user.

We use bytealg.MakeNoZero from within the following functions:

* bytes.Join
* bytes.Repeat
* bytes.ToUpper
* bytes.ToLower
* strings.Builder.Grow

The optimization in strings.Builder transitively benefits the following:

* strings.Join
* strings.Map
* strings.Repeat
* strings.ToUpper
* strings.ToLower
* strings.ToValidUTF8
* strings.Replace
* any user logic that depends on strings.Builder

This optimization is especially notable on large buffers that
do not fit in the CPU cache, such that the cost of
runtime.memclr and runtime.memmove are non-trivial since they are
both limited by the relatively slow speed of physical RAM.

Performance:

	RepeatLarge/256/1             66.0ns ± 3%    64.5ns ± 1%      ~     (p=0.095 n=5+5)
	RepeatLarge/256/16            55.4ns ± 5%    53.1ns ± 3%    -4.17%  (p=0.016 n=5+5)
	RepeatLarge/512/1             95.5ns ± 7%    87.1ns ± 2%    -8.78%  (p=0.008 n=5+5)
	RepeatLarge/512/16            84.4ns ± 9%    76.2ns ± 5%    -9.73%  (p=0.016 n=5+5)
	RepeatLarge/1024/1             161ns ± 4%     144ns ± 7%   -10.45%  (p=0.016 n=5+5)
	RepeatLarge/1024/16            148ns ± 3%     141ns ± 5%      ~     (p=0.095 n=5+5)
	RepeatLarge/2048/1             296ns ± 7%     288ns ± 5%      ~     (p=0.841 n=5+5)
	RepeatLarge/2048/16            298ns ± 8%     281ns ± 5%      ~     (p=0.151 n=5+5)
	RepeatLarge/4096/1             593ns ± 8%     539ns ± 8%    -8.99%  (p=0.032 n=5+5)
	RepeatLarge/4096/16            568ns ±12%     526ns ± 7%      ~     (p=0.056 n=5+5)
	RepeatLarge/8192/1            1.15µs ± 8%    1.08µs ±12%      ~     (p=0.095 n=5+5)
	RepeatLarge/8192/16           1.12µs ± 4%    1.07µs ± 7%      ~     (p=0.310 n=5+5)
	RepeatLarge/8192/4097         1.77ns ± 1%    1.76ns ± 2%      ~     (p=0.310 n=5+5)
	RepeatLarge/16384/1           2.06µs ± 7%    1.94µs ± 5%      ~     (p=0.222 n=5+5)
	RepeatLarge/16384/16          2.02µs ± 4%    1.92µs ± 6%      ~     (p=0.095 n=5+5)
	RepeatLarge/16384/4097        1.50µs ±15%    1.44µs ±11%      ~     (p=0.802 n=5+5)
	RepeatLarge/32768/1           3.90µs ± 8%    3.65µs ±11%      ~     (p=0.151 n=5+5)
	RepeatLarge/32768/16          3.92µs ±14%    3.68µs ±12%      ~     (p=0.222 n=5+5)
	RepeatLarge/32768/4097        3.71µs ± 5%    3.43µs ± 4%    -7.54%  (p=0.032 n=5+5)
	RepeatLarge/65536/1           7.47µs ± 8%    6.88µs ± 9%      ~     (p=0.056 n=5+5)
	RepeatLarge/65536/16          7.29µs ± 4%    6.74µs ± 6%    -7.60%  (p=0.016 n=5+5)
	RepeatLarge/65536/4097        7.90µs ±11%    6.34µs ± 5%   -19.81%  (p=0.008 n=5+5)
	RepeatLarge/131072/1          17.0µs ±18%    14.1µs ± 6%   -17.32%  (p=0.008 n=5+5)
	RepeatLarge/131072/16         15.2µs ± 2%    16.2µs ±17%      ~     (p=0.151 n=5+5)
	RepeatLarge/131072/4097       15.7µs ± 6%    14.8µs ±11%      ~     (p=0.095 n=5+5)
	RepeatLarge/262144/1          30.4µs ± 5%    31.4µs ±13%      ~     (p=0.548 n=5+5)
	RepeatLarge/262144/16         30.1µs ± 4%    30.7µs ±11%      ~     (p=1.000 n=5+5)
	RepeatLarge/262144/4097       31.2µs ± 7%    32.7µs ±13%      ~     (p=0.310 n=5+5)
	RepeatLarge/524288/1          67.5µs ± 9%    63.7µs ± 3%      ~     (p=0.095 n=5+5)
	RepeatLarge/524288/16         67.2µs ± 5%    62.9µs ± 6%      ~     (p=0.151 n=5+5)
	RepeatLarge/524288/4097       65.5µs ± 4%    65.2µs ±18%      ~     (p=0.548 n=5+5)
	RepeatLarge/1048576/1          141µs ± 6%     137µs ±14%      ~     (p=0.421 n=5+5)
	RepeatLarge/1048576/16         140µs ± 2%     134µs ±11%      ~     (p=0.222 n=5+5)
	RepeatLarge/1048576/4097       141µs ± 3%     134µs ±10%      ~     (p=0.151 n=5+5)
	RepeatLarge/2097152/1          258µs ± 2%     271µs ±10%      ~     (p=0.222 n=5+5)
	RepeatLarge/2097152/16         263µs ± 6%     273µs ± 9%      ~     (p=0.151 n=5+5)
	RepeatLarge/2097152/4097       270µs ± 2%     277µs ± 6%      ~     (p=0.690 n=5+5)
	RepeatLarge/4194304/1          684µs ± 3%     467µs ± 6%   -31.69%  (p=0.008 n=5+5)
	RepeatLarge/4194304/16         682µs ± 1%     471µs ± 7%   -30.91%  (p=0.008 n=5+5)
	RepeatLarge/4194304/4097       685µs ± 2%     465µs ±20%   -32.12%  (p=0.008 n=5+5)
	RepeatLarge/8388608/1         1.50ms ± 1%    1.16ms ± 8%   -22.63%  (p=0.008 n=5+5)
	RepeatLarge/8388608/16        1.50ms ± 2%    1.22ms ±17%   -18.49%  (p=0.008 n=5+5)
	RepeatLarge/8388608/4097      1.51ms ± 7%    1.33ms ±11%   -11.56%  (p=0.008 n=5+5)
	RepeatLarge/16777216/1        3.48ms ± 4%    2.66ms ±13%   -23.76%  (p=0.008 n=5+5)
	RepeatLarge/16777216/16       3.37ms ± 3%    2.57ms ±13%   -23.72%  (p=0.008 n=5+5)
	RepeatLarge/16777216/4097     3.38ms ± 9%    2.50ms ±11%   -26.16%  (p=0.008 n=5+5)
	RepeatLarge/33554432/1        7.74ms ± 1%    4.70ms ±19%   -39.31%  (p=0.016 n=4+5)
	RepeatLarge/33554432/16       7.90ms ± 4%    4.78ms ± 9%   -39.50%  (p=0.008 n=5+5)
	RepeatLarge/33554432/4097     7.80ms ± 2%    4.86ms ±11%   -37.60%  (p=0.008 n=5+5)
	RepeatLarge/67108864/1        16.4ms ± 3%     9.7ms ±15%   -41.29%  (p=0.008 n=5+5)
	RepeatLarge/67108864/16       16.5ms ± 1%     9.9ms ±15%   -39.83%  (p=0.008 n=5+5)
	RepeatLarge/67108864/4097     16.5ms ± 1%    11.0ms ±18%   -32.95%  (p=0.008 n=5+5)
	RepeatLarge/134217728/1       35.2ms ±12%    19.2ms ±10%   -45.58%  (p=0.008 n=5+5)
	RepeatLarge/134217728/16      34.6ms ± 6%    19.3ms ± 7%   -44.07%  (p=0.008 n=5+5)
	RepeatLarge/134217728/4097    33.2ms ± 2%    19.3ms ±14%   -41.79%  (p=0.008 n=5+5)
	RepeatLarge/268435456/1       70.9ms ± 2%    36.2ms ± 5%   -48.87%  (p=0.008 n=5+5)
	RepeatLarge/268435456/16      77.4ms ± 7%    36.1ms ± 8%   -53.33%  (p=0.008 n=5+5)
	RepeatLarge/268435456/4097    75.8ms ± 4%    37.0ms ± 4%   -51.15%  (p=0.008 n=5+5)
	RepeatLarge/536870912/1        163ms ±14%      77ms ± 9%   -52.94%  (p=0.008 n=5+5)
	RepeatLarge/536870912/16       156ms ± 4%      76ms ± 6%   -51.42%  (p=0.008 n=5+5)
	RepeatLarge/536870912/4097     151ms ± 2%      76ms ± 6%   -49.64%  (p=0.008 n=5+5)
	RepeatLarge/1073741824/1       293ms ± 5%     149ms ± 8%   -49.18%  (p=0.008 n=5+5)
	RepeatLarge/1073741824/16      308ms ± 9%     150ms ± 8%   -51.19%  (p=0.008 n=5+5)
	RepeatLarge/1073741824/4097    299ms ± 5%     151ms ± 6%   -49.51%  (p=0.008 n=5+5)

Updates #57153

Change-Id: I024553b7e676d6da6408278109ac1fa8def0a802
Reviewed-on: https://go-review.googlesource.com/c/go/+/456336
Reviewed-by: Dmitri Shuralyov &lt;dmitshur@google.com&gt;
Reviewed-by: Ian Lance Taylor &lt;iant@google.com&gt;
Run-TryBot: Joseph Tsai &lt;joetsai@digital-static.net&gt;
TryBot-Result: Gopher Robot &lt;gobot@golang.org&gt;
Reviewed-by: Daniel Martí &lt;mvdan@mvdan.cc&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add bytealg.MakeNoZero that specially allocates a []byte
without zeroing it. It assumes the caller will populate every byte.
From within the bytes and strings packages, we can use
bytealg.MakeNoZero in a way where our logic ensures that
the entire slice is overwritten such that uninitialized bytes
are never leaked to the end user.

We use bytealg.MakeNoZero from within the following functions:

* bytes.Join
* bytes.Repeat
* bytes.ToUpper
* bytes.ToLower
* strings.Builder.Grow

The optimization in strings.Builder transitively benefits the following:

* strings.Join
* strings.Map
* strings.Repeat
* strings.ToUpper
* strings.ToLower
* strings.ToValidUTF8
* strings.Replace
* any user logic that depends on strings.Builder

This optimization is especially notable on large buffers that
do not fit in the CPU cache, such that the cost of
runtime.memclr and runtime.memmove are non-trivial since they are
both limited by the relatively slow speed of physical RAM.

Performance:

	RepeatLarge/256/1             66.0ns ± 3%    64.5ns ± 1%      ~     (p=0.095 n=5+5)
	RepeatLarge/256/16            55.4ns ± 5%    53.1ns ± 3%    -4.17%  (p=0.016 n=5+5)
	RepeatLarge/512/1             95.5ns ± 7%    87.1ns ± 2%    -8.78%  (p=0.008 n=5+5)
	RepeatLarge/512/16            84.4ns ± 9%    76.2ns ± 5%    -9.73%  (p=0.016 n=5+5)
	RepeatLarge/1024/1             161ns ± 4%     144ns ± 7%   -10.45%  (p=0.016 n=5+5)
	RepeatLarge/1024/16            148ns ± 3%     141ns ± 5%      ~     (p=0.095 n=5+5)
	RepeatLarge/2048/1             296ns ± 7%     288ns ± 5%      ~     (p=0.841 n=5+5)
	RepeatLarge/2048/16            298ns ± 8%     281ns ± 5%      ~     (p=0.151 n=5+5)
	RepeatLarge/4096/1             593ns ± 8%     539ns ± 8%    -8.99%  (p=0.032 n=5+5)
	RepeatLarge/4096/16            568ns ±12%     526ns ± 7%      ~     (p=0.056 n=5+5)
	RepeatLarge/8192/1            1.15µs ± 8%    1.08µs ±12%      ~     (p=0.095 n=5+5)
	RepeatLarge/8192/16           1.12µs ± 4%    1.07µs ± 7%      ~     (p=0.310 n=5+5)
	RepeatLarge/8192/4097         1.77ns ± 1%    1.76ns ± 2%      ~     (p=0.310 n=5+5)
	RepeatLarge/16384/1           2.06µs ± 7%    1.94µs ± 5%      ~     (p=0.222 n=5+5)
	RepeatLarge/16384/16          2.02µs ± 4%    1.92µs ± 6%      ~     (p=0.095 n=5+5)
	RepeatLarge/16384/4097        1.50µs ±15%    1.44µs ±11%      ~     (p=0.802 n=5+5)
	RepeatLarge/32768/1           3.90µs ± 8%    3.65µs ±11%      ~     (p=0.151 n=5+5)
	RepeatLarge/32768/16          3.92µs ±14%    3.68µs ±12%      ~     (p=0.222 n=5+5)
	RepeatLarge/32768/4097        3.71µs ± 5%    3.43µs ± 4%    -7.54%  (p=0.032 n=5+5)
	RepeatLarge/65536/1           7.47µs ± 8%    6.88µs ± 9%      ~     (p=0.056 n=5+5)
	RepeatLarge/65536/16          7.29µs ± 4%    6.74µs ± 6%    -7.60%  (p=0.016 n=5+5)
	RepeatLarge/65536/4097        7.90µs ±11%    6.34µs ± 5%   -19.81%  (p=0.008 n=5+5)
	RepeatLarge/131072/1          17.0µs ±18%    14.1µs ± 6%   -17.32%  (p=0.008 n=5+5)
	RepeatLarge/131072/16         15.2µs ± 2%    16.2µs ±17%      ~     (p=0.151 n=5+5)
	RepeatLarge/131072/4097       15.7µs ± 6%    14.8µs ±11%      ~     (p=0.095 n=5+5)
	RepeatLarge/262144/1          30.4µs ± 5%    31.4µs ±13%      ~     (p=0.548 n=5+5)
	RepeatLarge/262144/16         30.1µs ± 4%    30.7µs ±11%      ~     (p=1.000 n=5+5)
	RepeatLarge/262144/4097       31.2µs ± 7%    32.7µs ±13%      ~     (p=0.310 n=5+5)
	RepeatLarge/524288/1          67.5µs ± 9%    63.7µs ± 3%      ~     (p=0.095 n=5+5)
	RepeatLarge/524288/16         67.2µs ± 5%    62.9µs ± 6%      ~     (p=0.151 n=5+5)
	RepeatLarge/524288/4097       65.5µs ± 4%    65.2µs ±18%      ~     (p=0.548 n=5+5)
	RepeatLarge/1048576/1          141µs ± 6%     137µs ±14%      ~     (p=0.421 n=5+5)
	RepeatLarge/1048576/16         140µs ± 2%     134µs ±11%      ~     (p=0.222 n=5+5)
	RepeatLarge/1048576/4097       141µs ± 3%     134µs ±10%      ~     (p=0.151 n=5+5)
	RepeatLarge/2097152/1          258µs ± 2%     271µs ±10%      ~     (p=0.222 n=5+5)
	RepeatLarge/2097152/16         263µs ± 6%     273µs ± 9%      ~     (p=0.151 n=5+5)
	RepeatLarge/2097152/4097       270µs ± 2%     277µs ± 6%      ~     (p=0.690 n=5+5)
	RepeatLarge/4194304/1          684µs ± 3%     467µs ± 6%   -31.69%  (p=0.008 n=5+5)
	RepeatLarge/4194304/16         682µs ± 1%     471µs ± 7%   -30.91%  (p=0.008 n=5+5)
	RepeatLarge/4194304/4097       685µs ± 2%     465µs ±20%   -32.12%  (p=0.008 n=5+5)
	RepeatLarge/8388608/1         1.50ms ± 1%    1.16ms ± 8%   -22.63%  (p=0.008 n=5+5)
	RepeatLarge/8388608/16        1.50ms ± 2%    1.22ms ±17%   -18.49%  (p=0.008 n=5+5)
	RepeatLarge/8388608/4097      1.51ms ± 7%    1.33ms ±11%   -11.56%  (p=0.008 n=5+5)
	RepeatLarge/16777216/1        3.48ms ± 4%    2.66ms ±13%   -23.76%  (p=0.008 n=5+5)
	RepeatLarge/16777216/16       3.37ms ± 3%    2.57ms ±13%   -23.72%  (p=0.008 n=5+5)
	RepeatLarge/16777216/4097     3.38ms ± 9%    2.50ms ±11%   -26.16%  (p=0.008 n=5+5)
	RepeatLarge/33554432/1        7.74ms ± 1%    4.70ms ±19%   -39.31%  (p=0.016 n=4+5)
	RepeatLarge/33554432/16       7.90ms ± 4%    4.78ms ± 9%   -39.50%  (p=0.008 n=5+5)
	RepeatLarge/33554432/4097     7.80ms ± 2%    4.86ms ±11%   -37.60%  (p=0.008 n=5+5)
	RepeatLarge/67108864/1        16.4ms ± 3%     9.7ms ±15%   -41.29%  (p=0.008 n=5+5)
	RepeatLarge/67108864/16       16.5ms ± 1%     9.9ms ±15%   -39.83%  (p=0.008 n=5+5)
	RepeatLarge/67108864/4097     16.5ms ± 1%    11.0ms ±18%   -32.95%  (p=0.008 n=5+5)
	RepeatLarge/134217728/1       35.2ms ±12%    19.2ms ±10%   -45.58%  (p=0.008 n=5+5)
	RepeatLarge/134217728/16      34.6ms ± 6%    19.3ms ± 7%   -44.07%  (p=0.008 n=5+5)
	RepeatLarge/134217728/4097    33.2ms ± 2%    19.3ms ±14%   -41.79%  (p=0.008 n=5+5)
	RepeatLarge/268435456/1       70.9ms ± 2%    36.2ms ± 5%   -48.87%  (p=0.008 n=5+5)
	RepeatLarge/268435456/16      77.4ms ± 7%    36.1ms ± 8%   -53.33%  (p=0.008 n=5+5)
	RepeatLarge/268435456/4097    75.8ms ± 4%    37.0ms ± 4%   -51.15%  (p=0.008 n=5+5)
	RepeatLarge/536870912/1        163ms ±14%      77ms ± 9%   -52.94%  (p=0.008 n=5+5)
	RepeatLarge/536870912/16       156ms ± 4%      76ms ± 6%   -51.42%  (p=0.008 n=5+5)
	RepeatLarge/536870912/4097     151ms ± 2%      76ms ± 6%   -49.64%  (p=0.008 n=5+5)
	RepeatLarge/1073741824/1       293ms ± 5%     149ms ± 8%   -49.18%  (p=0.008 n=5+5)
	RepeatLarge/1073741824/16      308ms ± 9%     150ms ± 8%   -51.19%  (p=0.008 n=5+5)
	RepeatLarge/1073741824/4097    299ms ± 5%     151ms ± 6%   -49.51%  (p=0.008 n=5+5)

Updates #57153

Change-Id: I024553b7e676d6da6408278109ac1fa8def0a802
Reviewed-on: https://go-review.googlesource.com/c/go/+/456336
Reviewed-by: Dmitri Shuralyov &lt;dmitshur@google.com&gt;
Reviewed-by: Ian Lance Taylor &lt;iant@google.com&gt;
Run-TryBot: Joseph Tsai &lt;joetsai@digital-static.net&gt;
TryBot-Result: Gopher Robot &lt;gobot@golang.org&gt;
Reviewed-by: Daniel Martí &lt;mvdan@mvdan.cc&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bytes, strings: add ContainsFunc</title>
<updated>2023-01-24T22:06:45+00:00</updated>
<author>
<name>hopehook</name>
<email>hopehook@qq.com</email>
</author>
<published>2023-01-03T08:23:16+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/go-git.git/commit/?id=0b3f58c48e3298e49e27f80dc748f0652339d63e'/>
<id>0b3f58c48e3298e49e27f80dc748f0652339d63e</id>
<content type='text'>
Fixes #54386.

Change-Id: I78747da337ed6129e4f7426dd0483a644bed82e3
Reviewed-on: https://go-review.googlesource.com/c/go/+/460216
TryBot-Result: Gopher Robot &lt;gobot@golang.org&gt;
Reviewed-by: Matthew Dempsky &lt;mdempsky@google.com&gt;
Reviewed-by: Ian Lance Taylor &lt;iant@google.com&gt;
Run-TryBot: hopehook &lt;hopehook@golangcn.org&gt;
Auto-Submit: Ian Lance Taylor &lt;iant@golang.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fixes #54386.

Change-Id: I78747da337ed6129e4f7426dd0483a644bed82e3
Reviewed-on: https://go-review.googlesource.com/c/go/+/460216
TryBot-Result: Gopher Robot &lt;gobot@golang.org&gt;
Reviewed-by: Matthew Dempsky &lt;mdempsky@google.com&gt;
Reviewed-by: Ian Lance Taylor &lt;iant@google.com&gt;
Run-TryBot: hopehook &lt;hopehook@golangcn.org&gt;
Auto-Submit: Ian Lance Taylor &lt;iant@golang.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bytes,strings: add some examples</title>
<updated>2023-01-20T23:21:39+00:00</updated>
<author>
<name>fangguizhen</name>
<email>1297394526@qq.com</email>
</author>
<published>2023-01-20T09:43:40+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/go-git.git/commit/?id=85b49d7f21dfbee9946bece01a168de239094716'/>
<id>85b49d7f21dfbee9946bece01a168de239094716</id>
<content type='text'>
Change-Id: Ic93ad59119f3549c0f13c4f366f71e9d01b88c47
GitHub-Last-Rev: afb518047288976f440d3fe0d65923c1905a9b26
GitHub-Pull-Request: golang/go#57907
Reviewed-on: https://go-review.googlesource.com/c/go/+/462283
TryBot-Result: Gopher Robot &lt;gobot@golang.org&gt;
Run-TryBot: Ian Lance Taylor &lt;iant@google.com&gt;
Reviewed-by: Ian Lance Taylor &lt;iant@google.com&gt;
Reviewed-by: Cherry Mui &lt;cherryyz@google.com&gt;
Run-TryBot: Ian Lance Taylor &lt;iant@golang.org&gt;
Auto-Submit: Ian Lance Taylor &lt;iant@google.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: Ic93ad59119f3549c0f13c4f366f71e9d01b88c47
GitHub-Last-Rev: afb518047288976f440d3fe0d65923c1905a9b26
GitHub-Pull-Request: golang/go#57907
Reviewed-on: https://go-review.googlesource.com/c/go/+/462283
TryBot-Result: Gopher Robot &lt;gobot@golang.org&gt;
Run-TryBot: Ian Lance Taylor &lt;iant@google.com&gt;
Reviewed-by: Ian Lance Taylor &lt;iant@google.com&gt;
Reviewed-by: Cherry Mui &lt;cherryyz@google.com&gt;
Run-TryBot: Ian Lance Taylor &lt;iant@golang.org&gt;
Auto-Submit: Ian Lance Taylor &lt;iant@google.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bytes, strings: rename field in CutSuffix tests</title>
<updated>2023-01-20T01:25:45+00:00</updated>
<author>
<name>fangguizhen</name>
<email>1297394526@qq.com</email>
</author>
<published>2023-01-19T03:12:12+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/go-git.git/commit/?id=e590afcf2c2d046b1a4b6a11986a8e38a2b93ed7'/>
<id>e590afcf2c2d046b1a4b6a11986a8e38a2b93ed7</id>
<content type='text'>
Change-Id: I63181f6540fc1bfcfc988a16bf9fafbd3575cfdf
GitHub-Last-Rev: d90528730a92a087866c1bfc227a0a0bf1cdffbe
GitHub-Pull-Request: golang/go#57909
Reviewed-on: https://go-review.googlesource.com/c/go/+/462284
Reviewed-by: Ian Lance Taylor &lt;iant@google.com&gt;
Run-TryBot: Ian Lance Taylor &lt;iant@golang.org&gt;
Auto-Submit: Ian Lance Taylor &lt;iant@google.com&gt;
TryBot-Result: Gopher Robot &lt;gobot@golang.org&gt;
Run-TryBot: Ian Lance Taylor &lt;iant@google.com&gt;
Reviewed-by: Cherry Mui &lt;cherryyz@google.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: I63181f6540fc1bfcfc988a16bf9fafbd3575cfdf
GitHub-Last-Rev: d90528730a92a087866c1bfc227a0a0bf1cdffbe
GitHub-Pull-Request: golang/go#57909
Reviewed-on: https://go-review.googlesource.com/c/go/+/462284
Reviewed-by: Ian Lance Taylor &lt;iant@google.com&gt;
Run-TryBot: Ian Lance Taylor &lt;iant@golang.org&gt;
Auto-Submit: Ian Lance Taylor &lt;iant@google.com&gt;
TryBot-Result: Gopher Robot &lt;gobot@golang.org&gt;
Run-TryBot: Ian Lance Taylor &lt;iant@google.com&gt;
Reviewed-by: Cherry Mui &lt;cherryyz@google.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>strings: remove redundant symbols</title>
<updated>2023-01-17T17:24:17+00:00</updated>
<author>
<name>fangguizhen</name>
<email>1297394526@qq.com</email>
</author>
<published>2023-01-17T16:37:42+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/go-git.git/commit/?id=8e199294361de59b637f25a7d5eebdee1b131415'/>
<id>8e199294361de59b637f25a7d5eebdee1b131415</id>
<content type='text'>
Change-Id: Ie3fe0274288d6cb6303acdcec1340c480e5c0b20
GitHub-Last-Rev: ce9d44619e970b1319fbccf3aace1ddf719bcec1
GitHub-Pull-Request: golang/go#57848
Reviewed-on: https://go-review.googlesource.com/c/go/+/462277
Run-TryBot: Ian Lance Taylor &lt;iant@google.com&gt;
Auto-Submit: Ian Lance Taylor &lt;iant@google.com&gt;
Run-TryBot: Keith Randall &lt;khr@golang.org&gt;
Reviewed-by: Keith Randall &lt;khr@google.com&gt;
TryBot-Result: Gopher Robot &lt;gobot@golang.org&gt;
Auto-Submit: Keith Randall &lt;khr@golang.org&gt;
Reviewed-by: Ian Lance Taylor &lt;iant@google.com&gt;
Reviewed-by: Keith Randall &lt;khr@golang.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: Ie3fe0274288d6cb6303acdcec1340c480e5c0b20
GitHub-Last-Rev: ce9d44619e970b1319fbccf3aace1ddf719bcec1
GitHub-Pull-Request: golang/go#57848
Reviewed-on: https://go-review.googlesource.com/c/go/+/462277
Run-TryBot: Ian Lance Taylor &lt;iant@google.com&gt;
Auto-Submit: Ian Lance Taylor &lt;iant@google.com&gt;
Run-TryBot: Keith Randall &lt;khr@golang.org&gt;
Reviewed-by: Keith Randall &lt;khr@google.com&gt;
TryBot-Result: Gopher Robot &lt;gobot@golang.org&gt;
Auto-Submit: Keith Randall &lt;khr@golang.org&gt;
Reviewed-by: Ian Lance Taylor &lt;iant@google.com&gt;
Reviewed-by: Keith Randall &lt;khr@golang.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>internal/bytealg: fix bug in index function for ppc64le/power9</title>
<updated>2022-10-31T12:52:07+00:00</updated>
<author>
<name>Archana R</name>
<email>aravind5@in.ibm.com</email>
</author>
<published>2022-10-28T07:42:39+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/go-git.git/commit/?id=6774ddfec758ecf2cc64d58392c438dd64660a00'/>
<id>6774ddfec758ecf2cc64d58392c438dd64660a00</id>
<content type='text'>
The index function was not handling certain corner cases where there
were two more bytes to be examined in the tail end of the string to
complete the comparison. Fix code to ensure that when the string has
to be shifted two more times the correct bytes are examined.
Also hoisted vsplat to V10 so that all paths use the correct value.
Some comments had incorrect register names and corrected the same.
Added the strings that were failing to strings test for verification.

Fixes #56457

Change-Id: Idba7cbc802e3d73c8f4fe89309871cc8447792f5
Reviewed-on: https://go-review.googlesource.com/c/go/+/446135
Reviewed-by: Bryan Mills &lt;bcmills@google.com&gt;
Reviewed-by: Heschi Kreinick &lt;heschi@google.com&gt;
Reviewed-by: Lynn Boger &lt;laboger@linux.vnet.ibm.com&gt;
TryBot-Result: Gopher Robot &lt;gobot@golang.org&gt;
Run-TryBot: Archana Ravindar &lt;ravindararchana@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The index function was not handling certain corner cases where there
were two more bytes to be examined in the tail end of the string to
complete the comparison. Fix code to ensure that when the string has
to be shifted two more times the correct bytes are examined.
Also hoisted vsplat to V10 so that all paths use the correct value.
Some comments had incorrect register names and corrected the same.
Added the strings that were failing to strings test for verification.

Fixes #56457

Change-Id: Idba7cbc802e3d73c8f4fe89309871cc8447792f5
Reviewed-on: https://go-review.googlesource.com/c/go/+/446135
Reviewed-by: Bryan Mills &lt;bcmills@google.com&gt;
Reviewed-by: Heschi Kreinick &lt;heschi@google.com&gt;
Reviewed-by: Lynn Boger &lt;laboger@linux.vnet.ibm.com&gt;
TryBot-Result: Gopher Robot &lt;gobot@golang.org&gt;
Run-TryBot: Archana Ravindar &lt;ravindararchana@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bytes,strings: optimize Repeat</title>
<updated>2022-09-27T16:55:15+00:00</updated>
<author>
<name>Carlo Alberto Ferraris</name>
<email>cafxx@strayorange.com</email>
</author>
<published>2022-07-22T05:18:33+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/go-git.git/commit/?id=dcb90152a444be97fcc45bb12d176641c1b0d90e'/>
<id>dcb90152a444be97fcc45bb12d176641c1b0d90e</id>
<content type='text'>
When generating long strings or slices with Repeat we
currently reuse intermediate states as a way to quickly
build exponentially longer results.

This works well as long as the intermediate states fit into
the processor D-cache. If they don't we start thrashing the
D-cache by reading in the whole intermediate state over and
over on each iteration.

Instead, once we reach a large enough intermediate state (that
allows the memcpy operation to perform at peak) we cap the
size of chunk of the state that is used as source for subsequent
appends. This ensures that this smaller source chunk is always
present in the D-cache, and the append operation does not need
to read the state contents from memory.

Currently the cap is set to 8KB, a number derived via
experimentation to yield the highest performance across a
a large range of result sizes. Slightly higher caps also
produced similar results: 8KB was chosen as the smallest one
in this performance plateau with the intention to minimize
D-cache pollution.

For result sizes larger than the fastest cache levels we get
significantly higher performance compared to the current
implementation:
strings:
name                            old speed      new speed      delta
RepeatLarge/256/1-16            1.73GB/s ± 1%  1.73GB/s ± 0%      ~     (p=0.556 n=5+4)
RepeatLarge/256/16-16           2.02GB/s ± 0%  1.95GB/s ± 8%      ~     (p=0.222 n=5+5)
RepeatLarge/512/1-16            2.30GB/s ±13%  2.47GB/s ± 1%      ~     (p=0.548 n=5+5)
RepeatLarge/512/16-16           2.38GB/s ±16%  2.77GB/s ± 1%   +16.27%  (p=0.032 n=5+5)
RepeatLarge/1024/1-16           3.17GB/s ± 1%  3.18GB/s ± 0%      ~     (p=0.730 n=4+5)
RepeatLarge/1024/16-16          3.39GB/s ± 2%  3.38GB/s ± 1%      ~     (p=0.548 n=5+5)
RepeatLarge/2048/1-16           3.32GB/s ± 2%  3.32GB/s ± 2%      ~     (p=1.000 n=5+5)
RepeatLarge/2048/16-16          3.41GB/s ± 4%  3.46GB/s ± 2%      ~     (p=0.310 n=5+5)
RepeatLarge/4096/1-16           3.60GB/s ± 4%  3.67GB/s ± 3%      ~     (p=0.690 n=5+5)
RepeatLarge/4096/16-16          3.74GB/s ± 3%  3.71GB/s ± 5%      ~     (p=0.690 n=5+5)
RepeatLarge/8192/1-16           3.94GB/s ± 4%  4.01GB/s ± 1%      ~     (p=0.222 n=5+5)
RepeatLarge/8192/16-16          3.94GB/s ± 6%  4.05GB/s ± 1%      ~     (p=0.222 n=5+5)
RepeatLarge/8192/4097-16        4.25GB/s ± 6%  4.32GB/s ± 3%      ~     (p=0.690 n=5+5)
RepeatLarge/16384/1-16          4.96GB/s ± 1%  5.02GB/s ± 2%      ~     (p=0.421 n=5+5)
RepeatLarge/16384/16-16         4.99GB/s ± 2%  5.07GB/s ± 1%      ~     (p=0.421 n=5+5)
RepeatLarge/16384/4097-16       5.15GB/s ± 3%  5.17GB/s ± 1%      ~     (p=1.000 n=5+5)
RepeatLarge/32768/1-16          5.44GB/s ± 2%  5.42GB/s ± 1%      ~     (p=0.841 n=5+5)
RepeatLarge/32768/16-16         5.46GB/s ± 4%  5.44GB/s ± 1%      ~     (p=0.905 n=5+4)
RepeatLarge/32768/4097-16       4.84GB/s ± 2%  4.59GB/s ±12%    -5.05%  (p=0.032 n=5+5)
RepeatLarge/65536/1-16          5.85GB/s ± 0%  5.84GB/s ± 1%      ~     (p=0.690 n=5+5)
RepeatLarge/65536/16-16         5.81GB/s ± 2%  5.84GB/s ± 2%      ~     (p=0.421 n=5+5)
RepeatLarge/65536/4097-16       5.38GB/s ± 6%  5.45GB/s ± 1%      ~     (p=1.000 n=5+5)
RepeatLarge/131072/1-16         6.20GB/s ± 1%  6.31GB/s ± 1%    +1.80%  (p=0.008 n=5+5)
RepeatLarge/131072/16-16        6.12GB/s ± 3%  6.25GB/s ± 3%      ~     (p=0.095 n=5+5)
RepeatLarge/131072/4097-16      5.95GB/s ± 1%  5.85GB/s ±10%      ~     (p=1.000 n=5+5)
RepeatLarge/262144/1-16         6.33GB/s ± 1%  6.56GB/s ± 0%    +3.62%  (p=0.016 n=5+4)
RepeatLarge/262144/16-16        6.42GB/s ± 0%  6.65GB/s ± 1%    +3.58%  (p=0.016 n=4+5)
RepeatLarge/262144/4097-16      6.31GB/s ± 1%  6.44GB/s ± 1%    +1.94%  (p=0.008 n=5+5)
RepeatLarge/524288/1-16         6.23GB/s ± 1%  6.92GB/s ± 3%   +11.02%  (p=0.008 n=5+5)
RepeatLarge/524288/16-16        6.24GB/s ± 1%  6.97GB/s ± 2%   +11.77%  (p=0.016 n=4+5)
RepeatLarge/524288/4097-16      6.14GB/s ± 2%  6.73GB/s ± 3%    +9.50%  (p=0.008 n=5+5)
RepeatLarge/1048576/1-16        5.23GB/s ± 1%  6.53GB/s ± 6%   +24.85%  (p=0.008 n=5+5)
RepeatLarge/1048576/16-16       5.21GB/s ± 1%  6.56GB/s ± 4%   +25.93%  (p=0.008 n=5+5)
RepeatLarge/1048576/4097-16     5.22GB/s ± 1%  6.26GB/s ± 2%   +20.09%  (p=0.008 n=5+5)
RepeatLarge/2097152/1-16        3.95GB/s ± 1%  5.96GB/s ± 1%   +51.01%  (p=0.008 n=5+5)
RepeatLarge/2097152/16-16       3.94GB/s ± 1%  5.98GB/s ± 2%   +51.99%  (p=0.008 n=5+5)
RepeatLarge/2097152/4097-16     4.94GB/s ± 1%  5.71GB/s ± 2%   +15.63%  (p=0.008 n=5+5)
RepeatLarge/4194304/1-16        3.10GB/s ± 1%  5.89GB/s ± 1%   +89.90%  (p=0.008 n=5+5)
RepeatLarge/4194304/16-16       3.09GB/s ± 1%  5.86GB/s ± 1%   +89.89%  (p=0.008 n=5+5)
RepeatLarge/4194304/4097-16     3.13GB/s ± 1%  5.89GB/s ± 1%   +88.36%  (p=0.008 n=5+5)
RepeatLarge/8388608/1-16        3.06GB/s ± 1%  6.31GB/s ±16%  +105.84%  (p=0.008 n=5+5)
RepeatLarge/8388608/16-16       3.08GB/s ± 1%  6.62GB/s ± 1%  +114.66%  (p=0.008 n=5+5)
RepeatLarge/8388608/4097-16     3.13GB/s ± 2%  6.87GB/s ± 1%  +119.62%  (p=0.008 n=5+5)
RepeatLarge/16777216/1-16       3.21GB/s ± 3%  5.88GB/s ± 1%   +83.27%  (p=0.008 n=5+5)
RepeatLarge/16777216/16-16      3.23GB/s ± 2%  5.84GB/s ± 2%   +80.49%  (p=0.008 n=5+5)
RepeatLarge/16777216/4097-16    3.30GB/s ± 6%  5.88GB/s ± 2%   +78.18%  (p=0.008 n=5+5)
RepeatLarge/33554432/1-16       3.71GB/s ± 3%  5.91GB/s ± 2%   +59.17%  (p=0.008 n=5+5)
RepeatLarge/33554432/16-16      3.67GB/s ± 3%  5.91GB/s ± 2%   +61.13%  (p=0.008 n=5+5)
RepeatLarge/33554432/4097-16    3.71GB/s ± 1%  5.77GB/s ± 6%   +55.51%  (p=0.008 n=5+5)
RepeatLarge/67108864/1-16       4.61GB/s ±11%  6.00GB/s ± 5%   +30.15%  (p=0.008 n=5+5)
RepeatLarge/67108864/16-16      4.62GB/s ± 7%  6.11GB/s ± 2%   +32.35%  (p=0.008 n=5+5)
RepeatLarge/67108864/4097-16    4.71GB/s ± 2%  6.24GB/s ± 2%   +32.60%  (p=0.008 n=5+5)
RepeatLarge/134217728/1-16      4.53GB/s ± 8%  6.28GB/s ±11%   +38.57%  (p=0.008 n=5+5)
RepeatLarge/134217728/16-16     4.78GB/s ± 3%  6.36GB/s ± 3%   +33.16%  (p=0.008 n=5+5)
RepeatLarge/134217728/4097-16   4.73GB/s ± 6%  6.46GB/s ± 3%   +36.63%  (p=0.008 n=5+5)
RepeatLarge/268435456/1-16      4.09GB/s ±25%  6.37GB/s ±19%   +56.00%  (p=0.008 n=5+5)
RepeatLarge/268435456/16-16     4.50GB/s ± 4%  6.86GB/s ± 0%   +52.49%  (p=0.016 n=5+4)
RepeatLarge/268435456/4097-16   4.73GB/s ± 5%  6.90GB/s ± 0%   +45.94%  (p=0.008 n=5+5)
RepeatLarge/536870912/1-16      4.38GB/s ±36%  6.52GB/s ± 8%   +48.68%  (p=0.008 n=5+5)
RepeatLarge/536870912/16-16     4.69GB/s ±12%  6.90GB/s ± 1%   +46.97%  (p=0.008 n=5+5)
RepeatLarge/536870912/4097-16   4.87GB/s ± 8%  6.98GB/s ± 0%   +43.36%  (p=0.008 n=5+5)
RepeatLarge/1073741824/1-16     3.87GB/s ±28%  6.96GB/s ± 1%   +79.94%  (p=0.016 n=5+4)
RepeatLarge/1073741824/16-16    4.79GB/s ± 9%  6.93GB/s ± 0%   +44.79%  (p=0.008 n=5+5)
RepeatLarge/1073741824/4097-16  4.65GB/s ± 8%  7.02GB/s ± 1%   +51.02%  (p=0.008 n=5+5)

bytes:
name                            old speed      new speed      delta
RepeatLarge/256/1-16            1.93GB/s ± 1%  1.84GB/s ± 1%    -4.81%  (p=0.000 n=10+10)
RepeatLarge/256/16-16           2.25GB/s ± 2%  2.15GB/s ± 1%    -4.45%  (p=0.000 n=9+8)
RepeatLarge/512/1-16            2.71GB/s ± 1%  2.62GB/s ± 1%    -3.27%  (p=0.000 n=10+9)
RepeatLarge/512/16-16           2.96GB/s ± 4%  2.91GB/s ± 1%      ~     (p=0.243 n=9+10)
RepeatLarge/1024/1-16           3.35GB/s ± 1%  3.27GB/s ± 1%    -2.61%  (p=0.000 n=9+10)
RepeatLarge/1024/16-16          3.56GB/s ± 2%  3.52GB/s ± 1%    -1.10%  (p=0.010 n=10+9)
RepeatLarge/2048/1-16           3.52GB/s ± 1%  3.45GB/s ± 1%    -1.92%  (p=0.000 n=10+10)
RepeatLarge/2048/16-16          3.61GB/s ± 1%  3.58GB/s ± 0%    -0.82%  (p=0.008 n=9+8)
RepeatLarge/4096/1-16           3.85GB/s ± 2%  3.80GB/s ± 2%      ~     (p=0.165 n=10+10)
RepeatLarge/4096/16-16          3.88GB/s ± 3%  3.84GB/s ± 4%      ~     (p=0.393 n=10+10)
RepeatLarge/8192/1-16           4.12GB/s ± 2%  4.04GB/s ± 1%    -1.96%  (p=0.000 n=10+10)
RepeatLarge/8192/16-16          4.11GB/s ± 2%  4.09GB/s ± 1%      ~     (p=0.278 n=9+10)
RepeatLarge/8192/4097-16        4.38GB/s ± 1%  4.39GB/s ± 4%      ~     (p=0.720 n=9+10)
RepeatLarge/16384/1-16          5.06GB/s ± 2%  4.95GB/s ± 3%    -2.29%  (p=0.001 n=10+9)
RepeatLarge/16384/16-16         5.11GB/s ± 3%  5.06GB/s ± 3%      ~     (p=0.315 n=10+9)
RepeatLarge/16384/4097-16       5.22GB/s ± 3%  5.26GB/s ± 3%      ~     (p=0.211 n=9+10)
RepeatLarge/32768/1-16          5.54GB/s ± 2%  5.50GB/s ± 3%      ~     (p=0.353 n=10+10)
RepeatLarge/32768/16-16         5.55GB/s ± 1%  5.60GB/s ± 1%    +0.91%  (p=0.035 n=10+9)
RepeatLarge/32768/4097-16       4.88GB/s ± 2%  4.85GB/s ± 2%      ~     (p=0.447 n=10+9)
RepeatLarge/65536/1-16          5.86GB/s ± 1%  5.93GB/s ± 2%    +1.18%  (p=0.043 n=8+10)
RepeatLarge/65536/16-16         5.83GB/s ± 2%  5.98GB/s ± 1%    +2.67%  (p=0.000 n=10+10)
RepeatLarge/65536/4097-16       5.57GB/s ± 0%  5.56GB/s ± 3%      ~     (p=0.696 n=8+10)
RepeatLarge/131072/1-16         6.23GB/s ± 1%  6.38GB/s ± 2%    +2.51%  (p=0.000 n=9+10)
RepeatLarge/131072/16-16        6.21GB/s ± 2%  6.37GB/s ± 1%    +2.72%  (p=0.000 n=9+10)
RepeatLarge/131072/4097-16      6.04GB/s ± 1%  6.09GB/s ± 3%      ~     (p=0.356 n=9+10)
RepeatLarge/262144/1-16         6.47GB/s ± 1%  6.63GB/s ± 2%    +2.57%  (p=0.003 n=10+10)
RepeatLarge/262144/16-16        6.45GB/s ± 2%  6.69GB/s ± 2%    +3.65%  (p=0.000 n=10+10)
RepeatLarge/262144/4097-16      6.35GB/s ± 1%  6.51GB/s ± 2%    +2.48%  (p=0.000 n=9+10)
RepeatLarge/524288/1-16         6.21GB/s ± 2%  6.95GB/s ± 1%   +11.95%  (p=0.000 n=10+10)
RepeatLarge/524288/16-16        6.24GB/s ± 2%  6.93GB/s ± 2%   +11.11%  (p=0.000 n=10+10)
RepeatLarge/524288/4097-16      6.18GB/s ± 2%  6.82GB/s ± 1%   +10.39%  (p=0.000 n=9+10)
RepeatLarge/1048576/1-16        5.34GB/s ± 2%  6.41GB/s ± 2%   +20.05%  (p=0.000 n=10+10)
RepeatLarge/1048576/16-16       5.33GB/s ± 1%  6.45GB/s ± 2%   +20.84%  (p=0.000 n=10+9)
RepeatLarge/1048576/4097-16     5.28GB/s ± 1%  6.17GB/s ± 2%   +16.75%  (p=0.000 n=10+10)
RepeatLarge/2097152/1-16        4.04GB/s ± 1%  6.21GB/s ± 1%   +53.89%  (p=0.000 n=9+8)
RepeatLarge/2097152/16-16       4.02GB/s ± 1%  6.20GB/s ± 2%   +54.37%  (p=0.000 n=10+9)
RepeatLarge/2097152/4097-16     4.94GB/s ± 1%  6.04GB/s ± 1%   +22.36%  (p=0.000 n=10+10)
RepeatLarge/4194304/1-16        3.10GB/s ± 1%  5.74GB/s ± 0%   +85.04%  (p=0.000 n=10+9)
RepeatLarge/4194304/16-16       3.10GB/s ± 2%  5.72GB/s ± 1%   +84.26%  (p=0.000 n=9+10)
RepeatLarge/4194304/4097-16     3.03GB/s ± 4%  5.61GB/s ± 1%   +85.06%  (p=0.000 n=10+9)
RepeatLarge/8388608/1-16        3.08GB/s ± 2%  6.25GB/s ± 1%  +103.09%  (p=0.000 n=9+9)
RepeatLarge/8388608/16-16       3.07GB/s ± 2%  6.26GB/s ± 3%  +104.07%  (p=0.000 n=10+9)
RepeatLarge/8388608/4097-16     3.08GB/s ± 2%  6.23GB/s ± 2%  +102.09%  (p=0.000 n=9+10)
RepeatLarge/16777216/1-16       3.25GB/s ± 2%  5.78GB/s ± 3%   +78.03%  (p=0.000 n=9+9)
RepeatLarge/16777216/16-16      3.25GB/s ± 1%  5.75GB/s ± 1%   +77.21%  (p=0.000 n=9+10)
RepeatLarge/16777216/4097-16    3.29GB/s ± 3%  5.72GB/s ± 2%   +73.74%  (p=0.000 n=10+10)
RepeatLarge/33554432/1-16       3.68GB/s ± 2%  5.90GB/s ± 1%   +60.20%  (p=0.000 n=10+10)
RepeatLarge/33554432/16-16      3.69GB/s ± 3%  5.88GB/s ± 1%   +59.54%  (p=0.000 n=10+9)
RepeatLarge/33554432/4097-16    3.74GB/s ± 1%  5.94GB/s ± 2%   +58.68%  (p=0.000 n=7+10)
RepeatLarge/67108864/1-16       4.62GB/s ±12%  6.11GB/s ± 3%   +32.23%  (p=0.000 n=10+9)
RepeatLarge/67108864/16-16      4.77GB/s ± 2%  6.09GB/s ± 2%   +27.88%  (p=0.000 n=9+9)
RepeatLarge/67108864/4097-16    4.78GB/s ± 1%  6.19GB/s ± 1%   +29.51%  (p=0.000 n=9+10)
RepeatLarge/134217728/1-16      4.60GB/s ±16%  6.52GB/s ± 9%   +41.67%  (p=0.000 n=10+10)
RepeatLarge/134217728/16-16     4.80GB/s ± 4%  6.81GB/s ± 2%   +41.82%  (p=0.000 n=10+9)
RepeatLarge/134217728/4097-16   4.79GB/s ± 4%  6.81GB/s ± 2%   +42.31%  (p=0.000 n=9+10)
RepeatLarge/268435456/1-16      4.43GB/s ±25%  6.27GB/s ±14%   +41.52%  (p=0.000 n=10+10)
RepeatLarge/268435456/16-16     4.75GB/s ± 4%  6.68GB/s ± 4%   +40.50%  (p=0.000 n=9+10)
RepeatLarge/268435456/4097-16   4.75GB/s ± 3%  6.58GB/s ± 4%   +38.68%  (p=0.000 n=9+10)
RepeatLarge/536870912/1-16      4.96GB/s ± 9%  6.39GB/s ±16%   +28.90%  (p=0.000 n=8+10)
RepeatLarge/536870912/16-16     4.66GB/s ± 6%  6.57GB/s ± 7%   +40.82%  (p=0.000 n=10+9)
RepeatLarge/536870912/4097-16   4.68GB/s ±11%  6.88GB/s ± 3%   +47.01%  (p=0.000 n=10+9)
RepeatLarge/1073741824/1-16     4.39GB/s ±23%  6.57GB/s ± 5%   +49.75%  (p=0.000 n=10+8)
RepeatLarge/1073741824/16-16    4.73GB/s ±13%  6.89GB/s ± 1%   +45.68%  (p=0.000 n=9+8)
RepeatLarge/1073741824/4097-16  4.97GB/s ±15%  6.73GB/s ± 9%   +35.45%  (p=0.000 n=10+10)

The results above come from a Intel i9-9980HK (256KB L2) with
TurboBoost disabled.

Change-Id: I79dd57da0429aee9020ffd7bc458a034b999b740
Reviewed-on: https://go-review.googlesource.com/c/go/+/419054
Reviewed-by: Ian Lance Taylor &lt;iant@google.com&gt;
Auto-Submit: Ian Lance Taylor &lt;iant@google.com&gt;
Run-TryBot: Ian Lance Taylor &lt;iant@google.com&gt;
Reviewed-by: Dmitri Shuralyov &lt;dmitshur@google.com&gt;
TryBot-Result: Gopher Robot &lt;gobot@golang.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When generating long strings or slices with Repeat we
currently reuse intermediate states as a way to quickly
build exponentially longer results.

This works well as long as the intermediate states fit into
the processor D-cache. If they don't we start thrashing the
D-cache by reading in the whole intermediate state over and
over on each iteration.

Instead, once we reach a large enough intermediate state (that
allows the memcpy operation to perform at peak) we cap the
size of chunk of the state that is used as source for subsequent
appends. This ensures that this smaller source chunk is always
present in the D-cache, and the append operation does not need
to read the state contents from memory.

Currently the cap is set to 8KB, a number derived via
experimentation to yield the highest performance across a
a large range of result sizes. Slightly higher caps also
produced similar results: 8KB was chosen as the smallest one
in this performance plateau with the intention to minimize
D-cache pollution.

For result sizes larger than the fastest cache levels we get
significantly higher performance compared to the current
implementation:
strings:
name                            old speed      new speed      delta
RepeatLarge/256/1-16            1.73GB/s ± 1%  1.73GB/s ± 0%      ~     (p=0.556 n=5+4)
RepeatLarge/256/16-16           2.02GB/s ± 0%  1.95GB/s ± 8%      ~     (p=0.222 n=5+5)
RepeatLarge/512/1-16            2.30GB/s ±13%  2.47GB/s ± 1%      ~     (p=0.548 n=5+5)
RepeatLarge/512/16-16           2.38GB/s ±16%  2.77GB/s ± 1%   +16.27%  (p=0.032 n=5+5)
RepeatLarge/1024/1-16           3.17GB/s ± 1%  3.18GB/s ± 0%      ~     (p=0.730 n=4+5)
RepeatLarge/1024/16-16          3.39GB/s ± 2%  3.38GB/s ± 1%      ~     (p=0.548 n=5+5)
RepeatLarge/2048/1-16           3.32GB/s ± 2%  3.32GB/s ± 2%      ~     (p=1.000 n=5+5)
RepeatLarge/2048/16-16          3.41GB/s ± 4%  3.46GB/s ± 2%      ~     (p=0.310 n=5+5)
RepeatLarge/4096/1-16           3.60GB/s ± 4%  3.67GB/s ± 3%      ~     (p=0.690 n=5+5)
RepeatLarge/4096/16-16          3.74GB/s ± 3%  3.71GB/s ± 5%      ~     (p=0.690 n=5+5)
RepeatLarge/8192/1-16           3.94GB/s ± 4%  4.01GB/s ± 1%      ~     (p=0.222 n=5+5)
RepeatLarge/8192/16-16          3.94GB/s ± 6%  4.05GB/s ± 1%      ~     (p=0.222 n=5+5)
RepeatLarge/8192/4097-16        4.25GB/s ± 6%  4.32GB/s ± 3%      ~     (p=0.690 n=5+5)
RepeatLarge/16384/1-16          4.96GB/s ± 1%  5.02GB/s ± 2%      ~     (p=0.421 n=5+5)
RepeatLarge/16384/16-16         4.99GB/s ± 2%  5.07GB/s ± 1%      ~     (p=0.421 n=5+5)
RepeatLarge/16384/4097-16       5.15GB/s ± 3%  5.17GB/s ± 1%      ~     (p=1.000 n=5+5)
RepeatLarge/32768/1-16          5.44GB/s ± 2%  5.42GB/s ± 1%      ~     (p=0.841 n=5+5)
RepeatLarge/32768/16-16         5.46GB/s ± 4%  5.44GB/s ± 1%      ~     (p=0.905 n=5+4)
RepeatLarge/32768/4097-16       4.84GB/s ± 2%  4.59GB/s ±12%    -5.05%  (p=0.032 n=5+5)
RepeatLarge/65536/1-16          5.85GB/s ± 0%  5.84GB/s ± 1%      ~     (p=0.690 n=5+5)
RepeatLarge/65536/16-16         5.81GB/s ± 2%  5.84GB/s ± 2%      ~     (p=0.421 n=5+5)
RepeatLarge/65536/4097-16       5.38GB/s ± 6%  5.45GB/s ± 1%      ~     (p=1.000 n=5+5)
RepeatLarge/131072/1-16         6.20GB/s ± 1%  6.31GB/s ± 1%    +1.80%  (p=0.008 n=5+5)
RepeatLarge/131072/16-16        6.12GB/s ± 3%  6.25GB/s ± 3%      ~     (p=0.095 n=5+5)
RepeatLarge/131072/4097-16      5.95GB/s ± 1%  5.85GB/s ±10%      ~     (p=1.000 n=5+5)
RepeatLarge/262144/1-16         6.33GB/s ± 1%  6.56GB/s ± 0%    +3.62%  (p=0.016 n=5+4)
RepeatLarge/262144/16-16        6.42GB/s ± 0%  6.65GB/s ± 1%    +3.58%  (p=0.016 n=4+5)
RepeatLarge/262144/4097-16      6.31GB/s ± 1%  6.44GB/s ± 1%    +1.94%  (p=0.008 n=5+5)
RepeatLarge/524288/1-16         6.23GB/s ± 1%  6.92GB/s ± 3%   +11.02%  (p=0.008 n=5+5)
RepeatLarge/524288/16-16        6.24GB/s ± 1%  6.97GB/s ± 2%   +11.77%  (p=0.016 n=4+5)
RepeatLarge/524288/4097-16      6.14GB/s ± 2%  6.73GB/s ± 3%    +9.50%  (p=0.008 n=5+5)
RepeatLarge/1048576/1-16        5.23GB/s ± 1%  6.53GB/s ± 6%   +24.85%  (p=0.008 n=5+5)
RepeatLarge/1048576/16-16       5.21GB/s ± 1%  6.56GB/s ± 4%   +25.93%  (p=0.008 n=5+5)
RepeatLarge/1048576/4097-16     5.22GB/s ± 1%  6.26GB/s ± 2%   +20.09%  (p=0.008 n=5+5)
RepeatLarge/2097152/1-16        3.95GB/s ± 1%  5.96GB/s ± 1%   +51.01%  (p=0.008 n=5+5)
RepeatLarge/2097152/16-16       3.94GB/s ± 1%  5.98GB/s ± 2%   +51.99%  (p=0.008 n=5+5)
RepeatLarge/2097152/4097-16     4.94GB/s ± 1%  5.71GB/s ± 2%   +15.63%  (p=0.008 n=5+5)
RepeatLarge/4194304/1-16        3.10GB/s ± 1%  5.89GB/s ± 1%   +89.90%  (p=0.008 n=5+5)
RepeatLarge/4194304/16-16       3.09GB/s ± 1%  5.86GB/s ± 1%   +89.89%  (p=0.008 n=5+5)
RepeatLarge/4194304/4097-16     3.13GB/s ± 1%  5.89GB/s ± 1%   +88.36%  (p=0.008 n=5+5)
RepeatLarge/8388608/1-16        3.06GB/s ± 1%  6.31GB/s ±16%  +105.84%  (p=0.008 n=5+5)
RepeatLarge/8388608/16-16       3.08GB/s ± 1%  6.62GB/s ± 1%  +114.66%  (p=0.008 n=5+5)
RepeatLarge/8388608/4097-16     3.13GB/s ± 2%  6.87GB/s ± 1%  +119.62%  (p=0.008 n=5+5)
RepeatLarge/16777216/1-16       3.21GB/s ± 3%  5.88GB/s ± 1%   +83.27%  (p=0.008 n=5+5)
RepeatLarge/16777216/16-16      3.23GB/s ± 2%  5.84GB/s ± 2%   +80.49%  (p=0.008 n=5+5)
RepeatLarge/16777216/4097-16    3.30GB/s ± 6%  5.88GB/s ± 2%   +78.18%  (p=0.008 n=5+5)
RepeatLarge/33554432/1-16       3.71GB/s ± 3%  5.91GB/s ± 2%   +59.17%  (p=0.008 n=5+5)
RepeatLarge/33554432/16-16      3.67GB/s ± 3%  5.91GB/s ± 2%   +61.13%  (p=0.008 n=5+5)
RepeatLarge/33554432/4097-16    3.71GB/s ± 1%  5.77GB/s ± 6%   +55.51%  (p=0.008 n=5+5)
RepeatLarge/67108864/1-16       4.61GB/s ±11%  6.00GB/s ± 5%   +30.15%  (p=0.008 n=5+5)
RepeatLarge/67108864/16-16      4.62GB/s ± 7%  6.11GB/s ± 2%   +32.35%  (p=0.008 n=5+5)
RepeatLarge/67108864/4097-16    4.71GB/s ± 2%  6.24GB/s ± 2%   +32.60%  (p=0.008 n=5+5)
RepeatLarge/134217728/1-16      4.53GB/s ± 8%  6.28GB/s ±11%   +38.57%  (p=0.008 n=5+5)
RepeatLarge/134217728/16-16     4.78GB/s ± 3%  6.36GB/s ± 3%   +33.16%  (p=0.008 n=5+5)
RepeatLarge/134217728/4097-16   4.73GB/s ± 6%  6.46GB/s ± 3%   +36.63%  (p=0.008 n=5+5)
RepeatLarge/268435456/1-16      4.09GB/s ±25%  6.37GB/s ±19%   +56.00%  (p=0.008 n=5+5)
RepeatLarge/268435456/16-16     4.50GB/s ± 4%  6.86GB/s ± 0%   +52.49%  (p=0.016 n=5+4)
RepeatLarge/268435456/4097-16   4.73GB/s ± 5%  6.90GB/s ± 0%   +45.94%  (p=0.008 n=5+5)
RepeatLarge/536870912/1-16      4.38GB/s ±36%  6.52GB/s ± 8%   +48.68%  (p=0.008 n=5+5)
RepeatLarge/536870912/16-16     4.69GB/s ±12%  6.90GB/s ± 1%   +46.97%  (p=0.008 n=5+5)
RepeatLarge/536870912/4097-16   4.87GB/s ± 8%  6.98GB/s ± 0%   +43.36%  (p=0.008 n=5+5)
RepeatLarge/1073741824/1-16     3.87GB/s ±28%  6.96GB/s ± 1%   +79.94%  (p=0.016 n=5+4)
RepeatLarge/1073741824/16-16    4.79GB/s ± 9%  6.93GB/s ± 0%   +44.79%  (p=0.008 n=5+5)
RepeatLarge/1073741824/4097-16  4.65GB/s ± 8%  7.02GB/s ± 1%   +51.02%  (p=0.008 n=5+5)

bytes:
name                            old speed      new speed      delta
RepeatLarge/256/1-16            1.93GB/s ± 1%  1.84GB/s ± 1%    -4.81%  (p=0.000 n=10+10)
RepeatLarge/256/16-16           2.25GB/s ± 2%  2.15GB/s ± 1%    -4.45%  (p=0.000 n=9+8)
RepeatLarge/512/1-16            2.71GB/s ± 1%  2.62GB/s ± 1%    -3.27%  (p=0.000 n=10+9)
RepeatLarge/512/16-16           2.96GB/s ± 4%  2.91GB/s ± 1%      ~     (p=0.243 n=9+10)
RepeatLarge/1024/1-16           3.35GB/s ± 1%  3.27GB/s ± 1%    -2.61%  (p=0.000 n=9+10)
RepeatLarge/1024/16-16          3.56GB/s ± 2%  3.52GB/s ± 1%    -1.10%  (p=0.010 n=10+9)
RepeatLarge/2048/1-16           3.52GB/s ± 1%  3.45GB/s ± 1%    -1.92%  (p=0.000 n=10+10)
RepeatLarge/2048/16-16          3.61GB/s ± 1%  3.58GB/s ± 0%    -0.82%  (p=0.008 n=9+8)
RepeatLarge/4096/1-16           3.85GB/s ± 2%  3.80GB/s ± 2%      ~     (p=0.165 n=10+10)
RepeatLarge/4096/16-16          3.88GB/s ± 3%  3.84GB/s ± 4%      ~     (p=0.393 n=10+10)
RepeatLarge/8192/1-16           4.12GB/s ± 2%  4.04GB/s ± 1%    -1.96%  (p=0.000 n=10+10)
RepeatLarge/8192/16-16          4.11GB/s ± 2%  4.09GB/s ± 1%      ~     (p=0.278 n=9+10)
RepeatLarge/8192/4097-16        4.38GB/s ± 1%  4.39GB/s ± 4%      ~     (p=0.720 n=9+10)
RepeatLarge/16384/1-16          5.06GB/s ± 2%  4.95GB/s ± 3%    -2.29%  (p=0.001 n=10+9)
RepeatLarge/16384/16-16         5.11GB/s ± 3%  5.06GB/s ± 3%      ~     (p=0.315 n=10+9)
RepeatLarge/16384/4097-16       5.22GB/s ± 3%  5.26GB/s ± 3%      ~     (p=0.211 n=9+10)
RepeatLarge/32768/1-16          5.54GB/s ± 2%  5.50GB/s ± 3%      ~     (p=0.353 n=10+10)
RepeatLarge/32768/16-16         5.55GB/s ± 1%  5.60GB/s ± 1%    +0.91%  (p=0.035 n=10+9)
RepeatLarge/32768/4097-16       4.88GB/s ± 2%  4.85GB/s ± 2%      ~     (p=0.447 n=10+9)
RepeatLarge/65536/1-16          5.86GB/s ± 1%  5.93GB/s ± 2%    +1.18%  (p=0.043 n=8+10)
RepeatLarge/65536/16-16         5.83GB/s ± 2%  5.98GB/s ± 1%    +2.67%  (p=0.000 n=10+10)
RepeatLarge/65536/4097-16       5.57GB/s ± 0%  5.56GB/s ± 3%      ~     (p=0.696 n=8+10)
RepeatLarge/131072/1-16         6.23GB/s ± 1%  6.38GB/s ± 2%    +2.51%  (p=0.000 n=9+10)
RepeatLarge/131072/16-16        6.21GB/s ± 2%  6.37GB/s ± 1%    +2.72%  (p=0.000 n=9+10)
RepeatLarge/131072/4097-16      6.04GB/s ± 1%  6.09GB/s ± 3%      ~     (p=0.356 n=9+10)
RepeatLarge/262144/1-16         6.47GB/s ± 1%  6.63GB/s ± 2%    +2.57%  (p=0.003 n=10+10)
RepeatLarge/262144/16-16        6.45GB/s ± 2%  6.69GB/s ± 2%    +3.65%  (p=0.000 n=10+10)
RepeatLarge/262144/4097-16      6.35GB/s ± 1%  6.51GB/s ± 2%    +2.48%  (p=0.000 n=9+10)
RepeatLarge/524288/1-16         6.21GB/s ± 2%  6.95GB/s ± 1%   +11.95%  (p=0.000 n=10+10)
RepeatLarge/524288/16-16        6.24GB/s ± 2%  6.93GB/s ± 2%   +11.11%  (p=0.000 n=10+10)
RepeatLarge/524288/4097-16      6.18GB/s ± 2%  6.82GB/s ± 1%   +10.39%  (p=0.000 n=9+10)
RepeatLarge/1048576/1-16        5.34GB/s ± 2%  6.41GB/s ± 2%   +20.05%  (p=0.000 n=10+10)
RepeatLarge/1048576/16-16       5.33GB/s ± 1%  6.45GB/s ± 2%   +20.84%  (p=0.000 n=10+9)
RepeatLarge/1048576/4097-16     5.28GB/s ± 1%  6.17GB/s ± 2%   +16.75%  (p=0.000 n=10+10)
RepeatLarge/2097152/1-16        4.04GB/s ± 1%  6.21GB/s ± 1%   +53.89%  (p=0.000 n=9+8)
RepeatLarge/2097152/16-16       4.02GB/s ± 1%  6.20GB/s ± 2%   +54.37%  (p=0.000 n=10+9)
RepeatLarge/2097152/4097-16     4.94GB/s ± 1%  6.04GB/s ± 1%   +22.36%  (p=0.000 n=10+10)
RepeatLarge/4194304/1-16        3.10GB/s ± 1%  5.74GB/s ± 0%   +85.04%  (p=0.000 n=10+9)
RepeatLarge/4194304/16-16       3.10GB/s ± 2%  5.72GB/s ± 1%   +84.26%  (p=0.000 n=9+10)
RepeatLarge/4194304/4097-16     3.03GB/s ± 4%  5.61GB/s ± 1%   +85.06%  (p=0.000 n=10+9)
RepeatLarge/8388608/1-16        3.08GB/s ± 2%  6.25GB/s ± 1%  +103.09%  (p=0.000 n=9+9)
RepeatLarge/8388608/16-16       3.07GB/s ± 2%  6.26GB/s ± 3%  +104.07%  (p=0.000 n=10+9)
RepeatLarge/8388608/4097-16     3.08GB/s ± 2%  6.23GB/s ± 2%  +102.09%  (p=0.000 n=9+10)
RepeatLarge/16777216/1-16       3.25GB/s ± 2%  5.78GB/s ± 3%   +78.03%  (p=0.000 n=9+9)
RepeatLarge/16777216/16-16      3.25GB/s ± 1%  5.75GB/s ± 1%   +77.21%  (p=0.000 n=9+10)
RepeatLarge/16777216/4097-16    3.29GB/s ± 3%  5.72GB/s ± 2%   +73.74%  (p=0.000 n=10+10)
RepeatLarge/33554432/1-16       3.68GB/s ± 2%  5.90GB/s ± 1%   +60.20%  (p=0.000 n=10+10)
RepeatLarge/33554432/16-16      3.69GB/s ± 3%  5.88GB/s ± 1%   +59.54%  (p=0.000 n=10+9)
RepeatLarge/33554432/4097-16    3.74GB/s ± 1%  5.94GB/s ± 2%   +58.68%  (p=0.000 n=7+10)
RepeatLarge/67108864/1-16       4.62GB/s ±12%  6.11GB/s ± 3%   +32.23%  (p=0.000 n=10+9)
RepeatLarge/67108864/16-16      4.77GB/s ± 2%  6.09GB/s ± 2%   +27.88%  (p=0.000 n=9+9)
RepeatLarge/67108864/4097-16    4.78GB/s ± 1%  6.19GB/s ± 1%   +29.51%  (p=0.000 n=9+10)
RepeatLarge/134217728/1-16      4.60GB/s ±16%  6.52GB/s ± 9%   +41.67%  (p=0.000 n=10+10)
RepeatLarge/134217728/16-16     4.80GB/s ± 4%  6.81GB/s ± 2%   +41.82%  (p=0.000 n=10+9)
RepeatLarge/134217728/4097-16   4.79GB/s ± 4%  6.81GB/s ± 2%   +42.31%  (p=0.000 n=9+10)
RepeatLarge/268435456/1-16      4.43GB/s ±25%  6.27GB/s ±14%   +41.52%  (p=0.000 n=10+10)
RepeatLarge/268435456/16-16     4.75GB/s ± 4%  6.68GB/s ± 4%   +40.50%  (p=0.000 n=9+10)
RepeatLarge/268435456/4097-16   4.75GB/s ± 3%  6.58GB/s ± 4%   +38.68%  (p=0.000 n=9+10)
RepeatLarge/536870912/1-16      4.96GB/s ± 9%  6.39GB/s ±16%   +28.90%  (p=0.000 n=8+10)
RepeatLarge/536870912/16-16     4.66GB/s ± 6%  6.57GB/s ± 7%   +40.82%  (p=0.000 n=10+9)
RepeatLarge/536870912/4097-16   4.68GB/s ±11%  6.88GB/s ± 3%   +47.01%  (p=0.000 n=10+9)
RepeatLarge/1073741824/1-16     4.39GB/s ±23%  6.57GB/s ± 5%   +49.75%  (p=0.000 n=10+8)
RepeatLarge/1073741824/16-16    4.73GB/s ±13%  6.89GB/s ± 1%   +45.68%  (p=0.000 n=9+8)
RepeatLarge/1073741824/4097-16  4.97GB/s ±15%  6.73GB/s ± 9%   +35.45%  (p=0.000 n=10+10)

The results above come from a Intel i9-9980HK (256KB L2) with
TurboBoost disabled.

Change-Id: I79dd57da0429aee9020ffd7bc458a034b999b740
Reviewed-on: https://go-review.googlesource.com/c/go/+/419054
Reviewed-by: Ian Lance Taylor &lt;iant@google.com&gt;
Auto-Submit: Ian Lance Taylor &lt;iant@google.com&gt;
Run-TryBot: Ian Lance Taylor &lt;iant@google.com&gt;
Reviewed-by: Dmitri Shuralyov &lt;dmitshur@google.com&gt;
TryBot-Result: Gopher Robot &lt;gobot@golang.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bytes, strings: add ASCII fast path to EqualFold</title>
<updated>2022-09-21T14:00:37+00:00</updated>
<author>
<name>Charlie Vieth</name>
<email>charlie.vieth@gmail.com</email>
</author>
<published>2022-08-24T18:23:28+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/go-git.git/commit/?id=c70fd4b30aba5db2df7b5f6b0833c62b909f50eb'/>
<id>c70fd4b30aba5db2df7b5f6b0833c62b909f50eb</id>
<content type='text'>
This commit adds an ASCII fast path to bytes/strings EqualFold that
roughly doubles performance when all characters are ASCII.

It also changes strings.EqualFold to use `for range` for the first
string since this is ~10% faster than using utf8.DecodeRuneInString for
both (see #31666).

Performance (similar results on arm64 and amd64):

name                        old time/op  new time/op  delta
EqualFold/Tests-10           238ns ± 0%   172ns ± 1%  -27.91%  (p=0.000 n=10+10)
EqualFold/ASCII-10          20.5ns ± 0%   9.7ns ± 0%  -52.73%  (p=0.000 n=10+10)
EqualFold/UnicodePrefix-10  86.5ns ± 0%  77.6ns ± 0%  -10.37%  (p=0.000 n=10+10)
EqualFold/UnicodeSuffix-10  86.8ns ± 2%  71.3ns ± 0%  -17.88%  (p=0.000 n=10+8)

Change-Id: I058f3f97a08dc04d65af895674d85420f920abe1
Reviewed-on: https://go-review.googlesource.com/c/go/+/425459
Reviewed-by: Ian Lance Taylor &lt;iant@google.com&gt;
TryBot-Result: Gopher Robot &lt;gobot@golang.org&gt;
Auto-Submit: Ian Lance Taylor &lt;iant@google.com&gt;
Run-TryBot: Ian Lance Taylor &lt;iant@google.com&gt;
Reviewed-by: Cherry Mui &lt;cherryyz@google.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This commit adds an ASCII fast path to bytes/strings EqualFold that
roughly doubles performance when all characters are ASCII.

It also changes strings.EqualFold to use `for range` for the first
string since this is ~10% faster than using utf8.DecodeRuneInString for
both (see #31666).

Performance (similar results on arm64 and amd64):

name                        old time/op  new time/op  delta
EqualFold/Tests-10           238ns ± 0%   172ns ± 1%  -27.91%  (p=0.000 n=10+10)
EqualFold/ASCII-10          20.5ns ± 0%   9.7ns ± 0%  -52.73%  (p=0.000 n=10+10)
EqualFold/UnicodePrefix-10  86.5ns ± 0%  77.6ns ± 0%  -10.37%  (p=0.000 n=10+10)
EqualFold/UnicodeSuffix-10  86.8ns ± 2%  71.3ns ± 0%  -17.88%  (p=0.000 n=10+8)

Change-Id: I058f3f97a08dc04d65af895674d85420f920abe1
Reviewed-on: https://go-review.googlesource.com/c/go/+/425459
Reviewed-by: Ian Lance Taylor &lt;iant@google.com&gt;
TryBot-Result: Gopher Robot &lt;gobot@golang.org&gt;
Auto-Submit: Ian Lance Taylor &lt;iant@google.com&gt;
Run-TryBot: Ian Lance Taylor &lt;iant@google.com&gt;
Reviewed-by: Cherry Mui &lt;cherryyz@google.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>strings: reuse the input string for Repeat count of 1</title>
<updated>2022-09-14T14:56:07+00:00</updated>
<author>
<name>Anuraag Agrawal</name>
<email>anuraaga@gmail.com</email>
</author>
<published>2022-06-10T05:41:02+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/go-git.git/commit/?id=9503bcae2b20d290332d00d78672881b7fcfedf0'/>
<id>9503bcae2b20d290332d00d78672881b7fcfedf0</id>
<content type='text'>
The existing implementation allocates a new string even when the
count is 1, where we know the output is the same as the input.
While we wouldn't expect a count of 1 for hardcoded values of the
parameter, it is expected when the parameter is computed based on
a different value (e.g., the length of a input slice).

name            old time/op  new time/op  delta
Repeat/5x0-10   2.03ns ± 0%  2.02ns ± 0%   ~     (p=1.000 n=1+1)
Repeat/5x1-10   13.7ns ± 0%   2.0ns ± 0%   ~     (p=1.000 n=1+1)
Repeat/5x2-10   18.2ns ± 0%  18.1ns ± 0%   ~     (p=1.000 n=1+1)
Repeat/5x6-10   27.0ns ± 0%  27.0ns ± 0%   ~     (p=1.000 n=1+1)
Repeat/10x0-10  2.02ns ± 0%  2.02ns ± 0%   ~     (p=1.000 n=1+1)
Repeat/10x1-10  16.1ns ± 0%   2.0ns ± 0%   ~     (p=1.000 n=1+1)
Repeat/10x2-10  20.8ns ± 0%  20.9ns ± 0%   ~     (p=1.000 n=1+1)
Repeat/10x6-10  29.2ns ± 0%  29.4ns ± 0%   ~     (p=1.000 n=1+1)

Change-Id: I48e08e08f8f6d6914d62b3d6a61d563d637bec59
GitHub-Last-Rev: 068f58e08b8f5c4105e7a210f242ca1ff3a61177
GitHub-Pull-Request: golang/go#53321
Reviewed-on: https://go-review.googlesource.com/c/go/+/411477
Reviewed-by: Ian Lance Taylor &lt;iant@google.com&gt;
Run-TryBot: Ian Lance Taylor &lt;iant@google.com&gt;
TryBot-Result: Gopher Robot &lt;gobot@golang.org&gt;
Auto-Submit: Ian Lance Taylor &lt;iant@google.com&gt;
Reviewed-by: Keith Randall &lt;khr@google.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The existing implementation allocates a new string even when the
count is 1, where we know the output is the same as the input.
While we wouldn't expect a count of 1 for hardcoded values of the
parameter, it is expected when the parameter is computed based on
a different value (e.g., the length of a input slice).

name            old time/op  new time/op  delta
Repeat/5x0-10   2.03ns ± 0%  2.02ns ± 0%   ~     (p=1.000 n=1+1)
Repeat/5x1-10   13.7ns ± 0%   2.0ns ± 0%   ~     (p=1.000 n=1+1)
Repeat/5x2-10   18.2ns ± 0%  18.1ns ± 0%   ~     (p=1.000 n=1+1)
Repeat/5x6-10   27.0ns ± 0%  27.0ns ± 0%   ~     (p=1.000 n=1+1)
Repeat/10x0-10  2.02ns ± 0%  2.02ns ± 0%   ~     (p=1.000 n=1+1)
Repeat/10x1-10  16.1ns ± 0%   2.0ns ± 0%   ~     (p=1.000 n=1+1)
Repeat/10x2-10  20.8ns ± 0%  20.9ns ± 0%   ~     (p=1.000 n=1+1)
Repeat/10x6-10  29.2ns ± 0%  29.4ns ± 0%   ~     (p=1.000 n=1+1)

Change-Id: I48e08e08f8f6d6914d62b3d6a61d563d637bec59
GitHub-Last-Rev: 068f58e08b8f5c4105e7a210f242ca1ff3a61177
GitHub-Pull-Request: golang/go#53321
Reviewed-on: https://go-review.googlesource.com/c/go/+/411477
Reviewed-by: Ian Lance Taylor &lt;iant@google.com&gt;
Run-TryBot: Ian Lance Taylor &lt;iant@google.com&gt;
TryBot-Result: Gopher Robot &lt;gobot@golang.org&gt;
Auto-Submit: Ian Lance Taylor &lt;iant@google.com&gt;
Reviewed-by: Keith Randall &lt;khr@google.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
