Add latest changes from gitlab-org/gitlab@13-2-stable-ee

author: GitLab Bot <gitlab-bot@gitlab.com> 2020-07-20 12:26:25 +0000
committer: GitLab Bot <gitlab-bot@gitlab.com> 2020-07-20 12:26:25 +0000
commit: a09983ae35713f5a2bbb100981116d31ce99826e (patch)
tree: 2ee2af7bd104d57086db360a7e6d8c9d5d43667a /doc/development/elasticsearch.md
parent: 18c5ab32b738c0b6ecb4d0df3994000482f34bd8 (diff)
download: gitlab-ce-a09983ae35713f5a2bbb100981116d31ce99826e.tar.gz
1 files changed, 15 insertions, 7 deletions
diff --git a/doc/development/elasticsearch.md b/doc/development/elasticsearch.md
index 9f54386f1af..90debab3b5c 100644
--- a/doc/development/elasticsearch.md
+++ b/doc/development/elasticsearch.md
@@ -60,7 +60,7 @@ The `whitespace` tokenizer was selected in order to have more control over how t
 
 Please see the `code` filter for an explanation on how tokens are split.
 
-NOTE: **Known Issues**:
+NOTE: **Note:**
 Currently the [Elasticsearch code_analyzer doesn't account for all code cases](../integration/elasticsearch.md#known-issues).
 
 #### `code_search_analyzer`
@@ -111,11 +111,8 @@ Patterns:
 - `'"((?:\\"|[^"]|\\")*)"'`: captures terms inside quotes, removing the quotes
 - `"'((?:\\'|[^']|\\')*)'"`: same as above, for single-quotes
 - `'\.([^.]+)(?=\.|\s|\Z)'`: separate terms with periods in-between
-- `'\/?([^\/]+)(?=\/|\b)'`: separate path terms `like/this/one`
-
-#### `edgeNGram_filter`
-
-Uses an [Edge NGram token filter](https://www.elastic.co/guide/en/elasticsearch/reference/5.5/analysis-edgengram-tokenfilter.html) to allow inputs with only parts of a token to find the token. For example it would turn `glasses` into permutations starting with `gl` and ending with `glasses`, which would allow a search for "`glass`" to find the original token `glasses`
+- `'([\p{L}_.-]+)'`: some common chars in file names to keep the whole filename intact (eg. `my_file-ñame.txt`)
+- `'([\p{L}\d_]+)'`: letters, numbers and underscores are the most common tokens in programming. Always capture them greedily regardless of context.
 
 ## Gotchas
 
@@ -160,7 +157,8 @@ The global configurations per version are now in the `Elastic::(Version)::Config
 
 ### Creating new version of schema
 
-NOTE: **Note:** this is not applicable yet as multiple indices functionality is not fully implemented.
+NOTE: **Note:**
+This is not applicable yet as multiple indices functionality is not fully implemented.
 
 Folders like `ee/lib/elastic/v12p1` contain snapshots of search logic from different versions. To keep a continuous Git history, the latest version lives under `ee/lib/elastic/latest`, but its classes are aliased under an actual version (e.g. `ee/lib/elastic/v12p3`). When referencing these classes, never use the `Latest` namespace directly, but use the actual version (e.g. `V12p3`).
 
@@ -222,6 +220,16 @@ be used both locally in development and on any deployed GitLab instance to
 diagnose poor search performance. This will show the exact queries being made,
 which is useful to diagnose why a search might be slow.
 
+### Correlation ID and X-Opaque-Id
+
+Our [correlation
+ID](./distributed_tracing.md#developer-guidelines-for-working-with-correlation-ids)
+is forwarded by all requests from Rails to Elasticsearch as the
+[`X-Opaque-Id`](https://www.elastic.co/guide/en/elasticsearch/reference/current/tasks.html#_identifying_running_tasks)
+header which allows us to track any
+[tasks](https://www.elastic.co/guide/en/elasticsearch/reference/current/tasks.html)
+in the cluster back the request in GitLab.
+
 ## Troubleshooting
 
 ### Getting `flood stage disk watermark [95%] exceeded`
author	GitLab Bot <gitlab-bot@gitlab.com>	2020-07-20 12:26:25 +0000
committer	GitLab Bot <gitlab-bot@gitlab.com>	2020-07-20 12:26:25 +0000
commit	a09983ae35713f5a2bbb100981116d31ce99826e (patch)
tree	2ee2af7bd104d57086db360a7e6d8c9d5d43667a /doc/development/elasticsearch.md
parent	18c5ab32b738c0b6ecb4d0df3994000482f34bd8 (diff)
download	gitlab-ce-a09983ae35713f5a2bbb100981116d31ce99826e.tar.gz