diff options
| author | Rémy Coutable <remy@rymai.me> | 2016-08-04 10:07:56 +0000 |
|---|---|---|
| committer | Rémy Coutable <remy@rymai.me> | 2016-08-04 10:07:56 +0000 |
| commit | 5ffb07189cb24b0fff4e24fcfcd0c3be26e2a073 (patch) | |
| tree | 2a6c56adc8c7b854bf61bea9196925476ad5dc0d /lib | |
| parent | 7b94f23b3dfbf589c51f908e1c38505e113db54a (diff) | |
| parent | dd35c3ddf6dce7a69cc116fe6165dad68b8e9251 (diff) | |
| download | gitlab-ce-5ffb07189cb24b0fff4e24fcfcd0c3be26e2a073.tar.gz | |
Merge branch 'autolink-filter-text-parse' into 'master'
Improve AutolinkFilter#text_parse performance
## What does this MR do?
This MR improves the performance of `AutolinkFilter#text_parse` by using XPath queries for filtering out most text nodes.
## Are there points in the code the reviewer needs to double check?
Mostly the styling of things.
## Why was this MR needed?
Parsing text nodes is slow, mostly because most of this happens in Ruby.
## What are the relevant issue numbers?
https://gitlab.com/gitlab-org/gitlab-ce/issues/18593
## Does this MR meet the acceptance criteria?
- [x] [CHANGELOG](https://gitlab.com/gitlab-org/gitlab-ce/blob/master/CHANGELOG) entry added
- [x] ~~[Documentation created/updated](https://gitlab.com/gitlab-org/gitlab-ce/blob/master/doc/development/doc_styleguide.md)~~
- [x] ~~API support added~~
- Tests
- [x] ~~Added for this feature/bug~~
- [ ] All builds are passing
- [x] Conform by the [style guides](https://gitlab.com/gitlab-org/gitlab-ce/blob/master/CONTRIBUTING.md#style-guides)
- [x] Branch has no merge conflicts with `master` (if you do - rebase it please)
- [x] [Squashed related commits together](https://git-scm.com/book/en/Git-Tools-Rewriting-History#Squashing-Commits)
See merge request !5629
Diffstat (limited to 'lib')
| -rw-r--r-- | lib/banzai/filter/autolink_filter.rb | 15 |
1 files changed, 9 insertions, 6 deletions
diff --git a/lib/banzai/filter/autolink_filter.rb b/lib/banzai/filter/autolink_filter.rb index 9ed45707515..799b83b1069 100644 --- a/lib/banzai/filter/autolink_filter.rb +++ b/lib/banzai/filter/autolink_filter.rb @@ -31,6 +31,14 @@ module Banzai # Text matching LINK_PATTERN inside these elements will not be linked IGNORE_PARENTS = %w(a code kbd pre script style).to_set + # The XPath query to use for finding text nodes to parse. + TEXT_QUERY = %Q(descendant-or-self::text()[ + not(#{IGNORE_PARENTS.map { |p| "ancestor::#{p}" }.join(' or ')}) + and contains(., '://') + and not(starts-with(., 'http')) + and not(starts-with(., 'ftp')) + ]) + def call return doc if context[:autolink] == false @@ -66,16 +74,11 @@ module Banzai # Autolinks any text matching LINK_PATTERN that Rinku didn't already # replace def text_parse - search_text_nodes(doc).each do |node| + doc.xpath(TEXT_QUERY).each do |node| content = node.to_html - next if has_ancestor?(node, IGNORE_PARENTS) next unless content.match(LINK_PATTERN) - # If Rinku didn't link this, there's probably a good reason, so we'll - # skip it too - next if content.start_with?(*%w(http https ftp)) - html = autolink_filter(content) next if html == content |
