delta/gitlab/gitlab-ce.git/lib/banzai, branch update-github-doc-page

Fix a memory leak caused by Banzai::Filter::SanitizationFilter

2016-08-14T20:28:18+00:00

In Banzai::Filter::SanitizationFilter#customize_whitelist, we append
three lambdas that has reference to the SanitizationFilter instance,
which in turn (potentially) has a reference to the following chain:

context hash -> Project instance -> Repository instance -> lookup hash
-> various Rugged instances -> various mmap-ed git pack files.

All of the above is not garbage collected because the array we append
the lambdas to is the constant
HTML::Pipeline::SanitizationFilter::WHITELIST.

Merge branch 'relative-link-filter-ref' into 'master'

2016-08-09T21:45:59+00:00


Do not look up commit again when it is passed to RelativeLinkFilter

## What does this MR do?

Use `context[:commit]` in RelativeLinkFilter instead of looking up commit using `context[:ref]`.

## Why was this MR needed?

Even though the commit object was already passed, unnecessary I/O is done to retrieve the commit object.

## What are the relevant issue numbers?

Fixes #20026

See merge request !5455

Enable Style/EmptyLinesAroundClassBody cop

2016-08-06T01:52:24+00:00

Enable Style/EmptyLinesAroundModuleBody cop

2016-08-06T01:44:39+00:00

Ignore URLs starting with // (!5677)

2016-08-04T23:30:59+00:00

Merge branch 'syntax-highlight-filter-performance' into 'master'

2016-08-04T10:13:39+00:00


Improve performance of SyntaxHighlightFilter

## What does this MR do?

This MR improves the performance of `Banzai::Filter::SyntaxHighlightFilter`. See e9bacc6575d0002c6cab620075dea3dc7f93f100 for more information.

## Are there points in the code the reviewer needs to double check?

Styling mostly.

## Why was this MR needed?

Syntax highlighting is rather slow.

## What are the relevant issue numbers?

#18592 

## Does this MR meet the acceptance criteria?

- [x] [CHANGELOG](https://gitlab.com/gitlab-org/gitlab-ce/blob/master/CHANGELOG) entry added
- [x] ~~[Documentation created/updated](https://gitlab.com/gitlab-org/gitlab-ce/blob/master/doc/development/doc_styleguide.md)~~
- [x] ~~API support added~~
- Tests
  - [x] ~~Added for this feature/bug~~
  - [ ] All builds are passing
- [x] Conform by the [style guides](https://gitlab.com/gitlab-org/gitlab-ce/blob/master/CONTRIBUTING.md#style-guides)
- [ ] Branch has no merge conflicts with `master` (if you do - rebase it please)
- [x] [Squashed related commits together](https://git-scm.com/book/en/Git-Tools-Rewriting-History#Squashing-Commits)

See merge request !5643

Improve performance of SyntaxHighlightFilter

2016-08-03T14:58:20+00:00

By using Rouge::Lexer.find instead of find_fancy() and memoizing the
HTML formatter we can speed up the highlighting process by between 1.7
and 1.8 times (at least when measured using synthetic benchmarks). To
measure this I used the following benchmark:

    require 'benchmark/ips'

    input = ''

    Dir['./app/controllers/**/*.rb'].each do |controller|
      input << <<-EOF
      #{File.read(controller).strip}

      EOF
    end

    document = Nokogiri::HTML.fragment(input)
    filter = Banzai::Filter::SyntaxHighlightFilter.new(document)

    puts "Input size: #{(input.bytesize.to_f / 1024).round(2)} KB"

    Benchmark.ips do |bench|
      bench.report 'call' do
        filter.call
      end
    end

This benchmark produces 250 KB of input. Before these changes the timing
output would be as follows:

    Calculating -------------------------------------
                    call     1.000  i/100ms
    -------------------------------------------------
                    call     22.439  (±35.7%) i/s -     93.000

After these changes the output instead is as follows:

Calculating -------------------------------------
                call     1.000  i/100ms
-------------------------------------------------
                call     41.283  (±38.8%) i/s -    148.000

Note that due to the fairly high standard deviation and this being a
synthetic benchmark it's entirely possible the real-world improvements
are smaller.

Improve AutolinkFilter#text_parse performance

2016-08-03T09:38:46+00:00

By using clever XPath queries we can quite significantly improve the performance of this method. The actual improvement depends a bit on the amount of links used but in my tests the new implementation is usually around 8 times faster than the old one. This was measured using the following benchmark: require 'benchmark/ips' text = '

' + Note.select("string_agg(note, '') AS note").limit(50).take[:note] + '

' document = Nokogiri::HTML.fragment(text) filter = Banzai::Filter::AutolinkFilter.new(document, autolink: true) puts "Input size: #{(text.bytesize.to_f / 1024 / 1024).round(2)} MB" filter.rinku_parse Benchmark.ips(time: 15) do |bench| bench.report 'text_parse' do filter.text_parse end bench.report 'text_parse_fast' do filter.text_parse_fast end bench.compare! end Here the "text_parse_fast" method is the new implementation and "text_parse" the old one. The input size was around 180 MB. Running this benchmark outputs the following: Input size: 181.16 MB Calculating ------------------------------------- text_parse 1.000 i/100ms text_parse_fast 9.000 i/100ms ------------------------------------------------- text_parse 13.021 (±15.4%) i/s - 188.000 text_parse_fast 112.741 (± 3.5%) i/s - 1.692k Comparison: text_parse_fast: 112.7 i/s text_parse: 13.0 i/s - 8.66x slower Again the production timings may (and most likely will) vary depending on the input being processed.

Do not look up commit again when it is passed to RelativeLinkFilter (!5455)

2016-08-02T19:52:52+00:00

Add support for relative links starting with ./ or / to RelativeLinkFilter (!5586)

2016-08-02T01:52:24+00:00