summaryrefslogtreecommitdiff
path: root/chromium/docs/speed
diff options
context:
space:
mode:
authorAllan Sandfeld Jensen <allan.jensen@qt.io>2018-08-28 15:28:34 +0200
committerAllan Sandfeld Jensen <allan.jensen@qt.io>2018-08-28 13:54:51 +0000
commit2a19c63448c84c1805fb1a585c3651318bb86ca7 (patch)
treeeb17888e8531aa6ee5e85721bd553b832a7e5156 /chromium/docs/speed
parentb014812705fc80bff0a5c120dfcef88f349816dc (diff)
downloadqtwebengine-chromium-2a19c63448c84c1805fb1a585c3651318bb86ca7.tar.gz
BASELINE: Update Chromium to 69.0.3497.70
Change-Id: I2b7b56e4e7a8b26656930def0d4575dc32b900a0 Reviewed-by: Allan Sandfeld Jensen <allan.jensen@qt.io>
Diffstat (limited to 'chromium/docs/speed')
-rw-r--r--chromium/docs/speed/addressing_performance_regressions.md4
-rw-r--r--chromium/docs/speed/apk_size_regressions.md26
-rw-r--r--chromium/docs/speed/benchmark/benchmark_ownership.md11
-rw-r--r--chromium/docs/speed/benchmark/harnesses/blink_perf.md4
-rw-r--r--chromium/docs/speed/benchmark/harnesses/loading.md101
-rw-r--r--chromium/docs/speed/benchmark/harnesses/power_perf.md91
-rw-r--r--chromium/docs/speed/benchmark/harnesses/rendering.md94
-rw-r--r--chromium/docs/speed/bot_health_sheriffing/how_to_access_test_logs.md24
-rw-r--r--chromium/docs/speed/bot_health_sheriffing/images/flakiness_dashboard_new_recipe.pngbin0 -> 144128 bytes
-rw-r--r--chromium/docs/speed/bot_health_sheriffing/images/som_new_recipe_benchmark_logs_link.pngbin0 -> 46620 bytes
-rw-r--r--chromium/docs/speed/bot_health_sheriffing/images/som_new_recipe_choose_builder.pngbin0 -> 64114 bytes
-rw-r--r--chromium/docs/speed/bot_health_sheriffing/images/som_new_recipe_identify_failed_tests.pngbin0 -> 74325 bytes
-rw-r--r--chromium/docs/speed/bot_health_sheriffing/images/som_new_recipe_story_log.pngbin0 -> 58220 bytes
-rw-r--r--chromium/docs/speed/bot_health_sheriffing/main.md2
-rw-r--r--chromium/docs/speed/bot_health_sheriffing/what_test_is_failing.md5
15 files changed, 288 insertions, 74 deletions
diff --git a/chromium/docs/speed/addressing_performance_regressions.md b/chromium/docs/speed/addressing_performance_regressions.md
index 4512a8910c6..4b5247f3cb3 100644
--- a/chromium/docs/speed/addressing_performance_regressions.md
+++ b/chromium/docs/speed/addressing_performance_regressions.md
@@ -124,6 +124,10 @@ to learn how to use traces to debug performance issues.
* [Memory](https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/memory_benchmarks.md)
* [Android binary size](apk_size_regressions.md)
+### How do I profile?
+
+Here is the [documentation on CPU Profiling Chrome](https://chromium.googlesource.com/chromium/src/+/master/docs/profiling.md)
+
## If you don't believe your CL could be the cause
> Please remember that our performance tests exist to catch unexpected
diff --git a/chromium/docs/speed/apk_size_regressions.md b/chromium/docs/speed/apk_size_regressions.md
index 3a97c4fed27..af1f8c35620 100644
--- a/chromium/docs/speed/apk_size_regressions.md
+++ b/chromium/docs/speed/apk_size_regressions.md
@@ -29,8 +29,13 @@
tools/binary_size/diagnose_bloat.py AFTER_GIT_REV --reference-rev BEFORE_GIT_REV --subrepo v8 --all
-You can usually find the before and after revs in the roll commit message
+ * You can usually find the before and after revs in the roll commit message
([example](https://chromium.googlesource.com/chromium/src/+/10c40fd863f4ae106650bba93b845f25c9b733b1))
+ * Note that you may need to click through the link for the list of changes
+ in order to find the actual first commit hash and use that one instead
+ since some rollers (including v8) use extra commits for tagging not in
+ master. In the linked example `BEFORE_GIT_REV` would actually be
+ `876f37c` and not `c1dec05f`.
### Monochrome.apk Alerts
@@ -153,6 +158,11 @@ to show a diff of ELF symbols.
* Use [//tools/binary_size/diagnose_bloat.py](https://chromium.googlesource.com/chromium/src/+/master/tools/binary_size/README.md)
to show a diff of Java symbols.
* Ensure any new Java deps are as specific as possible.
+ * If the change doesn't look suspect, check to see if the regression still
+ exists when internal proguard is used (see
+ [downstream graphs](https://chromeperf.appspot.com/report?sid=83bf643964a326648325f7eb6767d8adb85d67db8306dd94aa7476ed70d7dace)
+ or use `diagnose_bloat.py -v --enable-chrome-android-internal REV`
+ to build locally)
### Growth is from "other lib size" or "Unknown files size"
@@ -168,13 +178,21 @@ to show a diff of ELF symbols.
## Step 1: Check work queue daily
- * Bugs requiring sheriffs to take a look at are labeled `Performance-Sheriff` and `Performance-Size`.
+ * Bugs requiring sheriffs to take a look at are labeled `Performance-Sheriff` and `Performance-Size` [here](https://bugs.chromium.org/p/chromium/issues/list?q=label:Performance-Sheriff%20label:Performance-Size&sort=-modified).
* After resolving the bug by finding an owner or debugging or commenting, remove the `Performance-Sheriff` label.
## Step 2: Check alerts regularly
- * **IMPORTANT**: Check the [perf bot page](https://ci.chromium.org/buildbot/chromium.perf/Android%20Builder%20Perf/)
- several times a day to make sure it isn't broken (and ping/file a bug if it is).
+ * **IMPORTANT: Check the [perf bot page](https://ci.chromium.org/buildbot/chromium.perf/Android%20Builder%20Perf/)
+ several times a day to make sure it isn't broken (and ping/file a bug if it is).**
+ * At the very least you need to check this once in the morning and once in
+ the afternoon.
+ * If you don't and the builder is broken either you or the next sheriff will
+ have to manually build and diff the broken range (via. diagnose_bloat.py) to
+ see if we missed any regressions.
+ * This is necessary even if the next passing build doesn't create an alert
+ because the range could contain a large regression with multiple offsetting
+ decreases.
* Check [alert page](https://chromeperf.appspot.com/alerts?sheriff=Binary%20Size%20Sheriff) regularly for new alerts.
* Join [binary-size-alerts@chromium.org](https://groups.google.com/a/chromium.org/forum/#!forum/binary-size-alerts). Eventually it will be all set up.
* Deal with alerts as outlined above.
diff --git a/chromium/docs/speed/benchmark/benchmark_ownership.md b/chromium/docs/speed/benchmark/benchmark_ownership.md
index f94ebfe65de..b6a1892bcca 100644
--- a/chromium/docs/speed/benchmark/benchmark_ownership.md
+++ b/chromium/docs/speed/benchmark/benchmark_ownership.md
@@ -14,17 +14,20 @@ There can be multiple owners of a benchmark, for example if there are multiple t
### Telemetry Benchmarks
1. Open [`src/tools/perf/benchmarks/benchmark_name.py`](https://cs.chromium.org/chromium/src/tools/perf/benchmarks/), where `benchmark_name` is the part of the benchmark before the “.”, like `smoothness` in `smoothness.top_25_smooth`.
1. Find the class for the benchmark. It has a `Name` method that should match the full name of the benchmark.
-1. Add a `benchmark.Owner` decorator above the class.
+1. Add a `benchmark.Info` decorator above the class.
Example:
```
- @benchmark.Owner(
+ @benchmark.Info(
emails=['owner1@chromium.org', 'owner2@samsung.com'],
- component=’GoatTeleporter>Performance’)
+ component=’GoatTeleporter>Performance’,
+ documentation_url='http://link.to/your_benchmark_documentation')
```
- In this example, there are two owners for the benchmark, specified by email, and a bug component (we are working on getting the bug component automatically added to all perf regressions in Q2 2018).
+ In this example, there are two owners for the benchmark, specified by email; a bug component,
+ which will be automatically added to the bug by the perf dashboard; and a link
+ to documentation (which will be added to regression bugs in Q3 2018).
1. Run `tools/perf/generate_perf_data` to update `tools/perf/benchmarks.csv`.
1. Upload the benchmark python file and `benchmarks.csv` to a CL for review. Please add any previous owners to the review.
diff --git a/chromium/docs/speed/benchmark/harnesses/blink_perf.md b/chromium/docs/speed/benchmark/harnesses/blink_perf.md
index 1f6e781b250..3eff372e8b3 100644
--- a/chromium/docs/speed/benchmark/harnesses/blink_perf.md
+++ b/chromium/docs/speed/benchmark/harnesses/blink_perf.md
@@ -160,8 +160,8 @@ viewer won't be supported.
Assuming your current directory is `chromium/src/`, you can run tests with:
-`./tools/perf/run_benchmark blink_perf [--test-path=<path to your tests>]`
+`./tools/perf/run_benchmark run blink_perf [--test-path=<path to your tests>]`
For information about all supported options, run:
-`./tools/perf/run_benchmark blink_perf --help`
+`./tools/perf/run_benchmark run blink_perf --help`
diff --git a/chromium/docs/speed/benchmark/harnesses/loading.md b/chromium/docs/speed/benchmark/harnesses/loading.md
new file mode 100644
index 00000000000..2c24b04e698
--- /dev/null
+++ b/chromium/docs/speed/benchmark/harnesses/loading.md
@@ -0,0 +1,101 @@
+# Loading benchmarks
+
+[TOC]
+
+## Overview
+
+The Telemetry loading benchmarks measure Chrome's loading performance under
+different network and caching conditions.
+
+There are currently three loading benchmarks:
+
+- **`loading.desktop`**: A desktop-only benchmark in which each test case
+ measures performance of loading a real world website (e.g: facebook, cnn,
+ alibaba..).
+- **`loading.mobile`**: A mobile-only benchmark that parallels `loading.desktop`
+- **`loading.cluster_telemetry`**: A cluster Telemetry benchmark that uses the
+corpus of top 10 thousands URLs from Alexa. Unlike the other two loading
+benchmarks which are run continuously on the perf waterfall, this benchmark is
+triggered on-demand only.
+
+# Running the tests remotely
+
+If you're just trying to gauge whether your change has caused a loading
+regression, you can either run `loading.desktop` and `loading.mobile` through
+[perf try job](https://chromium.googlesource.com/chromium/src/+/master/docs/speed/perf_trybots.md) or you can run `loading.cluster_telemetry` through
+[Cluster Telemetry service](https://ct.skia.org/) (Cluster Telemetry is for
+Googler only).
+
+## Running the tests locally
+
+For more in-depth analysis and shorter cycle times, it can be helpful to run the tests locally.
+
+First, [prepare your test device for
+Telemetry](https://chromium.googlesource.com/chromium/src/+/master/docs/speed/benchmark/telemetry_device_setup.md).
+
+Once you've done this, you can start the Telemetry benchmark with:
+
+```
+./tools/perf/run_benchmark <benchmark_name> --browser=<browser>
+```
+
+where `benchmark_name` can be `loading.desktop` or `loading.mobile`.
+
+## Understanding the loading test cases
+
+The loading test cases are divided into groups based on their network traffic
+settings and cache conditions.
+
+All available traffic settings can be found in [traffic_setting.py](https://chromium.googlesource.com/catapult/+/master/telemetry/telemetry/page/traffic_setting.py)
+
+All available caching conditions can be found in [cache_temperature.py](https://chromium.googlesource.com/catapult/+/master/telemetry/telemetry/page/cache_temperature.py)
+
+Test cases of `loading.desktop` and `loading.mobile` are named with their
+corresponding settings. For example, `DevOpera_cold_3g` test case loads
+`https://dev.opera.com/` with cold cache and 3G network setting.
+
+In additions, the pages are also tagged with labels describing their content.
+e.g: 'global', 'pwa',...
+
+To run only pages of one tags, add `--story-tag-filter=<tag name>` flag to the
+run benchmark command.
+
+## Understanding the loading metrics
+The benchmark output several different loading metrics. The keys one are:
+ * [Time To First Contentful Paint](https://docs.google.com/document/d/1kKGZO3qlBBVOSZTf-T8BOMETzk3bY15SC-jsMJWv4IE/edit#heading=h.27igk2kctj7o)
+ * [Time To First Meaningful Paint](https://docs.google.com/document/d/1BR94tJdZLsin5poeet0XoTW60M0SjvOJQttKT-JK8HI/edit)
+ * [Time to First CPU
+ Idle](https://docs.google.com/document/d/12UHgAW2r7nWo3R6FBerpYuz9EVOdG1OpPm8YmY4yD0c/edit#)
+
+Besides those key metrics, there are also breakdown metrics that are meant to
+to make debugging regressions simpler. These metrics are updated often, for most
+up to date information, you can email progressive-web-metrics@chromium.org
+or chrome-speed-metrics@google.com (Googlers only).
+
+## Adding new loading test cases
+New test cases can be added by modifying
+[loading_desktop.py](https://chromium.googlesource.com/chromium/src/+/master/tools/perf/page_sets/loading_desktop.py)
+or [loading_mobile.py](https://chromium.googlesource.com/chromium/src/+/master/tools/perf/page_sets/loading_mobile.py) page sets.
+
+For example, to add a new case of loading
+`https://en.wikipedia.org/wiki/Cats_and_the_Internet` on 2G and 3G networks with
+warm cache to `news` group to `loading.desktop` benchmark, you would write:
+
+```
+self.AddStories(
+ tags=['news'],
+ urls=[('https://en.wikipedia.org/wiki/Cats_and_the_Internet', 'wiki_cats')],
+ cache_temperatures=[cache_temperature_module.WARM],
+ traffic_settings=[traffic_setting_module.2G, traffic_setting_module.3G])
+```
+
+After adding the new page, record it and upload the page archive to cloud
+storage with:
+
+```
+$ ./tools/perf/record_wpr loading_desktop --browser=system \
+ --story-filter=wiki_cats --upload
+```
+
+If the extra story was added to `loading.mobile`, replace `loading_desktop` in
+the command above with `loading_mobile`.
diff --git a/chromium/docs/speed/benchmark/harnesses/power_perf.md b/chromium/docs/speed/benchmark/harnesses/power_perf.md
index c7c4a172d98..04372512840 100644
--- a/chromium/docs/speed/benchmark/harnesses/power_perf.md
+++ b/chromium/docs/speed/benchmark/harnesses/power_perf.md
@@ -4,80 +4,45 @@
## Overview
-The Telemetry power benchmarks use BattOr, a small external power monitor, to collect power measurements while Chrome performs various tasks (a.k.a. user stories).
+The Telemetry power benchmarks measure power indirectly by measuring the CPU time used by Chrome while it performs various tasks (a.k.a. user stories).
-There are currently seven benchmarks that collect power data, grouped together by the type of task during which the power data is collected:
+## List of power metrics
-- **`system_health.common_desktop`**: A desktop-only benchmark in which each page focuses on a single, common way in which users use Chrome (e.g. browsing Facebook photos, shopping on Amazon, searching Google)
-- **`system_health.common_mobile`**: A mobile-only benchmark that parallels `system_health.common_desktop`
-- **`battor.trivial_pages`**: A Mac-only benchmark in which each page focuses on a single, extremely simple behavior (e.g. a blinking cursor, a CSS blur animation)
-- **`battor.steady_state`**: A Mac-only benchmark in which each page focuses on a website that Chrome has exhibited pathological idle behavior in the past
-- **`media.tough_video_cases_tbmv2`**: A desktop-only benchmark in which each page tests a particular media-related scenario (e.g. playing a 1080p, H264 video with sound)
-- **`media.android.tough_video_cases_tbmv2`**: A mobile-only benchmark that parallels `media.tough_video_cases_tbmv2`
-- **`power.idle_platform`**: A benchmark that sits idle without starting Chrome for various lengths of time. Used as a debugging benchmark to monitor machine background noise.
-
-Note that these benchmarks are in the process of being consolidated and that there will likely be fewer, larger power benchmarks in the near future.
-
-The legacy power benchmarks consist of:
-
-- **`power.typical_10_mobile`**, which visits ten popular sites and uses Android-specific APIs to measure approximately how much power was consumed. This can't be deleted because it's still used by the Android System Health Council to assess whether Chrome Android is fit for release on hardware configurations for which BattOrs are not yet available.
-
-## Running the tests remotely
-
-If you're just trying to gauge whether your change has caused a power regression, you can do so by [running a benchmark remotely via a perf try job](https://chromium.googlesource.com/chromium/src/+/master/docs/speed/perf_trybots.md).
-
-When you do this, be sure to use a configuration that's equipped with BattOrs:
-
-- `android_nexus5X`
-- `android-webview-arm64-aosp`
-- `mac-retina`
-- `mac-10-11`
-- `winx64-high-dpi`
-
-If you're unsure of which benchmark to choose, `system_health.common_[desktop/mobile]` is a safe, broad choice.
+### `cpu_time_percentage_avg`
+This metric measures the average number of cores that Chrome used over the duration of the trace.
-## Running the tests locally
-
-For more in-depth analysis and shorter cycle times, it can be helpful to run the tests locally. Because the power benchmarks rely on having a BattOr, you'll need to get one before you can do so. If you're a Googler, you can ask around (many Chrome offices already have a BattOr in them) or request one at [go/battor-request-form](http://go/battor-request-form). If you're external to Google, you can contact the BattOr's manufacturer at <sales@mellowlabs.com>.
-
-Once you have a BattOr, follow the instructions in the [BattOr laptop setup guide](https://docs.google.com/document/d/1UsHc990NRO2MEm5A3b9oRk9o7j7KZ1qftOrJyV1Tr2c/edit) to hook it up to your laptop. If you're using a phone with a BattOr, you'll need to run one USB to micro-USB cable from the host computer triggering the Telemetry tests to the BattOr and another from the host computer to the phone.
-
-Once you've done this, you can start the Telemetry benchmark with:
+This metric is enabled by adding `'cpuTimeMetric'` to the list of TBM2 metrics in the benchmark's Python class:
+```python
+options.SetTimelineBasedMetrics(['cpuTimeMetric', 'memoryMetric'])
```
-./tools/perf/run_benchmark <benchmark_name> --browser=<browser>
-```
-
-where `benchmark_name` is one of the above benchmark names.
-## Understanding power metrics
+Additionally, the `toplevel` trace category must be enabled for this metric to function correctly because it ensures that a trace span is active whenever Chrome is doing work:
-To understand our power metrics, it's important to understand the distinction between power and energy. *Energy* is what makes computers run and is measured in Joules, whereas *power* is the rate at which that energy is used and is measured in Joules per second.
+```python
+category_filter = chrome_trace_category_filter.ChromeTraceCategoryFilter(filter_string='toplevel')
+```
-Some of our power metrics measure energy, whereas others measure power. Specifically:
+## List of power benchmarks
-- We measure *energy* when the user cares about whether the task is completed (e.g. "energy required to load a page", "energy required to responsd to a mouse click").
-- We measure *power* when the user cares about the power required to continue performing an action (e.g. "power while scrolling", "power while playing a video animation").
+The primary power benchmarks are:
-The full list of our metrics is as follows:
+- **`system_health.common_desktop`**: A desktop-only benchmark in which each page focuses on a single, common way in which users use Chrome (e.g. browsing Facebook photos, shopping on Amazon, searching Google)
+- **`system_health.common_mobile`**: A mobile-only benchmark that parallels `system_health.common_desktop`
+- **`power.desktop`**: A desktop-only benchmark made up of two types of pages:
+ - Pages focusing on a single, extremely simple behavior (e.g. a blinking cursor, a CSS blur animation)
+ - Pages on which Chrome has exhibited pathological idle behavior in the past
+- **`power.typical_10_mobile`**: A mobile-only benchmark which visits ten popular sites and uses Android-specific APIs to measure approximately how much power is consumed. This benchmark is necessary to provide data to the Android System Health Council to assess whether Chrome Android is fit for release
+- **`media.desktop`**: A desktop-only benchmark in which each page tests a particular media-related scenario (e.g. playing a 1080p, H264 video with sound)
+- **`media.mobile`**: A mobile-only benchmark that parallels `media.desktop`
-### Energy metrics
-- **`load:energy_sum`**: Total energy used in between page navigations and first meaningful paint on all navigations in the story.
-- **`scroll_response:energy_sum`**: Total energy used to respond to all scroll requests in the story.
-- **`tap_response:energy_sum`**: Total energy used to respond to all taps in the story.
-- **`keyboard_response:energy_sum`**: Total energy used to respond to all key entries in the story.
+[This spreadsheet](https://docs.google.com/spreadsheets/d/1xaAo0_SU3iDfGdqDJZX_jRV0QtkufwHUKH3kQKF3YQs/edit#gid=0) lists the owner for each benchmark.
-### Power metrics
-- **`story:power_avg`**: Average power over the entire story.
-- **`css_animation:power_avg`**: Average power over all CSS animations in the story.
-- **`scroll_animation:power_avg`**: Average power over all scroll animations in the story.
-- **`touch_animation:power_avg`**: Average power over all touch animations (e.g. finger drags) in the story.
-- **`video_animation:power_avg`**: Average power over all videos played in the story.
-- **`webgl_animation:power_avg`**: Average power over all WebGL animations in the story.
-- **`idle:power_avg`**: Average power over all idle periods in the story.
+## Adding new power test cases
+To add a new test case to a power benchmark, contact the owner of the benchmark above that sounds like the best fit.
-### Other metrics
-- **`cpu_time_percentage_avg`**: Average CPU load over the entire story.
+## Running the benchmarks locally
+See [this page](https://github.com/catapult-project/catapult/blob/master/telemetry/docs/run_benchmarks_locally.md) for instructions on how to run the benchmarks locally.
-## Adding new power test cases
-We're not currently accepting new power stories until we've consolidated the existing ones.
+## Seeing power benchmark results
+Enter the platform, benchmark, and metric you care about on [this page](https://chromeperf.appspot.com/report) to see how the power metrics have moved over time.
diff --git a/chromium/docs/speed/benchmark/harnesses/rendering.md b/chromium/docs/speed/benchmark/harnesses/rendering.md
new file mode 100644
index 00000000000..a2317ae57e4
--- /dev/null
+++ b/chromium/docs/speed/benchmark/harnesses/rendering.md
@@ -0,0 +1,94 @@
+# Rendering Benchmarks
+
+This document provides an overview of the benchmarks used to monitor Chrome’s graphics performance. It includes information on what benchmarks are available, how to run them, how to interpret their results, and how to add more tests to the benchmarks.
+
+[TOC]
+
+## Glossary
+
+- **Page** (or story): A recording of a website, which is associated with a set of actions (ex. scrolling)
+- **Page Set** (or story set): A collection of different pages, organized by some shared characteristic (ex. top real world mobile sites)
+- **Metric**: A process that describes how to collect meaningful data from a Chrome trace and calculate results (ex. frame time)
+- **Benchmark**: A combination of a page set and multiple metrics
+- **Telemetry**: The [framework](https://github.com/catapult-project/catapult/blob/master/telemetry/README.md) used for Chrome performance testing, which allows benchmarks to be run and metrics to be collected
+
+## Overview
+
+The Telemetry rendering benchmarks measure Chrome’s rendering performance in different scenarios.
+
+There are currently two rendering benchmarks:
+
+- `rendering.desktop`: A desktop-only benchmark that measures performance on both real world websites and special cases (ex. pages that are difficult to zoom)
+- `rendering.mobile`: A mobile-only equivalent of rendering.desktop
+
+Note: Some pages are used for rendering.desktop but not rendering.mobile, and vice versa. This is because some pages are only meant to measure behavior on one platform, for instance dragging on desktop. This is indicated with the `SUPPORTED_PLATFORMS` attribute in the page class.
+
+These benchmarks are run on the [Chromium Perf Waterfall](https://ci.chromium.org/p/chromium/g/chromium.perf/console), with results reported on the [Chrome Performance Dashboard](https://chromeperf.appspot.com/report).
+
+## What are the rendering metrics
+
+Some rendering metrics are [written in Python](https://cs.chromium.org/chromium/src/third_party/catapult/telemetry/telemetry/web_perf/metrics/smoothness.py) and others are written [in JavaScript](https://github.com/catapult-project/catapult/blob/master/tracing/tracing/metrics/rendering_metric.html). The list of all metrics and their meanings should be documented in the files they are defined in. We are in the progress of writing all metrics in JavaScript, which means [rendering_metric.html](https://github.com/catapult-project/catapult/blob/master/tracing/tracing/metrics/rendering_metric.html) will eventually contain all metrics.
+
+Important rendering metrics include:
+- `mean_frame_time`: the amount of time it takes for a frame to be rendered
+- `mean_input_event_latency`: time from when the input event is created to when its resulted page is swap buffered (from [here](https://github.com/catapult-project/catapult/blob/master/telemetry/telemetry/web_perf/metrics/smoothness.py))
+
+## How to run rendering benchmarks on local devices
+
+First, set up your device by following the instructions [here](https://chromium.googlesource.com/chromium/src/+/master/docs/speed/benchmark/telemetry_device_setup.md). You can then run telemetry benchmarks locally using:
+
+`./tools/perf/run_benchmark <benchmark_name> --browser=<browser>`
+
+For `benchmark_name`, use either `rendering.desktop` or `rendering.mobile`
+
+As the pages in the rendering page sets were merged from a variety of previous page sets, they have corresponding tags. To run the benchmark only for pages of a certain tag, add this flag:
+
+`--story-tag-filter=<tag name>`
+
+For example, if the old benchmark was `smoothness.tough_scrolling_cases`, you would now use `--story-tag-filter=tough_scrolling` for the rendering benchmarks. A list of all rendering [tags](https://cs.chromium.org/chromium/src/tools/perf/page_sets/rendering/story_tags.py?dr&g=0) can be found here. You can also find out which tags are used by a page by looking at the `TAGS` attribute of the class. Additionally, these same tags can be used to filter the metrics results in the generated results.html file.
+
+Other useful options for the command are:
+
+- `--pageset-repeat [n]`: override the default number of repetitions
+- `--reset-results`: clear results from any previous benchmark runs in the results.html file.
+- `--results-label [label]`: give meaningful names to your benchmark runs, to make it easier to compare them
+
+## How to run rendering benchmarks on try bots
+
+For more consistent results and to identify whether your change has resulted in a rendering regression, you can run the rendering benchmarks using a [perf try job](https://chromium.googlesource.com/chromium/src/+/master/docs/speed/perf_trybots.md). In order to do this, you need to first upload a CL, which allows results to be generated with and without your patch.
+
+## How to handle regressions
+
+If your changes have resulted in a regression in a metric that is monitored by [perf alerts](https://chromeperf.appspot.com/alerts?sortby=end_revision&sortdirection=down), you will be assigned to a bug. This will contain information about the specific metric and how much it was regressed, as well as a Pinpoint link that will help you investigate further. For instance, you will be able to obtain traces from the try bot runs. This [link](https://chromium.googlesource.com/chromium/src/+/master/docs/speed/addressing_performance_regressions.md) contains detailed steps on how to deal with regressions. Rendering metrics use trace events logged under the benchmark and toplevel trace categories.
+
+If you already have a trace and want to debug the metric computation part, you can just run the metric:
+`tracing/bin/run_metric <path-to-trace-file> renderingMetric`
+
+## How to add more pages
+
+New rendering pages should be added to the [./tools/perf/page_sets/rendering](https://cs.chromium.org/chromium/src/tools/perf/page_sets/rendering/?dr&g=0) folder:
+
+Pages inherit from the [RenderingStory](https://cs.chromium.org/chromium/src/tools/perf/page_sets/rendering/rendering_story.py?dr&g=0) class. If adding a group of new pages, create an abstract class with the following attributes:
+
+- `ABSTRACT_STORY = True`
+- `TAGS`: a list of tags, which can be added to [story_tags.py](https://cs.chromium.org/chromium/src/tools/perf/page_sets/rendering/story_tags.py?dr&g=0) if necessary
+- `SUPPORTED_PLATFORMS` (optional): if the page should only be mobile or desktop
+
+Children classes should specify these attributes:
+- `BASE_NAME`: name of the page
+ - Use the “new_page_name” format
+ - If the page is a real-world website and should be periodically refreshed, add “_year” to the end of the page name and update the value when a new recording is uploaded
+ - Ex. google_web_search_2018
+- `URL`: url of the page
+
+All pages in the rendering benchmark need to use [RenderingSharedState](https://cs.chromium.org/chromium/src/tools/perf/page_sets/rendering/rendering_shared_state.py?dr&g=0) as the shared_page_state_class, since this has to be consistent across pages in a page set. Individual pages can also specify `extra_browser_args`, in order to set specific flags.
+
+After adding the page, record it and upload it to cloud storage using:
+
+`./tools/perf/record_wpr rendering_desktop --browser=system --story-tag-filter=<tag name> --upload`
+
+This will modify the [data/rendering_desktop.json](https://cs.chromium.org/chromium/src/tools/perf/page_sets/data/rendering_desktop.json?type=cs&q=rendering_deskt&g=0&l=1) or [data/rendering_mobile.json](https://cs.chromium.org/chromium/src/tools/perf/page_sets/data/rendering_mobile.json?type=cs&g=0) files and generate .sha1 files, which should be included in the CL.
+
+### Merging existing pages
+
+If more pages need to be merged into the rendering page sets, please see [this guide](https://docs.google.com/document/d/19vUZCnJ0_5pfcwotl0ABTFGFIBc_CckNIyfE7Cs7I3o/edit#bookmark=id.w3jf2ip73aat) on how to do so.
diff --git a/chromium/docs/speed/bot_health_sheriffing/how_to_access_test_logs.md b/chromium/docs/speed/bot_health_sheriffing/how_to_access_test_logs.md
index 5cb817c89f9..89ee025c3ba 100644
--- a/chromium/docs/speed/bot_health_sheriffing/how_to_access_test_logs.md
+++ b/chromium/docs/speed/bot_health_sheriffing/how_to_access_test_logs.md
@@ -34,6 +34,30 @@ After doing this, search for your benchmark's name (in this case, "v8.browsing_d
![Sheriff-o-matic choose shard #0 failed link from test steps](images/som_test_steps_shard_0.png)
+### Accessing the log for the new perf recipe
+
+Currently linux-perf and mac-10_12_laptop_low_end-perf are running the new perf recipe and logs are accessed slightly differently. 
+
+#### Failing Story Logs
+Sheriff-o-matic now links to failing story logs when present. Click on the logs
+link to download the failing story log.
+![Sheriff-o-matic click on builder](images/som_new_recipe_story_log.png)
+
+
+#### Failing Benchmark Logs
+First navigate to the failing build through the Sheriff-o-matic entry by clicking on the builder this step failed on. 
+
+![Sheriff-o-matic click on builder](images/som_new_recipe_choose_builder.png)
+
+This new screen will list out the most recent builds for this builder.  To identify which build you are interested in you will have to drill into each build starting with the most recent to identify where the specific failing test is. Ctrl-F for “performance_test_suite” or scroll down to the test entry.  The list of failed tests is right on the performance_test_suite step:
+
+![Sheriff-o-matic identify the list of failing tests](images/som_new_recipe_identify_failed_tests.png)
+
+Once you have identified the build that has your failing test, click on the “Benchmark logs” link and Ctrl-F for your failing benchmark.  This link provides a logdog stream with all of the logs for that particular benchmark.
+
+![Sheriff-o-matic find benchmark logs](images/som_new_recipe_benchmark_logs_link.png)
+
+
## Navigating log files
### Identifying why a story failed
diff --git a/chromium/docs/speed/bot_health_sheriffing/images/flakiness_dashboard_new_recipe.png b/chromium/docs/speed/bot_health_sheriffing/images/flakiness_dashboard_new_recipe.png
new file mode 100644
index 00000000000..6e2985e1e8f
--- /dev/null
+++ b/chromium/docs/speed/bot_health_sheriffing/images/flakiness_dashboard_new_recipe.png
Binary files differ
diff --git a/chromium/docs/speed/bot_health_sheriffing/images/som_new_recipe_benchmark_logs_link.png b/chromium/docs/speed/bot_health_sheriffing/images/som_new_recipe_benchmark_logs_link.png
new file mode 100644
index 00000000000..a7b0997ab46
--- /dev/null
+++ b/chromium/docs/speed/bot_health_sheriffing/images/som_new_recipe_benchmark_logs_link.png
Binary files differ
diff --git a/chromium/docs/speed/bot_health_sheriffing/images/som_new_recipe_choose_builder.png b/chromium/docs/speed/bot_health_sheriffing/images/som_new_recipe_choose_builder.png
new file mode 100644
index 00000000000..f8417c9fdd5
--- /dev/null
+++ b/chromium/docs/speed/bot_health_sheriffing/images/som_new_recipe_choose_builder.png
Binary files differ
diff --git a/chromium/docs/speed/bot_health_sheriffing/images/som_new_recipe_identify_failed_tests.png b/chromium/docs/speed/bot_health_sheriffing/images/som_new_recipe_identify_failed_tests.png
new file mode 100644
index 00000000000..580e13f77fe
--- /dev/null
+++ b/chromium/docs/speed/bot_health_sheriffing/images/som_new_recipe_identify_failed_tests.png
Binary files differ
diff --git a/chromium/docs/speed/bot_health_sheriffing/images/som_new_recipe_story_log.png b/chromium/docs/speed/bot_health_sheriffing/images/som_new_recipe_story_log.png
new file mode 100644
index 00000000000..575a426743e
--- /dev/null
+++ b/chromium/docs/speed/bot_health_sheriffing/images/som_new_recipe_story_log.png
Binary files differ
diff --git a/chromium/docs/speed/bot_health_sheriffing/main.md b/chromium/docs/speed/bot_health_sheriffing/main.md
index 624e151d881..f327ea85e2a 100644
--- a/chromium/docs/speed/bot_health_sheriffing/main.md
+++ b/chromium/docs/speed/bot_health_sheriffing/main.md
@@ -32,7 +32,7 @@ The sheriff should *not* feel like responsible for investigating hard problems.
Incoming failures are shown in [Sheriff-o-matic](https://sheriff-o-matic.appspot.com/chromium.perf), which acts as a task management system for bot health sheriffs. Failures are divided into three groups on the dashboard:
-* **Infra failures** show general infrastructure problems that are affecting benchmarks.
+* **Infra failures** show general infrastructure problems that are affecting benchmarks. Besides surfacing in Sheriff-o-matic, we also need to check for down bots in the lame duck pool. Please file a ticket for any bots you see in [this list](https://chrome-swarming.appspot.com/botlist?c=id&c=os&c=task&c=status&c=os&c=task&c=status&c=pool&f=status%3Adead&f=pool%3Achrome.tests.perf&l=100&q=pool%3Achrome.tests.perf&s=id%3Aasc) or [this list for webview](https://chrome-swarming.appspot.com/botlist?c=id&c=os&c=task&c=status&c=os&c=task&c=status&c=pool&f=status%3Adead&f=pool%3Achrome.tests.perf-webview&l=100&q=pool%3Achrome.tests.perf&s=id%3Aasc) as they will not show up in Sheriff-o-matic.
* **Consistent failures** show benchmarks that have been failing for a while.
diff --git a/chromium/docs/speed/bot_health_sheriffing/what_test_is_failing.md b/chromium/docs/speed/bot_health_sheriffing/what_test_is_failing.md
index 9eb5b192722..91b15bece4a 100644
--- a/chromium/docs/speed/bot_health_sheriffing/what_test_is_failing.md
+++ b/chromium/docs/speed/bot_health_sheriffing/what_test_is_failing.md
@@ -6,6 +6,11 @@ The easiest way to identify these is to use the [Flakiness dashboard](https://te
![The flakiness dashboard](images/flakiness_dashboard.png)
+If the bot is running the new performance_test_suite than all stories will be
+listed under test type 'performance_test_suite' and the associated builder.
+
+![The flakiness dashboard new recipe](images/flakiness_dashboard_new_recipe.png)
+
Each row represents a particular story and each column represents a recent run, listed with the most recent run on the left. If the cell is green, then the story passed; if it's red, then it failed. Only stories that have failed at least once will be listed. You can click on a particular cell to see more information like revision ranges (useful for launching bisects) and logs.
With this view, you can easily see how often a given story is failing. Usually, any story that appears to be failing in over 20% of recent runs should be disabled.