diff options
Diffstat (limited to 'chromium/mojo/docs')
-rw-r--r-- | chromium/mojo/docs/mojolpm.md | 471 |
1 files changed, 471 insertions, 0 deletions
diff --git a/chromium/mojo/docs/mojolpm.md b/chromium/mojo/docs/mojolpm.md new file mode 100644 index 00000000000..a3b1c8e254b --- /dev/null +++ b/chromium/mojo/docs/mojolpm.md @@ -0,0 +1,471 @@ +# Getting started with MojoLPM + +*** note +**Note:** Using MojoLPM to fuzz your Mojo interfaces is intended to be simple, +but there are edge-cases that may require a very detailed understanding of the +Mojo implementation to fix. If you run into problems that you can't understand +readily, send an email to [markbrand@google.com] and cc `fuzzing@chromium.org` +and we'll try and help. + +**Prerequisites:** Knowledge of [libfuzzer] and basic understanding +of [Protocol Buffers] and [libprotobuf-mutator]. Basic understanding of +[testing in Chromium]. +*** + +This document will walk you through: +* An overview of MojoLPM and what it's used for. +* Adding a fuzzer to an existing Mojo interface using MojoLPM. + +[TOC] + +## Overview of MojoLPM + +MojoLPM is a toolchain for automatically generating structure-aware fuzzers for +Mojo interfaces using libprotobuf-mutator as the fuzzing engine. + +This tool works by using the existing "grammar" for the interface provided by +the .mojom files, and translating that into a Protocol Buffer format that can be +fuzzed by libprotobuf-mutator. These protocol buffers are then interpreted by +a generated runtime as a sequence of mojo method calls on the targeted +interface. + +The intention is that using these should be as simple as plugging the generated +code in to the existing unittests for those interfaces - so if you've already +implemented the necessary mocks to unittest your code, the majority of the work +needed to get quite effective fuzzing of your interfaces is already complete! + +## Choose the Mojo interface(s) to fuzz + +If you're a developer looking to add fuzzing support for an interface that +you're developing, then this should be very easy for you! + +If not, then a good starting point is to search for [interfaces] in codesearch. +The most interesting interfaces from a security perspective are those which are +implemented in the browser process and exposed to the renderer process, but +there isn't a very simple way to enumerate these, so you may need to look +through some of the source code to find an interesting one. + +For the rest of this guide, we'll write a new fuzzer for +`blink.mojom.CodeCacheHost`, which is defined in +`third_party/blink/public/mojom/loader/code_cache.mojom`. + +We then need to find the relevant GN build target for this mojo interface so +that we know how to refer to it later - in this case that is +`//third_party/blink/public/mojom:mojom_platform`. + +## Find the implementations of the interfaces + +If you are developing these interfaces, then you already know where to find the +implementations. + +Otherwise a good starting point is to search for references to +"public blink::mojom::CodeCacheHost". Usually there is only a single +implementation of a given Mojo interface (there are a few exceptions where the +interface abstracts platform specific details, but this is less common). This +leads us to `content/browser/renderer_host/code_cache_host_impl.h` and +`CodeCacheHostImpl`. + +## Find the unittest for the implementation + +Unfortunately, it doesn't look like `CodeCacheHostImpl` has a unittest, so we'll +have to go through the process of understanding how to create a valid instance +ourselves in order to fuzz this interface. + +Since this interface runs in the Browser process, and is part of `/content`, +we're going to create our new fuzzer in `/content/test/fuzzer`. + +## Add our testcase proto + +First we'll add a proto source file, `code_cache_host_mojolpm_fuzzer.proto`, +which is going to define the structure of our testcases. This is basically +boilerplate, but it allows creating fuzzers which interact with multiple Mojo +interfaces to uncover more complex issues. For our case, this will be a simple +file: + +``` +syntax = "proto2"; + +package content.fuzzing.code_cache_host.proto; + +import "third_party/blink/public/mojom/loader/code_cache.mojom.mojolpm.proto"; + +message NewCodeCacheHost { + required uint32 id = 1; +} + +message RunUntilIdle { + enum ThreadId { + IO = 0; + UI = 1; + } + + required ThreadId id = 1; +} + +message Action { + oneof action { + NewCodeCacheHost new_code_cache_host = 1; + RunUntilIdle run_until_idle = 2; + mojolpm.blink.mojom.CodeCacheHost.RemoteMethodCall code_cache_host_call = 3; + } +} + +message Sequence { + repeated uint32 action_indexes = 1 [packed=true]; +} + +message Testcase { + repeated Action actions = 1; + repeated Sequence sequences = 2; + repeated uint32 sequence_indexes = 3 [packed=true]; +} +``` + +This specifies all of the actions that the fuzzer will be able to take - it +will be able to create a new `CodeCacheHost` instance, perform sequences of +interface calls on those instances, and wait for various threads to be idle. + +In order to build this proto file, we'll need to copy it into the out/ directory +so that it can reference the proto files generated by MojoLPM - this will be +handled for us by the `mojolpm_fuzzer_test` build rule. + +## Add our fuzzer source + +Now we're ready to create the fuzzer c++ source file, +`code_cache_host_mojolpm_fuzzer.cc` and the fuzzer build target. This +target is going to depend on both our proto file, and on the c++ source file. +Most of the necessary dependencies will be handled for us, but we do still need +to add some directly. + +Note especially the dependency on `mojom_platform_mojolpm` in blink, this is an +autogenerated target where the target containing the generated fuzzer protocol +buffer descriptions will be the name of the mojom target with `_mojolpm` +appended. + +``` +mojolpm_fuzzer_test("code_cache_host_mojolpm_fuzzer") { + sources = [ + "code_cache_host_mojolpm_fuzzer.cc" + ] + + proto_source = "code_cache_host_mojolpm_fuzzer.proto" + + deps = [ + "//base/test:test_support", + "//content/browser:for_content_tests", + "//content/public/browser:browser_sources", + "//content/test:test_support", + "//services/network:test_support", + "//storage/browser:test_support", + ] + + proto_deps = [ + "//third_party/blink/public/mojom:mojom_platform_mojolpm", + ] +} +``` + +Now, the minimal source code to do load our testcases: + +```c++ +// Copyright 2020 The Chromium Authors. All rights reserved. +// Use of this source code is governed by a BSD-style license that can be +// found in the LICENSE file. + +#include <stdint.h> +#include <utility> + +#include "code_cache_host_mojolpm_fuzzer.pb.h" +#include "mojo/core/embedder/embedder.h" +#include "third_party/blink/public/mojom/loader/code_cache.mojom-mojolpm.h" +#include "third_party/libprotobuf-mutator/src/src/libfuzzer/libfuzzer_macro.h" + +DEFINE_BINARY_PROTO_FUZZER( + const content::fuzzing::code_cache_host::proto::Testcase& testcase) { +} +``` + +You should now be able to build and run this fuzzer (it, of course, won't do +very much) to check that everything is lined up right so far. + +## Handle global process setup + +Now we need to add some basic setup code so that our process has something that +mostly resembles a normal Browser process; if you look in the file this is +`CodeCacheHostFuzzerEnvironment`, which adds a global environment instance that +will handle setting up this basic environment, which will be reused for all of +our testcases, since starting threads is expensive and slow. + +## Handle per-testcase setup + +We next need to handle the necessary setup to instantiate `CodeCacheHostImpl`, +so that we can actually run the testcases. At this point, we realise that it's +likely that we want to be able to have multiple `CodeCacheHostImpl`'s with +different render_process_ids and different backing origins, so we need to modify +our proto file to reflect this: + +``` +message NewCodeCacheHost { + enum OriginId { + ORIGIN_A = 0; + ORIGIN_B = 1; + ORIGIN_OPAQUE = 2; + ORIGIN_EMPTY = 3; + } + + required uint32 id = 1; + required uint32 render_process_id = 2; + required OriginId origin_id = 3; +} +``` + +Note that we're using an enum to represent the origin, rather than a string; +it's unlikely that the true value of the origin is going to be important, so +we've instead chosen a few select values based on the cases mentioned in the +source. + +The first thing that we need to do is set-up the basic Browser process +environment; this is what `ContentFuzzerEnvironment` is doing - this has a basic +setup suitable for fuzzing interfaces in `/content`. A few things to be careful +of are that we need to make sure that `mojo::core::Init()` is called (only once) +and we probably want as much freedom as possible in terms of scheduling, so we +want to use slightly different threading options than the average unittest. This +is a singleton type that will live for the entire duration of the fuzzer process +so we don't want to be holding any testcase-specific data here. + +The next thing that we need to do is to figure out the basic setup needed to +instantiate the interface we're interested in. Looking at the constructor for +`CodeCacheHostImpl` we need three things; a valid `render_process_id`, an +instance of `CacheStorageContextImpl` and an instance of +`GeneratedCodeCacheContext`. `CodeCacheHostFuzzerContext` is our container for +these per-testcase instances; and will handle creating and binding the instances +of the Mojo interfaces that we're going to fuzz. The most important thing to be +careful of here is that everything happens on the correct thread/sequence. Many +Browser-process objects have specific expectations, and will end up with very +different behaviour if they are created or used from the wrong context. + +## Integrate with the generated MojoLPM fuzzer code + +Finally, we need to do a little bit more plumbing, to rig up this infrastructure +that we've built together with the autogenerated code that MojoLPM gives us to +interpret and run our testcases. This is the `CodeCacheHostTestcase`, and the +part where the magic happens is here: + +```c++ +void CodeCacheHostTestcase::NextAction() { + if (next_idx_ < testcase_.sequence_indexes_size()) { + auto sequence_idx = testcase_.sequence_indexes(next_idx_++); + const auto& sequence = + testcase_.sequences(sequence_idx % testcase_.sequences_size()); + for (auto action_idx : sequence.action_indexes()) { + if (!testcase_.actions_size() || ++action_count_ > MAX_ACTION_COUNT) { + return; + } + const auto& action = + testcase_.actions(action_idx % testcase_.actions_size()); + switch (action.action_case()) { + case content::fuzzing::code_cache_host::proto::Action::kNewCodeCacheHost: { + cch_context_.AddCodeCacheHost( + action.new_code_cache_host().id(), + action.new_code_cache_host().render_process_id(), + action.new_code_cache_host().origin_id()); + } break; + + case content::fuzzing::code_cache_host::proto::Action::kRunUntilIdle: { + if (action.run_until_idle().id()) { + content::RunUIThreadUntilIdle(); + } else { + content::RunIOThreadUntilIdle(); + } + } break; + + case content::fuzzing::code_cache_host::proto::Action::kCodeCacheHostCall: { + mojolpm::HandleRemoteMethodCall(action.code_cache_host_call()); + } break; + + case content::fuzzing::code_cache_host::proto::Action::ACTION_NOT_SET: + break; + } + } + } +} +``` + +The key line here in integration with MojoLPM is the last case, +`kCodeCacheHostCall`, where we're asking MojoLPM to treat this incoming proto +entry as a call to a method on the `CodeCacheHost` interface. + +There's just a little bit more boilerplate in the bottom of the file to tidy up +concurrency loose ends, making sure that the fuzzer components are all running +on the correct threads; those are more-or-less common to any fuzzer using +MojoLPM. + +## Test it! + +Make a corpus directory and fire up your shiny new fuzzer! + +``` + ~/chromium/src% out/Default/code_cache_host_mojolpm_fuzzer /dev/shm/corpus +INFO: Seed: 3273881842 +INFO: Loaded 1 modules (1121912 inline 8-bit counters): 1121912 [0x559151a1aea8, 0x559151b2cd20), +INFO: Loaded 1 PC tables (1121912 PCs): 1121912 [0x559151b2cd20,0x559152c4b4a0), +INFO: 146 files found in /dev/shm/corpus +INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes +INFO: seed corpus: files: 146 min: 2b max: 268b total: 8548b rss: 88Mb +#147 INITED cov: 4633 ft: 10500 corp: 138/8041b exec/s: 0 rss: 91Mb +#152 NEW cov: 4633 ft: 10501 corp: 139/8139b lim: 4096 exec/s: 0 rss: 91Mb L: 98/268 MS: 8 Custom-ChangeByte-Custom-EraseBytes-Custom-ShuffleBytes-Custom-Custom- +#154 NEW cov: 4634 ft: 10510 corp: 140/8262b lim: 4096 exec/s: 0 rss: 91Mb L: 123/268 MS: 3 CustomCrossOver-ChangeBit-Custom- +#157 NEW cov: 4634 ft: 10512 corp: 141/8384b lim: 4096 exec/s: 0 rss: 91Mb L: 122/268 MS: 3 CustomCrossOver-Custom-CustomCrossOver- +#158 NEW cov: 4634 ft: 10514 corp: 142/8498b lim: 4096 exec/s: 0 rss: 91Mb L: 114/268 MS: 1 CustomCrossOver- +#159 NEW cov: 4634 ft: 10517 corp: 143/8601b lim: 4096 exec/s: 0 rss: 91Mb L: 103/268 MS: 1 Custom- +#160 NEW cov: 4634 ft: 10526 corp: 144/8633b lim: 4096 exec/s: 0 rss: 91Mb L: 32/268 MS: 1 Custom- +#164 NEW cov: 4634 ft: 10528 corp: 145/8851b lim: 4096 exec/s: 0 rss: 91Mb L: 218/268 MS: 4 CustomCrossOver-Custom-CustomCrossOver-Custom- +``` + +## Wait for it... + +Let the fuzzer run for a while, and keep periodically checking in in case it's +fallen over. It's likely you'll have made a few mistakes somewhere along the way +but hopefully soon you'll have the fuzzer running 'clean' for a few hours. + +If your coverage isn't going up at all, then you've probably made a mistake and +it likely isn't managing to actually interact with the interface you're trying +to fuzz - try using the code coverage output from the next step to debug what's +going wrong. + +## (Optional) Run coverage + +In many cases it's useful to check the code coverage to see if we can benefit +from adding some manual testcases to get deeper coverage. For this example I +used the following command: + +``` +python tools/code_coverage/coverage.py code_cache_host_mojolpm_fuzzer -b out/Coverage -o ManualReport -c "out/Coverage/code_cache_host_mojolpm_fuzzer -ignore_timeouts=1 -timeout=4 -runs=0 /dev/shm/corpus" -f content +``` + +With the CodeCacheHost, looking at the coverage after a few hours we could see +that there's definitely some room for improvement: + +```c++ +/* 55 */ base::Optional<GURL> GetSecondaryKeyForCodeCache(const GURL& resource_url, +/* 56 53.6k */ int render_process_id) { +/* 57 53.6k */ if (!resource_url.is_valid() || !resource_url.SchemeIsHTTPOrHTTPS()) +/* 58 53.6k */ return base::nullopt; +/* 59 0 */ +/* 60 0 */ GURL origin_lock = +/* 61 0 */ ChildProcessSecurityPolicyImpl::GetInstance()->GetOriginLock( +/* 62 0 */ render_process_id); +``` + +## (Optional) Improve corpus manually + +It's fairly easy to improve the corpus manually, since our corpus files are just +protobuf files that describe the sequence of interface calls to make. + +There are a couple of approaches that we can take here - we'll try building a +small manual seed corpus that we'll use to kick-start our fuzzer. Since it's +easier to edit text protos, MojoLPM can automatically convert our seed corpus +from text protos to binary protos during the build, making this slightly less +painful for us, and letting us store our corpus in-tree in a readable format. + +So, we'll create a new folder to hold this seed corpus, and craft our first +file: + +``` +actions { + new_code_cache_host { + id: 1 + render_process_id: 0 + origin_id: ORIGIN_A + } +} +actions { + code_cache_host_call { + remote { + id: 1 + } + m_did_generate_cacheable_metadata { + m_cache_type: CodeCacheType_kJavascript + m_url { + new { + id: 1 + m_url: "http://aaa.com/test" + } + } + m_data { + new { + id: 1 + m_bytes { + } + } + m_expected_response_time { + } + } + } +} +sequences { + action_indexes: 0 + action_indexes: 1 +} +sequence_indexes: 0 +``` + +We can then add some new entries to our build target to have the corpus +converted to binary proto directly during build. + +``` + testcase_proto_kind = "content.fuzzing.code_cache_host.proto.Testcase" + + seed_corpus_sources = [ + "code_cache_host_mojolpm_fuzzer_corpus/did_generate_cacheable_metadata.textproto", + ] +``` + +If we now run a new coverage report using this single file seed corpus: +(note that the binary corpus files will be output in your output directory, in +this case code_cache_host_mojolpm_fuzzer_seed_corpus.zip): + +``` +autoninja -C out/Coverage chrome +rm -rf /tmp/corpus; mkdir /tmp/corpus; unzip out/Coverage/code_cache_host_mojolpm_fuzzer_seed_corpus.zip -d /tmp/corpus +python tools/code_coverage/coverage.py code_cache_host_mojolpm_fuzzer -b out/Coverage -o ManualReport -c "out/Coverage/code_cache_host_mojolpm_fuzzer -ignore_timeouts=1 -timeout=4 -runs=0 /tmp/corpus" -f content +``` + +We can see that we're now getting some more coverage: + +```c++ +/* 118 */ void CodeCacheHostImpl::DidGenerateCacheableMetadata( +/* 119 */ blink::mojom::CodeCacheType cache_type, +/* 120 */ const GURL& url, +/* 121 */ base::Time expected_response_time, +/* 122 2 */ mojo_base::BigBuffer data) { +/* 123 2 */ if (!url.SchemeIsHTTPOrHTTPS()) { +/* 124 0 */ mojo::ReportBadMessage("Invalid URL scheme for code cache."); +/* 125 0 */ return; +/* 126 0 */ } +/* 127 2 */ +/* 128 2 */ DCHECK_CURRENTLY_ON(BrowserThread::UI); +/* 129 2 */ +/* 130 2 */ GeneratedCodeCache* code_cache = GetCodeCache(cache_type); +/* 131 2 */ if (!code_cache) +/* 132 0 */ return; +/* 133 2 */ +/* 134 2 */ base::Optional<GURL> origin_lock = +/* 135 2 */ GetSecondaryKeyForCodeCache(url, render_process_id_); +/* 136 2 */ if (!origin_lock) +/* 137 0 */ return; +/* 138 2 */ +/* 139 2 */ code_cache->WriteEntry(url, *origin_lock, expected_response_time, +/* 140 2 */ std::move(data)); +/* 141 2 */ } +``` + +Much better! + +[markbrand@google.com]: mailto:markbrand@google.com?subject=[MojoLPM%20Help]:%20&cc=fuzzing@chromium.org +[libfuzzer]: https://source.chromium.org/chromium/chromium/src/+/master:testing/libfuzzer/getting_started.md +[Protocol Buffers]: https://developers.google.com/protocol-buffers/docs/cpptutorial +[libprotobuf-mutator]: https://source.chromium.org/chromium/chromium/src/+/master:testing/libfuzzer/libprotobuf-mutator.md +[testing in Chromium]: https://source.chromium.org/chromium/chromium/src/+/master:docs/testing/testing_in_chromium.md +[interfaces]: https://source.chromium.org/search?q=interface%5Cs%2B%5Cw%2B%5Cs%2B%7B%20f:%5C.mojom$%20-f:test + |