1 files changed, 471 insertions, 0 deletions
diff --git a/chromium/mojo/docs/mojolpm.md b/chromium/mojo/docs/mojolpm.md
new file mode 100644
index 00000000000..a3b1c8e254b
--- /dev/null
+++ b/chromium/mojo/docs/mojolpm.md
@@ -0,0 +1,471 @@
+# Getting started with MojoLPM
+
+*** note
+**Note:** Using MojoLPM to fuzz your Mojo interfaces is intended to be simple,
+but there are edge-cases that may require a very detailed understanding of the
+Mojo implementation to fix. If you run into problems that you can't understand
+readily, send an email to [markbrand@google.com] and cc `fuzzing@chromium.org`
+and we'll try and help.
+
+**Prerequisites:** Knowledge of [libfuzzer] and basic understanding
+of [Protocol Buffers] and [libprotobuf-mutator]. Basic understanding of
+[testing in Chromium].
+***
+
+This document will walk you through:
+* An overview of MojoLPM and what it's used for.
+* Adding a fuzzer to an existing Mojo interface using MojoLPM.
+
+[TOC]
+
+## Overview of MojoLPM
+
+MojoLPM is a toolchain for automatically generating structure-aware fuzzers for
+Mojo interfaces using libprotobuf-mutator as the fuzzing engine.
+
+This tool works by using the existing "grammar" for the interface provided by
+the .mojom files, and translating that into a Protocol Buffer format that can be
+fuzzed by libprotobuf-mutator. These protocol buffers are then interpreted by
+a generated runtime as a sequence of mojo method calls on the targeted
+interface.
+
+The intention is that using these should be as simple as plugging the generated
+code in to the existing unittests for those interfaces - so if you've already
+implemented the necessary mocks to unittest your code, the majority of the work
+needed to get quite effective fuzzing of your interfaces is already complete!
+
+## Choose the Mojo interface(s) to fuzz
+
+If you're a developer looking to add fuzzing support for an interface that
+you're developing, then this should be very easy for you!
+
+If not, then a good starting point is to search for [interfaces] in codesearch.
+The most interesting interfaces from a security perspective are those which are
+implemented in the browser process and exposed to the renderer process, but
+there isn't a very simple way to enumerate these, so you may need to look
+through some of the source code to find an interesting one.
+
+For the rest of this guide, we'll write a new fuzzer for
+`blink.mojom.CodeCacheHost`, which is defined in
+`third_party/blink/public/mojom/loader/code_cache.mojom`.
+
+We then need to find the relevant GN build target for this mojo interface so
+that we know how to refer to it later - in this case that is
+`//third_party/blink/public/mojom:mojom_platform`.
+
+## Find the implementations of the interfaces
+
+If you are developing these interfaces, then you already know where to find the
+implementations.
+
+Otherwise a good starting point is to search for references to
+"public blink::mojom::CodeCacheHost". Usually there is only a single
+implementation of a given Mojo interface (there are a few exceptions where the
+interface abstracts platform specific details, but this is less common). This
+leads us to `content/browser/renderer_host/code_cache_host_impl.h` and
+`CodeCacheHostImpl`.
+
+## Find the unittest for the implementation
+
+Unfortunately, it doesn't look like `CodeCacheHostImpl` has a unittest, so we'll
+have to go through the process of understanding how to create a valid instance
+ourselves in order to fuzz this interface.
+
+Since this interface runs in the Browser process, and is part of `/content`,
+we're going to create our new fuzzer in `/content/test/fuzzer`.
+
+## Add our testcase proto
+
+First we'll add a proto source file, `code_cache_host_mojolpm_fuzzer.proto`,
+which is going to define the structure of our testcases. This is basically
+boilerplate, but it allows creating fuzzers which interact with multiple Mojo
+interfaces to uncover more complex issues. For our case, this will be a simple
+file:
+
+```
+syntax = "proto2";
+
+package content.fuzzing.code_cache_host.proto;
+
+import "third_party/blink/public/mojom/loader/code_cache.mojom.mojolpm.proto";
+
+message NewCodeCacheHost {
+  required uint32 id = 1;
+}
+
+message RunUntilIdle {
+  enum ThreadId {
+    IO = 0;
+    UI = 1;
+  }
+
+  required ThreadId id = 1;
+}
+
+message Action {
+  oneof action {
+    NewCodeCacheHost new_code_cache_host = 1;
+    RunUntilIdle run_until_idle = 2;
+    mojolpm.blink.mojom.CodeCacheHost.RemoteMethodCall code_cache_host_call = 3;
+  }
+}
+
+message Sequence {
+  repeated uint32 action_indexes = 1 [packed=true];
+}
+
+message Testcase {
+  repeated Action actions = 1;
+  repeated Sequence sequences = 2;
+  repeated uint32 sequence_indexes = 3 [packed=true];
+}
+```
+
+This specifies all of the actions that the fuzzer will be able to take - it
+will be able to create a new `CodeCacheHost` instance, perform sequences of
+interface calls on those instances, and wait for various threads to be idle.
+
+In order to build this proto file, we'll need to copy it into the out/ directory
+so that it can reference the proto files generated by MojoLPM - this will be
+handled for us by the `mojolpm_fuzzer_test` build rule.
+
+## Add our fuzzer source
+
+Now we're ready to create the fuzzer c++ source file,
+`code_cache_host_mojolpm_fuzzer.cc` and the fuzzer build target. This
+target is going to depend on both our proto file, and on the c++ source file.
+Most of the necessary dependencies will be handled for us, but we do still need
+to add some directly.
+
+Note especially the dependency on `mojom_platform_mojolpm` in blink, this is an
+autogenerated target where the target containing the generated fuzzer protocol
+buffer descriptions will be the name of the mojom target with `_mojolpm`
+appended.
+
+```
+mojolpm_fuzzer_test("code_cache_host_mojolpm_fuzzer") {
+  sources = [
+    "code_cache_host_mojolpm_fuzzer.cc"
+  ]
+
+  proto_source = "code_cache_host_mojolpm_fuzzer.proto"
+
+   deps = [
+    "//base/test:test_support",
+    "//content/browser:for_content_tests",
+    "//content/public/browser:browser_sources",
+    "//content/test:test_support",
+    "//services/network:test_support",
+    "//storage/browser:test_support",
+  ]
+
+  proto_deps = [
+    "//third_party/blink/public/mojom:mojom_platform_mojolpm",
+  ]
+}
+```
+
+Now, the minimal source code to do load our testcases:
+
+```c++
+// Copyright 2020 The Chromium Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+#include <stdint.h>
+#include <utility>
+
+#include "code_cache_host_mojolpm_fuzzer.pb.h"
+#include "mojo/core/embedder/embedder.h"
+#include "third_party/blink/public/mojom/loader/code_cache.mojom-mojolpm.h"
+#include "third_party/libprotobuf-mutator/src/src/libfuzzer/libfuzzer_macro.h"
+
+DEFINE_BINARY_PROTO_FUZZER(
+    const content::fuzzing::code_cache_host::proto::Testcase& testcase) {
+}
+```
+
+You should now be able to build and run this fuzzer (it, of course, won't do
+very much) to check that everything is lined up right so far.
+
+## Handle global process setup
+
+Now we need to add some basic setup code so that our process has something that
+mostly resembles a normal Browser process; if you look in the file this is
+`CodeCacheHostFuzzerEnvironment`, which adds a global environment instance that
+will handle setting up this basic environment, which will be reused for all of
+our testcases, since starting threads is expensive and slow.
+
+## Handle per-testcase setup
+
+We next need to handle the necessary setup to instantiate `CodeCacheHostImpl`,
+so that we can actually run the testcases. At this point, we realise that it's
+likely that we want to be able to have multiple `CodeCacheHostImpl`'s with
+different render_process_ids and different backing origins, so we need to modify
+our proto file to reflect this:
+
+```
+message NewCodeCacheHost {
+  enum OriginId {
+    ORIGIN_A = 0;
+    ORIGIN_B = 1;
+    ORIGIN_OPAQUE = 2;
+    ORIGIN_EMPTY = 3;
+  }
+
+  required uint32 id = 1;
+  required uint32 render_process_id = 2;
+  required OriginId origin_id = 3;
+}
+```
+
+Note that we're using an enum to represent the origin, rather than a string;
+it's unlikely that the true value of the origin is going to be important, so
+we've instead chosen a few select values based on the cases mentioned in the
+source.
+
+The first thing that we need to do is set-up the basic Browser process
+environment; this is what `ContentFuzzerEnvironment` is doing - this has a basic
+setup suitable for fuzzing interfaces in `/content`. A few things to be careful
+of are that we need to make sure that `mojo::core::Init()` is called (only once)
+and we probably want as much freedom as possible in terms of scheduling, so we
+want to use slightly different threading options than the average unittest. This
+is a singleton type that will live for the entire duration of the fuzzer process
+so we don't want to be holding any testcase-specific data here.
+
+The next thing that we need to do is to figure out the basic setup needed to
+instantiate the interface we're interested in. Looking at the constructor for
+`CodeCacheHostImpl` we need three things; a valid `render_process_id`, an
+instance of `CacheStorageContextImpl` and an instance of
+`GeneratedCodeCacheContext`. `CodeCacheHostFuzzerContext` is our container for
+these per-testcase instances; and will handle creating and binding the instances
+of the Mojo interfaces that we're going to fuzz. The most important thing to be
+careful of here is that everything happens on the correct thread/sequence. Many
+Browser-process objects have specific expectations, and will end up with very
+different behaviour if they are created or used from the wrong context.
+
+## Integrate with the generated MojoLPM fuzzer code
+
+Finally, we need to do a little bit more plumbing, to rig up this infrastructure
+that we've built together with the autogenerated code that MojoLPM gives us to
+interpret and run our testcases. This is the `CodeCacheHostTestcase`, and the
+part where the magic happens is here:
+
+```c++
+void CodeCacheHostTestcase::NextAction() {
+  if (next_idx_ < testcase_.sequence_indexes_size()) {
+    auto sequence_idx = testcase_.sequence_indexes(next_idx_++);
+    const auto& sequence =
+      testcase_.sequences(sequence_idx % testcase_.sequences_size());
+    for (auto action_idx : sequence.action_indexes()) {
+      if (!testcase_.actions_size() || ++action_count_ > MAX_ACTION_COUNT) {
+        return;
+      }
+      const auto& action =
+        testcase_.actions(action_idx % testcase_.actions_size());
+      switch (action.action_case()) {
+        case content::fuzzing::code_cache_host::proto::Action::kNewCodeCacheHost: {
+          cch_context_.AddCodeCacheHost(
+            action.new_code_cache_host().id(),
+            action.new_code_cache_host().render_process_id(),
+            action.new_code_cache_host().origin_id());
+        } break;
+
+        case content::fuzzing::code_cache_host::proto::Action::kRunUntilIdle: {
+          if (action.run_until_idle().id()) {
+            content::RunUIThreadUntilIdle();
+          } else {
+            content::RunIOThreadUntilIdle();
+          }
+        } break;
+
+        case content::fuzzing::code_cache_host::proto::Action::kCodeCacheHostCall: {
+          mojolpm::HandleRemoteMethodCall(action.code_cache_host_call());
+        } break;
+
+        case content::fuzzing::code_cache_host::proto::Action::ACTION_NOT_SET:
+          break;
+      }
+    }
+  }
+}
+```
+
+The key line here in integration with MojoLPM is the last case,
+`kCodeCacheHostCall`, where we're asking MojoLPM to treat this incoming proto
+entry as a call to a method on the `CodeCacheHost` interface.
+
+There's just a little bit more boilerplate in the bottom of the file to tidy up
+concurrency loose ends, making sure that the fuzzer components are all running
+on the correct threads; those are more-or-less common to any fuzzer using
+MojoLPM.
+
+## Test it!
+
+Make a corpus directory and fire up your shiny new fuzzer!
+
+```
+ ~/chromium/src% out/Default/code_cache_host_mojolpm_fuzzer /dev/shm/corpus
+INFO: Seed: 3273881842
+INFO: Loaded 1 modules   (1121912 inline 8-bit counters): 1121912 [0x559151a1aea8, 0x559151b2cd20),
+INFO: Loaded 1 PC tables (1121912 PCs): 1121912 [0x559151b2cd20,0x559152c4b4a0),
+INFO:      146 files found in /dev/shm/corpus
+INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
+INFO: seed corpus: files: 146 min: 2b max: 268b total: 8548b rss: 88Mb
+#147  INITED cov: 4633 ft: 10500 corp: 138/8041b exec/s: 0 rss: 91Mb
+#152  NEW    cov: 4633 ft: 10501 corp: 139/8139b lim: 4096 exec/s: 0 rss: 91Mb L: 98/268 MS: 8 Custom-ChangeByte-Custom-EraseBytes-Custom-ShuffleBytes-Custom-Custom-
+#154  NEW    cov: 4634 ft: 10510 corp: 140/8262b lim: 4096 exec/s: 0 rss: 91Mb L: 123/268 MS: 3 CustomCrossOver-ChangeBit-Custom-
+#157  NEW    cov: 4634 ft: 10512 corp: 141/8384b lim: 4096 exec/s: 0 rss: 91Mb L: 122/268 MS: 3 CustomCrossOver-Custom-CustomCrossOver-
+#158  NEW    cov: 4634 ft: 10514 corp: 142/8498b lim: 4096 exec/s: 0 rss: 91Mb L: 114/268 MS: 1 CustomCrossOver-
+#159  NEW    cov: 4634 ft: 10517 corp: 143/8601b lim: 4096 exec/s: 0 rss: 91Mb L: 103/268 MS: 1 Custom-
+#160  NEW    cov: 4634 ft: 10526 corp: 144/8633b lim: 4096 exec/s: 0 rss: 91Mb L: 32/268 MS: 1 Custom-
+#164  NEW    cov: 4634 ft: 10528 corp: 145/8851b lim: 4096 exec/s: 0 rss: 91Mb L: 218/268 MS: 4 CustomCrossOver-Custom-CustomCrossOver-Custom-
+```
+
+## Wait for it...
+
+Let the fuzzer run for a while, and keep periodically checking in in case it's
+fallen over. It's likely you'll have made a few mistakes somewhere along the way
+but hopefully soon you'll have the fuzzer running 'clean' for a few hours.
+
+If your coverage isn't going up at all, then you've probably made a mistake and
+it likely isn't managing to actually interact with the interface you're trying
+to fuzz - try using the code coverage output from the next step to debug what's
+going wrong.
+
+## (Optional) Run coverage
+
+In many cases it's useful to check the code coverage to see if we can benefit
+from adding some manual testcases to get deeper coverage. For this example I
+used the following command:
+
+```
+python tools/code_coverage/coverage.py code_cache_host_mojolpm_fuzzer -b out/Coverage -o ManualReport -c "out/Coverage/code_cache_host_mojolpm_fuzzer -ignore_timeouts=1 -timeout=4 -runs=0 /dev/shm/corpus" -f content
+```
+
+With the CodeCacheHost, looking at the coverage after a few hours we could see
+that there's definitely some room for improvement:
+
+```c++
+/* 55       */ base::Optional<GURL> GetSecondaryKeyForCodeCache(const GURL& resource_url,
+/* 56 53.6k */ int render_process_id) {
+/* 57 53.6k */    if (!resource_url.is_valid() || !resource_url.SchemeIsHTTPOrHTTPS())
+/* 58 53.6k */      return base::nullopt;
+/* 59 0     */
+/* 60 0     */    GURL origin_lock =
+/* 61 0     */        ChildProcessSecurityPolicyImpl::GetInstance()->GetOriginLock(
+/* 62 0     */            render_process_id);
+```
+
+## (Optional) Improve corpus manually
+
+It's fairly easy to improve the corpus manually, since our corpus files are just
+protobuf files that describe the sequence of interface calls to make.
+
+There are a couple of approaches that we can take here - we'll try building a
+small manual seed corpus that we'll use to kick-start our fuzzer. Since it's
+easier to edit text protos, MojoLPM can automatically convert our seed corpus
+from text protos to binary protos during the build, making this slightly less 
+painful for us, and letting us store our corpus in-tree in a readable format.
+
+So, we'll create a new folder to hold this seed corpus, and craft our first
+file:
+
+```
+actions {
+  new_code_cache_host {
+    id: 1
+    render_process_id: 0
+    origin_id: ORIGIN_A
+  }
+}
+actions {
+  code_cache_host_call {
+    remote {
+      id: 1
+    }
+    m_did_generate_cacheable_metadata {
+      m_cache_type: CodeCacheType_kJavascript
+      m_url {
+        new {
+          id: 1
+          m_url: "http://aaa.com/test"
+        }
+      }
+      m_data {
+        new {
+          id: 1
+          m_bytes {
+          }
+        }
+      m_expected_response_time {
+      }
+    }
+  }
+}
+sequences {
+  action_indexes: 0
+  action_indexes: 1
+}
+sequence_indexes: 0
+```
+
+We can then add some new entries to our build target to have the corpus
+converted to binary proto directly during build.
+
+```
+  testcase_proto_kind = "content.fuzzing.code_cache_host.proto.Testcase"
+
+  seed_corpus_sources = [
+    "code_cache_host_mojolpm_fuzzer_corpus/did_generate_cacheable_metadata.textproto",
+  ]
+```
+
+If we now run a new coverage report using this single file seed corpus:
+(note that the binary corpus files will be output in your output directory, in 
+this case code_cache_host_mojolpm_fuzzer_seed_corpus.zip):
+
+```
+autoninja -C out/Coverage chrome
+rm -rf /tmp/corpus; mkdir /tmp/corpus; unzip out/Coverage/code_cache_host_mojolpm_fuzzer_seed_corpus.zip -d /tmp/corpus
+python tools/code_coverage/coverage.py code_cache_host_mojolpm_fuzzer -b out/Coverage -o ManualReport -c "out/Coverage/code_cache_host_mojolpm_fuzzer -ignore_timeouts=1 -timeout=4 -runs=0 /tmp/corpus" -f content
+```
+
+We can see that we're now getting some more coverage:
+
+```c++
+/* 118   */ void CodeCacheHostImpl::DidGenerateCacheableMetadata(
+/* 119   */     blink::mojom::CodeCacheType cache_type,
+/* 120   */     const GURL& url,
+/* 121   */     base::Time expected_response_time,
+/* 122 2 */       mojo_base::BigBuffer data) {
+/* 123 2 */     if (!url.SchemeIsHTTPOrHTTPS()) {
+/* 124 0 */       mojo::ReportBadMessage("Invalid URL scheme for code cache.");
+/* 125 0 */       return;
+/* 126 0 */     }
+/* 127 2 */
+/* 128 2 */     DCHECK_CURRENTLY_ON(BrowserThread::UI);
+/* 129 2 */
+/* 130 2 */     GeneratedCodeCache* code_cache = GetCodeCache(cache_type);
+/* 131 2 */     if (!code_cache)
+/* 132 0 */       return;
+/* 133 2 */
+/* 134 2 */     base::Optional<GURL> origin_lock =
+/* 135 2 */         GetSecondaryKeyForCodeCache(url, render_process_id_);
+/* 136 2 */     if (!origin_lock)
+/* 137 0 */       return;
+/* 138 2 */
+/* 139 2 */     code_cache->WriteEntry(url, *origin_lock, expected_response_time,
+/* 140 2 */                            std::move(data));
+/* 141 2 */ }
+```
+
+Much better!
+
+[markbrand@google.com]: mailto:markbrand@google.com?subject=[MojoLPM%20Help]:%20&cc=fuzzing@chromium.org
+[libfuzzer]: https://source.chromium.org/chromium/chromium/src/+/master:testing/libfuzzer/getting_started.md
+[Protocol Buffers]: https://developers.google.com/protocol-buffers/docs/cpptutorial
+[libprotobuf-mutator]: https://source.chromium.org/chromium/chromium/src/+/master:testing/libfuzzer/libprotobuf-mutator.md
+[testing in Chromium]: https://source.chromium.org/chromium/chromium/src/+/master:docs/testing/testing_in_chromium.md
+[interfaces]: https://source.chromium.org/search?q=interface%5Cs%2B%5Cw%2B%5Cs%2B%7B%20f:%5C.mojom$%20-f:test
+