mojo/docs/mojolpm.md

   1 # Getting started with MojoLPM
   2
   3 ***
   4 **Note:** Using MojoLPM to fuzz your Mojo interfaces is intended to be simple,
   5 but there are edge-cases that may require a very detailed understanding of the
   6 Mojo implementation to fix. If you run into problems that you can't understand
   7 readily, send an email to [markbrand@google.com] and cc `fuzzing@chromium.org`
   8 and we'll try and help.
   9
  10 **Prerequisites:** Knowledge of [libfuzzer] and basic understanding
  11 of [Protocol Buffers] and [libprotobuf-mutator]. Basic understanding of
  12 [testing in Chromium].
  13 ***
  14
  15 This document will walk you through:
  16 * An overview of MojoLPM and what it's used for.
  17 * Adding a fuzzer to an existing Mojo interface using MojoLPM.
  18
  19 [TOC]
  20
  21 ## Overview of MojoLPM
  22
  23 MojoLPM is a toolchain for automatically generating structure-aware fuzzers for
  24 Mojo interfaces using libprotobuf-mutator as the fuzzing engine.
  25
  26 This tool works by using the existing "grammar" for the interface provided by
  27 the .mojom files, and translating that into a Protocol Buffer format that can be
  28 fuzzed by libprotobuf-mutator. These protocol buffers are then interpreted by
  29 a generated runtime as a sequence of mojo method calls on the targeted
  30 interface.
  31
  32 The intention is that using these should be as simple as plugging the generated
  33 code in to the existing unittests for those interfaces - so if you've already
  34 implemented the necessary mocks to unittest your code, the majority of the work
  35 needed to get quite effective fuzzing of your interfaces is already complete!
  36
  37 ## Choose the Mojo interface(s) to fuzz
  38
  39 If you're a developer looking to add fuzzing support for an interface that
  40 you're developing, then this should be very easy for you!
  41
  42 If not, then a good starting point is to search for [interfaces] in codesearch.
  43 The most interesting interfaces from a security perspective are those which are
  44 implemented in the browser process and exposed to the renderer process, but
  45 there isn't a very simple way to enumerate these, so you may need to look
  46 through some of the source code to find an interesting one.
  47
  48 A few of the places which bind many of these cross-privilege interfaces are
  49 `content/browser/browser_interface_binders.cc` and
  50 `content/browser/render_process_host_impl.cc`, specifically `RenderProcessHostImpl::RegisterMojoInterfaces`.
  51
  52 For the rest of this guide, we'll write a new fuzzer for
  53 `blink.mojom.CodeCacheHost`, which is defined in
  54 `third_party/blink/public/mojom/loader/code_cache.mojom`.
  55
  56 We then need to find the relevant GN build target for this mojo interface so
  57 that we know how to refer to it later - in this case that is
  58 `//third_party/blink/public/mojom:mojom_platform`.
  59
  60 ## Find the implementations of the interfaces
  61
  62 If you are developing these interfaces, then you already know where to find the
  63 implementations.
  64
  65 Otherwise a good starting point is to search for references to
  66 "public blink::mojom::CodeCacheHost". Usually there is only a single
  67 implementation of a given Mojo interface (there are a few exceptions where the
  68 interface abstracts platform specific details, but this is less common). This
  69 leads us to `content/browser/renderer_host/code_cache_host_impl.h` and
  70 `CodeCacheHostImpl`.
  71
  72 ## Find the unittest for the implementation
  73
  74 Specifically, we're looking for a browser-process side unittest (so not in
  75 `//third_party/blink`). We want the unittest for the browser side implementation
  76 of that Mojo interface - in many cases if such exists, it will be directly next
  77 to the implementation source, ie. in this case we would be most likely to find
  78 them in `content/browser/renderer_host/code_cache_host_impl_unittest.cc`.
  79
  80 Unfortunately, it doesn't look like `CodeCacheHostImpl` has a unittest, so we'll
  81 have to go through the process of understanding how to create a valid instance
  82 ourselves in order to fuzz this interface.
  83
  84 Since this implementation runs in the Browser process, and is part of `/content`,
  85 we're going to create our new fuzzer in `/content/test/fuzzer`.
  86
  87 ## Add our testcase proto
  88
  89 First we'll add a proto source file, `code_cache_host_mojolpm_fuzzer.proto`,
  90 which is going to define the structure of our testcases. This is basically
  91 boilerplate, but it allows creating fuzzers which interact with multiple Mojo
  92 interfaces to uncover more complex issues. For our case, this will be a simple
  93 file:
  94
  95 Note that the structure used here is shared between all MojoLPM fuzzers, and
  96 while it is possible to come up with your own testcase format it would be
  97 preferred if you use this same structure and simply add the appropriate Actions
  98 for your fuzzer. This will allow more code-reuse between fuzzers, and also
  99 allow corpus-merging between related fuzzers.
 100
 101 ```
 102 // Copyright 2020 The Chromium Authors
 103 // Use of this source code is governed by a BSD-style license that can be
 104 // found in the LICENSE file.
 105
 106 // Message format for the MojoLPM fuzzer for the CodeCacheHost interface.
 107
 108 syntax = "proto2";
 109
 110 package content.fuzzing.code_cache_host.proto;
 111
 112 import "third_party/blink/public/mojom/loader/code_cache.mojom.mojolpm.proto";
 113
 114 // Bind a new CodeCacheHost remote
 115 message NewCodeCacheHostAction {
 116   required uint32 id = 1;
 117 }
 118
 119 // Run the specific sequence for (an indeterminate) period. This is not
 120 // intended to create a specific ordering, but to allow the fuzzer to delay a
 121 // later task until previous tasks have completed.
 122 message RunThreadAction {
 123   enum ThreadId {
 124     IO = 0;
 125     UI = 1;
 126   }
 127
 128   required ThreadId id = 1;
 129 }
 130
 131 // Actions that can be performed by the fuzzer.
 132 message Action {
 133   oneof action {
 134     NewCodeCacheHostAction new_code_cache_host = 1;
 135     RunThreadAction run_thread = 2;
 136     mojolpm.blink.mojom.CodeCacheHost.RemoteAction
 137         code_cache_host_remote_action = 3;
 138   }
 139 }
 140
 141 // Sequence provides a level of indirection which allows Testcase to compactly
 142 // express repeated sequences of actions.
 143 message Sequence {
 144   repeated uint32 action_indexes = 1 [packed = true];
 145 }
 146
 147 // Testcase is the top-level message type interpreted by the fuzzer.
 148 message Testcase {
 149   repeated Action actions = 1;
 150   repeated Sequence sequences = 2;
 151   repeated uint32 sequence_indexes = 3 [packed = true];
 152 }
 153 ```
 154
 155 This specifies all of the actions that the fuzzer will be able to take - it
 156 will be able to create a new `CodeCacheHost` instance, perform sequences of
 157 interface calls on those instances, and wait for various threads to be idle.
 158
 159 In order to build this proto file, we'll need to copy it into the out/ directory
 160 so that it can reference the proto files generated by MojoLPM - this will be
 161 handled for us by the `mojolpm_fuzzer_test` build rule.
 162
 163 ## Add our fuzzer source
 164
 165 Now we're ready to create the fuzzer c++ source file,
 166 `code_cache_host_mojolpm_fuzzer.cc` and the fuzzer build target. This
 167 target is going to depend on both our proto file, and on the c++ source file.
 168 Most of the necessary dependencies will be handled for us, but we do still need
 169 to add some directly.
 170
 171 Note especially the dependency on `mojom_platform_mojolpm` in blink, this is an
 172 autogenerated target where the target containing the generated fuzzer protocol
 173 buffer descriptions will be the name of the mojom target with `_mojolpm`
 174 appended. You'll need to make sure that your `mojolpm_fuzzer_test` target has
 175 the correct dependencies here for all of the needed .mojolpm.proto imports.
 176
 177 (A good way to find these dependencies is to search in codesearch for
 178 [`"code_cache_host.mojom f:.gn$"`](https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/public/mojom/BUILD.gn?q=code_cache.mojom%20f:gn&ss=chromium) to find the target that builds that mojom file.)
 179
 180 In addition, in `content/test/fuzzer/mojolpm_fuzzer_support.h` there is some
 181 common code used to share the basics of a browser-process like environment
 182 between fuzzers. New fuzzers in other areas of the codebase may need to build
 183 something similar.
 184
 185 ```
 186 mojolpm_fuzzer_test("code_cache_host_mojolpm_fuzzer") {
 187   sources = [
 188     "code_cache_host_mojolpm_fuzzer.cc"
 189   ]
 190
 191   proto_source = "code_cache_host_mojolpm_fuzzer.proto"
 192
 193   deps = [
 194     ":mojolpm_fuzzer_support",
 195     "//content/browser:for_content_tests",
 196     "//content/public/browser:browser_sources",
 197   ]
 198
 199   proto_deps = [
 200     "//third_party/blink/public/mojom:mojom_platform_mojolpm",
 201   ]
 202 }
 203 ```
 204
 205 Now, the minimal source code to load our testcases:
 206
 207 ```c++
 208 // Copyright 2020 The Chromium Authors
 209 // Use of this source code is governed by a BSD-style license that can be
 210 // found in the LICENSE file.
 211
 212 #include <stdint.h>
 213 #include <utility>
 214
 215 #include "code_cache_host_mojolpm_fuzzer.pb.h"
 216 #include "content/test/fuzzer/mojolpm_fuzzer_support.h"
 217 #include "third_party/blink/public/mojom/loader/code_cache.mojom-mojolpm.h"
 218 #include "third_party/libprotobuf-mutator/src/src/libfuzzer/libfuzzer_macro.h"
 219
 220 DEFINE_BINARY_PROTO_FUZZER(
 221     const content::fuzzing::code_cache_host::proto::Testcase& testcase) {
 222 }
 223 ```
 224
 225 You should now be able to build and run this fuzzer (it, of course, won't do
 226 very much) to check that everything is lined up right so far. Recommended GN
 227 arguments:
 228
 229 ```
 230 # DCHECKS are really useful when getting your fuzzer up and running correctly,
 231 # but will often get in the way when running actual fuzzing, so we will disable
 232 # this later.
 233 dcheck_always_on = true
 234
 235 # Without this flag, our fuzzer target won't exist.
 236 enable_mojom_fuzzer = true
 237
 238 # ASAN is super useful for fuzzing, but in this case we just want it to help us
 239 # debug the inevitable lifetime issues while we get everything set-up correctly!
 240 is_asan = true
 241 is_component_build = true
 242 is_debug = false
 243 optimize_for_fuzzing = true
 244 use_goma = true
 245 use_libfuzzer = true
 246 ```
 247
 248
 249 ## Handle global process setup
 250
 251 Now we need to add some basic setup code so that our process has something that
 252 mostly resembles a normal Browser process; if you look in the file this is
 253 `FuzzerEnvironmentWithTaskEnvironment`, which adds a global environment instance
 254 that will handle setting up this basic environment, which will be reused for all
 255 of our testcases, since starting threads is expensive and slow. This code should
 256 also be responsible for setting up any "stateless" (or more-or-less stateless)
 257 code that is required for your interface to run - examples are initializing the
 258 Mojo Core, and loading ICU datafiles.
 259
 260 A key difference between our needs here and those of a normal unittest is that
 261 we very likely do not want to be running in a special single-threaded mode. We
 262 want to be able to trigger issues related to threading, sequencing and ordering,
 263 and making sure that the UI, IO and threadpool threads behave as close to a
 264 normal browser process as possible is desirable.
 265
 266 It's likely better to be conservative here - while it might appear that an
 267 interface to be tested has no interaction with the UI thread, and so we could
 268 save some resources by only having a real IO thread, it's often very difficult
 269 to establish this with certainty.
 270
 271 In practice, the most efficient way forward will be to copy the existing
 272 `Environment` setup from another MojoLPM fuzzer and adapting that to the
 273 context in which the interface to be fuzzed will actually run. Most fuzzers in
 274 content will be fine using either the existing `FuzzerEnvironment` or
 275 `FuzzerEnvironmentWithTaskEnvironment`, depending on whether there's some
 276 per-testcase state that causes issues with reusing the task environment. There
 277 are existing examples of both in //content/test/fuzzer.
 278
 279
 280 ## Handle per-testcase setup
 281
 282 We next need to handle the necessary setup to instantiate `CodeCacheHostImpl`,
 283 so that we can actually run the testcases. At this point, we realise that it's
 284 likely that we want to be able to have multiple `CodeCacheHostImpl`'s with
 285 different render_process_ids and different backing origins, so we need to modify
 286 our proto file to reflect this:
 287
 288 ```
 289 message NewCodeCacheHost {
 290   enum OriginId {
 291     ORIGIN_A = 0;
 292     ORIGIN_B = 1;
 293     ORIGIN_OPAQUE = 2;
 294     ORIGIN_EMPTY = 3;
 295   }
 296
 297   required uint32 id = 1;
 298   required uint32 render_process_id = 2;
 299   required OriginId origin_id = 3;
 300 }
 301 ```
 302
 303 Note that we're using an enum to represent the origin, rather than a string;
 304 it's unlikely that the true value of the origin is going to be important, so
 305 we've instead chosen a few select values based on the cases mentioned in the
 306 source.
 307
 308 The next thing that we need to do is to figure out the basic setup needed to
 309 instantiate the interface we're interested in. Looking at the constructor for
 310 `CodeCacheHostImpl` we need three things; a valid `render_process_id`, an
 311 instance of `CacheStorageContextImpl` and an instance of
 312 `GeneratedCodeCacheContext`. `CodeCacheHostFuzzerContext` is our container for
 313 these per-testcase instances; and will handle creating and binding the instances
 314 of the Mojo interfaces that we're going to fuzz.
 315
 316 The most important thing to be careful of here is that everything happens on
 317 the correct thread/sequence. Many Browser-process objects have specific
 318 expectations, and will end up with very different behaviour if they are created
 319 or used from the wrong context.
 320
 321 See [here](https://bugs.chromium.org/p/chromium/issues/detail?id=1275431) for
 322 an example of a false-positive crash caused by a change in sequencing behaviour
 323 that was not immediately mirrored by the fuzzer.
 324
 325 If your test case requires the existence of a `RenderFrameHost` and similar
 326 structures, see `content/test/fuzzer/presentation_service_mojolpm_fuzzer.cc`
 327 for a fuzzer which already sets them up (in particular, using
 328 `RenderViewHostTestHarnessAdapter`).
 329
 330 **The most important thing to be careful of here is that everything happens on
 331 the correct thread/sequence. Many Browser-process objects have specific
 332 expectations, and will end up with very different behaviour if they are created
 333 or used from the wrong context. Test code doesn't always behave the same way, so
 334 try to check the behaviour in the real Browser.**
 335
 336 **The second most important thing to be aware of is to make sure that the fuzzer
 337 has the same control over lifetimes of objects that a renderer process would
 338 normally have - the best way to check this is to make sure that you've found and
 339 understood the browser process code that would usually bind that interface.**
 340
 341 ## Integrate with the generated MojoLPM fuzzer code
 342
 343 Finally, we need to do a little bit more plumbing, to rig up this infrastructure
 344 that we've built together with the autogenerated code that MojoLPM gives us to
 345 interpret and run our testcases.
 346
 347 We need to implement the `CodeCacheHostTestcase`, and by inheriting from
 348 `mojolpm::Testcase` we'll automatically get handling of the testcase format; we
 349 just need to implement code to run at the start and end of each testcase, and
 350 to run each individual action.
 351
 352 All three of these functions will be called on the Fuzzer thread; they should
 353 ensure that after they've completed the `done_closure/run_closure` argument is
 354 invoked on the Fuzzer thread.
 355
 356 ```c++
 357 void CodeCacheHostTestcase::SetUp(base::OnceClosure done_closure) {
 358   DCHECK_CALLED_ON_VALID_SEQUENCE(sequence_checker_);
 359 }
 360
 361 void CodeCacheHostTestcase::TearDown(base::OnceClosure done_closure) {
 362   DCHECK_CALLED_ON_VALID_SEQUENCE(sequence_checker_);
 363 }
 364
 365 void CodeCacheHostTestcase::RunAction(const ProtoAction& action,
 366                                       base::OnceClosure run_closure) {
 367   DCHECK_CALLED_ON_VALID_SEQUENCE(sequence_checker_);
 368
 369   const auto ThreadId_UI =
 370       content::fuzzing::code_cache_host::proto::RunThreadAction_ThreadId_UI;
 371   const auto ThreadId_IO =
 372       content::fuzzing::code_cache_host::proto::RunThreadAction_ThreadId_IO;
 373
 374   switch (action.action_case()) {
 375     case ProtoAction::kNewCodeCacheHost:
 376       AddCodeCacheHost(action.new_code_cache_host().id(),
 377                        action.new_code_cache_host().render_process_id(),
 378                        action.new_code_cache_host().origin_id(),
 379                        std::move(run_closure));
 380       return;
 381
 382     case ProtoAction::kRunThread:
 383       // These actions ensure that any tasks currently queued on the named
 384       // thread have chance to run before the fuzzer continues.
 385       //
 386       // We don't provide any particular guarantees here; this does not mean
 387       // that the named thread is idle, nor does it prevent any other threads
 388       // from running (or the consequences of any resulting callbacks, for
 389       // example).
 390       if (action.run_thread().id() == ThreadId_UI) {
 391         content::GetUIThreadTaskRunner({})->PostTaskAndReply(
 392             FROM_HERE, base::DoNothing(), std::move(run_closure));
 393       } else if (action.run_thread().id() == ThreadId_IO) {
 394         content::GetIOThreadTaskRunner({})->PostTaskAndReply(
 395             FROM_HERE, base::DoNothing(), std::move(run_closure));
 396       }
 397       return;
 398
 399     case ProtoAction::kCodeCacheHostRemoteAction:
 400       mojolpm::HandleRemoteAction(action.code_cache_host_remote_action());
 401       break;
 402
 403     case ProtoAction::ACTION_NOT_SET:
 404       break;
 405   }
 406
 407   GetFuzzerTaskRunner()->PostTask(FROM_HERE, std::move(run_closure));
 408 }
 409 ```
 410
 411 The key line here in integration with MojoLPM is the last case,
 412 `kCodeCacheHostCall`, where we're asking MojoLPM to treat this incoming proto
 413 entry as a call to a method on the `CodeCacheHost` interface.
 414
 415 There's just a little bit more boilerplate in the bottom of the file to tidy up
 416 concurrency loose ends, making sure that the fuzzer components are all running
 417 on the correct threads; those are more-or-less common to any fuzzer using
 418 MojoLPM.
 419
 420
 421 ## Resulting structure
 422
 423 Overall, the structure of your fuzzer is likely approximately to reflect that
 424 of the `content/test/fuzzer/presentation_service_mojolpm_fuzzer.cc`,
 425 shown here:
 426
 427 ![alt text](mojolpm-fuzzer-structure.png "Architecture diagram showing
 428 the rough structure of the presentation service fuzzer")
 429
 430 (drawing source
 431 [here](https://goto.google.com/mojolpm-fuzzer-structure) )
 432
 433
 434
 435 ## Test it!
 436
 437 Make a corpus directory and fire up your shiny new fuzzer!
 438
 439 ```
 440  ~/chromium/src% out/Default/code_cache_host_mojolpm_fuzzer /dev/shm/corpus
 441 INFO: Seed: 3273881842
 442 INFO: Loaded 1 modules   (1121912 inline 8-bit counters): 1121912 [0x559151a1aea8, 0x559151b2cd20),
 443 INFO: Loaded 1 PC tables (1121912 PCs): 1121912 [0x559151b2cd20,0x559152c4b4a0),
 444 INFO:      146 files found in /dev/shm/corpus
 445 INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
 446 INFO: seed corpus: files: 146 min: 2b max: 268b total: 8548b rss: 88Mb
 447 #147  INITED cov: 4633 ft: 10500 corp: 138/8041b exec/s: 0 rss: 91Mb
 448 #152  NEW    cov: 4633 ft: 10501 corp: 139/8139b lim: 4096 exec/s: 0 rss: 91Mb L: 98/268 MS: 8 Custom-ChangeByte-Custom-EraseBytes-Custom-ShuffleBytes-Custom-Custom-
 449 #154  NEW    cov: 4634 ft: 10510 corp: 140/8262b lim: 4096 exec/s: 0 rss: 91Mb L: 123/268 MS: 3 CustomCrossOver-ChangeBit-Custom-
 450 #157  NEW    cov: 4634 ft: 10512 corp: 141/8384b lim: 4096 exec/s: 0 rss: 91Mb L: 122/268 MS: 3 CustomCrossOver-Custom-CustomCrossOver-
 451 #158  NEW    cov: 4634 ft: 10514 corp: 142/8498b lim: 4096 exec/s: 0 rss: 91Mb L: 114/268 MS: 1 CustomCrossOver-
 452 #159  NEW    cov: 4634 ft: 10517 corp: 143/8601b lim: 4096 exec/s: 0 rss: 91Mb L: 103/268 MS: 1 Custom-
 453 #160  NEW    cov: 4634 ft: 10526 corp: 144/8633b lim: 4096 exec/s: 0 rss: 91Mb L: 32/268 MS: 1 Custom-
 454 #164  NEW    cov: 4634 ft: 10528 corp: 145/8851b lim: 4096 exec/s: 0 rss: 91Mb L: 218/268 MS: 4 CustomCrossOver-Custom-CustomCrossOver-Custom-
 455 ```
 456
 457 ## Wait for it...
 458
 459 Let the fuzzer run for a while, and keep periodically checking in in case it's
 460 fallen over. It's likely you'll have made a few mistakes somewhere along the way
 461 but hopefully soon you'll have the fuzzer running 'clean' for a few hours.
 462
 463 If you run into DCHECK failures in deserialization, see the section below marked
 464 [triage].
 465
 466 ## Expand it to include all relevant interfaces
 467
 468 `CodeCacheHost` is a very simple interface, and it doesn't have any dependencies
 469 on other interfaces. In reality, most Mojo interfaces are much more complex, and
 470 fuzzing their implementations thoroughly will require more work. We'll take a
 471 quick look at a more complex interface, the `BlobRegistry`. If we look at
 472 `blob_registry.mojom`:
 473
 474 ```
 475 // This interface is the primary access point from renderer to the browser's
 476 // blob system. This interface provides methods to register new blobs and get
 477 // references to existing blobs.
 478 interface BlobRegistry {
 479   // Registers a new blob with the blob registry.
 480   // TODO(mek): Make this method non-sync and get rid of the UUID parameter once
 481   // enough of the rest of the system doesn't rely on the UUID anymore.
 482   [Sync] Register(pending_receiver<blink.mojom.Blob> blob, string uuid,
 483                   string content_type, string content_disposition,
 484                   array<DataElement> elements) => ();
 485
 486   // Creates a new blob out of a data pipe.
 487   // |length_hint| is only used as a hint, to decide if the blob should be
 488   // stored in memory or on disk. Registration will still succeed even if less
 489   // or more bytes are read from the pipe. The resulting SerializedBlob can be
 490   // inspected to see how many bytes actually did end up being read from
 491   // the pipe. Pass 0 if nothing is known about the expected size.
 492   // If something goes wrong (for example the blob system doesn't have enough
 493   // available space to store all the data from the stream) null will be
 494   // returned.
 495   RegisterFromStream(string content_type, string content_disposition,
 496                      uint64 length_hint,
 497                      handle<data_pipe_consumer> data,
 498                      pending_associated_remote<ProgressClient>? progress_client)
 499       => (SerializedBlob? blob);
 500
 501   // Returns a reference to an existing blob. Should not be used by new code,
 502   // is only exposed to make converting existing blob using code easier.
 503   // TODO(mek): Remove when crbug.com/740744 is resolved.
 504   [Sync] GetBlobFromUUID(pending_receiver<Blob> blob, string uuid) => ();
 505
 506   // Returns a BlobURLStore for a specific origin.
 507   URLStoreForOrigin(url.mojom.Origin origin,
 508                     pending_associated_receiver<blink.mojom.BlobURLStore> url_store);
 509 };
 510 ```
 511
 512 We can see that this interface references multiple other interfaces; there are
 513 several different kinds of reference that we need to worry about:
 514
 515 **Additional fuzzable interfaces** - if an interface method can return a
 516 pending_remote<> or take a pending_receiver<> to an interface Foo, then we
 517 want our fuzzer to fuzz those interfaces too.
 518
 519 Here we would want to add `blink.mojom.Blob.RemoteAction` and
 520 `blink.mojom.BlobURLStore.AssociatedRemoteAction` to the possible actions
 521 that our fuzzer protobufs can take.
 522
 523 **Renderer-hosted interfaces** - if an interface method takes a pending_remote<>
 524 (or returns a pending_receiver<>), then we'll also want to add response handling
 525 to our fuzzer. This lets the fuzzer send fuzzer-side implementations of mojo
 526 interfaces, and handle fuzzing the values returned if those methods are called.
 527
 528 Here we can see `blink.mojom.ProgressClient` is needed, but we can also see that
 529 we pass `blink.mojom.DataElement` structures to `BlobRegistry.Register`. These
 530 can contain `remote<blink.mojom.Blob>`, so we also need to support
 531 `blink.mojom.Blob`.
 532
 533 These are handled similarly to the `RemoteAction`s, but the type that we need to
 534 add to our proto is instead `blink.mojom.ProgressClient.ReceiverAction`, and so
 535 on.
 536
 537 We can continue applying this logic recursively to all of the interfaces that
 538 might be accessed - this comes down to a question of what dependencies are most
 539 likely to be important in getting good coverage, so the later step of examining
 540 code coverage may also help in guiding the addition of new interfaces here.
 541
 542 `blob_registry_mojolpm_fuzzer.proto` illustrates how these responses can be added
 543 to the testcase proto.
 544
 545 ## Start fuzzing
 546
 547 Once the fuzzer is up and running, we probably want to remove dcheck_always_on.
 548
 549 ```
 550 enable_mojom_fuzzer = true
 551 is_asan = true
 552 is_component_build = true
 553 is_debug = false
 554 optimize_for_fuzzing = true
 555 use_goma = true
 556 use_libfuzzer = true
 557 ```
 558
 559 The reason for this is that while DCHECKs are often useful when fuzzing (and a
 560 good indication of potential bugs), the Mojo serialization code often contains
 561 quite a few DCHECKs, and our fuzzer is essentially serializing untrusted data
 562 before it can deserialize that data on the Browser-process side. This means
 563 that we can easily get blocked by a "completely valid" DCHECK during
 564 serialisation that a compromised renderer would bypass. Removing DCHECKs will
 565 sometimes let the fuzzer continue in these situations, and will reduce spurious
 566 results, but if your fuzzer doesn't trigger any of these cases it may be
 567 beneficial to also fuzz with DCHECKs enabled. We'll discuss this below under
 568 [triage](#triage-notes).
 569
 570 If your coverage isn't going up at all, then you've probably made a mistake and
 571 it likely isn't managing to actually interact with the interface you're trying
 572 to fuzz - try using the code coverage output from the next step to debug what's
 573 going wrong.
 574
 575
 576 ## (Optional) Run coverage
 577
 578 In many cases it's useful to check the code coverage to see if we can benefit
 579 from adding some manual testcases to get deeper coverage. For this example I
 580 used the following gn arguments and command:
 581
 582 ```
 583 enable_mojom_fuzzer = true
 584 is_component_build = false
 585 is_debug = false
 586 use_clang_coverage = true
 587 use_goma = true
 588 use_libfuzzer = true
 589 ```
 590
 591 ```
 592 python tools/code_coverage/coverage.py code_cache_host_mojolpm_fuzzer -b out/Coverage -o ManualReport -c "out/Coverage/code_cache_host_mojolpm_fuzzer -ignore_timeouts=1 -timeout=4 -runs=0 /dev/shm/corpus" -f content
 593 ```
 594
 595 With the CodeCacheHost, looking at the coverage after a few hours we could see
 596 that there's definitely some room for improvement:
 597
 598 ```c++
 599 /* 55       */ absl::optional<GURL> GetSecondaryKeyForCodeCache(const GURL& resource_url,
 600 /* 56 53.6k */ int render_process_id) {
 601 /* 57 53.6k */    if (!resource_url.is_valid() || !resource_url.SchemeIsHTTPOrHTTPS())
 602 /* 58 53.6k */      return absl::nullopt;
 603 /* 59 0     */
 604 /* 60 0     */    GURL origin_lock =
 605 /* 61 0     */        ChildProcessSecurityPolicyImpl::GetInstance()->GetOriginLock(
 606 /* 62 0     */            render_process_id);
 607 ```
 608
 609 ## (Optional) Improve corpus manually
 610
 611 It's fairly easy to improve the corpus manually, since our corpus files are just
 612 protobuf files that describe the sequence of interface calls to make.
 613
 614 There are a couple of approaches that we can take here - we'll try building a
 615 small manual seed corpus that we'll use to kick-start our fuzzer. Since it's
 616 easier to edit text protos, MojoLPM can automatically convert our seed corpus
 617 from text protos to binary protos during the build, making this slightly less
 618 painful for us, and letting us store our corpus in-tree in a readable format.
 619
 620 So, we'll create a new folder to hold this seed corpus, and craft our first
 621 file:
 622
 623 ```
 624 actions {
 625   new_code_cache_host {
 626     id: 1
 627     render_process_id: 0
 628     origin_id: ORIGIN_A
 629   }
 630 }
 631 actions {
 632   code_cache_host_remote_action {
 633     id: 1
 634     m_did_generate_cacheable_metadata {
 635       m_cache_type: CodeCacheType_kJavascript
 636       m_url {
 637         new {
 638           id: 1
 639           m_url: "http://aaa.com/test"
 640         }
 641       }
 642       m_data {
 643         new {
 644           id: 1
 645           m_bytes {
 646           }
 647         }
 648       }
 649       m_expected_response_time {
 650       }
 651     }
 652   }
 653 }
 654 sequences {
 655   action_indexes: 0
 656   action_indexes: 1
 657 }
 658 sequence_indexes: 0
 659 ```
 660
 661 We can then add some new entries to our build target to have the corpus
 662 converted to binary proto directly during build.
 663
 664 ```
 665   testcase_proto_kind = "content.fuzzing.code_cache_host.proto.Testcase"
 666
 667   seed_corpus_sources = [
 668     "code_cache_host_mojolpm_fuzzer_corpus/did_generate_cacheable_metadata.textproto",
 669   ]
 670 ```
 671
 672 If we now run a new coverage report using this single file seed corpus:
 673 (note that the binary corpus files will be output in your output directory, in
 674 this case code_cache_host_mojolpm_fuzzer_seed_corpus.zip):
 675
 676 ```
 677 autoninja -C out/Coverage chrome
 678 rm -rf /tmp/corpus; mkdir /tmp/corpus; unzip out/Coverage/code_cache_host_mojolpm_fuzzer_seed_corpus.zip -d /tmp/corpus
 679 python tools/code_coverage/coverage.py code_cache_host_mojolpm_fuzzer -b out/Coverage -o ManualReport -c "out/Coverage/code_cache_host_mojolpm_fuzzer -ignore_timeouts=1 -timeout=4 -runs=0 /tmp/corpus" -f content
 680 ```
 681
 682 We can see that we're now getting some more coverage:
 683
 684 ```c++
 685 /* 118   */ void CodeCacheHostImpl::DidGenerateCacheableMetadata(
 686 /* 119   */     blink::mojom::CodeCacheType cache_type,
 687 /* 120   */     const GURL& url,
 688 /* 121   */     base::Time expected_response_time,
 689 /* 122 2 */       mojo_base::BigBuffer data) {
 690 /* 123 2 */     if (!url.SchemeIsHTTPOrHTTPS()) {
 691 /* 124 0 */       mojo::ReportBadMessage("Invalid URL scheme for code cache.");
 692 /* 125 0 */       return;
 693 /* 126 0 */     }
 694 /* 127 2 */
 695 /* 128 2 */     DCHECK_CURRENTLY_ON(BrowserThread::UI);
 696 /* 129 2 */
 697 /* 130 2 */     GeneratedCodeCache* code_cache = GetCodeCache(cache_type);
 698 /* 131 2 */     if (!code_cache)
 699 /* 132 0 */       return;
 700 /* 133 2 */
 701 /* 134 2 */     absl::optional<GURL> origin_lock =
 702 /* 135 2 */         GetSecondaryKeyForCodeCache(url, render_process_id_);
 703 /* 136 2 */     if (!origin_lock)
 704 /* 137 0 */       return;
 705 /* 138 2 */
 706 /* 139 2 */     code_cache->WriteEntry(url, *origin_lock, expected_response_time,
 707 /* 140 2 */                            std::move(data));
 708 /* 141 2 */ }
 709 ```
 710
 711 Much better!
 712
 713 ## Triage notes
 714
 715 MojoLPM fuzzers have a number of common failure modes that are fairly easy to
 716 distinguish from real bugs in the implementation being fuzzed.
 717
 718 The first of these is any crash on the `fuzzer_thread`. Code in the
 719 implementation should never, under any circumstances be running on this thread,
 720 so any crash on this thread is the result of a bug in the fuzzer itself, or
 721 one of the other causes mentioned below.
 722
 723 The second is DCHECK or other failures during Mojo serialization. Various traits
 724 assert that they are serializing reasonable values - since we need to reuse this
 725 serialization code in the fuzzer to produce input to the implementation, we can
 726 trigger these on the `fuzzer_thread` while processing input to send to the
 727 implementation.
 728
 729 The example ASAN error output below illustrates an example of both of these
 730 cases - the error happens on the `fuzzer_thread`, and during serialization.
 731
 732 ```
 733 ==2940792==ERROR: AddressSanitizer: ILL on unknown address 0x7fbd9391d0f9 (pc 0x7fbd9391d0f9 bp 0x7fbd24deb3e0 sp 0x7fbd24deb3e0 T5)
 734     #0 0x7fbd9391d0f9 in unsigned int base::internal::CheckOnFailure::HandleFailure<unsigned int>() base/numerics/safe_conversions_impl.h:122:5
 735     #1 0x7fbd9391ba78 in unsigned int base::internal::checked_cast<unsigned int, base::internal::CheckOnFailure, unsigned long>(unsigned long) base/numerics/safe_conversions.h:114:16
 736     #2 0x7fbd9391ba28 in mojo::StructTraits<mojo_base::mojom::BigBufferSharedMemoryRegionDataView, mojo_base::internal::BigBufferSharedMemoryRegion>::size(mojo_base::internal::BigBufferSharedMemoryRegion const&) mojo/public/cpp/base/big_buffer_mojom_traits.cc:17:10
 737     #3 0x7fbd7f62fc2e in mojo::internal::Serializer<mojo_base::mojom::BigBufferSharedMemoryRegionDataView, mojo_base::internal::BigBufferSharedMemoryRegion>::Serialize(mojo_base::internal::BigBufferSharedMemoryRegion&, mojo::internal::Buffer*, mojo_base::mojom::internal::BigBufferSharedMemoryRegion_Data::BufferWriter*, mojo::internal::SerializationContext*) gen/mojo/public/mojom/base/big_buffer.mojom-shared.h:182:23
 738 ...
 739     #41 0x7fbd955376e8 in base::RunLoop::Run() base/run_loop.cc:124:14
 740     #42 0x7fbd95707f83 in base::Thread::Run(base::RunLoop*) base/threading/thread.cc:311:13
 741     #43 0x7fbd95708427 in base::Thread::ThreadMain() base/threading/thread.cc:382:3
 742     #44 0x7fbd957dfb40 in base::(anonymous namespace)::ThreadFunc(void*) base/threading/platform_thread_posix.cc:81:13
 743     #45 0x7fbd403866b9 in start_thread /build/glibc-LK5gWL/glibc-2.23/nptl/pthread_create.c:333
 744 AddressSanitizer can not provide additional info.
 745 SUMMARY: AddressSanitizer: ILL (/mnt/scratch0/clusterfuzz/bot/builds/chromium-browser-libfuzzer_linux-release-asan_ae530a86793cd6b8b56ce9af9159ac101396e802/revisions/libfuzzer-linux-release-807440/libmojo_base_shared_typemap_traits.so+0x190f9)
 746 Thread T5 (fuzzer_thread) created by T0 here:
 747     #0 0x56433ef70b3a in pthread_create third_party/llvm/compiler-rt/lib/asan/asan_interceptors.cpp:214:3
 748 ...
 749     #14 0x56433f15380c in main third_party/libFuzzer/src/FuzzerMain.cpp:19:10
 750     #15 0x7fbd3c38a82f in __libc_start_main /build/glibc-LK5gWL/glibc-2.23/csu/libc-start.c:291
 751 ==2940792==ABORTING
 752 ```
 753
 754 [markbrand@google.com]:mailto:markbrand@google.com?subject=[MojoLPM%20Help]:%20&cc=fuzzing@chromium.org
 755 [libfuzzer]: https://source.chromium.org/chromium/chromium/src/+/main:testing/libfuzzer/getting_started.md
 756 [Protocol Buffers]: https://developers.google.com/protocol-buffers/docs/cpptutorial
 757 [libprotobuf-mutator]: https://source.chromium.org/chromium/chromium/src/+/main:testing/libfuzzer/libprotobuf-mutator.md
 758 [testing in Chromium]: https://source.chromium.org/chromium/chromium/src/+/main:docs/testing/testing_in_chromium.md
 759 [interfaces]: https://source.chromium.org/search?q=interface%5Cs%2B%5Cw%2B%5Cs%2B%7B%20f:%5C.mojom$%20-f:test
 760