[Serialization] Add -ftime-trace block for reading loaded modules.
The existing ReadAST block only describes the top-level PCM file being
loaded, when in practice most of the time taken is dealing with other
PCM files which are loaded in turn.
Because this work isn't strictly recursive (first all the modules are
discovered, then processsed in several flat loops), we can't have a neat
recursive structure like processing of source files. Instead, slap a
timer on the largest of these boxes: reading the AST block for modules.
In practice this shows where most of the time goes, and in particular
which modules are most expensive.