src/doxygen.md

   1 Doxygen Internals {#mainpage}
   2 =================
   3
   4 Introduction
   5 ------------
   6
   7 This page provides a high-level overview of the internals of doxygen, with
   8 links to the relevant parts of the code. This document is intended for
   9 developers who want to work on doxygen. Users of doxygen are referred to the
  10 [User Manual](http://www.doxygen.org/manual.html).
  11
  12 The generic starting point of the application is of course the main() function.
  13
  14 Configuration options
  15 ---------------------
  16
  17 Configuration file data is stored in singleton class Config and can be
  18 accessed using wrapper macros
  19 Config_getString(), Config_getInt(), Config_getList(),
  20 Config_getEnum(), and Config_getBool() depending on the type of the
  21 option.
  22
  23 The format of the configuration file (options and types) is defined
  24 by the file `config.xml`. As part of the build process,
  25 the python script `configgen.py` will create a file configoptions.cpp
  26 from this, which serves as the input for the configuration file parser
  27 that is invoked using Config::parse(). The script `configgen.py` will also
  28 create the documentation for the configuration items, creating the file
  29 `config.doc`.
  30
  31 Gathering Input files
  32 ---------------------
  33
  34 After the configuration is known, the input files are searched using
  35 searchInputFiles() and any tag files are read using readTagFile()
  36
  37 Parsing Input files
  38 -------------------
  39
  40 The function parseFiles() takes care of parsing all files.
  41 It uses the ParserManager singleton factory to create a suitable parser object
  42 for each file. Each parser implements the abstract interface ParserInterface.
  43
  44 If the parser indicates it needs preprocessing
  45 via ParserInterface::needsPreprocessing(), doxygen will call preprocessFile()
  46 on the file.
  47
  48 A second step is to convert multiline C++-style comments into C style comments
  49 for easier processing later on. As side effect of this step also
  50 aliases (ALIASES option) are resolved. The function that performs these
  51 2 tasks is called convertCppComments().
  52
  53 *Note:* Alias resolution should better be done in a separate step as it is
  54 now coupled to C/C++ code and does not work automatically for other languages!
  55
  56 The third step is the actual language parsing and is done by calling
  57 ParserInterface::parseInput() on the parser interface returned by
  58 the ParserManager.
  59
  60 The result of parsing is a tree of Entry objects.
  61 These Entry objects are wrapped in a EntryNav object and stored on disk using
  62 Entry::createNavigationIndex() on the root node of the tree.
  63
  64 Each Entry object roughly contains the raw data for a symbol and is later
  65 converted into a Definition object.
  66
  67 When a parser finds a special comment block in the input, it will do a first
  68 pass parsing via parseCommentBlock(). During this pass the comment block
  69 is split into multiple parts if needed. Some data that is later needed is
  70 extracted like section labels, xref items, and formulas.
  71 Also Markdown markup is processed using processMarkdown() during this pass.
  72
  73 Resolving relations
  74 -------------------
  75
  76 The Entry objects created and filled during parsing are stored on disk
  77 (to keep memory needs low). The name, parent/child relation, and
  78 location on disk of each Entry is stored as a tree of EntryNav nodes, which is
  79 kept in memory.
  80
  81 Doxygen does a number of tree walks over the EntryNav nodes in the tree to
  82 build up the data structures needed to produce the output.
  83
  84 The resulting data structures are all children of the generic base class
  85 called Definition which holds all non-specific data for a symbol definition.
  86
  87 Definition is an abstract base class. Concrete subclasses are
  88 - ClassDef: for storing class/struct/union related data
  89 - NamespaceDef: for storing namespace related data
  90 - FileDef: for storing file related data
  91 - DirDef: for storing directory related data
  92
  93 For doxygen specific concepts the following subclasses are available
  94 - GroupDef: for storing grouping related data
  95 - PageDef: for storing page related data
  96
  97 Finally the data for members of classes, namespaces, and files is stored in
  98 the subclass MemberDef.
  99
 100 Producing debug output
 101 ----------------------
 102
 103 Within doxygen there are a number of ways to obtain debug output. Besides the
 104 invasive method of  putting print statements in the code there are a number of
 105 easy ways to get debug information.
 106
 107 - Compilation of `.l` files<br>
 108   This is also an invasive method but it will be automatically done by the
 109   `flex / lex` command. The result is that of each input line the (lex) rule(s)
 110   that are applied on it are shown.
 111   - windows
 112     - in the Visual C++ GUI
 113       - find the required `.l` file
 114       - select the `Properties` of this file
 115       - set the item `Write used lex rules` to `Yes`
 116       - see to it that the `.l` file is newer than the corresponding `.cpp` file
 117         or remove the corresponding `.cpp` file
 118   - unices
 119     - global change<br>
 120       In the chapter "Doxygen's internals" a `perl` script is given to toggle the
 121       possibility of having the rules debug information.
 122     - command line change<br>
 123       It is possible to the option `LEX="flex -d"` with the `make` command on the
 124       command line. In this case the `.l` that are converted to the corresponding
 125       `.cpp` files during this `make` get the rules debug information.<br>
 126       To undo the rules debug information output just recompile the file with
 127       just `make`.<br>
 128       Note this method applies for all the `.l` files that are rebuild to `.cpp`
 129       files so be sure that only the `.l` files(s) of which you want to have the
 130       rules debug information is (are) newer than the corresponding `.cpp`
 131       file(s).
 132 - Running doxygen<br>
 133   During a run of doxygen it is possible to specify the `-d` option with the
 134   following possibilities (each option has to be preceded by `-d`):
 135   - findmembers<br>
 136     Gives of global, class, module members its scope, arguments and other relevant information.
 137   - functions<br>
 138     Gives of functions its scope, arguments and other relevant information.
 139   - variables<br>
 140     Gives of variables its scope and other relevant information.
 141   - classes<br>
 142     Gives of classes en modules its scope and other relevant information.
 143   - preprocessor<br>
 144     Shows the results of the preprocessing phase, i.e. results from include files,
 145     <tt>\#define</tt> statements etc., definitions in the doxygen configuration file like:
 146     `EXPAND_ONLY_PREDEF`, `PREDEFINED` and `MACRO_EXPANSION`.
 147   - commentcnv<br>
 148     Shows the results of the comment conversion, the comment conversion does the
 149     following:
 150      - It converts multi-line C++ style comment blocks (that are aligned)
 151        to C style comment blocks (if `MULTILINE_CPP_IS_BRIEF` is set to `NO`).
 152      - It replaces aliases with their definition (see `ALIASES`)
 153      - It handles conditional sections (<tt>\\cond ... \\endcond</tt> blocks)
 154   - commentscan<br>
 155     Will print each comment block before and after the comment is interpreted by
 156     the comment scanner.
 157   - printtree<br>
 158     Give the results in in pretty print way, i.e. in an XML like way with each
 159     level indented by a `"."` (dot).
 160   - time<br>
 161     Provides information of the different stages of the doxygen process.
 162   - extcmd<br>
 163     Shows which external commands are executed and which pipes are opened.
 164   - markdown<br>
 165     Will print each comment block before and after Markdown processing.
 166   - filteroutput<br>
 167     Gives the output of the output as result of the filter command (when a filter
 168     command is specified)
 169   - validate<br>
 170     Currently not used
 171   - lex<br>
 172     Provide output of the `lex` files used. When a lexer is started and when a lexer
 173     ends the name of the `lex` file is given so it is possible to see in which lexer the
 174     problem occurs. This makes it easier to select the file to be compiled in `lex` debug mode.
 175
 176 Producing output
 177 ----------------
 178
 179 TODO
 180
 181 Topics TODO
 182 -----------
 183 - Grouping of files in Model / Parser / Generator categories
 184 - Index files based on IndexIntf
 185   - HTML navigation
 186   - HTML Help (chm)
 187   - Documentation Sets (XCode)
 188   - Qt Help (qhp)
 189   - Eclipse Help
 190 - Search index
 191   - Javascript based
 192   - Server based
 193   - External
 194 - Citations
 195   - via bibtex
 196 - Various processing steps for a comment block
 197   - comment conversion
 198   - comment scanner
 199   - markdown processor
 200   - doc tokenizer
 201   - doc parser
 202   - doc visitors
 203 - Diagrams and Images
 204   - builtin
 205   - via Graphviz dot
 206   - via mscgen
 207   - PNG generation
 208 - Output formats: OutputGen, OutputList, and DocVisitor
 209   - Html:  HtmlGenerator and HtmlDocVisitor
 210   - Latex: LatexGenerator and LatexDocVisitor
 211   - RTF:   RTFGenerator and RTFDocVisitor
 212   - Man:   ManGenerator and ManDocVisitor
 213   - XML:   generateXML() and XmlDocVisitor
 214   - print: debugging via PrintDocVisitor
 215   - text:  TextDocVisitor for tooltips
 216   - perlmod
 217 - i18n via Translator and language.cpp
 218 - Customizing the layout via LayoutDocManager
 219 - Parsers
 220   - C Preprocessing
 221     - const expression evaluation
 222   - C link languages
 223   - Python
 224   - Fortran
 225   - VHDL
 226   - TCL
 227   - Tag files
 228 - Marshaling to/from disk
 229 - Portability functions
 230 - Utility functions
 231