TODO

   1 Before 1.1:
   2
   3 * Move perfcounter.h into sysprof namespace
   4
   5 * Check for existence and presense of __NR_perf_counter_open in
   6   syscall.h
   7
   8 Before 1.2:
   9
  10 * Filter out kernel stacktrace entries from the interrupts.
  11   (This can be done better in the kernel).
  12
  13 * Make sure errors are reported properly.
  14
  15 * Fork GtkLabel and change it to only request a resize when the new size is
  16   actually larger than the old one. It could also be special-purposed for
  17   the "Samples: <number>" format with the number left-aligned.
  18
  19 * When should the samples label be updated? On one hand we don't want
  20   it to be the only thing that shows up on the profile. On the other,
  21   when there are things going on, it should update quickly.
  22
  23   It is also desirable that it updates slowly if there is slow
  24   activity going on; for example if you are moving the mouse cursor
  25   around.
  26
  27   Cost of updating the samples label:   s       (in samples)
  28   Base rate:                            f
  29
  30   If we update the samples label c times per second, the frequency is
  31
  32         c * s
  33
  34   If we update the samples label for every k samples.
  35
  36         (k - s)/f   is the time it takes before updating
  37
  38   So if the update rate should be proportional to the base rate, then
  39   we get
  40
  41         (k - s)/f = d/f
  42
  43   which implies k = d + s. So we should pick some constant d and only
  44   update when that many samples have arrived.
  45
  46 * The counters seem to not be disabled when looking at the
  47   profiles.
  48
  49 * Build system
  50    - Create RPM package? See fedora-packaging-list for information
  51      about how to package kernel modules. Lots of threads in
  52      June 2005 and forward.
  53
  54      See also http://www.fedoraproject.org/wiki/Extras/KernelModuleProposal
  55
  56      Someone already did create a package - should be googlable.
  57
  58 * The hrtimer in the kernel currently generates an event every time
  59   the timer fires. There are two problems with this:
  60
  61   - Essentially all the events are idle events and exclude_idle is
  62     ignored.
  63
  64   - If you make it obey exclude_idle, it still generates activity
  65     based on what is running currently. Unfortunately, the thing that
  66     is running will very often be the userspace tracker because it was
  67     handling the last sample generated. So this degenerates into an
  68     infinite loop of sorts. Or maybe the GUI is just too slow, but
  69     then why doesn't it happen with the real counters?
  70
  71     I think the solution here is to make the hrtimer fire at some
  72     reasonable interval, like 100000 ns. When the timer fires, if the
  73     current task is not the idle taks, it increments a counter by
  74
  75                 cpu clock frequency * 100000 ns
  76
  77     If that overflows the sample period, an event is generated.
  78
  79     This is closer to the idea of a fake CPU cycle counter.
  80
  81   Also, for reasons I don't understand, it stops generating events
  82   completely if a process is running that spins on all CPUs. So this
  83   interface is not usable in its present state, but fortunately all
  84   CPUs we care about have hardware cycle counters.
  85
  86 * With more than one CPU, we can get events out of order, so userspace
  87   will have to deal with that. With serial numbers we could do it
  88   correctly, but even without them we can do a pretty reasonable job
  89   of putting them back in order. If a fork happens "soon" after a
  90   sample, it probably happened before the sample; if an mmap happens
  91   "soon" after a sample that would otherwise be unmapped, it probably
  92   happened before the sample. All we need is a way to determine what
  93   "soon" is.
  94
  95   Serial numbers would be useful to make "soon" an accurate measure.
  96
  97   There is also the issue of pid reuse, but that can probably be
  98   ignored.
  99
 100   If we ignore pid reuse, we can sort the event buffer where two
 101   events compare equal, unless both have the same pid and one is a
 102   fork and the other is not.
 103
 104   A system-wide serial number could be expensive to maintain though,
 105   so maybe time events would be just as good.
 106
 107 * Another issue is processes that exit during the initial scan of
 108   /proc. Such a process will not cause sample events by itself, but it
 109   may fork a child that will. There is no way to get maps for that
 110   child.
 111
 112   A possible solution would be to optionally generate mmap event after
 113   forks. Userspace would turn this off when it was done with the
 114   initial gathering of processes.
 115
 116   Also, exec() events will delete the maps from a process, but all we
 117   get is 'comm' events which is not quite the same thing.
 118
 119 * Find out why the busy cursor stays active after you hit start
 120
 121 * Kernel binary when available, is better than kallsyms.
 122
 123 * GTK+ bugs:
 124   - Misposition window after click
 125   - Find out why gtk_tree_view_columns_autosize() apparently doesn't
 126     work on empty tree views.
 127   - Write my own tree model? There is still performance issues in
 128     FooTreeStore.
 129
 130 * Counters must not be destroyed during tracker setup. They have to
 131   exist but be disabled so that we can track process creation.
 132
 133 * Check that we don't use too much memory (in particular with the
 134   timeline).
 135
 136 * Fix names. "new process" is really "exec". (What does "comm"
 137   actually stand for? Command?)
 138
 139 * Fix ugly flash when double clicking in descendants view
 140
 141 * Find out what's up with weird two-step flash when you hit start when
 142   a profile is loaded.
 143
 144 * Make tracker creation faster. (Faster populating mainly)
 145
 146 * Share map reading code between vdso stuff in binfile.c and tracker.c
 147
 148 * Get rid of remaining gulongs (use uint64_t instead)
 149
 150 * Move binfile hash table to state_t.
 151
 152 * Get rid of process.c
 153
 154 * On 32 bit, NMI stackframes are not filtered out which leads to wrong
 155   kernel traces
 156
 157 * Can we track kernel modules being inserted/removed?
 158
 159 * Does it make sense to try and locate a kernel binary, or can we
 160   always just use kallsyms.
 161
 162 * open_inode() would be useful in many cases
 163
 164 * Is the double mapping of the ring buffer ok?
 165
 166 * Why do we get EINVAL when we try to track forks?
 167
 168 * Sometimes it gets samples for unknown processes. This may be due to
 169   forking without execing.
 170
 171 * Give an informative error message if not run as root
 172
 173 * What are the events generated for pid 0 and pid 1? They have no
 174   stacktrace, and an eip in the kernel.
 175
 176 * Delete the binparser stuff and fix the elf parser to just deal with
 177   32 bits vs 64 bits. Or use C++ like behdad does in harfbuzz?
 178
 179 * We often get "No map". I suspect this is because the vdso stackframe
 180   is strange.
 181
 182 * Hack to disable recursion for binaries without symbols causes the
 183   symbols to not work the way other symbols do.  A better approach is
 184   probably to simply generate a new symbol for every appearance except
 185   leaf nodes, which should still be considered one symbol (or maybe be
 186   considered the same symbol if they have the same parent). In fact
 187   "has same parent" may be the correct criterion in all cases. (done:
 188   master now doesn't fold those recursions anymore)
 189
 190   * See if we can make "In file <blah>" not be treated as a recursive
 191     function.  Maybe simply treat each individual address in the file
 192     as a function.  Or try to parse the machine code. Positions that
 193     are called are likely to be functions.
 194
 195     - Treat identical addresses as one function
 196
 197     - Treat all addresses within a library that don't have children
 198       are treated as one function.
 199
 200     This will have the effect of coalescing adjacent siblings without
 201     children. Which is what you want since you can't tell them apart
 202     anyway. It will never be a great experience though.
 203
 204 * Make sure that labels look decent in case of "No Map" etc.
 205
 206 * Elf bugs:
 207         - error handling for bin_parser is necessary.
 208
 209         * Find out why all apps have an "In file /usr/bin/<app binary>" below
 210           _libc_main. If possible, maybe make up a name for it.
 211
 212 * vdso stuff:
 213         - the "[vdso]" string should be #defined somewhere
 214         - Does get_vdso_bytes() belong in process.c?
 215         - Is basing on "[vdso]" always correct?
 216
 217 * Convert things like [heap] and [stack] to more understandable labels.
 218
 219 * Strategies for taking reliable stacktraces.
 220
 221         Three different kinds of files
 222
 223         - kernel
 224         - vdso
 225         - regular elf files
 226
 227         - kernel
 228                 - eh_frame annotations, in kernel or in kernel debug
 229                 - /proc/kallsyms
 230                 - userspace can look at _stext and _etext to determine
 231                   start and end of kernel text segment
 232                 - copying kernel stack to userspace
 233                         - it's always 4096 bytes these days
 234                 - heuristically determine functions based on address
 235                         - callbacks on the stack can be identified
 236                           by having an offset of 0.
 237                         - even so there is a lot of false positives.
 238                 - is eh_frame usually loaded into memory during normal
 239                   operation? It is mapped, but probably not paged in,
 240                   so we will be taking a few major page faults when we
 241                   first profile something.
 242                         Unless of course, we store the entire stack in
 243                   the stackstash. This may use way too much memory though.
 244
 245                 - Locking, possibly useful code:
 246
 247                 /* In principle we should use get_task_mm() but
 248                  * that will use task_lock() leading to deadlock
 249                  * if somebody already has the lock
 250                  */
 251                 if (spin_is_locked (&current->alloc_lock))
 252                         printk ("alreadylocked\n");
 253                 {
 254                         struct mm_struct *mm = current->mm;
 255                         if (mm)
 256                         {
 257                                 printk (KERN_ALERT "stack size: %d (%d)\n",
 258                                         mm->start_stack - regs->REG_STACK_PTR,
 259                                         current->pid);
 260
 261                                 stacksize = mm->start_stack - regs->REG_STACK_PTR;
 262                         }
 263                         else
 264                                 stacksize = 1;
 265                 }
 266
 267         - regular elf
 268                 - usually have eh_frame section which is mapped into memory
 269                   during normal operation
 270                 - do stackwalk in kernel based on eh_frame
 271                 - eh_frame section is usually mapped into memory, so
 272                   no file reading in kernel would be necessary.
 273                 - do stackwalk in userland based on eh_frame
 274                 - do ebp based stackwalk in kernel
 275                 - do ebp based stackwalk in userland
 276                 - do heuristic stackwalk in kernel
 277                 - do heuristic stackwalk in userland
 278
 279         - Send heuristic stack trace to user space, along with
 280           location on the stack. Then, in userspace analyze the
 281           machine code to determine the size of the stack frame at any
 282           point. The instructions that would need to be recognized are:
 283
 284                  subl <constant>, %esp
 285                  addl <constant>, %esp
 286                  leave
 287                  jcc
 288                  push
 289                  pop
 290
 291           GCC is unlikely to have different stack sizes at the entry
 292           to a basic block.
 293
 294           We can often find a vmlinux in /lib/modules/<uname-r>/build.
 295
 296 * "Expand all" is horrendously slow because update_screenshot gets called
 297   for every "expanded" signal. In fact even normal expanding is really
 298   slow. It's probably hopeless to get decent performance out of GtkTreeView,
 299   so we will have to store a list of expanded objects and keep that uptodate
 300   as the rows expands and unexpands.
 301
 302 * Give more sensible 'error messages'. Eg., if you get permission denied for
 303   a file, put "Permission denied" instead of "No map"
 304
 305 * crc32 checking probably doesn't belong in elfparser.c
 306
 307 * Missing things in binparser.[ch]
 308
 309         - it's inconvenient that you have to pass in both a parser _and_
 310           a record. The record should just contain a pointer to the parser.
 311           On the other hand, the result does depend on the parser->offset.
 312           So it's a bit confusing that it's not passed in.
 313
 314         - the bin_parser_seek_record (..., 1); idiom is a little dubious
 315
 316         - Add error checking
 317           Also need to add error checking.
 318
 319         - "native endian" is probably not useful. Maybe go back to just
 320           having big/little endian.
 321
 322   Should probably rethink the whole thing. It's just not very convenient to use, even
 323   for simple things like ELF files.
 324
 325 * Rename stack_stash_foreach_by_address() to stack_stash_foreach_unique(),
 326   or maybe not ...
 327
 328   Which things are we actually using from stack stash now?
 329
 330 * Maybe report idle time? Although this would come for free with the
 331   timelines.
 332
 333 * Fix (deleted) problem. But more generally, whenever we can't display a
 334   symbol, display an error message instead, ie.,
 335         - 'Binary file <xxxx> was deleted/replaced'
 336         - 'No mapping for address'
 337         - 'No symbols in binary file'
 338         - 'Address has no corrresponding symbol in file'
 339         - etc.
 340   done: HEAD will not load files with the wrong inode now.
 341
 342 * Consider whether ProfileDescendant can be done with a StackStash We
 343   need the "go-back-on-recursion" behavior.  That could be added of
 344   course ... the functions are otherwise very similar.
 345
 346 * Add spew infrastructure to make remote debugging easier.
 347
 348 * Make it compile and work on x86-64
 349
 350 - make the things we put in a stackstash real
 351   objects so that
 352         - we can save them
 353         - they will know how to delete the presentation
 354           names and themselves (through a virtual function)
 355         - they can contain markup etc.
 356         - The unique_dup() scheme is somewhat confusing.
 357   - a more pragmatic approach might be to just walk the tree and
 358     save it.
 359
 360 - plug all the leaks
 361         - loading and saving probably leak right now
 362
 363 - make it use less memory:
 364         - StackNodes are dominating
 365                 - fold 'toplevel' into 'size'
 366                 - allocate in blocks
 367                         - this will need to be coordinated with
 368                           profile.c which also creates stacknodes.
 369
 370                         - maybe simply make stackstashes able to
 371                           save themselves.
 372
 373 - rethink loading and saving. Goals
 374
 375         - Can load 1.0 profiles
 376         - Don't export too much of stackstashes to the rest of the
 377           app
 378
 379         Features:
 380
 381         * Have a compatibility module that can load 1.0 modules
 382                 - reads in the call tree, then generate a stack stash
 383                   by repeatedly adding traces.
 384
 385         * Make stackstash able to save themselves using callbacks.
 386
 387         * If loading a file fails, then try loading it using the
 388           1.0 loader.
 389                 - optimization: make sure we fail immediately if we
 390                   see an unknown object.
 391
 392         * Add versioning support to sfile.[ch]:
 393                 - sformat_new() should take doctype and version strings:
 394                         like  "sysprof-profile"   "version 1.2"
 395                 - there should be sfile_sniff() functionality that will
 396                         return the doctype and version of a file. Or None
 397                         if there aren't any.
 398
 399         * At this point, make the loader first check if the file has a version
 400           if it doesn't, load as 1.0, otherwise as whatever the version claims
 401           it is.
 402
 403         * Make provisions for forward compatibility: maybe it should be
 404           possible to load records with more fields than specified.
 405
 406         * Figure out how to make sfile.[ch] use less memory.
 407
 408         - In general clean sfile.[ch] up a little:
 409
 410         - split out dfa in its own generic class
 411
 412         - make a generic representation of xml files with quarks for strings:
 413                 struct item {
 414                         int begin/end/text;
 415                         quark text:             -> begin/end/value
 416                         int id;                 -> for begins  that are pointed to
 417                 }
 418           perhaps even with iterators. Should be compact and suitable for both
 419           input and output. As a first cut, perhaps just split out the
 420           Instruction code.
 421
 422                 (done, somewhat).
 423
 424         - make the api saner; add format/content structs
 425                 Idea: all types have to be declared:
 426
 427                 SFormat *format = sformat_new()
 428                 SType   *object_list = stype_new (format);
 429                 SType   *object = stype_new (format);
 430                 ...
 431
 432                 stype_define_list (object_list, "objects", object);
 433                 stype_define_record (object, "object",
 434                                      name, total, self, NULL);
 435                 stype_define_pointer (...);
 436
 437 * See if the auto-expanding can be made more intelligent
 438         - "Everything" should be expanded exactly one level
 439         - all trees should be expanded at least one level
 440
 441 * Send entire stack to user space, then do stackwalking there. That would
 442   allow us to do more complex algorithms, like dwarf, in userspace. Though
 443   we'd lose the ability to do non-racy file naming. We could pass a list
 444   of the process mappings with each stack though. Doing this would also solve
 445   the problem of not being able to get maps of processes running as root.
 446         Might be too expensive though. User stacks seem to be on the order
 447   of 100K usually, which for 200 times a second means a bandwidth of
 448   20MB/s, which is probably too much. One question is how much of it
 449   usually changes.
 450         Actually it seems that the _interesting_ part of the stack
 451   (ie., from the stack pointer and up) is not that big in many cases. The
 452   average stacksize seemed to be about 7700 bytes for gcc compiling gtk+.
 453   Even deeply recursive apps like sysprof only generate about 55K stacks.
 454
 455   Other possibilities:
 456
 457         - Do heuristic stack walking where it lists all words on the stack
 458           that look like they might be return addresses.
 459
 460         - Somehow map the application's stack pages into the client. This
 461           is likely difficult or impossible.
 462
 463         - Another idea: copy all addresses that look like they could be
 464           return addresses, along with the location on the stack. This
 465           just might be enough for a userspace stack walker.
 466
 467         - Yet another: krh suggests hashing blocks of the stack, then
 468           only sending the blocks that changed since last time.
 469
 470                 - every time you send a stackblock, also send a cookie.
 471
 472                 - whenever you *don't* send a stackblock, send the cookie
 473                   instead. That way you always get a complete stacktrace
 474                   conceptually.
 475
 476                 - also, that would allow the kernel to just have a simple
 477                   hashtable containing the known blocks. Though, that could
 478                   become large. Actually there is no reason to store the
 479                   blocks; you can just send the hashcode. That way you
 480                   would only need to store a list of hashcodes that we
 481                   have generated previously.
 482
 483         - One problem with doing DWARF walking is that the debug code
 484           will have to be faulted in. This can be a substantial amount
 485           of disk access which is undesirable to have during a
 486           profiling run. Even if we only have to fault in the
 487           .eh_frame_hdr section, that's still 18 pages for gtk+. The
 488           .eh_frame section for gtk+ is 72 pages.
 489
 490           A possibility may be to consider two stacktraces identical
 491           if the only differing values are *outside* the text
 492           segments.  This may work since stack frames tend to be the
 493           same size.  Is there a way of determining the location of
 494           text segments without reading the ELF files? Maybe just
 495           check if it's inside an executable mappign.
 496
 497           It is then sufficient in user space to only store one
 498           representative for each set of considered-identical stack
 499           traces.
 500
 501           User space storage: Use the stackstash tree. When a new trace
 502           is added, just skip over nodes that differ, but where none of
 503           them points to text segments. Two possibilities then:
 504
 505                 - when two traces are determined to differ, store them
 506                   in completely separate trees. This ensures that we
 507                   will never run the dwarf algorithm on an invalid
 508                   stack trace, but also means that we won't get shared
 509                   prefixes for stacktraces.
 510
 511                 - when two traces are determined to differ, branch off
 512                   as currently. This will share more data, but the
 513                   dwarf algorithm could be run on invalid traces. It
 514                   may work in practice though if the compiler
 515                   generally uses fixed stack frames.
 516
 517                   A twist on is to mark the complete stack traces as
 518                   "complete". Then after running the DWARF algorithm,
 519                   the generated stack trace can be saved with it. This
 520                   way incomplete stack traces branching off a complete
 521                   one can be completed using the DWARF information for
 522                   the shared part.
 523
 524 * Notes on heuristic stack walking
 525
 526   - We can reject addresses that point exactly to the beginning of a
 527     function since these are likely callbacks. Note though that the
 528     first time a function in a shared library is called, it goes
 529     through dynamic linker resolution which will cause the stack to
 530     contain a callback of the function. This needs to be investigated
 531     in more detail.
 532
 533   - We are already rejecting addresses outside the text section
 534     (addresses of global variables and the like).
 535
 536 * How to get the user stack:
 537
 538    /* In principle we should use get_task_mm() but
 539     * that will use task_lock() leading to deadlock
 540     * if somebody already has the lock
 541     */
 542    if (spin_is_locked (&current->alloc_lock))
 543            printk ("alreadylocked\n");
 544    {
 545            struct mm_struct *mm = current->mm;
 546            if (mm)
 547            {
 548                    printk (KERN_ALERT "stack size: %d (%d)\n",
 549                            mm->start_stack - regs->REG_STACK_PTR,
 550                            current->pid);
 551
 552                    stacksize = mm->start_stack - regs->REG_STACK_PTR;
 553            }
 554            else
 555                    stacksize = 1;
 556    }
 557
 558 * If interrupt happens in kernel mode, send both
 559   kernel stack and user space stack, have userspace stitch them
 560   together. well, they could be stitched together in the kernel.
 561         Already done: we now take a stacktrace of the user space process
 562         when the interrupt happens in kernel mode. (Unfortunately, this
 563         causes lockups on many kernels (though I haven't seen that for
 564         a while)).
 565
 566         We don't take any stacktraces of the kernel though. Things that
 567         need to be
 568         investigated:
 569                 - does the kernel come with dwarf debug information?
 570                 - does the kernel come with some other debug info
 571                 - is there a place where the vmlinux binary is usually
 572                   placed? (We should avoid any "location of vmlinux" type
 573                   questions if at all possible).
 574
 575          We do now copy the kernel stack to userspace and do a
 576          heuristic stack walk there. It may be better at some point to
 577          use dump_trace() in the kernel since that will do some
 578          filtering by itself.
 579
 580   Notes about kernel symbol lookup:
 581
 582         - /proc/kallsym is there, but it includes things like labels.
 583           There is no way to tell them from functions
 584
 585         - We can search for a kernel binary with symbols. If the
 586           kernel-debug package is installed, or if the user compiled
 587           his own kernel, we will find one. This is a regular elf file.
 588           It also includes labels, but we can tell by the STT_INFO field
 589           which is which.
 590
 591           Note though that for some labels we do actually want to
 592           treat them as functions. For example the "page_fault" label,
 593           which is function by itself. We can recognize these by the
 594           fact that their symbols have a size. However, the _start
 595           function in normal applications does not have a size, so the
 596           heuristic should be something like this:
 597
 598              - If the address is inside the range of some symbol, use
 599                that symbol
 600
 601              - Otherwise, if the closest symbol is a function with
 602                size 0, use that function.
 603
 604           This means the datastructure will probably have to be done a
 605           little differently.
 606
 607 - See if there is a way to make it distcheck
 608
 609 - grep "FIXME - not10"
 610 - grep FIXME
 611
 612 - translation should be hooked up
 613
 614 - Consider adding "at least 5% inclusive cost" filter
 615
 616 - consider having the ability to group a function together with its nearest
 617   neighbours. That way we can eliminate some of the effect of
 618         "one function taking 10% of the time"
 619   vs.
 620         "the same function broken into ten functions each taking 1%"
 621   Not clear what the UI looks like though.
 622
 623 - Ability to generate "screenshots" suitable for mail/blog/etc
 624         UI: "generate screenshot" menu item pops up a window with
 625         a text area + a radio buttons "text/html". When you flick
 626         them, the text area is automatically updated.
 627         - beginning in CVS:
 628                 - why does the window not remember its position when
 629                   you close it with the close button, but does remember
 630                   it when you use the wm button or the menu item? It actually
 631                   seems that it only forgets the position when you click the
 632                   button with the mouse. But not if you use the keyboard ...
 633                         This is a gtk+ bug.
 634
 635 - Find out how gdb does backtraces; they may have a better way. Also
 636   find out what dwarf2 is and how to use it. Look into libunwind.
 637   It seems gdb is capable of doing backtraces of code that neither has
 638   a framepointer nor has debug info. It appears gdb uses the contents
 639   of the ".eh_frame" section. There is also an ".eh_frame_hdr" section.
 640
 641 http://www.linuxbase.org/spec/booksets/LSB-Embedded/LSB-Embedded/ehframe.html
 642
 643   look in dwarf2-frame.[ch] in the gdb distribution.
 644
 645   Also look at bozo-profiler
 646         http://cutebugs.net/bozo-profiler/
 647   which has an elf32 parser/debugger
 648
 649 - Make busy cursors more intelligent
 650         - when you click something in the main list and we don't respond
 651                 within 50ms (or perhaps when we expect to not be able to do
 652                 so (can we know the size in advance?))
 653         - instead of what we do now: set the busy cursor unconditionally
 654
 655 - Consider adding ability to show more than one function at a time. Algorithm:
 656         Find all relevant nodes;
 657         For each relevant node
 658                 best_so_far = relevant node
 659                 walk towards root
 660                         if node is relevant,
 661                                 best_so_far = relevant
 662                 add best_so_far to interesting
 663         for each interesting
 664                 list leaves
 665                 for each leaf
 666                         add trace to tree (leaf, interesting)
 667
 668 - Consider adding KDE-style nested callgraph view
 669         - probably need a dependency on gtk+ 2.8 (cairo) for this.
 670         - Matthias has code for something like this.
 671         - See http://www.marzocca.net/linux/baobab.html
 672         - Also see http://www.cs.umd.edu/hcil/treemap-history/index.shtml
 673
 674 - Add support for line numbers within functions
 675         - Possibly a special "view details" mode, assuming that
 676           the details of a function are not that interesting
 677           together with a tree. (Could add radio buttons somewhere in
 678           in the right pane). Or tabs.
 679         - Open a new window for the function.
 680
 681 - Add view->ancestors/descendants menu items
 682
 683 - rethink caller list, not terribly useful at the moment. Federico suggested
 684   listing all ancestors.
 685         Done: implemented this idea in CVS HEAD. If we keep it that way,
 686         should do a globale s/callers/ancestors on the code.
 687         - not sure it's an improvement. Often it is more interesting to
 688         find the immediate callers.
 689         - Now it's back to just listing the immediate callers.
 690
 691 - Figure out how Google's pprof script works. Then add real call graph
 692   drawing. (google's script is really simple; uses dot from graphviz).
 693   KCacheGrind also uses dot to do graph drawing.
 694
 695 - hide internal stuff in ProfileDescendant
 696
 697 Later:
 698
 699 - Multithreading is possible in a number of places.
 700
 701 - If the stack trace ends in a memory access instruction, send the
 702   vma information to userspace. Then have user space
 703   produce statistics on what types of memory are accessed.
 704
 705 - somehow get access to VSEnterprise profiler and see how it works.
 706   somehow get access to vtune and see how it works.
 707
 708 - On SMP systems interrupts happen unpredictably, including when another
 709   one is running. Right now we are ignoring any interrupts that happen
 710   when another one is running, but we should probably just save the data
 711   in a per-cpu buffer.
 712
 713 - Find out if sysprof accurately reports time spent handling pagefaults.
 714   There is evidence that it doesn't:
 715         - a version of sysprof took 10 seconds to load a certain profile.
 716           Profiling itself it appeared that most of the time was spent
 717           in the GMarkup parser
 718         - a newer version of sysprof with significantly more compact
 719           Instructions structure took about 5 seconds, but the profile
 720           looked about the same.
 721   The difference between the two versions has to be in page faults/
 722   memory speed, but the profiles looked similar.
 723       Try and reproduce this in a more controlled experiment.
 724
 725 - See if it is possible to group the X server activity under the process that
 726   generated it.
 727
 728 - .desktop file
 729         [Is this worth it? You will often want to start it as root,
 730          and you will need to insert the module from the comman line]
 731
 732 - Applications should be able to say "start profiling", "stop profiling"
 733   so that you can limit the profiling to specific areas.
 734         Idea:
 735                 Add a new kernel interface that applications uses to say
 736                 begin/end.
 737                 Then add a timeline where you can mark interesting regions,
 738                 for example those that applications have marked interesting.
 739
 740 - Find out how to hack around gtk+ bug causing multiple double clicks
 741   to get eaten.
 742
 743 - Consider what it would take to take stacktraces of other languages such
 744   as perl, python, java, ruby, or bash. Or scheme.
 745
 746   Possible solution is for the script binaries to have a function
 747   called something like
 748
 749         __sysprof__generate_stacktrace (char **functions, int *n_functions);
 750
 751   that the sysprof kernel module could call (and make return to the kernel).
 752
 753   This function would behave essentially like a signal handler: couldn't
 754   call malloc(), couldn't call printf(), etc.
 755
 756   Note though that scripting languages will generally have a stack with
 757   both script-binary-stack, script stack, and library stacks. We wouldn't
 758   want scripts to need to parse dwarf. Also if we do that thing with
 759   sending the entire stack to userspace, things will be further
 760   complicated.
 761
 762   Also note languages like scheme that uses heap allocated activation
 763   records.
 764
 765 - Consider this usecase:
 766         Someone is considering replacing malloc()/free() with a freelist
 767         for a certain data structure. All use of this data structure is
 768         confined to one function, foo(). It is now interesting to know
 769         how much time that particular function spends on malloc() and free()
 770         combined.
 771
 772         Possible UI:
 773
 774                 - Select foo(),
 775                 - find an instance of malloc()
 776                 - shift-click on it,
 777                 -       all traces with malloc are removed
 778                 -       a new item "..." appears immeidately below foo()
 779                 -       malloc is added below "..."
 780                 - same for free
 781                 - at this point, the desired data can be read at comulative
 782                   for "..."
 783
 784         Actually, with this UI, you could potentially get rid of the
 785         caller list: Just present the call tree under an <everything> root,
 786         and use ... to  single out the stuff you are interested in.
 787
 788         Maybe also get rid of 'callers' by having a new "show details"
 789         dialog or something.
 790
 791         The complete solution here degenerates into "expressions":
 792
 793                 "foo" and ("malloc" or "free")
 794
 795         Having that would also take care of the "multiple functions"
 796         above. Noone would understand it, though.
 797
 798 - figure out a way to deal with both disk and CPU. Need to make sure that
 799   things that are UNINTERRUPTIBLE while there are RUNNING tasks are not
 800   considered bad. Also figure out how to deal with more than one CPU/core.
 801
 802   Not entirely clear that the sysprof visualization is right for disk.
 803
 804   Maybe assign a size of n to traces with n *unique* disk access (ie.
 805   disk accesses that are not required by any other stack trace).
 806
 807   Or assign values to nodes in the calltree based on how many diskaccesses
 808   are contained in that tree. Ie., if I get rid of this branch, how many
 809   disk accesses would that get rid of.
 810
 811   Or turn it around and look at individual disk accesses and see what it
 812   would take to get rid of it. Ie., a number of traces are associated with
 813   any given diskaccess. Just show those.
 814
 815   Or for a given tree with contained disk accesses, figure out what *other*
 816   traces has the same diskaccesses.
 817
 818   Or visualize a set of squares with a color that is more saturated depending
 819   on the number of unique stack traces that access it. Then look for the
 820   lightly saturated ones.
 821
 822   The input to the profiler would basically be
 823
 824         (stack trace, badness, cookie)
 825
 826    For CPU:     badness=10ms,                                         cookie=<a new one always>
 827    For Disk:    badness=<calculated based on previous disk accesses>, cookie=<the accessed disk block>
 828
 829    For Memory:  badness=<cache line size not in cache>,               cookie=<the address>
 830
 831    Cookies are used to figure out whether an access is really the same, ie., for two identical
 832    cookies, the size is still just one, however
 833
 834    Memory is different from disk because you can't reasonably assume
 835    that stuff that has been read will stay in cache (for short profile
 836    runs you can assume that with disk, but not for long ones).
 837
 838    - Perhaps show a timeline with CPU in one color and disk in one
 839      color. Allow people to look at at subintervals of this
 840      timeline. Is it useful to look at both CPU and disk at the same
 841      time? Probably not. See also marker discussion above. UI should
 842      probably allow double clicking on a marked section and all
 843      instances of that one would be marked.
 844
 845    - This also allows us to show how well multicore CPUs are being used.
 846
 847    - Other variation on the timeline idea: Instead of a disk timeline you could have a
 848      list of individual diskaccesses, and be able to select the ones you wanted to
 849      get rid of.
 850
 851    - The existing sysprof visualization is not terribly bad, the "self" column is
 852      more useful now.
 853
 854    - See what files are accessed so that you can get a getter idea of what
 855      the system is doing.
 856
 857    - Optimization usecases:
 858
 859         - A lot of stuff is read synchronously, but it is possible to read
 860           it asynchronously.
 861           Visualization: A timeline with alternating CPU/disk activity.
 862
 863         - What function is doing all the synchronous reading, and what
 864           files/offsets is it reading. Visualization: lots of reads across
 865           different files out of one function
 866
 867         - A piece of the program is doing disk I/O. We can drop that
 868           entire piece of code. Sysprof visualization is ok, although seeing
 869           the files accessed is useful so that we can tell if those files are
 870           not just going to be used in other places. (Gnumeric plugin_init()).
 871
 872         - A function is reading a file synchronously, but there is other
 873           (CPU/disk) stuff that could be done at the same time. Visualization:
 874           A piece of the timeline is diskbound with little or no CPU used.
 875
 876         - Want to improve code locality of library or binary. Visualization:
 877           no GUI, just produce a list of functions that should be put first in
 878           the file. Then run the program again until the list converges.
 879           (Valgrind may be more useful here).
 880
 881         - Nautilus reads a ton of files, icons + all the files in the
 882           homedirectory. Normal sysprof visualization is probably useful
 883           enough.
 884
 885         - Profiling a login session.
 886
 887         - Many applications are running at the same time, doing IPC. It would
 888           be useful if we could figure out what other things a given process
 889           is waiting on. Eg., in poll, find out what processes have the other
 890           ends of the fd's open.
 891                 Visualization: multiple lines on a graph. Lines join up where
 892           one process is blocking on another. That would show processes holding
 893           up the progress very clearly.
 894           This was suggested by Federico.
 895
 896     - Need to report stat() as well. (Where do inode data end up? In the
 897       buffer-cache?) Also open() may cause disk reads (seeks).
 898
 899     - To generate the timeline we need to know when a disk request is
 900       issued and when it is completed. This way we can assign blame to all
 901       applications that have issued a disk request at a given point in time.
 902
 903       The disk timeline should probably vary in intensity with the number
 904       of outstanding disk requests.
 905
 906
 907 -=-=-=-=-=-=-=-=-=-=-=-=-=-=- ALREADY DONE: -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
 908
 909 * Find out what is going on with kernel threads:
 910
 911   [(ksoftirqd/0)]              0.00   0.03
 912     No map ([(ksoftirqd/0)])   0.00   0.03
 913       kernel                   0.00   0.03
 914         do_softirq             0.00   0.03
 915           __do_softirq         0.00   0.03
 916
 917 * Make sure there aren't leftover stacktraces from last time when
 918   profiling begins.
 919
 920 * Is the move-to-front in process_locate_map() really worth it?
 921
 922 * Whenever we fail to lock the atomic variable, track this, and send the
 923   information to userspace as an indication of the overhead of the profiling.
 924   Although there is inherent aliasing here since stack scanning happens at
 925   regular intervals.
 926
 927 * Apparently, if you upgrade the kernel, then don't re-run configure,
 928   the kernel Makefile will delete all of /lib/modules/<release>/kernel
 929   if you run make install in the module directory. Need to find out what
 930   is going on.
 931
 932 * Performance:
 933         Switching between descendant views is a slow:
 934                   - gtk_tree_store_get_path() is O(n^2) and accounts
 935                     for 43% of the time.
 936                   - GObject signal emission overhead accounts for 18% of
 937                     the time.
 938         Consider adding a forked version of GtkTreeStore with
 939         performance and bug fixes.
 940
 941 * If we end up believing the kernel's own stacktraces, maybe
 942   /proc/kallsyms shouldn't be parsed until the user hits profile.
 943
 944 * Make it compilable against a non-running kernel.
 945
 946 * With kernel module not installed, select Profiler->Start, then dismiss
 947   the alert. This causes the start button to appear prelighted. Probably
 948   just another gtk+ bug.
 949
 950 - Fix bugs/performance issues:
 951         - add_trace_to_tree() might be a little slow when dealing with deeply
 952           recursive profiles. Hypothesis: seen_nodes can grow large, and the
 953           algorithm is O(n^2) in the length of the trace.
 954
 955 - Have kernel module report the file the address was found in
 956         Should avoid a lot of potential broken/raciness with dlopen etc.
 957         Probably better to send a list of maps with each trace. Which
 958         shouldn't really be that expensive. We already walk the list of
 959         maps in process_ensure_map() on every trace. And we can do hashing
 960         if it turns out to be a problem.
 961                 Maybe replace the SysprofStackTrace with a union, so that
 962         it can be either a list of maps, or a stacktrace. Both map lists and
 963         stacktraces would come with a hashcode.allowing userspac. This avoids
 964         the problem that maps could take up a lot of extra bandwidth.
 965
 966                 struct MapInfo
 967                 {
 968                         long start;
 969                         long end;
 970                         long offset;
 971                         long inode;
 972                 }
 973
 974                 struct Maps
 975                 {
 976                         int hash_code;
 977                         int n_maps;
 978                         MapInfo info [127];
 979                         char filenames [2048];
 980                 }
 981
 982 - possibly add dependency on glib 2.8 if it is released at that point.
 983   (g_file_replace())
 984
 985 * Some notes about timer interrupt handling in Linux
 986
 987 On an SMP system APIC is used - the interesting file is arch/i386/kernel/apic.c
 988
 989 On UP systems, the normal IRQ0 is used
 990 When the interrupt happens,
 991         entry.S
 992                 calls do_IRQ, which sets up the special interrupt stack,
 993                 and calls __do_IRQ, which is in /kernel/irq/handle.c.
 994                 This calls the corresponding irqaction, which has previously
 995                 been setup by arch/i386/mach-default/setup.c to point to
 996                 timer_interrupt, which is in arch/i386/kernel/time.c.
 997                 This calls do_timer_interrupt_hooks() which is defined in
 998                 /include/asm-i386/mach-default/do_timer.h. This function
 999                 then calls profile_tick().
1000
1001         Note when the CPU switches from user mode to kernel mode, it
1002         pushes SS/ESP on top of the kernel stack, but when it switches
1003         from kernel mode to kernel mode, it does _not_ push SS/ESP.
1004         It does in both cases push EIP though.
1005
1006 * Rename sysprof-text to sysprof-cli
1007
1008 * Find out why the samples label won't right adjust
1009
1010 * It crashes sometimes.
1011
1012   I haven't seen any crashes in a long time
1013
1014 * Find out why the strings
1015
1016   _ZL11DisplayLineP20nsDisplayListBuilderRK6nsRectS3_R19nsLineList_iteratoriRiRK16nsDisplayListSetP12nsBlockFrame
1017   _ZL11DisplayRowsP20nsDisplayListBuilderP7nsFrameRK6nsRectRK16nsDisplayListSet   _ZL11DrawBordersP10gfxContextR7gfxRectS2_PhPdS4_PjPP14nsBorderColorsijiP6nsRect _ZL11HandleEventP10nsGUIEvent
1018   _ZL12IsContentLEQP13nsDisplayItemS0_Pv
1019   _ZL15expose_event_cbP10_GtkWidgetP15_GdkEventExpose
1020
1021   do not get demangled.
1022
1023 * For glibc, the debug files do not contain .strtab and .symtab, but
1024   the original files do. The algorithm in binfile.c must be modified
1025   accordingly.
1026
1027 * If we profile something that is not very CPU bound, sysprof itself
1028   seems to get a disproportionate amount of the samples. Should look
1029   into this.  Fixed by only returning from poll when there is more
1030   than eight traces available.
1031
1032 * regarding crossing system call barriers: Find out about the virtual dso
1033   that linux uses to do fast system calls:
1034
1035         http://lkml.org/lkml/2002/12/18/218
1036
1037   and what that actually makes the stack look like. (We may want to just
1038   special case this fake dso in the symbol lookup code).
1039
1040   Maybe get_user_pages() is the way forward at least for some stuff.
1041
1042   note btw. that the kernel pages are only one or two pages, so we
1043   could easily just dump them to userspace.
1044
1045 * In profile.c, change "non_recursive" to "cumulative", and
1046   "marked_non_recursive" to a boolean "charged". This is tricky code,
1047   so be careful. Possibly make it a two-pass operation:
1048         - first add the new trace
1049         - then walk from the leaf, charging nodes
1050   That would allow us to get rid of the marked field altogether. In fact,
1051   maybe the descendants tree could become a stackstash. We'll just have
1052   to make stack_stash_add_trace() return the leaf.
1053
1054   DONE: the name is now "cumulative"
1055
1056
1057 * vdso
1058         - assume its the same across processes, just look at
1059           sysprof's own copy.
1060                 Done: vdso is done now
1061         - send copy of it to userspace once, or for every
1062           sample.
1063
1064 * Various:
1065         - decorate_node should be done lazily
1066         - Find out why we sometimes get completely ridicoulous stacktraces,
1067           where main seems to be called from within Xlib etc. This happens
1068           even after restarting everything.
1069         - It looks like the stackstash-reorg code confuses "main" from
1070           unrelated processes. - currently it looks like if multiple
1071           "main"s are present, only one gets listed in the object list.
1072                 Seems to mostly happen when multiple processes are
1073                 involved.
1074         - Numbers in caller view are completely screwed up.
1075         - It looks like it sometimes gets confused with similar but different
1076           processes: Something like:
1077                 process a spends 80% in foo() called from bar()
1078                 process b spends 1% in foo() called from baz()
1079           we get reports of baz() using > 80% of the time.
1080           Or something.
1081
1082 * commandline version should check that the output file is writable
1083   before starting the profiling.
1084
1085 * See if we can reproduce the problem where libraries didn't get correctly
1086   reloaded after new versions were installed.
1087         This is just the (deleted) problem. Turns out that the kernel
1088   doesn't print (deleted) in all cases. Some possibilities:
1089
1090         - check that the inodes of the mapped file and the disk file
1091           are the same (done in HEAD).
1092
1093         - check that the file was not modified after being mapped?
1094           (Can we get the time it was mapped or opened?) If it was
1095           modified you'd expect the inode to change, right?
1096
1097 * Find out if the first sort order of a GtkTreeView column can be
1098   changed programmatically. It can't (and the GTK+ bug was wontfixed).
1099   A workaround is possible though. (Someone, please write a
1100   GtkTreeView replacement!)
1101
1102 * Missing things in binparser.[ch]
1103
1104         - maybe convert BIN_UINT32 => { BIN_UINT, 4 }
1105           we already have the width in the struct.
1106
1107 * Rethink binparser. Maybe the default mode should be:
1108         - there is a current offset
1109         - you can move the cursor
1110                 - _goto()
1111                 - _align()
1112         - you can read structs with "begin_struct (format) / end_struct()"
1113                 Or maybe just "set_format()" that would accept NULL?
1114         -   when you are reading a struct, you can skip records with _index()
1115         - you can read fields with get_string/get_uint by passing a name.
1116         - you can read anonymous strings and uints by passing NULL for name
1117                 This is allowed even when not reading structs. Or maybe this
1118                 should be separate functions. Advantages:
1119                         - they can skip ahead, unlike fields accessors
1120                         - you can access specific types (8,16,32,64)
1121                         - there is no "name" field
1122                 Disadvantage:
1123                         - the field accesors would need renaming.
1124                                 bin_parser_get_uint_field ()
1125                           is not really that bad though.
1126                 Maybe begin_record() could return a structure you could
1127                 use to access that particular record? Would nicely solve
1128                 the problems with "goto" and "index".
1129                         bin_record_get_uint();
1130                 What should begin/end be called? They will have different
1131                 objects passed.
1132                         bin_parser_get_record (parser) -> record
1133                         bin_record_free (record);
1134         - Maybe support for indirect strings? Ie., get_string() in elfparser
1135         - This will require endianness to be a per-parser property. Which is
1136           probably just fine. Although d-bus actually has
1137           per-message endianness. Maybe there could be a settable
1138           "endianness" property.
1139
1140 * Don't look in $(libdir) for separate debug files (since $libdir is
1141   the libdir for sysprof, not a system wide libdir). Tim Rowley.
1142   Fix is probably to hardcode /usr/lib, and also look in $libdir.
1143
1144 * Consider deleting cmdline hack in process.c and replace with something at
1145   the symbol resolution level. Will require more memory though. DONE: in
1146   head, processes are no longer coalesced based on cmdline. Need to add something
1147   at the symbol level.
1148
1149 * don't loop infinitely if there are cycles in the debuglink graph.
1150
1151 * Add "sysprof --version"
1152
1153 * Fix (potential) performance issues in symbol lookup.
1154
1155 - when an elf file is read, it should be checked that the various
1156   sections are of the right type. For example the debug information
1157   for emacs is just a stub file where all the sections are NOBITS.
1158
1159 * Try reproducing crash when profiling xrender demo
1160   - it looks like it crashes when it attempts to read /usr/bin/python
1161   - apparently what's going on is that one of the symbols in python's
1162     dynamic symbol table has a completely crazy 'st_name' offset.
1163   DONE: we didn't actually need to read the name at all,
1164   but still should find out why that value is so weird.
1165   It looks like there is something strange going on with that file.
1166   All the dynsyms have weird info/type values, yet nm and readelf
1167   have no problems displaying it.
1168
1169 - Can .gnu_debuglink recurse?
1170   yes, it can, and we should probably not crash if there are
1171   cycles in the graph.
1172
1173 * Find out why we are getting bogus symbols reported for /usr/bin/Xorg
1174   Like this:
1175
1176         Everything                                     0.00 100.00
1177           [/usr/bin/Xorg]                              0.00  94.79
1178             GetScratchPixmapHeader                     0.00  94.79
1179               __libc_start_main                        0.00  94.79
1180                 FindAllClientResources                 0.00  94.79
1181                   FreeFontPath                         0.00  94.79
1182                     SProcRenderCreateConicalGradient   0.00  94.56
1183                       ProcRenderTrapezoids             0.00  94.56
1184                         AllocatePicture                0.00  94.56
1185                     __glXDispatch                      0.00   0.16
1186                       __glXVendorPrivate               0.00   0.08
1187                       __glXRender                      0.00   0.08
1188                     SmartScheduleStartTimer            0.00   0.08
1189           [./sysprof]                                  0.00   2.76
1190           [sshd: ssp@pts/0]                            0.00   2.37
1191           [compiz]                                     0.00   0.08
1192
1193   What's going on here is that the computed load address for the X server
1194   binary is different for the debug binary. The lowest allocated address
1195   is 0x08047134 for the normal, and 0x8048134 for the debug. But it looks
1196   like the addresses are still the same for the symbols.
1197         The root of this problem may be that we should ignore the load
1198   address of the debug binary, and just lookup the address computed.
1199   The *text* segments have the same address though. Everything from
1200   "gnu version" on has the same address.
1201
1202   So:
1203         - find out where in memory the text segment is
1204         - take an address and compute its offset to the text segment
1205         - in elf parser, find address of text segment
1206         - add offset
1207         - lookup resulting address
1208
1209 So basically, "load address" should really be text address. Except of course
1210 that load_address is not used in process.c - instead the 'load address' of the
1211 first part of the file is computed and assumed to be equivalent to the
1212 load address. So to lookup something you probably actually need
1213 to know the load/offset at the elf parser level:
1214
1215         lookup_symbol (elf, map, offset, address)
1216
1217 then,
1218
1219         real load address of text (lta) = map - offset + text_offset
1220
1221         offset of called func (ocf):   addr - lta
1222
1223         thing to lookup in table:  ocf + text_addr.loadaddr in debug
1224
1225         addr - map - offset + text_offset
1226
1227
1228         hmmm ...
1229
1230 * plug all the leaks
1231         - don't leak the presentation strings/objects
1232                 - maybe add stack_stash_set_free_func() or something
1233 * Delete elf_parser_new() and rename elf_parser_new_from_file()
1234
1235 * Add demangling again
1236
1237 * Restore filename => binfile cache.
1238
1239 * It is apparently possible to get another timer interrupt in the middle
1240   of timer_notify. If this happens the circular buffer gets screwed up and
1241   you get crashes. Note this gets much worse on SMP (in fact how did this
1242   work at all previously?)
1243
1244   Possible fixes
1245         - have a "in timer notify" variable, then simply reject nested
1246           interrupts
1247         - keep a "ghost head" that timers can use to allocate new traces,
1248           then update the real head whenever one of them completes. Note
1249           though, that the traces will get generated in the wrong order
1250           due to the nesting. In fact, only the outermost timernotify
1251           can update the real head, and it should update it to ghost_head.
1252         - do correct locking? Nah, that's crazy talk
1253   Also note: a race is a race, and on SMP we probably can't even make it
1254   unlikely enough to not matter.
1255
1256   Fixed by ignoring the nested interrupts using an atomic variable.
1257
1258 * When you load a file from the commandline, there is a weird flash of the toolbar.
1259   What's going on is this:
1260                 - this file is loaded into a tree model
1261                 - the tree model is set for the function list
1262                 - this causes the selection changed signal to be emitted
1263                 - the callback for that signal process updates
1264                 - somehow in that update process, the toolbar flashes.
1265                 - turns out to be a gtk+ issue: 350517
1266
1267 - screenshot window must be cleared when you press start.
1268
1269 - Formats should become first-class, stand-alone objects that offers
1270   help with parsing and nothing else.
1271
1272         ParseContext* format_get_parse_context (format, err);
1273         gboolean parse_context_begin (parse_context, name, err);
1274         gboolean parse_context_end (parse_format, name, err);
1275
1276   basically, a Format encapsulates a DFA, and a ParseContext encapsulates
1277   the current state.
1278
1279 - make stackstash ref counted
1280
1281 - Charge 'self' properly to processes that don't get any stack trace at all
1282         (probably we get that for free with stackstash reorganisation)
1283
1284 - CVS head now has two radio buttons in the right pane, and
1285   caller pane is gone. (This turned out to be a bad idea, because it
1286   is often useful to click on ancestors to move up the tree).
1287
1288 * Don't build the GUI if gtk+ is not installed
1289
1290 * Find out why we sometimes get reports of time spent by [pid 0].
1291
1292 * - Run a.out generated normally with gcc.
1293   - Run sysprof
1294   - hit profile
1295   - Kill a.out
1296   - strip a.out
1297   - run a.out
1298   - hit start
1299   - hit profile
1300   At this point we should not get any symbols, but we do. There is some
1301   sort of bad caching going on.
1302
1303 * support more than one reader of the samples properly
1304         - Don't generate them if noone cares
1305
1306 * Correctness
1307         - When the module is unloaded, kill all processes blocking in read
1308                 - or block unloading until all processes have exited
1309           Unfortunately this is basically impossible to do with a /proc
1310           file (no open() notification). So, for 1.0 this will have to be
1311           a dont-do-that-then. For 1.2, we should do it with a sysfs and
1312           kobject instead.
1313
1314         - When the module is unloaded, can we somehow *guarantee* that no
1315           kernel thread is active? Doesn't look like it; however we can
1316           get close by decreasing a ref count just before returning
1317           from the module. (There may still be return instructions etc.
1318           that will get run). This may not be an issue with the timer
1319           based scanning we are using currently.
1320
1321 * Find out why we get hangs with rawhide kernels. This only happens with the
1322   'trace "current"' code. See this mail:
1323
1324         http://mail.nl.linux.org/kernelnewbies/2005-08/msg00157.html
1325
1326   esp0 points to top of kernel stack
1327   esp  points to top of user stack
1328
1329   (Reported by Kjartan Maraas).
1330
1331 - When not profiling, sysprof shouldn't keep the file open.
1332
1333 - Make things faster
1334         - Can I get it to profile itself?
1335         - speedprof seems to report that lots of time is spent in
1336           stack_stash_foreach() and also in generate_key()
1337 - add an 'everything' object. It is really needed for a lot of things
1338         - should be easy to do with stackstash reorganization.
1339
1340
1341 * Handle time being set back in the RESET_DEAD_PERIOD code.
1342
1343 - total should probably be cached so that compute_total() doesn't
1344   take 80% of the time to generate a profile.
1345
1346 - Fixing the oops in kernels < 2.6.11
1347
1348         - Probably just require 2.6.11 (necessary for timer interrupt
1349           based anyway).
1350
1351         - Make the process waiting in poll() responsible for extracting
1352           the backtrace. Give a copy of the entire stack rather than doing
1353           the walk inside the kernel.
1354
1355                 New model:
1356                         - Two arrays,
1357                                 one of actual scanned stacks
1358                                 one of tasks that need to be scanned
1359                           One wait queue,
1360                                 wait for data
1361
1362                         - in read() wait for stack data:
1363                                 scan_tasks()
1364                                 if (!stack_data)
1365                                         return -EWOULDBLOCK;
1366
1367                           in poll()
1368                                 while (!stack data) {
1369                                         wait_for_data();
1370                                         scan_tasks();
1371                                 }
1372                                 return READABLE;
1373
1374                           scan_tasks() is a function that converts waiting
1375                           tasks into data, and wakes them up.
1376
1377                         - in timer interrupt:
1378
1379                                 if (someone waiting in poll() &&
1380                                     current && current != that_someone &&
1381                                     current is runnable)
1382                                 {
1383                                         stop current;
1384                                         add current to queue;
1385                                         wake wait_for_data;
1386                                 }
1387
1388                         This way, we will have a real userspace process
1389                         that can take the page faults.
1390
1391
1392                 - Different approach:
1393
1394                         pollable file where a regular userspace process
1395                         can read a pid. Any pid returned is guaranteed to be
1396                         UNINTERRUPTIBLE. Userspace process is required to
1397                         start it again when it is done with it.
1398
1399                         Also provide interface to read arbitrary memory of
1400                         that process.
1401
1402                         ptrace() could in principle do all this, but
1403                         unfortunately it sucks to continuously
1404                         ptrace() processes.
1405
1406                 - Yet another
1407
1408                         Userspace process can register itself as "profiler"
1409                         and pass in a filedescriptor where all sorts of
1410                         information is sent.
1411
1412                                 - could tie lifetime of module to profiler
1413                                 - could send "module going away" information
1414                                 - Can we map filedescriptors to files in
1415                                   a module?
1416
1417 * Make sure sysprof-text is not linked to gtk+
1418
1419 * Consider renaming profiler.[ch] to collector.[ch]
1420
1421 * Crash reported by Rudi Chiarito with n_addrs == 0.
1422
1423 * Find out what distributions it actually works on
1424    (ask for sucess/failure-stories in 1.0 releases)
1425
1426 * Add note in README about Ubuntu and Debian -dbg packages and how to get
1427   debug symbols for X there.
1428
1429 stackstash reorg:
1430
1431         - make loading and saving work again.
1432                 - make stashes loadable and savable.
1433                 - add a way to convert 1.0 files to stashes
1434
1435         - Get rid of remaining uses of stack_stash_foreach(), then
1436           rename stack_stash_foreach_reversed() to
1437           stack_stash_foreach()
1438
1439         - stackstash should just take traces of addresses without knowing
1440           anything about what those addresses mean.
1441
1442         - stacktraces should then begin with a process
1443
1444         - stackstash should be extended so that the "create_descendant"
1445           and "create_ancestor" code in profile.c can use it directly.
1446           At that point, get rid of the profile tree, and rename
1447           profile.c to analyze.c.
1448
1449         - the profile tree will then just be a stackstash where the
1450           addresses are presentation strings instead.
1451
1452         - Doing a profile will then amount to converting the raw stash
1453           to one where the addresses have been looked up and converted to
1454           presentation strings.
1455
1456         -=-=
1457
1458         - profile should take traces of pointers to presentation
1459           objects without knowing anything about these presentation
1460           objects.
1461
1462         - For each stack node, compute a presentation object
1463                 (probably need to export opaque stacknode objects
1464                   with set/get_user_data)
1465
1466         - Send each stack trace to the profile module, along with
1467           presentation objects. Maybe just a map from stack nodes
1468           to presentation objects.
1469
1470 - Make the Profile class use the stash directly instead of
1471   building its own copy.
1472         - store a stash in the profile class
1473         - make sure descendants and callers can be
1474           built from it.
1475         - get rid of other stuff in the profile
1476           struct
1477
1478
1479 * Before 1.0:
1480
1481    - Update version numbers in source
1482
1483    - Make tarball
1484
1485    - Check that tarball works
1486
1487    - cvs commit
1488
1489    - cvs tag sysprof-1-0
1490
1491    - Update website
1492
1493    - Announce on Freshmeat
1494
1495    - Announce on gnome-announce
1496    - Announce on kernel list.
1497
1498    - Announce on Gnomefiles
1499
1500    - Announce on news.gnome.org
1501    - Send to slashdot/developers
1502    - Announce on devtools list (?)
1503
1504    - Announce on Advogato
1505         link to archive
1506
1507 * The handling of the global variable in signal-handler.[ch] needs to be
1508   atomic - right now it isn't. The issue is what happens if a handled signal
1509   arrives while we are manipulating the list?
1510
1511 * (User space stack must probably be done in a thread - kernel
1512   stack must probably be taken in the interrupt itself?
1513   - Why this difference? The page tables should still be loaded. Is it
1514     because pages_present() doesn't work? No, turning it off doesn't help.
1515   - It looks like this works. Get:
1516
1517        struct pt_regs *user_regs =
1518           (void *)current->thread.esp0 - sizeof (struct pt_regs);
1519
1520     then use pages_present as usual to trace with user_regs; There could be
1521     rare lockups though.
1522
1523 * Non-GUI version that can save in a format the GUI can understand.
1524   Could be used for profiling startup etc. Would preferably be able to
1525   dump the data to a network socket. Should be able to react to eg.
1526   SIGUSR1 by dumping the data.
1527
1528   Work done by Lorenzo:
1529
1530   http://www.colitti.com/lorenzo/software/gnome-startup/sysprof-text.diff
1531   http://www.colitti.com/lorenzo/software/gnome-startup/sysprof.log
1532   http://colitti.com/lorenzo/software/gnome-startup/
1533
1534 * consider caching [filename => bin_file]
1535
1536 * Check the kernel we are building against, if it is SMP or
1537   less than 2.6.11, print a warning and suggest upgrading.
1538
1539 * Timer interrupt based
1540
1541 * Interface
1542         - Consider expanding a few more levels of a new descendants tree
1543                 - Algorithm should be expand in proportion to the
1544                   "total" percentage. Basically consider 'total' the
1545                   likelyhood that the user is going to look at it.
1546                 - Maybe just; keep expanding the biggest total until
1547                   there is no more room or we run out of things to expand.
1548
1549 * Web page containing
1550
1551         - Screen shots
1552         - Explanation of what it is
1553         - Download
1554         - Bug reporting
1555         - Contact info
1556         - Ask for sucess/failure reports
1557 - hook up menu items view/start etc (or possibly get rid of them or
1558   move them)
1559 - Should do as suggested in the automake manual in the
1560   chapter "when automake is not enough"
1561 - add an "insert-module" target
1562 - need to run depmod on install
1563 - If the current profile has a name, display it in the title bar
1564 - auto*?
1565 - Find out if that PREFIX business in Makefile was really such
1566   a great idea.
1567 - Sould just install the kernel module if it running as root, pop up
1568   a dialog if not. Note we must be able to start without module now,
1569   since it is useful to just load profiles from disk.
1570         - Is there a portable way of asking for the root password?
1571         - Install a small suid program that only inserts the module?
1572                 (instant security hole ..)
1573 - Need to make "make install" work (how do you know where to install
1574   kernel modules?)
1575         - in /lib/modules/`uname -r`/kernel/drivers/
1576         - need to run depmod as root after that
1577         - Then modprobe run as root should correctly find it.
1578
1579 - grep FIXME
1580
1581 - give profiles on the command line
1582
1583 - Hopefully the oops at the end of this file is gone now that
1584   we use mmput/get_task_mm.  For older kernels those symbols
1585   are not exported though, so we will probably have to either
1586   use the old way (directly accessing the mm's) or just not
1587   support those kernels.
1588
1589 - Need an icon
1590
1591 - hook up about box
1592
1593 - Add busy cursors,
1594         - when you hit "Profile"
1595         - when you click something in the main list and we don't respond
1596                 within 50ms (or perhaps when we expect to not be able to do
1597                 so (can we know the size in advance?))
1598
1599 - kernel module should put process to sleep before sampling. Should get us
1600   more accurate data
1601
1602 - Make sure samples label shows correct nunber after Open
1603
1604 - Move "samples" label to the toolbar, then get rid of statusbar.
1605
1606 - crashes when you ctrl-click the selected item in the top left pane
1607    <ian__> ssp: looks like it doesn't handle the none-selected case
1608
1609 - loading and saving
1610
1611 - consider making ProfileObject more of an object.
1612
1613 - make an "everything" object
1614        maybe not necessary -- there is a libc_ctors_something()
1615
1616 - make presentation strings nicer
1617
1618   four different kinds of symbols:
1619
1620        a) I know exactly what this is
1621        b) I know in what library this is
1622        c) I know only the process that did this
1623        d) I know the name, but there is another similarly named one
1624
1625   (a) is easy,  (b) should be <in ...>    (c) should just become "???"
1626   (d) not sure
1627
1628 - processes with a cmdline of "" should get a [pid = %d] instead.
1629
1630 - make an "n samples" label
1631 Process stuff:
1632
1633 - make threads be reported together
1634   (simply report pids with similar command lines together)
1635   (note: it seems separating by pid is way too slow (uses too much memory),
1636   so it has to be like this)
1637
1638 - stack stash should allow different pids to refer to the same root
1639   (ie. there is no need to create a new tree for each pid)
1640    The *leaves* should contain the pid, not the root. You could even imagine
1641    a set of processes, each referring to a set of leaves.
1642
1643 - when we see a new pid, immediately capture its mappings
1644
1645 Road map:
1646      - new object Process
1647        - hashable by pointer
1648        - contains list of maps
1649        - process_from_pid (pid_t pid, gboolean join_threads)
1650          - new processes are gets their maps immediately
1651          - resulting pointer must be unref()ed, but it is possible it
1652            just points to an existing process
1653          - processes with identical cmdlines are taken together
1654        - method lookup_symbol()
1655        - method get_name()
1656        - ref/unref
1657      - StackStash stores map from process to leaves
1658      - Profile is called with processes
1659
1660 It is possible that we simply need a better concept of Process:
1661
1662      If two pids have the same command line, consider them the same, period.
1663      This should save considerable amounts of memory.
1664
1665      The assumptions:
1666
1667          "No pids are reused during a profiling run"
1668          "Two processes with the same command line have the same mappings"
1669
1670      are somewhat dubious, but probably necessary.
1671
1672      (More complex kernel:
1673
1674            have the module report
1675
1676                 - new pid arrived (along with mappings)
1677                 - mapping changed for pid
1678                 - stacktrace)
1679
1680 - make symbols in executable work
1681 - the hashtables used in profile.c should not accept NULL as the key
1682 - make callers work
1683 - autoexpand descendant tree
1684 - make double clicks work
1685 - fix leaks
1686
1687
1688 - Find out what happened here:
1689
1690 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: Unable to handle kernel NULL pointer dereference at virtual address 000001b8
1691 Apr 11 15:42:08 great-sage-equal-to-heaven kernel:  printing eip:
1692 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: c017342c
1693 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: *pde = 00000000
1694 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: Oops: 0000 [#1]
1695 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: Modules linked in: sysprof_module(U) i2c_algo_bit md5 ipv6 parport_pc lp parport autofs4 sunrpc video button battery ac ohci1394 ieee1394 uhci_hcd ehci_hcd hw_random tpm_atmel tpm i2c_i801 i2c_core snd_intel8x0 snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc e1000 floppy dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod ata_piix libata sd_mod scsi_mod
1696 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: CPU:    0
1697 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: EIP:    0060:[<c017342c>]    Not tainted VLI
1698 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: EFLAGS: 00010287   (2.6.11-1.1225_FC4)
1699 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: EIP is at grab_swap_token+0x35/0x21f
1700 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: eax: 0bd48023   ebx: d831d028   ecx: 00000282   edx: 00000000
1701 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: esi: c1b72934   edi: c1045820   ebp: c1b703f0   esp: c18dbdd8
1702 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: ds: 007b   es: 007b   ss: 0068
1703 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: Process events/0 (pid: 3, threadinfo=c18db000 task=f7e62000)
1704 Apr 11 15:42:09 great-sage-equal-to-heaven kernel: Stack: 000011a8 00000000 000011a8 c1b703f0 c0151731 c016f58f 000011a8 c1b72934
1705 Apr 11 15:42:09 great-sage-equal-to-heaven kernel:        000011a8 c0166415 c1b72934 c1b72934 c0163768 ee7ccc38 f459fbf8 bf92e7b8
1706 Apr 11 15:42:09 great-sage-equal-to-heaven kernel:        f6c6a934 c0103b92 bfdaba18 c1b703f0 00000001 c1b81bfc c1b72934 bfdaba18
1707 Apr 11 15:42:09 great-sage-equal-to-heaven kernel: Call Trace:
1708 Apr 11 15:42:09 great-sage-equal-to-heaven kernel:  [<c0151731>] find_get_page+0x9/0x24
1709 Apr 11 15:42:09 great-sage-equal-to-heaven kernel:  [<c016f58f>] read_swap_cache_async+0x32/0x83Apr 11 15:42:09 great-sage-equal-to-heaven kernel:  [<c0166415>] do_swap_page+0x262/0x600
1710 Apr 11 15:42:09 great-sage-equal-to-heaven kernel:  [<c0163768>] pte_alloc_map+0xc6/0x1e6
1711 Apr 11 15:42:09 great-sage-equal-to-heaven kernel:  [<c0103b92>] common_interrupt+0x1a/0x20
1712 Apr 11 15:42:09 great-sage-equal-to-heaven kernel:  [<c01673f0>] handle_mm_fault+0x1da/0x31d
1713 Apr 11 15:42:09 great-sage-equal-to-heaven kernel:  [<c016488e>] __follow_page+0xa2/0x10d
1714 Apr 11 15:42:09 great-sage-equal-to-heaven kernel:  [<c0164a6f>] get_user_pages+0x145/0x6ee
1715 Apr 11 15:42:09 great-sage-equal-to-heaven kernel:  [<c0161f66>] kmap_high+0x52/0x44e
1716 Apr 11 15:42:09 great-sage-equal-to-heaven kernel:  [<c0103b92>] common_interrupt+0x1a/0x20
1717 Apr 11 15:42:09 great-sage-equal-to-heaven kernel:  [<f8cbb19d>] x_access_process_vm+0x111/0x1a5 [sysprof_module]
1718 Apr 11 15:42:10 great-sage-equal-to-heaven kernel:  [<f8cbb24a>] read_user_space+0x19/0x1d [sysprof_module]
1719 Apr 11 15:42:10 great-sage-equal-to-heaven kernel:  [<f8cbb293>] read_frame+0x35/0x51 [sysprof_module]
1720 Apr 11 15:42:10 great-sage-equal-to-heaven kernel:  [<f8cbb33a>] generate_stack_trace+0x8b/0xb4
1721 Apr 11 15:42:10 great-sage-equal-to-heaven kernel:  [<f8cbb3a2>] do_generate+0x3f/0xa0 [sysprof_module]
1722 Apr 11 15:42:10 great-sage-equal-to-heaven kernel:  [<c0138d7a>] worker_thread+0x1b0/0x450
1723 Apr 11 15:42:10 great-sage-equal-to-heaven kernel:  [<c0379ccd>] schedule+0x30d/0x780
1724 Apr 11 15:42:10 great-sage-equal-to-heaven kernel:  [<c011bdb6>] __wake_up_common+0x39/0x59
1725 Apr 11 15:42:10 great-sage-equal-to-heaven kernel:  [<f8cbb363>] do_generate+0x0/0xa0 [sysprof_module]
1726 Apr 11 15:42:10 great-sage-equal-to-heaven kernel:  [<c011bd71>] default_wake_function+0x0/0xc
1727 Apr 11 15:42:10 great-sage-equal-to-heaven kernel:  [<c0138bca>] worker_thread+0x0/0x450
1728 Apr 11 15:42:10 great-sage-equal-to-heaven kernel:  [<c013f3cb>] kthread+0x87/0x8b
1729 Apr 11 15:42:10 great-sage-equal-to-heaven kernel:  [<c013f344>] kthread+0x0/0x8b
1730 Apr 11 15:42:10 great-sage-equal-to-heaven kernel:  [<c0101275>] kernel_thread_helper+0x5/0xb
1731 Apr 11 15:42:10 great-sage-equal-to-heaven kernel: Code: e0 8b 00 8b 50 74 8b 1d c4 55 3d c0 39
1732 da 0f 84 9b 01 00 00 a1 60 fc 3c c0 39 05 30 ec 48 c0 78 05 83 c4 20 5b c3 a1 60 fc 3c c0 <3b> 82 b8 01 00 00 78 ee 81 3d ac 55 3d c0 3c 4b 24 1d 0f 85 78
1733