3 * Move perfcounter.h into sysprof namespace
5 * Check for existence and presense of __NR_perf_counter_open in
10 * Filter out kernel stacktrace entries from the interrupts.
11 (This can be done better in the kernel).
13 * Make sure errors are reported properly.
15 * Fork GtkLabel and change it to only request a resize when the new size is
16 actually larger than the old one. It could also be special-purposed for
17 the "Samples: <number>" format with the number left-aligned.
19 * When should the samples label be updated? On one hand we don't want
20 it to be the only thing that shows up on the profile. On the other,
21 when there are things going on, it should update quickly.
23 It is also desirable that it updates slowly if there is slow
24 activity going on; for example if you are moving the mouse cursor
27 Cost of updating the samples label: s (in samples)
30 If we update the samples label c times per second, the frequency is
34 If we update the samples label for every k samples.
36 (k - s)/f is the time it takes before updating
38 So if the update rate should be proportional to the base rate, then
43 which implies k = d + s. So we should pick some constant d and only
44 update when that many samples have arrived.
46 * The counters seem to not be disabled when looking at the
50 - Create RPM package? See fedora-packaging-list for information
51 about how to package kernel modules. Lots of threads in
52 June 2005 and forward.
54 See also http://www.fedoraproject.org/wiki/Extras/KernelModuleProposal
56 Someone already did create a package - should be googlable.
58 * The hrtimer in the kernel currently generates an event every time
59 the timer fires. There are two problems with this:
61 - Essentially all the events are idle events and exclude_idle is
64 - If you make it obey exclude_idle, it still generates activity
65 based on what is running currently. Unfortunately, the thing that
66 is running will very often be the userspace tracker because it was
67 handling the last sample generated. So this degenerates into an
68 infinite loop of sorts. Or maybe the GUI is just too slow, but
69 then why doesn't it happen with the real counters?
71 I think the solution here is to make the hrtimer fire at some
72 reasonable interval, like 100000 ns. When the timer fires, if the
73 current task is not the idle taks, it increments a counter by
75 cpu clock frequency * 100000 ns
77 If that overflows the sample period, an event is generated.
79 This is closer to the idea of a fake CPU cycle counter.
81 Also, for reasons I don't understand, it stops generating events
82 completely if a process is running that spins on all CPUs. So this
83 interface is not usable in its present state, but fortunately all
84 CPUs we care about have hardware cycle counters.
86 * With more than one CPU, we can get events out of order, so userspace
87 will have to deal with that. With serial numbers we could do it
88 correctly, but even without them we can do a pretty reasonable job
89 of putting them back in order. If a fork happens "soon" after a
90 sample, it probably happened before the sample; if an mmap happens
91 "soon" after a sample that would otherwise be unmapped, it probably
92 happened before the sample. All we need is a way to determine what
95 Serial numbers would be useful to make "soon" an accurate measure.
97 There is also the issue of pid reuse, but that can probably be
100 If we ignore pid reuse, we can sort the event buffer where two
101 events compare equal, unless both have the same pid and one is a
102 fork and the other is not.
104 A system-wide serial number could be expensive to maintain though,
105 so maybe time events would be just as good.
107 * Another issue is processes that exit during the initial scan of
108 /proc. Such a process will not cause sample events by itself, but it
109 may fork a child that will. There is no way to get maps for that
112 A possible solution would be to optionally generate mmap event after
113 forks. Userspace would turn this off when it was done with the
114 initial gathering of processes.
116 Also, exec() events will delete the maps from a process, but all we
117 get is 'comm' events which is not quite the same thing.
119 * Find out why the busy cursor stays active after you hit start
121 * Kernel binary when available, is better than kallsyms.
124 - Misposition window after click
125 - Find out why gtk_tree_view_columns_autosize() apparently doesn't
126 work on empty tree views.
127 - Write my own tree model? There is still performance issues in
130 * Counters must not be destroyed during tracker setup. They have to
131 exist but be disabled so that we can track process creation.
133 * Check that we don't use too much memory (in particular with the
136 * Fix names. "new process" is really "exec". (What does "comm"
137 actually stand for? Command?)
139 * Fix ugly flash when double clicking in descendants view
141 * Find out what's up with weird two-step flash when you hit start when
144 * Make tracker creation faster. (Faster populating mainly)
146 * Share map reading code between vdso stuff in binfile.c and tracker.c
148 * Get rid of remaining gulongs (use uint64_t instead)
150 * Move binfile hash table to state_t.
152 * Get rid of process.c
154 * On 32 bit, NMI stackframes are not filtered out which leads to wrong
157 * Can we track kernel modules being inserted/removed?
159 * Does it make sense to try and locate a kernel binary, or can we
160 always just use kallsyms.
162 * open_inode() would be useful in many cases
164 * Is the double mapping of the ring buffer ok?
166 * Why do we get EINVAL when we try to track forks?
168 * Sometimes it gets samples for unknown processes. This may be due to
169 forking without execing.
171 * Give an informative error message if not run as root
173 * What are the events generated for pid 0 and pid 1? They have no
174 stacktrace, and an eip in the kernel.
176 * Delete the binparser stuff and fix the elf parser to just deal with
177 32 bits vs 64 bits. Or use C++ like behdad does in harfbuzz?
179 * We often get "No map". I suspect this is because the vdso stackframe
182 * Hack to disable recursion for binaries without symbols causes the
183 symbols to not work the way other symbols do. A better approach is
184 probably to simply generate a new symbol for every appearance except
185 leaf nodes, which should still be considered one symbol (or maybe be
186 considered the same symbol if they have the same parent). In fact
187 "has same parent" may be the correct criterion in all cases. (done:
188 master now doesn't fold those recursions anymore)
190 * See if we can make "In file <blah>" not be treated as a recursive
191 function. Maybe simply treat each individual address in the file
192 as a function. Or try to parse the machine code. Positions that
193 are called are likely to be functions.
195 - Treat identical addresses as one function
197 - Treat all addresses within a library that don't have children
198 are treated as one function.
200 This will have the effect of coalescing adjacent siblings without
201 children. Which is what you want since you can't tell them apart
202 anyway. It will never be a great experience though.
204 * Make sure that labels look decent in case of "No Map" etc.
207 - error handling for bin_parser is necessary.
209 * Find out why all apps have an "In file /usr/bin/<app binary>" below
210 _libc_main. If possible, maybe make up a name for it.
213 - the "[vdso]" string should be #defined somewhere
214 - Does get_vdso_bytes() belong in process.c?
215 - Is basing on "[vdso]" always correct?
217 * Convert things like [heap] and [stack] to more understandable labels.
219 * Strategies for taking reliable stacktraces.
221 Three different kinds of files
228 - eh_frame annotations, in kernel or in kernel debug
230 - userspace can look at _stext and _etext to determine
231 start and end of kernel text segment
232 - copying kernel stack to userspace
233 - it's always 4096 bytes these days
234 - heuristically determine functions based on address
235 - callbacks on the stack can be identified
236 by having an offset of 0.
237 - even so there is a lot of false positives.
238 - is eh_frame usually loaded into memory during normal
239 operation? It is mapped, but probably not paged in,
240 so we will be taking a few major page faults when we
241 first profile something.
242 Unless of course, we store the entire stack in
243 the stackstash. This may use way too much memory though.
245 - Locking, possibly useful code:
247 /* In principle we should use get_task_mm() but
248 * that will use task_lock() leading to deadlock
249 * if somebody already has the lock
251 if (spin_is_locked (¤t->alloc_lock))
252 printk ("alreadylocked\n");
254 struct mm_struct *mm = current->mm;
257 printk (KERN_ALERT "stack size: %d (%d)\n",
258 mm->start_stack - regs->REG_STACK_PTR,
261 stacksize = mm->start_stack - regs->REG_STACK_PTR;
268 - usually have eh_frame section which is mapped into memory
269 during normal operation
270 - do stackwalk in kernel based on eh_frame
271 - eh_frame section is usually mapped into memory, so
272 no file reading in kernel would be necessary.
273 - do stackwalk in userland based on eh_frame
274 - do ebp based stackwalk in kernel
275 - do ebp based stackwalk in userland
276 - do heuristic stackwalk in kernel
277 - do heuristic stackwalk in userland
279 - Send heuristic stack trace to user space, along with
280 location on the stack. Then, in userspace analyze the
281 machine code to determine the size of the stack frame at any
282 point. The instructions that would need to be recognized are:
284 subl <constant>, %esp
285 addl <constant>, %esp
291 GCC is unlikely to have different stack sizes at the entry
294 We can often find a vmlinux in /lib/modules/<uname-r>/build.
296 * "Expand all" is horrendously slow because update_screenshot gets called
297 for every "expanded" signal. In fact even normal expanding is really
298 slow. It's probably hopeless to get decent performance out of GtkTreeView,
299 so we will have to store a list of expanded objects and keep that uptodate
300 as the rows expands and unexpands.
302 * Give more sensible 'error messages'. Eg., if you get permission denied for
303 a file, put "Permission denied" instead of "No map"
305 * crc32 checking probably doesn't belong in elfparser.c
307 * Missing things in binparser.[ch]
309 - it's inconvenient that you have to pass in both a parser _and_
310 a record. The record should just contain a pointer to the parser.
311 On the other hand, the result does depend on the parser->offset.
312 So it's a bit confusing that it's not passed in.
314 - the bin_parser_seek_record (..., 1); idiom is a little dubious
317 Also need to add error checking.
319 - "native endian" is probably not useful. Maybe go back to just
320 having big/little endian.
322 Should probably rethink the whole thing. It's just not very convenient to use, even
323 for simple things like ELF files.
325 * Rename stack_stash_foreach_by_address() to stack_stash_foreach_unique(),
328 Which things are we actually using from stack stash now?
330 * Maybe report idle time? Although this would come for free with the
333 * Fix (deleted) problem. But more generally, whenever we can't display a
334 symbol, display an error message instead, ie.,
335 - 'Binary file <xxxx> was deleted/replaced'
336 - 'No mapping for address'
337 - 'No symbols in binary file'
338 - 'Address has no corrresponding symbol in file'
340 done: HEAD will not load files with the wrong inode now.
342 * Consider whether ProfileDescendant can be done with a StackStash We
343 need the "go-back-on-recursion" behavior. That could be added of
344 course ... the functions are otherwise very similar.
346 * Add spew infrastructure to make remote debugging easier.
348 * Make it compile and work on x86-64
350 - make the things we put in a stackstash real
353 - they will know how to delete the presentation
354 names and themselves (through a virtual function)
355 - they can contain markup etc.
356 - The unique_dup() scheme is somewhat confusing.
357 - a more pragmatic approach might be to just walk the tree and
361 - loading and saving probably leak right now
363 - make it use less memory:
364 - StackNodes are dominating
365 - fold 'toplevel' into 'size'
367 - this will need to be coordinated with
368 profile.c which also creates stacknodes.
370 - maybe simply make stackstashes able to
373 - rethink loading and saving. Goals
375 - Can load 1.0 profiles
376 - Don't export too much of stackstashes to the rest of the
381 * Have a compatibility module that can load 1.0 modules
382 - reads in the call tree, then generate a stack stash
383 by repeatedly adding traces.
385 * Make stackstash able to save themselves using callbacks.
387 * If loading a file fails, then try loading it using the
389 - optimization: make sure we fail immediately if we
390 see an unknown object.
392 * Add versioning support to sfile.[ch]:
393 - sformat_new() should take doctype and version strings:
394 like "sysprof-profile" "version 1.2"
395 - there should be sfile_sniff() functionality that will
396 return the doctype and version of a file. Or None
399 * At this point, make the loader first check if the file has a version
400 if it doesn't, load as 1.0, otherwise as whatever the version claims
403 * Make provisions for forward compatibility: maybe it should be
404 possible to load records with more fields than specified.
406 * Figure out how to make sfile.[ch] use less memory.
408 - In general clean sfile.[ch] up a little:
410 - split out dfa in its own generic class
412 - make a generic representation of xml files with quarks for strings:
415 quark text: -> begin/end/value
416 int id; -> for begins that are pointed to
418 perhaps even with iterators. Should be compact and suitable for both
419 input and output. As a first cut, perhaps just split out the
424 - make the api saner; add format/content structs
425 Idea: all types have to be declared:
427 SFormat *format = sformat_new()
428 SType *object_list = stype_new (format);
429 SType *object = stype_new (format);
432 stype_define_list (object_list, "objects", object);
433 stype_define_record (object, "object",
434 name, total, self, NULL);
435 stype_define_pointer (...);
437 * See if the auto-expanding can be made more intelligent
438 - "Everything" should be expanded exactly one level
439 - all trees should be expanded at least one level
441 * Send entire stack to user space, then do stackwalking there. That would
442 allow us to do more complex algorithms, like dwarf, in userspace. Though
443 we'd lose the ability to do non-racy file naming. We could pass a list
444 of the process mappings with each stack though. Doing this would also solve
445 the problem of not being able to get maps of processes running as root.
446 Might be too expensive though. User stacks seem to be on the order
447 of 100K usually, which for 200 times a second means a bandwidth of
448 20MB/s, which is probably too much. One question is how much of it
450 Actually it seems that the _interesting_ part of the stack
451 (ie., from the stack pointer and up) is not that big in many cases. The
452 average stacksize seemed to be about 7700 bytes for gcc compiling gtk+.
453 Even deeply recursive apps like sysprof only generate about 55K stacks.
457 - Do heuristic stack walking where it lists all words on the stack
458 that look like they might be return addresses.
460 - Somehow map the application's stack pages into the client. This
461 is likely difficult or impossible.
463 - Another idea: copy all addresses that look like they could be
464 return addresses, along with the location on the stack. This
465 just might be enough for a userspace stack walker.
467 - Yet another: krh suggests hashing blocks of the stack, then
468 only sending the blocks that changed since last time.
470 - every time you send a stackblock, also send a cookie.
472 - whenever you *don't* send a stackblock, send the cookie
473 instead. That way you always get a complete stacktrace
476 - also, that would allow the kernel to just have a simple
477 hashtable containing the known blocks. Though, that could
478 become large. Actually there is no reason to store the
479 blocks; you can just send the hashcode. That way you
480 would only need to store a list of hashcodes that we
481 have generated previously.
483 - One problem with doing DWARF walking is that the debug code
484 will have to be faulted in. This can be a substantial amount
485 of disk access which is undesirable to have during a
486 profiling run. Even if we only have to fault in the
487 .eh_frame_hdr section, that's still 18 pages for gtk+. The
488 .eh_frame section for gtk+ is 72 pages.
490 A possibility may be to consider two stacktraces identical
491 if the only differing values are *outside* the text
492 segments. This may work since stack frames tend to be the
493 same size. Is there a way of determining the location of
494 text segments without reading the ELF files? Maybe just
495 check if it's inside an executable mappign.
497 It is then sufficient in user space to only store one
498 representative for each set of considered-identical stack
501 User space storage: Use the stackstash tree. When a new trace
502 is added, just skip over nodes that differ, but where none of
503 them points to text segments. Two possibilities then:
505 - when two traces are determined to differ, store them
506 in completely separate trees. This ensures that we
507 will never run the dwarf algorithm on an invalid
508 stack trace, but also means that we won't get shared
509 prefixes for stacktraces.
511 - when two traces are determined to differ, branch off
512 as currently. This will share more data, but the
513 dwarf algorithm could be run on invalid traces. It
514 may work in practice though if the compiler
515 generally uses fixed stack frames.
517 A twist on is to mark the complete stack traces as
518 "complete". Then after running the DWARF algorithm,
519 the generated stack trace can be saved with it. This
520 way incomplete stack traces branching off a complete
521 one can be completed using the DWARF information for
524 * Notes on heuristic stack walking
526 - We can reject addresses that point exactly to the beginning of a
527 function since these are likely callbacks. Note though that the
528 first time a function in a shared library is called, it goes
529 through dynamic linker resolution which will cause the stack to
530 contain a callback of the function. This needs to be investigated
533 - We are already rejecting addresses outside the text section
534 (addresses of global variables and the like).
536 * How to get the user stack:
538 /* In principle we should use get_task_mm() but
539 * that will use task_lock() leading to deadlock
540 * if somebody already has the lock
542 if (spin_is_locked (¤t->alloc_lock))
543 printk ("alreadylocked\n");
545 struct mm_struct *mm = current->mm;
548 printk (KERN_ALERT "stack size: %d (%d)\n",
549 mm->start_stack - regs->REG_STACK_PTR,
552 stacksize = mm->start_stack - regs->REG_STACK_PTR;
558 * If interrupt happens in kernel mode, send both
559 kernel stack and user space stack, have userspace stitch them
560 together. well, they could be stitched together in the kernel.
561 Already done: we now take a stacktrace of the user space process
562 when the interrupt happens in kernel mode. (Unfortunately, this
563 causes lockups on many kernels (though I haven't seen that for
566 We don't take any stacktraces of the kernel though. Things that
569 - does the kernel come with dwarf debug information?
570 - does the kernel come with some other debug info
571 - is there a place where the vmlinux binary is usually
572 placed? (We should avoid any "location of vmlinux" type
573 questions if at all possible).
575 We do now copy the kernel stack to userspace and do a
576 heuristic stack walk there. It may be better at some point to
577 use dump_trace() in the kernel since that will do some
580 Notes about kernel symbol lookup:
582 - /proc/kallsym is there, but it includes things like labels.
583 There is no way to tell them from functions
585 - We can search for a kernel binary with symbols. If the
586 kernel-debug package is installed, or if the user compiled
587 his own kernel, we will find one. This is a regular elf file.
588 It also includes labels, but we can tell by the STT_INFO field
591 Note though that for some labels we do actually want to
592 treat them as functions. For example the "page_fault" label,
593 which is function by itself. We can recognize these by the
594 fact that their symbols have a size. However, the _start
595 function in normal applications does not have a size, so the
596 heuristic should be something like this:
598 - If the address is inside the range of some symbol, use
601 - Otherwise, if the closest symbol is a function with
602 size 0, use that function.
604 This means the datastructure will probably have to be done a
607 - See if there is a way to make it distcheck
609 - grep "FIXME - not10"
612 - translation should be hooked up
614 - Consider adding "at least 5% inclusive cost" filter
616 - consider having the ability to group a function together with its nearest
617 neighbours. That way we can eliminate some of the effect of
618 "one function taking 10% of the time"
620 "the same function broken into ten functions each taking 1%"
621 Not clear what the UI looks like though.
623 - Ability to generate "screenshots" suitable for mail/blog/etc
624 UI: "generate screenshot" menu item pops up a window with
625 a text area + a radio buttons "text/html". When you flick
626 them, the text area is automatically updated.
628 - why does the window not remember its position when
629 you close it with the close button, but does remember
630 it when you use the wm button or the menu item? It actually
631 seems that it only forgets the position when you click the
632 button with the mouse. But not if you use the keyboard ...
635 - Find out how gdb does backtraces; they may have a better way. Also
636 find out what dwarf2 is and how to use it. Look into libunwind.
637 It seems gdb is capable of doing backtraces of code that neither has
638 a framepointer nor has debug info. It appears gdb uses the contents
639 of the ".eh_frame" section. There is also an ".eh_frame_hdr" section.
641 http://www.linuxbase.org/spec/booksets/LSB-Embedded/LSB-Embedded/ehframe.html
643 look in dwarf2-frame.[ch] in the gdb distribution.
645 Also look at bozo-profiler
646 http://cutebugs.net/bozo-profiler/
647 which has an elf32 parser/debugger
649 - Make busy cursors more intelligent
650 - when you click something in the main list and we don't respond
651 within 50ms (or perhaps when we expect to not be able to do
652 so (can we know the size in advance?))
653 - instead of what we do now: set the busy cursor unconditionally
655 - Consider adding ability to show more than one function at a time. Algorithm:
656 Find all relevant nodes;
657 For each relevant node
658 best_so_far = relevant node
661 best_so_far = relevant
662 add best_so_far to interesting
666 add trace to tree (leaf, interesting)
668 - Consider adding KDE-style nested callgraph view
669 - probably need a dependency on gtk+ 2.8 (cairo) for this.
670 - Matthias has code for something like this.
671 - See http://www.marzocca.net/linux/baobab.html
672 - Also see http://www.cs.umd.edu/hcil/treemap-history/index.shtml
674 - Add support for line numbers within functions
675 - Possibly a special "view details" mode, assuming that
676 the details of a function are not that interesting
677 together with a tree. (Could add radio buttons somewhere in
678 in the right pane). Or tabs.
679 - Open a new window for the function.
681 - Add view->ancestors/descendants menu items
683 - rethink caller list, not terribly useful at the moment. Federico suggested
684 listing all ancestors.
685 Done: implemented this idea in CVS HEAD. If we keep it that way,
686 should do a globale s/callers/ancestors on the code.
687 - not sure it's an improvement. Often it is more interesting to
688 find the immediate callers.
689 - Now it's back to just listing the immediate callers.
691 - Figure out how Google's pprof script works. Then add real call graph
692 drawing. (google's script is really simple; uses dot from graphviz).
693 KCacheGrind also uses dot to do graph drawing.
695 - hide internal stuff in ProfileDescendant
699 - Multithreading is possible in a number of places.
701 - If the stack trace ends in a memory access instruction, send the
702 vma information to userspace. Then have user space
703 produce statistics on what types of memory are accessed.
705 - somehow get access to VSEnterprise profiler and see how it works.
706 somehow get access to vtune and see how it works.
708 - On SMP systems interrupts happen unpredictably, including when another
709 one is running. Right now we are ignoring any interrupts that happen
710 when another one is running, but we should probably just save the data
713 - Find out if sysprof accurately reports time spent handling pagefaults.
714 There is evidence that it doesn't:
715 - a version of sysprof took 10 seconds to load a certain profile.
716 Profiling itself it appeared that most of the time was spent
717 in the GMarkup parser
718 - a newer version of sysprof with significantly more compact
719 Instructions structure took about 5 seconds, but the profile
720 looked about the same.
721 The difference between the two versions has to be in page faults/
722 memory speed, but the profiles looked similar.
723 Try and reproduce this in a more controlled experiment.
725 - See if it is possible to group the X server activity under the process that
729 [Is this worth it? You will often want to start it as root,
730 and you will need to insert the module from the comman line]
732 - Applications should be able to say "start profiling", "stop profiling"
733 so that you can limit the profiling to specific areas.
735 Add a new kernel interface that applications uses to say
737 Then add a timeline where you can mark interesting regions,
738 for example those that applications have marked interesting.
740 - Find out how to hack around gtk+ bug causing multiple double clicks
743 - Consider what it would take to take stacktraces of other languages such
744 as perl, python, java, ruby, or bash. Or scheme.
746 Possible solution is for the script binaries to have a function
747 called something like
749 __sysprof__generate_stacktrace (char **functions, int *n_functions);
751 that the sysprof kernel module could call (and make return to the kernel).
753 This function would behave essentially like a signal handler: couldn't
754 call malloc(), couldn't call printf(), etc.
756 Note though that scripting languages will generally have a stack with
757 both script-binary-stack, script stack, and library stacks. We wouldn't
758 want scripts to need to parse dwarf. Also if we do that thing with
759 sending the entire stack to userspace, things will be further
762 Also note languages like scheme that uses heap allocated activation
765 - Consider this usecase:
766 Someone is considering replacing malloc()/free() with a freelist
767 for a certain data structure. All use of this data structure is
768 confined to one function, foo(). It is now interesting to know
769 how much time that particular function spends on malloc() and free()
775 - find an instance of malloc()
777 - all traces with malloc are removed
778 - a new item "..." appears immeidately below foo()
779 - malloc is added below "..."
781 - at this point, the desired data can be read at comulative
784 Actually, with this UI, you could potentially get rid of the
785 caller list: Just present the call tree under an <everything> root,
786 and use ... to single out the stuff you are interested in.
788 Maybe also get rid of 'callers' by having a new "show details"
791 The complete solution here degenerates into "expressions":
793 "foo" and ("malloc" or "free")
795 Having that would also take care of the "multiple functions"
796 above. Noone would understand it, though.
798 - figure out a way to deal with both disk and CPU. Need to make sure that
799 things that are UNINTERRUPTIBLE while there are RUNNING tasks are not
800 considered bad. Also figure out how to deal with more than one CPU/core.
802 Not entirely clear that the sysprof visualization is right for disk.
804 Maybe assign a size of n to traces with n *unique* disk access (ie.
805 disk accesses that are not required by any other stack trace).
807 Or assign values to nodes in the calltree based on how many diskaccesses
808 are contained in that tree. Ie., if I get rid of this branch, how many
809 disk accesses would that get rid of.
811 Or turn it around and look at individual disk accesses and see what it
812 would take to get rid of it. Ie., a number of traces are associated with
813 any given diskaccess. Just show those.
815 Or for a given tree with contained disk accesses, figure out what *other*
816 traces has the same diskaccesses.
818 Or visualize a set of squares with a color that is more saturated depending
819 on the number of unique stack traces that access it. Then look for the
820 lightly saturated ones.
822 The input to the profiler would basically be
824 (stack trace, badness, cookie)
826 For CPU: badness=10ms, cookie=<a new one always>
827 For Disk: badness=<calculated based on previous disk accesses>, cookie=<the accessed disk block>
829 For Memory: badness=<cache line size not in cache>, cookie=<the address>
831 Cookies are used to figure out whether an access is really the same, ie., for two identical
832 cookies, the size is still just one, however
834 Memory is different from disk because you can't reasonably assume
835 that stuff that has been read will stay in cache (for short profile
836 runs you can assume that with disk, but not for long ones).
838 - Perhaps show a timeline with CPU in one color and disk in one
839 color. Allow people to look at at subintervals of this
840 timeline. Is it useful to look at both CPU and disk at the same
841 time? Probably not. See also marker discussion above. UI should
842 probably allow double clicking on a marked section and all
843 instances of that one would be marked.
845 - This also allows us to show how well multicore CPUs are being used.
847 - Other variation on the timeline idea: Instead of a disk timeline you could have a
848 list of individual diskaccesses, and be able to select the ones you wanted to
851 - The existing sysprof visualization is not terribly bad, the "self" column is
854 - See what files are accessed so that you can get a getter idea of what
857 - Optimization usecases:
859 - A lot of stuff is read synchronously, but it is possible to read
861 Visualization: A timeline with alternating CPU/disk activity.
863 - What function is doing all the synchronous reading, and what
864 files/offsets is it reading. Visualization: lots of reads across
865 different files out of one function
867 - A piece of the program is doing disk I/O. We can drop that
868 entire piece of code. Sysprof visualization is ok, although seeing
869 the files accessed is useful so that we can tell if those files are
870 not just going to be used in other places. (Gnumeric plugin_init()).
872 - A function is reading a file synchronously, but there is other
873 (CPU/disk) stuff that could be done at the same time. Visualization:
874 A piece of the timeline is diskbound with little or no CPU used.
876 - Want to improve code locality of library or binary. Visualization:
877 no GUI, just produce a list of functions that should be put first in
878 the file. Then run the program again until the list converges.
879 (Valgrind may be more useful here).
881 - Nautilus reads a ton of files, icons + all the files in the
882 homedirectory. Normal sysprof visualization is probably useful
885 - Profiling a login session.
887 - Many applications are running at the same time, doing IPC. It would
888 be useful if we could figure out what other things a given process
889 is waiting on. Eg., in poll, find out what processes have the other
890 ends of the fd's open.
891 Visualization: multiple lines on a graph. Lines join up where
892 one process is blocking on another. That would show processes holding
893 up the progress very clearly.
894 This was suggested by Federico.
896 - Need to report stat() as well. (Where do inode data end up? In the
897 buffer-cache?) Also open() may cause disk reads (seeks).
899 - To generate the timeline we need to know when a disk request is
900 issued and when it is completed. This way we can assign blame to all
901 applications that have issued a disk request at a given point in time.
903 The disk timeline should probably vary in intensity with the number
904 of outstanding disk requests.
907 -=-=-=-=-=-=-=-=-=-=-=-=-=-=- ALREADY DONE: -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
909 * Find out what is going on with kernel threads:
911 [(ksoftirqd/0)] 0.00 0.03
912 No map ([(ksoftirqd/0)]) 0.00 0.03
915 __do_softirq 0.00 0.03
917 * Make sure there aren't leftover stacktraces from last time when
920 * Is the move-to-front in process_locate_map() really worth it?
922 * Whenever we fail to lock the atomic variable, track this, and send the
923 information to userspace as an indication of the overhead of the profiling.
924 Although there is inherent aliasing here since stack scanning happens at
927 * Apparently, if you upgrade the kernel, then don't re-run configure,
928 the kernel Makefile will delete all of /lib/modules/<release>/kernel
929 if you run make install in the module directory. Need to find out what
933 Switching between descendant views is a slow:
934 - gtk_tree_store_get_path() is O(n^2) and accounts
936 - GObject signal emission overhead accounts for 18% of
938 Consider adding a forked version of GtkTreeStore with
939 performance and bug fixes.
941 * If we end up believing the kernel's own stacktraces, maybe
942 /proc/kallsyms shouldn't be parsed until the user hits profile.
944 * Make it compilable against a non-running kernel.
946 * With kernel module not installed, select Profiler->Start, then dismiss
947 the alert. This causes the start button to appear prelighted. Probably
948 just another gtk+ bug.
950 - Fix bugs/performance issues:
951 - add_trace_to_tree() might be a little slow when dealing with deeply
952 recursive profiles. Hypothesis: seen_nodes can grow large, and the
953 algorithm is O(n^2) in the length of the trace.
955 - Have kernel module report the file the address was found in
956 Should avoid a lot of potential broken/raciness with dlopen etc.
957 Probably better to send a list of maps with each trace. Which
958 shouldn't really be that expensive. We already walk the list of
959 maps in process_ensure_map() on every trace. And we can do hashing
960 if it turns out to be a problem.
961 Maybe replace the SysprofStackTrace with a union, so that
962 it can be either a list of maps, or a stacktrace. Both map lists and
963 stacktraces would come with a hashcode.allowing userspac. This avoids
964 the problem that maps could take up a lot of extra bandwidth.
979 char filenames [2048];
982 - possibly add dependency on glib 2.8 if it is released at that point.
985 * Some notes about timer interrupt handling in Linux
987 On an SMP system APIC is used - the interesting file is arch/i386/kernel/apic.c
989 On UP systems, the normal IRQ0 is used
990 When the interrupt happens,
992 calls do_IRQ, which sets up the special interrupt stack,
993 and calls __do_IRQ, which is in /kernel/irq/handle.c.
994 This calls the corresponding irqaction, which has previously
995 been setup by arch/i386/mach-default/setup.c to point to
996 timer_interrupt, which is in arch/i386/kernel/time.c.
997 This calls do_timer_interrupt_hooks() which is defined in
998 /include/asm-i386/mach-default/do_timer.h. This function
999 then calls profile_tick().
1001 Note when the CPU switches from user mode to kernel mode, it
1002 pushes SS/ESP on top of the kernel stack, but when it switches
1003 from kernel mode to kernel mode, it does _not_ push SS/ESP.
1004 It does in both cases push EIP though.
1006 * Rename sysprof-text to sysprof-cli
1008 * Find out why the samples label won't right adjust
1010 * It crashes sometimes.
1012 I haven't seen any crashes in a long time
1014 * Find out why the strings
1016 _ZL11DisplayLineP20nsDisplayListBuilderRK6nsRectS3_R19nsLineList_iteratoriRiRK16nsDisplayListSetP12nsBlockFrame
1017 _ZL11DisplayRowsP20nsDisplayListBuilderP7nsFrameRK6nsRectRK16nsDisplayListSet _ZL11DrawBordersP10gfxContextR7gfxRectS2_PhPdS4_PjPP14nsBorderColorsijiP6nsRect _ZL11HandleEventP10nsGUIEvent
1018 _ZL12IsContentLEQP13nsDisplayItemS0_Pv
1019 _ZL15expose_event_cbP10_GtkWidgetP15_GdkEventExpose
1021 do not get demangled.
1023 * For glibc, the debug files do not contain .strtab and .symtab, but
1024 the original files do. The algorithm in binfile.c must be modified
1027 * If we profile something that is not very CPU bound, sysprof itself
1028 seems to get a disproportionate amount of the samples. Should look
1029 into this. Fixed by only returning from poll when there is more
1030 than eight traces available.
1032 * regarding crossing system call barriers: Find out about the virtual dso
1033 that linux uses to do fast system calls:
1035 http://lkml.org/lkml/2002/12/18/218
1037 and what that actually makes the stack look like. (We may want to just
1038 special case this fake dso in the symbol lookup code).
1040 Maybe get_user_pages() is the way forward at least for some stuff.
1042 note btw. that the kernel pages are only one or two pages, so we
1043 could easily just dump them to userspace.
1045 * In profile.c, change "non_recursive" to "cumulative", and
1046 "marked_non_recursive" to a boolean "charged". This is tricky code,
1047 so be careful. Possibly make it a two-pass operation:
1048 - first add the new trace
1049 - then walk from the leaf, charging nodes
1050 That would allow us to get rid of the marked field altogether. In fact,
1051 maybe the descendants tree could become a stackstash. We'll just have
1052 to make stack_stash_add_trace() return the leaf.
1054 DONE: the name is now "cumulative"
1058 - assume its the same across processes, just look at
1060 Done: vdso is done now
1061 - send copy of it to userspace once, or for every
1065 - decorate_node should be done lazily
1066 - Find out why we sometimes get completely ridicoulous stacktraces,
1067 where main seems to be called from within Xlib etc. This happens
1068 even after restarting everything.
1069 - It looks like the stackstash-reorg code confuses "main" from
1070 unrelated processes. - currently it looks like if multiple
1071 "main"s are present, only one gets listed in the object list.
1072 Seems to mostly happen when multiple processes are
1074 - Numbers in caller view are completely screwed up.
1075 - It looks like it sometimes gets confused with similar but different
1076 processes: Something like:
1077 process a spends 80% in foo() called from bar()
1078 process b spends 1% in foo() called from baz()
1079 we get reports of baz() using > 80% of the time.
1082 * commandline version should check that the output file is writable
1083 before starting the profiling.
1085 * See if we can reproduce the problem where libraries didn't get correctly
1086 reloaded after new versions were installed.
1087 This is just the (deleted) problem. Turns out that the kernel
1088 doesn't print (deleted) in all cases. Some possibilities:
1090 - check that the inodes of the mapped file and the disk file
1091 are the same (done in HEAD).
1093 - check that the file was not modified after being mapped?
1094 (Can we get the time it was mapped or opened?) If it was
1095 modified you'd expect the inode to change, right?
1097 * Find out if the first sort order of a GtkTreeView column can be
1098 changed programmatically. It can't (and the GTK+ bug was wontfixed).
1099 A workaround is possible though. (Someone, please write a
1100 GtkTreeView replacement!)
1102 * Missing things in binparser.[ch]
1104 - maybe convert BIN_UINT32 => { BIN_UINT, 4 }
1105 we already have the width in the struct.
1107 * Rethink binparser. Maybe the default mode should be:
1108 - there is a current offset
1109 - you can move the cursor
1112 - you can read structs with "begin_struct (format) / end_struct()"
1113 Or maybe just "set_format()" that would accept NULL?
1114 - when you are reading a struct, you can skip records with _index()
1115 - you can read fields with get_string/get_uint by passing a name.
1116 - you can read anonymous strings and uints by passing NULL for name
1117 This is allowed even when not reading structs. Or maybe this
1118 should be separate functions. Advantages:
1119 - they can skip ahead, unlike fields accessors
1120 - you can access specific types (8,16,32,64)
1121 - there is no "name" field
1123 - the field accesors would need renaming.
1124 bin_parser_get_uint_field ()
1125 is not really that bad though.
1126 Maybe begin_record() could return a structure you could
1127 use to access that particular record? Would nicely solve
1128 the problems with "goto" and "index".
1129 bin_record_get_uint();
1130 What should begin/end be called? They will have different
1132 bin_parser_get_record (parser) -> record
1133 bin_record_free (record);
1134 - Maybe support for indirect strings? Ie., get_string() in elfparser
1135 - This will require endianness to be a per-parser property. Which is
1136 probably just fine. Although d-bus actually has
1137 per-message endianness. Maybe there could be a settable
1138 "endianness" property.
1140 * Don't look in $(libdir) for separate debug files (since $libdir is
1141 the libdir for sysprof, not a system wide libdir). Tim Rowley.
1142 Fix is probably to hardcode /usr/lib, and also look in $libdir.
1144 * Consider deleting cmdline hack in process.c and replace with something at
1145 the symbol resolution level. Will require more memory though. DONE: in
1146 head, processes are no longer coalesced based on cmdline. Need to add something
1147 at the symbol level.
1149 * don't loop infinitely if there are cycles in the debuglink graph.
1151 * Add "sysprof --version"
1153 * Fix (potential) performance issues in symbol lookup.
1155 - when an elf file is read, it should be checked that the various
1156 sections are of the right type. For example the debug information
1157 for emacs is just a stub file where all the sections are NOBITS.
1159 * Try reproducing crash when profiling xrender demo
1160 - it looks like it crashes when it attempts to read /usr/bin/python
1161 - apparently what's going on is that one of the symbols in python's
1162 dynamic symbol table has a completely crazy 'st_name' offset.
1163 DONE: we didn't actually need to read the name at all,
1164 but still should find out why that value is so weird.
1165 It looks like there is something strange going on with that file.
1166 All the dynsyms have weird info/type values, yet nm and readelf
1167 have no problems displaying it.
1169 - Can .gnu_debuglink recurse?
1170 yes, it can, and we should probably not crash if there are
1171 cycles in the graph.
1173 * Find out why we are getting bogus symbols reported for /usr/bin/Xorg
1176 Everything 0.00 100.00
1177 [/usr/bin/Xorg] 0.00 94.79
1178 GetScratchPixmapHeader 0.00 94.79
1179 __libc_start_main 0.00 94.79
1180 FindAllClientResources 0.00 94.79
1181 FreeFontPath 0.00 94.79
1182 SProcRenderCreateConicalGradient 0.00 94.56
1183 ProcRenderTrapezoids 0.00 94.56
1184 AllocatePicture 0.00 94.56
1185 __glXDispatch 0.00 0.16
1186 __glXVendorPrivate 0.00 0.08
1187 __glXRender 0.00 0.08
1188 SmartScheduleStartTimer 0.00 0.08
1189 [./sysprof] 0.00 2.76
1190 [sshd: ssp@pts/0] 0.00 2.37
1193 What's going on here is that the computed load address for the X server
1194 binary is different for the debug binary. The lowest allocated address
1195 is 0x08047134 for the normal, and 0x8048134 for the debug. But it looks
1196 like the addresses are still the same for the symbols.
1197 The root of this problem may be that we should ignore the load
1198 address of the debug binary, and just lookup the address computed.
1199 The *text* segments have the same address though. Everything from
1200 "gnu version" on has the same address.
1203 - find out where in memory the text segment is
1204 - take an address and compute its offset to the text segment
1205 - in elf parser, find address of text segment
1207 - lookup resulting address
1209 So basically, "load address" should really be text address. Except of course
1210 that load_address is not used in process.c - instead the 'load address' of the
1211 first part of the file is computed and assumed to be equivalent to the
1212 load address. So to lookup something you probably actually need
1213 to know the load/offset at the elf parser level:
1215 lookup_symbol (elf, map, offset, address)
1219 real load address of text (lta) = map - offset + text_offset
1221 offset of called func (ocf): addr - lta
1223 thing to lookup in table: ocf + text_addr.loadaddr in debug
1225 addr - map - offset + text_offset
1230 * plug all the leaks
1231 - don't leak the presentation strings/objects
1232 - maybe add stack_stash_set_free_func() or something
1233 * Delete elf_parser_new() and rename elf_parser_new_from_file()
1235 * Add demangling again
1237 * Restore filename => binfile cache.
1239 * It is apparently possible to get another timer interrupt in the middle
1240 of timer_notify. If this happens the circular buffer gets screwed up and
1241 you get crashes. Note this gets much worse on SMP (in fact how did this
1242 work at all previously?)
1245 - have a "in timer notify" variable, then simply reject nested
1247 - keep a "ghost head" that timers can use to allocate new traces,
1248 then update the real head whenever one of them completes. Note
1249 though, that the traces will get generated in the wrong order
1250 due to the nesting. In fact, only the outermost timernotify
1251 can update the real head, and it should update it to ghost_head.
1252 - do correct locking? Nah, that's crazy talk
1253 Also note: a race is a race, and on SMP we probably can't even make it
1254 unlikely enough to not matter.
1256 Fixed by ignoring the nested interrupts using an atomic variable.
1258 * When you load a file from the commandline, there is a weird flash of the toolbar.
1259 What's going on is this:
1260 - this file is loaded into a tree model
1261 - the tree model is set for the function list
1262 - this causes the selection changed signal to be emitted
1263 - the callback for that signal process updates
1264 - somehow in that update process, the toolbar flashes.
1265 - turns out to be a gtk+ issue: 350517
1267 - screenshot window must be cleared when you press start.
1269 - Formats should become first-class, stand-alone objects that offers
1270 help with parsing and nothing else.
1272 ParseContext* format_get_parse_context (format, err);
1273 gboolean parse_context_begin (parse_context, name, err);
1274 gboolean parse_context_end (parse_format, name, err);
1276 basically, a Format encapsulates a DFA, and a ParseContext encapsulates
1279 - make stackstash ref counted
1281 - Charge 'self' properly to processes that don't get any stack trace at all
1282 (probably we get that for free with stackstash reorganisation)
1284 - CVS head now has two radio buttons in the right pane, and
1285 caller pane is gone. (This turned out to be a bad idea, because it
1286 is often useful to click on ancestors to move up the tree).
1288 * Don't build the GUI if gtk+ is not installed
1290 * Find out why we sometimes get reports of time spent by [pid 0].
1292 * - Run a.out generated normally with gcc.
1300 At this point we should not get any symbols, but we do. There is some
1301 sort of bad caching going on.
1303 * support more than one reader of the samples properly
1304 - Don't generate them if noone cares
1307 - When the module is unloaded, kill all processes blocking in read
1308 - or block unloading until all processes have exited
1309 Unfortunately this is basically impossible to do with a /proc
1310 file (no open() notification). So, for 1.0 this will have to be
1311 a dont-do-that-then. For 1.2, we should do it with a sysfs and
1314 - When the module is unloaded, can we somehow *guarantee* that no
1315 kernel thread is active? Doesn't look like it; however we can
1316 get close by decreasing a ref count just before returning
1317 from the module. (There may still be return instructions etc.
1318 that will get run). This may not be an issue with the timer
1319 based scanning we are using currently.
1321 * Find out why we get hangs with rawhide kernels. This only happens with the
1322 'trace "current"' code. See this mail:
1324 http://mail.nl.linux.org/kernelnewbies/2005-08/msg00157.html
1326 esp0 points to top of kernel stack
1327 esp points to top of user stack
1329 (Reported by Kjartan Maraas).
1331 - When not profiling, sysprof shouldn't keep the file open.
1333 - Make things faster
1334 - Can I get it to profile itself?
1335 - speedprof seems to report that lots of time is spent in
1336 stack_stash_foreach() and also in generate_key()
1337 - add an 'everything' object. It is really needed for a lot of things
1338 - should be easy to do with stackstash reorganization.
1341 * Handle time being set back in the RESET_DEAD_PERIOD code.
1343 - total should probably be cached so that compute_total() doesn't
1344 take 80% of the time to generate a profile.
1346 - Fixing the oops in kernels < 2.6.11
1348 - Probably just require 2.6.11 (necessary for timer interrupt
1351 - Make the process waiting in poll() responsible for extracting
1352 the backtrace. Give a copy of the entire stack rather than doing
1353 the walk inside the kernel.
1357 one of actual scanned stacks
1358 one of tasks that need to be scanned
1362 - in read() wait for stack data:
1365 return -EWOULDBLOCK;
1368 while (!stack data) {
1374 scan_tasks() is a function that converts waiting
1375 tasks into data, and wakes them up.
1377 - in timer interrupt:
1379 if (someone waiting in poll() &&
1380 current && current != that_someone &&
1381 current is runnable)
1384 add current to queue;
1388 This way, we will have a real userspace process
1389 that can take the page faults.
1392 - Different approach:
1394 pollable file where a regular userspace process
1395 can read a pid. Any pid returned is guaranteed to be
1396 UNINTERRUPTIBLE. Userspace process is required to
1397 start it again when it is done with it.
1399 Also provide interface to read arbitrary memory of
1402 ptrace() could in principle do all this, but
1403 unfortunately it sucks to continuously
1408 Userspace process can register itself as "profiler"
1409 and pass in a filedescriptor where all sorts of
1410 information is sent.
1412 - could tie lifetime of module to profiler
1413 - could send "module going away" information
1414 - Can we map filedescriptors to files in
1417 * Make sure sysprof-text is not linked to gtk+
1419 * Consider renaming profiler.[ch] to collector.[ch]
1421 * Crash reported by Rudi Chiarito with n_addrs == 0.
1423 * Find out what distributions it actually works on
1424 (ask for sucess/failure-stories in 1.0 releases)
1426 * Add note in README about Ubuntu and Debian -dbg packages and how to get
1427 debug symbols for X there.
1431 - make loading and saving work again.
1432 - make stashes loadable and savable.
1433 - add a way to convert 1.0 files to stashes
1435 - Get rid of remaining uses of stack_stash_foreach(), then
1436 rename stack_stash_foreach_reversed() to
1437 stack_stash_foreach()
1439 - stackstash should just take traces of addresses without knowing
1440 anything about what those addresses mean.
1442 - stacktraces should then begin with a process
1444 - stackstash should be extended so that the "create_descendant"
1445 and "create_ancestor" code in profile.c can use it directly.
1446 At that point, get rid of the profile tree, and rename
1447 profile.c to analyze.c.
1449 - the profile tree will then just be a stackstash where the
1450 addresses are presentation strings instead.
1452 - Doing a profile will then amount to converting the raw stash
1453 to one where the addresses have been looked up and converted to
1454 presentation strings.
1458 - profile should take traces of pointers to presentation
1459 objects without knowing anything about these presentation
1462 - For each stack node, compute a presentation object
1463 (probably need to export opaque stacknode objects
1464 with set/get_user_data)
1466 - Send each stack trace to the profile module, along with
1467 presentation objects. Maybe just a map from stack nodes
1468 to presentation objects.
1470 - Make the Profile class use the stash directly instead of
1471 building its own copy.
1472 - store a stash in the profile class
1473 - make sure descendants and callers can be
1475 - get rid of other stuff in the profile
1481 - Update version numbers in source
1485 - Check that tarball works
1489 - cvs tag sysprof-1-0
1493 - Announce on Freshmeat
1495 - Announce on gnome-announce
1496 - Announce on kernel list.
1498 - Announce on Gnomefiles
1500 - Announce on news.gnome.org
1501 - Send to slashdot/developers
1502 - Announce on devtools list (?)
1504 - Announce on Advogato
1507 * The handling of the global variable in signal-handler.[ch] needs to be
1508 atomic - right now it isn't. The issue is what happens if a handled signal
1509 arrives while we are manipulating the list?
1511 * (User space stack must probably be done in a thread - kernel
1512 stack must probably be taken in the interrupt itself?
1513 - Why this difference? The page tables should still be loaded. Is it
1514 because pages_present() doesn't work? No, turning it off doesn't help.
1515 - It looks like this works. Get:
1517 struct pt_regs *user_regs =
1518 (void *)current->thread.esp0 - sizeof (struct pt_regs);
1520 then use pages_present as usual to trace with user_regs; There could be
1521 rare lockups though.
1523 * Non-GUI version that can save in a format the GUI can understand.
1524 Could be used for profiling startup etc. Would preferably be able to
1525 dump the data to a network socket. Should be able to react to eg.
1526 SIGUSR1 by dumping the data.
1528 Work done by Lorenzo:
1530 http://www.colitti.com/lorenzo/software/gnome-startup/sysprof-text.diff
1531 http://www.colitti.com/lorenzo/software/gnome-startup/sysprof.log
1532 http://colitti.com/lorenzo/software/gnome-startup/
1534 * consider caching [filename => bin_file]
1536 * Check the kernel we are building against, if it is SMP or
1537 less than 2.6.11, print a warning and suggest upgrading.
1539 * Timer interrupt based
1542 - Consider expanding a few more levels of a new descendants tree
1543 - Algorithm should be expand in proportion to the
1544 "total" percentage. Basically consider 'total' the
1545 likelyhood that the user is going to look at it.
1546 - Maybe just; keep expanding the biggest total until
1547 there is no more room or we run out of things to expand.
1549 * Web page containing
1552 - Explanation of what it is
1556 - Ask for sucess/failure reports
1557 - hook up menu items view/start etc (or possibly get rid of them or
1559 - Should do as suggested in the automake manual in the
1560 chapter "when automake is not enough"
1561 - add an "insert-module" target
1562 - need to run depmod on install
1563 - If the current profile has a name, display it in the title bar
1565 - Find out if that PREFIX business in Makefile was really such
1567 - Sould just install the kernel module if it running as root, pop up
1568 a dialog if not. Note we must be able to start without module now,
1569 since it is useful to just load profiles from disk.
1570 - Is there a portable way of asking for the root password?
1571 - Install a small suid program that only inserts the module?
1572 (instant security hole ..)
1573 - Need to make "make install" work (how do you know where to install
1575 - in /lib/modules/`uname -r`/kernel/drivers/
1576 - need to run depmod as root after that
1577 - Then modprobe run as root should correctly find it.
1581 - give profiles on the command line
1583 - Hopefully the oops at the end of this file is gone now that
1584 we use mmput/get_task_mm. For older kernels those symbols
1585 are not exported though, so we will probably have to either
1586 use the old way (directly accessing the mm's) or just not
1587 support those kernels.
1594 - when you hit "Profile"
1595 - when you click something in the main list and we don't respond
1596 within 50ms (or perhaps when we expect to not be able to do
1597 so (can we know the size in advance?))
1599 - kernel module should put process to sleep before sampling. Should get us
1602 - Make sure samples label shows correct nunber after Open
1604 - Move "samples" label to the toolbar, then get rid of statusbar.
1606 - crashes when you ctrl-click the selected item in the top left pane
1607 <ian__> ssp: looks like it doesn't handle the none-selected case
1609 - loading and saving
1611 - consider making ProfileObject more of an object.
1613 - make an "everything" object
1614 maybe not necessary -- there is a libc_ctors_something()
1616 - make presentation strings nicer
1618 four different kinds of symbols:
1620 a) I know exactly what this is
1621 b) I know in what library this is
1622 c) I know only the process that did this
1623 d) I know the name, but there is another similarly named one
1625 (a) is easy, (b) should be <in ...> (c) should just become "???"
1628 - processes with a cmdline of "" should get a [pid = %d] instead.
1630 - make an "n samples" label
1633 - make threads be reported together
1634 (simply report pids with similar command lines together)
1635 (note: it seems separating by pid is way too slow (uses too much memory),
1636 so it has to be like this)
1638 - stack stash should allow different pids to refer to the same root
1639 (ie. there is no need to create a new tree for each pid)
1640 The *leaves* should contain the pid, not the root. You could even imagine
1641 a set of processes, each referring to a set of leaves.
1643 - when we see a new pid, immediately capture its mappings
1646 - new object Process
1647 - hashable by pointer
1648 - contains list of maps
1649 - process_from_pid (pid_t pid, gboolean join_threads)
1650 - new processes are gets their maps immediately
1651 - resulting pointer must be unref()ed, but it is possible it
1652 just points to an existing process
1653 - processes with identical cmdlines are taken together
1654 - method lookup_symbol()
1657 - StackStash stores map from process to leaves
1658 - Profile is called with processes
1660 It is possible that we simply need a better concept of Process:
1662 If two pids have the same command line, consider them the same, period.
1663 This should save considerable amounts of memory.
1667 "No pids are reused during a profiling run"
1668 "Two processes with the same command line have the same mappings"
1670 are somewhat dubious, but probably necessary.
1672 (More complex kernel:
1674 have the module report
1676 - new pid arrived (along with mappings)
1677 - mapping changed for pid
1680 - make symbols in executable work
1681 - the hashtables used in profile.c should not accept NULL as the key
1683 - autoexpand descendant tree
1684 - make double clicks work
1688 - Find out what happened here:
1690 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: Unable to handle kernel NULL pointer dereference at virtual address 000001b8
1691 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: printing eip:
1692 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: c017342c
1693 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: *pde = 00000000
1694 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: Oops: 0000 [#1]
1695 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: Modules linked in: sysprof_module(U) i2c_algo_bit md5 ipv6 parport_pc lp parport autofs4 sunrpc video button battery ac ohci1394 ieee1394 uhci_hcd ehci_hcd hw_random tpm_atmel tpm i2c_i801 i2c_core snd_intel8x0 snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc e1000 floppy dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod ata_piix libata sd_mod scsi_mod
1696 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: CPU: 0
1697 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: EIP: 0060:[<c017342c>] Not tainted VLI
1698 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: EFLAGS: 00010287 (2.6.11-1.1225_FC4)
1699 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: EIP is at grab_swap_token+0x35/0x21f
1700 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: eax: 0bd48023 ebx: d831d028 ecx: 00000282 edx: 00000000
1701 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: esi: c1b72934 edi: c1045820 ebp: c1b703f0 esp: c18dbdd8
1702 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: ds: 007b es: 007b ss: 0068
1703 Apr 11 15:42:08 great-sage-equal-to-heaven kernel: Process events/0 (pid: 3, threadinfo=c18db000 task=f7e62000)
1704 Apr 11 15:42:09 great-sage-equal-to-heaven kernel: Stack: 000011a8 00000000 000011a8 c1b703f0 c0151731 c016f58f 000011a8 c1b72934
1705 Apr 11 15:42:09 great-sage-equal-to-heaven kernel: 000011a8 c0166415 c1b72934 c1b72934 c0163768 ee7ccc38 f459fbf8 bf92e7b8
1706 Apr 11 15:42:09 great-sage-equal-to-heaven kernel: f6c6a934 c0103b92 bfdaba18 c1b703f0 00000001 c1b81bfc c1b72934 bfdaba18
1707 Apr 11 15:42:09 great-sage-equal-to-heaven kernel: Call Trace:
1708 Apr 11 15:42:09 great-sage-equal-to-heaven kernel: [<c0151731>] find_get_page+0x9/0x24
1709 Apr 11 15:42:09 great-sage-equal-to-heaven kernel: [<c016f58f>] read_swap_cache_async+0x32/0x83Apr 11 15:42:09 great-sage-equal-to-heaven kernel: [<c0166415>] do_swap_page+0x262/0x600
1710 Apr 11 15:42:09 great-sage-equal-to-heaven kernel: [<c0163768>] pte_alloc_map+0xc6/0x1e6
1711 Apr 11 15:42:09 great-sage-equal-to-heaven kernel: [<c0103b92>] common_interrupt+0x1a/0x20
1712 Apr 11 15:42:09 great-sage-equal-to-heaven kernel: [<c01673f0>] handle_mm_fault+0x1da/0x31d
1713 Apr 11 15:42:09 great-sage-equal-to-heaven kernel: [<c016488e>] __follow_page+0xa2/0x10d
1714 Apr 11 15:42:09 great-sage-equal-to-heaven kernel: [<c0164a6f>] get_user_pages+0x145/0x6ee
1715 Apr 11 15:42:09 great-sage-equal-to-heaven kernel: [<c0161f66>] kmap_high+0x52/0x44e
1716 Apr 11 15:42:09 great-sage-equal-to-heaven kernel: [<c0103b92>] common_interrupt+0x1a/0x20
1717 Apr 11 15:42:09 great-sage-equal-to-heaven kernel: [<f8cbb19d>] x_access_process_vm+0x111/0x1a5 [sysprof_module]
1718 Apr 11 15:42:10 great-sage-equal-to-heaven kernel: [<f8cbb24a>] read_user_space+0x19/0x1d [sysprof_module]
1719 Apr 11 15:42:10 great-sage-equal-to-heaven kernel: [<f8cbb293>] read_frame+0x35/0x51 [sysprof_module]
1720 Apr 11 15:42:10 great-sage-equal-to-heaven kernel: [<f8cbb33a>] generate_stack_trace+0x8b/0xb4
1721 Apr 11 15:42:10 great-sage-equal-to-heaven kernel: [<f8cbb3a2>] do_generate+0x3f/0xa0 [sysprof_module]
1722 Apr 11 15:42:10 great-sage-equal-to-heaven kernel: [<c0138d7a>] worker_thread+0x1b0/0x450
1723 Apr 11 15:42:10 great-sage-equal-to-heaven kernel: [<c0379ccd>] schedule+0x30d/0x780
1724 Apr 11 15:42:10 great-sage-equal-to-heaven kernel: [<c011bdb6>] __wake_up_common+0x39/0x59
1725 Apr 11 15:42:10 great-sage-equal-to-heaven kernel: [<f8cbb363>] do_generate+0x0/0xa0 [sysprof_module]
1726 Apr 11 15:42:10 great-sage-equal-to-heaven kernel: [<c011bd71>] default_wake_function+0x0/0xc
1727 Apr 11 15:42:10 great-sage-equal-to-heaven kernel: [<c0138bca>] worker_thread+0x0/0x450
1728 Apr 11 15:42:10 great-sage-equal-to-heaven kernel: [<c013f3cb>] kthread+0x87/0x8b
1729 Apr 11 15:42:10 great-sage-equal-to-heaven kernel: [<c013f344>] kthread+0x0/0x8b
1730 Apr 11 15:42:10 great-sage-equal-to-heaven kernel: [<c0101275>] kernel_thread_helper+0x5/0xb
1731 Apr 11 15:42:10 great-sage-equal-to-heaven kernel: Code: e0 8b 00 8b 50 74 8b 1d c4 55 3d c0 39
1732 da 0f 84 9b 01 00 00 a1 60 fc 3c c0 39 05 30 ec 48 c0 78 05 83 c4 20 5b c3 a1 60 fc 3c c0 <3b> 82 b8 01 00 00 78 ee 81 3d ac 55 3d c0 3c 4b 24 1d 0f 85 78