Milian Wolff [Mon, 22 Dec 2014 13:03:30 +0000 (14:03 +0100)]
Make start time const.
Milian Wolff [Mon, 22 Dec 2014 13:03:21 +0000 (14:03 +0100)]
Don't fail when trying to join the timer thread.
Milian Wolff [Mon, 22 Dec 2014 12:23:15 +0000 (13:23 +0100)]
Silence GDB stdout messages by using --batch-silent.
Milian Wolff [Mon, 22 Dec 2014 12:22:27 +0000 (13:22 +0100)]
Also print exit message when killing heaptrack via CTRL+C.
Milian Wolff [Mon, 22 Dec 2014 12:19:16 +0000 (13:19 +0100)]
Do not load gdbinit file or symbols when attaching to process.
Instead, only load the two libs we actually need, libdl for dlopen
and libheaptrack_inject for the init function.
Milian Wolff [Thu, 18 Dec 2014 19:04:41 +0000 (20:04 +0100)]
Cleanup code for operator new skipping in backtraces.
Milian Wolff [Thu, 18 Dec 2014 18:36:54 +0000 (19:36 +0100)]
Also stop backtraces after __static_initialization_and_destruction_0.
Additionally, we can now cope with backtraces that either show
main or __libc_start_main, and stop at the first occurrance
of either symbol.
Milian Wolff [Wed, 17 Dec 2014 13:09:19 +0000 (14:09 +0100)]
Fix major regression: Get backtrace in malloc properly.
Zomg, I must have been asleep when I comitted this - sorry.
Milian Wolff [Wed, 17 Dec 2014 12:49:44 +0000 (13:49 +0100)]
Remove explicit std:: qualification, we use that namespace.
Milian Wolff [Wed, 17 Dec 2014 12:46:24 +0000 (13:46 +0100)]
Instead of yielding the thread, sleep for one micro second.
This significantly improves the performance of the threaded
test example. The reason are the far reduced numbers of context
switches (goes down by about 90%):
Perf stat results averaged over ten runs are:
before:
1537.459666 task-clock (msec) # 2.741 CPUs utilized ( +- 1.98% )
164,214 context-switches # 0.107 M/sec ( +- 4.77% )
1,238 cpu-migrations # 0.805 K/sec ( +- 1.42% )
3,996 page-faults # 0.003 M/sec ( +- 0.26% )
3,531,767,774 cycles # 2.297 GHz ( +- 1.96% ) [56.85%]
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
4,404,564,712 instructions # 1.25 insns per cycle ( +- 1.81% ) [82.34%]
895,286,284 branches # 582.315 M/sec ( +- 1.83% ) [82.05%]
14,800,878 branch-misses # 1.65% of all branches ( +- 2.07% ) [83.04%]
0.
560967210 seconds time elapsed ( +- 1.78% )
after:
940.204408 task-clock (msec) # 1.970 CPUs utilized ( +- 0.97% )
16,628 context-switches # 0.018 M/sec ( +- 2.09% )
1,709 cpu-migrations # 0.002 M/sec ( +- 2.02% )
3,733 page-faults # 0.004 M/sec ( +- 0.62% )
2,299,207,706 cycles # 2.445 GHz ( +- 0.95% ) [54.67%]
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
3,203,389,330 instructions # 1.39 insns per cycle ( +- 1.48% ) [84.25%]
717,416,828 branches # 763.043 M/sec ( +- 1.01% ) [77.86%]
11,127,629 branch-misses # 1.55% of all branches ( +- 0.98% ) [76.95%]
0.
477194059 seconds time elapsed ( +- 0.34% )
Note that we utilize less CPUs (which is good, as it safes power), but
still decrease the total run time, as do number of cycles spent
and instructions executed.
Milian Wolff [Wed, 17 Dec 2014 12:40:27 +0000 (13:40 +0100)]
Use acquire semantics after all... ;-)
According to this page, the relaxed semantics are a useless
optimization and might in fact degrade performance on some systems:
http://www.boost.org/doc/libs/master/doc/html/atomic/usage_examples.html#boost_atomic.usage_examples.example_spinlock
Milian Wolff [Wed, 17 Dec 2014 12:38:21 +0000 (13:38 +0100)]
Cleanup spinlock implementation.
- use atomic::exchange
- use relaxed memory order in loop
Milian Wolff [Mon, 15 Dec 2014 18:02:44 +0000 (19:02 +0100)]
Cope with broken backtraces.
Still output a 0 trace index and print ?? later.
Milian Wolff [Fri, 12 Dec 2014 23:57:02 +0000 (00:57 +0100)]
Refactor code once more for yet better thread safety at shutdown.
Instead of using flockfile, we use a simple spin lock now. This
is even a bit faster than using flockfile, and does not have any
issues at shutdown, when the file was deleted already.
Also, the spin lock can easily be combined with the check for
when the timer thread should be stopped.
Furthermore, the code is restructured to ensure the thread-unsafe
API is only ever getting called while the lock is held.
To simplify future development, some debug log can now be enabled
which will trace the heaptrack execution.
Milian Wolff [Thu, 11 Dec 2014 14:04:20 +0000 (15:04 +0100)]
Set cmake requirement to 2.8.9 and use LINK_PRIVATE.
Hopefully this restores building it with an older CMake version.
Milian Wolff [Thu, 11 Dec 2014 13:37:40 +0000 (14:37 +0100)]
Only one trap can be added.
This prevents leakage of the /tmp/heaptrack_fifo$$ files.
Milian Wolff [Thu, 11 Dec 2014 13:32:59 +0000 (14:32 +0100)]
Adapt to libunwind changes with unw_set_cache_size.
Milian Wolff [Wed, 10 Dec 2014 22:08:17 +0000 (23:08 +0100)]
Silence build warnings, as was done before in vogl/apitrace.
Milian Wolff [Wed, 10 Dec 2014 18:30:29 +0000 (19:30 +0100)]
Add noexcept
Milian Wolff [Wed, 10 Dec 2014 18:27:11 +0000 (19:27 +0100)]
Compile in pedantic mode.
Milian Wolff [Wed, 10 Dec 2014 18:23:51 +0000 (19:23 +0100)]
Remove dead code
Milian Wolff [Wed, 10 Dec 2014 18:10:18 +0000 (19:10 +0100)]
Write accurate end time on shutdown.
Also print out bytes allocated per second and allocations per
second in heaptrack_print overview.
Milian Wolff [Wed, 10 Dec 2014 17:04:42 +0000 (18:04 +0100)]
Minimize changes against libbacktrace.
Instead, we export more elf internals and call that directly from
heaptrack_interpret. This should make it simpler to keep the
libbacktrace checkout in sync with upstream changes.
Milian Wolff [Wed, 10 Dec 2014 15:56:57 +0000 (16:56 +0100)]
Remove dead code and enable more warnings when building.
Milian Wolff [Wed, 10 Dec 2014 15:25:27 +0000 (16:25 +0100)]
Update libbacktrace to latest version from GCC 4.9 git branch.
This should fix a serious memory issue in libbacktrace, see:
https://gcc.gnu.org/ml/gcc-patches/2014-05/msg00547.html
Thanks for the report and analysis to André Wöbbeking.
CCMAIL: Woebbeking@kde.org
Milian Wolff [Wed, 10 Dec 2014 14:06:08 +0000 (15:06 +0100)]
Mark Boost and Threads as required dependencies.
Milian Wolff [Tue, 9 Dec 2014 20:54:07 +0000 (21:54 +0100)]
Overwrite symbols after dlopen in injected heaptrack.
Milian Wolff [Tue, 9 Dec 2014 20:42:30 +0000 (21:42 +0100)]
Mark module cache dirty on initialization.
This is required for multiple reattachements.
Milian Wolff [Tue, 9 Dec 2014 19:27:47 +0000 (20:27 +0100)]
Use a custom thread + sleep_for instead of a C timer.
This makes re-attaching work without deadlocks in timer_delete
for me. And we can clean the code up even more now:
Much less code, yet still more accurate. The overhead is negleglible
in my tests. And this makes it trivial for the future to let users
configure the interval themselves.
Milian Wolff [Tue, 9 Dec 2014 17:59:59 +0000 (18:59 +0100)]
Refactor the code to allow multiple runtime-injections.
This is sometimes useful. This way, you can hop on/off
as you like and just investigate the areas you are
interested in.
Milian Wolff [Tue, 9 Dec 2014 14:33:55 +0000 (15:33 +0100)]
Refactor hook initialization in heaptrack_preload.
The previous code works with clang, but not with GCC. This new
code works with GCC, and, hopefully, also with clang. It does not
depend on static initialization order anymore, as the string
identifiers of the hook functions are initialized via constexpr now.
This fixes the heaptrack_preload usage for me on this machine.
Milian Wolff [Tue, 9 Dec 2014 10:19:56 +0000 (11:19 +0100)]
Remove code that does not compile with slightly older compilers.
Milian Wolff [Tue, 9 Dec 2014 10:18:08 +0000 (11:18 +0100)]
Add compile-check around unw_set_cache_log_size.
Milian Wolff [Tue, 9 Dec 2014 01:11:24 +0000 (02:11 +0100)]
Silence warning when debuggee quit already.
Milian Wolff [Tue, 9 Dec 2014 01:08:55 +0000 (02:08 +0100)]
Restore original function addresses when shutting down injected heaptrack.
This should bring back the original behavior and performance.
And we might also be able to re-attach then if desired.
Milian Wolff [Tue, 9 Dec 2014 00:51:06 +0000 (01:51 +0100)]
Return output stream
Milian Wolff [Tue, 9 Dec 2014 00:49:49 +0000 (01:49 +0100)]
Merge branch 'inject'
Milian Wolff [Tue, 9 Dec 2014 00:45:24 +0000 (01:45 +0100)]
Stop heaptrack when heaptrack_interpret goes away.
Milian Wolff [Tue, 9 Dec 2014 00:13:57 +0000 (01:13 +0100)]
Add -p/--pid option to attach heaptrack to a running process.
We use gdb to attach to the process, then call dlopen there and
finally an initialization hook in the new libheaptrack_inject.so.
To prevent an underflow in the total memory consumption, we
call malloc_info and parse that in heaptrack_interpret to get a
baseline for the current total memory consumption at the point
where we attached to the process.
The code is refactored into a shared libheaptrack.cpp with common
code, and two libraries, libheaptrack_inject.so and the old
libheaptrack_preload.so. The heaptrack bash script is adapted to
support both versions.
Milian Wolff [Mon, 8 Dec 2014 21:13:37 +0000 (22:13 +0100)]
Restructure argument parsing in shell script.
This makes it easier to maintain and the arguments become position-
independent, which is very useful.
Milian Wolff [Mon, 8 Dec 2014 11:19:29 +0000 (12:19 +0100)]
Get rid of temporary string allocations.
Milian Wolff [Mon, 8 Dec 2014 11:09:05 +0000 (12:09 +0100)]
Use lambdas instead of passing printf labels with magic numbers.
This also gets rid of useless string allocations to format the byte
sizes of data members that are never printed.
Milian Wolff [Mon, 8 Dec 2014 10:26:09 +0000 (11:26 +0100)]
Cleanup/fix: We add +1 above already, so no need to do it again.
Milian Wolff [Sun, 7 Dec 2014 14:53:29 +0000 (15:53 +0100)]
Use mprotect on addresses we want to overwrite.
This seems to be more reliable than the other method.
Milian Wolff [Sat, 6 Dec 2014 14:39:36 +0000 (15:39 +0100)]
Also overload posix_memalign.
Milian Wolff [Sat, 6 Dec 2014 14:35:07 +0000 (15:35 +0100)]
Overwrite realloc, calloc and cfree at runtime.
Milian Wolff [Sat, 6 Dec 2014 14:27:25 +0000 (15:27 +0100)]
Overwrite dlopen/dlclose as well at runtime.
Milian Wolff [Sat, 6 Dec 2014 13:51:09 +0000 (14:51 +0100)]
Ensure the hook function is convertible to the original function.
Milian Wolff [Sat, 6 Dec 2014 12:46:36 +0000 (13:46 +0100)]
Further cleanup the code
Milian Wolff [Sat, 6 Dec 2014 12:42:02 +0000 (13:42 +0100)]
Make the hook list easier extensible.
Milian Wolff [Sat, 6 Dec 2014 12:35:34 +0000 (13:35 +0100)]
Cleanup code a bit.
Milian Wolff [Fri, 5 Dec 2014 23:50:02 +0000 (00:50 +0100)]
Also overwrite free.
Milian Wolff [Fri, 5 Dec 2014 23:43:42 +0000 (00:43 +0100)]
Make runtime-patching work!
Wow, awesome. What was missing where the offsets, i.e. the dlpi_addr
must be added to the p_vaddr, as well as the r_offset. Or so it seems.
Now I don't see any crashes anymore, and my custom malloc gest called!
Milian Wolff [Fri, 5 Dec 2014 23:06:19 +0000 (00:06 +0100)]
first work towards runtime injection, not really working yet
Milian Wolff [Fri, 5 Dec 2014 20:35:32 +0000 (21:35 +0100)]
Remove spurious define, only required on some faulty machine of mine.
Milian Wolff [Thu, 4 Dec 2014 18:24:05 +0000 (19:24 +0100)]
Add LGPL v2.1+ text in COPYING file.
Milian Wolff [Wed, 3 Dec 2014 16:11:39 +0000 (17:11 +0100)]
Fix line information written by heaptrack_interpret.
The line numbers where written in decimal format, but heaptrack_print
expects hexadecimal numbers when reading. This lead to completely
wrong line numbers in the end.
Milian Wolff [Wed, 3 Dec 2014 11:07:27 +0000 (12:07 +0100)]
Always show the biggest heap in the total memory consumption.
This still has some issues, but is better than nothing.
Milian Wolff [Tue, 2 Dec 2014 14:11:48 +0000 (15:11 +0100)]
Add some platform checks to give better error messages at cmake time.
Milian Wolff [Tue, 2 Dec 2014 13:51:04 +0000 (14:51 +0100)]
Skip anything below main in massif backtraces.
Milian Wolff [Tue, 2 Dec 2014 13:48:58 +0000 (14:48 +0100)]
Properly handle case where __libc_start_main is encountered before main itself.
We still want to break in main.
Milian Wolff [Tue, 2 Dec 2014 13:34:29 +0000 (14:34 +0100)]
Sort massif output file.
This is expected by e.g. the massif-visualizer and probably other
tools as well.
Milian Wolff [Tue, 2 Dec 2014 11:22:19 +0000 (12:22 +0100)]
Introduce tunable massif threshold and detailed freq parameters.
Milian Wolff [Tue, 2 Dec 2014 03:30:53 +0000 (04:30 +0100)]
Print full backtraces in generated massif files.
These files easily become quite large. We really need our own
heaptrack-visualizer to efficiently read the data and handle it.
Milian Wolff [Tue, 2 Dec 2014 02:36:34 +0000 (03:36 +0100)]
Print first level of backtrace in generated massif files.
Milian Wolff [Tue, 2 Dec 2014 01:56:00 +0000 (02:56 +0100)]
Basic support for massif output file generation.
So far, it only includes the total heap memory usage. Backtraces
will be added in the next step.
Milian Wolff [Tue, 2 Dec 2014 01:54:57 +0000 (02:54 +0100)]
Use thread instead of signal and increase timestamp frequency.
Sadly, this creates a new thread for every timer apparently,
but still, it's better than using signals which randomly influence
other functions and sometimes fubar our output files. Or so I think.
Milian Wolff [Tue, 2 Dec 2014 01:00:23 +0000 (02:00 +0100)]
Remember the debugee command line.
This is sometimes useful and also tracked by massif and other tools.
Milian Wolff [Tue, 2 Dec 2014 00:32:02 +0000 (01:32 +0100)]
Simplify code and write exe path directly to stream.
No need for a temporary string allocation here.
Milian Wolff [Tue, 2 Dec 2014 00:26:40 +0000 (01:26 +0100)]
Simplify code, it's not worth the imagined performance gain.
Milian Wolff [Fri, 28 Nov 2014 15:53:06 +0000 (16:53 +0100)]
Unset automatic locking of stdout and stdin.
This is not required in the heaptrack_interpret process.
Milian Wolff [Fri, 28 Nov 2014 15:50:04 +0000 (16:50 +0100)]
Prefer C I/O API for speed.
It's sad that this gives such a noticeable speedup :(
Milian Wolff [Fri, 28 Nov 2014 15:43:35 +0000 (16:43 +0100)]
Use fputs if we don't need to format anything.
Milian Wolff [Fri, 28 Nov 2014 15:22:44 +0000 (16:22 +0100)]
Cleanup module handling code and change the file format a bit.
We now output a single line per module and delegate the whole
interpretation of the dl_phdr_info data to heaptrack_interpret.
This allows us to reduce the size of Module a bit, leading to hope-
fully faster binary searches amongst other benefits.
Milian Wolff [Mon, 1 Dec 2014 14:50:10 +0000 (15:50 +0100)]
Remove code that depends on local libunwind patches.
Milian Wolff [Thu, 27 Nov 2014 17:23:44 +0000 (18:23 +0100)]
Fixup handling of posix_memaling.
The return value of zero indicates success here. This should fix
some inconsistencies on applications that use this function.
Milian Wolff [Thu, 27 Nov 2014 17:23:38 +0000 (18:23 +0100)]
Don't print 0B leaked lines.
Milian Wolff [Thu, 27 Nov 2014 17:19:36 +0000 (18:19 +0100)]
Properly take cfree into account.
Milian Wolff [Thu, 27 Nov 2014 16:46:11 +0000 (17:46 +0100)]
Ensure we only ever initialize once.
Milian Wolff [Thu, 27 Nov 2014 16:40:10 +0000 (17:40 +0100)]
Use the same error handler for both init error callbacks.
This happens rarely, and the error message is still helpful.
Milian Wolff [Thu, 27 Nov 2014 16:37:39 +0000 (17:37 +0100)]
Fixup callback handlers, the wrong data was passed in.
Milian Wolff [Thu, 27 Nov 2014 16:30:58 +0000 (17:30 +0100)]
Cleanup: get rid of obsolete isExe member in module.
It is only required for the call to backtrace_fileline_initialize.
Milian Wolff [Thu, 27 Nov 2014 14:50:16 +0000 (15:50 +0100)]
Add seconds elapsed to heaptrack log for future evaluation.
Milian Wolff [Fri, 17 Oct 2014 12:59:58 +0000 (14:59 +0200)]
Support building against libunwind from different include directory
Milian Wolff [Tue, 25 Nov 2014 16:12:05 +0000 (17:12 +0100)]
Delete obsolete files.
These are used in VOGL, but not in heaptrack, so get rid of them.
Milian Wolff [Tue, 25 Nov 2014 15:55:22 +0000 (16:55 +0100)]
Support atomic functions if possible.
Milian Wolff [Mon, 24 Nov 2014 16:22:40 +0000 (17:22 +0100)]
Prevent crashes in apps using QProcess.
Milian Wolff [Wed, 19 Nov 2014 12:33:01 +0000 (13:33 +0100)]
Prevent issues and file corruption when tracking a forking process.
Thankfully there is pthread_atfork which can be used to get notified
about fork calls. We stop heaptrack before forking and only continue
it in the parent process. In the child process we stop heaptrack
alltogether.
TODO: make it possible to track child processes.
Milian Wolff [Tue, 18 Nov 2014 19:04:47 +0000 (20:04 +0100)]
Also use LineReader in heaptrack_interpret.
Milian Wolff [Tue, 18 Nov 2014 18:44:20 +0000 (19:44 +0100)]
Be more forgiving when encountering bad data.
Still no clue where the bad data comes from though...
Milian Wolff [Tue, 18 Nov 2014 17:44:58 +0000 (18:44 +0100)]
Optimize findAllocation calls by leveraging monotonicity of indices.
By comparing the incoming trace index to the largest index we so
far encountered, we can already decide whether:
a) if the index is equal to the largest one, the same location is used
repeatedly, i.e. in a loop, and we can directly return the last item.
b) if the index is larger than the last known one, we'll definitely
not find it and can thus insert directly and return
c) only if the index is lower than the last index do we actually need
to do a binary search.
This yields a small but noticeable speedup from ~14.5s to 13.4s for
one of my files.
Milian Wolff [Tue, 18 Nov 2014 17:01:12 +0000 (18:01 +0100)]
Slightly cleanup and optimize LineReader::readHex
Milian Wolff [Tue, 18 Nov 2014 16:42:27 +0000 (17:42 +0100)]
minor cleanup
Milian Wolff [Tue, 18 Nov 2014 16:41:20 +0000 (17:41 +0100)]
Don't fill sizeHistogram if we are not going to print it.
Milian Wolff [Tue, 18 Nov 2014 16:28:13 +0000 (17:28 +0100)]
Introduce optimized reader class for faster reading of hex numbers.
In my benchmark on a big data file this reduces the runtime of
heaptrack_print from 24s to 16s. The whole function is pretty simple
so I'm willing to sacrifice simplicity for this big performance win.
Milian Wolff [Tue, 18 Nov 2014 15:40:35 +0000 (16:40 +0100)]
Improve error checking when reading data file.
Milian Wolff [Tue, 18 Nov 2014 15:03:17 +0000 (16:03 +0100)]
Format byte sizes to KB/MB/GB/TB.
Milian Wolff [Tue, 18 Nov 2014 13:10:18 +0000 (14:10 +0100)]
Fallback to __libc_start_main when main is not found.
Milian Wolff [Tue, 18 Nov 2014 13:10:07 +0000 (14:10 +0100)]
Change the output format slightly for better readability.
Milian Wolff [Tue, 18 Nov 2014 12:44:54 +0000 (13:44 +0100)]
Merge allocations from equivalent locations.
Equivalent means the data is the same except for the instruction pointer
address itself. Useful when we are missing some debug information.