-* BFD:
- + New executable formats
- + Read list of libraries needed
- + Read list of undefined symbols in executables
- + Read list of exported symbols in libraries
- + Read debugging info from executables/libraries
-* Automatically update list of syscalls?
-* Improve documentation
-* Improve -e/-x options (regexp?)
-* Improve -l option
-* Improve C++ name demangling
-* Display different argument types
-* Update /etc/ltrace.conf
-* More architectures, cleaner way to port
-* More operating systems (solaris?)
-* Option -I (inter-library calls)
-* Modify ARGTYPE_STRING[0-5] types so that they don't stop displaying chars when '\0' is seen
-* Get rid of EVENT_ARCH_SYSCALL and EVENT_ARCH_SYSRET
-* Cleaner way to use breakpoints:
- + BP is placed in the PLT
- + When control hits there:
- - write down return address
- - change return address with another one (handled by ltrace)
- - get arguments...
- - change the process' PC to be in the correct place,
- without removing breakpoint
- + When control hits one of our return addresses:
- - get return value...
- - change PC to the right place
-* To be able to work with processes sharing memory, we must:
- + ptrace() every single thread
- + place breakpoints only in places where the process control can continue
- without having to remove it
-* List source dependencies in Makefile
-* Create different ltrace processes to trace different children
-* After a clone(), syscalls may be seen as sysrets in s390 (see trace.c:syscall_p())
+-*-org-*-
+* TODO
+** Keep exit code of traced process
+ See https://bugzilla.redhat.com/show_bug.cgi?id=105371 for details.
+
+** Automatic prototype discovery:
+*** Use debuginfo if available
+ Alternatively, use debuginfo to generate configure file.
+*** Mangled identifiers contain partial prototypes themselves
+ They don't contain return type info, which can change the
+ parameter passing convention. We could use it and hope for the
+ best. Also they don't include the potentially present hidden this
+ pointer.
+** Automatically update list of syscalls?
+** More operating systems (solaris?)
+** Get rid of EVENT_ARCH_SYSCALL and EVENT_ARCH_SYSRET
+** Implement displaced tracing
+ A technique used in GDB (and in uprobes, I believe), whereby the
+ instruction under breakpoint is moved somewhere else, and followed
+ by a jump back to original place. When the breakpoint hits, the IP
+ is moved to the displaced instruction, and the process is
+ continued. We avoid all the fuss with singlestepping and
+ reenablement.
+** Create different ltrace processes to trace different children
+** Config file syntax
+*** mark some symbols as exported
+ For PLT hits, only exported prototypes would be considered. For
+ symtab entry point hits, all would be.
+
+*** named arguments
+ This would be useful for replacing the arg1, emt2 etc.
+
+*** parameter pack improvements
+ The above format tweaks require that packs that expand to no types
+ at all be supported. If this works, then it should be relatively
+ painless to implement conditionals:
+
+ | void ptrace(REQ=enum(PTRACE_TRACEME=0,...),
+ | if[REQ==0](pack(),pack(pid_t, void*, void *)))
+
+ This is of course dangerously close to a programming language, and
+ I think ltrace should be careful to stay as simple as possible.
+ (We can hook into Lua, or TinyScheme, or some such if we want more
+ general scripting capabilities. Implementing something ad-hoc is
+ undesirable.) But the above can be nicely expressed by pattern
+ matching:
+
+ | void ptrace(REQ=enum[int](...)):
+ | [REQ==0] => ()
+ | [REQ==1 or REQ==2] => (pid_t, void*)
+ | [true] => (pid_t, void*, void*);
+
+ Or:
+
+ | int open(string, FLAGS=flags[int](O_RDONLY=00,...,O_CREAT=0100,...)):
+ | [(FLAGS & 0100) != 0] => (flags[int](S_IRWXU,...))
+
+ This would still require pretty complete expression evaluation.
+ _Including_ pointer dereferences and such. And e.g. in accept, we
+ need subtraction:
+
+ | int accept(int, +struct(short, +array(hex(char), X-2))*, (X=uint)*);
+
+ Perhaps we should hook to something after all.
+
+*** system call error returns
+
+ This is closely related to above. Take the following syscall
+ prototype:
+
+ | long read(int,+string0,ulong);
+
+ string0 means the same as string(array(char, zero(retval))*). But
+ if read returns a negative value, that signifies errno. But zero
+ takes this at face value and is suspicious:
+
+ | read@SYS(3 <no return ...>
+ | error: maximum array length seems negative
+ | , "\n\003\224\003\n", 4096) = -11
+
+ Ideally we would do what strace does, e.g.:
+
+ | read@SYS(3, 0x12345678, 4096) = -EAGAIN
+
+*** errno tracking
+ Some calls result in setting errno. Somehow mark those, and on
+ failure, show errno. System calls return errno as a negative
+ value (see the previous point).
+
+*** second conversions?
+ This definitely calls for some general scripting. The goal is to
+ have seconds in adjtimex calls show as e.g. 10s, 1m15s or some
+ such.
+
+*** format should take arguments like string does
+ Format should take value argument describing the value that should
+ be analyzed. The following overwriting rules would then apply:
+
+ | format | format(array(char, zero)*) |
+ | format(LENS) | X=LENS, format[X] |
+
+ The latter expanded form would be canonical.
+
+ This depends on named arguments and parameter pack improvements
+ (we need to be able to construct parameter packs that expand to
+ nothing).
+
+*** More fine-tuned control of right arguments
+ Combination of named arguments and some extensions could take care
+ of that:
+
+ | void func(X=hide(int*), long*, +pack(X)); |
+
+ This would show long* as input argument (i.e. the function could
+ mangle it), and later show the pre-fetched X. The "pack" syntax is
+ utterly undeveloped as of now. The general idea is to produce
+ arguments that expand to some mix of types and values. But maybe
+ all we need is something like
+
+ | void func(out int*, long*); |
+
+ ltrace would know that out/inout/in arguments are given in the
+ right order, but left pass should display in and inout arguments
+ only, and right pass then out and inout. + would be
+ backward-compatible syntactic sugar, expanded like so:
+
+ | void func(int*, int*, +long*, long*); |
+ | void func(in int*, in int*, out long*, out long*); |
+
+ This is useful in particular for:
+
+ | ulong mbsrtowcs(+string(array(uint, zero(arg3))), string*, ulong, addr); |
+
+ Where we would like to render arg2 on the way in, and arg1 on the
+ way out.
+
+ But sometimes we may want to see a different type on the way in and
+ on the way out. E.g. in asprintf, what's interesting on the way in
+ is the address, but on the way out we want to see buffer contents.
+ Does something like the following make sense?
+
+ | void func(X=void*, long*, out string(X)); |
+
+** Support for functions that never return
+ This would be useful for __cxa_throw, presumably also for longjmp
+ (do we handle that at all?) and perhaps a handful of others.
+
+** Support flag fields
+ enum-like syntax, except disjunction of several values is assumed.
+** Support long long
+ We currently can't define time_t on 32bit machines. That mean we
+ can't describe a range of time-related functions.
+
+** Support signed char, unsigned char, char
+ Also, don't format it as characted by default, string lens can do
+ it. Perhaps introduce byte and ubyte and leave 'char' as alias of
+ one of those with string lens applied by default.
+
+** Support fixed-width types
+ Really we should keep everything as {u,}int{8,16,32,64} internally,
+ and have long, short and others be translated to one of those
+ according to architecture rules. Maybe this could be achieved by a
+ per-arch config file with typedefs such as:
+
+ | typedef ulong = uint8_t; |
+
+** Support for ARM/AARCH64 types
+ - ARM and AARCH64 both support half-precision floating point
+ - there are two different half-precision formats, IEEE 754-2008
+ and "alternative". Both have 10 bits of mantissa and 5 bits of
+ exponent, and differ only in how exponent==0x1F is handled. In
+ IEEE format, we get NaN's and infinities; in alternative
+ format, this encodes normalized value -1S × 2¹⁶ × (1.mant)
+ - The Floating-Point Control Register, FPCR, controls: — The
+ half-precision format where applicable, FPCR.AHP bit.
+ - AARCH64 supports fixed-point interpretation of {,double}words
+ - e.g. fixed(int, X) (int interpreted as a decimal number with X
+ binary digits of fraction).
+ - AARCH64 supports 128-bit quad words in SIMD
+
+** Some more functions in vect might be made to take const*
+ Or even marked __attribute__((pure)).
+
+** pretty printer support
+ GDB supports python pretty printers. We migh want to hook this in
+ and use it to format certain types.
+
+* BUGS
+** After a clone(), syscalls may be seen as sysrets in s390 (see trace.c:syscall_p())