Development hints & tips
These tips mostly revolve around Genode's syslog-style "LOG" as it is so crucial in debugging.
Tracing the build system
As mentionned in the official documentation, the Makefile system supports tracing flags that may be passed to gmake such as "VERBOSE=". Here's the whole complement of flags in one convenient line:make -C build/x86_64/ VERBOSE= VERBOSE_MK= VERBOSE_DIR= MAKEFLAGS=--trace <target name goes here>
Obtaining the LOG contents
- qemu: use the command line argument "-serial mon:stdio" -- LOG contents will be displayed in the console from which you launched qemu. (as a side note, qemu also has its own tracing, for low level hardware emulation debugging). This is the most convenient; however, not all run scenarios can be launched in qemu, they sometimes need "bare metal"; hence the next items.
- VirtualBox: enable COM (serial) output and capture (?)
- serial port (uart): xx
- UDP/IP: if you're not setup for serial port tracing, it is possible to forward LOG messages on the network as UDP packets; see this and this
- Terminal: if all else fails, and your system boots far enough to run nitpicker and friends, you may redirect LOG to a (genode terminal) window on the nitpicker desktop. This won't be as complete as the previous methods, as it won't capture the output of init, the microkernel ..etc.
To elaborate on that latter one, one may retrieve part of the LOG, the lines produced by a component or set of components. This can be mighty useful, if not as complete as getting the whole thing. Here it is, in Jam syntax (TODO: express in make/expect/tcl syntax too):
1) make the component you want to trace/debug, redirect its output to a component called terminal_log:
<route> <service name=\"LOG\"> <child name=\"terminal_log\"/> </service> </route>
2) launch terminal_log as a "bridge" to a terminal, the terminal itself, and a terminal 'framebuffer' to display terminal contents on screen, and supporting fonts and libs:
AddRawComponent tts-dev.run : VeraMono.ttf : VeraMono.ttf ; AddRawComponent tts-dev.run : vfs_ttf.lib.so : vfs_ttf.lib.so ; AddComponentAsStart tts-dev.run : 4M : nit_fb : "name=\"terminal_fb\" caps=\"120\"" : " <binary name=\"nit_fb\"/> <provides> <service name=\"Framebuffer\"/> <service name=\"Input\"/> </provides> <config xpos=\"0\" ypos=\"10\" width=\"640\" height=\"480\" refresh_rate=\"25\"/> " ; AddComponentService tts-dev.run : 1M : terminal_log : LOG ; AddComponentService tts-dev.run : 3M : terminal : Terminal : <config> <vfs> <rom name=\"VeraMono.ttf\"/> <dir name=\"fonts\"> <dir name=\"monospace\"> <ttf name=\"regular\" path=\"/VeraMono.ttf\" size_px=\"16\"/> </dir> </dir> </vfs> </config> <route> <service name=\"Input\"> <child name=\"terminal_fb\"/> </service> <service name=\"Framebuffer\"> <child name=\"terminal_fb\"/> </service> <any-service> <parent/> <any-child/></any-service> </route> ;
Memory leaks, corruption ..etc
One way to diagnose memory heap problems is to use a "guarded heap". It seems there is no guarded heap out of the box in Genode, but it should be possible to implement at least a "tracing heap" by following the lead of base/src/test/main.cc's "struct allocator"
xxx also look into potential for "sliced heap" ..etc to help
Obtaining and using backtraces
Note: there is a lot of extra tips and a "hands on" case study over at Pine fun - How did we come here?. It first reminds the reader of tracing basics to get off on a gentle start (always nice for shy readers ;) ) :
Genode::log(__FILE__, ":", __LINE__);
And then gets to the nuts and bolts:
- Genode::raw() is available as an alternative to Genode::log() if the latter fails in extreme cases (if you're bootstrapping Genode on newly supported hardware e.g.)
- how to instrument the immediate parent call (first backtrace item) by adding Genode::log("called from ", __builtin_return_address(0));
- how to generate a full backtrace with Genode::backtrace(); and addr2line
Anyway here's my own 'case study' of sorts:
1. Getting a stack crawl/backtrace in case of e.g. crash
Say you have a component issuing a Genode::error. For instance, audio_drv logs "Error: slab too large...". Grepping the source, you find where that error is produced, but don't know which part of audio_drv triggered it.
You could retrieve the address of the function immediately above the currently executed one:
- add this at the "leaf" call, which in this case is the memory slab:
- Genode::log( __builtin_return_address(0) );
- => that will log: 0x1034c6d ; that is the address of the function which called the (unsuccessful) slab method. Now we have to find out which function lives at that address.
Or you could even get the whole backtrace:
#include <spec/x86_64/os/backtrace.h> ... void function_where_the_problem_occurs() { ... Genode::backtrace();
Before using backtrace(), make sure your code is compiled with the the -fno-omit-frame-pointer GCC option, otherwise you'll get an exception like this:
Warning: unresolvable exception 13, pd 'init -> audio_drv', thread 'ep', cpu 0, ip=0x10455bb no signal handler
You may add "CC_OPT += -fno-omit-frame-pointer" to either...
- the target.mk of your project/component, or
- ..build-dir../etc/tools.conf, to affect all components
Next up, what to do with the retrieved function address ?
2. Mapping the address to source code, function names
See https://github.com/genodelabs/genode/issues/3451#issuecomment-513151548 :
Start by disassembling the concerned program with something like this:
cd (build-dir)/x86_64/debug genode-x86-objdump -dlC audio_drv
Or as another example:
genode-x86-objdump -dSCl debug/ld-sel4.lib.so | less
Doing so will produce the (huge) disassembly of the binary with detailed offsets, and even source code line numbers.
In this case, the offsets (from beginning of driver) are not "offseted" (from physical address space), so it's just a matter of using the unmodified value: we'll look for a disassembled offset that is equal to, or closest to, the 0x1034c6d address from the backtrace mentionned before.
In this case, the look up yields show the closest address is 0x1034c6b:
1034c6b: ff d0 callq *%rax
Looking a few lines higher up, we see a header that mentions the corresponding source code line:
.....src/lib/audio/dev/pci/azalia_codec.c:1337
Finally, looking up line 1337 of file azalia_codec.c, we see a function called azalia_mixer_ensure_capacity(), which indeed is calling
newbuf = mallocarray(newmax, sizeof(mixer_item_t), M_DEVBUF, M_NOWAIT | M_ZERO);
So that's the one which was ultimately responsible for the failed call to enlarge the slab.
Obtaining miscellaneous info
To retrieve the raw instruction pointer, the stack pointer, or other registers (flags, trapno..) use e.g. repos/base/include/spec/x86_64/cpu/cpu_state.h :
Cpu_thread_client thread_client(Thread::myself()->cap()); Thread_state state = thread_client.state(); log("my ip=", Hex(state.ip), " sp=", Hex(state.sp), " - done");
That will yield e.g.
my ip=0x837c2 sp=0x401fe4c0 - done
Other uses of objdump
(from the mailing-list):
If a run scenario fails with a message like this:
[init -> test-go] Error: LD: exception during program load: 'Genode::Region_map::Region_conflict'
It means the "loader is not able to load the binary because it cannot fulfill the memory-region requirements of the application.". One may investigate the structure of the binary with objdump, e.g.:
genode-x86-objdump -p bin/test-log
This lists needed shared libraries (e.g. "NEEDED ld.lib.so") and at which RAM addresses the binary segments will be loaded (e.g. "LOAD off 0x0000000000001000 vaddr 0x0000000001000000 paddr 0x0000000001000000 align 2**12").
The GDB debugger
Please refer to User-level debugging on Genode via GDB.
Trouble-shooting
Failure to boot
Problem: execution stops early; LOG shows just the initial listing of components, but nothing more.=> might be due to an invalid "config", which makes core/init fail silently; this should be caught by xmllint at the image building stage though.
Failure to boot
Problem: LOG mentions something about "session_requests", then execution stops:[init] Error: ROM-session creation failed (ram_quota=6144, cap_quota=3, label="session_requests") [init] Error: Could not open ROM session for "session_requests" [init] Error: Uncaught exception of type 'Genode::Rom_connection::Rom_connection_failed' [init] Warning: abort called - thread: ep child "init" exited with exit value 1
==> Occurs if you put e.g. this
<service name="Input"> <default-policy> <child name="input_filter"/> </default-policy> </service>
in the wrong place (e.g. in the header of the root init config)
Non-repainting or frozen window, non-responsive program
=> check the LOG, your component might be stuck asking for ram or capabilities, which seems to freeze most or all ep (entry points) and threads. Tweak your config to increase the RAM and/or capabilities quota. Also check for more 'traditional' causes: deadlock, blocking "syscall"..
More:
1) if there are several apps open and none of them react to clicks, it's probably the input drivers or input_filter which are at fault, not your app ==> check the LOG for capability requests ..etc from them.
2) a blockage on opening a session will typically block only the one thread; example: the CC main window blocks opening Audio_out on qemu; but the Settings window (provided it was opened/made available prior to doing audio) still reacts to clicks afterwards.