strace -k (built with libunwind)

prints the stack trace with the system calls

Franck Pachot
3 min readJul 5, 2019

PostgreSQL is Open Source and you may think that it is not necessary to trace the system calls as we can read the source code. But even there, strace is a really nice tool for troubleshooting a running process.

https://postgreslondon.org/speaker/dmitrii-dolgov/

Little disclaimer here: attaching strace to a running process may hang. Do not use it in production, except when this (small) risk is an acceptable way to troubleshoot a critical problem.

At PostgresLondon 2019, attending Dmitry Dolgov session “PostgreSQL at low level: stay curious!” I learned something about strace: the -k argument displays the stack trace. That’s awesome: it can display the full stack of C functions in the software at the moment the system call is executed. An easy way to get the userspace context of the system call, and understand the reason for the call.

With Oracle, we are used to see the stack trace in dumps, but of course, I immediately wanted to test this strace -k:

$ strace -k
strace: invalid option -- 'k'
Try 'strace -h' for more information.

Ok, not available here (OEL7.6) but man strace gives more information:

-k Print the execution stack trace of the traced processes after each system call (experimental). This option is available only if strace is built with libunwind.

Ok, no, problem, let’s build strace with libunwind

This is what I did, as root, to replace my current strace with the new one:

sudo yum install -y autoconf automake libunwind libunwind-devel gcc
cd /var/tmp
git clone https://github.com/strace/strace.git
cd strace
./bootstrap
./configure --with-libunwind
make
sudo make install

The man strace is also updated:

-k Print the execution stack trace of the traced processes after each system call.

Ready to test. Here is an example on the Oracle Log Writer for writes:

strace -k -e trace=desc -y -p $(pgrep -f ora_lgwr_CDB1A)

When I have a stack from the oracle executable, I format it as we are used to see in Oracle dumps short stacks:

strace -k -e trace=desc -y -p $(pgrep -f ora_lgwr_CDB1A) 2>&1 | 
awk '/^ > /{gsub(/[()]/," ");sub(/+0x/,"()&");printf "<-"$3;next}{sub(/[(].*/,"()");printf "\n\n"$0}'

and then I can paste it in Frits Hoogland http://orafun.info/stack/ to add some annotations about the Oracle C functions:

http://orafun.info/stack/ created by Frits Hoogland with a little help from Kamil Stawiarski

Again, there is always a risk to attach strace to a running process, and there may be some reasons why the Linux distributions do not build it with libunwind, so be careful: Labs, non-critical environment. Or critical ones where a blocking issue to troubleshoot justifies the risk.

Update 6-DEC-2019

I mentioned Dmitry Dolgov presentation. Here is a detailed article from him on the same subject:

--

--

Franck Pachot

Developer Advocate for YugabyteDB (Open-Source, PostgreSQL-compatible Distributed SQL Database. Oracle Certified Master and AWS Data Hero.