Joint stack traces

This document describes the final results of the “Joint Profiling of Native and R Code” project. It describes two new packages, winch, and its helper package, procmaps. After a definition of the problem, the packages’ features are described, with an outline of work still to be done. It concludes with a discussion of technical caveats and a connection to the first part of the project. The appendix is a scratch pad for links and other bits of information that don’t fit elsewhere.

The winch package combines native and R stack traces in a controlled way. For now this is outside of a profiling context. This deviates from the original goal of the project, with good reason: robust combined stack traces are a cornerstone for joint profiling and are useful independent of profiling.

Stack traces

A stack trace is a sequence of code locations that describes the execution path from a program’s main entry point to a specific point of interest. Each entry represents a nested function call.

Use cases include (source):

  1. In a debugger, to show the user the stack.
  2. In a profiler, to take a sample of the stack.
  3. From the program itself (e.g. from a crash handler to show the stack).

R code

In R, stack traces can be obtained via sys.calls():

foo <- function() {
  bar()
}

bar <- function() {
  baz()
}

baz <- function() {
  sys.calls()
}

tail(foo(), 3)
#> [[1]]
#> foo()
#> 
#> [[2]]
#> bar()
#> 
#> [[3]]
#> baz()

Native code cannot easily be deduced from the stack trace. This is seen when replacing function calls with winch::winch_call(), which simulates a round-trip from native code into R by calling a function passed as argument from C code. The example becomes a double round-trip: R → native → R → native → R.

library(winch)

foo <- function() {
  winch_call(bar)
}

bar <- function() {
  winch_call(baz)
}

tail(foo(), 5)
#> [[1]]
#> foo()
#> 
#> [[2]]
#> winch_call(bar)
#> 
#> [[3]]
#> (function() {
#>   winch_call(baz)
#> })()
#> 
#> [[4]]
#> winch_call(baz)
#> 
#> [[5]]
#> (function() {
#>   sys.calls()
#> })()

A bit of guesswork is required here: we can inspect the bodies of the functions in the call stack to check for calls to .Call(), .External() or .External2(). In our example the candidates are located at indexes 2 and 4:

baz <- function() {
  funs <- winch:::sys_functions()
  lapply(funs, body)
}

tail(foo(), 5)
#> [[1]]
#> UseMethod("tail")
#> 
#> [[2]]
#> {
#>     winch_call(bar)
#> }
#> 
#> [[3]]
#> {
#>     .Call(winch_c_call, fun, env)
#> }
#> 
#> [[4]]
#> {
#>     winch_call(baz)
#> }
#> 
#> [[5]]
#> {
#>     .Call(winch_c_call, fun, env)
#> }

It would be much more accurate if the information was available directly in the stack trace. This would require changes to base R.

Native code

R is an interpreted language. It can be expected that it knows how to track the execution state. (It is much more surprising that it allows programs to compute on their own execution state – R has very strong support for reflection indeed.) How do we achieve this for native code? Can an application programmatically generate its own stack trace?

For native code, this is provided by the compiler or by specialized libraries. The main result is a sequence of instruction pointers (IP for short): addresses in the process’s memory from where function calls occurred. In some cases, additional information such as the function name is provided. In general, function name, file + line information and even library name must be deduced separately, using other specialized libraries.

The winch package wraps two libraries (for portability across different OSes) that help obtain IP sequences and function names. This functionality is accessible via winch::winch_trace_back(). For demonstration purposes, this is what the native stack trace looks like for our double round-trip. Calls into native code can be deduced from the pathname column:

baz <- function() {
  winch_trace_back()
}

trace <- foo()
nrow(trace)
#> [1] 197
head(trace, 25)
#>               func               ip
#> 1           bcEval 0000000102933540
#> 2          Rf_eval 0000000102932cc0
#> 3    R_execClosure 000000010294f8c0
#> 4  Rf_applyClosure 000000010294e500
#> 5          Rf_eval 0000000102932cc0
#> 6         do_begin 0000000102952b40
#> 7          Rf_eval 0000000102932cc0
#> 8    R_execClosure 000000010294f8c0
#> 9  Rf_applyClosure 000000010294e500
#> 10         Rf_eval 0000000102932cc0
#> 11      winch_call 00000001034eb640
#> 12     R_doDotCall 0000000102905040
#> 13          bcEval 0000000102933540
#> 14         Rf_eval 0000000102932cc0
#> 15   R_execClosure 000000010294f8c0
#> 16 Rf_applyClosure 000000010294e500
#> 17          bcEval 0000000102933540
#> 18         Rf_eval 0000000102932cc0
#> 19   R_execClosure 000000010294f8c0
#> 20 Rf_applyClosure 000000010294e500
#> 21         Rf_eval 0000000102932cc0
#> 22      winch_call 00000001034eb640
#> 23     R_doDotCall 0000000102905040
#> 24          bcEval 0000000102933540
#> 25         Rf_eval 0000000102932cc0
#>                                                                                                     pathname
#> 1                                /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib
#> 2                                /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib
#> 3                                /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib
#> 4                                /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib
#> 5                                /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib
#> 6                                /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib
#> 7                                /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib
#> 8                                /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib
#> 9                                /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib
#> 10                               /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib
#> 11 /private/var/folders/cw/805_99kx0lq50gkr0hr81_y40000gp/T/RtmpuisoOf/Rinste1c15300e219/winch/libs/winch.so
#> 12                               /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib
#> 13                               /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib
#> 14                               /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib
#> 15                               /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib
#> 16                               /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib
#> 17                               /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib
#> 18                               /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib
#> 19                               /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib
#> 20                               /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib
#> 21                               /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib
#> 22 /private/var/folders/cw/805_99kx0lq50gkr0hr81_y40000gp/T/RtmpuisoOf/Rinste1c15300e219/winch/libs/winch.so
#> 23                               /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib
#> 24                               /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib
#> 25                               /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib
table(trace$pathname)
#> 
#>                                   /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/bin/exec/R 
#>                                                                                                         1 
#>                               /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libR.dylib 
#>                                                                                                       194 
#> /private/var/folders/cw/805_99kx0lq50gkr0hr81_y40000gp/T/RtmpuisoOf/Rinste1c15300e219/winch/libs/winch.so 
#>                                                                                                         2

winch_trace_back() is a low-level function not intended to be called by the user. It is a cornerstone for joint backtraces provided by winch_add_trace_back(), as shown in the next section. For now, winch_trace_back() provides the native function name and the library name on all tested platforms (Linux, macOS, Windows). Information about file + line remains an open issue.

R code + native code

R has always been an interface language, with excellent integration of calls into foreign interfaces. R code often calls into specialized native code for speed or to avoid re-implementation in R. Due to this duality, it is helpful to access both kinds of stack trace: R and native.

These stack traces are not very helpful if taken separately. Native code is hidden from R backtrace, and R code is represented as sequence of function calls in libR.so or equivalent. Fusing these stack traces allows representing the chain of function calls from both domains when R code calls into native code (which then might call back into R).

The combination of R and native stack traces in winch relies on two heuristics:

  1. Calls from R into native code can be detected by searching for special function names in the stack trace: .Call(), .External() or .External2().
  2. In the native stack trace, interpreted R code can be distinguished from user-provided native code by checking the name of the shared library for each IP: R code corresponds to IPs pointing to libR.so.

For our example, a simplified view on the R and native stack traces, side by side, looks like this:

# R                  # Native
baz()                libR.so::... (repeated)
.Call(...)    <-->   winch.so::winch_call
bar()                libR.so::... (repeated)
.Call(...)    <-->   winch.so::winch_call
foo()                libR.so::... (repeated)

Combining stack traces now means pairing corresponding .Call() entries with native traces from other modules than libR.so, denoted by <--> in the above example.

The winch package implements this on top of rlang::trace_back(), a wrapper around sys.calls() that displays as a tree and creates better output in the presence of lazy evaluation.

baz <- function() {
  rlang::trace_back()
}

foo()
    █
 1. └─global::foo()
 2.   ├─winch::winch_call(bar)
 3.   └─(function () ... R/call.R:4:2
 4.     ├─winch::winch_call(baz)
 5.     └─(function () ... R/call.R:4:2

The function winch::winch_add_trace_back() enriches an rlang stack trace with relevant parts of the native stack trace, by pairing .Call() and similar entries with native code from outside of libR.so, as described above.

baz <- function() {
  trace <- rlang::trace_back()
  winch_add_trace_back(trace)
}

foo()
    █
 1. └─global::foo()
 2.   └─winch::winch_call(bar)
 3.     └─`/winch.so`::winch_call()
 4.       └─(function () ... R/call.R:4:2
 5.         └─winch::winch_call(baz)
 6.           └─`/winch.so`::winch_call()
 7.             └─(function () ... R/call.R:4:2

Stack traces must be collected at the time that the error occurs, this includes calls to rlang::abort() which already collect a stack trace. The integration is a small change to rlang that calls winch_add_trace_back() during collection if winch is installed and the rlang_trace_use_winch option is identical to 1L. With this option set, manual addition of the native stack trace is no longer necessary.

options(rlang_trace_use_winch = TRUE)

baz <- function() {
  rlang::trace_back()
}

foo()
    █
 1. └─global::foo()
 2.   └─winch::winch_call(bar)
 3.     └─`/winch.so`::winch_call()
 4.       └─(function () ... R/call.R:4:2
 5.         └─winch::winch_call(baz)
 6.           └─`/winch.so`::winch_call()
 7.             └─(function () ... R/call.R:4:2

The vctrs package uses C for performance-critical parts, and often calls back into R for e.g. S3 dispatch or error reporting. The example below triggers an error and native code calls into R to throw the error.

options(
  error = rlang::entrace,
  rlang_backtrace_on_error = "full",
  rlang_trace_use_winch = TRUE
)

vctrs::vec_as_location(quote, 2)
Error: Must subset elements with a valid subscript vector.
✖ Subscript has the wrong type `function`.
ℹ It must be logical, numeric, or character.
Backtrace:
    █
 1. └─vctrs::vec_as_location(quote, 2)
 2.   └─`/vctrs.so`::vctrs_as_location()
 3.     └─`/vctrs.so`::vec_as_location_opts()

A bare-bones solution that enhances sys.calls() and sys.frames() and is usable outside of rlang remains an open issue.

Limitations on Windows

Building an R package on Linux and Windows seems to include the necessary debugging information that helps libunwind and libbacktrace. On Windows, it seems necessary to set DEBUG=true, e.g. in ~/.R/Makevars. This adds -gdwarf-2 to the compiler flags.

Also, Windows users currently need winch::winch_init_library() to trace errors in their library. This is a limitation of the upstream library.

On Windows, 64-bit R is required.

Next steps

The current implementation is useful for displaying combined stack traces for debugging. Testing in practice is likely to discover problems or corner cases that were missed during initial testing.

A few things need to be done for fully portable joint profiling based on the new infrastructure:

For errors thrown from Rcpp code, the stack trace ends at the auto-generated stub, because this is the code that catches the C++ exception. It would be useful if stack trace information could be collected at the point of error and provided as part of the error condition.

Technical details

Stack unwinding, or retrieving the sequence of IPs for the stack trace, is dependent on the compiler, and also on the executable format used. As the blog post already linked above puts it,

stack unwinding … [is an] esoteric topic [that] lies at the intersection of compilers, linkers, loaders, debuggers, ABIs, and language runtimes. There is precious little documentation about it and, from what I can tell, a small handful of guys who answer all the questions about it.

It’s not surprising that there is no portable “one size fits all” solution. Some but not all options give the function name for each entry, sometimes file + line information is available too. Determining the shared library that provided the code seems to be an entirely separate topic.

Retrieving IPs

The following options were experimented with:

  1. The libc function execinfo() was a first obvious candidate. It works well enough on mainstream Linux, but not on Windows. Also, function names were not discovered reliably, in particular for optimized code.

  2. The libunwind library supports similar functionality and discovers function names more reliably. Unfortunately, it also does not work on Windows. What’s more, there are two implementations: the macOS implementation, part of clang, is a newer rewrite of the original non-GNU implementation that is available on GitHub.

  3. The libbacktrace library, part of GCC and also available standalone, with its own tests, seems to give slightly worse results for the detection of function names. On the up side, it is portable, with support for Windows (MSYS2, on which the most recent rtools is based). It also gives file + line information for each stack frame.

Wrapping the libbacktrace library looks like the best option in terms of portability and features. Unfortunately, it currently fails on macOS. Therefore, a combination of libbacktrace and libunwind is used depending on the platform:

The execinfo() approach is no longer used because libunwind is clearly better.

libbacktrace on the Mac

  • The libbacktrace tests succeed

  • The winch package can be built with libbacktrace enabled

  • Running the winch examples gives a segmentation fault

libbacktrace on Windows

Options not considered

  • boost stacktrace has not been tried yet

  • The stacktrace library is from 2009, was last changed 2013

  • A fully native route has not been considered. Without library support this would amount to rewriting large parts of existing library code.

Retrieving module name

Perhaps not surprising, mapping IPs to module names also requires platform-specific code. On Linux, the mapping is provided through the /proc/self/maps virtual file. Other systems require vastly different approaches. It appears that there is not a library dedicated specifically to this problem.

The gperftools repository contains relevant code targeting all platforms of interest. I extracted this code in the procmaps package, now on CRAN.

A first release, version 0.0.1, was flagged due to a too complex ./configure script. With further simplification it was possible to avoid configuration entirely. It should be possible to rewrite the relevant code in C to provide a standalone solution that is applicable in a wider context.

Retrieving file + line

When file and line are provided for function calls in the stack trace, debugging is even easier. This information is also relevant for profiling, so that the costs of each source line can be estimated.

The libbacktrace library optionally provides file + line information for each entry in the stack trace.

Unfortunately it currently does not work on macOS:

The gperftools library relies on the addr2line executable for retrieving file + line information (source, relevant StackOverflow question).

Once a fully portable solution is found, this information should be provided by winch_add_trace_back().

History: jointprof part 1

The main result of the first part of the project is the jointprof package, which implements a proof of concept for recording native and R stack traces in parallel, exclusively during profiling. It uses the gperftools library to record native stack traces, and daisy chains into the profiling routine provided by R to record R stack traces. A helper package, profile, processes the profiler data created by gperftools.

This approach works on Linux only. On macOS, a patch is required; Windows is completely unsupported. The approach is a bit brittle. Occasionally stack traces could not be matched. Fusion of stack traces for debugging purposes is not possible.

Will Landau contributed the proffer package to the project, a drop-in replacement to profvis that uses pprof to visualize profiles.

Appendix

Scratch pad for information that doesn’t fit elsewhere.

Accompanying work

Technical details