Generating Pinballs with the Logger

The recording of a workload execution flow is called logging. This action is done in the logger. It can start the recording at the start of the application’s run or upon getting a start event from The Controller.

At the start of the recording, the register state is captured and the logger switches to logging mode. In this mode every instruction is inspected and injections are emitted to guarantee a deterministic replay. There are two kinds of injections, register injections that can modify the value of a register and a memory injections that can modify the value of certain memory region.

Like any other binary instrumentation tool, the logger is working only in user mode and has no visibility to ring0. This means that the recording is paused at the execution of system call or exception, and resumed when the kernel resumes the execution in user mode.

The logger is using Intel® Pin system call and context change callbacks to handle these cases, and emits injections to the IP (i.e. the instruction pointer) to skip the execution of the system call, or jump to the exception/signal handler. The logger is also using shadow memory to detect memory regions which where modified by the kernel or by other external agent, thus emitting memory injections to ensure that the memory instruction (at replay time) will be executed with the same inputs as in the log time.

Since pinball is a collection of files, the user specify the pinball base name. This name is used as prefix of the pinball files. Intel® SDE can create the directory in which the files will be created, freeing the user from creating empty directories for each and every pinball.

Examples for generating `pinball` for single threaded workloads:

% sde -skx -log -log:basename pinballs/myapp -- myapp

In this example, the entire run of myapp is recorded in the pinballs directory. Each of the pinball files has the prefix ‘myapp’.

You can specify analysis tools at the same run as the recording.

% sde -skx -log -log:basename pinballs/myapp -mix -- myapp

In this example, a mix file will also be generated.

You can specify start and stop events to define the region of interest.

% sde -skx -log -log:basename pinballs/myapp -control start:address:myfunc,stop:icount:10000 -- myapp

In this example, the recording starts at the entry to the function myfunc and stops after 10000 instructions.

Multi-Threading Support

When capturing multi-threaded workload, it is not enough to record what happen in each thread. You must also capture the thread order when the same memory block is accessed by multiple threads. Intel® SDE is also needs to guarantee that the memory block that was checked with the shadow memory, was not changed by another thread in the time between inspecting the memory and the time the instruction really executed.

This means, the logging multi-threaded workloads, adds significant overhead which is not needed for single threaded workloads. Therefore, the user must specify that the workload is multi-threaded to let PinPlay run in multi-threaded mode.

Starting the recording in the middle of the run of multi-threaded workload, require stopping the application’s threads before the initial architectural state can be captured (for all the active threads). This means that Intel® SDE is using Intel® Pin API to stop all application’s threads which will cause a small delay between receiving the start event and the actual start of the recording.

The pinball has global files and per-thread files. The per-thread files has the thread-id as part of the file name.

% sde -skx -log -log:basename pinballs/myapp -log:mt -- myapp

In this example, a multi-threaded pinball will be created for myapp.

% sde -skx -log -log:basename pp/myapp -control start:address:myfunc:tid2,stop:icount:10000:tid2 -- myapp

In this example the start event will be fired when myfunc is called in thread number 2, and the stop event will be delivered after thread number 2 executes 10000 instructions. Since the recording is global, all the execution of the other threads will be captured in the pinball.

Focus Thread

Intel® SDE provides an option to record only one thread from the execution of multi-threaded workload. This is useful when the application is using symmetric programming model like openmp. Use the knob -log:focus-thread <tid> to tell which thread to capture. You must add the -log:mt knob to make sure that the logger is running in multi-threaded mode.

Intel® SDE will use memory injections to apply the memory modifications that happen in other (i.e. other than the focus thread) threads, and where read by the focus thread.

Sometime you don’t know which thread will execute the “interesting region”, you can use the dynamic focus option to tell the logger to capture the focus thread of the thread that delivered the start event. Use the knob -log:dynamic-focus instead of -log:focus-thread.

% sde -skx -log -log:basename pinballs/myapp -log:mt -log:focus-thread 2 -- myapp

In this example, a single-threaded pinball will be created for thread number 2 of myapp.

Multi Process Mode

Intel® SDE is working inside the process, and automatically injects itself into child processes. This means that when an application is calling fork on Linux or CreateProcess on Windows, Intel® SDE will trace also the newly created process. The collected pinball is still a recording of a single process. Therefore, Intel® SDE opens a subdirectory of the parent process pinball directory, and it will generate the pinball files for the child process at this directory.

Intel® SDE has a special mechanism to handle memory accesses between multiple processes (using shared memory).

Logging Multiple regions

Tracing applications with Intel® SDE is often using the controller to define the region of interest. The controller supports defining multiple regions. This can happen with the uniform controller or when using the repeat modifier.

Normally, the logger will trace only the first region that was reached. The user can add the knob -log:region_id and each region will be created separately, and its base name will get a suffix of the form <basename>_<id>.

% sde -skx -log -log:basename pinballs/myapp -log:region_id \
  -control start:enter_func:myfunc,stop:exit_func:myfunc,repeat -- myapp

In this example, the logger records each call to the function myfunc until it returns, in a separate region.

Note

The controller stop event does not terminate the run. You can use the -early-out knob to tell the controller to terminate the process upon getting the stop event.

Excluding Code

Intel® SDE provides a way to exclude code from the recording. This is useful when we want to skip code in spin-loops, or code in certain libraries (like system libraries). When excluding code, Intel® SDE will use injections to skip the code and memory injections to apply the memory modifications that happen when the code was excluded.

Intel® SDE provides a an option to exclude entire image by using the -log:exclude-image knob. But, it also provides a flexible way to define the start and stop conditions for the code exclusion. Intel® SDE has an additional instance of the controller which service only the code-exclusion module. You need to specify -log:exclude-code to activate this mode, and use the controller with the -log:exclude:control to define this controller start and stop events.

Similar to excluding code, Intel® SDE allow to exclude thread. In this mode the thread will not be captured in the pinball.

% sde -skx -log -log:basename pinballs/myapp -log:mt -log:exclude-code \
  -control start:address:myfunc,stop:icount:100000000 \
  -log:exclude:control start:enter_func:spin_lock,stop:exit_func:spin_lock,repeat \
  -log:exclude:control start:enter_func:spin_unlock,stop:exit_func:spin_unlock,repeat \
  -log:exclude-thread 0 -- myapp

In this example, thread 0 as well as the code in the functions spin_lock and spin_unlock is excluded from the pinball.

The Pinball files

The pinball format is a collection of text files that represent the recorded region. Some files are per thread, and in a single threaded pinball they get the .0 addition to the base name.

Per-process

Per-thread

Type

Content

Type

Content

.address

recorded memory regions

<tid>.dyn_text

pages loaded dynamically

.cpuid.def

emulated CPUID definition

<tid>.race

meta data for thread synchronization

.global.log

global data: SDE command, attributes, IPs tables,…

<tid>.reg

register injections

.log.txt

debugging and information messages for the logger

<tid>.result

attributes and other information

.replay.txt

debugging and information messages for the replayer

<tid>.sel

memory injections

.procinfo.xml

symbolic and image information

<tid>.sync_text

synchronization of page injections

.text

static code and data pages loaded at the start

The Logger Knobs

The logger options have the ‘log:’ prefix.

-log

Activate the logger [default 0]

-log:basename

Specify base name for log files [default log]

-log:compressed

Compression method, “none”, “gzip” or “bzip2” [default none]

-log:copy_cpuid_file

Copy the chip CPUID file to the pinball even when attaching to the process [default 0]

-log:dynamic_focus

Log dynamic focus thread according to start event [default 0]

-log:early_detach

Detach after generating the log-files for the first region [default 0]

-log:early_out

Exit after generating the log-files for the first region [default 0]

-log:exclude_code

Enable code exclusion. (Range to exclude to be specified by ‘-log:exclude:…’ knobs) [default 0]

-log:exclude_image

Images to exclude from log [default “”]

-log:exclude_thread

Threads to exclude from relog [default -1]

-log:fat

Shortcut for ‘-log:whole_image’ and ‘-log:pages_early 1’ [default 0]

-log:focus_thread

Specifies which thread id to log (defaults to all) [default -1]

-log:mp_mode

Use atomic instrumentation to support cross-process memory accesses ( values: 0 - Disabled, 1 - Enabled, 2 - to be determined according to other knobs) [default 2]

-log:mt

Trace multi-threaded program [default 0]

-log:optimize_injections

Optimize injections for single threaded relog [default 0]

-log:optimize_rep_injections

Optimize repeat string instructions injections [default 1]

-log:optimize_syscall_mem_injections

Support parsing system call to optimize and generate memory injections [default 1]

-log:page_size

Size of shadow memory page in bytes [default 4096]

-log:pages_early

Log pages of dynamically loaded libraries to the initial memory image [default 1]

-log:pid

Use PID for naming files [default 0]

-log:race_data_size

Size of data per race entry in bytes [default 128]

-log:region_id

Use region number for naming files [default 0]

-log:stop_at_special_inst

Stop logging when hit a special instruction which requires injection [default 0]

-log:stop_at_syscall

Stop log when hitting a system call instruction [default 0]

-log:stop_at_vsyscall

Stop log when hitting code at the VSYSCALL area [default 0]

-log:syminfo

Generate procinfo XML file [default 1]

-log:whole_image

Log all image pages loaded (even if not touched) [default 0]