Generating Pinballs with the Logger
The recording of a workload execution flow is called logging. This action is done in the logger. It can start the recording at the start of the application’s run or upon getting a start event from The Controller.
At the start of the recording, the register state is captured and the logger switches to logging mode. In this mode every instruction is inspected and injections are emitted to guarantee a deterministic replay. There are two kinds of injections, register injections that can modify the value of a register and a memory injections that can modify the value of certain memory region.
Like any other binary instrumentation tool, the logger is working only in user mode and has no visibility to ring0. This means that the recording is paused at the execution of system call or exception, and resumed when the kernel resumes the execution in user mode.
The logger is using Intel® Pin system call and context change callbacks to handle these cases, and emits injections to the IP (i.e. the instruction pointer) to skip the execution of the system call, or jump to the exception/signal handler. The logger is also using shadow memory to detect memory regions which where modified by the kernel or by other external agent, thus emitting memory injections to ensure that the memory instruction (at replay time) will be executed with the same inputs as in the log time.
Since pinball is a collection of files, the user specify the pinball base name. This name is used as prefix of the pinball files. Intel® SDE can create the directory in which the files will be created, freeing the user from creating empty directories for each and every pinball.
Examples for generating `pinball` for single threaded workloads:
% sde -skx -log -log:basename pinballs/myapp -- myapp
In this example, the entire run of myapp is recorded in the pinballs directory. Each of the pinball files has the prefix ‘myapp’.
You can specify analysis tools at the same run as the recording.
% sde -skx -log -log:basename pinballs/myapp -mix -- myapp
In this example, a mix file will also be generated.
You can specify start and stop events to define the region of interest.
% sde -skx -log -log:basename pinballs/myapp -control start:address:myfunc,stop:icount:10000 -- myapp
In this example, the recording starts at the entry to the function myfunc and stops after 10000 instructions.
Multi-Threading Support
When capturing multi-threaded workload, it is not enough to record what happen in each thread. You must also capture the thread order when the same memory block is accessed by multiple threads. Intel® SDE is also needs to guarantee that the memory block that was checked with the shadow memory, was not changed by another thread in the time between inspecting the memory and the time the instruction really executed.
This means, the logging multi-threaded workloads, adds significant overhead which is not needed for single threaded workloads. Therefore, the user must specify that the workload is multi-threaded to let PinPlay run in multi-threaded mode.
Starting the recording in the middle of the run of multi-threaded workload, require stopping the application’s threads before the initial architectural state can be captured (for all the active threads). This means that Intel® SDE is using Intel® Pin API to stop all application’s threads which will cause a small delay between receiving the start event and the actual start of the recording.
The pinball has global files and per-thread files. The per-thread files has the thread-id as part of the file name.
% sde -skx -log -log:basename pinballs/myapp -log:mt -- myapp
In this example, a multi-threaded pinball will be created for myapp.
% sde -skx -log -log:basename pp/myapp -control start:address:myfunc:tid2,stop:icount:10000:tid2 -- myapp
In this example the start event will be fired when myfunc is called in thread number 2, and the stop event will be delivered after thread number 2 executes 10000 instructions. Since the recording is global, all the execution of the other threads will be captured in the pinball.
Focus Thread
Intel® SDE provides an option to record only one thread from the execution of multi-threaded workload. This is useful when the application is using symmetric programming model like openmp. Use the knob -log:focus-thread <tid> to tell which thread to capture. You must add the -log:mt knob to make sure that the logger is running in multi-threaded mode.
Intel® SDE will use memory injections to apply the memory modifications that happen in other (i.e. other than the focus thread) threads, and where read by the focus thread.
Sometime you don’t know which thread will execute the “interesting region”, you can use the dynamic focus option to tell the logger to capture the focus thread of the thread that delivered the start event. Use the knob -log:dynamic-focus instead of -log:focus-thread.
% sde -skx -log -log:basename pinballs/myapp -log:mt -log:focus-thread 2 -- myapp
In this example, a single-threaded pinball will be created for thread number 2 of myapp.
Multi Process Mode
Intel® SDE is working inside the process, and automatically injects itself into child processes. This means that when an application is calling fork on Linux or CreateProcess on Windows, Intel® SDE will trace also the newly created process. The collected pinball is still a recording of a single process. Therefore, Intel® SDE opens a subdirectory of the parent process pinball directory, and it will generate the pinball files for the child process at this directory.
Intel® SDE has a special mechanism to handle memory accesses between multiple processes (using shared memory).
Logging Multiple regions
Tracing applications with Intel® SDE is often using the controller to define the region of interest. The controller supports defining multiple regions. This can happen with the uniform controller or when using the repeat modifier.
Normally, the logger will trace only the first region that was reached. The user can add the knob -log:region_id and each region will be created separately, and its base name will get a suffix of the form <basename>_<id>.
% sde -skx -log -log:basename pinballs/myapp -log:region_id \
-control start:enter_func:myfunc,stop:exit_func:myfunc,repeat -- myapp
In this example, the logger records each call to the function myfunc until it returns, in a separate region.
Note
The controller stop event does not terminate the run. You can use the -early-out knob to tell the controller to terminate the process upon getting the stop event.
Excluding Code
Intel® SDE provides a way to exclude code from the recording. This is useful when we want to skip code in spin-loops, or code in certain libraries (like system libraries). When excluding code, Intel® SDE will use injections to skip the code and memory injections to apply the memory modifications that happen when the code was excluded.
Intel® SDE provides a an option to exclude entire image by using the -log:exclude-image knob. But, it also provides a flexible way to define the start and stop conditions for the code exclusion. Intel® SDE has an additional instance of the controller which service only the code-exclusion module. You need to specify -log:exclude-code to activate this mode, and use the controller with the -log:exclude:control to define this controller start and stop events.
Similar to excluding code, Intel® SDE allow to exclude thread. In this mode the thread will not be captured in the pinball.
% sde -skx -log -log:basename pinballs/myapp -log:mt -log:exclude-code \
-control start:address:myfunc,stop:icount:100000000 \
-log:exclude:control start:enter_func:spin_lock,stop:exit_func:spin_lock,repeat \
-log:exclude:control start:enter_func:spin_unlock,stop:exit_func:spin_unlock,repeat \
-log:exclude-thread 0 -- myapp
In this example, thread 0 as well as the code in the functions spin_lock and spin_unlock is excluded from the pinball.
The Pinball files
The pinball format is a collection of text files that represent the recorded region. Some files are per thread, and in a single threaded pinball they get the .0 addition to the base name.
Per-process |
Per-thread |
||
---|---|---|---|
Type |
Content |
Type |
Content |
.address |
recorded memory regions |
<tid>.dyn_text |
pages loaded dynamically |
.cpuid.def |
emulated CPUID definition |
<tid>.race |
meta data for thread synchronization |
.global.log |
global data: SDE command, attributes, IPs tables,… |
<tid>.reg |
register injections |
.log.txt |
debugging and information messages for the logger |
<tid>.result |
attributes and other information |
.replay.txt |
debugging and information messages for the replayer |
<tid>.sel |
memory injections |
.procinfo.xml |
symbolic and image information |
<tid>.sync_text |
synchronization of page injections |
.text |
static code and data pages loaded at the start |
The Logger Knobs
The logger options have the ‘log:’ prefix.
- -log
Activate the logger [default 0]
- -log:basename
Specify base name for log files [default log]
- -log:compressed
Compression method, “none”, “gzip” or “bzip2” [default none]
- -log:copy_cpuid_file
Copy the chip CPUID file to the pinball even when attaching to the process [default 0]
- -log:dynamic_focus
Log dynamic focus thread according to start event [default 0]
- -log:early_detach
Detach after generating the log-files for the first region [default 0]
- -log:early_out
Exit after generating the log-files for the first region [default 0]
- -log:exclude_code
Enable code exclusion. (Range to exclude to be specified by ‘-log:exclude:…’ knobs) [default 0]
- -log:exclude_image
Images to exclude from log [default “”]
- -log:exclude_thread
Threads to exclude from relog [default -1]
- -log:fat
Shortcut for ‘-log:whole_image’ and ‘-log:pages_early 1’ [default 0]
- -log:focus_thread
Specifies which thread id to log (defaults to all) [default -1]
- -log:mp_mode
Use atomic instrumentation to support cross-process memory accesses ( values: 0 - Disabled, 1 - Enabled, 2 - to be determined according to other knobs) [default 2]
- -log:mt
Trace multi-threaded program [default 0]
- -log:optimize_injections
Optimize injections for single threaded relog [default 0]
- -log:optimize_rep_injections
Optimize repeat string instructions injections [default 1]
- -log:optimize_syscall_mem_injections
Support parsing system call to optimize and generate memory injections [default 1]
- -log:page_size
Size of shadow memory page in bytes [default 4096]
- -log:pages_early
Log pages of dynamically loaded libraries to the initial memory image [default 1]
- -log:pid
Use PID for naming files [default 0]
- -log:race_data_size
Size of data per race entry in bytes [default 128]
- -log:region_id
Use region number for naming files [default 0]
- -log:stop_at_special_inst
Stop logging when hit a special instruction which requires injection [default 0]
- -log:stop_at_syscall
Stop log when hitting a system call instruction [default 0]
- -log:stop_at_vsyscall
Stop log when hitting code at the VSYSCALL area [default 0]
- -log:syminfo
Generate procinfo XML file [default 1]
- -log:whole_image
Log all image pages loaded (even if not touched) [default 0]