What is systemtap
We generally debug our programs, and the logs printed by the business process are basically sufficient for our needs. If not, using strace, lsof, or perf is enough to see the bottleneck of performance. But for system programming, you can’t print logs like crazy, and many call stacks are in kernel space, so ordinary debugging means are stretched to the limit.
At this point systemtap comes in handy, it adds probe probes to kernel functions, aggregates statistics on kernel space function calls, and even intervenes in them. However, the support for user space debugging is not very good.
Installation
Local environment: DELL R720, Ubuntu 14.04 3.19.0-25-generic x86_64
|
|
For centos systems as well, yum install is sufficient. In this case, you should also install the missing kernel image debug package by running stap-prep. For example, mine is:
|
|
If you encounter any missing packages install them directly, or download them from the Internet. systemtap It is best not to install them with source code, packages involving the kernel are nasty, version must match , you can check with uname -r.
1. how to know which process is writing a file under Linux
There is a file that is modified from time to time, if it is only written instantly, lsof can not help, even if it is executed regularly, there may be missing. Then systemtap should come into play, the code is as follows.
|
|
The syntax is similar to awk code, probe defines a probe, followed by a probe point, which can be a specific function name, supports * matching, and curly brackets define the probe trigger action.
file is the argument to the functions vfs.read, vfs.write, dev_nr, inode_nr get the device number and inode according to the file structure, probe point is for the kernel function, so you can get all the arguments to the function.
execname Execute vfs.write or vfs.read Program name.
pid execute vfs.write or vfs.read process number.
ppfunc is the name of the control point function. This built-in function may be different in different versions.
1.1 Open the terminal and execute dd
Open the terminal and execute dd to write data continuously and check the file inode number.
|
|
Here /dev/sdb1 is the device mounted in the /disk1 directory.
1.2 Execute stap probe
|
|
stap executes the script in 5 steps, parsing the script, parsing it, generating c code, and deducing it into a kernel module ko file. Finally the module is executed and you can see that the dd task is writing the file, calling vfs_write.
2. Using Systemtap to inject delay to simulate IO device jitter
This is an interesting example from Master Ba, systemtap simulates disk IO jitter, for some storage systems, you can try it when pressure testing. The principle is still very simple, sleep a small period of time when vfs_write, vfs_read, the time can be random. The code is as follows.
|
|
The code is a bit long, first look at the probes probe vfs.read.return, vfs.write.return means execute the probe code before exit, determine if dev_nr is the target device and open ineject, if open, then udelay a small time. As for the other two probes, procfs(“cnt”), procfs(“inject”) is triggered when reading /proc/systemtap, and the global variable inject is modified to decide whether to turn on IO injection.
2.1 Executing the code
This script execution may encounter vfs_lookup_path error, which is very nasty, I updated procfs.c by one version and commented out the vfs_lookup_path part to solve it.
8, 17 indicates the disk device number and 400 indicates the udelay time, at which point the script blocks and does not start executing the IO injection. Open another terminal and execute the injection for 30 seconds.
|
|
At this point, you can see that stap has output.
2.2 Testing Disk Performance
Simply use dd to test the effect of IO latency on sequential writes.
Before injecting.
After injection.
You can see that dd performance drops significantly, and by adjusting the udelay time you can simulate the performance at different latencies. It may be better if it is random or fits a normal distribution.
Summary
There are many systemtap examples and introductions on the official website, and you can also capture the network stack, which is very powerful. At the same time, you need to have some kernel skills, or at least know where to bury the probe. openresty uses systemtap a lot for debugging, you can refer to learn.
In addition, the installation is a big problem, must pay attention to the version, too new can not be, ubuntu system apt source is 2.3, tried to install a high version of the source code, there are many errors.