Linux performance monitoring (an introduction)

1. Classification

Before getting into action, let’s split the “performance” problem in a couple of boxes, as the concept itself is quite general. First, deciding what we want to monitor (an entire system? a particular application?) – and second, deciding on what type of performance monitoring do we require (stats collection by the kernel? in-depth analysis?). Based on this particular classification, we may end up with 4 categories, each with its particular software selection:

 

Stats (Counters)

Tracing / Profiling / Debugging

 System Wide

 Per Process

NB:

  • netstat offers much more info beyond statistics on interface / protocol and may also be used to monitor individual connections.

  • dtrace and SystemTap can also trace individual applications.

2. Stats Collection

Where the stats reported by the programs from the first column come from? “The kernel” is the easy answer, but the data interface must also be named – the /proc file system. For every process in the system, there is a “stat” interface provided by the kernel that displays all the collected data:

# cat /proc/15588/stat
15588 (httpd) S 30837 30837 30837 0 -1 4202816 1845 0 0 0 2 1 0 0 20 0 1 0 381733777 671944704 4446 18446744073709551615 140071612739584 140071613080548 140735088901328 140735088899312 140071585579283 0 0 16781312 201344747 18446744071580587657 0 0 17 0 0 0 0 0 0

NB: I agree, this interface is much better suited for being read by software. Nobody can tell which is which without looking to the documentation. Luckily, there also is a human-readable version:

# cat /proc/15588/status
Name:   httpd
State:  S (sleeping)
Tgid:   15588
Pid:    15588
PPid:   30837
TracerPid:      0
Uid:    48      48      48      48
Gid:    48      48      48      48
Utrace: 0
FDSize: 64
Groups: 48 
VmPeak:   656708 kB
VmSize:   655684 kB
VmLck:         0 kB
VmHWM:     17920 kB
VmRSS:     17752 kB
VmData:    87788 kB
VmStk:       100 kB
VmExe:       336 kB
VmLib:     62692 kB
VmPTE:       872 kB
VmSwap:        0 kB
Threads:        1
SigQ:   0/63581
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000001001000
SigCgt: 000000018c0046eb
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: ffffffffffffffff
Cpus_allowed:   ff
Cpus_allowed_list:      0-7
Mems_allowed:   00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
Mems_allowed_list:      0
voluntary_ctxt_switches:        166
nonvoluntary_ctxt_switches:     0

There is a system-wide stat, too:

# cat /proc/stat 
cpu  20887396 0 10349385 3015436747 407563 3166 1467937 0 0
cpu0 10678475 0 3883808 362454876 232413 3166 1390223 0 0
cpu1 1661441 0 1011415 378955537 31275 0 11441 0 0
cpu2 746371 0 330726 380344321 23829 0 11456 0 0
cpu3 519138 0 214791 380738302 16767 0 10813 0 0
cpu4 5386158 0 3986695 371475949 54473 0 12596 0 0
cpu5 682554 0 324926 380359994 18308 0 11663 0 0
cpu6 705281 0 343150 380407382 15057 0 10218 0 0
cpu7 507974 0 253871 380700384 15438 0 9525 0 0
intr 5529843084 126 2 0 0 0 0 0 0 1 0 0 0 4 0 0 0 59 0 0 0 0 0 0 26 0 12727 0 0 0 0 0 0 13683346 0 751338256 592123250 2 80 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
ctxt 5379451608
btime 1451035714
processes 6474048
procs_running 1
procs_blocked 0
softirq 11537062915 0 4052674276 583728909 992215181 16867459 0 13667315 1762309239 2086018 4113514518

No human readable version for the system-wide /proc/stat, that’s why one must rely on the software named above.

3. Tracing

When it comes to tracing, things get more complicated. First, monitoring individual processes is possible by making use of signals such as SIGTRAP. This enables tracing using userspace-only tools, but performance-wise this is not advisable: the overall cost of tracing is significant. The userspace tools like strace and gdb are the best in class for debugging and spotting erratic behaviour of individual programs, though.

The more advanced tools such as dtrace rely on the kernel to report events related to individual processes (or to all the processes). This is sometimes considered as “zero cost” tracing, able to also be used in production environments.

Software like dtrace and SystemTap provide scripting languages that can be used to handle the results returned by “probe” triggering (e.g. a certain system call is run or the execution reaches a certain line in the monitored program). This allows for things like:

  • Advanced data processing (e.g. counting certain behaviours)

  • Stats aggregation;

  • “Open heart surgery” on programs – intercepting data flows / replacing data that would otherwise trigger some undesired behaviour.

Getting into a bit more detail, the tracing can be split in 2 categories: static and dynamic. The so-called static tracing looks for invariants such as system calls or well known points within functionalities. These are well documented and do not change. The dynamic tracing is similar to setting breakpoints to individual lines of code; these breakpoints work for certain versions and may not be applicable to subsequent program versions.

Example (dtrace one liner that prints out all the system calls by a certain application):

$ dtrace -n 'syscall::: /execname=="myapp"/ { printf("%d", timestamp); }'

More dtrace one-liners can be found here: DTrace One Liners.

This was my introduction in the performance monitoring world. Hope you enjoyed the read!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.