Note: this text is about Systems Programming on Linux platforms.
This type of interview is focused on finding out what do you know about what happens below the “command line” surface. Down there things can get messier, as processes get created, terminated, the output gets collected, system calls are performed. Things get complicated really fast for the unaware or the unprepared.
Let’s take for example a single command that is being run:
$ ls
This is a classical interview question, asked for more than 15 years now. I’m not sure if anyone still asks it in 2016, but it’s still interesting to see the answer. So, what happens when this is being run? (No, not the file list display).
From the beginning:
AWS provides a complete monitoring engine called CloudWatch. This works with metrics, including custom, user-provided metrics and it’s able to raise alarms when any such metric crosses a certain threshold. This is the only tool used for perfomance monitoring tasks within AWS.
This text will cover a monitoring scenario regarding deploying an arbitrary application to the “Cloud” and then being able to determine what causes performance limiting, be it in the application code itself or coming from limits enforced by Amazon.
Scenario
Let’s assume that you have just started using Amazon Web Services and are deploying applications on this free tier or by using general purpose (T2) instances. You quickly learn that the general purpose instances work with “credits” that allow dealing with short load spikes through performance bursting, but when these credits are exhausted, instance performance is reverted to some baseline. These particular details do not make a lot of sense, but you need to know if the application can meet the desired service targets while sticking to this setup.
1. Classification
Before getting into action, let’s split the “performance” problem in a couple of boxes, as the concept itself is quite general. First, deciding what we want to monitor (an entire system? a particular application?) – and second, deciding on what type of performance monitoring do we require (stats collection by the kernel? in-depth analysis?). Based on this particular classification, we may end up with 4 categories, each with its particular software selection:
Stats (Counters) |
Tracing / Profiling / Debugging |
|
System Wide |
||
Per Process |
NB:
-
netstat offers much more info beyond statistics on interface / protocol and may also be used to monitor individual connections.
-
dtrace and SystemTap can also trace individual applications.