The long interview day is nearing its end. Googamazbook got the best and the worst out of you (well, neither of those, but I’m trying to put some literature in here); the last interviewer comes in, smiles condescendently and greets you with:
Time for the easy interview, heh?
Yes, you have all the reasons to be concerned and feel you’re just one step away from failure (yes, why didn’t you spend the day by the pool in the basement of the many stars hotel they got you a room in for the interview?). But without further ado, the questions start pouring in:
How do you figure out if a process is CPU bound or I/O bound?
Tricky! Let’s not jump to the conclusion. There are 2 variables here, this means we have 4 possibilities:
The process is CPU bound ;
The process is I/O bound;
The process is both CPU and I/O bound;
The process does not experience such bounds.
In the real world, most processes fall into the last category and spend almost all of their life in the S state (sleep).
We have to add another detail here: processes do not usually execute a single operation or the same single set of operations and then exit. Throughout their lifespan, the process can have times when it is CPU bound and moments when it is I/O bound, even if these moments account for a very small percent of the execution time. Using tools like top or ps to visually identify any longer period the process spends in states R (running) or D (uninterruptible sleep or I/O wait) may not bring any tangible results.
The conclusion is, if we limit ourselves to considering the process to be a black box, there is no way to answer the interviewer’s question. If we have source code access, then yes, we can identify blocks of code or put in some tracing that may help bring us to some answer, but otherwise we can only have answers in the line of:
The process is using CPU in a way that could probably benefit from some more computing power for at least some operations;
The process reads or writes large amounts of data and could benefit from faster storage.
The trick is how can we determine if any of the above 2 statements are true for an arbitrary, “black box” process; it’s not hard, but one needs to look deep under the operating system’s hood:
High CPU usage can be spotted by looking in /proc/pid/sched; the attributes to be checked are nr_involuntary_switches and nr_voluntary_switches. In a relative sense, if the involuntary switch count is higher than the voluntary one, it means the kernel has forcibly taken the process off the CPU. This might indicate at least a CPU intensive process.
High I/O is also visible in /proc/pid/sched in the se.statistics.iowait_sum attribute. That must be taken as an absolute figure, but one needs to correlate it with the process running time and also compare it with other processes. For I/O the best way could still be monitoring the process with ps or top.
Done? Yes, on to the next.
How do you debug an unresponsive service?
Too general. One must ask about what service is that or if the node is still accessible through some other means (e.g. ssh). Usually the scenario implies a single service is unusable while the node is still up and accessible, otherwise what fun is an answer like “reboot the node”? The bar can’t be that low for such interview.
The scenario usually goes like this:
What service is it? HTTP
Is it accessed by IP or with name resolution? DNS
Does DNS work, e.g. resolves the name of the service to an IP? yes
Is that the IP of the node with the unresponsive service? yes
On the node itself, does the service listen on the expected port? yes
Is there any firewall filtering access on that port? no
At this point it’s clear that all the trivial solutions are gone and the big guns need to be brought in. The scenario continues:
Is the service up & running? yes
How many processes for the service are there? about 50
(Getting smart) What’s the process limit for the service? around that figure
(Evrika moment) Oh, the service is overloaded, let’s increase the limits and do a graceful restart! yep, after a 2x increase the service is still unresponsive
You start to smell the scent of failure, like many before. Too bad, because one could, say, look at individual processes and see where they’re blocked.
Let’s do a strace -p on individual processes and see what they do. They’re blocked in a read() call from some file descriptor
(We’re past the mountain’s peak) Let’s see what an ls -la /proc/pid/fd/xx has for us. It’s a file in /www/service
Oh, how’s that mounted? a NFS directory
Is that server accessible, e.g. ping? yes
Let’s go to that node and do some top, ps… top reports 99% wa
(Yep, cause identified, now for the fix) Let’s do a ps and look for processes in the D state. /usr/local/bin/maintenance.sh is in the D state
What does it do? Should prune some logs but due to a bug it keeps on repeating the process
kill -9? and take it out of cron, chmod a-x…? This solved the issue
Ugh! Was it crazy, wasn’t it? This, while being completely made up by myself, is a plausible scenario for such interview. One needs to remember a few pointers:
The solution is never trivial (no client error, no external service error);
The problem is not immediately visible with the usual tools (e.g. ps, top, df);
The cause is almost never located on the node where the problem occurs, while the relationship between the nodes is all but easy to figure out;
System call tracing is almost always required to figure such relationship.
On to the next question.
How would you deal with a “disk full” situation?
First things first, there are 2 resource types that, when exhausted, cause the “disk full” situation:
This means that diagnosing such problem requires 2 commands:
# df -i
There are different approaches for “disk full” by the causes above; while running out of storage space can be attributed to one or a couple of very large files, the second scenario is caused by having lots of small files.
How to find the largest files on the file system? With find:
# find / -type f -size +10M -printf '%k\t%p\n'
The command above prints the size on disk (kbytes) and the file path for all the files larger than 10 Mbytes. Usually there are some outliers with sizes of a couple gigabytes or more. Deleting them is trivial but won’t always solve the problem; what if they are open by some program pouring data into them for some reason?
The solution is to use lsof:
# lsof | grep file_name
If stopping or restarting the program after deleting the file is not an option, truncate may help:
# truncate -s 0 file_name
One may also encounter the situation when a very large file was deleted while it still is kept open by some program. The find command above will be of no help here, but lsof will do the trick and then truncate will complete the job:
# lsof | grep "deleted"
# truncate -s 0 /proc/pid/fd/xx
Now on the second scenario (inode count exhaustion): this is harder to solve as there are no outliers to be dealt with but hundreds of thousands of files that may need to be deleted. Locating them is hard: you need to figure out what type of software runs on the node or if anything happened in the last few days. find comes again for help:
# find / -mtime -1 -print
The command above shows the files created or modified during the last day. For a complex deletion example please see the second question in this interview text.
That’s it for today, thank you for your read!