Note: this text is about Systems Programming on Linux platforms.
This type of interview is focused on finding out what do you know about what happens below the “command line” surface. Down there things can get messier, as processes get created, terminated, the output gets collected, system calls are performed. Things get complicated really fast for the unaware or the unprepared.
Let’s take for example a single command that is being run:
$ ls
This is a classical interview question, asked for more than 15 years now. I’m not sure if anyone still asks it in 2016, but it’s still interesting to see the answer. So, what happens when this is being run? (No, not the file list display).
From the beginning:
The shell parses the command line, identifies the command and its parameters (none provided);
The shell tries to assess if the command is one of:
internal;
alias;
external binary to be looked up for.
On most systems ls is an alias to the real binary plus a single parameter (e.g. –color=auto);
Safety step: the current directory is validated with stat(“.”) before effectively executing any command;
The binary is then looked for by enumerating all the directories in the PATH environment variable, appending the binary name to every directory and executing a stat() system call on the composed path. The ls binary is most likely located in /bin so the lookup stops after positively identifying the executable as /bin/ls;.
The shell creates a new process – fork() – and then issues an exec()-type system call with whatever parameters were determined in the previous stage in order to get the binary executed;
Some boiler-plate code gets executed: shared libraries get loaded, memory gets allocated, signal masks get set, signal handlers are defined and so on;
The interesting part finally begins:
ls loads the /proc/filesystems contents and the current locale;
the current directory is opened for reading – opendir();
the directory is enumerated and then closed – readdir() and closedir() (some more memory may get allocated during this process);
the directory contents is sent to stdout – write(1, …).
At the end, ls frees up whatever memory it was using and exits.
More complicated than it originally looked like, isn’t it?
Follow up, what happens in the following scenario:
$ ls | tee list.txt
OK – the basics are easy to express: the output of ls – stdout – is piped as the input of tee (stdin).
NB: stderr is not redirected in this scenario and will most likely find its way directly to the console.
What happens behind the scenes?
First things first – this is the shell command line so everything is parsed and then executed by the shell. The details mentioned before continue to be valid: internal command / alias / external command, stat() in every directory in the PATH until the binary is found. The difference is that 2 external commands are identified and run by the shell (ls and tee), each in a process of its own.
The “magic” happens before the fork() and then before the exec() calls. A rough approximation of what happens can be expressed as:
An unnamed pipe pair is requested from the kernel (the pipe() system call);
The end of the pipe that accepts input is dup2() into the stdout of the process that will execute ls;
The other end of the same pipe is dup2() into the stdin of the process that will execute tee.
The shell will collect the exit codes of both children processes; even if the first one fails to produce any output, the second is run anyway (with empty input).
Sounds fun? An interviewer calibrated on this systems topic will dig even further, asking how fork() and exec behave to the kernel level. A stronger candidate will have to provide answers like:
fork() duplicates the current process: from a logical point of view, almost everything is duplicated – memory, file descriptors, kernel resources such as refernces to shared memory segments or semaphores; the pid is different, though. if more than one thread is present in the process context, only the one running the fork() gets duplicated. This could be the cause of many hard-to-find bugs.
The actual memory copy is performed using a copy-on-write approach (the memory pages are marked read only; if the child (or parent) attempts to modify one of the pages, a page fault gets generated and the kernel duplicates the page behind the scenes and allows the change to go through.
exec() replaces the process image with the one of the executed binary – the previously allocated process memory gets freed by the kernel. Some file descriptors get left behind and can be used by the exec()-d binary: the ones not marked with “close-on-exec” (O_CLOEXEC) flag. By default, the standard descriptors (stdin, stdout, stderr) do not have this flag set.
As you might have noticed, such interview can get tougher by the minute. Fortunately, only a few companies do conduct such “systems interview” and only for certain positions. If you have such interview scheduled and you spent a couple of minutes reading this text, then you’re surely closer to succeeding than before. Or at least this is what I hope. Anyway, thank you for your read!
When the child or parent attempts to modify the page, I think protection fault will be generated and not page fault