Let’s assume you went against my advice from a few months ago and you actually secured that job. That’s actually quite an achievement, but it’s not what I intend to write about. What you leave behind is, most of the time, more significant than what you think is being set in front of you.
Most likely your current job does not compare with the likes of Googamazbook. That’s OK. Or you feel that you did not get to achieve everything you hoped for when you signed up for your current, soon to be previous position. That’s also OK. Let’s walk through some possible frustration points:
1. Promotions (lack of)
You do feel you deserved more and you were passed on for promotions due to unclear, maybe non-professional reasons. Most likely this is true, but it does not always have to be about you as an individual; it’s about the way companies work.
FOSDEM is an annual event for software developers, focused on open source software, happening in Bruxelles during the first weekend of February.
The event this year was the 5th FOSDEM conference I attended, starting from 2010. During the years I have seen it evolve: as the presentation focus moved along with the industry, some topics faded out and got replaced by newer things. Many of these newcomers did not actually gain traction over the years and also faded out at some point.
One of the main transitions I have noticed was the one from abstract or too general things (e.g. discussions on the Linux Kernel or performance tricks in C/C++) towards end products and getting (soon to be) mature technologies applied in order to get clear outcomes.
Note: The “magical” tool for XFS is (obviously) xfs_repair. Having it running can sometimes be the tough issue.
Introduction
How could a filesystem corruption happen? There are a couple of likely causes to it:
Kernel bugs: they are infrequent but they also did happen many times in the past and will still happen in the future. Not many things to be done about them, other than applying patches / keeping the kernel up to date;
Memory issues, e.g. memory errors propagated to the file system in control structures: they are usually mitigated with ECC memory but they can never be ruled out;
Underlying storage issues: quite unlikely but nevertheless possible;
Using the reset button on running servers: journaling file systems are almost always able to recover from such incident;
RAID controller issues: this could be the leading cause and not be easy to mitigate, even if firmware upgrade is sometimes possible.