Tag Archives: devops
Python Cheat Sheet: Lists

Lists are linear sequences that provide constant time data lookup. They can be resized, searched, sorted (using a custom compare function) and are not restricted to a single data type (e.g. you can define lists with mixed data). Lists in Python are 0 indexed.

Defining a pre-initialized list with 10 numeric values (all zeros):
A = [0] * 10
Defining a pre-initialized list with 10 numeric values (powers of 2):
A = [ 2**i for i in xrange(0, 10) ]

Note1: xrange above returns values from 0 to 9 inclusive.

Note2: this syntax is known as list comprehension.

Iterating through all the elements (read only):
A = [ 0, 'a', {'b':'c'} ]
for e in A:
	print e
Iterating through all the elements (read/write):
A = [ 0, 'a', {'b':'c'} ]
for i in xrange(len(A)):
	print i, A[i]
Adding new elements at the end of the list:
A = [ 0, 1, 2, 3 ]
A += [ 4 ]

Note: there is an append method that can also be used for this purpose.

Insert new elements:
A = [ 0, 1, 2, 3 ]
A.insert(0, -1)	#position, value

Note: the insert above puts a new element at the front of the list.

Find elements in the list:
A = [ 0, 1, 2, 3 ]
A.index(2)

Note: index does a linear search for the element with the value provided. A ValueError exception is thrown if the element cannot be found.

Remove elements from the list:
A = [ 0, 1, 2, 3 ]
A.remove(2)		#by value
del A[1]		#by index
Using a Python List as a Stack:
A = [ ]
A.append(1)
A.append(2)			#always add elements at the end
stacktop = A.pop()	#returns 2, the last element added

Note: pop throws the exception IndexError if the list is empty.

Using a Python List as a Queue:
A = [ ]
A.insert(0, 1)
A.insert(0, 2)	#always insert at the beginning of the list
elem = A.pop()	#returns 1, the first element added

Note: for an optimized implementation for both Stacks and Queues you may want to look at the collections.deque data structure.

That’s it for today, have fun!


Crazy DevOps interview questions (2)

You can find the first article of the series here: Crazy DevOps interview questions.


Question 1:

Suppose you run the following commands:

# cd /tmp
# mkdir a
# cd a
# mkdir b
# cd b
# ln /tmp/a a

… what is the result?

At this point one may point out that the hardlink being defined may basically create a circular reference, which is a correct answer on its own. It’s not complete, though: how would the operating system (the file system) handle such command, anyway?

A command line guru may simply dismiss the question saying that hardlinks are not allowed for directories and that’s about it. Another guru may point out that we’re missing the -d parameter to ln and the command will fail before anything else considered. Correct, but still not the complete answer expected by the interviewer.

The complete answer must point out that:

  • Not all file systems disallow directory hardlinks (most do). The notable exception is HFS+ (Apple OS/X).

  • The hard links are, by definition, multiple directory entries pointing to a single inode. There is a “hardlink counter” field within the inode. Deleting a hard link will not delete the file unless that counter is 1.

  • Directory hard links are not by definition dangerous to be disallowed by default. The major problem with them is the circular reference situation described above. This can be solved by using graph theory but such implementation is both cpu and memory intensive.

  • The decision to disallow hard links for directories was taken with this computation cost in mind. Such computation cost grows with the file system size.

I agree to you that a comprehensive answer is usually expected in an interview setting by a company within the “Big Four” technology companies.

Continue Reading →

Doing Backups in AWS

This text is about doing backups for data already existing in AWS, not for outside data, although some methods apply for both cases. But let’s start from the beginning:

What Data?

Your data can be located on EC2 nodes (virtual servers) or you may be using some dedicated database service such as RDS. The dedicated services have the backup functionality built-in already, with settings easily accessible through the interface. I won’t deal with those but rather with the “raw” data you may have on a node.

The data on the node falls in 2 categories, or can be looked over from 2 different perspectives:

  1. When one wants to capture the “system state” at a certain point in time. This perspective does not consider the data composition, but the functionality that is being captured for use at a later date as a known good fallback point.

  2. When one wants to get the state of a specific subsystem (e.g. a subset of the local storage, a subset of the local database). This is the “classical backup” as it is widely known.

Capturing State

AWS offers full support for taking snapshots of volumes:

AWS Volume Snapshot example

One does not need to only use the interface; all the functionalities are available programatically. One may also want to look over Boto Python library.

Classical Backup

One can store files through programatical means (e.g. from cron-based scripts to full fledged backup software that runs on a schedule) in the Amazon Cloud to the following services:

  • Simple Storage Service (S3): this is the easiest to use as it offers instant storage, instant retrieval and also versioning (e.g. you may mirror some directory contents on the secure storage at various points in time). It is not a cost effective method of storage for huge amounts of data (multiple terabytes) over long periods of time.

  • Glacier: this is the equivalent of the tape storage. The retrieval is not instant (one must schedule such retrieval in advance). It does not support versioning by default. It is 3-4 times cheaper than S3, though.

  • A dedicated EC2 node (or multiple nodes organized as a backup storage cluster): this is not cost effective but may work in certain scenarios (e.g. live data mirroring).

  • A dedicated database in RDS: this is far from cost effective but is the solution if one wants to use some existing backup software that can store data to a database only.

That was my introduction on doing backups in AWS. Thank you for your read!

Previous Page · Next Page