Tag Archives: http
Instance Performance Monitoring in AWS

AWS provides a complete monitoring engine called CloudWatch; it works with metrics – including custom, user-provided metrics – and is able to raise alarms when any such metric crosses a certain threshold. This is the tool that is used for all perfomance monitoring tasks within AWS.

This text will cover a monitoring scenario regarding deploying an arbitrary appplication to the cloud and being able to determine what causes the performance limits to be met, be it the application code itself or resource limits enforced by Amazon.


Let’s assume that one has just started using Amazon Web Services and is deploying applications on free tier or other general purpose (T2) instances. One learns that the general purpose instances work with “credits” that allow dealing with short spikes through performance bursting – but once the credits are exhausted the performance is reverted to some baseline. All the particular details do not make a lot of sense but one needs to know if the application can meet the desired service limits with this setup.

Continue Reading →

Cloud Distribution with AWS

When do you need to use some form of Cloud Distribution?

If you consider this concept a form of caching, then you need it when your website passes a certain threshold in terms of users and the volume of data that is being sent out. The actual numeric figure depends entirely on your setup – you may have large files and sending them out of your server may fill up the available bandwidth. Or you may have a lot of users and may want to ease the load on the server by pushing some of this load to somewhere else.

Nevertheless, the purpose of this article is to show you how this is being done by using the AWS CloudFront.

AWS CloudFront

First, you need to go to the “CloudFront” option in the AWS Console and attempt to create a Web distribution. On the creation page you should first fill in your domain name as the distribution origin (the primary source of the content):

Create Cloud Distribution

Continue Reading →

Note to self: http server keepalive timeout

Note: While I found this issue on Apache httpd, it may apply to any http server out there.

HTTP KeepAlive

The “KeepAlive” concept is simple: the browser opens a connection to the server and sends out multiple requests (e.g. for the main page, for stylesheets, javascript includes and images) through a single connection. This effectively reduces the page load time, providing – or at least that’s what the theory says – a better customer experience. All good from this perspective.

Server Configuration

The Apache httpd server controls this feature through 3 configuration directives, all of them in the core module:

  • KeepAlive: on/off, default on;

  • KeepAliveTimeout: seconds, default 5;

  • MaxKeepAliveRequests: positive number, default 100.

There is no setting in Apache httpd to somehow allow KeepAlive and non-KeepAlive requests at the same time (e.g. to allow up to 100 KeepAlive requests, what comes above to be treated differently). One must choose the server behavior from the very beginning.

The Traffic

Now it’s math time. Starting from the MaxClients value (default 256), what is the request rate (new clients / second) that can be served without compromising the user experience? MaxClients you may say, but let’s not draw conclusions too fast. There are some issues to be considered:

  • The time between opening the connection and the KeepAliveTimeout expiration. On a default configuration, at the very least 5 seconds, but on a more typical side, maybe 7-10 seconds;

  • Traffic fluctuation (spikes);

  • Internal Apache httpd time (spawning new processes to handle new connections, etc).

It’s getting complicated. But assuming a typical 7 seconds time for every process that serves a client and an uniform behavior (all or almost all clients keep a server process busy or waiting for data for 7 seconds), on a default setting of 256 MaxClients, the uniform traffic rate that can be served is 256/7 = 36 (new) requests / second. Any spike larger than that will cause page loading delays and a poor user experience.

Better Planning

If disabling KeepAlive is not an option, all the planning should start from the worst expected spike that is to be handled (e.g. 2x or 3x the average during the busiest period). For a 50 req/s busy period average, the server should be able to allow 100 or even 150 req/s without compromising the user experience. If the configuration is already maxed out from the hardware point of view, then other solutions should be looked into (multiple servers, load balancers, I’m not dwelling into that).

Assuming a maximum rate of 150 req/s, with a KeepAliveTimeout at the 5 seconds default, one may need to adjust MaxClients (and ServerLimit if prefork mpm is used) to somewhere in the area of 1,000.


Don’t leave the defaults on; always start the parameter calculation from the desired outcome first. On suboptimal hardware (memory wise) disabling KeepAlive is the way to go to squeeze a bit more performance. Oh, and don’t assume the hardware is capable of keeping up with your calculations; turn any failure or mis-planning in a learning subject.

That’s it for today, have fun!