22 Oct 2014 Apache or Nginx: The ultimate argument
When it comes to web server applications, there are two particular names that rise strongly;
Apache and Nginx.
So, what are the differences between these well-known web servers? and which one to pick?
Introducing Apache
The Apache HTTP Server is the most popular Linux-based web server. It was first officially released in April 1995, and since then it has been used widely, that in 2009 it became the first web server to serve more than 100 million website.
Apache has a process-driven architecture. At first, it didn’t support multitasking, but the Apache MPM (multi-processing module) was added later on.
Apache is easy to set up, and “how Apache works” depends on which MPM module is loaded at configuration. The default MPM differs from one OS to another, (and for Linux from one distro. to another).
- Prefork MPM is a non-threaded module. When the prefork MPM is selected, Apache has a number of child processes, where each process handles one connection at a time. This module is used for applications that need to avoid threading for compatibility with non-thread-safe libraries; e.g. mod_php. It’s the best MPM to isolate requests, but when dealing with high amount of concurrent connections, it eats up the resources.
- Worker MPM differs from prefork, for it supports multi-threading. It implements multi-process multi-threaded server, where it has a number of child processes (n), each has a number of threads (m) that can handle (m) connections. This results in handling more concurrent connections with less resources.
- The thing is, the threads are attached to connections not requests, which is a big deal in case of long keep-alive time; threads that are kept alive will wait for additional requests from a connection for a relatively long time, in addition to the risk of facing deadlock problems.
- Apache 2.4 introduced the event MPM in stable status. It works almost exactly like the worker MPM, but it fixes the keep-alive problem, for it attaches threads to requests instead of a whole TCP connection. The HTTP connections are handled by one dedicated thread, while the rest of child threads handle the incoming requests from different connections. Thus less processes/threads are created compared to worker MPM.
Over the years, Apache has added to its package a lot of modules and tools, and became compatible with a wide range of operating systems.
In time, the need for larger number of concurrent connections increases, which is known as the C10K problem; handling 10,000 concurrent clients at a time. This scalability problem can be solved by one of two ways; increasing the hardware capabilities (e.g. memory, CPU,….), or improving the web server architecture to optimize the usage of the hardware resources.
As Apache failed on the scalability side, another web server was developed in 2002 to avoid this problem. And here comes Nginx.
Nginx, and the comparison with Apache
Nginx is a light-weighted and stable web server that is reputable for its speed. It has event-based, asynchronous architecture. This means many users’ connections and requests can be handled concurrently without one blocking the others, in a manner that if the resources are not free, it works on another event until the needed resources are available; hence avoiding the deadlock problem.
How Nginx works? Aside of the master process that starts Nginx, a certain number of worker processes – the rule is to use 1 worker process per core – are created. This means that a single process can handle multiple connections without the need to spawn new process/thread for upcoming connections, like in the case of using Apache.
With Nginx, scalability is no longer dependent on hardware capabilities, unlike Apache that is limited in scalability and performance with the underlying hardware.
One more thing to be noted about Nginx, is that it can only have the modules configured at the compile time. Any additional modules require Nginx to be rebuilt. While in Apache, this problem doesn’t exist; modules can be dynamically added at any time, then just reloading the Apache is all what’s needed so they can be ready for use.
Even though the Apache event MPM – that was introduced to boost the performance – shares with Nginx the same asynchronous manner in handling requests, yet it still spawns new threads to handle the upcoming requests, on the contrary of Nginx. Moreover, in case of heavy load, Apache has been reported to slow down significantly because of its need to create new process/threads, consuming more memory and CPU. That’s why Nginx is said to be preferred for high-traffic websites, especially those serving static content.
However, serving dynamic content is a different and more complicated issue for Nginx. Apache is designed to interpret PHP, Python, Perl and some other languages internally, by loading their specific modules; i.e. mod_php, mod_python. Nginx works around this situation by using FastCGI to interpret these languages externally. An example for this is using PHP-FPM (FastCGI Process Manager), where Nginx is configured to send PHP files to php5-fpm to be interpreted and get the results sent back. This works just fine, yet it’s such an overhead, unlike with Apache which can do the same job faster.
It’s also important to know that there are more features that are common between both Apache and Nginx including: SSL support, proxying and load balancing.
Use Cases
One of the common scenarios is to use Nginx as a reverse-proxy/cache, taking advantage of Nginx’s great performance at serving static content, and pass the work of the dynamic content processing for Apache at the back-end. However, some replace Apache and mod_php with Nginx and PHP-FPM for the sake of scalability.
Another scenario is to use Nginx in front of application servers such as Thin, Phusion Passenger and uWSGI, handling the load of serving the static content and adding a layer of security. It can also work as a load balancer in case of multiple application servers.
Conclusion
Back to the question, Apache or Nginx? This question can’t be easily answered, for it depends on multiple factors including the ratio of dynamic to static content, geographical location of the resources, traffic load and a lot more. So, despite the fact that Apache has lost some of it’s market share to Nginx, it’s not right to generally favour one of them over the other. This decision should be carefully made, after clearly determining what exactly the server would be used for.