Nginx is a fantastic web server and a lot more. This section introduces some of its more important features.
More Than Just a Web Server
At its core, you can consider Nginx to be an event-based reverse proxy server . That may come as a surprise to many, because mostly Nginx is usually said to be a web server.
It has an extremely extensible architecture due to its support for plug-ins. Even basic things like SSL and compression are built as modules. The real power lies in the fact that you can rebuild Nginx from source and include or exclude the modules that you don’t need. This gives you a very focused executable that does precisely what you need. This approach has a downside too, though. If you decide to incorporate another module at a later point, you will need to recompile with appropriate switches. The good angle to this is, Nginx has a fairly robust way of upgrading its live processes and it can be done without interrupting the service levels.
As of this writing, www.nginx.org hosts as many as 62 modules for very specific purposes. There are plenty of other third-party Nginx modules available as well to make your job easier. The ecosystem is thriving and helping Nginx to become even more powerful as time passes. You will learn more about modules in the coming chapters in detail.
Asynchronous Web Server
Nginx gains much of its performance due to its asynchronous and event-based architecture whereas Apache and IIS like to spin new threads per connection, which are blocking in nature. Both IIS and Apache handle the threads using multithreaded programming techniques . Nginx differs in the approach completely. It does not create a separate thread for each request. Instead it relies on events.
Reverse Proxy and Load Balancing Capability
Nginx analyzes the request based on its URI and decides how to proceed with the request. In other words, it is not looking at the file system to decide what it has to do with it. Instead, it makes that decision based on the URI. This differentiation enables Nginx to act as a very fast front end that acts as a reverse proxy and helps balance the load on the application servers. It’s no exaggeration to say that Nginx is a reverse proxy first and a web server later.
Nginx can also fit very well in a hybrid setup. So, the front-end job is taken care of by Nginx , and everything else gets delegated to the back end (to Apache, for instance).
Low Resource Requirement and Consumption
Small things that go a long way, define Nginx. Where other web servers typically allow a simple plug-and-play architecture for plug-ins using configuration files, Nginx requires you to recompile the source with required modules. Every module that it requires is loaded directly inside of an Nginx process. Such tweaks along with smart architectural differences ensure that Nginx has a very small memory and CPU footprint on the server and yields a much better throughput than its competition. You will learn about the Nginx architecture with granular details in the coming chapters.
Nginx is probably the best server today when it comes to serving static files. There are situations where it cannot be considered the best (like dynamic files), but even then, the fact that it plays well as a reverse proxy ensures that you get the best of both worlds. If configured well, you can save a lot of cost that you typically incur on caching, SSL termination, hardware load balancing, zipping/unzipping on the fly, and completing many more web-related tasks.
Multiple Protocol Support: HTTP(S), WebSocket , IMAP , POP3 , SMTP
As a proxy server, Nginx can handle not only HTTP and HTTPS requests, but also mail protocols with equal grace. There are modules available that you can use while compiling your build and Nginx will proxy your mail-related traffic too.
Secure Sockets Layer is a necessity for any website that deals with sensitive data. And, just like any other necessity, there is a cost involved. When it comes to web traffic, SSL also induces an extra processing overhead on the server side where it has to decrypt the request every time. There lies a catch-22 situation: If you remove the SSL, you are opening yourself up for attacks and if you use SSL, you end up losing a little bit on speed (or additional cost due to scaling out)!
HTTP Video Streaming Using MP4/FLV/HDS/HLS
You have already learned that the Input/Output (IO) in Nginx doesn’t block if the client is slow. Video streaming is typically a very IO-intensive process, and Nginx does a great job here. It has multiple modules that help you provide streaming services. To give a little perspective as to what is special about video streaming, imagine watching YouTube. You can easily skip the video from one position to another and it almost immediately starts serving the content. The key here is to not download the entire file at one shot. The request, hence, should be created in such a way that it has certain markers in the query string, like this:
The preceding request is asking the server to send the content of yourfile.mp4 starting from (notice the start query string) 120.12 seconds. This allows random seeking of a file in a very efficient way.
Extended Monitoring and Logging
Failure to log and finding the problems in production farm is extremely crucial if you are to run a successful web service. Monitoring a web server , however, on a regular basis is a challenging and time-consuming task for any IT pro.
The more servers you have, and the more traffic you get, the harder it becomes. There are all sorts of nasty people out there who have ulterior motives to bring the website down and disrupt your web service. The best way to ensure safety, hence, is to be cautious and alert. Log as much as possible and ensure that you react proactively.
Nginx writes information about issues it encounters to a file called an error log . Windows users may consider it similar to an event log . You can configure it to log based on its levels. For example, if you tell Nginx to write anything above error severity, it will not log warning logs at all.
It also has an access log that is similar to W3C logs created by other web servers. You can change the fields that you would like to log, and even configure it to ignore common status codes like 2xx and 3xx. This is a pretty neat feature, since it ends up creating much smaller log files instead of huge ones that may get created if you are managing busy servers.
The way Nginx is designed, you can easily upgrade Nginx. You can also update its configuration while the server is running, without losing client connections. This allows you to test your troubleshooting approach , and if something doesn’t work as desired, you can simply revert the settings.
Nginx brings a very interesting way of controlling your processes. Instead of bringing the entire service down, you can send signal values to the master process by using an Nginx command with a switch . You will learn about it in detail in upcoming chapters, but for now you can imagine saying something like nginx -s reload , a command that will simply reload the configuration changes without recycling the worker processes. Simple, but effective!
Upgrades without Downtime Using Live Binaries
This is probably one of the most powerful features of Nginx. In the IIS or Apache worlds, you can’t upgrade your web server without bringing the service down. Nginx spawns a master process when the service starts. Its main purpose is to read and evaluate configuration files. Apart from that, the master process starts one or more worker processes that do the real work by handling the client connections.
If you need to upgrade the binary, there are simple steps and commands that you need to issue in order to make the new worker processes run at tandem with the older ones. The new requests will be sent to the newer worker processes that have the latest configuration loaded in it. If by any chance, you find out that the upgrade is causing issues, you can simply issue another set of commands that will gracefully return the requests to the older process that already has the previous working configuration loaded in it. How neat is that?
Enterprise Features of Nginx Plus
Nginx has two versions. The basic version is free, and the paid option is called Nginx Plus . Nginx Plus has quite a few important features that are very helpful for managing busy sites. Choosing Nginx Plus helps you save a lot of time. It has features like load balancing, session persistence, cache control, and even health checks out of the box. You will be learning about the overall differences shortly in this chapter.
Support Available with Nginx Plus
Community support is free for Nginx, but e-mail and phone support are not. Nginx Plus comes packaged with support options. You can buy different kinds of support options based on your need and criticality of the business. Nginx Plus contains many additional benefits as you will see in the next section.
Nginx is a reliable solution for any website or service that is looking for scalability, high performance, and reliable solutions. You can download it directly from the website and build the binaries yourself as discussed earlier. However, there are a few modules that are not available unless you licence Nginx Plus. The key difference here is that while Nginx is available in source form that you can compile according to your needs, Nginx Plus is available only in binary form.
The core features (HTTP server, core worker process architecture, SPDY, SSL termination, authentication, bandwidth management, reverse proxy options for HTTP, TCP, and Mail) are available in both Nginx and Nginx Plus.
Load balancing and application delivery is not available in the same capacity, though. Nginx Plus provides features, discussed in this section, which are not available in Nginx.
Advanced HTTP and TCP Load Balancing
Nginx Plus enhances the reverse proxy capabilities of Nginx. Imagine Nginx Plus as Nginx running on steroids. There are four methods of load balancing in Nginx that are common to both versions: Round-Robin, Least Connections, Generic Hash, and IP Hash.
Nginx Plus adds the least time method in its stack (more on these methods later). The load balancing methods in Nginx Plus are extended to support multicore servers in an optimized way. The worker processes share the load balancing state among each other so that traffic can be distributed more evenly.
HTTP is a stateless protocol. You make a request, the server responds, and that’s it. But you may argue that this is not what it feels like. For instance, you go to your mail server, log in, and check your mail. If you right-click a message and open it in a new window, it doesn’t reauthenticate you. If the request was stateless, how would such a thing be possible?
The logging-in behavior makes it appear that the server knows you. To make this happen, plenty of things have to happen in the background. Cookies, sessions, and timeouts typically govern how the websites behave for logged-on users.
This implies that if your session or cookie is lost or tampered with, you will be logged out automatically. It also implies that there is “some” work done at the server side for every user. It would make a lot of sense, that if the request has gone to Server 1 for a User A, the subsequent requests from User A go to the same Server 1. If this doesn’t happen, and the request ends up at Server 2, it would ask the user to reauthenticate. This behavior is referred to as session persistence. Nginx Plus load balancer identifies and pins all requests in a session to the same upstream server. It also provides a feature called session draining , which allows you to take a server down without interrupting established sessions.
Content Caching Enhanced Capabilities
Caching is an activity by the server to temporarily hold a static resource, so that it doesn’t need to be retrieved from the back end every time a request is made for the same resource. It improves speed and reduces load on the back end servers.
Nginx Plus can cache content retrieved from the upstream HTTP servers and responses returned by FASTCgi, SCGI, and uwsgi services. The cached object is persisted in the local disk and served as if it is coming from the origin.
However, there is a caveat to caching. What if the content in the back end has changed? The server will keep sending older files to the client, which is not what you would like. To avoid such scenarios, Nginx Plus allows purging of cache. You will need to use one of the many tools available to purge the cache. You can purge selected subset of requests or everything if you need to.
Application Health Checks
Nobody likes to visit a site that is down. If your site suffers frequent outages, it is likely that people will lose trust soon. Health check is a way where you let Nginx handle failures gracefully. Who wouldn’t like a self-healing and self-servicing robot? Health check is like a robot that goes to the service station automatically when it thinks it is not performing well.
Health checks continually test the upstream servers and instruct Nginx Plus to avoid servers that have failed. This simply implies that the servers will be “taken care of” by itself, and your end users won’t see the error pages that they might have, in case there was no real person monitoring the servers.
If yours is a very busy site, this feature can be considered as one of the biggest reasons why you should go with Nginx Plus!
HTTP Live Streaming (HLS) and Video on Demand (VOD)
Before learning about HTTP live streaming, let us explain the differences between streaming, progressive downloads, and adaptive streaming. This will help you understand why Nginx plays a special role in this arena.
With increasing bandwidth every day and reduced costs, delivering rich content has never been easier. Technically, there is a media file that you have sent to the browser or mobile device, so that it just plays. The problem is that the size can be overwhelming to download. Clients want the content to play as soon as possible and there are multiple ways to do this.
When you stream content, you typically mean that the viewer clicks on a button and video/audio starts playing after an initial amount of buffering. At the back end, you will need to use dedicated streaming software. This software will ensure that the data rate of the encoded file is less than that of the bandwidth. It ensures that the encoded file is small enough to be streamed through the limited bandwidth at disposal. Keep in mind that every streaming software has its own set of requirements of media files so that it can function as expected.
In contrast to streaming, progressive download enables you to use simple HTTP web servers. The video that is delivered using this technique is typically stored at the client side and played directly from the hard drive. This is a big difference, since streaming media is not stored locally at all! From a user experience perspective, the software takes care of playing the file as soon as enough content is downloaded. Sites like YouTube, CNN, and many other video sites don’t use streaming servers. They deliver it using progressive download. Since the data is stored locally before playing, the user experience is a lot better than streaming.
As the name suggests, this is streaming with a twist. It automatically adapts to the client’s bandwidth. It uses streams in such a way, that when the connection is good the viewer gets a higher-quality content. As you can guess, if the connection quality deteriorates, a lower data rate is opted for. This also means that the video quality might get too blurry at times and the users will blame the service rather than their own network connection. You will need dedicated streaming software to do adaptive streaming.
That little detour should have given you a reasonably decent understanding of where Nginx fits. Nginx is widely used to deliver MP4 and FLV video content using progressive downloads. It is very efficient in delivering content due to its non-blocking I/O architecture and support for huge number of concurrent connections.
Nginx Plus takes it even further. It allows you to support adaptive streaming functionality for video-on-demand services. This way, the bitrate is automatically adjusted in real time. It also has bandwidth throttling capabilities so that the fast clients and download accelerators don’t suck up your entire bandwidth.
Nginx Plus uses HLS/VOD module to provide even more flexibility and support for H.264/AAC. This helps a lot, since you don’t have to repackage the MP4 content for adaptive streaming. It provides real-time transformations from mp4 to HLS/MPEG-TS. There are other modules that you can use together so that the intellectual property is not compromised.
HTTP Dynamic Streaming (HDS/VOD)
It is an alternative method for delivering adaptive streaming media to your clients. It uses different file formats that are prepared initially using Adobe’s f4fpackager tool . This tool generates the files that are necessary for the clients. Nginx f4f handler simply delivers it to the clients.
Bandwidth Management for MP4 Media
With Nginx Plus, you have multiple directives that can be used to limit the rate of download. Essentially, it defines limits that activate after a specified time. It saves you from denial of service attacks because users who are putting more loads on the servers are automatically identified and throttled.
Another smart thing it does is to allow the content to stream without any limit for the first N seconds so that the data is buffered appropriately. After that the limit automatically applies. On one hand it helps clients with quicker play time, and on the other hand it discourages download accelerators.
Live Activity Monitoring
Nginx Plus comes with a real-time activity monitoring interface. It is quite friendly and easy to use. For a live view of a demo website to see how it looks, try http://demo.nginx.com/status.html.As you can see, information about current connections, requests, and many other counters are listed here. Notice how clearly it shows that there are a couple of problems in the upstream servers. This interface is exposed through HTTP and it implies that you can access it using a browser without logging on your server .
Nginx Commercial Support
Sometimes, when you face a challenge in a production farm and your team is not able to resolve issues, you can rely on community support. However, there is no direct accountability and guarantee that your issue will be resolved.
At that point, having a commercial support option offered by Nginx Plus comes to rescue. You will have the experts from Nginx support team covering your back. Standard support covers you during the business hours (9 a.m. to 5 p.m.), whereas premium support covers you 24/7. With premium support you get phone support as well.
In case it is found that the issue is due to a bug in the software, premium support can help you get the bug fixed as soon as possible. In short, premium support is the best and fastest support you can get from Nginx Inc.
Nginx and Apache are both versatile and powerful web servers. Together they serve more than 70 percent of the top million websites. At times they compete, but often they are found complementing each other. One important thing to point out here is that they are not entirely interchangeable. You will need to pick them up carefully according to your workload.
Apache Software Foundation (ASF) is the umbrella under which Apache is developed. Apache has been around since 1995 and developed under ASF since 1999. It is the clear winner today in terms of overall market share. Apache has widespread support, and you will find plenty of expertise to hire and solve your hosting needs. Nginx is the new kid in the block and seen widespread adoption since 2008. Between June 2008 and June 2015, it has grown from 2 percent to 21 percent among the top million sites. For the top 10,000 websites, the story is even better. It has grown mostly at the cost of Apache, which saw its market share drop from 66 percent to 49 percent in the same time period.
For Apache users, there is a choice of multiprocessing modules (MPM) that control the way the requests are handled. You can choose between mpm_prefork, mpm_worker, mpm_event. Basically mpm_prefork spawns processes for every request, mpm_worker spawns processes , which in turn spawn threads and manages the threads, mpm_event is further optimization of mpm_worker where Apache juggles the keep alive connections using dedicated threads. If you haven’t already noted, these changes are all for the better and evolutionary.
Nginx was created to solve the concurrency problem and it did by using a new design altogether. It spawns multiple worker processes that can handle thousands of connections each! It is completely asynchronous, non-blocking, and event-driven. It consumes very little resources and helps in reducing cost of scaling out of a web server. The web server can be upgraded on the fly without losing the connected visitors and reduces downtime of your service.
Nginx needs fewer resources than Apache because of its new architecture. Fewer resources = Lower cost = More profit.
Apache is present more widely across operating systems whereas Nginx is not. Most of the famous Linux distros have Nginx, which can be downloaded using rpm, yum, or apt-get but it is almost always an extra step. Consider this, for installing Apache (on CentOS 7) , you can run the following command and everything is all set (there is a dedicated chapter for that coming up with all the details).
Add the following text:
Use the following code:
It is not that it is hard; it is just that it needs some extra little steps to make it work. With more popularity, it is possible that in the coming time it will become more generally available .
Proxy and Load Balancing Server
Nginx was designed as a reverse proxy that doubles up as a web server. This is quite different than Apache since it was designed as a general purpose web server. This feature gives an edge to Nginx since it is more effective in dealing with a high volume of requests. It also has good load balancing capability. Quite often, Nginx acts as a web accelerator by handling the request in the front end and passing the request to the back-end servers when required. So, Nginx in the front end and Apache in the back end gives you the best of both worlds. They are more complementing than competing from this perspective.
Static vs. Dynamic Content
As mentioned earlier, Nginx has a clear advantage when serving static content. The dynamic content story is quite different though. Apache has a clear, early mover advantage here. It has built-in support for PHP, Python, Perl, and many other languages. Nginx almost always requires extra effort to make it work with these languages. If you are a Python or Ruby developer, Apache might be a better choice since it will not need CGI to execute it. Even though PHP has good support on Nginx, you still need to dedicate a little time to get PHP-based solutions that work directly on Nginx. For example, installing WordPress on LAMP stack is super easy, and even though it can be easily done on a LEMP stack, you will still need to configure some nuts here, and some bolts there. You get the idea!
Apache’s basic configuration ideology is drastically different from Nginx. You can have a .htaccess file in every directory (if you like) using which you can provide additional directions to Apache about how to respond to the requests of that specific directory. Nginx on the other hand interprets the requests based on the URL, instead of a directory structure (more on this later). It doesn’t even process the .htaccess file. It has both merits (better performance) and demerits (lesser configuration flexibility). Although for static files the requests are eventually mapped to the file, the core power of parsing the URI comes to play when you use it for scenarios like mail and proxy server roles.
If you come from an Apache background, you will need to unlearn a lot of concepts while migrating to Nginx. The differences are many from a configuration perspective, but most people who migrate from either side say that once you learn the nuances, you will find the Nginx configuration quite simple and straightforward!
Modules (or Plug-Ins)
Both Apache and Nginx have a robust set of modules that extend the platform. There is still a stark difference in the way these extensions are added and configured. In Apache, you can dynamically load/unload the modules using configuration, but in Nginx you are supposed to build the binaries using different switches (more on this in the next chapter). It may sound limiting and less flexible (and it is), but it has its own advantages. For example, the binaries won’t have any unnecessary code inside it. It requires and forces you to have a prior understanding of what you need the specific web server to do.
It is also good in a way, because mostly it is seen that even though the modular software has modules, web administrators end up installing much more than what they need. Any unnecessary module that is loaded in the memory is extra CPU cycles getting wasted. Obviously, if you are wasting those cycles due to lack of planning, it all adds up eventually and you will get poorer performance from the same hardware!
Due to the early mover advantage, Apache has a lot to offer when it comes to documentation. The web is full of solid advice, books, blogs, articles, trainings, tools, use cases, forum support, configuration suggestions, and pretty much everything you will need from an Apache web-administration perspective.
The documentation for Nginx has been evolving and getting better rapidly but is still way less compared to Apache. That doesn’t mean it is bad, it just means that it is competing in this area; and most likely, it will become better as more and more people join in.
The support system for Apache is very mature. There are a lot of tools available to help you maintain your web server well. There are plenty of third-party companies that support Apache by providing different support levels. There are IRC channels available as well, which makes community support easier.
Nginx does have support as mentioned earlier, but lacks the richness and maturity because of its late entry. The fact that it is straightforward, simple to configure, and robust helps Nginx to a great extent. In a way, simplicity is where Nginx wins and with time and adoption, it can only get better!
Making a decision about which server to choose is often a debatable subject. Even more, the IT pros typically get used to working with specific software. Nginx acts as a complementary solution to most web servers. The idea is not to replace your existing infrastructure completely, but to augment it with Nginx in ways that you get the best of both worlds. In the upcoming section you will learn about the reasons why you should seriously consider adding Nginx servers to your web farm.
The online users today have very low threshold of tolerance for slow websites. With smartphones and tablets available at your fingertips and so much of social data to consume, everybody seems to be in a rush. Innovation hence will not cut it alone. The website has to have equally good performance. As if this was not enough, Google now incorporates page load time into its search rankings. In essence, poorly performing websites will find it increasingly difficult to succeed.
Fast page load times builds trust in your site and leads to more returning visitors. If your site is slow, you are most certainly going to lose your visitors to your competition. Recent surveys reveal that users expect the page to load in less than 2 seconds, and 40 percent of them will abandon your website if it takes more than 3 seconds!
Nginx has solved the performance problem and that is one of the biggest reasons for all the praise and awards it bags. It is extremely fast, and shines even under high load .
It Can Accelerate Your Application
Not only Nginx is extremely fast, but it can also act as an acceleration toolkit for your existing application. The idea is to drop Nginx in front of an existing set of web servers and let it take care of routing traffic to the back end intelligently. This way, you can offload a lot of tasks to Nginx and let your back-end server handle more data intensive tasks. In effect, you will find that the users have been served the content while your back end was churning out the data.
It Has a Straightforward Load Balancer
Setting up a hardware load balancer is quite costly and resource intensive. It requires a lot of expertise to handle and also takes a considerable amount of time to set up. After a physical installation of the devices, you can definitely reap the rewards from your hardware load balancer, but you are locked in with the solution and hardware that may require servicing at times. In any case, you add one more layer of complexity in your infrastructure by using a hardware load balancer.
With Nginx you can set up a pretty straightforward and fast software load balancer. It can immediately help you out by sharing load across your front-end web servers.
It Scales Well
With Apache and IIS , it is a common pain: The more connections, the more issues. These servers solved a big problem around bringing dynamic content to the web server instead of static files, but scalability has always been a challenge. Keep in mind that scalability and performance are not the same problem.
Let’s say you have server that can handle 1000 concurrent connections . As long as the requests are short and the server is able to handle 1000 connections/second, you are good. But the moment a request starts taking 10 seconds to execute, the server simply starts crawling and you see the domino effect where one thing fails after another. If you have large files available for download, your server will most likely choke with a high number of concurrent connections. Apache and IIS servers are not suitable for this kind of load, simply because of the way they have been architected. They are also prone to denial of service attacks (DoS) . Unfortunately, adding more resources like CPU and RAM doesn’t help much. For example, if you double the RAM or CPU, that doesn’t mean the server will be able to handle 2000 concurrent connections. As you can see, the issue is not with performance, but with scale.
Nginx is one of the very few servers (along with Node.js) that is capable of addressing this issue, which is often referred to as C10K problem (a term coined in 1999 by Dan Kegel for 10,000 concurrent connections).
You Can Upgrade It On the Fly
Nginx provides you an ability to reconfigure and upgrade Nginx instances on the fly without interrupting customer activity. It is an extremely important capability because every server and every service needs patching at times. With Nginx you can patch your production environment reliably without completely bringing down your services levels.
It’s Affordable to Install and Maintain
Nginx performs pretty well even on servers with a very low hardware footprint. Even with default settings, you can get much more throughout from an Nginx server compared to Apache or IIS .
It’s Easy to Use
Don’t be intimidated by the lack of a user interface (UI) . Nginx is easy if you understand how to use it. The configuration system is pretty well thought out and once you get up to speed, you will thoroughly enjoy it!
In simple words, a web server is a server that hosts an application that listens to the HTTP requests. It is the web server’s responsibility to hear (i.e., to understand HTTP) what the browser is saying, and respond appropriately. Sometimes, it could be as simple as fetching a file from the file system and delivering it to the web browser. At other times, it delegates the request to a handler that performs complicated logic and returns the processed response to the web server, which in turn transfers it back to the client! Typically, the server that hosts web server software is termed a web server or a web front-end server . If you are new to the web server’s world, don’t worry. By the time you are done reading, you will have a good grasp on the subject. Although there are quite a few web servers around, three dominate: Apache , Microsoft Internet Information Services (IIS) , and Nginx combined have captured around 85 percent of the market. Each web server has its space and its user base. When you are making a choice, you should evaluate wisely based on your workload. It becomes extremely crucial to make a diligent effort while you are setting up your web server, since migration from one web server to another is typically a painful exercise. Sometimes, it is just not possible and you have to rewrite a lot of code. Historically, the fight for market share used to be between Apache and IIS, until Nginx showed up. Since then, Nginx received its fifth consecutive “Web Server of the Year Award” from W3Techs in 2015. It is also a testament to the power of Nginx, and why Nginx should not be ignored for your web hosting needs.
A dedicated server may be compared to a safety deposit box held within a bank. A client ‘rents’ a safety deposit box in a Swiss bank which allows the client to store anything he desires within his box as long as it fits. If his items are too large he can acquire a larger box. His rent entitles him to a ‘dedicated’ storage space for his personal items which are kept separate from the bank’s other clients’ items. However, despite the extended storage freedom compared to ‘shared’ storage space, the client has no ownership or control over the actual physical box. In other words, the client does not reserve the right to put his own lock onto the box neither is he entitled to mark nor paint the box. Furthermore, the client has no control over the physical storage of the box nor the manner in which the box is physically protected as this is a service historically offered by banks and over which banks retain strict control. In order to provide ‘dedicated space’ the bank utilises physical equipment, for example, the safety deposit box. Despite the use of the safety deposit box, the client is ‘renting’ storage space and not physical property of the bank. The ‘safety deposit box’ is not the object of the service but the equipment required to provide the object of the service – separate and safe storage. The dedicated server is a form of virtual ‘safety deposit box’ in respect of which the physical server equipment is merely a device which facilitates the provision of secure virtual storage.
In shared hosting the client ‘rents’ pure virtual storage space shared by multiple users. A ‘shared server’ is comparable to storage space rented for gold shares acquired on the stock market, where the actual gold is held at Fort Knox, presupposing that Fort Knox holds privately owned gold rather than USA Federal gold reserves. In such an hypothesis the gold owner never has access to Fort Knox and has control over neither the manner in which the gold is stored nor the security systems protecting the gold housed within Fort Knox. Multiple investors own shares in the gold housed within Fort Knox, and while the investors retain the rights to, and ownership of, their investments they have no actual say in the manner in which it is stored, protected or maintained. Furthermore, the investor may not dictate the type of valuable stored in Fort Knox. The investor may store any type of gold bullion, but it may only be gold bullion and not, for example, diamonds. Fort Knox in turn provides secure storage for the gold investment. ‘Shared hosting’ is a virtual equivalent of such warehouse storage, however, instead of gold the client stores data and programmes. The client may not change the core technologies, being “the most prevalent ‘stack’ of technologies in hosting”. The “core stack of technologies” which is essentially comprised of the “operating system”, “website application server software”, “the database server software” and “language” of the server system may not be changed by the client or alternative “core stack of technologies” downloaded or installed. This is comparable with Fort Knox storage restrictions whereby any form of gold bullion may be stored, but not diamonds. In this example, the ‘gold bullion’ in Fort Knox is the “core stack”. Thus, the client may store any form of data and may download or install software provided that such installation conforms with the storage restrictions applicable to the ‘share hosting’ server.
Classroom computing in graduate education continues to grow as more and more schools include the use of sophisticated software programs in their curriculums. Unfortunately many of these statistics and modeling applications are quite expensive and require significant processing power. The Graduate School of Business and Public Policy at the Naval Postgraduate School is using server-based computing to control software costs and improve the performance of applications. This paper describes the school’s use of Microsoft’s Remote Desktop Services to deliver applications to networked student computers. The virtual delivery of the software, which runs on a server, eliminates the need to install the software on every student computer. Depending on the software licensing structure, this can significantly reduce the required number of licenses. For some applications it can also dramatically improve performance.
IntroductionIn 2004 the Graduate School of Business and Public Policy (GSBPP) at the Naval Postgraduate School built a prototype smart classroom seating 45 students with networked laptop PCs at every seat. Infusing computer technology into the traditional lecture based classroom proved to be a resounding success and that classroom quickly became the most frequently requested room every quarter. Faculty reported they could cover up to 20% more material in the same amount of time. The improved efficiency was the result of the professors being able to optimize classroom time by using computer based tools whenever it was appropriate rather than having to wait for a specific hour of the week when they had access to a computer lab. In the past, courses would be divided between lecture based classroom time and one or two hours per week of computer lab. Another instructional example is the use of the Internet to access on-line databases such as federal budget information in order to bring current budget issues into the classroom at the same time they are being addressed by the government. Thanks to the concurrency of access to the data and the issues at hand, the relevance of the materials becomes immediately apparent to the student (Doyle, 2010). Beyond the instructional advantages, research has also shown a significant increase in the level of student interaction when computer mediated communications are incorporated into the education process (Brinkley, 2003). The success of the prototype project generated a demand to install computers for every student in as many classrooms as possible. As of May, 2013, GSBPP maintains approximately 200 computers spread across six classrooms and another 35 laptops in a mobile cart ready to deploy to any of the other classrooms as needed. The school tries to adhere to a three year lifecycle replacement plan to keep the systems up to date. Unfortunately budget constraints often preclude replacement of the systems exactly as planned. The high cost of analytical software competes directly with the purchase of new hardware when limited resources are available.
SettingThe Graduate School of Business and Public Policy is one of four academic schools that make up the Naval Postgraduate School (NPS). NPS is located in Monterey, California and was established in 1909 to serve the advanced educational needs of the United States Navy. It has since been expanded to support students from the other U.S. military services and foreign countries as well. The total student population consists of approximately 1,500 students coming from all branches of the U.S. defense community and the military services of more than 25 allied nations. NPS is a well diversified, fully accredited graduate school with a proud history of academic excellence. This paper focuses on the classroom technologies employed by the Graduate School of Business and Public Policy (GSBPP). Classrooms within GSBPP are designed to accommodate an average of thirty to forty students. Continuously seeking to improve, GSBPP evaluates and considers the adoption of new technologies that can further improve teaching effectiveness and efficiency.
Results for GSBPPAs stated earlier, GSBPP maintains approximately 235 computer systems for use in the school’s resident graduate programs. It is the school’s policy to give as much autonomy as possible to professors choosing what software they wish to teach with. One benefit of this approach is that it ensures the professors are proficient and comfortable teaching their respective applications to their students. A disadvantage is that some programs are only used by a few professors for just a few courses. This makes it impractical to buy and install every application on every GSBPP computer system. One example of the software used is a statistical analysis program called Crystal Ball. This program was requested by only one professor whose class size was normally less than 25. GSBPP could not install the software in just one classroom because the school’s academic scheduling model is such that the classroom assignments change every quarter. GSBPP had no way of knowing which classroom Crystal Ball would be needed in advance of the quarter beginning. GSBPP could not afford to install the software on all 235 systems due to limited resources. The total annual cost for 235 licenses would have been approximately $22,000. The GSBPP solution was to purchase the needed 25 licenses, at a cost of only $3,000, and install the software on the RDS application server. This facilitated the virtual delivery of the software to whichever classroom it was required in. The only restriction was the limit of 25 concurrent users. In this case the use of the RDS application server meant the difference between approving the use of Crystal Ball or having to deny its use due to budget constraints. Another important benefit to the RDS virtual delivery model is the reduced technical support man-hours needed to install, configure and maintain the software. GSBPP only needs to maintain the one instance of the software loaded on the server vice installing, configuring and updating the software on each end-user system. Both the cost savings and reduced technical support man-hours were anticipated outcomes due to the known benefits of centralization and virtualization. Another unexpected benefit was realized during the postinstallation testing phase. There was a significant increase in application performance. GSBPP developed a specific statistical model to evaluate and benchmark the performance of the software. The model was executed in several different usage scenarios in order to compare the performance under different situations. The first test was designed to establish the performance baseline of how the software ran in the old stand-alone environment. For this test the software was loaded and run on the local student computer hard drive. This represented what could be expected if GSBPP had purchased a license for each student computer instead of using the RDS application server. It took the standalone computer 33 seconds to finish the model’s calculations. The next test used the RDS application server to run the software with only one client connection. This time the model was completed in just 19 seconds. The explanation for this 42% improvement is easily explained by the much higher grade of computer being used for the server compared to the student computers. The next test was designed to measure the performance with multiple client connections since the job of an RDS application server is to support many users. There would be times when the same model would be executed by multiple students simultaneously. To test this scenario several users were recruited to execute the model on seven different clients at exactly the same second. This instantaneous access would show GSBPP what to expect when a professor was teaching the software in a classroom environment. GSBPP expected some decrease in performance which would represent the sacrifice needed to achieve the many benefits of virtualization. Instead, there was yet another increase in performance and the model completed in just 11 seconds. The hypothesis for this additional improvement is that much of the software’s program was still in the computer’s cache memory when each subsequent client needed to access it. Having the needed software code available in cache memory saved the server from having to load it from the internal hard drive.
Conclusion and FindingsAs of May 2013, the school’s current iteration of the RDS application server had been operational for six months. During this time all of the RDS benefits listed above were confirmed. The primary goals of achieving cost savings and a more efficient centralized configuration management environment were clearly met. The added benefit of improved application performance was also sustained throughout the six months of real-world classroom teaching. GSBPP does not expect all applications to achieve the same performance benefits as measured for Crystal Ball. However the testing does show the RDS model offers significant benefits in the academic environment and should be considered as a part of an overall IT strategy.
Today, most large companies maintain virtual private networks (VPNs) to connect their remote locations into a single secure network. VPNs can be quite large covering more than 1000 locations and in most cases use standard Internet protocols and services. Such VPNs are implemented using a diverse set of technologies such as Frame Relay, MPLS, or IPSEC to achieve the goal of privacy and performance isolation from the public Internet. Using VPNs to distribute live content has recently received tremendous interest. For example, a VPN could be used to broadcast a CEOemployee town hall meeting. To distribute this type of content economically without overloading the network, the deployment of streaming caches or splitters is most likely required. We address the problem of optimally placing such caches to broadcast to a given set of VPN endpoints under the constraints typically found within a VPN. In particular, we introduce an efficient algorithm with complexity O(V ), V being the number of routers. This guarantees the optimal cache placement if interception is used for redirection. We prove that the general problem is NP-hard and introduce multiple heuristics for efficient and robust cache placement suitable under different constraints. At the expense of increased implementation complexity, each heuristic provides additional saving in the number of caches required. We evaluate proposed solutions using extensive simulations. In particular, we show our flow-based solution is very close to the optimal. We study the problem of placing cache servers in VPNs to provision for unicast based video streaming events. Our goal is to satisfy a given client population with the minimal number of cache servers. Given the bound on the number of cache servers, we add the additional goal of placing them in such a way as to minimize the total bandwidth usage. We developed provably optimal algorithms to achieve both goals using an interception cache server based solution. In addition, we prove that the problem is NP-hard in general. We then develop a set of simple and efficient heuristics to provide reasonable solutions to the cache placement problem if non-interception based redirection is used. Each of these solutions provide additional improvement in cache reduction at the expense of increased implementation cost compared to interception based redirection. In addition to those theoretical results we performed extensive simulations to evaluate the performance of our algorithms in realistic settings. We discovered in particular that if non-interception based redirection systems are used, the number of caches can be reduced by more than 27% using our heuristics compared to the greedy strategy for interception based redirection. Additionally, in large networks, if redirection is based on individual client IP addresses, our heuristics reduce the number of caches by 17% compared to the case where redirection is based on the router or the entire IP prefix ranges. If technologies such as MPLS are available to perform source routing, we show a flow-based algorithm can improve up to 40% in the number of caches and is very close to the actual optimal. For future work, we intend to verify our algorithms by implementing it within a large VPN. We also plan to evaluate the robustness of flow-based algorithm in presence of inaccurate input data and more thoroughly study the benefit in bandwidth saving using our proposed dynamic programming algorithm.
The Secure Socket Layer (SSL) and its variant, Transport Layer Security (TLS), are used toward ensuring server security. In this paper, we characterize the cryptographic strength of public servers running SSL/TLS. We present a tool developed for this purpose, the Probing SSL Security Tool (PSST), and evaluate over 19,000 servers. We expose the great diversity in the levels of cryptographic strength that is supported on the Internet. Some of our discouraging results show that most sites still support the insecure SSL 2.0, weak export-level grades of encryption ciphers, or weak RSA key strengths. We also observe encouraging behavior such as sensible default choices by servers when presented with multiple options, the quick adoption of AES (more than half the servers support strong key AES as their default choice), and the use of strong RSA key sizes of 1024 bits and above. Comparing results of running our tool over the last two years points to a positive trend that is moving in the right direction, though perhaps not as quickly as it should.
Cryptography is an essential component of modern electronic commerce. With the explosion of transactions being conducted over the Internet, ensuring the security of data transfer is critically important. Considerable amounts of money are being exchanged over the network, either through e-commerce sites (e.g., Amazon, Buy.com), auction sites (e.g., eBay), on-line banking (e.g., Citibank, Chase), stock trading (e.g., Schwab), and even government (e.g., irs.gov). Communication with these sites is secured by the Secure Sockets Layer (SSL) or its variant, Transport Layer Security (TLS), which are used to provide authentication, privacy, and integrity. A key component of the security of SSL/TLS is the cryptographic strength of the underlying algorithms used by the protocol. It is crucial to ensure that servers using the SSL protocol have employed it properly. For example, it should be determined whether site administrators are using the best practices, are aware of their sites’ vulnerabilities if they haven’t already been addressed, and are promptly reacting to CERT advisories. Poor use of cryptography may be an indicator of poorly-administered security. Experience in related areas of patch management and virus/worm propagation is not encouraging. The recent interest in SSL-based VPNs only increases the need to study SSL.
One key feature of SSL/TLS is that it allows negotiation between two peers. Different implementations will not necessarily support the same cryptographic algorithms. Thus SSL allows two peers to determine a subset of common cryptographic routines. This allows for the interoperability and extensibility of the protocol. For example, SSL allows different algorithms to be used for authentication (e.g., RSA, DSS), key exchange (RSA, EDH), encryption (RC2, RC4, DES, 3-DES, AES), and integrity (MD5, SHA-1). This flexibility allows for new, stronger algorithms to be added over time (such as AES) and reduces dependence on any one algorithm, in case that algorithm is broken or succumbs to brute-force exhaustive search techniques (as with DES).
While this flexibility improves interoperability, it may also compromise security. For example, server administrators may wish to support as wide a range of protocols as possible in order to maximize the number of clients that can access a site. However, they may be lax in removing features that have compromises in security. For example, if a site supports a weak form of encryption, a client may choose to use that algorithm for performance or power consumption reasons (e.g., on a wireless PDA), without recognizing the dangers. This could lead to a session being broken, a customer’s password being cracked, and an empty bank account. While this could be considered simply a case of clients suffering the consequences of their actions, there are reasons to prevent this from happening. This type of experience could alienate a customer, damage the reputation of a business, and perhaps even lead to legal action. More importantly, experience shows that clients do not understand security, and thus steps should be taken to minimize opportunities for clients to make the wrong decision. For these reasons, we believe the bulk of the burden for ensuring security falls on the provider, or server in this case. This then raises the question: do servers deployed in the Internet adhere to current best practices by employing strong cryptography?
This paper characterizes the cryptographic strength of public servers in the Internet running SSL/TLS. We evaluate over 19,000 servers, and present a tool developed for this purpose, the Probing SSL Security Tool (PSST). We use PSST to evaluate which protocols and cryptographic options are supported by the servers, and which are chosen by the servers as a default when presented with several options. We show that a great variety of behavior can be found in the network, with both encouraging and discouraging results. Examples include:
• 85 percent of SSL sites support the SSL 2.0 protocol, even though it has significant security problems. Moreover, a small number of sites support only the SSL2 protocol.
• 93 percent of servers support (single) DES, despite the fact that DES is considered susceptible to exhaustive search. • Many servers support the old export-grade encryption levels, even though US law has changed and these algorithms are considered susceptible to brute-force attacks.
• 765 (almost 4 percent) of the sites use RSA-based authentication with only 512-bit keys, even though RSA has announced that this level of security is insufficient. On the other hand, over 1200 sites use 2048 bits or greater.
• AES is already supported in over 57 percent of sites we probed. Out of these, about 94 percent default to AES when presented with all options (and the vast majority of them use a strong 256 bit key).
We have also run our tool periodically over the last two years, in order to study the evolution of SSL use (and misuse) over time. The overall trend we discovered is a steady, though a little slow, improvement in cryptographic strength of SSL/TLS servers. For example, within the past two years:
• Support of the weak SSL2 protocol has been reduced by over 9 percentage points.
• Support of AES has grown by nearly 16 percentage points.
• Support of weak public key sizes has gone down by nearly 2 percentage points.
• Support of very strong public key sizes has gone up by nearly 2 percentage points.
Thus, our results show that most servers (though not all) support both weak cryptography and strong cryptography, while making the correct choice by default, if given the option. This is in sharp contrast to the situation several years ago, where 20-30 percent of the servers used only weak cryptography. As a concrete example, in 2000 25 percent of servers probed by Murray supported a very weak server key size of at most 512 bits, compared to about 4 percent today. Our results are also useful in highlighting the most prevalent supported choices among the available options, thereby allowing future efforts on improving performance and enhancing security to focus on the most relevant set of cryptographic algorithms. Finally, our tool can be useful for regular security compliance testing, especially by large organizations that own multiple servers.
Homin K. Lee Department of Computer Science Columbia University New York, NY.
The increasing growth in computer networks has brought with it an increase in the interest and importance of online games. Most commercial games use a client-server architecture and many allow the players to connect their clients to their choice of game server. Given that players can also deploy their own server, many games provide players with a choice of many possible game servers. And server selection matters, not only for physical parameters such as player population and game map, but because the latency between the game client and the server has been shown to degrade gameplay. The selection process is made more difficult when players want to play together as a group, co-located on the same game server.
In order to enhance server browsing to support simultaneous players and consider alterations to support the increasingly wide-range of online games, there is first a need for a better understanding of the network characteristics of current game server browsing. This paper provides this needed first step by gathering data on real game servers and clients on the Internet. Months of master server data for three different games provide a characterization of server uptimes and populations, allowing observation of time of day and day of week correlations. A week of game server data gathered from custom software that emulates the server browsing of players seeking to play simultaneously on the same game server provides insight into the ability of currently deployed game servers to support online gameplay.
The results allow us to draw the following conclusions: 1) There is no visual day of week correlation to server uptime or player population. 2) There is no visual time of day correlation to server uptime, but there is some correlation with player population. However, there is not a corresponding correlation with server performance (latency). 3) Game server performance (latency) is nearly independent of game type and game generation. 4) The number of simultaneous players in a group directly reduces the performance for all players by increasing the maximum latency. 5) Game server pools are well-suited to support typical third-person games, such as role-playing games or real-time strategy games. The pool of available game servers for third-person games can fairly easily support up to 20 simultaneous game players. 6) Game server pools are not as able to support typical first-person games, such as first-person shooters or racing games. Players selecting a server outside of a group can find an adequate number of suitable servers, but the pool of servers that provide acceptable performance decreases rapidly with an increase in the group size.
Since the data obtained in this study has been made available to the public, additional processing of the data may provide other insights into game server browsing: 1) Game servers provide information on the latencies and scores of players currently connected to the server. This data can be analyzed to study the range of latencies currently in use, and perhaps correlated with user scores. Or, 2) Geographic information may play a role in the ease (or difficulty) in simultaneous users finding a suitable game server. Additional analysis could examine the physical relationship among the clients and servers, both geographically (in terms of physical distance) and topologically (in terms of network distance), to better understand server browsing. Some games have servers that are not setup or controlled by individual users, such as servers for one of the popular massively multi-player online (MMO) games. These servers typically have similar server selection issues and so may benefit from the analysis in this paper, but often have the selection done implicitly with a single head node re-directing players to appropriate servers. Study of server selection in this process, probably with support from industry, may be an interesting area of future work.