19 Nov 2014 CloudFront With Static And Dynamic Content
CloudFront is a CDN (Content Delivery Network) Service offered by Amazon Web Services. CloudFront distribute web content (html, css, images, php, etc.) over more than 52 edge locations around the world to deliver the content more quickly.
CloudFront serve the original content from the origin servers; the origin servers can be web servers or S3 buckets or both. The files uploaded to the origin servers are called objects.
The first step is to create a CloudFront distribution, to specify the origin server(s). CloudFront provides a new domain to serve the content from.
When requesting an object from CloudFront, it checks for the existence of this object. If it doesn’t exist, CloudFront retrieves the file from the origin servers. The edge servers hold the content in their cache for 24 hours by default.
CloudFront Distribution
The first step is to add the ID and the domain name of origin server which is the domain name of the HTTP web server or S3 bucket.
The next step is to determine which scheme protocol will be used to connect the CloudFront, with the origin server. There are two options that can be used: HTTP Only, or Match Viewer.
Selecting “HTTP Only” means to only use port 80, while selecting “Match Viewer” option means to use the same protocol the viewer used to connect to the CloudFront, whether it is HTTP or HTTPS.
The next step, is to specify the cache behavior settings. The Viewer Protocol Policy can be:
- Either HTTP and HTTPS, which allows the user to connect with either of them.
- Redirect to HTTPS, which will redirect every HTTP request to HTTPS.
- Use HTTPS Only, to accept only HTTPS protocol.
Several HTTP methods can be cached, although caching POST headers doesn’t make sense except in few cases, like using more parameters than the allowed with GET method.
Next is to tell the CloudFront cache to whether forward the request headers of the user or not. This means that CloudFront can make use of some headers in the cache key.
What follows is to specify the TTL of the cache content. “Use the origin cache headers” means that the origin server will use the cache-control header in the response to specify the TTL. An example for that with nginx is the expires directive which control the time to cache the content.
expires 24h;
Cookies can be forwarded to the origin server with the Forward Cookies option. Also, only some cookies can be selected to be forwarded to the origin servers using the cookies whitelist. Note that CloudFront uses the cookies to cache user’s sessions separately, so that each user will has his own cached version of the website.
Last option in this section is Forward Query Strings. If enabled, query strings can be included as a part of the cache key, which can be very important to cache the dynamic content like php files. Most of the php files take the query parameters as an argument to the script:
http://example.com/index.php?p=4
The caching of dynamic content has been supported recently by CloudFront.
Some points to be considered
Caching the static content and serve it from the CloudFront edges can improve the loading speed dramatically, However the URLs of the static content like the images and CSS files must be adjusted to point to the new CloudFront domain name.
One solution for this is the “W3 Total Cache” plugin that can be used with wordpress to add the CDN domain name of the CloudFront, then every URL of the static files will be automatically adjusted to point to the CloudFront domain name.
example: http://example.com/img.jpg
will become:
http://aaabbbccc1234.cloudfront.net/img.jpg
By adding caching of the dynamic content to the scenario, the whole site can be delivered through CloudFront. All what is needed to do so is to change the DNS setting to make your site as a CNAME for the CloudFront’s domain name.
Now each request including the dynamic content will be served from the cache of the CloudFront. If the content doesn’t exist in the cache, the files will be pulled from the origin servers and then saved in the cache.