This is part 1 in the Front-end Series. As a front-end developer at a large organization, I work daily on architecture, performance, automation, localization, and compatibility, and other topics which I will share during this series. Feel free to post topic ideas in the comment section!
Large scale web application architecture is much more than a server and a /www folder. The build process can scale with need, but the basic setup contains building assets to a development area, then production environment, caching, and delivery via cdn.
This is example architecture diagram — showing building to different environments, pushing to cdn boxes, and caching.
Building static assets via source control
Working with other developers at any scale necessitates use of a version control system. Whether Git, Subversion, Mercurial, or the next new hotness, a DVCS is an absolute must. After developing in a local environment, assets are pushed to a development version-controlled server.
The assets here are never directly used by the application, but are pushed to a CDN server (or a mock-cdn for development). The source control and CDN can be can be synchronized by a few different methods:
- CRON task that rebuilds one folder from the other
- Post-commit hook that runs similar script
- 3rd party build software, like Anthill
Server environments, or “lanes”
It’s necessary to duplicate asset servers for different audiences. The most basic are:
A typical pipeline is:
- Assets are developed locally
- Files are moved into the Development environment to be integrated with other developers work and preliminary tested
- Quality Assurance tests the application. Feedback provided to developers (loop back)
- Once testing is successful, the assets are moved to the staging server, which serves as a place for stakeholders to preview the application, and approve it for production
- When okay-ed, the files are moved to production where they are publicly accessible. They are now “live”
Caching and Content Delivery
Linking directly to assets on a server is an excellent way to kill your site, when it goes viral (the Reddit hug of death, or the slashdot effect). The most basic level of defense is on server caching. On smaller sites, this can be an effective protection. Many CMS’s (like WordPress) have plugins that can caching dynamic content (php generated files saved as static html). WP-SuperCache is an excellent example of this layer of defense.
The second layer of defense is a Caching Appliance. This is a server (or cluster of servers) that specializes in serving static files with extreme speed. This appliance has only one function, which allows it to maximize its resources for the one purpose of serving content.
The top layer for serving files is utilizing a Content Delivery Network to distribute and cache assets globally. At it’s core, a CDN is a network of servers placed in strategic locations around the world, each containing duplicate and synchronized files. These servers, ideally located on backbone connections, serve files to customers quickly in different areas of the globe. For example, a user in France is served assets from a UK server, rather then making the transatlantic trip to Virginia.
For smaller sites (such as this one) there are free or inexpensive CDN options. This website uses CloudFlare to serve static assets to viewers (your CSS file was probably served from this CDN, reducing my server load. Yay!) An excellent CDN for large-scale websites is Akamai. You’ve almost certainly been served assets from Akamai (Facebook, Adobe, Microsoft, and many more use it). If you’re in the market, take a look at CloudFront as well.
So there you have it: The front-end build and architecture for one of the largest publishing organizations in the world!
What I’ve described is the front-end server and building process for a large scale web application. This setup will typically be mirrored by the application codebase (back-end).
Comments, feedback or criticism? Let me know in the comments: