The ultimate beginners guide to web performance and speed

Make your webpage blazingly fast! This beginners guide includes 14 performance tips (from simple to complex) that will help you speed up any site.

Above is a video version of this guide. It goes in-depth into many of the topics, and includes a few more. The presentation was given to a group of web developers at The Church of Jesus Christ of Latter-day Saints, as is slightly tailored to some of their web properties. Enjoy!

This guide is loosely arranged in order by bang-to-buck ratio, so simpler optimizations are listed first (that generally have a immediate impact).

  1. Optimize images
  2. Don’t use images
  3. Lazy load images
  4. CSS at the top, Javascript at the bottom
  5. Load scripts asynchronously
  6. Ditch the library
  7. Minify and Concatenate
  8. Take advantage of browser caching
  9. Enable compression
  10. Take advantage of server caching
  11. Optimize database queries
  12. Separate static assets onto a different domain
  13. Use a content delivery network
  14. Flush partial responses

1. Optimize images

It’s not uncommon for an image straight from photoshop or a digital camera for weigh several megabytes. Large and unoptimized images can be the number 1 cause of slow loading pages. Luckily, this may be the simplest optimization as well.

This vacation image (right from my digital camera) is 12.8m. It is also 4000×6000 — which is far too large for a webpage. Reducing the pixel size in photoshop to 1000×667 takes the file size down to 480.7kb. This is still far too large. In the Photoshop Save For Web menu, we can usually reduce the jpeg quality to between 60%-70% without noticing any artifacting or degradation.  My rule of thumb is to move the quality sider down until I notice a difference, then take it up by a third. I’ve reduced the jpeg quality on this image to 62% and the new file size in 145.9 kb. This is still large for the web, but more acceptable for a photograph.

example photo for optimization
Optimized vacation photo

We can further reduce the size by using other image optimization tools, such as ImageOptim. This tool reduces image file size by making imperceptible quality adjustments, as well as by removing EXIF and other metadata from the file.

optimizing images with imageoptim software
ImageOptim reducing image file size

2. Don’t use images

Images are only really needed for displaying photos, screenshots, or other graphics. With the advance of CSS3 support, there are almost no situations were image files are necessary for page layout or design.  border-radius and   linear-gradient alone have reduced the need for most of our image dependancies.

Icons were another common image use-case. We can now replace these with the more semantic use of icon fonts or svg “sprites.” Icomoon is an excellent tool for creating both icon fonts and svn icons. Chris Coyer puts icon fonts and svgs in a informative cage-match that excellently lays-out the pros and cons of both methods

Finally, the One Less JPG movement reminds us that “before you go worrying about how to minify every last library or shave tests out of Modernizr, try and see if you can remove just one photo from your design. It will make a bigger difference.”

3. Lazy-load images

Inline images

If an image falls below the viewport (not immediately visible), load that image “lazily” — or after the main content has loaded. We can accomplish this simply and semantically, and in a way that still works for non-javascript users.

There are many expertly programmed lazy-load images scripts, so I’ll just outline the logic here:

CSS-tricks has an modern lazy-load script which seems to work well.

Background images

Lazy loading background images in CSS uses the same logic as inline images, with a slightly modified implementation:

4. Css at the top, Javascript at the bottom

The order and placement of CSS files and Scripts is important. CSS should be placed in the head element. As soon as the styles are downloaded and parsed, they can be applied to their DOM element matches — increasing perceived load time.

Included Javascript files will block the parsing and execution of a page, and should be placed at the bottom of a document to avoid the block.

5. Load scripts asynchronously

If a page is built with progressive-enhancement in mind, most scripts should be loaded asynchronously, or after the page itself. These scripts will enhance the experience of the users, and won’t be missed for the split-second before they are loaded.

Most popular libraries (read: jQuery) have built-in asynchronous script loading, but it is also simple to implement without a library. HTML5Rocks has a deep dive into async script loading. It’s long, but well worth the read.

6. Ditch the library

Speaking of popular front-end libraries: do you really need jQuery? The tongue-in-cheek site, Vanilla-JS offers up native javascript solutions for common jQuery uses. Indeed, many of the uses of jQuery can be reproduced by simple native JavaScript:

Your code can get a little more verbose when iterating over a set of elements:

 Case-by-case

Ditching the library isn’t a hard-and-fast rule. There are many things that jQuery provides that are difficult to achieve with out.

For example, jQuery’s event binding interface can be overloaded to provide event delegation — binding an event handler to an element’s parent in order to allow dynamic loading or modification of new child elements without losing the event handler.

Here is a native event delegation function I wrote to add this functionality natively:

As you can see, much of the native implementation deals with the inconsistencies in browser implementation. jQuery and other libraries “pave over” these differences, and provide a consistent API for developers. Additionally, library implementation code has been tested by many users over many browsers. Library usage benefits from thousands of hours of bug and edge-case fixes.

When deciding whether or not to ditch the library, you need to carefully assess the needs of your site. Are there complex requirements or compatibility needs?  Do you have time to write and test native implementation? There are certainly use cases where a library is not needed, but consider carefully before you make that decision.

7. Minify and Concatenate

If your CSS or JavaScript files are being served to a user in the same form that you authored them, then there is a performance penalty to pay for downloading all the unnecessary whitespace and descriptive naming conventions that make for good development.

Minification is the process of compressing files, removing unnecessary whitespace, line-breaks, and comments. It is often coupled with Uglification, which transforms code by renaming variables, and refactoring functions into the most compressed format. Any CSS or JS that makes its way to an end user should be both minified and uglified.

Concatenation is the process of combining multiple files into one. This reduces the number of requests, and is especially useful when optimizing for mobile networks where latency is often the bottle-neck.

There are many good stand-alone minification and concatenation tools, but I recommend integrating both of these processes into your build-step, using Grunt, Gulp, or similar.

8. Take advantage of browser caching

Once a user has visited a page on your site, assets have been downloaded and can be cached on the users system. Typically, static assets are cached: CSS, JavaScript, Image files, etc, as well as dynamic requests: search results that change infrequently, lookups, or other ajax requests.

Asset caching is controlled via to headers sent with the file:  Cache-Control and ETag. These headers are sent by the server with the file response.

From Google’s Leverage Browser Caching:

  • Cache-Control  defines how, and for how long the individual response can be cached by the browser and other intermediate caches. To learn more, see caching with Cache-Control.
  • ETag  provides a revalidation token that is automatically sent by the browser to check if the resource has changed since the last time it was requested. To learn more, see validating cached responses with ETags.

There are different ways to set these headers depending on your website platform. This guide to caching headers does a good job of explaining the different options available for the header. Ultimately, you will need to use a method that is correct for you specific platform.

9. Enable compression

This is a simple optimization, and comes enabled on most servers by default. By enabling gzip compression, assets served are run through a compression algorithm and decompressed by the client. The details of how to enable gzip compression are different for each platform, so find a guild that is specific to your technology to install or enable it.

10. Take advantage of server caching

As dynamic pages are built, there can be many database requests for each page. Consider the typical blog: the article content, related stories, and comments are each separate database queries. The comments will be updated frequently, but the article itself and the related stories will not change once published. We can save the results of these queries in memory or on the disk for future use (saving a expensive database query). Like browser caching, the implementation will vary from platform to platform, but the theory is the same. Each platform will have it’s community recommended caching solution, and honestly, the bang-to-buck ratio of creating your own is slim.

The logic of server caching is generally:

This pseudocode saves the cached page to the file system, but you could just as easily store data in memory or in an object-store. Google “caching in [your system]”, or read up on Varnishmemcached, redis for more information.

11. Optimize database queries

Expensive database queries can add hundreds of milliseconds to the lifecycle of a request. Optimizing queries is the process of:

  1. Identifying expensive database queries
  2. Refactoring
  3. Repeat

Identifying expensive queries

In general, look for queries that ask for more then they need. This can take the form of  SELECT * , when all you actually need is  SELECT first_name . Any query that returns more than you need, JOINs with more than is needed, or contains correlated subqueries should be suspect. Every database system is different, so there is silver-bullet when it comes to identifying bad queries.

Refactoring

Once bad queries are identified, use the database tools to understand the workings of the query. Many databases support an EXPLAIN  keyword that returns information about how a query is interpreted and executed by the database engine.

With the help of EXPLAIN, you can see where you should add indexes to tables so that the statement executes faster by using indexes to find rows. You can also use EXPLAIN to check whether the optimizer joins the tables in an optimal order. – MySQL Documentation

12. Separate static assets onto a different domain

When a request is made to a server, the client machine will send data that is relevant to the request to help the server complete it. This way, when a user requests a new page, data, such as an session cookie, is sent to the server. Without this interaction, dynamic and personalized user experiences would not be possible. However, there are many types of files where cookies are not relevant. This is the case with static assets (CSS, JavaScript, images, etc). To make static asset request smaller, serve assets from a static-only domain. You may have noticed while loading a YouTube video, a network request to *.ytimg.com. ytimg.com is YouTube’s cookie-less image domain. That way, when any image is requested by YouTube (like a thumbnail) it doesn’t have to send or receive cookies for every request.

You don’t necessarily need to host your static files on a different server, or even in a different application. You can configure most servers to use a separate path and domain for a folder using the rewrite rules system for your particular platform.

13. Use a Content Delivery Network

Content Delivery Networks, or CDN’s, store contents of your website on servers that are more physically close to the end users location. When a request for an asset is made, the CDN will look on it’s servers to see if it has a cached copy. If so, it will return that to the client — avoiding any traffic to your servers completely. If the file is not available, the CDN will request the asset from the server, store it, and then send it to the user. Not only does this decrease load on your server, but it speeds up the request by delivering the asset from a physically nearby server.

Akamai, and AWS both have very popular commercial offerings. However, you don’t have to be Facebook-scale to use a CDN. One company, CloudFlare, offers free CDN access for websites that server under a certain amount of traffic. Cloudflare’s product setup is relatively simple: you edit your DNS to point to their nameservers, and all traffic is routed through their network. Assets are cached as they are requested, and served to users from their servers around the world.

14. Flush partial responses

Some background: during a request lifecycle, there are three main functions that take the majority of the time to build a page:

  1. Network transmission time & latency
  2. Building the page on the server
  3. Rendering the page on the client

Often, as each part of the cycle is busy working, the other two are idle. We can perform this optimization to utilize the other two, and reduce the total time of the request.

Consider a request for “page-2”:

  1. the request is sent from the client
  2. network time
  3. the server receives and routes the request
  4. page renderer is found
  5. database queries are made
  6. page is built
  7. page is sent to the client
  8. network time
  9. page is received by the client
  10. client renders the page
  11. client makes requests for linked assets

Now, during step 4-6, the network and client are not utilized. Flushing a partial response means starting to send data over the network to the client before the entire response is ready. Segmenting a response in this way “pipelines” the process and decreases the time a client waits to receive the first response packet.

Each platform is different, but this pseudocode shows the general theory:

Flushing the output of the page in this way allows the client to receive the header before the entire page has finished building, and it may start rendering immediately.

Front-end Series 3: Templating syntax review

  1. Template syntax families
  2. Binding simple values
  3. Binding values using Scope Blocks
  4. Grammars
  5. Parsing
  6. Consistency leads to Predictability
  7. Comparison of popular templating syntaxes
    1. String Insertion
    2. Iteration
    3. If

Templating syntax (and language syntax in general) is very interesting. How a template is written somewhat informs mental-model of how it behaves, and how the data should be structured.

The following is a syntax review of common features in front-end templating systems. This is an initial pass! If you see any errors or optimizations, please correct me in the comments and I’ll quickly correct them. Also, this isn’t a review of the (sometimes very extensive) templating engines themselves — just the template syntax.

Template syntax families

At a very high-level, template syntax falls generally into two families: element-attribute renderers, and inline-token renderers.

Element-attribute style syntax binds data, logic, and state to a template by attributes on an individual node:  <element directive="property" /> . These templates are often parsed using traditional DOM parsing techniques, looking for element attributes that match a set of defined directives for that framework.

In inline-token renderers, binding is performed by code (or expressions) in-place, using special tokens to set it apart from the markup: like  {{ , <@ , or  <% . These templates are parsed using string or pattern matching techniques on the tokens, then evaluating their contents.

These are general distinctions, and some templating frameworks use both syntax styles.

Binding simple values

Inline-token style syntax really shines when binding single values to a template:

Because the templates are parsed as strings, using pattern matching regular expressions to find tokens, the binding can occur anywhere within the template, making them very powerful:

Element-attribute syntax expressions bind simple data easily, but quickly become verbose and unwieldy when replacing complex content:

 Binding values using Scope Blocks

Many templates also include some logic expressions, such as  if, foreach, and  using. Inline-token expressions can be limited when working with logic blocks — one reason is scope management. Element-attribute syntax declares expressions on the elements themselves, which have a natural scope within the DOM tree: they start, end, and have children. This makes data binding when scope is required simple:

Inline-token expressions are parsed with pattern-matching and could occur anywhere, so the parser must maintain the scope of any logic blocks used in a template. Additionally, the template syntax must contain closing tags for each opening logic block.

Grammars

handlebars-grammarEach templating syntax comes with its own grammar — or structure. A study of language grammar is a post on its own (or a degree of its own!), but a brief look at grammar is useful to understand the different syntax of each templating framework.

A grammar is a logical structure that defines what is acceptable in a language. Although templating frameworks can have a grammar, they aren’t necessarily languages on their own.

Lets look at the handlebars templating framework:

A handlebars expression starts with the tokens  {{ , and contains three sections:

  • An optional additional  { token
  • An optional logic expression :  #with , #using , #if , #else , #unless
  • A variable or object

Once defined, a basic grammar can be represented as a regular expression. Regular expressions can be represented differently in different languages, but this one should be generic enough:

{{{?(#[a-z]+ )?[a-z]+.[a-z]*}?}}

Meaning:

  • Match the string  {{ (opening token)
  • followed by zero or one  { (unescape token)
  • followed by zero or one  #[a-z]  strings (logic directives)
  • followed by an  [a-z]  string (variable or expression)
  • Followed by zero or more  .[a-z] strings (dot-operator variables)
  • followed by zero or one  } (closing unescape token)
  • followed by the string  }} (closing token)

Grammars can be wildly complex, but these simple principles can be applied to any of the templating frameworks to get a better understanding of how they work.

Parsing

Regular expressions are not only useful for understanding how a templates syntax is structured, but are also used in parsing inline-token syntax templates.

Handlebars’ parser uses a technique which involves searching a template for the  {{ token, then “looking ahead” for the closing  }}, and saving everything between the two in a buffer. When the end token is found, the buffer contents are then parsed and types are returned based on their content.

The contents of those blocks are then passed again into the same parser, to look for further operators:

After the parser has finished iterating on the template string, it is left with a stack of logical operations and functions which are then supplied the data for rendering.

Consistency leads to Predictability

One of the keys to a language’s (and templating engine’s) success and adoption is consistency in syntax. Based on previous experience with a given syntax, a programmer should be able to successfully create new code without diving into the documentation. “If that similar structure took a key-value-pair before, I bet this one here does too.”

Inline-token syntax parsers self-enforce consistency of the language, due to the nature of the parser. It must find a specific token, followed by a set of allowable expressions, which themselves might contain expressions. It’s easy to remember that an underscore block always looks like this  <@= ... @>, because it must per the parser.

Element-attribute syntaxes have a harder time enforcing consistency, and it must be created through the framework creators insistence on following a pattern for their “language.”

A generic element-attribute syntax could consist of

Framework creators must choose a pattern for their element-attributes and insist that the pattern is followed for new directives that are added to the language.

Let’s create a pretend framework, called Bob. Bob’s starts simply, and only binds strings:

We continue to use Bob, and need to add some logic:

And later, repeating sections. Since repeat doesn’t really need a value, let’s write it like so:

Now we need to replace multiple attributes:

And finally, replace specific strings within attributes:

The moment a second programmer tries to implement Bob, it becomes clear that we’ve implemented a totally unpredictable syntax! Is a directive part of the attribute name ( bob-string="name"), or in the value ( bob="repeat")? What is the valid syntax for a  bob-bind attribute? A single value or a key-value pair? The  bob-replace value is JSON apparently?

Regardless of the frameworks speed or other technical features, a syntax implemented in this way will kill adoption.

Comparison of popular framework syntaxes

String insertion

angular
closure
handlebars
knockout
pure
underscore

Iteration

angular
closure
handlebars
knockout
pure
underscore

If

angular
closure
handlebars
knockout
pure
underscore

That’s all folks!

Templating frameworks are very useful, and their syntax is one of the main way which a programmer interacts with them. Understanding the differences that syntaxes and template philosophies can help you choose the right one for your use-case!

Front-end Series: JavaScript debugging tips and tricks

Note: many of these tips are written with Chrome in mind, but are applicable to any modern browser with developer tools.

  1. Understand what is happening in your code
  2. Set and use Breakpoints
  3. Use the Console api
  4. De-minify
  5. Modify scripts locally
  6. Use a Unit-Testing framework

1. Understand what is happening in your code

A prerequisite to debugging is understanding the code in question. Before debugging, you should be able to answer: what is the purpose of this function? What should the parameters be? How will this line affect local and global variables?

Basically, run the code mentally before trying to debug.

2. Set and use Breakpoints

Once you understand how the code should function, set execution breakpoints at key places. When reached, a breakpoint will pause the execution of script on that line. You can then “step-through” the code one line at a time. This view also allows you to inspect any global or local values to ensure they are what you expect them to be. Often times, you can observe a variable which should contain a value, but is undefined. From there it is easy to trace your way back up and see where the error is occurring.

Breakpoints can be set by clicking on the line-number in the script tab of chrome developer tools, or programmatically using the  debugger; directive.

debug-breakpoints

Use the console api

While sprinkling a few  console.log ‘s sparingly throughout your code can be useful, the console api has many useful functions that might serve the purpose better. Here are a few console api methods I have found useful:

console.dir

Displays all the properties of an object.

console.dir

 console.table

Formats arrays and objects as tables, and allows sorting on columns.
console.table

console.time

Starts and stops a timer (string tag based), logs time (in ms) of intermediate code.
console.time

console.memory

Returns heap information for this process.
console.memory

console.trace

Returns a stack trace for the function where it is called.

console.trace

4. De-minify

When debugging, it can be very frustrating to see an error like:

When at all possible, swap out the minified version for a full expanded and commented version of the script.

When a full version is not available, Chrome’s developer tools come to the rescue. The curly brace button below the script will automatically expand lines for better breakpointing and debugging. It does not however rename obfuscated variable names.
de-minify

Minification bugs

Every so often, you can run into the maddening bug of a full script working, and the same minified script breaking. Luckily, these bugs often attributable to one of a few small problems (mostly having to do with missing semicolons).

For example:

These bugs can also creep in from preceding scripts, or when multiple scripts are concatenated together. This is the reason that jQuery plugins typically start with a  ; . It is a preventative measure against misbehaving code above it.

5. Modify scripts locally

Chrome allows you to modify script files and run them locally. Simply double click into source file, type, and CTRL-S to save. Not much more to explain, but very useful trick!

6. Use a Unit Testing framework

There are a handful of very good third-party unit-testing frameworks for JavaScript, each with their own philosophy and syntax.

See this great Stack Overflow post for many more.

Front-end Series: Part 1 – Architecture

This is part 1 in the Front-end Series. As a front-end developer at a large organization, I work daily on architecture, performance, automation, localization, and compatibility, and other topics which I will share during this series. Feel free to post topic ideas in the comment section!

Large scale web application architecture is much more than a server and a /www folder. The build process can scale with need, but the basic setup contains building assets to a development area, then production environment, caching, and delivery via cdn.

This is example architecture diagram — showing building to different environments, pushing to cdn boxes, and caching.

Front-end-archetecture

Building static assets via source control

Working with other developers at any scale necessitates use of a version control system. Whether Git, Subversion, Mercurial, or the next new hotness, a DVCS is an absolute must. After developing in a local environment, assets are pushed to a development version-controlled server.

The assets here are never directly used by the application, but are pushed to a CDN server (or a mock-cdn for development).  The source control and CDN can be can be synchronized by a few different methods:

  • CRON task that rebuilds one folder from the other
  • Post-commit hook that runs similar script
  • 3rd party build software, like Anthill

Server environments, or “lanes”

It’s necessary to duplicate asset servers for different audiences. The most basic are:

  • Development
  • Testing
  • Staging
  • Production

A typical pipeline is:

  • Assets are developed locally
  • Files are moved into the Development environment to be integrated with other developers work and preliminary tested
  • Quality Assurance tests the application.  Feedback provided to developers (loop back)
  • Once testing is successful,  the assets are moved to the staging server, which serves as a place for stakeholders to preview the application, and approve it for production
  • When okay-ed, the files are moved to production where they are publicly accessible. They are now “live”

Caching and Content Delivery

Linking directly to assets on a server is an excellent way to kill your site, when it goes viral (the Reddit hug of death, or the slashdot effect). The most basic level of defense is on server caching. On smaller sites, this can be an effective protection. Many CMS’s (like WordPress) have plugins that can caching dynamic content (php generated files saved as static html). WP-SuperCache is an excellent example of this layer of defense.

The second layer of defense is a Caching Appliance. This is a server (or cluster of servers) that specializes in serving static files with extreme speed. This appliance has only one function, which allows it to maximize its resources for the one purpose of serving content.

The top layer for serving files is utilizing a Content Delivery Network to distribute and cache assets globally.  At it’s core, a CDN is a network of servers placed in strategic locations around the world, each containing duplicate and synchronized files. These servers, ideally located on backbone connections, serve files to customers quickly in different areas of the globe. For example, a user in France is served assets from a UK server, rather then making the transatlantic trip to Virginia.

For smaller sites (such as this one) there are free or inexpensive CDN options. This website uses CloudFlare to serve static assets to viewers (your CSS file was probably served from this CDN, reducing my server load. Yay!) An excellent CDN for large-scale websites is Akamai. You’ve almost certainly been served assets from Akamai (Facebook, Adobe, Microsoft, and many more use it).  If you’re in the market, take a look at CloudFront as well.

So there you have it: The front-end build and architecture for one of the largest publishing organizations in the world!

Big Picture

What I’ve described is the front-end server and building process for a large scale web application. This setup will typically be mirrored by the application codebase (back-end).

back-end-architecture

Comments, feedback or criticism? Let me know in the comments: