Category Archives: Development

How to get More Bang for your Heroku Buck While Making Your Rails Site Super Snappy [Redux]

I first wrote about how to get the most bang for your Heroku buck a year ago. Since then a few things have changed and we’ve learnt even more about how to deliver great performance from our Heroku hosted sites. Some of the advice remains the same, but there are some important changes. There is also an important caveat at the end. While this is written primarily for Rails developers using Heroku, much of it is applicable to any site hosted on any platform.

We love Heroku. It makes deployment easy and quick. However, it gets pricey when you add additional dynos at $35pm. With a bit of work you can get a lot more out of your Heroku whilst drastically improving the performance of your site for your users and providing better scalability. You might need to spend a bit on other services, but a lot less than if you simply moved the dyno slider.

There are two sides to site performance: how many requests your site can handle, and how long it takes to display in the browser. These are intimately connected but ultimately your users only care about the latter, while your boss or client probably cares more about the former. Shaving 50ms from your response time will increase your throughput, but it won’t help your users if they have 2mb of Javascript to download.

0. Before you Dive in: Measure Your Performance [New]

Remember the golden rule:

Premature optimization is the root of all evil
- Donald Knuth

You don’t have a performance problem until you can show me a graph and some numbers. Luckily for you, that’s easily done on Heroku. The performance monitoring service New Relic is available as a free add-on for all Heroku users. Add it to your app and start digging. Not only will it help you work out the problem areas but will give you confirmation that your efforts are actually paying off (or not). Other useful tools are available in your browser. Chrome’s Developer Tools (the other browsers have an equivalent) Network and Audit views will show you exactly what happens when you load your page, and give you suggestions of ways to speed up your site respectively. The audits are especially useful for spotting caching problems.

1. Use Phusion Passenger [New]

Use Phusion Passenger for Heroku. Really, it’s awesome. Phusion Passenger is a multi-treaded application server that now runs on Heroku using Nginx. On average we manage three or four concurrent threads per dyno, depending on memory use. Passenger has several advantages over the other application servers available for Rails on Heroku.

  • It’s consistently fast. I’m not convinced that it’s significantly faster than Unicorn, but it does seem to be more consistent. This may be related to its second advantage…
  • It’s more memory efficient than the alternatives. While it won’t drastically reduce the memory footprint of your app, it does seem to have shrunk at least one of our apps’ total footprint by 10-15%. That’s not masses, but on Heroku, with its 512mb limit, that can make all the difference. If you breach the 512mb limit Heroku will start swapping memory out to disk, at which point performance will get much less consistent as parts of your application are moved in and out of RAM.
  • Assets are served directly by Nginx, not Rails. While we still don’t want to serve lots of assets from our Heroku instance, doing so through Nginx is significantly better than doing so through the application stack.
  • Finally, and significantly for your users, Passenger/Nginx support HTTP compression out of the box, for both assets and application responses. You don’t have to do anything. If the browser sends the correct Accept-Encoding header the server will respond appropriately. This can radically reduce the size of the HTML, CSS, and JavaScript sent.

2. Keep Within the Memory Limits: Put Your App on a Diet and Don’t Get Greedy with Threads [New]

One of the main limitations of a Heroku dyno is the 512mb RAM limit (1gb if you pay for a 2x dyno). Once you hit that things start getting swapped out to disk, significantly affecting performance. Requests get slower on average, and response times get more unpredictable.

New Relic can give you an insight into your memory use, on a per instance (in our case Passenger threads) and total basis. You might even be able to squeeze in an extra thread to handle more requests.

Always keep your total memory footprint below the 512mb limit if you want consistently good performance.

There are three main approaches to reducing the size of your application:

First and most obvious, remove unused code from your app and Gems from your Gemfile. If you don’t need it, it shouldn’t be there.

Secondly, be fastidious about your Gemfile groups. Make sure that gems that are only used in test, development, or asset compilation are in the relevant groups, don’t just dump everything in the default group or all of it will be automatically required at startup, consuming memory. The Rails 4 default project has done away with the :assets Gemfile group, but you can easily add it back in by editing application.rb and changing

    Bundler.require(:default, Rails.env)

to

    Bundler.require(*Rails.groups(:assets => %w(development test)))

Finally, if there are any gems that are used solely for background workers or rake tasks, you should manually require them where you need them, don’t auto-require them at startup.

Don’t be tempted to use too many Passenger threads if it means going over the memory limit. The increase in concurrency will probably be outweighed by an overall reduction in performance of all the threads.

The graph below shows what happened when we reduced the number of threads on an application so that its memory consumption dropped from about 530mb to about 390mb. Throughput on the site was roughly comparable. Notice how much more consistent the performance is afterwards.

Application response times over six hours, compared with the same time the previous day showing the effect of reducing the number of Passenger threads to fit within the Heroku memory limits.

Application response times over six hours, compared with the same time the previous day, showing the effect of reducing the number of Passenger threads to fit within the Heroku memory limits.

3. Serve Static Assets and Uploads from a CDN on Multiple Subdomains – but Don’t Use asset_sync [Updated]

Last year I recommended using asset_sync to move your assets to S3, removing the need for your Heroku dyno to serve them. With the arrival of Passenger on Heroku this is no longer good advice. Because Passenger serves assets through Nginx and will serve the compressed versions where appropriate, serving your assets from your dyno through a CDN (content delivery network) such as Amazon Cloudfront will give your users a much better experience than asset_sync, while not increasing the load on your dyno. Because the cache expiry of your assets is set, by default, to a very long time, the number of requests that actually hit your dyno will be tiny (around once per asset per year).

To really juice up the load times of your site, configure four subdomains for your assets, numbered from 0 to 3, e.g. assets0.myapp.com to assets3.myapp.com, pointing at your asset CDN and set the following in your production configuration:

    config.action_controller.asset_host = "assets%d.myapp.com"

Rails will cycle through each of these subdomains when it generates asset links. Browsers are generally restricted to only two concurrent requests per host name, so having assets served from four allows the browser to make eight concurrent requests. Page load speeds will now be constrained only by the speed of your user’s connection. If you user has a good connection then they will be able to download most of your assets in parallel.

Heroku have documentation walking you through the Cloudfront setup.

4. Turbo-Charge your Application with Memcache backed View Caching and In-app Caches [Updated]

If you’ve not encountered caching in Rails, stop reading this article right now, go read the Rails Guide to Caching and then DHH’s short guide to key based cache expiry. Caching in Rails 4 is even better, with improved support for “Russian Doll” caching.

View caching in Rails can have a profound effect on your application’s response time. In the past we have found that rendering pages, especially complex ones with lots of partials, can easily account for two-thirds of the total processing time, much more than you might expect. Use New Relic to guide your improvements.

Memcache store is shared between your dynos so they all benefit from any cached item. The Memcachier addon gives you 25mb for free, and is pretty reasonably priced from there on up. Just adding a small cache store of 25mb can make a significant difference to the load time of your pages.

Don’t be afraid to de-normalise some of your data, where appropriate. Sometimes storing a precomputed value in a model, especially one based on complex transitive relationships with other models, makes up in performance improvement what it loses in programming purity and elegance. The most common example of this approach is ActiveRecord counter caches, but you can easily add your own.

5. Offload Complex Search to a Dedicated Provider [Unchanged]

If you have an application that needs to perform complex searches over large datasets don’t do it in your application directly. If searches regularly take a long time consider using something like Solr (available as a Heroku plugin), Amazon CloudSearch, or one of the many Search as a Service providers. You’ll not only get faster search performance, but you’ll save vast amounts of development time trying to optimise your in-app search. If search is a significant aspect of your site the cost of a good search service will probably be better value than just scaling your database.

6. Use Background Processing the Smart Way with Delayed::Job and HireFire [Unchanged]

Background processing with Delayed::Job is a great way of speeding up your web requests. Potentially slow tasks like image processing or sending signup emails can happen outside of the request-response cycle, making it much snappier and freeing up your dyno to handle more requests. The downside is that you need to run a worker dyno at $35/month.

Michael van Rooljen’s HireFire modifies Delayed::Job and Resque to automatically scale the number of worker dynos based on the jobs in the queue. Because Heroku charge by the dyno/second, spinning up 10 workers for one minute costs the same as one worker for ten minutes, so with HireFire you can potentially get things done quicker while paying less than you would if you ran a dedicated worker dyno.

HireFire does have one limitation, it only works for jobs scheduled for immediate execution. If that is an issue Michael has a HireFire service that will monitor your application for you, so jobs scheduled in the future will be run.

7. Don’t Upload and Process Files with your Web Dynos [Unchanged]

If you use something like CarrierWave or Paperclip, by default the uploading and processing of images is done by your dyno. While this is happening your dyno thread is completely tied up, unable to handle requests from any other user.

Decouple the upload process from your dyno using something like CarrierWave Direct. With a bit of client-side magic it uploads files to S3 directly, rather than through the dyno. The images then get resized by background processes using DelayedJob or Resque. This obviously has the downside that you’ll need a worker running.

Another option, which we’ve used recently, is the awesome Cloudinary service. They provide direct image uploading, on-demand image processing (including face detection, which even seems to work on cats) and a worldwide CDN all in one package. There is a free tier to get you started, and for $39 (slightly more than one Heroku dyno) their Basic plan will be more than enough for many sites.

Putting it all Together

At the end of all this we’ve freed up our Heroku dyno from doing things it’s not very good at like serving static files and uploads, and juiced up its performance when doing what it’s great at, serving Rails application requests with no sys-admin in sight.

Each technique can be easily applied to your existing applications, but if you develop with them in mind from the start you get all the benefits with almost no additional work. On their own each one will help the performance of your application, but combining them together will significantly extend the amount of time before you have to start forking out for lots more dynos, and when you do you’ll get much more bang for each of your thirty-five Heroku bucks.

If you’ve got any other tips for getting the most out of a Rails application, whether or not it’s on Heroku, we’d love to hear about it them!

Postscript: Caveat Developer

Heroku is fantastic for reducing developer overhead and with a bit of work you can serve large and popular sites on it for relatively little. We use it for many of the sites we build. However we also use other hosting platforms, especially Amazon AWS, so we can compare our experiences of the two and we’ve noticed a couple of issues.

We frequently see significant performance drops after deploying a new version. Response times sometimes treble, with all parts of the stack slowing by the same factor. Scaling the application down and then back up will often fix the problem. This is not a code issue, it can happen after deploying a change to some CSS.

No matter how minimal an app is, the best response time I’ve ever seen in the browser is about 150ms, and that’s not consistent, it’s frequently longer. Now, 150ms is pretty quick, in fact it’s about a blink of an eye, but applications we’ve hosted on single Small EC2 instances have shown consistently better performance without any optimisation. Both of these issues are probably due to a combination of Heroku’s routing infrastructure and the way your dyno shares resources with others on the same host hardware.

The differences are only in the order of 100ms or so, less than the blink of an eye, so how much it matters will depend on your use case. Constant monitoring of your application is key.

Obviously, while you get by on a single free Heroku dyno you can’t complain too much, but once you start forking out for extra dynos you might want to look at Amazon Elastic Beanstalk as an alternative. It’s still quite immature compared to Heroku (but improving all the time), and you’ll have to get your hands a bit dirty setting it up, but it gives you most of the ease of maintenance of Heroku. If you are prepared to pay up front, the cost of a single Small EC2 instance is on a par (or less) with a Heroku dyno, but gives you more memory and more consistent performance. You also get the advantages of AWS’s other services like automatic Elastic Scaling for those busy periods.

As with all such decisions, how and where you host is going to depend on what you need and how you want to spend you cash, but with a bit of work Heroku can form the core of a really good setup that will scale effortlessly, but it’s always worth keeping an eye on the other options.

Using Pow with RVM 1.19′s .ruby-version and .ruby-gemset files

With the upgrade to RVM 1.19 you are asked to convert your old .rvmrc file into .ruby-version and .ruby-gemset files.

You are using ‘.rvmrc’, it requires trusting, it is slower and it is not compatible with other ruby managers, you can switch to ‘.ruby-version’ using ‘rvm rvmrc to [.]ruby-version’ or ignore this warnings with ‘rvm rvmrc warning ignore /Users/danielkehoe/code/railsapps/rails-prelaunch-signup/.rvmrc’, ‘.rvmrc’ will continue to be the default project file in RVM 1 and RVM 2, to ignore the warning for all files run ‘rvm rvmrc warning ignore all.rvmrcs’.

A typical .ruby-version file would contain

ruby-2.0.0-p0

The .ruby-gemset file would have

MyProject

You can add the gemset name into the .ruby-version file as ruby-2.0.0-p0@MyProject but that breaks compatibility with other Ruby version launchers.

Pow needs RVM to be loaded and for it to select the correct Ruby version to run your app with.  You do this with the .powrc file.  Mine looks like this:

if [ -f "$rvm_path/scripts/rvm" ] && [ -f ".ruby-version" ] && [ -f ".ruby-gemset" ] ; then
  source "$rvm_path/scripts/rvm"
  rvm use `cat .ruby-version`@`cat .ruby-gemset`
fi

That’s all there is to it!

oAuth Twitter for PHP and WordPress developers: Version 2!

Yesterday, I released version 2.0 of our oAuth Twitter PHP class and WordPress plugin.

It’s a simple way of handling all of the oAuth requirements in Twitter’s API v1.1 that become mandatory on 5th March 2013. For more information about the plugin itself, you can read my original post on the first release.

Version 2.0 is vastly improved and allows you to:

  • Use multiple Twitter accounts rather than just the one defined in the configuration
  • Define a custom cache expiry (and allows you to disable it for debug purposes)
  • Pass any custom parameters you want to Twitter’s API, and override the defaults easily

This update is fully backwards compatible with version 1.0. You can either upgrade the plugin using the WordPress admin panel, or just drop the updated class file in — but it will delete your cache the first time it runs, as the format has changed significantly to support the new features.

We suspect many people aren’t yet prepared for Twitter’s great API 1.0 switch off on 5th March 2013, so it’s worth checking any sites you manage or maintain to make sure they are using authenticated calls to Twitter.

How to save the uploaded file name with carrierwave_direct and S3

So you’ve setup carrierwave_direct and you’re happily uploading files to Amazon S3. In this example I’ve mounted CarrierWave on a field called csv_file, but that can be whatever is appropriate to your app.  

You’ve probably got two controller methods

def upload
  @model = Model.new
  @model.save

  @uploader = @model.csv_file
  @uploader.success_action_callback = upload_successful_model_url(@model)
end

def upload_successful
  @model = Model.find(params[:id])

  # Now what??
end

You need to save the file name to the model so that it can be referenced later. The documentation (at the time of writing) offers no indication of how you might go about that. The secret is in the key attribute that CarrierWave adds to your model.

def upload_successful
  @model = Model.find(params[:id])
  @model.key = params[:key]
  @model.save

  redirect_to model_path(@model)
end

Simple. When you know how!

How to show comments on a separate page in WordPress

Struggling to give WordPress comments their own page without messing up your URL structure? I know the feeling.

Displaying a post’s comments separately from the main content can be useful in many circumstances. Although less common nowadays, traditionally many blogs chose to feature comments in a pop-up window or lightbox. It can also be desirable for editorial or legal reasons, for caching purposes, or perhaps if you are publishing individually art-directed posts. It can also be handy when using WordPress for a non-blog website.


Continue reading

6 Ways to get More Bang for your Heroku Buck While Making Your Rails Site Super Snappy

We love Heroku. It makes deployment so easy and quick. However, it can start to get pricey when you add additional dynos at $35 each a month.

With a small amount of work, you can get a lot more out of your Heroku hosting whilst drastically improving the performance of your site. You might need to spend a little bit of cash on other services, but a lot less than if you simply moved the dyno slider up a few notches, and the result will be much better scalability.

So how do we max out the performance of our Heroku apps? First we stop using Heroku for things it’s bad at, then we let it do more of what it is good at, running your application code.


Continue reading

Using tomdoc to document a scope in a Rails model

I’m playing around with Tomdoc for documenting my latest Rails project.  The documentation is (ironically) a bit thin on the ground.  It’s taking a bit of trial and error to get some things working.  The most recent brainteaser was how to get tomdoc (or even rdoc) to document a scope declared on a Rails model.


Continue reading

Is Google indexing pages from Twitter and messing with your analytics?

I just Googled for “WordPress RC” to find the release notes for the 3.5 Release Candidate.  I clicked on the result for wordpress.org and was taken to the correct page, nothing out of the ordinary.  I then copied the URL to share in team chat and noticed that the URL was quite long; there were some query string parameters.  The complete URL was:

http://wordpress.org/news/2012/11/wordpress-3-5-release-candidate/?utm_source=twitterfeed&utm_medium=twitter

The utm_source and utm_medium parameters are used by Google Analytics to segment traffic by source.  Normally you would expect to end up on this URL if you clicked from the WordPress Twitter feed.


Continue reading

oAuth Twitter Feed: PHP Library & WordPress plugin

Twitter’s new API 1.1 has been live for a couple of months now and that brings with it a whole new set of requirements for using the Twitter API. From March 2013, everyone displaying tweets must comply with the new terms and update their websites code to be compliant with the new requirements, which includes changes to the way you authenticate.

This basically means you won’t be able to use Javascript to load in tweets for your website directly from Twitter, as all requests must be authenticated.

This is a major problem for the vast majority of websites that are currently using Twitter and one we needed to work around – so we built a PHP class (GitHub), and a WordPress plugin (GitHub), that implements all the new requirements for authentication and gives you an array of tweets out of the other end, for you to use in your WordPress themes, or PHP applications. We built the class and plugin on top of Abraham Williams’s Twitter OAuth class which makes the OAuth stuff really easy.

You can get the plugin from the WordPress plugin directory: oAuth Twitter Feed for Developers.

The plugin implements caching for (by default) 1 hour too, to stop you hitting the API limits which could occur with the new authenticated-only request requirements and protects you from Twitter outages. You can change this yourself in the source code.

It’s definitely for developers though – you only get an array out of it that contains Twitter tweet objects (You’ll find more about them on the Twitter API documentation). You’ll still need to style the output and make it comply with the new display requirements.

The plugin provides a getTweets() function that you can call in your theme files. This returns an array you can then loop over and do whatever you want with it.

      $tweets = getTweets();
      var_dump($tweets);

      foreach($tweets as $tweet){
        var_dump($tweet);
      }

You can specify a number of tweets to return (up to 20) by passing a parameter to the function. For example, to display just the latest tweet you’d request getTweets(1).

This is an early release of the class and plugin, both could offer a lot more functionality.  If you want to get involved, fork the code on GitHub and send us a Pull Request!

Debugging :active, :focus, :hover and :visited states in Chrome

When you interact with an element on a web page, various pseudo classes are applied dynamically that you can use in CSS to define styles. These changes are not reflected in the Chrome inspector in real-time – you can’t select an element, hover over it and see the :hover styles.  In the simple case of changing text colour of an <a> this isn’t a big problem.  However, as soon as you start to use these pseudo classes for more complex purposes, like building complex navigation menus, you hit a point where you really want to be able to inspect those styles.


Continue reading