Web Performance for Rails Developers

Web Performance for Rails Developers

Diagram showing the rendering waterfall from initial request through server processing, asset download and browser paint

Web performance in Rails is not a frontend problem with a backend afterthought — it is a full-stack concern where your database queries, your fragment caches, your asset pipeline and your CDN configuration all collapse into a single number that a user experiences as "fast" or "not fast." This guide covers the rendering waterfall from the Rails developer's perspective, Core Web Vitals and what they actually measure, asset budgets that work in practice, the caching layers available to you (fragment, HTTP, CDN), how database query time dominates Time to First Byte, Turbo and Hotwire performance characteristics, and lazy loading patterns that help without introducing layout shift. It connects to the broader Ruby performance topic, because server-side execution time is only one piece of the waterfall — but it is the piece you have the most direct control over. After a decade of profiling Rails applications in production, I have found that most performance problems are not exotic. They are predictable, and they follow patterns. The MDN Web Performance reference is excellent for the browser side of this equation; what follows focuses on the decisions a Rails developer makes that determine whether that browser ever gets the chance to render quickly.

Understanding the rendering waterfall

When a user requests a page from your Rails application, the browser does not display content the instant it receives a response. The rendering waterfall describes the sequence of dependent steps between the initial request and the moment the user sees a fully interactive page.

The simplified sequence: DNS lookup → TCP connection → TLS handshake → server processing (your Rails code) → first byte received → HTML parsed → CSS fetched and parsed → render tree built → first paint → JavaScript fetched, parsed and executed → interactive.

Each step depends on the previous one completing. That is what makes it a waterfall — you cannot parallelise sequential dependencies. But you can shorten individual steps, and you can avoid adding unnecessary steps.

From the Rails side, you control server processing time directly. You influence CSS and JavaScript delivery through your asset pipeline configuration. You control what the HTML contains (and therefore what the browser needs to fetch next). And you control HTTP caching headers, which can eliminate multiple steps entirely for returning visitors.

The single most useful diagnostic tool for understanding the waterfall is your browser's Network panel. Open it, disable cache, load your page and read the waterfall chart from top to bottom. Every horizontal bar is time your user is waiting. The long bars are where you should spend your optimisation effort.

Core Web Vitals from a Rails perspective

Google's Core Web Vitals measure three things: Largest Contentful Paint (LCP), Interaction to Next Paint (INP), and Cumulative Layout Shift (CLS). These are field metrics — measured on real user devices, not in your development environment.

LCP measures when the largest visible content element finishes rendering. For most Rails applications, this is either a hero image or the main text block. LCP is directly affected by your server response time (TTFB), your CSS delivery strategy and whether you are blocking rendering with unnecessary JavaScript in the head.

The Rails-specific lever for LCP is TTFB reduction. If your server takes 800ms to respond, your LCP cannot be faster than 800ms plus everything the browser needs to do after receiving the response. Fragment caching, database query optimisation and efficient view rendering are your primary tools.

INP measures responsiveness — how long it takes the page to respond to user interaction. Turbo and Hotwire help here by avoiding full page reloads, but they can also hurt if Turbo Frame responses are slow or if you are loading heavy JavaScript that blocks the main thread.

CLS measures visual stability. In Rails applications, CLS problems usually come from images without explicit dimensions, fonts that swap after initial render, or Turbo Frame content that loads asynchronously and pushes existing content around. Setting width and height attributes on images and reserving space for dynamic content are the fixes.

The important thing to internalise: Core Web Vitals are not abstract scores. Each one maps to a specific user experience problem, and each one has concrete Rails-side causes.

Asset budgets that actually work

An asset budget is a hard limit on the total size of CSS, JavaScript and fonts your page loads. Without one, asset sizes grow invisibly until someone notices that the marketing page loads 2.4 MB of JavaScript.

A reasonable starting budget for a Rails application using Hotwire: 100 KB of JavaScript (compressed), 50 KB of CSS (compressed), and 100 KB of fonts. That is aggressive by modern standards, but Hotwire applications should not need much client-side JavaScript — that is the whole point.

How to enforce it: add a CI check that measures asset sizes after compilation. If you are using esbuild or Vite through jsbundling-rails or vite_ruby, you can inspect the output directory and fail the build if any bundle exceeds your threshold. This is crude but effective. The alternative — discovering asset bloat in production when your Lighthouse score drops — is worse.

Where budgets get blown in Rails applications: adding a charting library "just for one page" that pulls in 400 KB of D3, including a rich text editor that bundles its own copy of ProseMirror, or importing a CSS framework alongside your existing stylesheets. Each of these is individually justifiable and collectively devastating. The budget forces the conversation before the merge.

Watch for hidden costs. Source maps, uncompressed assets served without gzip or Brotli, and duplicate dependencies in your JavaScript bundle all inflate what the user downloads without appearing in your source code.

Caching layers: fragment, HTTP and CDN

Rails gives you multiple caching layers, and using the right one (or the right combination) makes more difference than any amount of code optimisation. The layers, from closest to your code to closest to the user:

Fragment caching

Fragment caching stores rendered view partials in your cache store (usually Redis or Memcached). When the same partial is requested again with the same cache key, Rails serves the stored HTML instead of re-rendering it.

<% cache product do %>
  <%= render product %>
<% end %>

The cache key is derived from the object's cache_key_with_version, which includes the updated_at timestamp. When the product changes, the cache key changes and the old fragment is ignored.

Fragment caching is your first line of defence against slow view rendering. It is most effective for partials that are expensive to render (complex logic, multiple database queries, nested associations) and that do not change on every request. A product listing page that renders 50 products, each with category, pricing and review data, can go from 200ms of rendering time to 5ms with fragment caching.

The trap: fragment caching hides N+1 queries. If your partial triggers lazy-loaded associations, the first render is slow but subsequent renders are fast — until the cache expires and the N+1 problem comes back. Fix the underlying query, then add caching.

HTTP caching

HTTP caching uses response headers to tell browsers and intermediary proxies how long to store a response. Rails provides this through stale?, fresh_when and manual Cache-Control header setting.

def show
  @product = Product.find(params[:id])
  fresh_when @product
end

This sets ETag and Last-Modified headers. On subsequent requests, the browser sends conditional headers and Rails returns 304 Not Modified without re-rendering the response if the product has not changed. The server still processes the request and checks the database, but skips rendering and response body generation.

For pages that are truly static between deploys — your about page, your terms page — you can set aggressive Cache-Control headers and skip the conditional check entirely. The browser serves the cached version without contacting your server at all.

CDN caching

A CDN caches your responses at edge nodes close to the user, eliminating the network round trip to your origin server. For static assets, CDN caching is straightforward: set long Cache-Control headers and use fingerprinted filenames so new deploys serve new URLs.

For HTML pages, CDN caching requires more thought. You need to decide which pages can be cached at the edge (public pages with no user-specific content) and which must always hit your origin (authenticated pages, pages with personalised content). Getting this wrong means serving one user's personalised content to another, which is both a bug and a privacy violation.

Cloudflare, Fastly and CloudFront all support cache key customisation, allowing you to cache different versions of a page based on headers, cookies or query parameters. But complexity here is a maintenance burden — start with static assets and public pages, and only add dynamic page caching when you have the monitoring to verify it is working correctly.

Database query impact on TTFB

Time to First Byte is the interval between the browser sending a request and receiving the first byte of the response. For a server-rendered Rails application, TTFB is dominated by three things: routing and middleware overhead (small and mostly fixed), view rendering time, and database query time.

Database query time is almost always the largest component. A page that runs 15 queries averaging 10ms each spends 150ms just waiting on PostgreSQL before rendering starts. Add N+1 queries and that number climbs rapidly.

The diagnostic tool: rack-mini-profiler in production-safe mode, or your APM tool's transaction trace view. Both show you exactly which queries ran during a request, how long each took, and where in your code they originated.

Reducing query impact on TTFB:

  1. Eliminate N+1 queries. Use includes, preload or eager_load to batch association loading. The bullet gem detects N+1 queries in development.
  2. Add appropriate indexes. A missing index on a foreign key column used in a WHERE clause can turn a 2ms query into a 200ms sequential scan. See the PostgreSQL Indexing for Rails guide.
  3. Reduce query count. Sometimes the answer is not making each query faster but running fewer queries. Counter caches, denormalisation and materialised views all trade write complexity for read speed.
  4. Move slow queries out of the request cycle. If a dashboard page aggregates data across millions of rows, pre-compute the aggregation in a background job and serve the cached result.

Turbo and Hotwire performance

Turbo Drive, Turbo Frames and Turbo Streams change the performance characteristics of a Rails application in ways that are mostly positive but require attention.

Turbo Drive intercepts link clicks and form submissions, replacing full page loads with fetch requests that swap the page body. This eliminates the browser's HTML parsing, CSS re-evaluation and JavaScript re-execution overhead on navigation. For applications with substantial CSS and JavaScript, this alone can make navigation feel instantaneous.

But Turbo Drive does not reduce your server processing time. Every navigation still hits your Rails server, runs your controller action, renders your view and returns a full HTML response. If your server response time is 500ms, Turbo Drive makes the transition smoother but the user still waits 500ms for new content.

Turbo Frames allow partial page updates. A frame loads or replaces only a section of the page, which means you can render a smaller template and transfer less data. This is genuinely faster when used well — a frame that loads a comment form can return 2 KB instead of the 50 KB full page response.

The performance trap with Turbo Frames: lazy-loaded frames that trigger separate requests for each frame on the page. If your page has eight lazy Turbo Frames, the browser makes eight additional requests after the initial page load. Each one has its own round trip, its own server processing time and its own rendering cost. Eight frames × 100ms each is 800ms of additional loading time that the user sees as progressive content appearing in chunks. Sometimes that is acceptable. Often it is worse than loading everything in the initial response.

Turbo Streams over WebSocket add real-time updates without polling. The performance consideration here is connection management — each connected user holds a WebSocket connection to your server, which consumes memory and file descriptors. At scale, this requires tuning your Action Cable configuration and potentially introducing a dedicated WebSocket server.

Lazy loading done right

Lazy loading defers the loading of off-screen images, iframes or content until the user scrolls near them. The browser-native loading="lazy" attribute handles images and iframes without JavaScript.

<%= image_tag product.photo, loading: "lazy", width: 400, height: 300 %>

Always include width and height attributes on lazy-loaded images. Without them, the browser does not know how much space to reserve, and the image loading causes layout shift (hurting your CLS score).

Do not lazy-load images that are visible in the initial viewport — your hero image, your logo, your above-the-fold product photo. These should load eagerly so they are available for the first paint. Lazy loading above-the-fold images makes LCP worse, not better.

For Turbo Frame-based lazy loading of content sections, the same principle applies: content the user sees immediately should be in the initial response. Content below the fold or behind a tab can be lazy-loaded with a Turbo Frame that fetches on demand.

What usually goes wrong

The performance problems I see repeatedly in Rails applications:

Ignoring TTFB until it is unbearable. Developers focus on frontend metrics while the server takes 1.2 seconds to respond. No amount of image compression fixes a slow server.

Caching without monitoring cache hit rates. Fragment caching is only useful if the cache actually gets hit. If your cache keys change on every request because of a volatile dependency, you are paying the overhead of cache writes without the benefit of cache reads.

Adding JavaScript "just this once." Every library added to the bundle stays forever. Three months of "just this once" turns your 80 KB bundle into 400 KB.

Not setting explicit image dimensions. This causes CLS problems that are invisible in development (where images load instantly from localhost) but visible to real users on slower connections.

Turbo Frame waterfalls. Multiple lazy frames that each make a separate server request, turning one page load into eight sequential network requests.

Skipping gzip/Brotli compression. Serving uncompressed assets over the wire because the web server or CDN was not configured for compression. This doubles or triples transfer sizes for text-based assets.

Checklist summary

  • Measure your TTFB — if it exceeds 400ms, focus on server-side optimisation before touching the frontend
  • Run Lighthouse and check Core Web Vitals scores in the Performance panel
  • Set an asset budget and add a CI check to enforce it
  • Implement fragment caching on expensive partials, starting with the slowest pages
  • Set HTTP caching headers on static and semi-static pages
  • Configure your CDN to cache static assets with fingerprinted URLs
  • Add loading="lazy" to below-the-fold images with explicit width and height
  • Audit Turbo Frame usage for lazy-loading waterfalls
  • Enable gzip or Brotli compression on your web server and CDN
  • Profile database queries on your slowest pages using rack-mini-profiler or your APM tool

Frequently asked questions

How do I measure TTFB for my Rails application?

Browser DevTools shows TTFB in the Network panel as "Waiting for server response" for any request. For aggregate data, use your APM tool (New Relic, Scout, Datadog) or Real User Monitoring. The rack-mini-profiler gem gives per-request breakdowns in development and production.

Is Turbo Drive always faster than full page loads?

For most navigation, yes — it avoids re-parsing CSS and re-executing JavaScript. But if your JavaScript is minimal and your pages are small, the difference is negligible. Turbo Drive also introduces complexity around caching and page state that can cause subtle bugs. It is a net win for most applications, but not a free lunch.

Should I use Redis or Memcached for fragment caching?

Redis is the more common choice in the Rails ecosystem because it also serves as the backend for Sidekiq, Action Cable and other components, reducing operational complexity. Memcached is simpler and slightly faster for pure cache workloads. If you are already running Redis for background jobs, use it for caching too.

How do I know if my CDN is actually caching my assets?

Check the response headers. Cloudflare returns CF-Cache-Status: HIT for cached responses. CloudFront returns X-Cache: Hit from cloudfront. If you are seeing MISS on every request for fingerprinted assets, your cache headers or CDN configuration need attention.

What is a reasonable TTFB target for a Rails application?

Under 200ms for simple pages, under 400ms for complex pages with database queries, under 100ms for cached pages. If any page consistently exceeds 600ms, it needs investigation. These numbers assume a server geographically close to the user — add network latency for distant users, which is where CDN edge caching helps.