Caching with Varnish + ESI

If your website has almost no dynamic content, you can easily put all its pages into cache and backend will be accessed very rarely. But what if the website has personalized data (authorization, user block, banners)?

ESI

Webpage blocks

ESI allows splitting pages into logical sections and makes additional requests to get specific content on these pages when they are processed. Everything looks pretty simple:

<HTML>
<BODY>

Hello world!

<esi:include src=&quot;/news.php?UID&quot;/>

Goodbye!!!

</BODY>
</HTML>
Web server with ESI support simply makes an additional request and ESI instruction is substituted by the retrieved result. Let's say "news.php" script contains the following code:
<?
echo "<ul><li>News title 1</li></ul>";

After processing the first example, web server will return the following page to client:

<HTML>
<BODY>

Hello world!

<ul><li>News title 1</li></ul>

Goodbye!!!

</BODY>
</HTML>

ESI

ESI requests can be cached. Therefore, you get a convenient tool to work with dynamic content. All you need to do is divide them into several blocks, and cache those that are static (or changed infrequently).

How it works?

Web server, which supports ESI, sends a request to backend (in our case, to PHP). After the page is received, it handles all ESI calls, making a separate request for each of them to get the required content. Next, everything is put (or not) into cache and the page is generated. In subsequent calls web server will get data from cache, meaning no additional ESI requests will be needed.

ESI enabling procedure

  1. First, define blocks on the page, and put them into separate scripts (each block shall have its own address).
  2. Substitute removed blocks by ESI instructions.
  3. Enable ESI blocks caching on web server.

Personalized blocks

ESI

The most difficult task is to cache blocks that display unique content for each user. For example, blocks with personal profile links, settings and preview image blocks. In such cases, blocks should be stored in cache for each user.

But bear in mind that the number of blocks in cache will be proportional to the number of users.

total number = number of users x number of blocks on page

Detailed example

Suppose we have a website with news. News are updated every hour. The website also has an authorization block and links for authorized users. The following blocks can be defined for ESI::

  • Authorization block
  • Menu
  • News block
Start page
<HTML>
<BODY>

<h1>ESI test</h1>

<esi:include src=&quot;/app/auth.php?UID&quot;/>
<esi:include src=&quot;/app/menu.php?AUTH&quot;/>

<h3>News</h3>
<esi:include src=&quot;/app/news.php&quot;/>

</BODY>
</HTML>

# All scripts for ESI calls are located in app folder

Authorization script
<?

session_start();

if ( $_POST['user'] )
{
	$_SESSION['user'] = $_POST['user'];
	header('Location: /'); exit;
}

$user = $_SESSION['user'];

?>

<? if ( $user ) { ?>
	<div>Hello, <b><?=$user?></b>!</div>
<? } else { ?>
	<form method="post" action="/app/auth.php">
		Please, log in
		<input type="text" name="user" />
		<input type="submit" name="Log in">
	</form>
<? } ?>
Menu script
<? session_start(); ?>

<ul>
	<? if ( $_SESSION['user'] ) { ?>
		<li><a href=&quot;#&quot;>Menu item only for authorized users</a></li>
	<? } ?>
	<li><a href=&quot;#&quot;>Public menu item</a></li>
</ul>
News script

<?

$rss = file_get_contents('http://feeds.nytimes.com/nyt/rss/HomePage');
$xml = simplexml_load_string($rss);

echo &quot;<ul>&quot;;
foreach ( $xml->channel->item as $item )
{
	echo &quot;<li><a href=\&quot;{$item->link}\&quot;>{$item->title}</a>&quot;;
}
echo &quot;</ul>&quot;;

Web server configuration

Nginx will be used by the application and Varnish will send requests to it (port 8090):

server {
    listen 8090;

    # If gzip is enabled, be sure to disable it!
    gzip off;

    location / {
        index index.php;
    }

    location ~* \.(php)$ {
        fastcgi_pass 127.0.0.1:9000;
        fastcgi_index index.php;
        include fastcgi_params;
        fastcgi_param SCRIPT_FILENAME  /home/golotyuk/www/localhost/esi/$fastcgi_script_name;
    }
}

Varnish + ESI

Let's configure cache, following the rules below:

  • Start page shall be cached for 24 hours, blocks — for 1 hour.
  • All requests except POST shall be cached.
  • When caching personalized blocks, session cookies (PHPSESSID) shall be used to create cache keys.
  • To separate personalized blocks from common blocks for authorized users, the corresponding request prefixes shall be used: UID (personalized blocks) and AUTH (common blocks, reacting to user status only).
Varnish ESI
Configuration:

backend default { .host = "127.0.0.1"; .port = "8090"; }


# Cache key generation algorithm
sub vcl_hash {
       # Default parameters - server name and URL
        set req.hash += req.url;
        set req.hash += req.http.host;

       # If there's session cookie, it's value gets stored in the variable
        if( req.http.cookie ~ "PHPSESSID" ) {
            set req.http.X-Varnish-Hashed-On =
                regsub( req.http.cookie, "^.*?PHPSESSID=([^;]*?);*.*$", "\1" );
        }

        # If request string contains "UID", then session value
        # should be added to cache parameters
        if( req.url ~ "/app/.*UID" && req.http.X-Varnish-Hashed-On ) {
             set req.hash += req.http.X-Varnish-Hashed-On;
        }

        # If request string contains "AUTH", then status flag (logged in)
        # should be added to cache parameters
        if( req.url ~ "/app/.*AUTH" && req.http.X-Varnish-Hashed-On ) {
            set req.hash += "logged in";
        }

        hash;
}

sub vcl_recv {
        # If request type is not POST, then the object is looked for in cache
        if ( req.request != "POST" )
        {
                lookup;
        }
}

sub vcl_fetch {
    # For "/" requests, ESI processing is used and it is cached for 1 day
    if (req.url == "/") {
        esi;
        set obj.ttl = 24h;
    }
    # For "/app" requests (ESI calls), the result is cached of 1 hour
    elseif (req.url ~ "^/app/") {
        set obj.ttl = 1h;
    }

    deliver;
}

After testing performance we get the following results:

ab -n 100 -c 5 http://127.0.0.1/
Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       0
Processing:     2    5   3.8      3      18
Waiting:        2    5   3.8      3      18
Total:          2    5   3.8      4      19


Percentage of the requests served within a certain time (ms)
  50%      4
  66%      4
  75%      5
  80%      8
  90%     12
  95%     15
  98%     17
  99%     19
 100%     19 (longest request)

For a similar script without ESI, which contains the same logic inside and calls PHP each time:

ab -n 100 -c 5 http://127.0.0.1:8090/index_standard.php

Results:

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:   354  579 666.2    458    5484
Waiting:      354  579 666.2    458    5483
Total:        354  579 666.2    458    5484

Percentage of the requests served within a certain time (ms)
  50%    458
  66%    492
  75%    517
  80%    539
  90%    602
  95%    667
  98%   3572
  99%   5484
 100%   5484 (longest request)

As you can see it gives 100-fold performance increase!

The most important

ESI enables caching for dynamic websites with big amounts of personalized content. Also consider the alternative SSI-Nginx combination.

Подпишитесь на Хайлоад с помощью Google аккаунта
или закройте эту хрень