Amazon S3 pricing and cost optimization

Amazon S3 provides a limitless amount of storage that will cost quite cheap for a small web project. But with the increasing the data amount, you have to pay for storage and for the traffic, the GET, PUT, COPY, POST, and LIST requests. Amazon S3 prices

Price

Rates are listed on AWS page, you can use the calculator.

Free limited access gives you 5 GB of storage, 20 000 GET requests and 2 000 PUT requests for a year. Then you’ll have to pay $ 30 for a terabyte of storage, and $ 90 for each terabyte of traffic. Website with 100 000 views a day, which contains 5 photos on every page will generate this kind of traffic. And with an increase in popularity up to 10 times, you would pay about $ 1 000 a month.

Cloudfront

Amazon S3 is no CDN. It is much cheaper to distribute static for the popular web application using your own servers or a separate CDN provider.

Amazon has its own Content Delivery Network — CloudFront. It is faster and cheaper than S3, also it’s integrated with other services AWS.

Data transfer from S3 to CloudFront is free, downloading data from an external source to CDN costs only $ 20 for 1 TB, and distribution of the first 10 TB per month will cost $ 850. The price decreases as traffic increases.

The easiest way to connect the CloudFront is through an AWS web console, or using the CLI:

aws cloudfront create-distribution \
  --origin-domain-name my-bucket.s3.amazonaws.com \
  --default-root-object index.html

# Specifies the domain or S3 bucket

Caching

The generated traffic has the most significant effect on the S3 cost. Amazon S3 caching

Thus, caching will reduce traffic and the monthly check from Amazon. The files are uploaded to S3 and are serving through your own cache server.

Varnish fits good for this purpose, as it can work with AWS:

backend s3 {
  .host = "s3.amazonaws.com";
  .port = "80";
}

sub vcl_recv {
  if (req.url ~ "\.(gif|jpg|jpeg|png)$") {
      unset req.http.cookie;
      unset req.http.cache-control;
      unset req.http.pragma;
      unset req.http.expires;
      unset req.http.etag;

      set req.backend = s3;
      set req.http.host = "test-bucket.s3.amazonaws.com";

      lookup;
  }
}

sub vcl_fetch {
    set obj.ttl = 3w;
}

# Will cache the pictures of test-bucket

BitTorrent

If you plan to serve large files through S3, it makes sense to take advantage of the built-in support for BitTorrent. So using a peer-to-peer network, each client will download and distribute files, thereby reducing traffic.

Creating a torrent file is simple. You need to add ?torrent to the file link:

 # Was
http://s3.amazonaws.com/test_bucket/somefile


# Will be
http://s3.amazonaws.com/test_bucket/somefile?torrent

# Clients will automatically download the torrent file

Glacier

If S3 is used for storing backups, consider using a backup storage Glacier. Service can not replace the S3, but will keep the rarely used files that are not needed in real time.

It is noteworthy that Glacier, given its limitations, is much cheaper than S3. 1 TB storage will cost $ 7 per month, and download of 1 TB is $ 90.

To connect the service, you need to create a lifecycle policy, including Glacier in the transfer:

<LifecycleConfiguration>
    <Rule>
        <ID>sample-rule</ID>
        <Prefix></Prefix>
        <Status>Enabled</Status>
        <Transition>
      		<Days>30</Days>
      		<StorageClass>GLACIER</StorageClass>
    	</Transition>    
        <Expiration>
             <Days>365</Days>
        </Expiration>
    </Rule>
</LifecycleConfiguration>

# Automatically upload files to Glacier after 30 days

The most important

Do not use Amazon S3 as a CDN, caching with separate low-cost servers and the inclusion of BitTorrent files help to save money without compromising the Web application’s performance.

Подпишитесь на Хайлоад с помощью Google аккаунта
или закройте эту хрень