How To Http Download Files In S3 UPDATED

How To Http Download Files In S3

Every bit part of a projection I've been working on, we host the vast majority of avails on S3 (Simple Storage Service), one of the storage solutions provided by AWS (Amazon Web Services).

Since this is a web project, we've so got a CloudFront CDN in front end of the storage bucket, to ensure really fast delivery of content to users irrespective of their location on the planet. This is really rather like shooting fish in a barrel to prepare up in AWS, and since I've been working on the opposite side of the globe to the data centre for the last three months, I've really noticed the deviation a CDN makes.

Uploading the avails to S3 is performed via the admin interface the team built. The files are uploaded direct to S3 using the signed URLs characteristic. This means that our app tin can laissez passer a URL to the client, to which that browser can upload a file. This is a really helpful feature of S3, every bit our web-app never needs to see the information—it just gets uploaded straight to S3, rather than uploading to our app and then onto S3 later.

The filenames of uploaded epitome assets are irrelevant—they are used direct in the HTML, so end users won't care nigh their names. This is a practiced thing, because careful choice of file names is the easiest way to forbid CDN caching problems:

Imagine I've uploaded a file named hello_sam.jpg to S3, and it gets served through the CDN. If I afterwards discover a meliorate paradigm to use, so replace hello_sam.jpg with this new version, and then how does the CDN know that it should re-asking the new image from the source? Information technology doesn't. At some betoken each endpoint will asking the image from the source, simply in the meantime, unlike users around the earth will be seeing a dissimilar paradigm. This is incredibly hard to diagnose.

There are lots of different approaches we could have taken to avoid this issue, only we chose a very unproblematic 1—at upload fourth dimension, we name the file with a UUID. These aren't very user friendly, but it doesn't affair, right?

Wrong.

Not all files are images

This arroyo works fine for all files that are consumed solely by a car, merely some of the assets are meant to involve the user in some respect. For example, PDFs of books and zip files of accompanying materials.

Providing a user with a PDF named 56174a62-c1f7-4b42-8e95-cad71237d123.pdf is a pretty shitty feel, when it should probably exist called 101_Things You_Didn't_Know_About_Cheese.pdf.

At this signal, I can finally introduce the other trouble associated with the system I've outlined thus far.

Dependent on the configuration of the browser, when a user visits the link pointing to that PDF (whether it exist through S3 directly, or through the CDN in forepart of it), it'southward quite likely to open up upwardly inline.

Another shitty experience.

101 Things You Didn't Know Well-nigh Cheese is a book that requires fourth dimension to properly digest. And readers are sure to be reaching dorsum into this weighty tome for years to come. They don't desire to view it in Chrome like an beast.

Wouldn't it exist cracking if there were a way to solve all of these problems:

  • Retain the amazing file naming scheme to foreclose CDN caching problems
  • Automatically state that files should be savoured downloaded instead of read inline
  • Specify the filename of the downloaded file

Content Disposition to the Rescue

Every request that's made over HTTP includes not only the content (the stuff we actually encounter), merely also a load of headers. Recall of these as like metadata fields that draw the nature and purpose of the request/response.

One of these headers is known as Content-Disposition, and it describes what the recipient should do with the content: should it be displayed inline in the browser, or downloaded as an zipper and saved as a file.

This is precisely what I'm looking for. It'll permit me to go along the cache-busting file naming scheme, whilst also forcing downloads and specifying pretty file names.

The syntax is really simple:

                Content-Disposition: attachment; filename="filename.jpg"                              
  • attachment specifies that the file should be downloaded instead of being displayed inline—which would exist specified with inline.
  • filename= specifies what downloaded file should be named.

Now that I know what the header should exist, how can I get it into S3?

If yous've already uploaded a file to S3, you can locate it in the S3 management panel, and and so see information technology'southward metadata. These include the headers that volition be sent to a customer when it is requested.

Content-Type volition already exist populated—this would accept been sent past your browser when y'all uploaded the file in the outset place.

It's relatively simple to add together the Content-Disposition header to this metadata list too:

Content-Disposition in the S3 Management Console

Note: This prototype is taken from what's currently (December 2016) known as the "New S3 Direction Panel". The old one looks quite different, but offers identical functionality in this regard.

One time yous've made this update, the file volition automatically be downloaded and have the friendly name adjacent time you access it. Although, as e'er, CDNs may accept some time to update their caches, dependent on how y'all take it configured.

This can hardly be thought of as "scalable" though. Every time somebody uploads a file that should exist downloaded through our admin interface, somebody has to popular into the S3 console and update the Content-Disposition header on that file. I certainly shan't be signing up for that.

Specifying Content-Disposition at upload fourth dimension

When yous upload a file to S3, it stores the relevant headers as metadata. For example the Content-Type field that you saw in the previous department.

Therefore, nosotros can reduce this problem to simply specifying the Content-Disposition header at the aforementioned time. Easy.

It is actually fairly easy, although it took me a while to figure out precisely what we needed to practise.

As I mentioned before, we use signed S3 URLs for the uploading procedure. We have an endpoint in the app which generates this URL, passes it back to the client, which in turn attempts to upload the file the user selected to this URL.

We only needed to augment this process with the boosted headers.

Our app is written in Ruby-red, and then we employ the AWS SDK for Cherry to generate the signed URL. In addition to specifying the uploaded filename, and the HTTP method, you tin can also add together signed headers.

                                  def                  self                  .                  sign                  (                  filename                  =                  nil                  )                  render                  if                  filename                  .                  blank?                  resources                  =                  Aws                  ::                  S3                  ::                  Resources                  .                  new                  object                  =                  resources                  .                  bucket                  (                  ENV                  [                  'AWS_BUCKET_NAME'                  ])                  .                  object                  (                  "                  #{                  SecureRandom                  .                  uuid                  }#{                  File                  .                  extname                  (                  filename                  )                  }                  "                  )                  headers                  =                  {                  'Content-Disposition'                  =>                  "attachment; filename=                  \"                  #{                  filename                  }                  \"                  "                  }                  {                  signed_url:                                    object                  .                  presigned_url                  (                  :put                  ,                  presigned_params                  (                  headers                  )),                  public_url:                                    object                  .                  public_url                  ,                  headers:                                    headers                  }                  end                  def                  self                  .                  presigned_params                  (                  headers                  =                  {})                  params                  =                  {                  acl:                                    'public-read'                  }                  headers                  .                  each                  do                  |                  header                  ,                  value                  |                  params                  [                  header                  .                  parameterize                  .                  underscore                  .                  to_sym                  ]                  =                  value                  cease                  params                  end                              

At that place are a few things to note here:

  • The sign method is called at the betoken that a user requests to upload a new file.
  • It creates a UUID-based file name, with an extension that matches the original file.
  • The headers hash defines the Content-Disposition header as specified in a higher place. This is sent back to the client, along with the signed URL for upload.
  • The signed URL uses the presigned_url method on the AWS S3 object, specifying that the PUT HTTP method will be used, and the boosted parameters returned by the presigned_params method.
  • The acl value is not a header, but is represents a canned access policy—here choosing public-read.
  • Each of the headers is converted into a symbol that matches the form used by the AWS presigner.

This process generates a presigned URL, that specifies:

  • The uploaded file name
  • The ACL of the created file
  • The HTTP method through which the file is uploaded
  • The headers that are passed in the upload asking

It's important to notation that the URL doesn't provide all of these attributes, merely that they must be provided for the signature to match. Therefore we must send the headers forth with the asking.

We use jQuery on the front-end to create the request for the presigned URL, and can at present apply it to perform the upload itself:

                                  $                  .                  ajax                  "/sign"                  ,                  type                  :                  'Become'                  data                  :                  filename                  :                  file                  .                  name                  success                  :                  (                  data                  ,                  status                  ,                  xhr                  )                  =>                  $                  .                  ajax                  information                  .                  signed_url                  ,                  type                  :                  'PUT'                  contentType                  :                  file                  .                  blazon                  headers                  :                  data                  .                  headers                  information                  :                  @                  options                  .                  file                  processData                  :                  false                  success                  :                  (                  data                  ,                  status                  ,                  xhr                  )                  =>                  console                  .                  log                  "Upload successful"                              

The above CoffeeScript shows quite how simple this tin be.

  • First, a request to the /sign endpoint to get the presigned URL. This requires the name of the file that'southward being uploaded
  • This invokes the Crimson sign function from in a higher place. When that returns, it provides a signed URL and a hash of headers
  • Perform a second AJAX phone call, this time to the presigned URL on AWS S3, pushing the file that needs to be uploaded. This uses the PUT HTTP verb, and specifies the headers that were sent dorsum from the Ruby.
  • If this succeeds, the lawmaking above only outputs an log message. In reality y'all would desire to brand sure to update your information model to include this newly uploaded file.

The actually important office (and the part that I missed) was that I needed to specify the Content-Disposition header both to the URL signer, and at upload time.

When you lot endeavor this out, and check out your uploaded file in the S3 management panel, you'll see that information technology has the Content-Disposition header set correctly in the metadata field. When yous attempt and access the file in your browser, it'll automatically download, with the filename you uploaded it as, even though it'due south stored (and accessed) via a different filename.

Magic!

Final thoughts

I like AWS a lot—and nosotros use a lot of their services. The reason it took me a while to get this figured out was because although the documentation is comprehensive, it's often non all that clear. I hope that if yous're trying to achieve the same effect, and have happened across this postal service, that it helps you out.

Our file uploading organisation is a little more circuitous than this, because not all our files are publicly attainable, yet they are all attainable via the CloudFront CDN. I'll add CloudFront signing for individual content stored in S3 buckets to my list of things I might write nigh one day.

Big cheers to Mic Pringle for this post. Nosotros very much worked on this aspect of the site together, and at present, without telling him, I've written it up equally if it's all my own work.

If yous've ever read anything I've written in the past, or seen me speak, you'll very quickly realise that nix is my own work. My function is very much to find the most efficient way to join other people's work together such that information technology kinda does what I want it to (=

DOWNLOAD HERE

Posted by: adamshicarrece95.blogspot.com

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel