venable/software

Spring Rest Template for Large Files

We had an issue at Allogy several years ago where users were unable to upload large video files. This was clearly disruptive for our customers and was an important issue to resolve. Because we had taken care in our APIs to handle large files correctly, we were surprised about this issue. However, in this particular case we were uploading files through a Zuul gateway. It was the Zuul gateway that caused our issue. It was loading the entire file into memory as a byte array and then forwarding it to the correct service.

This article will show how we can handle large files in our Spring gateway server. We have had customers successfully upload video files of several hundred megabytes over the web using this solution.

Background

Most Spring APIs can work quite well with standard models and controllers. These models are normally deserialized from JSON or XML into the model and the entire model fits quite nicely in memory. But, a large file should not sit in memory.

If you use Spring in the standard way with large files you would end up using a byte array of some sort. This might be in the form of byte[] or ByteArrayInputStream. When you do this, then the service must load the entire file into memory in that single byte array.

Spring services never store files. They always move them somewhere else. It might be a local or remote file system, perhaps another services, or a storage services like AWS S3. At Allogy we always save our files to S3. So our Spring services will either move a file to S3 or to another services. To do this without taking up memory you will want to use a plain InputStream which send bytes from the upload file and then send it to your destination (S3 or another service).

For this particular issue, we had Spring acting as a gateway server for the destination Media Service. You can see how this works below.

Overview

The Web Browser is the web browser for our customer who wishes to upload a file. This makes an HTTP request directly to our Spring Gateway. This in turn needs to forward the request to the Media Service. I don’t show it here, but the Media Service uploads this to S3, and that is where we were already using InputStream correctly.

Overall Approach

For the issue we had with our Zuul gateway, we ended up not using Zuul. Instead we created a Spring request mapping to handle the upload. In our case we use POST requests. Here is some slightly modified code to show how we created a proxy endpoint in place of Zuul.

@RequestMapping(value = "/my/path/{id}", method = RequestMethod.POST)
public ResponseEntity<InputStreamResource> upload(
        @PathVariable String id,
        @RequestHeader HttpHeaders httpHeaders,
        @RequestBody InputStreamResource inputBodyStreamResource)
{
    RequestCallback requestCallback = request -> {
        httpHeaders.entrySet().stream()
                .filter(header -> FORWARD_HEADERS.contains(header.getKey().toLowerCase()))
                .forEach(entry -> request.getHeaders().put(entry.getKey(), entry.getValue()));

        IOUtils.copy(inputBodyStreamResource.getInputStream(), request.getBody());
    };

    ResponseExtractor<ResponseEntity<InputStreamResource>> responseExtractor =
            response -> {
                InputStreamResource responseInputStreamResource;
                if (response.getHeaders().getContentLength() > 0)
                {
                    responseInputStreamResource = new InputStreamResource(response.getBody());
                }
                else
                {
                    responseInputStreamResource = null;
                }

                return new ResponseEntity<>(responseInputStreamResource, response.getHeaders(), response.getStatusCode());
            };

    try
    {
        return filesRestTemplate.execute("http://localhost:8080/my/path/{id}",
                HttpMethod.POST, requestCallback, responseExtractor, id);
    }
    catch (HttpStatusCodeException ex)
    {
        return new ResponseEntity<>(ex.getStatusCode());
    }
}

Breaking it Down

First, we will look at one part of the code.

RequestCallback requestCallback = request -> {
    httpHeaders.entrySet().stream()
            .filter(header -> FORWARD_HEADERS.contains(header.getKey().toLowerCase()))
            .forEach(entry -> request.getHeaders().put(entry.getKey(), entry.getValue()));

    IOUtils.copy(inputBodyStreamResource.getInputStream(), request.getBody());
};

The code above creates an alternative RequestCallback which does two things.

  1. It forwards specific headers, as defined in a set of strings named FORWARD_HEADERS. Choose what you wish to forward on to the destination. This acts as a whitelist of headers to forward.
  2. It copies the incoming data without buffering all the data into memory or a byte array. This is the heart of the solution.

In this example, I’m using org.apache.commons.io.IOUtils::copy. But all you really need is code that copies data from an InputStream to an OutputStream without reading the entire stream into memory.

This next snippet from the function above handles the response.

    ResponseExtractor<ResponseEntity<InputStreamResource>> responseExtractor =
            response -> {
                InputStreamResource responseInputStreamResource;
                if (response.getHeaders().getContentLength() > 0)
                {
                    responseInputStreamResource = new InputStreamResource(response.getBody());
                }
                else
                {
                    responseInputStreamResource = null;
                }

                return new ResponseEntity<>(responseInputStreamResource, response.getHeaders(), response.getStatusCode());
            };

We want to return the exact response from the downstream service back to the client. You may wish to filter out some headers, or filter only others. In this case we can easily return an InputStreamResource and Spring MVC will handle it nicely. In this way, the proxy need not know anything about what is being returned.

Setting Up the RestTemplate

There are a few other tricks you must be aware of.

Do not use an @Autowired RestTemplate from the application context. In many cases the RestTemplate in the application context is given any number of interceptors. These are classes that implement ClientHttpRequestInterceptor. The problem with using an interceptor is that once you do this, the RestTemplate will copy the incoming input stream into a byte array. This defeats the whole purpose of streaming the data.

You can see where this happens in InterceptingHttpAccessor from which RestTemplate inherits.

Along those same lines you should create your RestTemplate as follows.

RestTemplate restTemplate = new RestTemplate();
SimpleClientHttpRequestFactory requestFactory = new SimpleClientHttpRequestFactory();
requestFactory.setBufferRequestBody(false);
restTemplate.setRequestFactory(requestFactory);

You can see how the SimpleClientHttpRequestFactory will use the bufferRequestBody parameter to stream the data.

Simpler Approach

The following approach should also work. I have verified it in local tests, but it is not what we are using in production.

HttpEntity<InputStreamResource> inputEntity = new HttpEntity<>(inputBodyStreamResource, httpHeaders);
restTemplate.exchange("http://localhost:8080/my/path/{id}", HttpMethod.POST, inputEntity, InputStreamResource.class, id);

This works because Spring has some code to handle an InputStreamResource using the correct copy from an InputStream to an OutputStream.


Share