Reuse InputStream for ResourceRegionHttpMessageConverter #24214

randomnicode · 2019-12-15T03:38:51Z

Dynamically generated input streams cannot be closed and reopened
multiple times. This fix reuses the same InputStream across multiple
ResourceRegion writes.

This solution does not presume anything about the underlying Resources
across different ResourceRegions. It will attempt to reuse an InputStream
if it's already open for a given Resource. If the range request is out of order,
then the solution closes the existing open stream and attempts to open a
new one. This is, at worst (all ranges are out of order or overlapping),
consistent with the existing solution, and at best (all ranges are non
overlapping and in order), reuses the same InputStream to help matters.

This is a proposal for and closes #24210

Dynamically generated input streams cannot be closed and reopened multiple times. This fix reuses the same InputStream across multiple ResourceRegion writes. This is a proposal for and closes spring-projects#24210

rstoyanchev

First, it's reasonable to assume 1 request serves (regions for) 1 resource. This is codified in HttpRange#toResourceRegions which creates regions given List<HttpRange> and a single Resource. So there doesn't have to be a Map.

A real problem with this solution however is that start positions for ranges are relative to the beginning of the InputStream but after the first region, the InputStream is no longer at the beginning. This is confirmed by ResourceRegionHttpMessageConverterTests failing to pass partialContentMultipleByteRanges.

One could keep a current index of where the InputStream is at and adjust start positions accordingly. However two more issues would still remain. It isn't illegal for ranges to overlap. It is also not illegal for them to be out of order. The spec says:

 A server that supports range requests MAY ignore or reject a Range header field that consists of more than two overlapping ranges, or a set of many small ranges that are not listed in ascending order, since both are indications of either a broken client or a deliberate denial-of-service attack (Section 6.1). A client SHOULD NOT request multiple ranges that are inherently less efficient to process and transfer than a single range that encompasses the same data. A client that is requesting multiple ranges SHOULD list those ranges in ascending order (the order in which they would typically be received in a complete representation) unless there is a specific need to request a later part earlier. For example, a user agent processing a large representation with an internal catalog of parts might need to request later parts first, particularly if the representation consists of pages stored in reverse order and the user agent wishes to transfer one page at a time.

The only option for a solution I can think of then is to check whether regions are in ascending order and do not overlap, and if so allow re-use of the InputStream.

Feel free to update the solution.

randomnicode · 2019-12-16T18:15:21Z

Thanks for the feedback!

While 1 Resource per request is a reasonable expectation, (and indeed my original PR did not have a Map), I reconsidered it because the issue is this: the method argument gets a Collection<ResourceRegion> with no guarantees about the underlying Resource for each of those objects. Extracting the Resource from the first ResourceRegion of the collection, and thereafter using it for all other ResourceRegion (even though they have their own Resource) feels like code smell to me (how is this method supposed to know that the underlying Resource for all the ResourceRegions provided is the same?). Thus in my issue (#24210), I suggested rearchitecting ResourceRegion. It seemed more reasonable to me to have a ResourceRegions class that consisted of 1 Resource and a collection of ranges, rather than have a collection of ResourceRegions each of which has a Resource object (that actually ends up being the same Resource object anyway). This then causes the downstream issues and necessitates assumptions such as what we're trying to make. If I'm allowed to, I can try rewriting those classes and submitting that solution, but it'll be a longer one (as it seems the ResourceRegion class is used elsewhere too).

The issue about the stream index being out of place is a good one and easier to mitigate with the aforementioned rewrite, when a ResourceRegion itself contains a list of ranges and one Resource, but I can try and fix it here within this scope.

Lastly, about the range order. I had pegged that point in an earlier revision of #24210, noting that we'd likely need to sort (and merge) ranges for this to be a viable option but I wasn't sure what the spec said. Is this still an assumption we could make? We could just sort and merge the ranges and respond with the sorted (and merged) order. The client wouldn't get the ranges in the order they requested, but as the spec mentions it would likely be a broken client or a DoS attack anyway. If someone actually wants out-of-order ranges, then they could just submit multiple requests (which they would realistically anyway, for instance for a zip directory: get the last part, process it, get a portion of the earlier parts based on the processing result)?

Part of my assumption comes from this piece of code: https://github.com/spring-projects/spring-framework/blob/master/spring-web/src/main/java/org/springframework/http/HttpRange.java#L191-L194 which implies that even though overlapping ranges are allowed by spec, we de facto don't allow it or at least don't expect it (if there are overlapping ranges, then by definition the total byte count for some requests would be greater than the size of the Resource, which we don't allow). Consider a byte range request of 0-499,0-499 for a 500 byte resource as an example, which is a valid range, but total byte count of 1000 for the ranges > resource size of 500 would throw an exception.

randomnicode · 2019-12-16T23:58:03Z

I've pushed out a fix addressing the issues and updated the PR description. The new solution should be consistent with the existing solution for out-of-order (and overlapping) requests.

rstoyanchev · 2019-12-18T16:50:55Z

the method argument gets a Collection with no guarantees about the underlying Resource for each of those objects

In practice however ResourceRegion is only ever instantiated through HttpRangeToResourceRegions. This could be verified through an assertion, or more likely a fallback that simply re-opens the stream if it isn't the same Resource, which should never happen.

Thus in my issue (#24210), I suggested rearchitecting ResourceRegion

Not an option at this time. The downsides of such refactoring, in a framework at least, outweigh any benefits.

Part of my assumption comes from this piece of code: https://github.com/spring-projects/spring-framework/blob/master/spring-web/src/main/java/org/springframework/http/HttpRange.java#L191-L194 which implies that even though overlapping ranges are allowed by spec, we de facto don't allow it or at least don't expect it

That rules is meant to put a boundary on the amount of content that can be sent. It does not prevent overlap, e.g. 0-10,5-15 out of 100.

The framework tests do not pass with changes from the PR. However, I'm taking a closer look and based on the proposal will implement a similar solution.

Reuse InputStream for ResourceRegionHttpMessageConverter

5efe7a2

Dynamically generated input streams cannot be closed and reopened multiple times. This fix reuses the same InputStream across multiple ResourceRegion writes. This is a proposal for and closes spring-projects#24210

spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged or decided on label Dec 15, 2019

rstoyanchev self-assigned this Dec 16, 2019

rstoyanchev added in: web Issues in web modules (web, webmvc, webflux, websocket) type: enhancement A general enhancement and removed status: waiting-for-triage An issue we've not yet triaged or decided on labels Dec 16, 2019

rstoyanchev added this to the 5.2.3 milestone Dec 16, 2019

rstoyanchev mentioned this pull request Dec 16, 2019

ResourceRegions converter opens and closes InputStream multiple times for multiple HttpRanges #24210

Closed

rstoyanchev requested changes Dec 16, 2019

View reviewed changes

rstoyanchev removed this from the 5.2.3 milestone Dec 16, 2019

rstoyanchev added status: waiting-for-feedback We need additional information before we can continue status: waiting-for-triage An issue we've not yet triaged or decided on labels Dec 16, 2019

spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue labels Dec 16, 2019

Remember stream position and account for out-of-order ranges

a3c9657

rstoyanchev removed status: feedback-provided Feedback has been provided status: waiting-for-triage An issue we've not yet triaged or decided on labels Dec 18, 2019

rstoyanchev added this to the 5.2.3 milestone Dec 18, 2019

rstoyanchev closed this in 0eacb44 Dec 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reuse InputStream for ResourceRegionHttpMessageConverter #24214

Reuse InputStream for ResourceRegionHttpMessageConverter #24214

Uh oh!

randomnicode commented Dec 15, 2019 •

edited

Loading

rstoyanchev left a comment •

edited

Loading

randomnicode commented Dec 16, 2019 •

edited

Loading

randomnicode commented Dec 16, 2019 •

edited

Loading

rstoyanchev commented Dec 18, 2019

Labels

3 participants

Reuse InputStream for ResourceRegionHttpMessageConverter #24214

Reuse InputStream for ResourceRegionHttpMessageConverter #24214

Uh oh!

Conversation

randomnicode commented Dec 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

rstoyanchev left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

randomnicode commented Dec 16, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

randomnicode commented Dec 16, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

rstoyanchev commented Dec 18, 2019

Labels

3 participants

randomnicode commented Dec 15, 2019 •

edited

Loading

rstoyanchev left a comment •

edited

Loading

randomnicode commented Dec 16, 2019 •

edited

Loading

randomnicode commented Dec 16, 2019 •

edited

Loading