- Notifications
You must be signed in to change notification settings - Fork 38.8k
Reuse InputStream for ResourceRegionHttpMessageConverter #24214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reuse InputStream for ResourceRegionHttpMessageConverter #24214
Conversation
Dynamically generated input streams cannot be closed and reopened multiple times. This fix reuses the same InputStream across multiple ResourceRegion writes. This is a proposal for and closes spring-projects#24210
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, it's reasonable to assume 1 request serves (regions for) 1 resource. This is codified in HttpRange#toResourceRegions which creates regions given List<HttpRange> and a single Resource. So there doesn't have to be a Map.
A real problem with this solution however is that start positions for ranges are relative to the beginning of the InputStream but after the first region, the InputStream is no longer at the beginning. This is confirmed by ResourceRegionHttpMessageConverterTests failing to pass partialContentMultipleByteRanges.
One could keep a current index of where the InputStream is at and adjust start positions accordingly. However two more issues would still remain. It isn't illegal for ranges to overlap. It is also not illegal for them to be out of order. The spec says:
A server that supports range requests MAY ignore or reject a Range header field that consists of more than two overlapping ranges, or a set of many small ranges that are not listed in ascending order, since both are indications of either a broken client or a deliberate denial-of-service attack (Section 6.1). A client SHOULD NOT request multiple ranges that are inherently less efficient to process and transfer than a single range that encompasses the same data. A client that is requesting multiple ranges SHOULD list those ranges in ascending order (the order in which they would typically be received in a complete representation) unless there is a specific need to request a later part earlier. For example, a user agent processing a large representation with an internal catalog of parts might need to request later parts first, particularly if the representation consists of pages stored in reverse order and the user agent wishes to transfer one page at a time. The only option for a solution I can think of then is to check whether regions are in ascending order and do not overlap, and if so allow re-use of the InputStream.
Feel free to update the solution.
| Thanks for the feedback! While 1 The issue about the stream index being out of place is a good one and easier to mitigate with the aforementioned rewrite, when a ResourceRegion itself contains a list of ranges and one Resource, but I can try and fix it here within this scope. Lastly, about the range order. I had pegged that point in an earlier revision of #24210, noting that we'd likely need to sort (and merge) ranges for this to be a viable option but I wasn't sure what the spec said. Is this still an assumption we could make? We could just sort and merge the ranges and respond with the sorted (and merged) order. The client wouldn't get the ranges in the order they requested, but as the spec mentions it would likely be a broken client or a DoS attack anyway. If someone actually wants out-of-order ranges, then they could just submit multiple requests (which they would realistically anyway, for instance for a zip directory: get the last part, process it, get a portion of the earlier parts based on the processing result)? Part of my assumption comes from this piece of code: https://github.com/spring-projects/spring-framework/blob/master/spring-web/src/main/java/org/springframework/http/HttpRange.java#L191-L194 which implies that even though overlapping ranges are allowed by spec, we de facto don't allow it or at least don't expect it (if there are overlapping ranges, then by definition the total byte count for some requests would be greater than the size of the Resource, which we don't allow). Consider a byte range request of |
| I've pushed out a fix addressing the issues and updated the PR description. The new solution should be consistent with the existing solution for out-of-order (and overlapping) requests. |
In practice however
Not an option at this time. The downsides of such refactoring, in a framework at least, outweigh any benefits.
That rules is meant to put a boundary on the amount of content that can be sent. It does not prevent overlap, e.g. 0-10,5-15 out of 100. The framework tests do not pass with changes from the PR. However, I'm taking a closer look and based on the proposal will implement a similar solution. |
Dynamically generated input streams cannot be closed and reopened
multiple times. This fix reuses the same InputStream across multiple
ResourceRegion writes.
This solution does not presume anything about the underlying Resources
across different ResourceRegions. It will attempt to reuse an InputStream
if it's already open for a given Resource. If the range request is out of order,
then the solution closes the existing open stream and attempts to open a
new one. This is, at worst (all ranges are out of order or overlapping),
consistent with the existing solution, and at best (all ranges are non
overlapping and in order), reuses the same InputStream to help matters.
This is a proposal for and closes #24210