Skip to content

Conversation

@david-driscoll
Copy link
Member

This replaces the existing logic for reading and writing from the given streams using System.IO.Pipelines.

Goal is to make input and output as memory efficient as is logically possible.

For input parsing:

  • We only care about the Content-Type header, so currently we just throw away any other headers (without allocations as far as I can tell)
  • Currently we still are using Newtonsoft.Json so we aren't going to see full benefit until that is replaced (in a future PR)

For output:

  • Just simplified the flow to write out using a PipeWriter

closes #151

@codecov
Copy link

codecov bot commented May 4, 2020

Codecov Report

Merging #236 into master will decrease coverage by 0.27%.
The diff coverage is 74.74%.

Impacted file tree graph

@@ Coverage Diff @@ ## master #236 +/- ## ========================================== - Coverage 44.83% 44.56% -0.28%  ========================================== Files 360 359 -1 Lines 8261 8316 +55 Branches 1022 1045 +23 ========================================== + Hits 3704 3706 +2  - Misses 4258 4305 +47  - Partials 299 305 +6 
Impacted Files Coverage Δ
src/JsonRpc/Connection.cs 0.00% <0.00%> (ø)
src/JsonRpc/JsonRpcServer.cs 0.00% <0.00%> (ø)
src/Server/LanguageServer.cs 0.00% <0.00%> (ø)
src/JsonRpc/OutputHandler.cs 57.14% <57.57%> (-26.20%) ⬇️
src/JsonRpc/ResponseRouter.cs 78.94% <66.66%> (ø)
src/JsonRpc/InputHandler.cs 74.11% <80.00%> (-12.66%) ⬇️
src/JsonRpc/ProcessScheduler.cs 100.00% <100.00%> (+1.38%) ⬆️
src/Protocol/DocumentUri.cs 67.87% <0.00%> (+4.68%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 61d029a...9205740. Read the comment docs.

Copy link

@NTaylorMullen NTaylorMullen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are some super exciting change @david-driscoll!

}

public void Start()
public void Send(object value)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TanayParikh this OutputHandler class is essentialy what our AutoFlushingStreamWriter tries to workaround. Changes here are a lot like what we'd have to do on our side until we consume these changes 😄

<PackageReference Update="MediatR" Version="8.0.1"/>
<PackageReference Update="MediatR.Extensions.Microsoft.DependencyInjection" Version="8.0.0"/>
<PackageReference Update="System.IO.Pipelines" Version="4.7.1"/>
<PackageReference Update="Nerdbank.Streams" Version="2.4.60"/>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually depend on this in VS to enable our LSP experience there. /cc @ToddGrun

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just found them and it was code I didn't have to 100% comprehend (I'm slowly catching on...)

// some time to attach a debugger
// System.Threading.Thread.Sleep(TimeSpan.FromSeconds(5));
// TODO: This might be simplified with SequenceReader...
// Not sure we can move to purely .netstandard 2.1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't move to netstandard2.1 since it doesn't work in .NET Framework :(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's an old comment, this PR works on netframework, netcoreapp2.1 and netcoreapp3.1 and support will be for the foreseeable future.

There are just things thank netstandard2.1 makes easier.... either way I learned a heck of lot by not having it 😄

@david-driscoll
Copy link
Member Author

@NTaylorMullen hopefully it's not terrible pipelines code... this is my first time delving into them deeply.

_inputThread = new Thread(ProcessInputStream) { IsBackground = true, Name = "ProcessInputStream" };
_scheduler = new ProcessScheduler(loggerFactory, concurrency,
new EventLoopScheduler(_ => new Thread(_) {IsBackground = true, Name = "InputHandler"}));
_headersBuffer = new Memory<byte>(new byte[HeadersFinishedLength]);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NTaylorMullen I originally was renting this from the shared memory pool but I discovered some intermittent unit test failures where it seems like the memory was getting shared somehow. I'm really not sure how or why but I just this is a small enough amount of memory it probably makes more sense to just hard code it here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya all of this is super foreign to me sadly I wish I could be of more help :(. @halter73 @davidfowl @Tratcher and @pakrym are the real pipeline experts especially when it comes to reading headers.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like there might have been a use-after-free issue. If the memory was being returned to the pool by the CompositeDisposable instead of at the end of ProcessInputStream(), it could very well have been returning the pooled memory before ProcessInputStream was done using it. I don't think CompositeDisposable should be completing the PipeReader either for the same reason. It would be better to complete it in a finally at the end of ProcessInputStream.

If possible, it would be best to have IInputHandler implement DisposeAsync (or if that's too breaking, have the caller try to cast the IInputHandler to IAsyncDisposable and call DisposeAsync instead Dispose if it's available), and have DisposeAsync wait on ProcessInputStream to complete to ensure every thing is cleaned up by the time DisposeAsync completes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after reading this about 10 times, that actually makes sense. Thanks @halter73

@david-driscoll
Copy link
Member Author

Here is the benchmark run

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19603 Intel Core i7-7700K CPU 4.20GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores .NET Core SDK=3.1.300-preview-015048 [Host] : .NET Core 3.1.3 (CoreCLR 4.700.20.11803, CoreFX 4.700.20.12001), X64 RyuJIT .NET 4.7.2 : .NET Framework 4.8 (4.8.4161.0), X64 RyuJIT .NET Core 2.1 : .NET Core 2.1.17 (CoreCLR 4.6.28619.01, CoreFX 4.6.28619.01), X64 RyuJIT .NET Core 3.1 : .NET Core 3.1.3 (CoreCLR 4.700.20.11803, CoreFX 4.700.20.12001), X64 RyuJIT 
Method Job Runtime Payload Count Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
Classic .NET 4.7.2 .NET 4.7.2 Cont(...):{}} [110] 100 126.93 μs 2.529 μs 2.706 μs 31.4941 - - 129.02 KB
Pipelines .NET 4.7.2 .NET 4.7.2 Cont(...):{}} [110] 100 89.35 μs 1.750 μs 3.064 μs 1.9531 - - 8.31 KB
Classic .NET Core 2.1 .NET Core 2.1 Cont(...):{}} [110] 100 109.01 μs 2.177 μs 2.235 μs 26.3672 - - 108.22 KB
Pipelines .NET Core 2.1 .NET Core 2.1 Cont(...):{}} [110] 100 79.10 μs 1.276 μs 1.366 μs 1.9531 - - 8.25 KB
Classic .NET Core 3.1 .NET Core 3.1 Cont(...):{}} [110] 100 83.14 μs 1.289 μs 1.007 μs 26.2451 - - 107.51 KB
Pipelines .NET Core 3.1 .NET Core 3.1 Cont(...):{}} [110] 100 63.28 μs 1.204 μs 1.386 μs 1.8311 - - 7.54 KB
Classic .NET 4.7.2 .NET 4.7.2 Cont(...):{}} [110] 1000 1,314.14 μs 26.173 μs 50.427 μs 310.5469 - - 1278.52 KB
Pipelines .NET 4.7.2 .NET 4.7.2 Cont(...):{}} [110] 1000 876.58 μs 17.374 μs 28.055 μs 16.6016 - - 71.78 KB
Classic .NET Core 2.1 .NET Core 2.1 Cont(...):{}} [110] 1000 1,176.67 μs 22.441 μs 61.808 μs 259.7656 - - 1071.5 KB
Pipelines .NET Core 2.1 .NET Core 2.1 Cont(...):{}} [110] 1000 800.52 μs 16.009 μs 40.456 μs 16.6016 - - 71.53 KB
Classic .NET Core 3.1 .NET Core 3.1 Cont(...):{}} [110] 1000 868.05 μs 17.167 μs 29.612 μs 260.7422 - - 1066.09 KB
Pipelines .NET Core 3.1 .NET Core 3.1 Cont(...):{}} [110] 1000 648.84 μs 12.799 μs 16.642 μs 15.6250 - - 66.13 KB
Classic .NET 4.7.2 .NET 4.7.2 Cont(...)}]}} [917] 100 293.44 μs 5.816 μs 11.480 μs 91.3086 - - 375.07 KB
Pipelines .NET 4.7.2 .NET 4.7.2 Cont(...)}]}} [917] 100 92.00 μs 1.835 μs 2.631 μs 1.9531 - - 8.31 KB
Classic .NET Core 2.1 .NET Core 2.1 Cont(...)}]}} [917] 100 235.29 μs 4.558 μs 4.681 μs 83.9844 - - 344.94 KB
Pipelines .NET Core 2.1 .NET Core 2.1 Cont(...)}]}} [917] 100 78.60 μs 1.152 μs 1.021 μs 1.9531 - - 8.25 KB
Classic .NET Core 3.1 .NET Core 3.1 Cont(...)}]}} [917] 100 127.74 μs 2.522 μs 3.617 μs 84.7168 - - 346.22 KB
Pipelines .NET Core 3.1 .NET Core 3.1 Cont(...)}]}} [917] 100 72.26 μs 1.424 μs 1.695 μs 2.3193 - - 9.53 KB
Classic .NET 4.7.2 .NET 4.7.2 Cont(...)}]}} [917] 1000 2,871.02 μs 53.686 μs 44.830 μs 910.1563 - - 3739 KB
Pipelines .NET 4.7.2 .NET 4.7.2 Cont(...)}]}} [917] 1000 994.96 μs 19.417 μs 17.213 μs 15.6250 - - 71.78 KB
Classic .NET Core 2.1 .NET Core 2.1 Cont(...)}]}} [917] 1000 2,578.41 μs 51.201 μs 76.636 μs 835.9375 - - 3438.69 KB
Pipelines .NET Core 2.1 .NET Core 2.1 Cont(...)}]}} [917] 1000 838.74 μs 16.657 μs 29.173 μs 16.6016 - - 71.53 KB
Classic .NET Core 3.1 .NET Core 3.1 Cont(...)}]}} [917] 1000 1,617.39 μs 31.404 μs 51.598 μs 921.8750 292.9688 7.8125 4093.51 KB
Pipelines .NET Core 3.1 .NET Core 3.1 Cont(...)}]}} [917] 1000 874.86 μs 17.203 μs 32.731 μs 208.9844 55.6641 0.9766 726.35 KB
@david-driscoll david-driscoll force-pushed the feature/input-output branch from b1e0a69 to 865a41b Compare May 5, 2020 01:42
* fixed an issue where if cancellation token was throw and not caught the scheduling thread would die (and potentially the entire process if the exception went unhandled) * Deal with undhandled exceptions as well, and report down through logging * removed unused constants * Added some more work continued tests and a benchmark for comparisons * Made improvements to unit tests for input and output. Fixed a small issue related to not emptying out the shared memory for content length parsing * Added reference assemblies and removed debugging buffer
@david-driscoll david-driscoll force-pushed the feature/input-output branch from 0c2a521 to a54a9e8 Compare May 5, 2020 01:59
@NTaylorMullen
Copy link

NTaylorMullen hopefully it's not terrible pipelines code... this is my first time delving into them deeply.

You know more than me to be honest. I've never done it to the extent that you do in this PR 😄

@davidfowl
Copy link
Member

I'll take a look.

@david-driscoll
Copy link
Member Author

Oh no... @davidfowl is going to tear me apart now 😭

@davidfowl
Copy link
Member

I have lots of questions about the InputHandler, it looks more complicated than it needs to be (outside of the pipelines usage). Remind me what TFMs this supports?

@bjorkstromm
Copy link
Member

@davidfowl

Remind me what TFMs this supports?

We support netstandard2.0 as we have some users on .NET Framework (#236 (comment))

@davidfowl
Copy link
Member

So I've been reading the code here and I want to suggest small changes to improve it but I think it might be better to suggest a rewrite 😄 . The IInputHandler and IOutputHandler can be drastically simplified using Channels and Pipelines (the Rx usage here makes things way more complicated than they need to be).


// don't be async: We already allocated a seperate thread for this.
private void ProcessInputStream()
private bool TryParseHeaders(ref ReadOnlySequence<byte> buffer, out ReadOnlySequence<byte> line)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parsing logic in here can be simplified using SequenceReader<byte>, it's currently only in .NET Core 3.0 but Nerdbank.Streams has https://github.com/AArnott/Nerdbank.Streams/blob/master/doc/SequenceTextReader.md that you should be able to use to simplify this logic.

It's slightly allocatey though, it might make sense to copy the SequenceReader sources into this ns2.0 project from .NET Core.

@david-driscoll
Copy link
Member Author

If you have some docs with how to do fan-in/fan-out (including concurrency control) with channels I'd love to hear it.

The ProcssScheduler is to manage back pressure, there are certain requests that are blocking (text changes) and we have to hold everything until the text requests are processed. We can easily make it take a Func<CancelationToken, Task> instead of an observable, and do the translation internally, it really doesn't matter (I used to have an extension method, but didn't seem worth while as the code is internal anyway and it makes unit testing the process scheduler easier).

@davidfowl
Copy link
Member

Back pressure doesn’t work if it isn’t happening everywhere. RX can’t do back pressure unless you’re going drop messages or block. Fan out is reasonable use case for RX as long as considered are fire and forget (there goes your back pressure). Where are you fanning in?

@david-driscoll
Copy link
Member Author

The spec doesn't specifically define response ordering so we have certain handlers defined as Serial and others as Parallel. With a default of Serial if none is defined.

So we might see textDocument/didChange event this will often lead to textDocument/completion and textDocument/codeLens and then we might get 3 textDocument/didChange events followed by textDocument/completion and textDocument/codeLens

So we can fan out to handle all the parallel requests but then we have to fan back in to handle the serial ones.

Something along the lines of textDocument/didChange -> [textDocument/completion, textDocument/codeLens] -> textDocument/didChange -> textDocument/didChange -> textDocument/didChange -> [textDocument/completion, textDocument/codeLens].

Now if the editor is smart enough (which if I recall vscode is, but it's been a while) it will cancel any in flight completions or code lens requests if text changes are continued to be made, but if the editor isn't doing that we are handling requests such that our state is "correct".

Now it's possible for an editor to fire off like 100 events, and we may not have a server that can handle that kind of load (or we want to constrain resources in someway) we also have the ability to say "We only want at most 10 parallel requests in flight at a time", and the ProcessScheduler will ensure that even if we have 100 parallel requests pending that we have at most 10 processing. As each request completes the next pending request will takes it's place. This way if we have an expensive request happen, say documentSymbols takes a while, we will continue to serve serve completions, signature help, etc.

@david-driscoll
Copy link
Member Author

However the output handler should be converted to Channels, that's a clear and easy win I think. Frankly I just forgot they existed as a part of pipelines 😄

_pipeReader.AdvanceTo(buffer.Start, buffer.End);

// Stop reading if there's no more data coming.
if (result.IsCompleted && buffer.GetPosition(0, buffer.Start).Equals(buffer.GetPosition(0, buffer.End)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty buffer + IsEmpty is how you check for done.

@davidfowl
Copy link
Member

While I brew over how to refactor this code check out these guidance docs on how to use pipelines https://docs.microsoft.com/en-us/dotnet/standard/io/pipelines

@TylerLeonhardt
Copy link
Collaborator

@davidfowl thanks for taking a look at this PR. It's really awesome to see you getting involved for this - a ton of extensions out in the wild thank you 😃

@david-driscoll and friends have done an incredible job on this library so far and I'm excited to see how it grows in the future!

@davidfowl
Copy link
Member

However the output handler should be converted to Channels, that's a clear and easy win I think. Frankly I just forgot they existed as a part of pipelines 😄

You can use a semaphore and write directly to the pipe. You only need the queue or channels if you want to more some slack (potentially even more write batching).

@david-driscoll and friends have done an incredible job on this library so far and I'm excited to see how it grows in the future!

Yes indeed, it's come a long way since I was last here 😄. It's so much more mature now (I use it daily).

try
{
payload = JToken.Parse(request);
using var textReader = new StreamReader(request.AsStream());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you could get rid of this you can make reduce allocations even further.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidfowl baby steps my friend!

Pipelines first, then I have this branch where I explored moving to system.text.json and it mostly works. I have to however refactor the language client as it doesn't really mesh well with the server at this point.
https://github.com/OmniSharp/csharp-language-server-protocol/tree/feature/systemtextjson

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And by works I mean it compiles 😆 (without the client project)

@davidfowl
Copy link
Member

Whyyyyyy

@david-driscoll david-driscoll deleted the feature/input-output branch June 2, 2020 04:47
@TylerLeonhardt
Copy link
Collaborator

@davidfowl the changes were included in this :)

#248

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

7 participants