Feature pipelines #236

david-driscoll · 2020-05-04T04:27:43Z

This replaces the existing logic for reading and writing from the given streams using System.IO.Pipelines.

Goal is to make input and output as memory efficient as is logically possible.

For input parsing:

We only care about the Content-Type header, so currently we just throw away any other headers (without allocations as far as I can tell)
Currently we still are using Newtonsoft.Json so we aren't going to see full benefit until that is replaced (in a future PR)

For output:

Just simplified the flow to write out using a PipeWriter

closes #151

codecov · 2020-05-04T04:33:07Z

Codecov Report

Merging #236 into master will decrease coverage by 0.27%.
The diff coverage is 74.74%.

@@ Coverage Diff @@ ## master #236 +/- ## ========================================== - Coverage 44.83% 44.56% -0.28%  ========================================== Files 360 359 -1 Lines 8261 8316 +55 Branches 1022 1045 +23 ========================================== + Hits 3704 3706 +2  - Misses 4258 4305 +47  - Partials 299 305 +6

Impacted Files	Coverage Δ
src/JsonRpc/Connection.cs	`0.00% <0.00%> (ø)`
src/JsonRpc/JsonRpcServer.cs	`0.00% <0.00%> (ø)`
src/Server/LanguageServer.cs	`0.00% <0.00%> (ø)`
src/JsonRpc/OutputHandler.cs	`57.14% <57.57%> (-26.20%)`	⬇️
src/JsonRpc/ResponseRouter.cs	`78.94% <66.66%> (ø)`
src/JsonRpc/InputHandler.cs	`74.11% <80.00%> (-12.66%)`	⬇️
src/JsonRpc/ProcessScheduler.cs	`100.00% <100.00%> (+1.38%)`	⬆️
src/Protocol/DocumentUri.cs	`67.87% <0.00%> (+4.68%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 61d029a...9205740. Read the comment docs.

NTaylorMullen

These are some super exciting change @david-driscoll!

NTaylorMullen · 2020-05-04T17:09:58Z

src/JsonRpc/OutputHandler.cs

 }

- public void Start()
+ public void Send(object value)


@TanayParikh this OutputHandler class is essentialy what our AutoFlushingStreamWriter tries to workaround. Changes here are a lot like what we'd have to do on our side until we consume these changes 😄

NTaylorMullen · 2020-05-04T17:32:13Z

Directory.Build.targets

 <PackageReference Update="MediatR" Version="8.0.1"/>
 <PackageReference Update="MediatR.Extensions.Microsoft.DependencyInjection" Version="8.0.0"/>
+ <PackageReference Update="System.IO.Pipelines" Version="4.7.1"/>
+ <PackageReference Update="Nerdbank.Streams" Version="2.4.60"/>


We actually depend on this in VS to enable our LSP experience there. /cc @ToddGrun

I just found them and it was code I didn't have to 100% comprehend (I'm slowly catching on...)

TylerLeonhardt · 2020-05-04T22:46:38Z

src/JsonRpc/InputHandler.cs

- // some time to attach a debugger
- // System.Threading.Thread.Sleep(TimeSpan.FromSeconds(5));
+ // TODO: This might be simplified with SequenceReader...
+ // Not sure we can move to purely .netstandard 2.1


please don't move to netstandard2.1 since it doesn't work in .NET Framework :(

Yeah, that's an old comment, this PR works on netframework, netcoreapp2.1 and netcoreapp3.1 and support will be for the foreseeable future.

There are just things thank netstandard2.1 makes easier.... either way I learned a heck of lot by not having it 😄

david-driscoll · 2020-05-05T00:49:04Z

@NTaylorMullen hopefully it's not terrible pipelines code... this is my first time delving into them deeply.

david-driscoll · 2020-05-05T00:52:13Z

src/JsonRpc/InputHandler.cs

- _inputThread = new Thread(ProcessInputStream) { IsBackground = true, Name = "ProcessInputStream" };
+ _scheduler = new ProcessScheduler(loggerFactory, concurrency,
+ new EventLoopScheduler(_ => new Thread(_) {IsBackground = true, Name = "InputHandler"}));
+ _headersBuffer = new Memory<byte>(new byte[HeadersFinishedLength]);


@NTaylorMullen I originally was renting this from the shared memory pool but I discovered some intermittent unit test failures where it seems like the memory was getting shared somehow. I'm really not sure how or why but I just this is a small enough amount of memory it probably makes more sense to just hard code it here.

Ya all of this is super foreign to me sadly I wish I could be of more help :(. @halter73 @davidfowl @Tratcher and @pakrym are the real pipeline experts especially when it comes to reading headers.

It sounds like there might have been a use-after-free issue. If the memory was being returned to the pool by the CompositeDisposable instead of at the end of ProcessInputStream(), it could very well have been returning the pooled memory before ProcessInputStream was done using it. I don't think CompositeDisposable should be completing the PipeReader either for the same reason. It would be better to complete it in a finally at the end of ProcessInputStream.

If possible, it would be best to have IInputHandler implement DisposeAsync (or if that's too breaking, have the caller try to cast the IInputHandler to IAsyncDisposable and call DisposeAsync instead Dispose if it's available), and have DisposeAsync wait on ProcessInputStream to complete to ensure every thing is cleaned up by the time DisposeAsync completes.

after reading this about 10 times, that actually makes sense. Thanks @halter73

david-driscoll · 2020-05-05T01:36:52Z

Here is the benchmark run

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19603 Intel Core i7-7700K CPU 4.20GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores .NET Core SDK=3.1.300-preview-015048 [Host] : .NET Core 3.1.3 (CoreCLR 4.700.20.11803, CoreFX 4.700.20.12001), X64 RyuJIT .NET 4.7.2 : .NET Framework 4.8 (4.8.4161.0), X64 RyuJIT .NET Core 2.1 : .NET Core 2.1.17 (CoreCLR 4.6.28619.01, CoreFX 4.6.28619.01), X64 RyuJIT .NET Core 3.1 : .NET Core 3.1.3 (CoreCLR 4.700.20.11803, CoreFX 4.700.20.12001), X64 RyuJIT

Method	Job	Runtime	Payload	Count	Mean	Error	StdDev	Gen 0	Gen 1	Gen 2	Allocated
Classic	.NET 4.7.2	.NET 4.7.2	Cont(...):{}} [110]	100	126.93 μs	2.529 μs	2.706 μs	31.4941	-	-	129.02 KB
Pipelines	.NET 4.7.2	.NET 4.7.2	Cont(...):{}} [110]	100	89.35 μs	1.750 μs	3.064 μs	1.9531	-	-	8.31 KB
Classic	.NET Core 2.1	.NET Core 2.1	Cont(...):{}} [110]	100	109.01 μs	2.177 μs	2.235 μs	26.3672	-	-	108.22 KB
Pipelines	.NET Core 2.1	.NET Core 2.1	Cont(...):{}} [110]	100	79.10 μs	1.276 μs	1.366 μs	1.9531	-	-	8.25 KB
Classic	.NET Core 3.1	.NET Core 3.1	Cont(...):{}} [110]	100	83.14 μs	1.289 μs	1.007 μs	26.2451	-	-	107.51 KB
Pipelines	.NET Core 3.1	.NET Core 3.1	Cont(...):{}} [110]	100	63.28 μs	1.204 μs	1.386 μs	1.8311	-	-	7.54 KB
Classic	.NET 4.7.2	.NET 4.7.2	Cont(...):{}} [110]	1000	1,314.14 μs	26.173 μs	50.427 μs	310.5469	-	-	1278.52 KB
Pipelines	.NET 4.7.2	.NET 4.7.2	Cont(...):{}} [110]	1000	876.58 μs	17.374 μs	28.055 μs	16.6016	-	-	71.78 KB
Classic	.NET Core 2.1	.NET Core 2.1	Cont(...):{}} [110]	1000	1,176.67 μs	22.441 μs	61.808 μs	259.7656	-	-	1071.5 KB
Pipelines	.NET Core 2.1	.NET Core 2.1	Cont(...):{}} [110]	1000	800.52 μs	16.009 μs	40.456 μs	16.6016	-	-	71.53 KB
Classic	.NET Core 3.1	.NET Core 3.1	Cont(...):{}} [110]	1000	868.05 μs	17.167 μs	29.612 μs	260.7422	-	-	1066.09 KB
Pipelines	.NET Core 3.1	.NET Core 3.1	Cont(...):{}} [110]	1000	648.84 μs	12.799 μs	16.642 μs	15.6250	-	-	66.13 KB
Classic	.NET 4.7.2	.NET 4.7.2	Cont(...)}]}} [917]	100	293.44 μs	5.816 μs	11.480 μs	91.3086	-	-	375.07 KB
Pipelines	.NET 4.7.2	.NET 4.7.2	Cont(...)}]}} [917]	100	92.00 μs	1.835 μs	2.631 μs	1.9531	-	-	8.31 KB
Classic	.NET Core 2.1	.NET Core 2.1	Cont(...)}]}} [917]	100	235.29 μs	4.558 μs	4.681 μs	83.9844	-	-	344.94 KB
Pipelines	.NET Core 2.1	.NET Core 2.1	Cont(...)}]}} [917]	100	78.60 μs	1.152 μs	1.021 μs	1.9531	-	-	8.25 KB
Classic	.NET Core 3.1	.NET Core 3.1	Cont(...)}]}} [917]	100	127.74 μs	2.522 μs	3.617 μs	84.7168	-	-	346.22 KB
Pipelines	.NET Core 3.1	.NET Core 3.1	Cont(...)}]}} [917]	100	72.26 μs	1.424 μs	1.695 μs	2.3193	-	-	9.53 KB
Classic	.NET 4.7.2	.NET 4.7.2	Cont(...)}]}} [917]	1000	2,871.02 μs	53.686 μs	44.830 μs	910.1563	-	-	3739 KB
Pipelines	.NET 4.7.2	.NET 4.7.2	Cont(...)}]}} [917]	1000	994.96 μs	19.417 μs	17.213 μs	15.6250	-	-	71.78 KB
Classic	.NET Core 2.1	.NET Core 2.1	Cont(...)}]}} [917]	1000	2,578.41 μs	51.201 μs	76.636 μs	835.9375	-	-	3438.69 KB
Pipelines	.NET Core 2.1	.NET Core 2.1	Cont(...)}]}} [917]	1000	838.74 μs	16.657 μs	29.173 μs	16.6016	-	-	71.53 KB
Classic	.NET Core 3.1	.NET Core 3.1	Cont(...)}]}} [917]	1000	1,617.39 μs	31.404 μs	51.598 μs	921.8750	292.9688	7.8125	4093.51 KB
Pipelines	.NET Core 3.1	.NET Core 3.1	Cont(...)}]}} [917]	1000	874.86 μs	17.203 μs	32.731 μs	208.9844	55.6641	0.9766	726.35 KB

* fixed an issue where if cancellation token was throw and not caught the scheduling thread would die (and potentially the entire process if the exception went unhandled) * Deal with undhandled exceptions as well, and report down through logging * removed unused constants * Added some more work continued tests and a benchmark for comparisons * Made improvements to unit tests for input and output. Fixed a small issue related to not emptying out the shared memory for content length parsing * Added reference assemblies and removed debugging buffer

NTaylorMullen · 2020-05-05T17:58:25Z

NTaylorMullen hopefully it's not terrible pipelines code... this is my first time delving into them deeply.

You know more than me to be honest. I've never done it to the extent that you do in this PR 😄

davidfowl · 2020-05-12T01:47:07Z

I'll take a look.

david-driscoll · 2020-05-12T04:36:50Z

Oh no... @davidfowl is going to tear me apart now 😭

davidfowl · 2020-05-12T05:48:15Z

I have lots of questions about the InputHandler, it looks more complicated than it needs to be (outside of the pipelines usage). Remind me what TFMs this supports?

bjorkstromm · 2020-05-12T06:03:22Z

@davidfowl

Remind me what TFMs this supports?

We support netstandard2.0 as we have some users on .NET Framework (#236 (comment))

davidfowl · 2020-05-12T06:36:46Z

So I've been reading the code here and I want to suggest small changes to improve it but I think it might be better to suggest a rewrite 😄 . The IInputHandler and IOutputHandler can be drastically simplified using Channels and Pipelines (the Rx usage here makes things way more complicated than they need to be).

davidfowl · 2020-05-12T06:53:23Z

src/JsonRpc/InputHandler.cs


- // don't be async: We already allocated a seperate thread for this.
- private void ProcessInputStream()
+ private bool TryParseHeaders(ref ReadOnlySequence<byte> buffer, out ReadOnlySequence<byte> line)


The parsing logic in here can be simplified using SequenceReader<byte>, it's currently only in .NET Core 3.0 but Nerdbank.Streams has https://github.com/AArnott/Nerdbank.Streams/blob/master/doc/SequenceTextReader.md that you should be able to use to simplify this logic.

It's slightly allocatey though, it might make sense to copy the SequenceReader sources into this ns2.0 project from .NET Core.

david-driscoll · 2020-05-12T15:30:10Z

If you have some docs with how to do fan-in/fan-out (including concurrency control) with channels I'd love to hear it.

The ProcssScheduler is to manage back pressure, there are certain requests that are blocking (text changes) and we have to hold everything until the text requests are processed. We can easily make it take a Func<CancelationToken, Task> instead of an observable, and do the translation internally, it really doesn't matter (I used to have an extension method, but didn't seem worth while as the code is internal anyway and it makes unit testing the process scheduler easier).

davidfowl · 2020-05-12T15:38:53Z

Back pressure doesn’t work if it isn’t happening everywhere. RX can’t do back pressure unless you’re going drop messages or block. Fan out is reasonable use case for RX as long as considered are fire and forget (there goes your back pressure). Where are you fanning in?

david-driscoll · 2020-05-12T16:09:19Z

The spec doesn't specifically define response ordering so we have certain handlers defined as Serial and others as Parallel. With a default of Serial if none is defined.

So we might see textDocument/didChange event this will often lead to textDocument/completion and textDocument/codeLens and then we might get 3 textDocument/didChange events followed by textDocument/completion and textDocument/codeLens

So we can fan out to handle all the parallel requests but then we have to fan back in to handle the serial ones.

Something along the lines of textDocument/didChange -> [textDocument/completion, textDocument/codeLens] -> textDocument/didChange -> textDocument/didChange -> textDocument/didChange -> [textDocument/completion, textDocument/codeLens].

Now if the editor is smart enough (which if I recall vscode is, but it's been a while) it will cancel any in flight completions or code lens requests if text changes are continued to be made, but if the editor isn't doing that we are handling requests such that our state is "correct".

Now it's possible for an editor to fire off like 100 events, and we may not have a server that can handle that kind of load (or we want to constrain resources in someway) we also have the ability to say "We only want at most 10 parallel requests in flight at a time", and the ProcessScheduler will ensure that even if we have 100 parallel requests pending that we have at most 10 processing. As each request completes the next pending request will takes it's place. This way if we have an expensive request happen, say documentSymbols takes a while, we will continue to serve serve completions, signature help, etc.

david-driscoll · 2020-05-12T16:10:45Z

However the output handler should be converted to Channels, that's a clear and easy win I think. Frankly I just forgot they existed as a part of pipelines 😄

davidfowl · 2020-05-12T16:58:41Z

benchmarks/Pipeline/PipelinesBased.cs

+ _pipeReader.AdvanceTo(buffer.Start, buffer.End);
+
+ // Stop reading if there's no more data coming.
+ if (result.IsCompleted && buffer.GetPosition(0, buffer.Start).Equals(buffer.GetPosition(0, buffer.End)))


empty buffer + IsEmpty is how you check for done.

davidfowl · 2020-05-12T17:00:04Z

While I brew over how to refactor this code check out these guidance docs on how to use pipelines https://docs.microsoft.com/en-us/dotnet/standard/io/pipelines

TylerLeonhardt · 2020-05-12T17:03:45Z

@davidfowl thanks for taking a look at this PR. It's really awesome to see you getting involved for this - a ton of extensions out in the wild thank you 😃

@david-driscoll and friends have done an incredible job on this library so far and I'm excited to see how it grows in the future!

davidfowl · 2020-05-13T06:01:44Z

However the output handler should be converted to Channels, that's a clear and easy win I think. Frankly I just forgot they existed as a part of pipelines 😄

You can use a semaphore and write directly to the pipe. You only need the queue or channels if you want to more some slack (potentially even more write batching).

@david-driscoll and friends have done an incredible job on this library so far and I'm excited to see how it grows in the future!

Yes indeed, it's come a long way since I was last here 😄. It's so much more mature now (I use it daily).

davidfowl · 2020-05-13T06:12:52Z

src/JsonRpc/InputHandler.cs

 try
 {
- payload = JToken.Parse(request);
+ using var textReader = new StreamReader(request.AsStream());


If you could get rid of this you can make reduce allocations even further.

@davidfowl baby steps my friend!

Pipelines first, then I have this branch where I explored moving to system.text.json and it mostly works. I have to however refactor the language client as it doesn't really mesh well with the server at this point.
https://github.com/OmniSharp/csharp-language-server-protocol/tree/feature/systemtextjson

And by works I mean it compiles 😆 (without the client project)

davidfowl · 2020-06-02T03:34:51Z

Whyyyyyy

TylerLeonhardt · 2020-06-02T04:59:10Z

@davidfowl the changes were included in this :)

#248

david-driscoll requested review from NTaylorMullen, TylerLeonhardt and bjorkstromm May 4, 2020 04:27

NTaylorMullen approved these changes May 4, 2020

View reviewed changes

TylerLeonhardt reviewed May 4, 2020

View reviewed changes

david-driscoll commented May 5, 2020

View reviewed changes

david-driscoll force-pushed the feature/input-output branch from b1e0a69 to 865a41b Compare May 5, 2020 01:42

david-driscoll force-pushed the feature/input-output branch from 0c2a521 to a54a9e8 Compare May 5, 2020 01:59

couple minor tweaks based on feedback

9205740

davidfowl reviewed May 12, 2020

View reviewed changes

davidfowl reviewed May 13, 2020

View reviewed changes

TylerLeonhardt mentioned this pull request May 13, 2020

F8 stops working randomly PowerShell/vscode-powershell#2695

Closed

david-driscoll closed this Jun 2, 2020

david-driscoll deleted the feature/input-output branch June 2, 2020 04:47

Feature pipelines #236

Feature pipelines #236

Uh oh!

Conversation

david-driscoll commented May 4, 2020

codecov bot commented May 4, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

NTaylorMullen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

david-driscoll commented May 5, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

david-driscoll commented May 5, 2020

NTaylorMullen commented May 5, 2020

davidfowl commented May 12, 2020

david-driscoll commented May 12, 2020

davidfowl commented May 12, 2020

bjorkstromm commented May 12, 2020

davidfowl commented May 12, 2020

Choose a reason for hiding this comment

david-driscoll commented May 12, 2020

davidfowl commented May 12, 2020

david-driscoll commented May 12, 2020

david-driscoll commented May 12, 2020

Choose a reason for hiding this comment

davidfowl commented May 12, 2020

TylerLeonhardt commented May 12, 2020

davidfowl commented May 13, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidfowl commented Jun 2, 2020

TylerLeonhardt commented Jun 2, 2020

Labels

7 participants

codecov bot commented May 4, 2020 •

edited

Loading