Skip to content

Conversation

jmoh932
Copy link

@jmoh932 jmoh932 commented May 21, 2019

According issue #240 , there is no reason why we don't, so I've tried.

And we are facing the real problems, the json.Unmarshal has been in our service's hot path. The Iterator.ReadString and Iterator.readStringSlowPath has been called multiple times per second. Any reasonable patch which could reduce the memory allocation will be valuable.

I've tried to avoid to use any incompatible trick of Golang. So if there is any plan to support new features of golang v1.10, this PR should be helpful I think.

@codecov
Copy link

codecov bot commented May 21, 2019

Codecov Report

Merging #367 into master will increase coverage by 0.31%.
The diff coverage is 89.83%.

Impacted file tree graph

@@ Coverage Diff @@ ## master #367 +/- ## ========================================== + Coverage 81.46% 81.78% +0.31%  ========================================== Files 41 41 Lines 5008 5034 +26 ========================================== + Hits 4080 4117 +37  + Misses 809 796 -13  - Partials 119 121 +2
Impacted Files Coverage Δ
config.go 89.58% <100%> (+0.22%) ⬆️
pool.go 100% <100%> (ø) ⬆️
iter.go 89.89% <100%> (+0.15%) ⬆️
iter_str.go 90.34% <87.23%> (-0.57%) ⬇️
stream_float.go 91.17% <0%> (-8.83%) ⬇️
reflect_struct_decoder.go 48.74% <0%> (+2.5%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1bc9828...6461c1c. Read the comment docs.

@jmoh932
Copy link
Author

jmoh932 commented May 22, 2019

I've checked the issue of codeclimate, it's just facing a bug. The codeclimate has counted the nested function return statement. For the code readability, I think the code should be like this.

@taowen
Copy link
Contributor

taowen commented May 22, 2019

is there any benchmark to support this is actual a improvement? why ret = strf.NewString(iter.buf[iter.head:i]) is faster?

@jmoh932
Copy link
Author

jmoh932 commented May 22, 2019

here is a benchmark in github.com/junchih/stringx package, https://github.com/junchih/stringx/blob/master/factory_test.go#L18 . you can go with go test -gcflags=-N -bench=. -benchmem ./... to get the result.

the reason why it is faster, you could figure out from the implementation of stringx and this function https://golang.org/src/strings/builder.go?s=1379:1412#L36 .

I'm not sure should I copy the code to json-iterator/go or just imported it. I created junchih/stringx separately cause our production might need it, and I might be able to do a PR for golang.

I'm working some benchmarks to this PR.

@taowen
Copy link
Contributor

taowen commented May 23, 2019

iter.buf might be reused, so we are decoding from a reader. Cast the buffer to string, will cause the string point to invalid region later. I used the impl in very early days, and fall back to current impl due to the bug introduced.

@jmoh932
Copy link
Author

jmoh932 commented May 24, 2019

hi @taowen , sorry for so long delay. I just made the benchmark to compare before this PR and after. the logs has been a little longer, below.

there are two benchmarks cost a little logger time comparing to before. Benchmark_jsoniter_nested and Benchmark_jsoniter_skip. I did the pprof, the line where ReadString and Parse both stands made the different time costing. so I just add another benchmark Benchmark_jsoniter_string. now everything make sense, you could figure out by yourself from the code.

and at the latest two commits, I just move the string factory object to the Iterator struct to reduce the time costing.

at last, to answer your question above, I'm not sure I got your point, sorry. but those calling trace might answer your worry. here junchih/stringx/factory.go and then here strings/builder.go.

for any further questions, feel free to comment, I'd like to fix any.

json-iterator:master $ go test -bench=. -benchmem ./... ... goos: linux goarch: amd64 pkg: github.com/json-iterator/go/benchmarks Benchmark_encode_string_with_SetEscapeHTML 2000000 759 ns/op 760 B/op 5 allocs/op Benchmark_jsoniter_large_file 100000 12801 ns/op 4888 B/op 79 allocs/op Benchmark_json_large_file 50000 36893 ns/op 6912 B/op 18 allocs/op ... pkg: github.com/json-iterator/go/misc_tests Benchmark_jsoniter_array 10000000 202 ns/op 0 B/op 0 allocs/op Benchmark_json_array 1000000 1656 ns/op 480 B/op 12 allocs/op Benchmark_jsoniter_float 50000000 29.5 ns/op 0 B/op 0 allocs/op Benchmark_json_float 3000000 518 ns/op 312 B/op 4 allocs/op Benchmark_jsoniter_encode_int 100000000 19.1 ns/op 0 B/op 0 allocs/op Benchmark_itoa 30000000 61.8 ns/op 16 B/op 1 allocs/op Benchmark_jsoniter_int 100000000 23.3 ns/op 0 B/op 0 allocs/op Benchmark_json_int 3000000 507 ns/op 312 B/op 4 allocs/op Benchmark_jsoniter_nested 3000000 527 ns/op 256 B/op 8 allocs/op Benchmark_jsoniter_string 3000000 480 ns/op 96 B/op 2 allocs/op Benchmark_json_nested 500000 2671 ns/op 552 B/op 10 allocs/op ... pkg: github.com/json-iterator/go/skip_tests Benchmark_jsoniter_skip 1000000 1662 ns/op 224 B/op 16 allocs/op Benchmark_json_skip 200000 6948 ns/op 480 B/op 10 allocs/op 
junchih:master $ go test -bench=. -benchmem ./... ... goos: linux goarch: amd64 pkg: github.com/json-iterator/go/benchmarks Benchmark_encode_string_with_SetEscapeHTML 2000000 764 ns/op 760 B/op 5 allocs/op Benchmark_jsoniter_large_file 100000 12973 ns/op 5432 B/op 14 allocs/op Benchmark_json_large_file 50000 34490 ns/op 6912 B/op 18 allocs/op ... pkg: github.com/json-iterator/go/misc_tests Benchmark_jsoniter_array 10000000 206 ns/op 0 B/op 0 allocs/op Benchmark_json_array 1000000 1681 ns/op 480 B/op 12 allocs/op Benchmark_jsoniter_float 50000000 28.4 ns/op 0 B/op 0 allocs/op Benchmark_json_float 3000000 516 ns/op 312 B/op 4 allocs/op Benchmark_jsoniter_encode_int 100000000 19.2 ns/op 0 B/op 0 allocs/op Benchmark_itoa 30000000 62.1 ns/op 16 B/op 1 allocs/op Benchmark_jsoniter_int 100000000 23.1 ns/op 0 B/op 0 allocs/op Benchmark_json_int 3000000 506 ns/op 312 B/op 4 allocs/op Benchmark_jsoniter_nested 2000000 708 ns/op 344 B/op 7 allocs/op Benchmark_jsoniter_string 3000000 454 ns/op 92 B/op 1 allocs/op Benchmark_json_nested 500000 2643 ns/op 552 B/op 10 allocs/op ... pkg: github.com/json-iterator/go/skip_tests Benchmark_jsoniter_skip 1000000 1955 ns/op 440 B/op 7 allocs/op Benchmark_json_skip 200000 6583 ns/op 480 B/op 10 allocs/op ... 
@jmoh932
Copy link
Author

jmoh932 commented May 31, 2019

hi @taowen , will this be merged or not?! it's been about a week, we really need this improvement.

if there is any blur, please comment it.

@taowen
Copy link
Contributor

taowen commented Jun 21, 2019

ret = iter.strf.NewString(iter.buf[iter.head:i]) the underlying iter.buf will change. it is not safe to use unsafe pointer to cast the byte array to string. It may work for some benchmark, it will cause problem when the input is a reader, and reading data in chunks.

@jmoh932
Copy link
Author

jmoh932 commented Jun 24, 2019

ret = iter.strf.NewString(iter.buf[iter.head:i]) the underlying iter.buf will change. it is not safe to use unsafe pointer to cast the byte array to string. It may work for some benchmark, it will cause problem when the input is a reader, and reading data in chunks.

do you read the implementation of function iter.strf.NewString? you should not give two comments based on your same old illusion in this PR discussion, since I've already give you code line number like below

the reason why it is faster, you could figure out from the implementation of stringx and this function https://golang.org/src/strings/builder.go?s=1379:1412#L36 .

Very sorry, since so long there is no more progress about this discussion, I'll just close this PR for no more responsibilities from me. Happy day!

@jmoh932 jmoh932 closed this Jun 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants