Zlib::GzipWriter generated large amounts of garbage from (struct zstream).input. Reuse the .input field when it is hidden, and recycle it when its lifetime is over. This change alone reduced memory usage of the writer from 90MB to 4.5MB.
For the detached buffer of compressed data used by gzfile_write_raw, we can only clear the string (not recycle it) since user code may hold references to it (but the data would be clobbered, anyways). This reduced memory usage slightly by around 0.5MB (because it's smaller compressed data).
Combined, these changes reduce the anonymous RSS memory of a dedicated writer process from over 90MB to under 4MB.
before:
# user system total real writer 7.823332 0.053333 7.876665 ( 7.881464) writer RssAnon: 92944 kB reader 6.969999 0.076666 7.046665 ( 7.906377) reader RssAnon: 109820 kB
def stats(pfx, bm) str = "#{bm}#{File.readlines("/proc/#$$/status").grep(/^RssAnon:/)[0]}" puts str.gsub!(/^/m, pfx) end
rd, wr = IO.pipe pid = fork do buf = ((0..255).map(&:chr).join * 128).freeze rd.close gzip = Zlib::GzipWriter.new(wr) bm = Benchmark.measure do nr.times { gzip.write(buf) } gzip.close wr.close end stats('writer ', bm) end
wr.close buf = '' gunzip = Zlib::GzipReader.new(rd) n = 0 bm = Benchmark.measure do begin gunzip.readpartial(16384, buf) n += buf.size rescue EOFError break end while true end stats('reader ', bm) Process.waitall¶
ext/zlib/zlib.c (zstream_discard_input): reuse or recycle hidden input (zstream_reset_input): clear hidden input (zstream_run): detach input and recycle after use (gzfile_write_raw): clear buffer after write [ruby-core:84638] [Feature #14315]
zlib: reduce garbage on gzip writes (deflate)
Zlib::GzipWriter generated large amounts of garbage from
(struct zstream).input. Reuse the .input field when it is
hidden, and recycle it when its lifetime is over. This change
alone reduced memory usage of the writer from 90MB to 4.5MB.
For the detached buffer of compressed data used by
gzfile_write_raw, we can only clear the string (not recycle it)
since user code may hold references to it (but the data would be
clobbered, anyways). This reduced memory usage slightly by
around 0.5MB (because it's smaller compressed data).
Combined, these changes reduce the anonymous RSS memory of a
dedicated writer process from over 90MB to under 4MB.
before:
after:
Script used:¶
require 'zlib'
require 'benchmark'
nr = 16384 * 2
def stats(pfx, bm)
str = "#{bm}#{File.readlines("/proc/#$$/status").grep(/^RssAnon:/)[0]}"
puts str.gsub!(/^/m, pfx)
end
rd, wr = IO.pipe
pid = fork do
buf = ((0..255).map(&:chr).join * 128).freeze
rd.close
gzip = Zlib::GzipWriter.new(wr)
bm = Benchmark.measure do
nr.times { gzip.write(buf) }
gzip.close
wr.close
end
stats('writer ', bm)
end
wr.close
buf = ''
gunzip = Zlib::GzipReader.new(rd)
n = 0
bm = Benchmark.measure do
begin
gunzip.readpartial(16384, buf)
n += buf.size
rescue EOFError
break
end while true
end
stats('reader ', bm)
Process.waitall¶
(zstream_reset_input): clear hidden input
(zstream_run): detach input and recycle after use
(gzfile_write_raw): clear buffer after write
[ruby-core:84638] [Feature #14315]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61631 b2dd03c8-39d4-4d8f-98ff-823fe69b080e