don't allocate for struct key lookups #9

alicebob · 2015-01-30T14:18:16Z

Isn't really faster, somehow, but it saves a few allocs. The disadvantage is there is now a max key size (which I've set to 32).

master:

BenchmarkSlurp 300000 4338 ns/op 540 B/op 21 allocs/op

this PR:

BenchmarkSlurp 300000 4448 ns/op 414 B/op 9 allocs/op

alicebob · 2015-01-30T14:22:31Z

Together with #8 and the change from yesterday I now get this benchmark:

BenchmarkSlurp 500000 3529 ns/op 89 B/op 2 allocs/op

for this stripped City struct:

type GeoipCity struct { City struct { Names struct { En string `maxminddb:"en"` } `maxminddb:"names"` } `maxminddb:"city"` Country struct { GeoNameID uint `maxminddb:"geoname_id"` IsoCode string `maxminddb:"iso_code"` } `maxminddb:"country"` Location struct { Latitude float64 `maxminddb:"latitude"` Longitude float64 `maxminddb:"longitude"` MetroCode uint `maxminddb:"metro_code"` } `maxminddb:"location"` Postal struct { Code string `maxminddb:"code"` } `maxminddb:"postal"` }

oschwald · 2015-01-30T14:46:09Z

I am a bit unsure on this one. I think a better approach might be to cache the key string offsets for the open database. However, we would have to be careful when doing this as short key string may be reproduced and nothing in the file spec prevents long strings from being reproduced.

alicebob · 2015-01-30T14:50:42Z

Fair enough, but the problem is that you can't use byte slices as a map key. I'll think about this a bit more.

oschwald · 2015-01-30T14:54:02Z

I think we would only need to cache the mapping between the offset and the field index for the type. However, the scope of the cache will need to be limited to the particular reader instance.

alicebob · 2015-01-30T15:08:23Z

I'm not sure I follow. AFAICS the offsets of the fields depend on the length of the values, which vary for every object in the db.

We use unsafe, but the lifetime of the strings build with unsafe is very short, they are never returned.

alicebob · 2015-01-30T15:15:50Z

Using unsafe we can use strings of arbitrary lengths, see the last commit. Unsafe should be fine, since it's used to build strings we only use to compare; they are never returned to the user.

That said, it is use of unsafe ;)

oschwald · 2015-01-30T16:36:32Z

With the MaxMind DB writer, the strings over a certain size occur only once in the database. If you cache the offset within the file to the string, you don't actually need to read out the bytes to know it is that string.

alicebob · 2015-01-30T16:49:19Z

As long as all map keys are always pointers, even small ones, and all equivalent keys point to the same address in the database, that would be a nice solution indeed.

oschwald · 2015-01-30T16:54:09Z

The issue isn't so much whether they are pointers, as we would use the offset to the actual string in the file, but that small strings get repeated. We would need some way to limit the cache size or not cache the repeated strings.

alicebob · 2015-01-30T17:14:32Z

In that case I think the updated PR as it currently stands avoids all those problems, while still not needlessly allocating struct field names. It might not be the optimal way, but it's also reasonably simple.

Up to you, of course :)

oschwald · 2015-01-31T00:34:42Z

I've merged this. The use of the unsafe seems safe in this situation and I don't have any serious concerns now that the key size limitation has been removed. That said, I think it is still worth testing the caching of the keys by offset, but the changes here don't conflict with that.

alicebob · 2015-01-31T11:25:49Z

ok, thanks for the quick responses!

oschwald · 2015-01-31T16:38:10Z

Thanks for the PRs! 👍

don't allocate for struct key lookups

2c42af7

Don't limit the key length.

ba218fe

We use unsafe, but the lifetime of the strings build with unsafe is very short, they are never returned.

oschwald merged commit ba218fe into oschwald:master Jan 31, 2015

alicebob deleted the bytekey branch January 31, 2015 11:25

oschwald mentioned this pull request Feb 23, 2015

Do not use unsafe on Google App Engine #10

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

don't allocate for struct key lookups #9

don't allocate for struct key lookups #9

Uh oh!

alicebob commented Jan 30, 2015

alicebob commented Jan 30, 2015

oschwald commented Jan 30, 2015

alicebob commented Jan 30, 2015

oschwald commented Jan 30, 2015

alicebob commented Jan 30, 2015

alicebob commented Jan 30, 2015

oschwald commented Jan 30, 2015

alicebob commented Jan 30, 2015

oschwald commented Jan 30, 2015

alicebob commented Jan 30, 2015

oschwald commented Jan 31, 2015

alicebob commented Jan 31, 2015

oschwald commented Jan 31, 2015

Labels

2 participants

Uh oh!

don't allocate for struct key lookups #9

don't allocate for struct key lookups #9

Uh oh!

Conversation

alicebob commented Jan 30, 2015

alicebob commented Jan 30, 2015

oschwald commented Jan 30, 2015

alicebob commented Jan 30, 2015

oschwald commented Jan 30, 2015

alicebob commented Jan 30, 2015

alicebob commented Jan 30, 2015

oschwald commented Jan 30, 2015

alicebob commented Jan 30, 2015

oschwald commented Jan 30, 2015

alicebob commented Jan 30, 2015

oschwald commented Jan 31, 2015

alicebob commented Jan 31, 2015

oschwald commented Jan 31, 2015

Labels

2 participants