This library includes ultra-fast Rust based query string and urlencoded parsers. These parsers are used by Litestar, but are developed separately - and can of course be used separately.
pip install fast-query-parsersThe library exposes two function parse_query_string and parse_url_encoded_dict.
This function is used to parse a query string into a list of key/value tuples.
from fast_query_parsers import parse_query_string result = parse_query_string(b"value=1&value=2&type=dollar&country=US", "&") # [("value", "1"), ("value", "2"), ("type", "dollar"), ("country", "US")]The first argument to this function is a byte string that includes the query string to be parsed, the second argument is the separator used.
Query string parsing is more than x5 times faster than the standard library:
stdlib parse_qsl parsing query string: Mean +- std dev: 2.86 us +- 0.03 us ..................... parse_query_string parsing query string: Mean +- std dev: 916 ns +- 13 ns ..................... stdlib parse_qsl parsing urlencoded query string: Mean +- std dev: 8.30 us +- 0.10 us ..................... parse_query_string urlencoded query string: Mean +- std dev: 1.50 us +- 0.03 us This function is used to parse a url-encoded form data dictionary and parse it into the python equivalent of JSON types.
from urllib.parse import urlencode from fast_query_parsers import parse_url_encoded_dict encoded = urlencode( [ ("value", "10"), ("value", "12"), ("veggies", '["tomato", "potato", "aubergine"]'), ("nested", '{"some_key": "some_value"}'), ("calories", "122.53"), ("healthy", "true"), ("polluting", "false"), ("json", "null"), ] ).encode() result = parse_url_encoded_dict(encoded, parse_numbers=True) # result == { # "value": [10, 12], # "veggies": ["tomato", "potato", "aubergine"], # "nested": {"some_key": "some_value"}, # "calories": 122.53, # "healthy": True, # "polluting": False, # "json": None, # }This function handles type conversions correctly - unlike the standard library function parse_qs. Additionally, it does not nest all values inside lists.
Note: the second argument passed to parse_url_encoded_dict dictates whether numbers should be parsed. If True, the value will be parsed into an int or float as appropriate, otherwise it will be kept as a string. By default the value of this arg is True.
Url Encoded parsing is more than x2 times faster than the standard library, without accounting for parsing of values:
stdlib parse_qs parsing url-encoded values into dict: Mean +- std dev: 8.99 us +- 0.09 us ..................... parse_url_encoded_dict parse url-encoded values into dict: Mean +- std dev: 3.77 us +- 0.08 us To actually mimic the parsing done by parse_url_encoded_dict we will need a utility along these lines:
from collections import defaultdict from contextlib import suppress from json import loads, JSONDecodeError from typing import Any, DefaultDict, Dict, List from urllib.parse import parse_qsl def parse_url_encoded_form_data(encoded_data: bytes) -> Dict[str, Any]: """Parse an url encoded form data into dict of parsed values""" decoded_dict: DefaultDict[str, List[Any]] = defaultdict(list) for k, v in parse_qsl(encoded_data.decode(), keep_blank_values=True): with suppress(JSONDecodeError): v = loads(v) if isinstance(v, str) else v decoded_dict[k].append(v) return {k: v if len(v) > 1 else v[0] for k, v in decoded_dict.items()}With the above, the benchmarks looks like so:
python parse_url_encoded_form_data parsing url-encoded values into dict: Mean +- std dev: 19.7 us +- 0.1 us ..................... parse_url_encoded_dict parsing url-encoded values into dict: Mean +- std dev: 3.69 us +- 0.03 us All contributions are of course welcome!
- Run
cargo installto setup the rust dependencies andpoetry installto setup the python dependencies. - Install the pre-commit hooks with
pre-commit install(requires pre-commit).
Run poetry run maturin develop --release --strip to install a release wheel (without debugging info). This wheel can be used in tests and benchmarks.
There are basic benchmarks using pyperf in place. To run these execute poetry run python benchrmarks.py.