- Notifications
You must be signed in to change notification settings - Fork 10
Typing Support (ingress.pyi file) #95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[tool.cibuildwheel] | ||
# See: https://cibuildwheel.readthedocs.io/en/stable/options/#configuration-file | ||
build-verbosity = "3" | ||
build-verbosity = 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe build-verbosity
is supposed to be an integer, not a string: https://cibuildwheel.pypa.io/en/stable/options/#configuration-file:~:text=pass%20%3D%20%5B%5D%0Abuild%2D-,verbosity,-%3D%200%0A%0Abefore
"""Maximum length of a table or column name.""" | ||
return self._max_name_len | ||
| ||
@property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a duplicate of lines #792-795
@amunra I've realised I've conflated a few things here - please let me know if you'd like them split over different PRs. |
No, that's not an issue. They are functionally effectively equivalent to header file in C, but have the downside (unless I'm wrong - and I hope I am) that there's nothing to ensure that they are in sync with the Cython implementation. |
Agreed. I tried looking for solutions, as mentioned in the initial PR comment, for autocompleting a stub file. I found all solutions, as it stands, have drawbacks, which means they do not produce a reliable file (and in most cases, just error out, even with extreme modification, as all these scripts text/symbol parsing).
I don't think you're wrong 😔 I think the best long-term solution would be to have something like Also, given how little this interface is changing (at least for the time being), I wasn't sure whether it would be ok for now either way? |
Auto-generation packages I tried:
|
These are the same ones I would have experimented with. What issues did you encounter, specifically? |
Agreed, and good question. Initially the logs output:
To look deeper, I made a throwaway script: from cythonbuilder.pyigenerator import pyx_to_pyi with open("./src/questdb/ingress.pyx", "r") as f: output = pyx_to_pyi(f) with open("./src/questdb/ingress.pyi", "w") as f: f.writelines(output) Which resulted in:
After fixing their counter which doesn't handle a line length of 1 correctly... def get_line_indentation_spacecount(line: str) -> int: """ How many spaces is the provided line indented? """ spaces: int = 0 # My code ------ for c in line: if c != ' ': break spaces += 1 return spaces # End my code --- if (len(line) > 0): if (line[0] == ' '): for letter_idx in range(len(line)): if (line[letter_idx + 1] != ' '): spaces = letter_idx + 1 break return spaces Another issue crops up:
Removing that check (for the time being), and also modifying their processor to remove lines which are all spaces by adding pyx_lines = [l for l in pyx_lines if not l.isspace()] We do yield a result of the interface file: """ import cython from enum import Enum from typing import List, Tuple, Dict, Union, Any, Optional, Callable, \ import pathlib import sys import os def _has_gil(gs:pythreadstate**) -> bool: ... def _ensure_doesnt_have_gil(gs:pythreadstate**) -> bool: ... def _ensure_has_gil(gs:pythreadstate**) -> None: ... class IngressErrorCode(Enum): """Category of Error.""" CouldNotResolveAddr = line_sender_error_could_not_resolve_addr InvalidApiCall = line_sender_error_invalid_api_call SocketError = line_sender_error_socket_error InvalidUtf8 = line_sender_error_invalid_utf8 InvalidName = line_sender_error_invalid_name InvalidTimestamp = line_sender_error_invalid_timestamp AuthError = line_sender_error_auth_error TlsError = line_sender_error_tls_error HttpNotSupported = line_sender_error_http_not_supported ServerFlushError = line_sender_error_server_flush_error ConfigError = line_sender_error_config_error BadDataFrame = <int>line_sender_error_server_flush_error + 1 def __str__(self) -> str: ... class IngressError(Exception): """An error whilst using the ``Sender`` or constructing its ``Buffer``.""" def __init__(self, code, msg): ... @property def code(self) -> IngressErrorCode: ... def c_err_code_to_py(code:line_sender_error_code): ... def c_err_to_code_and_msg(err:line_sender_error*): ... def c_err_to_py(err:line_sender_error*): ... def c_err_to_py_fmt(err:line_sender_error*, fmt:str): ... def _fqn(obj:type) -> str: ... def datetime_to_micros(dt:datetime) -> int64_t: ... def datetime_to_nanos(dt:datetime) -> int64_t: ... class _ServerTimestamp: """ A placeholder value to indicate using a server-generated-timestamp. """ pass ServerTimestamp = _ServerTimestamp() class TimestampMicros: """ A timestamp in microseconds since the UNIX epoch (UTC). You may construct a ``TimestampMicros`` from an integer or a ``datetime.datetime``, or simply call the :func:`TimestampMicros.now` method. .. code-block:: python TimestampMicros.now() TimestampMicros(time.time_ns() // 1000) TimestampMicros(1657888365426838) ``TimestampMicros`` can also be constructed from a ``datetime.datetime`` object. .. code-block:: python TimestampMicros.from_datetime( datetime.datetime.now(tz=datetime.timezone.utc)) We recommend that when using ``datetime`` objects, you explicitly pass in the timezone to use. This is because ``datetime`` objects without an associated timezone are assumed to be in the local timezone and it is easy to make mistakes (e.g. passing ``datetime.datetime.utcnow()`` is a likely bug). """ cdef int64_t _value def __cinit__(self, value: int): ... @classmethod def from_datetime(cls, dt: datetime): ... @classmethod def now(cls): ... @property def value(self) -> int: ... def __repr__(self): ... class TimestampNanos: """ A timestamp in nanoseconds since the UNIX epoch (UTC). You may construct a ``TimestampNanos`` from an integer or a ``datetime.datetime``, or simply call the :func:`TimestampNanos.now` method. .. code-block:: python TimestampNanos.now() TimestampNanos(time.time_ns()) TimestampNanos(1657888365426838016) ``TimestampNanos`` can also be constructed from a ``datetime`` object. .. code-block:: python TimestampNanos.from_datetime( datetime.datetime.now(tz=datetime.timezone.utc)) We recommend that when using ``datetime`` objects, you explicitly pass in the timezone to use. This is because ``datetime`` objects without an associated timezone are assumed to be in the local timezone and it is easy to make mistakes (e.g. passing ``datetime.datetime.utcnow()`` is a likely bug). """ cdef int64_t _value def __cinit__(self, value: int): ... @classmethod def from_datetime(cls, dt: datetime): ... @classmethod def now(cls): ... @property def value(self) -> int: ... def __repr__(self): ... def -1: ... def _is_tcp_protocol(protocol:line_sender_protocol) -> bool: ... def _is_http_protocol(protocol:line_sender_protocol) -> bool: ... class SenderTransaction: """ A transaction for a specific table. Transactions are not supported with ILP/TCP, only ILP/HTTP. The sender API can only operate on one transaction at a time. To create a transaction: .. code_block:: python with sender.transaction('table_name') as txn: txn.row(..) txn.dataframe(..) """ cdef Sender _sender cdef str _table_name cdef bint _complete def __cinit__(self, Sender sender, str table_name): ... def __enter__(self): ... def __exit__(self, exc_type, _exc_value, _traceback): ... def commit(self): ... def rollback(self): ... class Buffer: """ Construct QuestDB-flavored InfluxDB Line Protocol (ILP) messages. The :func:`Buffer.row` method is used to add a row to the buffer. You can call this many times. .. code-block:: python from questdb.ingress import Buffer from questdb.ingress import Sender, Buffer def __cinit__(self, init_buf_size: int=65536, max_name_len: int=127): ... def _cinit_impl(self, init_buf_size:size_t, max_name_len:size_t) -> inline: ... def __dealloc__(self): ... @property def init_buf_size(self) -> int: ... @property def max_name_len(self) -> int: ... def reserve(self, additional: int): ... def capacity(self) -> int: ... def clear(self): ... def __len__(self) -> int: ... def __str__(self) -> str: ... def _to_str(self): ... def -1: ... def -1: ... def _clear_marker(self) -> inline: ... def -1: ... def _cleared_b(self): ... def -1: ... def -1: ... def -1: ... def -1: ... def -1: ... def -1: ... def -1: ... import pandas as pd import questdb.ingress as qi def _timedelta_to_millis(timedelta:object) -> uint64_t: ... def auto_flush_rows_default(protocol:line_sender_protocol) -> int64_t: ... class TaggedEnum(Enum): """ Base class for tagged enums. """ @property def tag(self): ... @property def c_value(self): ... @classmethod def parse(cls, tag): ... class Protocol(TaggedEnum): """ Protocol to use for sending data to QuestDB. See :ref:`sender_which_protocol` for more information. """ Tcp = ('tcp', 0) Tcps = ('tcps', 1) Http = ('http', 2) Https = ('https', 3) @property def tls_enabled(self): ... class TlsCa(TaggedEnum): """ Verification mechanism for the server's certificate. Here ``webpki`` refers to the `WebPKI library <https://github.com/rustls/webpki-roots>`_ and ``os`` refers to the operating system's certificate store. See :ref:`sender_conf_tls` for more information. """ WebpkiRoots = ('webpki_roots', line_sender_ca_webpki_roots) OsRoots = ('os_roots', line_sender_ca_os_roots) WebpkiAndOsRoots = ('webpki_and_os_roots', line_sender_ca_webpki_and_os_roots) PemFile = ('pem_file', line_sender_ca_pem_file) def c_parse_conf_err_to_py(err:questdb_conf_str_parse_err*) -> object: ... class Sender: """ Ingest data into QuestDB. See the :ref:`sender` documentation for more information. """ cdef object __weakref__ cdef line_sender_protocol _c_protocol cdef line_sender_opts* _opts cdef line_sender* _impl cdef Buffer _buffer cdef auto_flush_mode_t _auto_flush_mode cdef int64_t* _last_flush_ms cdef size_t _init_buf_size cdef size_t _max_name_len cdef bint _in_txn cdef void_int _set_sender_fields( self, qdb_pystr_buf* b, object protocol, str bind_interface, str username, str password, str token, str token_x, str token_y, object auth_timeout, object tls_verify, object tls_ca, object tls_roots, object max_buf_size, object retry_timeout, object request_min_throughput, object request_timeout, object auto_flush, object auto_flush_rows, object auto_flush_bytes, object auto_flush_interval, object init_buf_size, object max_name_len) except -1: """ Set optional parameters for the sender. """ cdef line_sender_error* err = NULL cdef str user_agent = 'questdb/python/' + VERSION cdef line_sender_utf8 c_user_agent cdef line_sender_utf8 c_bind_interface cdef line_sender_utf8 c_username cdef line_sender_utf8 c_password cdef line_sender_utf8 c_token cdef line_sender_utf8 c_token_x cdef line_sender_utf8 c_token_y cdef uint64_t c_auth_timeout cdef bint c_tls_verify cdef line_sender_ca c_tls_ca cdef line_sender_utf8 c_tls_roots cdef uint64_t c_max_buf_size cdef uint64_t c_retry_timeout cdef uint64_t c_request_min_throughput cdef uint64_t c_request_timeout self._c_protocol = protocol.c_value str_to_utf8(b, <PyObject*>user_agent, &c_user_agent) if not line_sender_opts_user_agent(self._opts, c_user_agent, &err): raise c_err_to_py(err) if bind_interface is not None: str_to_utf8(b, <PyObject*>bind_interface, &c_bind_interface) if not line_sender_opts_bind_interface(self._opts, c_bind_interface, &err): raise c_err_to_py(err) if username is not None: str_to_utf8(b, <PyObject*>username, &c_username) if not line_sender_opts_username(self._opts, c_username, &err): raise c_err_to_py(err) if password is not None: str_to_utf8(b, <PyObject*>password, &c_password) if not line_sender_opts_password(self._opts, c_password, &err): raise c_err_to_py(err) if token is not None: str_to_utf8(b, <PyObject*>token, &c_token) if not line_sender_opts_token(self._opts, c_token, &err): raise c_err_to_py(err) if token_x is not None: str_to_utf8(b, <PyObject*>token_x, &c_token_x) if not line_sender_opts_token_x(self._opts, c_token_x, &err): raise c_err_to_py(err) if token_y is not None: str_to_utf8(b, <PyObject*>token_y, &c_token_y) if not line_sender_opts_token_y(self._opts, c_token_y, &err): raise c_err_to_py(err) if auth_timeout is not None: if isinstance(auth_timeout, int): c_auth_timeout = auth_timeout elif isinstance(auth_timeout, timedelta): c_auth_timeout = _timedelta_to_millis(auth_timeout) else: raise TypeError( '"auth_timeout" must be an int or a timedelta, ' f'not {_fqn(type(auth_timeout))}') if not line_sender_opts_auth_timeout(self._opts, c_auth_timeout, &err): raise c_err_to_py(err) if tls_verify is not None: if (tls_verify is True) or (tls_verify == 'on'): c_tls_verify = True elif (tls_verify is False) or (tls_verify == 'unsafe_off'): c_tls_verify = False else: raise ValueError( '"tls_verify" must be a bool, "on" or "unsafe_off", ' f'not {tls_verify!r}') if not line_sender_opts_tls_verify(self._opts, c_tls_verify, &err): raise c_err_to_py(err) if tls_roots is not None: tls_roots = str(tls_roots) str_to_utf8(b, <PyObject*>tls_roots, &c_tls_roots) if not line_sender_opts_tls_roots(self._opts, c_tls_roots, &err): raise c_err_to_py(err) if tls_ca is not None: c_tls_ca = TlsCa.parse(tls_ca).c_value if not line_sender_opts_tls_ca(self._opts, c_tls_ca, &err): raise c_err_to_py(err) elif protocol.tls_enabled and tls_roots is None: c_tls_ca = line_sender_ca_webpki_and_os_roots if not line_sender_opts_tls_ca(self._opts, c_tls_ca, &err): raise c_err_to_py(err) if max_buf_size is not None: c_max_buf_size = max_buf_size if not line_sender_opts_max_buf_size(self._opts, c_max_buf_size, &err): raise c_err_to_py(err) if retry_timeout is not None: if isinstance(retry_timeout, int): c_retry_timeout = retry_timeout if not line_sender_opts_retry_timeout(self._opts, c_retry_timeout, &err): raise c_err_to_py(err) elif isinstance(retry_timeout, timedelta): c_retry_timeout = _timedelta_to_millis(retry_timeout) if not line_sender_opts_retry_timeout(self._opts, c_retry_timeout, &err): raise c_err_to_py(err) else: raise TypeError( '"retry_timeout" must be an int or a timedelta, ' f'not {_fqn(type(retry_timeout))}') if request_min_throughput is not None: c_request_min_throughput = request_min_throughput if not line_sender_opts_request_min_throughput(self._opts, c_request_min_throughput, &err): raise c_err_to_py(err) if request_timeout is not None: if isinstance(request_timeout, int): c_request_timeout = request_timeout if not line_sender_opts_request_timeout(self._opts, c_request_timeout, &err): raise c_err_to_py(err) elif isinstance(request_timeout, timedelta): c_request_timeout = _timedelta_to_millis(request_timeout) if not line_sender_opts_request_timeout(self._opts, c_request_timeout, &err): raise c_err_to_py(err) else: raise TypeError( '"request_timeout" must be an int or a timedelta, ' f'not {_fqn(type(request_timeout))}') _parse_auto_flush( self._c_protocol, auto_flush, auto_flush_rows, auto_flush_bytes, auto_flush_interval, &self._auto_flush_mode) self._init_buf_size = init_buf_size or 65536 self._max_name_len = max_name_len or 127 self._buffer = Buffer( init_buf_size=self._init_buf_size, max_name_len=self._max_name_len) self._last_flush_ms = <int64_t*>calloc(1, sizeof(int64_t)) def __cinit__(self): ... @staticmethod @staticmethod def new_buffer(self): ... @property def init_buf_size(self) -> int: ... @property def max_name_len(self) -> int: ... @property def auto_flush(self) -> bint: ... @property def auto_flush_rows(self) -> Optional[int]: ... @property def auto_flush_bytes(self) -> Optional[int]: ... @property def auto_flush_interval(self) -> Optional[timedelta]: ... def establish(self): ... def __enter__(self) -> Sender: ... def __str__(self) -> str: ... def __len__(self) -> int: ... def transaction(self, table_name: str): ... import pandas as pd import questdb.ingress as qi def _close(self): ... def close(self, flush:bool=True): ... def __exit__(self, exc_type, _exc_val, _exc_tb): ... def __dealloc__(self): ... It was at this point, I figured I was facing an uphill battle with their parser, so opted to go the manual route, at least on my fork to allow for easier dev on my side. Hope some of this info is helpful |
I've just tried it too. It's not going to have decent mileage if it isn't using the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried on my end too, and there's really no better way of approaching this for now.
I appreciate the minor improvements too, thanks for the changes here!
As mentioned in this issue, the inclusion of a
.pyi
file makes type checkers and IDEs happier to show function signatures, required types and defaults if they exist.Unfortauntely, I tried a few
pyi
generation scripts/modules, and nothing worked flawlessly. Otherwise, I'd have added a script to thesetup.py
to auto-generate, allowing for a more streamlined process to any updates to the Cython file (ingress.pyx
).For now, this is a quick crack at manually getting the stub file complete using the existing Cython public methods, types and docstrings.
This PR also includes adjusting one of the TOML fields which showed an error as an unexpected type, as well as removing a duplicate function delcaration from the Cython file.
It also removes redundant built logic given the underlying issue has been solved, and adjusts checks for
fastparquet
due to no 3.13 binaries being available yet.