_ = load_dotenv(find_dotenv(), override=True) service_url = os.environ['TIMESCALE_SERVICE_URL']Client
uuid_from_time
uuid_from_time (time_arg=None, node=None, clock_seq=None)
Converts a datetime or timestamp to a type 1 uuid.UUID.
| Type | Default | Details | |
|---|---|---|---|
| time_arg | NoneType | None | The time to use for the timestamp portion of the UUID. This can either be a datetime object or a timestamp in seconds(as returned from time.time()). |
| node | NoneType | None | Bytes for the UUID (up to 48 bits). If not specified, this field is randomized. |
| clock_seq | NoneType | None | Clock sequence field for the UUID (up to 14 bits). If not specified, a random sequence is generated. |
| Returns | uuid.UUID: For the given time, node, and clock sequence |
Index Definitions
DiskAnnIndex
DiskAnnIndex (search_list_size:Optional[int]=None, num_neighbors:Optional[int]=None, max_alpha:Optional[float]=None, storage_layout:Optional[str]=None, num_dimensions:Optional[int]=None, num_bits_per_dimension:Optional[int]=None)
Timescale’s vector index.
HNSWIndex
HNSWIndex (m:Optional[int]=None, ef_construction:Optional[int]=None)
Pgvector’s hnsw index.
IvfflatIndex
IvfflatIndex (num_records:Optional[int]=None, num_lists:Optional[int]=None)
Pgvector’s ivfflat index.
BaseIndex
BaseIndex ()
Initialize self. See help(type(self)) for accurate signature.
Query Params
HNSWIndexParams
HNSWIndexParams (ef_search:int)
Initialize self. See help(type(self)) for accurate signature.
IvfflatIndexParams
IvfflatIndexParams (probes:int)
Initialize self. See help(type(self)) for accurate signature.
DiskAnnIndexParams
DiskAnnIndexParams (search_list_size:Optional[int]=None, rescore:Optional[int]=None)
Initialize self. See help(type(self)) for accurate signature.
QueryParams
QueryParams (params:dict[str,typing.Any])
Initialize self. See help(type(self)) for accurate signature.
Query Builder
UUIDTimeRange
UUIDTimeRange (start_date:Union[datetime.datetime,str,NoneType]=None, end_date:Union[datetime.datetime,str,NoneType]=None, time_delta:Optional[datetime.timedelta]=None, start_inclusive=True, end_inclusive=False)
*A UUIDTimeRange is a time range predicate on the UUID Version 1 timestamps.
Note that naive datetime objects are interpreted as local time on the python client side and converted to UTC before being sent to the database.*
Predicates
Predicates (*clauses:Union[ForwardRef('Predicates'),Tuple[str,Union[str,i nt,float,datetime.datetime,list,tuple]],Tuple[str,str,Union[s tr,int,float,datetime.datetime,list,tuple]],str,int,float,dat etime.datetime,list,tuple], operator:str='AND')
Predicates class defines predicates on the object metadata. Predicates can be combined using logical operators (&, |, and ~).
| Type | Default | Details | |
|---|---|---|---|
| clauses | Union | Predicate clauses. Can be either another Predicates object or a tuple of the form (field, operator, value) or (field, value). | |
| operator | str | AND |
QueryBuilder
QueryBuilder (table_name:str, num_dimensions:int, distance_type:str, id_type:str, time_partition_interval:Optional[datetime.timedelta], infer_filters:bool, schema_name:Optional[str])
Initializes a base Vector object to generate queries for vector clients.
| Type | Details | |
|---|---|---|
| table_name | str | The name of the table. |
| num_dimensions | int | The number of dimensions for the embedding vector. |
| distance_type | str | The distance type for indexing. |
| id_type | str | The type of the id column. Can be either ‘UUID’ or ‘TEXT’. |
| time_partition_interval | Optional | The time interval for partitioning the table (optional). |
| infer_filters | bool | Whether to infer start and end times from the special __start_date and __end_date filters. |
| schema_name | Optional | The schema name for the table (optional, uses the database’s default schema if not specified). |
| Returns | None |
QueryBuilder.get_create_query
QueryBuilder.get_create_query ()
Generates a query to create the tables, indexes, and extensions needed to store the vector data.
Async Client
Async
Async (service_url:str, table_name:str, num_dimensions:int, distance_type:str='cosine', id_type='UUID', time_partition_interval:Optional[datetime.timedelta]=None, max_db_connections:Optional[int]=None, infer_filters:bool=True, schema_name:Optional[str]=None)
Initializes a async client for storing vector data.
| Type | Default | Details | |
|---|---|---|---|
| service_url | str | The connection string for the database. | |
| table_name | str | The name of the table. | |
| num_dimensions | int | The number of dimensions for the embedding vector. | |
| distance_type | str | cosine | The distance type for indexing. |
| id_type | str | UUID | The type of the id column. Can be either ‘UUID’ or ‘TEXT’. |
| time_partition_interval | Optional | None | The time interval for partitioning the table (optional). |
| max_db_connections | Optional | None | |
| infer_filters | bool | True | Whether to infer start and end times from the special __start_date and __end_date filters. |
| schema_name | Optional | None | The schema name for the table (optional, uses the database’s default schema if not specified). |
| Returns | None |
Async.create_tables
Async.create_tables ()
Creates necessary tables.
Async.create_tables
Async.create_tables ()
Creates necessary tables.
Async.search
Async.search (query_embedding:Optional[List[float]]=None, limit:int=10, filter:Union[Dict[str,str],List[Dict[str,str]],NoneType]=No ne, predicates:Optional[__main__.Predicates]=None, uuid_time_filter:Optional[__main__.UUIDTimeRange]=None, query_params:Optional[__main__.QueryParams]=None)
Retrieves similar records using a similarity query.
| Type | Default | Details | |
|---|---|---|---|
| query_embedding | Optional | None | The query embedding vector. |
| limit | int | 10 | The number of nearest neighbors to retrieve. |
| filter | Union | None | A filter for metadata. Should be specified as a key-value object or a list of key-value objects (where any objects in the list are matched). |
| predicates | Optional | None | A Predicates object to filter the results. Predicates support more complex queries than the filter parameter. Predicates can be combined using logical operators (&, |, and ~). |
| uuid_time_filter | Optional | None | A UUIDTimeRange object to filter the results by time using the id column. |
| query_params | Optional | None | |
| Returns | List: List of similar records. |
Usage Example
for schema in ["tschema", None]: vec = Async(service_url, "data_table", 2, schema_name=schema) await vec.create_tables() empty = await vec.table_is_empty() assert empty await vec.upsert([(uuid.uuid4(), {"key": "val"}, "the brown fox", [1.0, 1.2])]) empty = await vec.table_is_empty() assert not empty await vec.upsert([ (uuid.uuid4(), '''{"key":"val"}''', "the brown fox", [1.0, 1.3]), (uuid.uuid4(), '''{"key":"val2", "key_10": "10", "key_11": "11.3"}''', "the brown fox", [1.0, 1.4]), (uuid.uuid4(), '''{"key2":"val"}''', "the brown fox", [1.0, 1.5]), (uuid.uuid4(), '''{"key2":"val"}''', "the brown fox", [1.0, 1.6]), (uuid.uuid4(), '''{"key2":"val"}''', "the brown fox", [1.0, 1.6]), (uuid.uuid4(), '''{"key2":"val2"}''', "the brown fox", [1.0, 1.7]), (uuid.uuid4(), '''{"key2":"val"}''', "the brown fox", [1.0, 1.8]), (uuid.uuid4(), '''{"key2":"val"}''', "the brown fox", [1.0, 1.9]), (uuid.uuid4(), '''{"key2":"val"}''', "the brown fox", [1.0, 100.8]), (uuid.uuid4(), '''{"key2":"val"}''', "the brown fox", [1.0, 101.8]), (uuid.uuid4(), '''{"key2":"val"}''', "the brown fox", [1.0, 1.8]), (uuid.uuid4(), '''{"key2":"val"}''', "the brown fox", [1.0, 1.8]), (uuid.uuid4(), '''{"key_1":"val_1", "key_2":"val_2"}''', "the brown fox", [1.0, 1.8]), (uuid.uuid4(), '''{"key0": [1,2,3,4]}''', "the brown fox", [1.0, 1.8]), (uuid.uuid4(), '''{"key0": [8,9,"A"]}''', "the brown fox", [1.0, 1.8]), # mixed types (uuid.uuid4(), '''{"key0": [5,6,7], "key3": 3}''', "the brown fox", [1.0, 1.8]), (uuid.uuid4(), '''{"key0": ["B", "C"]}''', "the brown fox", [1.0, 1.8]), ]) await vec.create_embedding_index(IvfflatIndex()) await vec.drop_embedding_index() await vec.create_embedding_index(IvfflatIndex(100)) await vec.drop_embedding_index() await vec.create_embedding_index(HNSWIndex()) await vec.drop_embedding_index() await vec.create_embedding_index(HNSWIndex(20, 125)) await vec.drop_embedding_index() await vec.create_embedding_index(DiskAnnIndex()) await vec.drop_embedding_index() await vec.create_embedding_index(DiskAnnIndex(50, 50, 1.5, "memory_optimized", 2, 1)) rec = await vec.search([1.0, 2.0]) assert len(rec) == 10 rec = await vec.search([1.0, 2.0], limit=4) assert len(rec) == 4 rec = await vec.search(limit=4) assert len(rec) == 4 rec = await vec.search([1.0, 2.0], limit=4, filter={"key2": "val2"}) assert len(rec) == 1 rec = await vec.search([1.0, 2.0], limit=4, filter={"key2": "does not exist"}) assert len(rec) == 0 rec = await vec.search([1.0, 2.0], limit=4, filter={"key_1": "val_1"}) assert len(rec) == 1 rec = await vec.search([1.0, 2.0], filter={"key_1": "val_1", "key_2": "val_2"}) assert len(rec) == 1 rec = await vec.search([1.0, 2.0], limit=4, filter={"key_1": "val_1", "key_2": "val_3"}) assert len(rec) == 0 rec = await vec.search(limit=4, filter={"key_1": "val_1", "key_2": "val_3"}) assert len(rec) == 0 rec = await vec.search([1.0, 2.0], limit=4, filter=[{"key_1": "val_1"}, {"key2": "val2"}]) assert len(rec) == 2 rec = await vec.search(limit=4, filter=[{"key_1": "val_1"}, {"key2": "val2"}]) assert len(rec) == 2 rec = await vec.search([1.0, 2.0], limit=4, filter=[{"key_1": "val_1"}, {"key2": "val2"}, {"no such key": "no such val"}]) assert len(rec) == 2 assert isinstance(rec[0][SEARCH_RESULT_METADATA_IDX], dict) assert isinstance(rec[0]["metadata"], dict) assert rec[0]["contents"] == "the brown fox" rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates(("key", "val2"))) assert len(rec) == 1 rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates(("key", "==", "val2"))) assert len(rec) == 1 rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates("key", "==", "val2")) assert len(rec) == 1 rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates("key_10", "<", 100)) assert len(rec) == 1 rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates("key_10", "<", 10)) assert len(rec) == 0 rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates("key_10", "<=", 10)) assert len(rec) == 1 rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates("key_10", "<=", 10.0)) assert len(rec) == 1 rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates("key_11", "<=", 11.3)) assert len(rec) == 1 rec = await vec.search(limit=4, predicates=Predicates("key_11", ">=", 11.29999)) assert len(rec) == 1 rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates("key_11", "<", 11.299999)) assert len(rec) == 0 rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates("key0", "@>", [1, 2])) assert len(rec) == 1 rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates("key0", "@>", [3, 7])) assert len(rec) == 0 rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates("key0", "@>", [42])) assert len(rec) == 0 rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates("key0", "@>", [4])) assert len(rec) == 1 rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates("key0", "@>", [9, "A"])) assert len(rec) == 1 rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates("key0", "@>", ["A"])) assert len(rec) == 1 rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates("key0", "@>", ("C", "B"))) assert len(rec) == 1 rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates(*[("key", "val2"), ("key_10", "<", 100)])) assert len(rec) == 1 rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates(("key", "val2"), ("key_10", "<", 100), operator='AND')) assert len(rec) == 1 rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates(("key", "val2"), ("key_2", "val_2"), operator='OR')) assert len(rec) == 2 rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates("key_10", "<", 100) & (Predicates("key","==", "val2",) | Predicates("key_2", "==", "val_2"))) assert len(rec) == 1 rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates("key_10", "<", 100) and (Predicates("key","==", "val2") or Predicates("key_2","==", "val_2"))) assert len(rec) == 1 rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates("key0", "@>", [6,7]) and Predicates("key3","==", 3)) assert len(rec) == 1 rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates("key0", "@>", [6,7]) and Predicates("key3","==", 6)) assert len(rec) == 0 rec = await vec.search(limit=4, predicates=~Predicates(("key", "val2"), ("key_10", "<", 100))) assert len(rec) == 4 raised = False try: # can't upsert using both keys and dictionaries await vec.upsert([ (uuid.uuid4(), {"key": "val"}, "the brown fox", [1.0, 1.2]), (uuid.uuid4(), '''{"key2":"val"}''', "the brown fox", [1.0, 1.2]) ]) except ValueError as e: raised = True assert raised raised = False try: # can't upsert using both keys and dictionaries opposite order await vec.upsert([ (uuid.uuid4(), '''{"key2":"val"}''', "the brown fox", [1.0, 1.2]), (uuid.uuid4(), {"key": "val"}, "the brown fox", [1.0, 1.2]) ]) except BaseException as e: raised = True assert raised rec = await vec.search([1.0, 2.0], limit=4, filter=[{"key_1": "val_1"}, {"key2": "val2"}]) assert len(rec) == 2 await vec.delete_by_ids([rec[0]["id"]]) rec = await vec.search([1.0, 2.0], limit=4, filter=[{"key_1": "val_1"}, {"key2": "val2"}]) assert len(rec) == 1 await vec.delete_by_metadata([{"key_1": "val_1"}, {"key2": "val2"}]) rec = await vec.search([1.0, 2.0], limit=4, filter=[{"key_1": "val_1"}, {"key2": "val2"}]) assert len(rec) == 0 rec = await vec.search([1.0, 2.0], limit=4, filter=[{"key2": "val"}]) assert len(rec) == 4 await vec.delete_by_metadata([{"key2": "val"}]) rec = await vec.search([1.0, 2.0], limit=4, filter=[{"key2": "val"}]) assert len(rec) == 0 assert not await vec.table_is_empty() await vec.delete_all() assert await vec.table_is_empty() await vec.drop_table() await vec.close() vec = Async(service_url, "data_table", 2, id_type="TEXT") await vec.create_tables() empty = await vec.table_is_empty() assert empty await vec.upsert([("Not a valid UUID", {"key": "val"}, "the brown fox", [1.0, 1.2])]) empty = await vec.table_is_empty() assert not empty await vec.delete_by_ids(["Not a valid UUID"]) empty = await vec.table_is_empty() assert empty await vec.drop_table() await vec.close() vec = Async(service_url, "data_table", 2, time_partition_interval=timedelta(seconds=60)) await vec.create_tables() empty = await vec.table_is_empty() assert empty id = uuid.uuid1() await vec.upsert([(id, {"key": "val"}, "the brown fox", [1.0, 1.2])]) empty = await vec.table_is_empty() assert not empty await vec.delete_by_ids([id]) empty = await vec.table_is_empty() assert empty raised = False try: # can't upsert with uuid type 4 in time partitioned table await vec.upsert([ (uuid.uuid4(), {"key": "val"}, "the brown fox", [1.0, 1.2]) ]) except BaseException as e: raised = True assert raised specific_datetime = datetime(2018, 8, 10, 15, 30, 0) await vec.upsert([ # current time (uuid.uuid1(), {"key": "val"}, "the brown fox", [1.0, 1.2]), #time in 2018 (uuid_from_time(specific_datetime), {"key": "val"}, "the brown fox", [1.0, 1.2]) ]) assert not await vec.table_is_empty() #check all the possible ways to specify a date range async def search_date(start_date, end_date, expected): #using uuid_time_filter rec = await vec.search([1.0, 2.0], limit=4, uuid_time_filter=UUIDTimeRange(start_date, end_date)) assert len(rec) == expected rec = await vec.search([1.0, 2.0], limit=4, uuid_time_filter=UUIDTimeRange(str(start_date), str(end_date))) assert len(rec) == expected #using filters filter = {} if start_date is not None: filter["__start_date"] = start_date if end_date is not None: filter["__end_date"] = end_date rec = await vec.search([1.0, 2.0], limit=4, filter=filter) assert len(rec) == expected #using filters with string dates filter = {} if start_date is not None: filter["__start_date"] = str(start_date) if end_date is not None: filter["__end_date"] = str(end_date) rec = await vec.search([1.0, 2.0], limit=4, filter=filter) assert len(rec) == expected #using predicates predicates = [] if start_date is not None: predicates.append(("__uuid_timestamp", ">=", start_date)) if end_date is not None: predicates.append(("__uuid_timestamp", "<", end_date)) rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates(*predicates)) assert len(rec) == expected #using predicates with string dates predicates = [] if start_date is not None: predicates.append(("__uuid_timestamp", ">=", str(start_date))) if end_date is not None: predicates.append(("__uuid_timestamp", "<", str(end_date))) rec = await vec.search([1.0, 2.0], limit=4, predicates=Predicates(*predicates)) assert len(rec) == expected await search_date(specific_datetime-timedelta(days=7), specific_datetime+timedelta(days=7), 1) await search_date(specific_datetime-timedelta(days=7), None, 2) await search_date(None, specific_datetime+timedelta(days=7), 1) await search_date(specific_datetime-timedelta(days=7), specific_datetime-timedelta(days=2), 0) #check timedelta handling rec = await vec.search([1.0, 2.0], limit=4, uuid_time_filter=UUIDTimeRange(start_date=specific_datetime, time_delta=timedelta(days=7))) assert len(rec) == 1 #end is exclusive rec = await vec.search([1.0, 2.0], limit=4, uuid_time_filter=UUIDTimeRange(end_date=specific_datetime, time_delta=timedelta(days=7))) assert len(rec) == 0 rec = await vec.search([1.0, 2.0], limit=4, uuid_time_filter=UUIDTimeRange(end_date=specific_datetime+timedelta(seconds=1), time_delta=timedelta(days=7))) assert len(rec) == 1 rec = await vec.search([1.0, 2.0], limit=4, query_params=DiskAnnIndexParams(10, 5)) assert len(rec) == 2 rec = await vec.search([1.0, 2.0], limit=4, query_params=DiskAnnIndexParams(100)) assert len(rec) == 2 await vec.drop_table() await vec.close()Sync Client
Sync
Sync (service_url:str, table_name:str, num_dimensions:int, distance_type:str='cosine', id_type='UUID', time_partition_interval:Optional[datetime.timedelta]=None, max_db_connections:Optional[int]=None, infer_filters:bool=True, schema_name:Optional[str]=None)
Initializes a sync client for storing vector data.
| Type | Default | Details | |
|---|---|---|---|
| service_url | str | The connection string for the database. | |
| table_name | str | The name of the table. | |
| num_dimensions | int | The number of dimensions for the embedding vector. | |
| distance_type | str | cosine | The distance type for indexing. |
| id_type | str | UUID | The type of the primary id column. Can be either ‘UUID’ or ‘TEXT’. |
| time_partition_interval | Optional | None | The time interval for partitioning the table (optional). |
| max_db_connections | Optional | None | |
| infer_filters | bool | True | Whether to infer start and end times from the special __start_date and __end_date filters. |
| schema_name | Optional | None | The schema name for the table (optional, uses the database’s default schema if not specified). |
| Returns | None |
Sync.create_tables
Sync.create_tables ()
Creates necessary tables.
Sync.upsert
Sync.upsert (records)
Performs upsert operation for multiple records.
| Type | Details | |
|---|---|---|
| records | Records to upsert. | |
| Returns | None |
/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/fastcore/docscrape.py:230: UserWarning: potentially wrong underline length... Returns -------- in Retrieves similar records using a similarity query. ... else: warn(msg) Sync.search
Sync.search (query_embedding:Optional[List[float]]=None, limit:int=10, filter:Union[Dict[str,str],List[Dict[str,str]],NoneType]=Non e, predicates:Optional[__main__.Predicates]=None, uuid_time_filter:Optional[__main__.UUIDTimeRange]=None, query_params:Optional[__main__.QueryParams]=None)
Retrieves similar records using a similarity query.
| Type | Default | Details | |
|---|---|---|---|
| query_embedding | Optional | None | The query embedding vector. |
| limit | int | 10 | The number of nearest neighbors to retrieve. |
| filter | Union | None | A filter for metadata. Should be specified as a key-value object or a list of key-value objects (where any objects in the list are matched). |
| predicates | Optional | None | A Predicates object to filter the results. Predicates support more complex queries than the filter parameter. Predicates can be combined using logical operators (&, |, and ~). |
| uuid_time_filter | Optional | None | |
| query_params | Optional | None | |
| Returns | List: List of similar records. |
Usage Example:
for schema in [None, "tschema"]: vec = Sync(service_url, "data_table", 2, schema_name=schema) vec.create_tables() empty = vec.table_is_empty() assert empty vec.upsert([(uuid.uuid4(), {"key": "val"}, "the brown fox", [1.0, 1.2])]) empty = vec.table_is_empty() assert not empty vec.upsert([ (uuid.uuid4(), '''{"key":"val"}''', "the brown fox", [1.0, 1.3]), (uuid.uuid4(), '''{"key":"val2"}''', "the brown fox", [1.0, 1.4]), (uuid.uuid4(), '''{"key2":"val"}''', "the brown fox", [1.0, 1.5]), (uuid.uuid4(), '''{"key2":"val"}''', "the brown fox", [1.0, 1.6]), (uuid.uuid4(), '''{"key2":"val"}''', "the brown fox", [1.0, 1.6]), (uuid.uuid4(), '''{"key2":"val2"}''', "the brown fox", [1.0, 1.7]), (uuid.uuid4(), '''{"key2":"val"}''', "the brown fox", [1.0, 1.8]), (uuid.uuid4(), '''{"key2":"val"}''', "the brown fox", [1.0, 1.9]), (uuid.uuid4(), '''{"key2":"val"}''', "the brown fox", [1.0, 100.8]), (uuid.uuid4(), '''{"key2":"val"}''', "the brown fox", [1.0, 101.8]), (uuid.uuid4(), '''{"key2":"val"}''', "the brown fox", [1.0, 1.8]), (uuid.uuid4(), '''{"key2":"val"}''', "the brown fox", [1.0, 1.8]), (uuid.uuid4(), '''{"key_1":"val_1", "key_2":"val_2"}''', "the brown fox", [1.0, 1.8]), (uuid.uuid4(), '''{"key0": [1,2,3,4]}''', "the brown fox", [1.0, 1.8]), (uuid.uuid4(), '''{"key0": [5,6,7], "key3": 3}''', "the brown fox", [1.0, 1.8]), ]) vec.create_embedding_index(IvfflatIndex()) vec.drop_embedding_index() vec.create_embedding_index(IvfflatIndex(100)) vec.drop_embedding_index() vec.create_embedding_index(HNSWIndex()) vec.drop_embedding_index() vec.create_embedding_index(HNSWIndex(20, 125)) vec.drop_embedding_index() vec.create_embedding_index(DiskAnnIndex()) vec.drop_embedding_index() vec.create_embedding_index(DiskAnnIndex(50, 50, 1.5)) rec = vec.search([1.0, 2.0]) assert len(rec) == 10 rec = vec.search(np.array([1.0, 2.0])) assert len(rec) == 10 rec = vec.search([1.0, 2.0], limit=4) assert len(rec) == 4 rec = vec.search(limit=4) assert len(rec) == 4 rec = vec.search([1.0, 2.0], limit=4, filter={"key2": "val2"}) assert len(rec) == 1 rec = vec.search([1.0, 2.0], limit=4, filter={"key2": "does not exist"}) assert len(rec) == 0 rec = vec.search(limit=4, filter={"key2": "does not exist"}) assert len(rec) == 0 rec = vec.search([1.0, 2.0], limit=4, filter={"key_1": "val_1"}) assert len(rec) == 1 rec = vec.search([1.0, 2.0], filter={"key_1": "val_1", "key_2": "val_2"}) assert len(rec) == 1 rec = vec.search([1.0, 2.0], limit=4, filter={ "key_1": "val_1", "key_2": "val_3"}) assert len(rec) == 0 rec = vec.search([1.0, 2.0], limit=4, filter=[ {"key_1": "val_1"}, {"key2": "val2"}]) assert len(rec) == 2 rec = vec.search([1.0, 2.0], limit=4, filter=[{"key_1": "val_1"}, { "key2": "val2"}, {"no such key": "no such val"}]) assert len(rec) == 2 raised = False try: # can't upsert using both keys and dictionaries await vec.upsert([ (uuid.uuid4(), {"key": "val"}, "the brown fox", [1.0, 1.2]), (uuid.uuid4(), '''{"key2":"val"}''', "the brown fox", [1.0, 1.2]) ]) except ValueError as e: raised = True assert raised raised = False try: # can't upsert using both keys and dictionaries opposite order await vec.upsert([ (uuid.uuid4(), '''{"key2":"val"}''', "the brown fox", [1.0, 1.2]), (uuid.uuid4(), {"key": "val"}, "the brown fox", [1.0, 1.2]) ]) except BaseException as e: raised = True assert raised rec = vec.search([1.0, 2.0], filter={"key_1": "val_1", "key_2": "val_2"}) assert rec[0][SEARCH_RESULT_CONTENTS_IDX] == 'the brown fox' assert rec[0]["contents"] == 'the brown fox' assert rec[0][SEARCH_RESULT_METADATA_IDX] == { 'key_1': 'val_1', 'key_2': 'val_2'} assert rec[0]["metadata"] == { 'key_1': 'val_1', 'key_2': 'val_2'} assert isinstance(rec[0][SEARCH_RESULT_METADATA_IDX], dict) assert rec[0][SEARCH_RESULT_DISTANCE_IDX] == 0.0009438353921149556 assert rec[0]["distance"] == 0.0009438353921149556 rec = vec.search([1.0, 2.0], limit=4, predicates=Predicates("key","==", "val2")) assert len(rec) == 1 rec = vec.search([1.0, 2.0], limit=4, filter=[ {"key_1": "val_1"}, {"key2": "val2"}]) len(rec) == 2 vec.delete_by_ids([rec[0][SEARCH_RESULT_ID_IDX]]) rec = vec.search([1.0, 2.0], limit=4, filter=[ {"key_1": "val_1"}, {"key2": "val2"}]) assert len(rec) == 1 vec.delete_by_metadata([{"key_1": "val_1"}, {"key2": "val2"}]) rec = vec.search([1.0, 2.0], limit=4, filter=[ {"key_1": "val_1"}, {"key2": "val2"}]) assert len(rec) == 0 rec = vec.search([1.0, 2.0], limit=4, filter=[{"key2": "val"}]) assert len(rec) == 4 vec.delete_by_metadata([{"key2": "val"}]) rec = vec.search([1.0, 2.0], limit=4, filter=[{"key2": "val"}]) len(rec) == 0 assert not vec.table_is_empty() vec.delete_all() assert vec.table_is_empty() vec.drop_table() vec.close() vec = Sync(service_url, "data_table", 2, id_type="TEXT", schema_name=schema) vec.create_tables() assert vec.table_is_empty() vec.upsert([("Not a valid UUID", {"key": "val"}, "the brown fox", [1.0, 1.2])]) assert not vec.table_is_empty() vec.delete_by_ids(["Not a valid UUID"]) assert vec.table_is_empty() vec.drop_table() vec.close() vec = Sync(service_url, "data_table", 2, time_partition_interval=timedelta(seconds=60), schema_name=schema) vec.create_tables() assert vec.table_is_empty() id = uuid.uuid1() vec.upsert([(id, {"key": "val"}, "the brown fox", [1.0, 1.2])]) assert not vec.table_is_empty() vec.delete_by_ids([id]) assert vec.table_is_empty() raised = False try: # can't upsert with uuid type 4 in time partitioned table vec.upsert([ (uuid.uuid4(), {"key": "val"}, "the brown fox", [1.0, 1.2]) ]) #pass except BaseException as e: raised = True assert raised specific_datetime = datetime(2018, 8, 10, 15, 30, 0) vec.upsert([ # current time (uuid.uuid1(), {"key": "val"}, "the brown fox", [1.0, 1.2]), #time in 2018 (uuid_from_time(specific_datetime), {"key": "val"}, "the brown fox", [1.0, 1.2]) ]) def search_date(start_date, end_date, expected): #using uuid_time_filter rec = vec.search([1.0, 2.0], limit=4, uuid_time_filter=UUIDTimeRange(start_date, end_date)) assert len(rec) == expected rec = vec.search([1.0, 2.0], limit=4, uuid_time_filter=UUIDTimeRange(str(start_date), str(end_date))) assert len(rec) == expected #using filters filter = {} if start_date is not None: filter["__start_date"] = start_date if end_date is not None: filter["__end_date"] = end_date rec = vec.search([1.0, 2.0], limit=4, filter=filter) assert len(rec) == expected #using filters with string dates filter = {} if start_date is not None: filter["__start_date"] = str(start_date) if end_date is not None: filter["__end_date"] = str(end_date) rec = vec.search([1.0, 2.0], limit=4, filter=filter) assert len(rec) == expected #using predicates predicates = [] if start_date is not None: predicates.append(("__uuid_timestamp", ">=", start_date)) if end_date is not None: predicates.append(("__uuid_timestamp", "<", end_date)) rec = vec.search([1.0, 2.0], limit=4, predicates=Predicates(*predicates)) assert len(rec) == expected #using predicates with string dates predicates = [] if start_date is not None: predicates.append(("__uuid_timestamp", ">=", str(start_date))) if end_date is not None: predicates.append(("__uuid_timestamp", "<", str(end_date))) rec = vec.search([1.0, 2.0], limit=4, predicates=Predicates(*predicates)) assert len(rec) == expected assert not vec.table_is_empty() search_date(specific_datetime-timedelta(days=7), specific_datetime+timedelta(days=7), 1) search_date(specific_datetime-timedelta(days=7), None, 2) search_date(None, specific_datetime+timedelta(days=7), 1) search_date(specific_datetime-timedelta(days=7), specific_datetime-timedelta(days=2), 0) #check timedelta handling rec = vec.search([1.0, 2.0], limit=4, uuid_time_filter=UUIDTimeRange(start_date=specific_datetime, time_delta=timedelta(days=7))) assert len(rec) == 1 #end is exclusive rec = vec.search([1.0, 2.0], limit=4, uuid_time_filter=UUIDTimeRange(end_date=specific_datetime, time_delta=timedelta(days=7))) assert len(rec) == 0 rec = vec.search([1.0, 2.0], limit=4, uuid_time_filter=UUIDTimeRange(end_date=specific_datetime+timedelta(seconds=1), time_delta=timedelta(days=7))) assert len(rec) == 1 rec = vec.search([1.0, 2.0], limit=4, query_params=DiskAnnIndexParams(10, 5)) assert len(rec) == 2 rec = vec.search([1.0, 2.0], limit=4, query_params=DiskAnnIndexParams(100, rescore=2)) assert len(rec) == 2 vec.drop_table() vec.close()