- Notifications
You must be signed in to change notification settings - Fork 12
Closed
Description
Using the model potion-multilingual-128M leads to the following error: Tokenizer JSON declared unk_token="<unk>" but it’s not in the vocab on creating a StaticModel using the from_pretrained method.
See this minimal example provoking the error:
use model2vec_rs::model::StaticModel; fn main() { let model = StaticModel::from_pretrained("minishlab/potion-multilingual-128M", None, None, None) .unwrap(); let embeddings = model.encode(&["Hello World".to_string()]); dbg!(embeddings); }CLI also fails with the same error for that model see for example:
model2vec-rs encode "Hello world" "minishlab/potion-multilingual-128M" Metadata
Metadata
Assignees
Labels
No labels