- Notifications
You must be signed in to change notification settings - Fork 4.4k
Make numpy and pandas optional for ~7 times smaller deps #153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 14 commits
9e83480 acd8b93 658a4ca 41fb5d2 49941e4 69a42c6 8bd45b2 184248c 1d4a5af cbe9446 054f9b4 129e6ba 1ffae5d be99210 4721f67 File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,56 @@ | ||
| """ | ||
| This module helps make data libraries like `numpy` and `pandas` optional dependencies. | ||
| | ||
| The libraries add up to 130MB+, which makes it challenging to deploy applications | ||
| using this library in environments with code size constraints, like AWS Lambda. | ||
| | ||
| This module serves as an import proxy and provides a few utilities for dealing with the optionality. | ||
| | ||
| Since the primary use case of this library (talking to the OpenAI API) doesn’t generally require data libraries, | ||
| it’s safe to make them optional. The rare case when data libraries are needed in the client is handled through | ||
| assertions with instructive error messages. | ||
| | ||
| See also `setup.py`. | ||
| | ||
| """ | ||
| try: | ||
| import numpy | ||
| except ImportError: | ||
| numpy = None | ||
| | ||
| try: | ||
| import pandas | ||
| except ImportError: | ||
| pandas = None | ||
| | ||
| HAS_NUMPY = bool(numpy) | ||
| HAS_PANDAS = bool(pandas) | ||
| | ||
| INSTRUCTIONS = """ | ||
| | ||
| OpenAI error: | ||
| | ||
| missing `{library}` | ||
| | ||
| This feature requires additional dependencies: | ||
| | ||
| $ pip install openai[datalib] | ||
| | ||
| """ | ||
| | ||
| NUMPY_INSTRUCTIONS = INSTRUCTIONS.format(library="numpy") | ||
| PANDAS_INSTRUCTIONS = INSTRUCTIONS.format(library="pandas") | ||
| | ||
| | ||
| class MissingDependencyError(Exception): | ||
| pass | ||
| | ||
| | ||
| def assert_has_numpy(): | ||
| if not HAS_NUMPY: | ||
| raise MissingDependencyError(NUMPY_INSTRUCTIONS) | ||
| | ||
| | ||
| def assert_has_pandas(): | ||
| if not HAS_PANDAS: | ||
| raise MissingDependencyError(PANDAS_INSTRUCTIONS) |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| | @@ -2,8 +2,6 @@ | |
| from typing import List, Optional | ||
| | ||
| import matplotlib.pyplot as plt | ||
| import numpy as np | ||
| import pandas as pd | ||
| import plotly.express as px | ||
| from scipy import spatial | ||
| from sklearn.decomposition import PCA | ||
| | @@ -12,6 +10,8 @@ | |
| from tenacity import retry, stop_after_attempt, wait_random_exponential | ||
| | ||
| import openai | ||
| from openai.datalib import numpy as np | ||
| from openai.datalib import pandas as pd | ||
| Comment on lines +13 to +14 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder if we should call There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The https://github.com/jakubroztocil/openai-python/blob/jakub/data-libraries-optional/setup.py#L46-L53 It could be improved, though. I think each optional extra — I wasn't sure whether you’d be interested in the PR, but it looks like you are, so I’ll polish it a bit: I’m thinking maybe throwing an It’s to a degree a backward-incompatible change (for existing users who don’t install There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh you're right this is an embeddings file so it will have the right dependencies. Regarding the backward-incompatibility, yes it's unfortunate but personally I think it's probably ok as long as the error is clear and explains how to resolve the problem. Also the line in See #124 for some historical context too about how deps have been handled too. | ||
| | ||
| | ||
| @retry(wait=wait_random_exponential(min=1, max=20), stop=stop_after_attempt(6)) | ||
| | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice