Harrison/self hosted runhouse #1154

hwchase17 · 2023-02-19T17:47:14Z

No description provided.

…GCP, Azure, Lambda (#978) New modules to facilitate easy use of embedding and LLM models on one's own cloud GPUs. Uses [Runhouse](https://github.com/run-house/runhouse) to facilitate cloud RPC. Supports AWS, GCP, Azure, and Lambda today (auto-launching) and BYO hardware by IP and SSH creds (e.g. for on-prem or other clouds like Coreweave, Paperspace, etc.). **APIs** The API mirrors the HuggingFaceEmbedding and HuggingFaceInstructEmbedding, but accepts an additional "hardware" parameter: ``` from langchain.embeddings import SelfHostedHuggingFaceEmbeddings, SelfHostedHuggingFaceInstructEmbeddings import runhouse as rh gpu = rh.cluster(name="rh-a10x", instance_type="A100:1") hf = SelfHostedHuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2", hardware=gpu) # Will run on the same GPU hf_instruct = SelfHostedHuggingFaceInstructEmbeddings(model_name="hkunlp/instructor-large", hardware=gpu) ``` The rh.cluster above will launch the A100 on GCP, Azure, or Lambda, whichever is enabled and cheapest (thanks to SkyPilot). You can specify a specific provider by `provider='gcp'`, as well as `use_spot`, `region`, `image_id`, and `autostop_mins`. For AWS you'd need to just switch to "A10G:1". For BYO cluster, you can do: ``` gpu = rh.cluster(ips=['<ip of the cluster>'], ssh_creds={'ssh_user': '...', 'ssh_private_key':'<path_to_key>'}, name='rh-a10x') ``` **Design** All we're doing here is sending a pre-defined inference function to the cluster through Runhouse, which brings up the cluster if needed, installs the dependencies, and returns a callable that sends requests to run the function over gRPC. The function takes the model_id as an input, but the model is cached so only needs to be downloaded once. We can improve performance further pretty easily by pinning the model to GPU memory on the cluster. Let me know if that's of interest. **Testing** Added new tests embeddings/test_self_hosted.py (which mirror test_huggingface.py) and llms/test_self_hosted_llm.py. Tests all pass on Lambda Labs (which is surprising, because the first two test_huggingface.py tests are supposedly segfaulting?). We can pin the provider used in the test to whichever is used by your CI, or you can choose to only run these on a schedule to avoid spinning up a GPU (can take ~5 minutes including installations). - [x] Introduce SelfHostedPipeline and SelfHostedHuggingFaceLLM - [x] Introduce SelfHostedEmbedding, SelfHostedHuggingFaceEmbedding, and SelfHostedHuggingFaceInstructEmbedding - [x] Add tutorials for Self-hosted LLMs and Embeddings - [x] Implement chat-your-data tutorial with Self-hosted models - https://github.com/dongreenberg/chat-your-data --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: John Dagdelen <jdagdelen@users.noreply.github.com> Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net> Co-authored-by: Andrew White <white.d.andrew@gmail.com> Co-authored-by: Peng Qu <82029664+pengqu123@users.noreply.github.com> Co-authored-by: Matt Robinson <mthw.wm.robinson@gmail.com> Co-authored-by: jeff <tangj1122@gmail.com> Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MacBook-Pro.local> Co-authored-by: zanderchase <zander@unfold.ag> Co-authored-by: Charles Frye <cfrye59@gmail.com> Co-authored-by: zanderchase <zanderchase@gmail.com> Co-authored-by: Shahriar Tajbakhsh <sh.tajbakhsh@gmail.com> Co-authored-by: Stefan Keselj <skeselj@princeton.edu> Co-authored-by: Francisco Ingham <fpingham@gmail.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: cragwolfe <cragcw@gmail.com> Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com> Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com> Co-authored-by: Oliver Klingefjord <oliver@klingefjord.com> Co-authored-by: blob42 <contact@blob42.xyz> Co-authored-by: blob42 <spike@w530> Co-authored-by: Enrico Shippole <henryshippole@gmail.com> Co-authored-by: Ibis Prevedello <ibiscp@gmail.com> Co-authored-by: jped <jonathanped@gmail.com> Co-authored-by: Justin Torre <justintorre75@gmail.com> Co-authored-by: Ivan Vendrov <ivan@anthropic.com> Co-authored-by: Sasmitha Manathunga <70096033+mmz-001@users.noreply.github.com> Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com> Co-authored-by: Matt Robinson <mrobinson@unstructuredai.io> Co-authored-by: Jeff Huber <jeffchuber@gmail.com> Co-authored-by: Akshay <64036106+akshayvkt@users.noreply.github.com> Co-authored-by: Andrew Huang <jhuang16888@gmail.com> Co-authored-by: rogerserper <124558887+rogerserper@users.noreply.github.com> Co-authored-by: seanaedmiston <seane999@gmail.com> Co-authored-by: Hasegawa Yuya <52068175+Hase-U@users.noreply.github.com> Co-authored-by: Ivan Vendrov <ivendrov@gmail.com> Co-authored-by: Chen Wu (吴尘) <henrychenwu@cmu.edu> Co-authored-by: Dennis Antela Martinez <dennis.antela@gmail.com> Co-authored-by: Maxime Vidal <max.vidal@hotmail.fr> Co-authored-by: Rishabh Raizada <110235735+rishabh-ti@users.noreply.github.com>

RasoulNik · 2023-02-27T13:02:47Z

Can I use runhouse with a local GPU? I am asking because the langchin uses runhouse as backed for the self-hosted model.

Co-authored-by: Donny Greenberg <dongreenberg2@gmail.com> Co-authored-by: John Dagdelen <jdagdelen@users.noreply.github.com> Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net> Co-authored-by: Andrew White <white.d.andrew@gmail.com> Co-authored-by: Peng Qu <82029664+pengqu123@users.noreply.github.com> Co-authored-by: Matt Robinson <mthw.wm.robinson@gmail.com> Co-authored-by: jeff <tangj1122@gmail.com> Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MacBook-Pro.local> Co-authored-by: zanderchase <zander@unfold.ag> Co-authored-by: Charles Frye <cfrye59@gmail.com> Co-authored-by: zanderchase <zanderchase@gmail.com> Co-authored-by: Shahriar Tajbakhsh <sh.tajbakhsh@gmail.com> Co-authored-by: Stefan Keselj <skeselj@princeton.edu> Co-authored-by: Francisco Ingham <fpingham@gmail.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: cragwolfe <cragcw@gmail.com> Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com> Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com> Co-authored-by: Oliver Klingefjord <oliver@klingefjord.com> Co-authored-by: blob42 <contact@blob42.xyz> Co-authored-by: blob42 <spike@w530> Co-authored-by: Enrico Shippole <henryshippole@gmail.com> Co-authored-by: Ibis Prevedello <ibiscp@gmail.com> Co-authored-by: jped <jonathanped@gmail.com> Co-authored-by: Justin Torre <justintorre75@gmail.com> Co-authored-by: Ivan Vendrov <ivan@anthropic.com> Co-authored-by: Sasmitha Manathunga <70096033+mmz-001@users.noreply.github.com> Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com> Co-authored-by: Matt Robinson <mrobinson@unstructuredai.io> Co-authored-by: Jeff Huber <jeffchuber@gmail.com> Co-authored-by: Akshay <64036106+akshayvkt@users.noreply.github.com> Co-authored-by: Andrew Huang <jhuang16888@gmail.com> Co-authored-by: rogerserper <124558887+rogerserper@users.noreply.github.com> Co-authored-by: seanaedmiston <seane999@gmail.com> Co-authored-by: Hasegawa Yuya <52068175+Hase-U@users.noreply.github.com> Co-authored-by: Ivan Vendrov <ivendrov@gmail.com> Co-authored-by: Chen Wu (吴尘) <henrychenwu@cmu.edu> Co-authored-by: Dennis Antela Martinez <dennis.antela@gmail.com> Co-authored-by: Maxime Vidal <max.vidal@hotmail.fr> Co-authored-by: Rishabh Raizada <110235735+rishabh-ti@users.noreply.github.com>

dongreenberg and others added 2 commits February 19, 2023 08:42

cr

973ead7

hwchase17 merged commit 9d6d8f8 into master Feb 19, 2023

hwchase17 deleted the harrison/self-hosted-runhouse branch February 19, 2023 17:53

blob42 mentioned this pull request Feb 21, 2023

fix searx blob42/langchain#1

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Harrison/self hosted runhouse #1154

Harrison/self hosted runhouse #1154

Uh oh!

hwchase17 commented Feb 19, 2023

RasoulNik commented Feb 27, 2023

Labels

4 participants

Harrison/self hosted runhouse #1154

Harrison/self hosted runhouse #1154

Uh oh!

Conversation

hwchase17 commented Feb 19, 2023

RasoulNik commented Feb 27, 2023

Labels

4 participants