JetStream is a throughput and memory optimized engine for LLM inference on XLA devices.

About

JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).

JetStream Engine Implementation

Currently, there are two reference engine implementations available -- one for Jax models and another for Pytorch models.

Jax

Pytorch

Documentation

JetStream Standalone Local Setup

Getting Started

Setup

make install-deps

Run local server & Testing

Use the following commands to run a server locally:

# Start a server python -m jetstream.core.implementations.mock.server # Test local mock server python -m jetstream.tools.requester # Load test local mock server python -m jetstream.tools.load_tester

Test core modules

# Test JetStream core orchestrator python -m unittest -v jetstream.tests.core.test_orchestrator # Test JetStream core server library python -m unittest -v jetstream.tests.core.test_server # Test mock JetStream engine implementation python -m unittest -v jetstream.tests.engine.test_mock_engine # Test mock JetStream token utils python -m unittest -v jetstream.tests.engine.test_token_utils python -m unittest -v jetstream.tests.engine.test_utils

Name		Name	Last commit message	Last commit date
Latest commit History 141 Commits
.github		.github
benchmarks		benchmarks
docs		docs
experimental/jax		experimental/jax
jetstream		jetstream
.gitignore		.gitignore
AUTHORS		AUTHORS
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
license_preamble.txt		license_preamble.txt
pylintrc		pylintrc
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

JetStream is a throughput and memory optimized engine for LLM inference on XLA devices.

About

JetStream Engine Implementation

Jax

Pytorch

Documentation

JetStream Standalone Local Setup

Getting Started

Setup

Run local server & Testing

Test core modules

About

Uh oh!

Releases

Packages

Languages

License

wyzhang/JetStream

Folders and files

Latest commit

History

Repository files navigation

JetStream is a throughput and memory optimized engine for LLM inference on XLA devices.

About

JetStream Engine Implementation

Jax

Pytorch

Documentation

JetStream Standalone Local Setup

Getting Started

Setup

Run local server & Testing

Test core modules

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages