This repo has:
- compiled model libraries for webgpu
- instructions on how to build all the necessary dependencies from source
Note
The instructions are ultra-specific to my setup — MacBook Air M2. I have no idea if they'll work for you.
Installed conda using miniconda
On the first time or when creating a new session, make sure to run:
# activate conda for session source ~/miniconda3/bin/activate conda init # if environment is already created conda activate mlc-dev # or create a new one, and then activate conda create -n mlc-devInstall the necessary packages for tvm and mlc-llm
conda install -c conda-forge \ "llvmdev>=15" \ "cmake>=3.24" \ compilers \ git \ rust \ numpy \ psutil \ python=3.13Some initial environment variables to use:
export CC=$CONDA_PREFIX/bin/clang export CXX=$CONDA_PREFIX/bin/clang++ export LDFLAGS="-L$CONDA_PREFIX/lib" export CPPFLAGS="-I$CONDA_PREFIX/include"More will be added later.
I'm using direnv and a .env file to manage these.
Following: https://tvm.apache.org/docs/install/from_source.html
# clone from GitHub git clone --recursive https://github.com/apache/tvm.git && cd tvm # create the build directory rm -rf build && mkdir build && cd build # specify build requirements in `config.cmake` cp ../cmake/config.cmake . # update config echo -e "set(CMAKE_BUILD_TYPE RelWithDebInfo)\n\ set(USE_LLVM \"llvm-config --ignore-libllvm --link-static\")\n\ set(HIDE_PRIVATE_SYMBOLS ON)" >> config.cmake # Configure cmake flags cmake \ -DCMAKE_PREFIX_PATH=$CONDA_PREFIX \ -DCMAKE_FIND_ROOT_PATH=$CONDA_PREFIX \ .. # Build cmake --build . --parallel $(sysctl -n hw.ncpu) # install the tvm-ffi package. cd ../3rdparty/tvm-ffi pip install -e . # install tvm in root environment cd ../../.. # Add to the session or .env file (update accordingly) export TVM_LIBRARY_PATH=./tvm/build export TVM_HOME=./tvm export PYTHONPATH=$TVM_HOME/python:$PYTHONPATH pip install -e ./tvm/python # validate it works python -c "import tvm; print(tvm.__file__)"Following: https://llm.mlc.ai/docs/install/mlc_llm.html#option-2-build-from-source
From the root:
git clone --recursive https://github.com/mlc-ai/mlc-llm.git && cd mlc-llm/ # create build directory mkdir -p build && cd build # generate build configuration python ../cmake/gen_cmake_config.py # build mlc_llm libraries cmake \ -DCMAKE_PREFIX_PATH=$CONDA_PREFIX \ -DCMAKE_FIND_ROOT_PATH=$CONDA_PREFIX \ -DCMAKE_POLICY_VERSION_MINIMUM=3.5 \ .. make -j $(sysctl -n hw.ncpu) && cd .. # install as a a pip project but with M2 modifications cd ./mlc-llm/python/requirements.txt # comment out flashinfer-python # from the root of the repo pip install -e ./mlc-llm/python # add to the .env export KMP_DUPLICATE_LIB_OK=TRUE # verify mlc_llm --help # if this error occurs: ./tvm/3rdparty/tvm-ffi/python/tvm_ffi/_optional_torch_c_dlpack.py:559: UserWarning: Failed to load torch c dlpack extension: Ninja is required to load C++ extensions (pip install ninja to get it),EnvTensorAllocator will not be enabled. warnings.warn( # then pip install ninjaFollow the official docs: https://emscripten.org/docs/getting_started/downloads.html
But make sure to use 3.1.56.
If returning to a new session:
cd emsdk ./emsdk activate 3.1.56 source ./emsdk_env.sh # validate emcc --versionNext, follow here
especially the first time:
cd mlc-llm ./web/prep_emcc_deps.shFollow the guide here: https://llm.mlc.ai/docs/deploy/webllm.html#bring-your-own-model-library