Skip to content

Conversation

@CharlieFRuan
Copy link
Member

@CharlieFRuan CharlieFRuan commented Aug 23, 2024

@CharlieFRuan CharlieFRuan marked this pull request as ready for review August 23, 2024 15:55
@CharlieFRuan CharlieFRuan merged commit 055f568 into mlc-ai:main Aug 23, 2024
CharlieFRuan added a commit to mlc-ai/web-llm that referenced this pull request Aug 23, 2024
This PR adds the newly release Phi3.5-mini, adding the following `model_id`s to our prebuilt model list: - `Phi-3.5-mini-instruct-q4f16_1-MLC` (4k KVCache) - `Phi-3.5-mini-instruct-q4f32_1-MLC` (4k KVCache) - `Phi-3.5-mini-instruct-q4f16_1-MLC-1k` (1k KVCache) - `Phi-3.5-mini-instruct-q4f16_1-MLC-1k` (1k KVCache) See mlc-ai/binary-mlc-llm-libs#136 for on which commits of TVM and MLC-LLM this is compiled with. Note that Phi-3.5-mini comes with support up to 128K context (unlike Phi-3-mini which only has 4k) thanks to rope scaling which MLC-LLM supports, which you can take advantage of in WebLLM by increasing `ModelRecord.overrides.context_window_size` or specifying it in `ChatOptions` when loading a model, as long as there is enough VRAM.
jingyi-zhao-01 pushed a commit to jingyi-zhao-01/web-llm that referenced this pull request Dec 8, 2024
This PR adds the newly release Phi3.5-mini, adding the following `model_id`s to our prebuilt model list: - `Phi-3.5-mini-instruct-q4f16_1-MLC` (4k KVCache) - `Phi-3.5-mini-instruct-q4f32_1-MLC` (4k KVCache) - `Phi-3.5-mini-instruct-q4f16_1-MLC-1k` (1k KVCache) - `Phi-3.5-mini-instruct-q4f16_1-MLC-1k` (1k KVCache) See mlc-ai/binary-mlc-llm-libs#136 for on which commits of TVM and MLC-LLM this is compiled with. Note that Phi-3.5-mini comes with support up to 128K context (unlike Phi-3-mini which only has 4k) thanks to rope scaling which MLC-LLM supports, which you can take advantage of in WebLLM by increasing `ModelRecord.overrides.context_window_size` or specifying it in `ChatOptions` when loading a model, as long as there is enough VRAM.
atebites-hub pushed a commit to atebites-hub/web-llm that referenced this pull request Oct 4, 2025
This PR adds the newly release Phi3.5-mini, adding the following `model_id`s to our prebuilt model list: - `Phi-3.5-mini-instruct-q4f16_1-MLC` (4k KVCache) - `Phi-3.5-mini-instruct-q4f32_1-MLC` (4k KVCache) - `Phi-3.5-mini-instruct-q4f16_1-MLC-1k` (1k KVCache) - `Phi-3.5-mini-instruct-q4f16_1-MLC-1k` (1k KVCache) See mlc-ai/binary-mlc-llm-libs#136 for on which commits of TVM and MLC-LLM this is compiled with. Note that Phi-3.5-mini comes with support up to 128K context (unlike Phi-3-mini which only has 4k) thanks to rope scaling which MLC-LLM supports, which you can take advantage of in WebLLM by increasing `ModelRecord.overrides.context_window_size` or specifying it in `ChatOptions` when loading a model, as long as there is enough VRAM.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant