[WebLLM][WASM] Add v0_2_48 models for web-llm #129

CharlieFRuan · 2024-07-12T06:51:11Z

Compiled at:

MLC-LLM: mlc-ai/mlc-llm@cbf6ae0
TVM: apache/tvm@32e9a48

The only expected change in the WASM is the creation of a bitmask in grammar, fixing grammar correctness for models like Phi-3 and QWen2. For more see:

[Fix][Bitmask] Mask dummy padded tokens for grammar mlc-llm#2651

Before this PR, we may run into issues like the following when using json mode with Phi3: ``` index.js:1913 [FATAL] /Users/cfruan/Documents/mlc-llm/cpp/grammar/grammar_state_matcher.cc:202: Check failed: (token_id >= 0 && token_id < init_ctx_->vocab_size) is false: Invalid token id 32042 for GrammarStateMatcher ``` This PR fixes this issue for models like Phi3, where the `vocab_size` in `config.json` may be larger than vocabs recorded in `tokenizer.json`. For more see: - mlc-ai/mlc-llm#2651 As a result, we had to recompile all WASMs, hence updating the wasm version to v0_2_48: - mlc-ai/binary-mlc-llm-libs#129 To support this in runtime, `ChatConfig` adds a new field `vocab_size` read from `mlc-chat-config.json` and update the signature of internal grammar function `findNextTokenBitmask()`.

### Changes - Fixed potential error in json mode for some models like Phi3 and Qwen2; for more see - #501 ### Model WASM Version Due to the fix, updated WASMs to `v0_2_48` - mlc-ai/binary-mlc-llm-libs#129 ### TVMjs No change, compiled at apache/tvm@32e9a48

Before this PR, we may run into issues like the following when using json mode with Phi3: ``` index.js:1913 [FATAL] /Users/cfruan/Documents/mlc-llm/cpp/grammar/grammar_state_matcher.cc:202: Check failed: (token_id >= 0 && token_id < init_ctx_->vocab_size) is false: Invalid token id 32042 for GrammarStateMatcher ``` This PR fixes this issue for models like Phi3, where the `vocab_size` in `config.json` may be larger than vocabs recorded in `tokenizer.json`. For more see: - mlc-ai/mlc-llm#2651 As a result, we had to recompile all WASMs, hence updating the wasm version to v0_2_48: - mlc-ai/binary-mlc-llm-libs#129 To support this in runtime, `ChatConfig` adds a new field `vocab_size` read from `mlc-chat-config.json` and update the signature of internal grammar function `findNextTokenBitmask()`.

### Changes - Fixed potential error in json mode for some models like Phi3 and Qwen2; for more see - mlc-ai#501 ### Model WASM Version Due to the fix, updated WASMs to `v0_2_48` - mlc-ai/binary-mlc-llm-libs#129 ### TVMjs No change, compiled at apache/tvm@32e9a48

Before this PR, we may run into issues like the following when using json mode with Phi3: ``` index.js:1913 [FATAL] /Users/cfruan/Documents/mlc-llm/cpp/grammar/grammar_state_matcher.cc:202: Check failed: (token_id >= 0 && token_id < init_ctx_->vocab_size) is false: Invalid token id 32042 for GrammarStateMatcher ``` This PR fixes this issue for models like Phi3, where the `vocab_size` in `config.json` may be larger than vocabs recorded in `tokenizer.json`. For more see: - mlc-ai/mlc-llm#2651 As a result, we had to recompile all WASMs, hence updating the wasm version to v0_2_48: - mlc-ai/binary-mlc-llm-libs#129 To support this in runtime, `ChatConfig` adds a new field `vocab_size` read from `mlc-chat-config.json` and update the signature of internal grammar function `findNextTokenBitmask()`.

### Changes - Fixed potential error in json mode for some models like Phi3 and Qwen2; for more see - mlc-ai#501 ### Model WASM Version Due to the fix, updated WASMs to `v0_2_48` - mlc-ai/binary-mlc-llm-libs#129 ### TVMjs No change, compiled at apache/tvm@32e9a48

Add v0_2_48 models for web-llm

0ebd958

CharlieFRuan marked this pull request as draft July 12, 2024 06:51

CharlieFRuan mentioned this pull request Jul 12, 2024

[WASM][Fix] Fix json mode for phi3, bump to v0_2_48 wasm mlc-ai/web-llm#501

Merged

CharlieFRuan marked this pull request as ready for review July 12, 2024 16:44

CharlieFRuan merged commit 7ce7510 into mlc-ai:main Jul 12, 2024

CharlieFRuan mentioned this pull request Jul 12, 2024

[Version][Trivial] Bump version to 0.2.48 mlc-ai/web-llm#503

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WebLLM][WASM] Add v0_2_48 models for web-llm #129

[WebLLM][WASM] Add v0_2_48 models for web-llm #129

Uh oh!

CharlieFRuan commented Jul 12, 2024 •

edited

Loading

Labels

1 participant

[WebLLM][WASM] Add v0_2_48 models for web-llm #129

[WebLLM][WASM] Add v0_2_48 models for web-llm #129

Uh oh!

Conversation

CharlieFRuan commented Jul 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Labels

1 participant

CharlieFRuan commented Jul 12, 2024 •

edited

Loading