Skip to content
This repository was archived by the owner on Oct 25, 2024. It is now read-only.

Conversation

@zhenwei-intel
Copy link
Contributor

@zhenwei-intel zhenwei-intel commented Nov 15, 2023

Type of Change

feature

Description

Support load_in_nbit in llm runtime

Expected Behavior & Potential Risk

the expected behavior that triggered by this PR

How has this PR been tested?

how to reproduce the test (including hardware information)

Dependency Change?

any library dependency introduced or removed

Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>
@kevinintel
Copy link
Contributor

please add new API in main page and graph readme

Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>
@zhenwei-intel
Copy link
Contributor Author

please add new API in main page and graph readme

updated

Copy link
Contributor

@a32543254 a32543254 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>
@hshen14 hshen14 merged commit 4423f70 into main Nov 15, 2023
@hshen14 hshen14 deleted the lzw/load_in_4bit branch November 15, 2023 11:36
@VincyZhang
Copy link
Contributor

VincyZhang commented Nov 15, 2023

This PR plus PR #679 will lead to Neuralchat llm_runtime_int4_server hangs without output, would you please figure out the reason? @zhenwei-intel @lvliang-intel

VincyZhang added a commit that referenced this pull request Nov 15, 2023
sywangyi pushed a commit that referenced this pull request Nov 21, 2023
* support load_in_nbit in llm runtime Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

8 participants