[LLM Runtime] Support load_in_nbit in llm runtime #688

zhenwei-intel · 2023-11-15T06:35:27Z

Type of Change

feature

Description

Support load_in_nbit in llm runtime

Expected Behavior & Potential Risk

the expected behavior that triggered by this PR

How has this PR been tested?

how to reproduce the test (including hardware information)

Dependency Change?

any library dependency introduced or removed

Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>

intel_extension_for_transformers/transformers/modeling/modeling_auto.py

kevinintel · 2023-11-15T07:19:53Z

please add new API in main page and graph readme

Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>

zhenwei-intel · 2023-11-15T08:05:43Z

please add new API in main page and graph readme

updated

README.md

a32543254

LGTM

intel_extension_for_transformers/llm/runtime/graph/README.md

Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>

VincyZhang · 2023-11-15T13:48:11Z

This PR plus PR #679 will lead to Neuralchat llm_runtime_int4_server hangs without output, would you please figure out the reason? @zhenwei-intel @lvliang-intel

This reverts commit 4423f70.

* support load_in_nbit in llm runtime Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>

support load_in_nbit in llm runtime

9f50610

Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>

zhenwei-intel requested a review from PenghuiCheng as a code owner November 15, 2023 06:35

hshen14 reviewed Nov 15, 2023

View reviewed changes

intel_extension_for_transformers/transformers/modeling/modeling_auto.py Show resolved Hide resolved

update readme

f471e65

Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>

zhenwei-intel requested a review from airMeng as a code owner November 15, 2023 08:04

zhenwei-intel requested a review from a32543254 November 15, 2023 08:06

airMeng added the ITREX.cpp label Nov 15, 2023

a32543254 reviewed Nov 15, 2023

View reviewed changes

README.md Show resolved Hide resolved

a32543254 approved these changes Nov 15, 2023

View reviewed changes

DDEle reviewed Nov 15, 2023

View reviewed changes

intel_extension_for_transformers/llm/runtime/graph/README.md Show resolved Hide resolved

update readme

b9684a1

Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>

hshen14 approved these changes Nov 15, 2023

View reviewed changes

hshen14 merged commit 4423f70 into main Nov 15, 2023

hshen14 deleted the lzw/load_in_4bit branch November 15, 2023 11:36

VincyZhang added a commit that referenced this pull request Nov 15, 2023

Revert "[LLM Runtime] Support load_in_nbit in llm runtime (#688)"

786becc

This reverts commit 4423f70.

sywangyi pushed a commit that referenced this pull request Nov 21, 2023

[LLM Runtime] Support load_in_nbit in llm runtime (#688)

bdfac3c

* support load_in_nbit in llm runtime Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[LLM Runtime] Support load_in_nbit in llm runtime #688

[LLM Runtime] Support load_in_nbit in llm runtime #688

Uh oh!

zhenwei-intel commented Nov 15, 2023 •

edited

Loading

Uh oh!

kevinintel commented Nov 15, 2023

zhenwei-intel commented Nov 15, 2023

Uh oh!

a32543254 left a comment

Uh oh!

VincyZhang commented Nov 15, 2023 •

edited

Loading

Labels

8 participants

[LLM Runtime] Support load_in_nbit in llm runtime #688

[LLM Runtime] Support load_in_nbit in llm runtime #688

Uh oh!

Conversation

zhenwei-intel commented Nov 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Type of Change

Description

Expected Behavior & Potential Risk

How has this PR been tested?

Dependency Change?

Uh oh!

kevinintel commented Nov 15, 2023

zhenwei-intel commented Nov 15, 2023

Uh oh!

a32543254 left a comment

Choose a reason for hiding this comment

Uh oh!

VincyZhang commented Nov 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Labels

8 participants

zhenwei-intel commented Nov 15, 2023 •

edited

Loading

VincyZhang commented Nov 15, 2023 •

edited

Loading