Skip to content
This repository was archived by the owner on Oct 25, 2024. It is now read-only.

Commit d9b484a

Browse files
authored
Fix bloom ffn fusion (#620)
1 parent 81c3262 commit d9b484a

File tree

1 file changed

+3
-1
lines changed
  • intel_extension_for_transformers/llm/runtime/graph/models/bloom

1 file changed

+3
-1
lines changed

intel_extension_for_transformers/llm/runtime/graph/models/bloom/bloom.cpp

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -213,13 +213,15 @@ static bool bloom_model_eval_internal(model_context& lctx, const model_token* to
213213
model.layers[il].ffn[5], cur);
214214
} else {
215215
cur = ne_mul_mat(ctx0, model.layers[il].ffn[2], cur);
216+
216217
cur = ne_add(ctx0, ne_repeat(ctx0, model.layers[il].ffn[3], cur), cur);
217218

218219
cur = ne_gelu(ctx0, cur);
219220

220221
cur = ne_mul_mat(ctx0, model.layers[il].ffn[4], cur);
222+
223+
cur = ne_add(ctx0, ne_repeat(ctx0, model.layers[il].ffn[5], cur), cur);
221224
}
222-
cur = ne_add(ctx0, ne_repeat(ctx0, model.layers[il].ffn[5], cur), cur);
223225
}
224226

225227
cur = ne_add(ctx0, cur, inpFF);

0 commit comments

Comments
 (0)