Skip to content

Conversation

@w5688414
Copy link
Contributor

@w5688414 w5688414 commented Apr 3, 2024

PR types

New features

  • deepset/deberta-v3-large-squad2
  • microsoft/deberta-v2-xlarge
  • microsoft/deberta-v3-base
  • microsoft/deberta-v3-large
  • microsoft/deberta-base

PR changes

Description

borrow from previous PR: #5414

import numpy as np import paddle import torch from paddlenlp.transformers import DebertaV2Tokenizer def test_precision(model_name): pp_model = PaddleDebertaModel.from_pretrained(model_name) # pp_model = PaddleDebertaModel.from_pretrained(model_name.split('/')[-1]) hf_model = HuggingfaceModel.from_pretrained(model_name) input_ids = np.random.randint(1, 1000, size=(2, 10)) pp_inputs = paddle.to_tensor(input_ids) hf_inputs = torch.tensor(input_ids) pp_model.eval() hf_model.eval() with paddle.no_grad(): pp_output = pp_model(pp_inputs, output_hidden_states=True, return_dict=True) with torch.no_grad(): hf_output = hf_model(hf_inputs, output_hidden_states=True) if "start_logits" in hf_output.keys(): for key in ['start_logits', 'end_logits']: diff = abs(hf_output[key].detach().numpy() - pp_output[key].numpy()) print(f"{key} max diff: {np.max(diff)}, min diff: {np.min(diff)}") for i in range(pp_model.config.num_hidden_layers + 1): diff = abs(hf_output["hidden_states"][i].detach().numpy() - pp_output["hidden_states"][i].numpy()) print(f"layer {i} max diff: {np.max(diff)}, min diff: {np.min(diff)}") from transformers import AutoModelForQuestionAnswering as HuggingfaceModel from paddlenlp.transformers import DebertaV2ForQuestionAnswering as PaddleDebertaModel model_name = "deepset/deberta-v3-large-squad2" test_precision(model_name) 

output is:

start_logits max diff: 5.0067901611328125e-06, min diff: 1.862645149230957e-08 end_logits max diff: 3.3080577850341797e-06, min diff: 8.940696716308594e-08 layer 0 max diff: 9.5367431640625e-07, min diff: 0.0 layer 1 max diff: 2.86102294921875e-06, min diff: 0.0 layer 2 max diff: 4.291534423828125e-06, min diff: 0.0 layer 3 max diff: 7.152557373046875e-06, min diff: 0.0 layer 4 max diff: 5.7220458984375e-06, min diff: 0.0 layer 5 max diff: 6.198883056640625e-06, min diff: 0.0 layer 6 max diff: 8.106231689453125e-06, min diff: 0.0 layer 7 max diff: 6.67572021484375e-06, min diff: 0.0 layer 8 max diff: 6.198883056640625e-06, min diff: 0.0 layer 9 max diff: 8.106231689453125e-06, min diff: 0.0 layer 10 max diff: 1.0728836059570312e-05, min diff: 0.0 layer 11 max diff: 9.775161743164062e-06, min diff: 0.0 layer 12 max diff: 1.1086463928222656e-05, min diff: 0.0 layer 13 max diff: 9.298324584960938e-06, min diff: 0.0 layer 14 max diff: 8.106231689453125e-06, min diff: 0.0 layer 15 max diff: 1.3113021850585938e-05, min diff: 0.0 layer 16 max diff: 1.2874603271484375e-05, min diff: 0.0 layer 17 max diff: 3.4332275390625e-05, min diff: 0.0 layer 18 max diff: 1.9073486328125e-05, min diff: 0.0 layer 19 max diff: 1.1682510375976562e-05, min diff: 0.0 layer 20 max diff: 1.52587890625e-05, min diff: 0.0 layer 21 max diff: 2.384185791015625e-05, min diff: 0.0 layer 22 max diff: 2.5510787963867188e-05, min diff: 0.0 layer 23 max diff: 3.337860107421875e-05, min diff: 0.0 layer 24 max diff: 1.71661376953125e-05, min diff: 0.0 

其中模型的参数是fp16,会产生一些微小的差别,是由于torch是基于fp32加载的(变成fp16会报错,有算子不支持),paddle是基于fp16加载的,计算出来的结果会稍有不同

加入文档:
image

跟huggingface的源代码有如下两个区别:

这两个算子不影响推理,可能会影响训练对齐。

@paddle-bot
Copy link

paddle-bot bot commented Apr 3, 2024

Thanks for your contribution!

@w5688414 w5688414 requested a review from sijunhe April 3, 2024 09:23
@w5688414 w5688414 self-assigned this Apr 3, 2024
@codecov
Copy link

codecov bot commented Apr 11, 2024

Codecov Report

Attention: Patch coverage is 72.99509% with 495 lines in your changes are missing coverage. Please review.

Project coverage is 55.23%. Comparing base (7b493a8) to head (78af468).
Report is 9 commits behind head on develop.

Files Patch % Lines
paddlenlp/transformers/deberta_v2/modeling.py 65.83% 234 Missing ⚠️
paddlenlp/transformers/deberta/modeling.py 76.26% 155 Missing ⚠️
paddlenlp/transformers/deberta_v2/tokenizer.py 62.17% 101 Missing ⚠️
paddlenlp/transformers/deberta/tokenizer.py 97.36% 4 Missing ⚠️
paddlenlp/transformers/deberta_v2/configuration.py 97.36% 1 Missing ⚠️
Additional details and impacted files
@@ Coverage Diff @@ ## develop #8227 +/- ## =========================================== + Coverage 55.15% 55.23% +0.08%  =========================================== Files 601 609 +8 Lines 91764 94218 +2454 =========================================== + Hits 50611 52040 +1429  - Misses 41153 42178 +1025 

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@w5688414 w5688414 requested a review from JunnYu April 11, 2024 03:22
@sijunhe sijunhe merged commit 814e9c4 into PaddlePaddle:develop Apr 11, 2024
@seetimee
Copy link

请问能加入中文版的deberta吗?

@w5688414
Copy link
Contributor Author

中文哪个版本?

@seetimee
Copy link

好像只有二郎神的v2

@w5688414
Copy link
Contributor Author

可以给出对应的hf的deberta链接

@w5688414
Copy link
Contributor Author

欢迎开发者贡献:

def _get_name_mappings(cls, config):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants