add mteb evaluation #8538

cxa-unique · 2024-06-04T09:10:33Z

PR types

New features

PR changes

Others

Description

Add the evaluation scripts for MTEB benchmark in the pipeline example "contrastive_training"

paddle-bot · 2024-06-04T09:10:38Z

Thanks for your contribution!

w5688414 · 2024-06-04T09:22:38Z

pipelines/examples/contrastive_training/mteb_evaluation/README.md

@@ -0,0 +1,97 @@
+# MTEB基准评估


把meteb的代码和原始的evaluation目录进行合并，不单独开一个目录，结构参考：

在原evaluation下加了mteb目录

w5688414 · 2024-06-04T09:27:13Z

pipelines/examples/contrastive_training/mteb_evaluation/README.md

+## 模型评估
+使用评估脚本`eval_mteb.py`：
+
+- `base_model_name_or_path`: 模型名称或路径


把这个README.md和主readme进行融合，维护主readme即可

已与主README融合，并已测试过

w5688414 · 2024-06-04T09:30:28Z

代码融合后，需要检查一下，原来的readme里面的内容是否还能跑通

codecov · 2024-06-05T08:05:58Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 53.97%. Comparing base (f36ed75) to head (c9379e8).
Report is 247 commits behind head on develop.

Additional details and impacted files

@@ Coverage Diff @@ ## develop #8538 +/- ## ======================================== Coverage 53.97% 53.97% ======================================== Files 618 618 Lines 96827 96827 ======================================== + Hits 52258 52259 +1  + Misses 44569 44568 -1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

w5688414 · 2024-06-05T08:28:08Z

pipelines/examples/contrastive_training/README.md

+是一个大规模文本嵌入评测基准，包含了丰富的向量检索评估任务和数据集。
+本仓库主要面向其中的中英文检索任务（Retrieval），并以SciFact数据集作为主要示例。
+
+使用评估脚本`evaluation/mteb/eval_mteb.py`：


参数的解释说明写到执行命令的后面，跟上面的写法保持一致。

w5688414

LGTM

sijunhe

lgtm

add mteb evaluation

f7fe7ae

paddle-bot bot added the contributor label Jun 4, 2024

paddle-bot bot assigned wawltor Jun 4, 2024

w5688414 reviewed Jun 4, 2024

View reviewed changes

w5688414 assigned w5688414 and unassigned wawltor Jun 4, 2024

w5688414 requested a review from sijunhe June 4, 2024 09:35

w5688414 assigned cxa-unique and unassigned w5688414 Jun 4, 2024

cxa-unique added 2 commits June 5, 2024 15:32

add mteb evaluation

7831752

add mteb evaluation

d42cd96

Merge branch 'PaddlePaddle:develop' into add_mteb_eval

4269175

w5688414 reviewed Jun 5, 2024

View reviewed changes

add mteb evaluation

c9379e8

w5688414 approved these changes Jun 5, 2024

View reviewed changes

sijunhe approved these changes Jun 5, 2024

View reviewed changes

sijunhe merged commit 1cf780e into PaddlePaddle:develop Jun 5, 2024

sijunhe added the Beijing Innovation Consortium label Jul 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add mteb evaluation #8538

add mteb evaluation #8538

Uh oh!

cxa-unique commented Jun 4, 2024

paddle-bot bot commented Jun 4, 2024

w5688414 Jun 4, 2024

cxa-unique Jun 5, 2024

w5688414 Jun 4, 2024

cxa-unique Jun 5, 2024 •

edited

Loading

w5688414 commented Jun 4, 2024

codecov bot commented Jun 5, 2024 •

edited

Loading

w5688414 Jun 5, 2024

w5688414 left a comment

sijunhe left a comment

Labels

4 participants

add mteb evaluation #8538

add mteb evaluation #8538

Uh oh!

Conversation

cxa-unique commented Jun 4, 2024

PR types

PR changes

Description

paddle-bot bot commented Jun 4, 2024

w5688414 Jun 4, 2024

Choose a reason for hiding this comment

cxa-unique Jun 5, 2024

Choose a reason for hiding this comment

w5688414 Jun 4, 2024

Choose a reason for hiding this comment

cxa-unique Jun 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

w5688414 commented Jun 4, 2024

codecov bot commented Jun 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

w5688414 Jun 5, 2024

Choose a reason for hiding this comment

w5688414 left a comment

Choose a reason for hiding this comment

sijunhe left a comment

Choose a reason for hiding this comment

Labels

4 participants

cxa-unique Jun 5, 2024 •

edited

Loading

codecov bot commented Jun 5, 2024 •

edited

Loading