Skip to content

Conversation

@nikita1503
Copy link

@nikita1503 nikita1503 commented Apr 14, 2024

While trying to run the steps given in leaderboard README, found following improvements

  • 1-Setup
    model variable to be initialised before creating generations and metrics directories

  • 2-Generations
    save_generations flag is missing in while running generations
    max_length to be 1024 for some tasks, based on your tokeniser (Fix for max_length_generation parameter #207)

  • 3-Evaluations
    Generations file is saved in save_generations_path_$task, while running evaluations it should load from this path(_$task is missing in the path in README).

    save_generations_path = f"{os.path.splitext(args.save_generations_path)[0]}_{task}.json"

Copy link
Collaborator

@loubnabnl loubnabnl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for the fixes!

fi
gen_suffix=generations_$task\_$model.json
gen_suffix=generations_$task\_$model\_$task.json
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need to have the same path format as here

save_generations_path = f"{os.path.splitext(args.save_generations_path)[0]}_{task}.json"
because during evaluation we call --load-generations_path which can be anything. So let's maybe keep the original path to not have task twice?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because during evaluation we call --load-generations_path which can be anything.

Right however, current README steps for Evaluation passes $gen_suffix variable in --load-generations_path argument

gen_suffix=generations_$task\_$model.json
metric_suffix=metrics_$task\_$model.json
echo "Evaluation of $model on $task benchmark, data in $generations_path/$gen_suffix"
sudo docker run -v $(pwd)/$generations_path/$gen_suffix:/app/$gen_suffix:ro -v $(pwd)/$metrics_path:/app/$metrics_path -it evaluation-harness-multiple python3 main.py \
--model $org/$model \
--tasks $task \
--load_generations_path /app/$gen_suffix \

Since $gen_suffix is missing the _task suffix, Running evaluations results in the following error

image

After adding _task suffix in $gen_suffix, evaluations run successfully.

I was able to run evaluations for the Artigenz-Coder-DS-6.7B here after these changes

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it shouldn't throw an error if you used save_generations_path=generations_$task\_$model.json in the generations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants