Skip to content
View danesherbs's full-sized avatar
🍹
Poolside
🍹
Poolside

Block or report danesherbs

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. openai/evals openai/evals Public

    Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

    Python 17.5k 2.9k

  2. openai/frontier-evals openai/frontier-evals Public

    OpenAI Frontier Evals

    Python 967 114

  3. openai/mle-bench openai/mle-bench Public

    MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering

    Python 1.2k 192

  4. bitblaster-16 bitblaster-16 Public

    BitBlaster-16 is a 16-bit computer built from scratch using only NAND gates and data flip-flops as primitives! :)

    Python 2

  5. fermi-poker fermi-poker Public

    Want to get better at making better estimates under uncertainty? No? Well, now you can!

    Python 4

  6. summarizing-from-human-feedback summarizing-from-human-feedback Public

    Implementation of OpenAI's "Learning to Summarize with Human Feedback"

    Jupyter Notebook 7 1