最佳AILlm Evaluation:工具
8 款工具按社区信号和数据排名。
1
lm-evaluation-harness
A framework for few-shot evaluation of language models.
免费8分
2
lighteval
a lightweight LLM evaluation suite that Hugging Face has been using internally.
免费8分
3
simple-evals
Eval tools by OpenAI.
免费8分
4
OLMO-eval
a repository for evaluating open language models.
免费8分
5
HELM
Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models.
免费8分
6
instruct-eval
This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out
免费8分
7
Giskard
Testing & evaluation library for LLM applications, in particular RAGs
免费8分
8
Ragas
a framework that helps you evaluate your Retrieval Augmented Generation (RAG) pipelines.
免费8分