Popular Tags
No tags found in this context
The most inspiring discoveries in evaluation framework

DeepEval is an open-source framework for evaluating large language models, offering customizable metrics and seamless integration with popular AI frameworks.

A framework for evaluating language models with a focus on few-shot tasks, supporting various model backends and benchmarks.

Promptfoo is a CLI tool for evaluating and securing LLM applications through automated testing and red teaming.