Hacker News

by fzaninottoon 10/3/24, 4:35 AMwith 1 comments

by fzaninottoon 10/3/24, 4:35 AM

Evaluating the quality of the responses of AI agents used to be tricky. It required knowledge of eval criteria as well as third-party tools like promptfoo, ragas or prometheus. Now openAI makes it ridiculously easy with a new API endpoint. It can grade a completion against a reference response, assess its format and tone, and you can even promt the eval to add your own criteria.

OpenAI Eval