AI Model Evaluation

Chatbot Arena

Chatbot Arena is a benchmark platform for Large Language Modeling (LLM), which conducts anonymous random battles through crowdsourcing. The project is led by LMSYS Org...

Tags:

Chatbot Arena is a benchmark platform for Large Language Modeling (LLM), which conducts anonymous random battles through crowdsourcing. The project is led by LMSYS Org, a research organization co founded by the University of California, Berkeley, the University of California, San Diego, and Carnegie Mellon University.
Enter the battle platform through the demo experience address, enter the questions you are interested in, submit the questions, and the anonymous model will compete in pairs to generate relevant answers. Users are required to evaluate the answers and choose one of the four evaluation options: Model A is better, Model B is better, tied, and all very poor. Support multiple rounds of dialogue. Finally, the Elo scoring system was used to comprehensively evaluate the ability of the large model. (You can specify your own model to see the effect, but it will not be included in the final ranking situation).

data statistics

Relevant Navigation

No comments

No comments...