AI Model Outperforming Human Experts

Google Gemini Ultra scored 90.1% on the Massive Multitask Language Understanding (MMLU) test, becoming the first AI model to outperform human experts.

What is the MMLU Test?

  • MMLU is a comprehensive benchmark evaluating AI’s knowledge and problem-solving ability across 57 subjects (mathematics, physics, history, law, medicine, ethics, etc.).

Score Comparison

  • Average human score: ~34.5%
  • Expert human score: ~89.8%
  • GPT-4 score: ~86.4%
  • Gemini Ultra score: 90.0%–90.1%

Why is this achievement important?

  1. First AI model to outperform human experts.
  2. Best in 30 out of 32 major AI benchmarks.
  3. Multimodal capability: Leading in understanding text, images, audio and video data.

Impact and Prospects

  • AI Reasoning and Knowledge: Equals ....
Do You Want to Read More?
Subscribe Now

To get access to detailed content

Already a Member? Login here


Take Annual Subscription and get the following Advantage
The annual members of the Civil Services Chronicle can read the monthly content of the magazine as well as the Chronicle magazine archives.
Readers can study all the material before the last six months of the Civil Services Chronicle monthly issue in the form of Chronicle magazine archives.

Related Content