AI Model Outperforming Human Experts
Google Gemini Ultra scored 90.1% on the Massive Multitask Language Understanding (MMLU) test, becoming the first AI model to outperform human experts.
What is the MMLU Test?
- MMLU is a comprehensive benchmark evaluating AI’s knowledge and problem-solving ability across 57 subjects (mathematics, physics, history, law, medicine, ethics, etc.).
Score Comparison
- Average human score: ~34.5%
- Expert human score: ~89.8%
- GPT-4 score: ~86.4%
- Gemini Ultra score: 90.0%–90.1%
Why is this achievement important?
- First AI model to outperform human experts.
- Best in 30 out of 32 major AI benchmarks.
- Multimodal capability: Leading in understanding text, images, audio and video data.
Impact and Prospects
- AI Reasoning and Knowledge: Equals ....
Do You Want to Read More?
Subscribe Now
To get access to detailed content
Already a Member? Login here

