AI Newsletter
Issue #1 by ODVI AI Committee
More than a year ago, Mors & Ritz had a short presentation entitled ChatGPT in the workplace to show AI's cutting-edge developments, which at that time was OpenAI's ChatGPT-3. Since then, the AI landscape has evolved rapidly, making it challenging to stay informed. To bridge that gap, the AI Committee was formed and we're excited to launch a quarterly newsletter, starting today, to keep you updated on the latest advancements and opportunities in AI. Here are some AI news followed by some tool recommendations that you can use to improve your own work in 2025.
AI Showdown: Google Surpasses OpenAI
The AI arena is crowded with formidable players: OpenAI with ChatGPT, Google with Gemini, Anthropic with Claude, Meta with Llama, and Alibaba with Qwen. Each has its strengths—OpenAI pioneered mainstream conversational AI, Google leverages its vast data and computational resources, Claude is well known with developers as it produces the highest quality code and more recently it was demonstrated that it can control the computer and perform tasks for the user, Meta aims for open-source innovation with Llama.
For the longest time, OpenAI's ChatGPT held the crown as the leading chat model. But recently, Google is now slightly better with Gemini EXP and Gemini 2.0 Flash, outperforming ChatGPT 4o.
This dominance extends beyond LLMs. In generative AI, Google's VEO 2 outshines OpenAI's Sora in text-to-video tasks, and Imagen 3 surpasses DALL-E 3 in text-to-image creation. Google's advantage likely stems from its ownership of YouTube, providing an unmatched dataset for training these advanced models.
AGI Achieved: A Milestone in Artificial Intelligence
While Google may have recently taken the crown in Chatbot Arena, OpenAI has made headlines with its newest model (which is currently not yet released to the public), o3, scoring an unprecedented 87.5 on the ARC-AGI benchmark. This benchmark is designed to evaluate genuine intelligence by presenting novel, unique problems that no human or machine has encountered before—essentially a test of reasoning and problem-solving across domains.
For context, the smartest human capability is benchmarked at a score of 85 (vs the 87.5 that o3 scored), meaning AI has now surpassed general human reasoning in a way comparable to Deep Blue defeating Garry Kasparov in chess in 1997. However, unlike Deep Blue's domain-specific superiority, o3 demonstrates reasoning prowess across all areas of knowledge.
If this sounds unsettling, there’s a silver lining: it cost OpenAI $250,000 in compute resources to achieve this score. This means AGI is not yet cost-effective for practical, widespread use, keeping human domain experts safe—at least for the moment.