February 2025

Claude 3.7 Sonnet and Claude Code Anthropic

Claude 3.7 Sonnet, Anthropic's latest hybrid reasoning model, offers quick and extended thinking modes for coding and web development. It includes Claude Code, a tool for developers to manage coding tasks directly from the terminal. Available across various plans, it maintains pricing similar to previous models. Claude 3.7 excels in real-world coding challenges, enhances user control over processing time, and improves upon its predecessor's reasoning capabilities while reducing harmful request refusals. The model aims to integrate reasoning and coding in a user-friendly manner, bringing AI closer to enhancing human capabilities.

https://www.anthropic.com/news/claude-3-7-sonnet

A New Generation of AIs: Claude 3.7 and Grok 3

New AIs, Claude 3.7 and Grok 3, show significant advancements due to increased computing power and new capabilities, improving performance in complex tasks and coding. Grok 3, using over 10x the computing power of GPT-4, leads benchmarks, while Claude 3.7 excels in practical applications. These models leverage two key Scaling Laws: larger models require more resources for better performance, and giving AI more time improves outcomes. As AI systems evolve, organizations need to shift focus from automation to capability enhancement, adapting strategies to manage rapid advancements and redefining metrics to value innovative contributions over mere efficiency.

https://www.oneusefulthing.org/p/a-new-generation-of-ais-claude-37

There Is No AI Revolution

Generative AI is framed as a costly and unsustainable industry, lacking real profitability. OpenAI, the leading player, reportedly lost $5 billion after generating $4 billion in revenue in 2024, spending $9 billion overall due to high operational and talent costs. The software model is unscalable and loses money on every prompt. Metrics like “weekly active users” can be misrepresented; OpenAI's user base is over 400 million, yet substantial conversion to paying customers is very low. The market is dominated by OpenAI, with limited competition and widespread unprofitability in the AI sector, raising doubts about the industry's legitimacy.

https://www.wheresyoured.at/wheres-the-money/

General Reasoning

Open Reasoning Data includes 1,748,344 questions and 300,119 thought traces for model training, categorized into various subjects like Mathematics, Medical, Chemistry, Physics, Biology, Languages, Engineering, Social Sciences, Humanities, and Coding, with detailed stats on questions and traces for each category.

https://gr.inc/

The Deep Research Problem — Benedict Evans

OpenAI's Deep Research tool appears useful for data analysis but often fails on accuracy, providing misleading information. While it saves time compiling data, results require meticulous verification, limiting its effectiveness. The tool is not geared for precise data retrieval, reflecting a broader issue with LLMs—being good at creative tasks but poor at accurate, deterministic ones. There are uncertainties about whether these models will improve sufficiently to become reliable. Overall, while they facilitate research, they still rely heavily on user oversight for accuracy.

https://www.ben-evans.com/benedictevans/2025/2/17/the-deep-research-problem

Here’s How Four Major Newsrooms Are Using AI

Major newsrooms are increasingly integrating AI into their operations.

  1. New York Times: Offers AI training for journalists, uses internal AI tools like “Echo,” and allows AI for minor revisions while cautioning against drafting entire articles.

  2. Quartz: Relies heavily on AI for generating earnings reports and blogs, but lacks strong human oversight, leading to potentially unreliable content.

  3. AP: Utilizes AI for translation, research, and summarizing content but maintains human control over most blog posts.

  4. Washington Post: Implements an AI-driven search tool, “Ask the Post AI,” to provide responses based on published content, encouraging users to verify information.

Caution is advised as reliance on AI can affect accuracy and creative input across these platforms.

https://lifehacker.com/tech/how-major-newsrooms-are-using-ai

Why I Think AI Take-off Is Relatively Slow

AI take-off is slow due to inefficiencies in less productive sectors (Baumol-Bowen cost disease), human bottlenecks affecting adoption and productivity, challenges in human-AI collaboration (O-Ring model), unclear economic impacts of AI, the historical diffusion of technologies, and a stable GDP growth pattern. Despite optimism about AI capabilities, significant barriers remain that hinder rapid transformation. Predictions suggest AI could increase growth rates modestly over time without drastic changes in daily experiences for most people.

https://marginalrevolution.com/marginalrevolution/2025/02/why-i-think-ai-take-off-is-relatively-slow.html

The Most Underreported and Important Story in AI Right Now Is That Pure Scaling Has Failed to Produce AGI

Scaling AI has not led to AGI, despite heavy investments. The scaling hypothesis—adding data and computational power—is failing, as models exhibit unreliability and hallucinations. Experts, including Gary Marcus, argue for new solutions, emphasizing that existing strategies have limitations akin to past technological trends. Major companies must innovate beyond scaling to achieve reliable AI.

https://fortune.com/2025/02/19/generative-ai-scaling-agi-deep-learning/

The Generative AI Con

Generative AI is a bubble fueled by hype and unsustainable practices. ChatGPT's popularity, with claimed huge user numbers, does not equate to profitability or meaningful industry impact. Reporting often exaggerates the significance of generative AI, leading to inflated valuations and unrealistic revenue projections for companies like OpenAI and Anthropic. Current offerings, like Deep Research and other models, serve primarily as expensive, low-quality products, underscoring a trend of financial losses and operational inefficiencies. The narrative around generative AI is driven more by investor hype than actual utility, creating an unstable economic environment and a potential crash.

https://www.wheresyoured.at/longcon/

Scroll to Top