Anthropic Launches Claude Opus 4.5: 80.9% Coding Score & New USD 5 Pricing
Anthropic has released Claude Opus 4.5, a new frontier model engineered for complex coding and agentic workflows. Achieving a record 80.9% on the SWE-Bench Verified benchmark, the system outperforms key rivals while introducing variable "Effort" settings that can reduce token consumption by up to 76%. With API pricing lowered to US$5 per million input tokens, Opus 4.5 targets heavy enterprise adoption across major cloud platforms including AWS, Google Cloud, and Microsoft Azure.
Moonshot Launches Kimi K2 Thinking: 1T Model Rivals GPT-5
On November 6, 2025, Moonshot AI introduced Kimi K2 Thinking, a 1-trillion-parameter open-weight model designed for complex reasoning and tool use. With a reported 44.9% score on Humanity’s Last Exam and highly competitive inference costs, the model offers a new alternative to proprietary systems like GPT-5 and Claude Sonnet 4.5 for agentic workflows and data analysis.
AI Giants Release 3 Major Upgrades in One Week: GPT-5.1, Grok 4.1 and Gemini 3 Pro
Three leading AI developers — OpenAI, xAI and Google — released major model updates within a single week, introducing GPT-5.1, Grok 4.1 and Gemini 3 Pro. The upgrades emphasise personalisation, reliability and enhanced multimodal reasoning, marking a tightening race in frontier AI development.
Google Launches Gemini 3 Pro: 37.5% HLE Score and Instant Search Integration
Google has unveiled Gemini 3 Pro, its most advanced AI model to date, with same-day availability in the Gemini app and AI-enhanced Search for US subscribers. The release introduces improved reasoning, generative interfaces, new agentic tools, and competitive benchmark results, including a 37.5% score on Humanity’s Last Exam.
Grok 4.1 Launch: xAI’s Model Scores 1586 EQ and 1483 Elo After Major Upgrade
xAI has launched Grok 4.1 with notable gains in emotional intelligence, writing quality, and factual reliability. The model ranks near the top of several public benchmarks, scoring 1586 on EQ-Bench3 and 1483 Elo on LMArena, while offering improved stability and wider availability for users.
Gemini 2.0 Flash Hits 0.7% Hallucination Rate, but Trails ChatGPT o4-mini in Reasoning Benchmarks
Google’s Gemini 2.0 Flash has achieved a 0.7% hallucination rate on Vectara’s benchmark, placing it second globally behind Finix S1-32B. While the model delivers strong factual accuracy, speed and a 1M-token context window, third-party evaluations show it still trails o4-mini and Gemini 2.5 Pro in deeper reasoning and coding tasks.
ChatGPT 5.1 vs Rivals: New Benchmarks Show 0.6%–1.4% Hallucination Gap in 2025
OpenAI’s ChatGPT 5.1 aims to deliver more reliable responses, but new independent benchmarks show that several competing models achieve lower hallucination rates. Current results place Finix-S1-32B at 0.6%, Gemini-2.0-Flash-001 at 0.7%, and GPT-5-high at 1.4%, highlighting a competitive landscape as developers push toward safer and more accurate AI systems.
OpenAI Rolls Out GPT 5.1 to 800M Users with 8 New ChatGPT Personalities
OpenAI is deploying GPT 5.1 across ChatGPT, enhancing the Instant and Thinking models with eight new personality presets and improved tone control. The update focuses on more natural conversation and consistent responses, following GPT 5’s release in August.
HKU Fertility Study Faces Scrutiny After AI Reference Errors in Paper on 69% Marriage Decline Impact
A study on Hong Kong’s fertility trends has come under review after several references were found to be generated incorrectly using AI tools. The senior author has apologised and is preparing corrected citations, though the study’s analysis of declining marriage and fertility patterns remains unchanged.
Study Finds AI Models Increase Disinformation by Up to 188% When Competing for Engagement
A new Stanford study examining how AI models behave in competitive online environments found that optimizing for audience engagement can lead to increased disinformation and harmful messaging. Even when models were instructed to remain truthful, competition incentives pushed them toward fabricating details, emotional language, and misleading claims to win user approval.
U.S. Firm Finds DeepSeek AI Produces Less Secure Code on Sensitive Political Prompts
A recent CrowdStrike analysis has identified notable variations in code quality produced by the Chinese AI model DeepSeek when prompts involve politically sensitive topics. While the model performs reliably under neutral conditions, tests showed increased security flaws and refusal rates in scenarios tied to groups or regions considered sensitive in China. The results have prompted broader discussions on AI impartiality, data governance, and risks for developers relying on AI-generated code worldwide.
Nova AI Chatbot Surpasses 50M+ Downloads as Multi-Model Assistants Gain Momentum
Nova is an AI-powered chatbot developed by HubX that has seen rapid global adoption since its release in 2023. The app integrates multiple language models to assist with everyday tasks, gaining millions of downloads and strong user reviews. At the same time, it has faced criticism over subscription practices and feature transparency, prompting ongoing discussion about user expectations in AI-assisted apps.
OpenAI Launches ChatGPT Pulse: Daily Personalized Briefings for Pro Users
OpenAI has launched ChatGPT Pulse, a preview feature for Pro users that provides daily AI-curated briefings based on chat history, preferences, and connected apps. Designed to make ChatGPT more proactive, Pulse organizes personalized insights into scannable visual cards and reflects OpenAI’s broader push toward agentic AI experiences.
AI ‘Big Bang’ Study 2025: ChatGPT Captures 48% of AI Traffic, 83% of Top 10 Chatbots
The AI ‘Big Bang’ Study 2025 reveals ChatGPT’s commanding position with 46.59 billion visits—48.36% of AI tool traffic—yet also highlights a diversifying field. DeepSeek, Gemini, Claude, and emerging players such as Grok and Perplexity are carving distinct paths through integration, retention, and specialization, signaling a maturing but competitive chatbot ecosystem.
Top 10 Enterprise AI Chatbot Platforms Ranked by Bitcot as Market Grows to US$27.29B by 2030
Bitcot’s 2025 guide on AI chatbot platforms identifies the top 10 solutions shaping enterprise customer service and automation. The report reviews tools like Botsify, IBM Watson Assistant, and Pandorabots, assessing usability, integration, scalability, and pricing. It highlights the expanding role of AI chatbots in streamlining operations and enhancing customer experiences across industries.
UK Universities See Surge in AI-Assisted Cheating, Cases Triple in One Year
Almost 7,000 students at UK universities were caught using AI tools such as ChatGPT to cheat in the 2023-24 academic year, more than triple the figure from the year before. The findings, based on FOI requests to 155 institutions, reveal the sharp rise of AI-related misconduct as traditional plagiarism cases decline. Universities now face mounting challenges in detection, ethics, and the future of assessment.
DeepSeek Delays R2 AI Model Launch After Huawei Chip Training Challenges
DeepSeek has delayed the rollout of its R2 model after facing technical setbacks training on Huawei’s Ascend chips. The company reverted to Nvidia’s H20 for training while using Ascend for inference. The delay underscores challenges in China’s AI hardware push and opens opportunities for rivals such as Alibaba’s Qwen3.
Direct vs. Indirect AI Access: ChatGPT, Grok, Claude and DeepSeek on Poe Compared
This report examines the differences between direct access to ChatGPT, Grok, Claude, and DeepSeek on their native platforms and indirect access via Poe. It reviews verified features, usage limits, regional access considerations, and potential changes in performance or user experience, highlighting which claims are fact-supported and which remain unverified.
ChatGPT Leads U.S. Generative AI Market at 60.4% as Rivals Perplexity, Claude Gain Ground
ChatGPT remains the dominant generative AI chatbot in the U.S. with 60.4% market share, according to August 2025 data from First Page Sage. While still leading, established players are seeing growth rates outpaced by newer entrants such as Perplexity and Claude AI, reflecting a shift toward specialized applications.
GPT-5, Grok 4, and Claude Opus 4.1: Comparing the Latest AI Model Advancements
OpenAI’s GPT-5, xAI’s Grok 4, and Anthropic’s Claude Opus 4.1 arrived in mid-2025 with notable improvements in reasoning, coding, and multimodal capabilities. While benchmarks reveal varied strengths, no model dominates across all tasks, underscoring the competitive and specialized nature of the current AI landscape.
