
09 October 2025
9th October - AI News Daily - Google's Gemini 2.5 Unleashes Browser Automation, Reshaping Agent Capabilities
AI News Daily
About
Send us a text
🌍 INAI • The Open AI Hub
The Intelligence Atlas → the world’s most comprehensive, open hub of AI knowledge. 2 Million+ tools, models, agents, tutorials & daily news—free for all, updated every day.
https://github.com/inai-sandy/inAI-wiki
TOP HIGHLIGHTS
- Google's Gemini 2.5 introduces "computer use" capabilities for browser automation, bringing agent automation to the mainstreamAMD secures multi-billion GPU deal with OpenAI while Nvidia tightens direct sales, intensifying AI compute competitionSecurity concerns emerge with first malicious MCP server discovery and Figma MCP vulnerabilityCoreWeave launches Serverless RL with Weights & Biases integration to simplify agent trainingDisney and Universal sue Midjourney over character imagery, escalating copyright debates
NEW TOOLS & FRAMEWORKS
- Microsoft unifies AutoGen and Semantic Kernel into enterprise-ready Agent FrameworkAnthropic releases Petri for open-source LLM auditingGoogle's Opal no-code app builder expands to 15 countriesStripe adds model pricing and usage tracking APIsPython 3.14 stabilizes GIL-free interpreter with Pydantic 2.12 support
LLM INNOVATIONS
- Ling-1T debuts trillion-parameter open-source reasonerSamsung's 7M-parameter Tiny Recursive Model outperforms larger systemsAI21's Jamba Reasoning 3B offers efficient reasoning trade-offsAlibaba releases Qwen3 Omni multimodal model and Qwen Image EditLiquidAI demonstrates on-device reasoning for iPhone 17 Pro
RESEARCH HIGHLIGHTS
- Drax achieves SOTA speech recognition with discrete flow matchingModernVBERT outperforms larger models through architecture innovationMulti-vector embeddings improve retrieval precisionCAIS updates "Humanity's Last Exam" to rolling benchmarkVChain introduces chain-of-visual-thought for video reasoningResearch shows quantization resilience must be built into training
INDUSTRY & POLICY DEVELOPMENTS
- USPTO pilots AI-assisted prior-art discovery for patent applicationsGoogle faces DOJ scrutiny over Gemini integration in core servicesHidden Unicode payload attacks affect some LLMs, including Gemini-class models
PRACTICAL RESOURCES
- Step-by-step RAG implementation guide for beginnersGuide on when to parse vs. extract in document workflowsStrategies for Sora 2 guardrails and watermarkingPrompt optimization techniques for agent reliabilityPrivacy best practices for biometric data handling
DEMOS & APPLICATIONS
- Intercom showcases LangGraph powering Fin_ai customer supportPika's Predictive Video enables prompt-to-clip creationSora-powered "viral video recreator" teasedSeedream mobile agent enables on-device image generationCristiano Ronaldo reportedly used Perplexity AI for speech preparation
THOUGHT-PROVOKING DISCUSSIONS
- JEPAs may bridge generative and contrastive learningQuality over quantity emphasized for RL training signalsStudies show sycophantic AI undermines relationship repairLLM checks identify 80M+ inconsistent Wikipedia factsIndustry consolidation raises concerns about AI infrastructure accessSora's upside-down exploit highlights evaluation gaps
Support the show