tandemly.ai
Briefing · MAY 4 2026

May 4, 2026

AI daily briefing

🎯 Top 3 Things to Know

1. Google's Pentagon Gemini deal goes "any lawful purpose" — ~600 Googlers sign an open letter against it. GenAI.mil now runs Gemini 3.1 Pro on classified data for "any lawful government purpose" — the exact language Anthropic refused and got blacklisted for. The internal backlash is now public; Fortune's read is this won't be a Project Maven repeat, since tech-sector layoffs have eroded employee leverage. Anthropic remains the only major lab that walked away from the procurement standard everyone else signed. Fortune · 9to5Google

2. IBM: 76% of large orgs now have a Chief AI Officer, up from 26% a year ago. IBM Institute for Business Value's annual CEO study (2,000 leaders, 33 countries) shows CAIO penetration tripled in twelve months. By 2030, CEOs expect 48% of operational decisions where consistency and guardrails can be codified to be made by AI without human review (up from 25% today). 29% of employees expect to need reskilling for a different role between 2026 and 2028; 53% need upskilling for their current role. IBM Newsroom

3. Grafana open-sources o11y-bench: an agent benchmark that runs on a real Grafana stack, not a static dataset. 63 tasks across PromQL, LogQL, TraceQL, incident investigations, and dashboard editing — graded on what the agent changes in the system, not what it claims to do. Across 29 model variants, Claude Opus 4.7 (reasoning off) leads on consistency, Qwen 3.6 Plus is the top open-source. The methodology — measure system mutation, not transcript — is more interesting than the leaderboard. Grafana Labs · GitHub


🚀 Frontier Models & Features

Quiet day — no new frontier-model releases in the last 24 hours. Cost compression keeps playing out from late-April releases (DeepSeek V4, Mistral Medium 3.5, Gemini 3.1 Flash-Lite) rather than fresh launches.


🔬 Research Worth Reading


🏢 Enterprise in the Wild


🛠️ Tooling & Ecosystem


⚖️ Policy & Regulation


📌 Watch List