May 13, 2026 — Tandemly Briefings

🎯 Top 3 Things to Know

1. Google and SpaceX are in talks to put AI data centers in orbit, the WSJ reported, formalizing Project Suncatcher into a launch-vehicle conversation. Google has explored solar-powered satellite constellations carrying TPUs and free-space optical links since late 2025. Prototype satellites are slated for 2027. Earth-side AI buildouts are constrained by grid interconnect queues, water for cooling, and power costs that compound at gigawatt scale. Orbital sites get continuous solar and passive radiative cooling, but launch economics and on-orbit servicing remain unsolved. Worth watching whether the prototype launches a single TPU pod or a small mesh, since the optical-link claim only matters with multiple nodes to connect. TechCrunch · Bloomberg

2. South Korea's presidential policy chief floated a "citizen dividend" funded by AI sector tax revenue, briefly knocking the Kospi down 5.1% before officials clarified it would draw from excess revenue, not a corporate windfall tax. Kim Yong-beom posted the proposal on Facebook Monday night. Markets read it as a new levy on Samsung and SK Hynix and sold off sharply on Tuesday. President Lee Jae-myung walked it back the same day, calling it a personal idea about redistributing already-collected excess tax revenue rather than a new tax instrument. The episode is the cleanest example yet of how AI infrastructure profits are becoming politically contested in markets that depend on chip exports. It also previews the language other governments will start using. Worth watching whether the proposal resurfaces in formal budget discussions or quietly dies, and whether any peer economy adopts the framing. Bloomberg · Bloomberg follow-up

3. OpenAI will extend GPT-5.5-Cyber preview access to vetted EU cybersecurity teams. Anthropic has not yet offered the EU equivalent access to Claude Mythos. GPT-5.5-Cyber is OpenAI's defender-focused variant with fewer refusal guardrails for verified security work. Anthropic's Mythos, released a month ago, is restricted to roughly 40 organizations under Project Glasswing. UK AISI testing reported Mythos completing a simulated 32-step corporate attack in three of ten attempts versus GPT-5.5 at two of ten. The split shows how the two labs read the same dual-use risk differently. OpenAI is widening the vetted pool. Anthropic is keeping the pool narrow and named. Worth watching whether EU regulators treat the two access models as functionally equivalent or whether the narrower Glasswing approach becomes the de facto safety standard for cyber-capable models. CNBC · Anthropic Project Glasswing

🚀 Frontier Models & Features

Quiet day on releases. Apple's WWDC date is confirmed for June 8 with a Siri overhaul expected to include third-party model extensions (Claude, Gemini, and others) routable through a new on-device agent surface. Google I/O opens May 19 with Gemini 4 widely expected. The week's signal is that the next round of model news will arrive bundled with platform integration rather than as standalone model cards. Tom's Guide WWDC preview · Google I/O

🔬 Research Worth Reading

ComplexMCP: Evaluation of LLM Agents in Dynamic, Interdependent, and Large-Scale Tool Sandboxes (Li, Yang, Wang et al.). arXiv 2605.10787
- TL;DR: A new MCP-based benchmark with 150+ interdependent stateful tools plus 150+ stateless APIs across seven sandbox domains (office, finance, etc.), seeded so that a single random seed governs both environment setup and runtime perturbations like API failures.
- Stat: Three failure modes recur across frontier agents under analysis: tool-retrieval saturation as the action space scales, over-confidence skipping environment verification, and "strategic defeatism" where the agent rationalizes failure instead of recovering.
- Apply it: If your agent eval suite uses only stateless tool stubs, add at least one stateful sandbox with seeded API failures. The "strategic defeatism" pattern is the most useful new vocabulary here. Grep your eval traces for moments where the model accepts failure mid-trajectory and you will likely find a recoverable retry that would have moved the run forward.
AgentTrust: Runtime Safety Evaluation and Interception for AI Agent Tool Use (Chenglin Yang). arXiv 2605.04785
- TL;DR: A runtime layer that sits between agents like Claude Code or OpenDevin and the tool surface. Each tool call is intercepted, scored, and returned with a structured verdict (allow, warn, block, review) before execution.
- Stat: The interception layer adds bounded latency overhead while catching destructive shell and HTTP calls that would otherwise reach the OS. Specific numbers vary by tool class in the paper's evaluation matrix.
- Apply it: Treat agent tool calls the same way you treat outbound HTTP from production services: a place to put a policy layer. If your coding agent is one rm -rf away from a bad day, a tool-call firewall belongs in the harness, not in the model's training data.

🏢 Enterprise in the Wild

Judgment Labs raised $32M from Lightspeed to build evaluation and observability infrastructure for AI agents. The thesis is that production agent reliability has shifted from a modeling problem to a measurement problem. The funding comes the same week ComplexMCP arrived on arXiv and Anthropic's Bloom evaluator continues to gain adoption. The pattern is consistent. Agent eval tooling is now a separately funded category, not a feature inside an agent framework. Crescendo AI roundup

🛠️ Tooling & Ecosystem

The MCP Python SDK shipped 1.27.1 on May 8 and the PHP SDK posted an update on May 11. Both are minor maintenance releases, but the cadence matters: MCP server tooling is moving from beta-style churn to steady incremental shipping across languages, which makes it materially easier to maintain production servers without breaking client compatibility. MCP Python SDK on PyPI · MCP PHP SDK

⚖️ Policy & Regulation

China's State Council issued the 2026 Legislative Work Plan on May 12, including AI legislation on its 2026 agenda. The plan signals that Beijing is preparing a unified domestic AI law rather than continuing to govern through sectoral measures. No draft text has been published. The companion item is the South Korea citizen-dividend episode above, which collectively suggests AI revenue distribution is becoming a near-term legislative concern in Asia ahead of any U.S. federal equivalent. China State Council coverage

📌 Watch List

Orbital data centers as a serious infrastructure conversation, not a slide-deck pitch.
Citizen-dividend framing for AI windfalls, and whether other governments adopt the language.
Divergent EU access strategies for cyber-capable frontier models (OpenAI broad vetting vs. Anthropic narrow named partners).
Agent runtime safety as a separate harness layer, not a training-time concern.
Apple WWDC June 8 and Google I/O May 19. Platform-level AI integration likely to overshadow standalone model launches this month.