engblogs

summaries of the latest blog articles from your favorite tech companies.
Apple MLApple ML

Asynchronous Verified Semantic Caching for Tiered LLM Architectures

Proposes Krites, an asynchronous, LLM-judged caching policy for tiered LLM architectures that expands static cache coverage by validating near-threshold prompts with an LLM and promoting verified matches into the dynamic cache, boosting static hits without increasing critical-path latency.

2/16/2026
SalesforceSalesforce

How Agentforce Achieved Accurate Flow Generation Across 461 Billion Monthly Executions Using a Constrained DSL

Shows how Agentforce replaced fine-tuned models with a constrained, multi-stage DSL to deliver deterministic, high-accuracy Flow generation at scale across 461 billion monthly executions, with validation at every stage and metadata discipline.

2/16/2026
AWS MLAWS ML

Supercharge regulated workloads with Claude Code and Amazon Bedrock

Combines Anthropic Claude Code with Amazon Bedrock in AWS GovCloud (US) to enable compliant, agentic AI-assisted coding workflows for regulated workloads.

2/16/2026
Snorkel AISnorkel AI

Coding agents don’t need to be perfect, they need to recover

Eight frontier models are analyzed for how they recover from errors in agentic coding tasks, showing that recovery, not perfection, is the differentiator and outlining actionable patterns and fixes to boost resilience in automated agents.

2/13/2026
PinterestPinterest

GPU-Serving Two-Tower Models for Lightweight Ads Engagement Prediction

GPU-accelerated serving of two-tower models enables lightweight ads engagement prediction.

2/13/2026
AWS MLAWS ML

Customize AI agent browsing with proxies, profiles, and extensions in Amazon Bedrock AgentCore Browser

Guidance on configuring AgentCore Browser for AI agents with proxy routing, persistent browser profiles, and Chrome extensions to enable secure, stateful, enterprise web automation.

2/13/2026
Snorkel AISnorkel AI

What Separates Success from Failure?

Analyzes error patterns and recovery dynamics across eight frontier models on the Agentic Coding benchmark to reveal how resilience, not perfection, separates success from failure.

2/13/2026
OpenAIOpenAI

Scaling social science research

A concise, technical guide to scaling social science research by applying scalable data collection, analysis, and workflow methods to large datasets.

2/13/2026
OpenAIOpenAI

GPT-5.2 derives a new result in theoretical physics

GPT-5.2 derives a novel result in theoretical physics, highlighting AI-assisted approach and its implications for future research.

2/13/2026
OpenAIOpenAI

Beyond rate limits: scaling access to Codex and Sora

A practical guide to scaling access to Codex and Sora beyond rate limits, outlining high-throughput API patterns and resilient access strategies.

2/13/2026
OpenAIOpenAI

Introducing Lockdown Mode and Elevated Risk labels in ChatGPT

Technical overview of Lockdown Mode and Elevated Risk labels in ChatGPT, detailing the security controls and risk-aware behavioral changes.

2/13/2026
Apple MLApple ML

A Small-Scale System for Autoregressive Program Synthesis Enabling Controlled Experimentation

Cadmus is a small-scale autoregressive program-synthesis system with an integer VM and a DSL, enabling controlled experimentation to study inductive reasoning, training-distribution control, and affordable, transparent model analysis.

2/13/2026