05/14/26 - Subquadratic Attention Economics, Active Parameter Efficiency in MoE, Default Tier Competition Moves to Hallucination

05/14/26 - Subquadratic Attention Economics, Active Parameter Efficiency in MoE, Default Tier Competition Moves to Hallucination

Episode description

This episode covers SubQ’s commercial deployment of subquadratic attention with twelve million token context at a fifth of frontier cost, Zyphra’s ZAYA one dash eight B trained on AMD Instinct hardware with seven hundred sixty million active parameters competing at thirty two to forty billion parameter performance levels, OpenAI and Google shifting default tier competition from benchmark scores to hallucination reduction in regulated domains, community signals from Ollama download counts favoring fine tuned over abliterated uncensored models, the operational requirements for multi stage retrieval architectures with provenance and access control, and NVIDIA plus Ineffable Intelligence co designing reinforcement learning infrastructure for Grace Blackwell and Vera Rubin platforms.