YAN: Mixture-of-Experts Flow Matching Achieves 40× Speedup Over Autoregressive LMs with Just 3 Sampling Steps
YAN is a new generative language model that combines Transformer and Mamba architectures with a Mixture-of-Experts Flow Matching approach — achieving quality comparable to autoregressive models in just 3 sampling steps, delivering a 40× speedup over AR baselines and up to 1000× over diffusion language models. The model decomposes global transport geometries into locally specialized vector fields.