2026 arXiv SPLA: Block Sparse Plus Linear Attention for Long Context Modeling Bailin Wang, Dan Friedman, Tao Lei, and Chong Wang Preprint, 2026 arXiv 2025 arXiv RAttention: Towards the Minimal Sliding Window Size in Local-Global Attention Models Bailin Wang, Chang Lan, Chong Wang, and Ruoming Pang Preprint, 2025 arXiv Code 2024 ICML Gated linear attention transformers with hardware-efficient training Songlin Yang*, Bailin Wang*, Yikang Shen, Rameswar Panda, and Yoon Kim ICML, 2024 arXiv Code ICML In-Context Language Learning: Architectures and Algorithms Ekin Akyürek, Bailin Wang, Yoon Kim, and Jacob Andreas ICML, 2024 arXiv Code NeurIPS Parallelizing Linear Transformers with the Delta Rule over Sequence Length Songlin Yang, Bailin Wang, Yu Zhang, Yikang Shen, and Yoon Kim NeurIPS, 2024 arXiv Code 2021 NeurIPS Structured Reordering for Modeling Latent Alignments in Sequence Transduction Bailin Wang, Mirella Lapata, and Ivan Titov NeurIPS, 2021 arXiv Code