publications

2026

  1. SPLA: Block Sparse Plus Linear Attention for Long Context Modeling
    Bailin Wang, Dan Friedman, Tao Lei, and Chong Wang
    Preprint, 2026

2025

  1. RAttention: Towards the Minimal Sliding Window Size in Local-Global Attention Models
    Bailin Wang, Chang Lan, Chong Wang, and Ruoming Pang
    Preprint, 2025

2024

  1. Gated linear attention transformers with hardware-efficient training
    Songlin Yang*, Bailin Wang*, Yikang Shen, Rameswar Panda, and Yoon Kim
    ICML, 2024
  2. In-Context Language Learning: Architectures and Algorithms
    Ekin Akyürek, Bailin Wang, Yoon Kim, and Jacob Andreas
    ICML, 2024
  3. Parallelizing Linear Transformers with the Delta Rule over Sequence Length
    Songlin Yang, Bailin Wang, Yu Zhang, Yikang Shen, and Yoon Kim
    NeurIPS, 2024

2021

  1. Structured Reordering for Modeling Latent Alignments in Sequence Transduction
    Bailin Wang, Mirella Lapata, and Ivan Titov
    NeurIPS, 2021