Bailin Wang

prof_pic.jpg

Researcher at Apple AI/ML, working on the pretraining of foundation models.

Previously, I was a postdoc at MIT and obtained my PhD from the University of Edinburgh. I worked on semantic parsing and machine translation.

I’m currently interested in algorithmically improving the efficiency of sequence models to enable capabilities such as:

  • long-context reasoning
  • multimodal learning

news

Jun 20, 2025 Your sliding window size of local attention can be reduced to 512, see preprint.
Jan 22, 2025 We have open-sourced Jax/Pallas implementatins of Mamba/Mamba2 via axlearn :sparkles:

latest posts