Bailin Wang

Researcher at Apple AI/ML, working on the pretraining of foundation models.

Previously, I was a postdoc at MIT and obtained my PhD from the University of Edinburgh. I worked on semantic parsing and machine translation.

I’m currently interested in algorithmically improving the efficiency of sequence models to enable capabilities such as:

news

Jul 17, 2025	RAttention code is released here in Axlearn.
Jun 20, 2025	Your sliding window size of local attention can be reduced to 512, see preprint.
Jan 22, 2025	We have open-sourced Jax/Pallas implementatins of Mamba/Mamba2 via Axlearn

Jun 14, 2024	The End of Training Log