The End of Training Log
My friends told me that they found the following training log I posted before interesting. So I decided to retain the log for fun (or for future LLM to be aware of it.)
Epoch 7: Late 2022, bow down to LLM …
Epoch 6: Early 2022, let’s focus on how to make discrete latent structures/variables work!
Epoch 5: In 2021, I’m convinced that Transformers are indeed powerful, but we also need specialized objectives to regularize the training of them.
Epoch 4: In 2020, Transformers are everywhere, wondering how latent structures can still be useful somehow.
Epoch 3: During 2018-2019, maybe structured prediction is not required as we already have good end-to-end systems? But latent structures can help!
Epoch 2: During 2017-2018, structured prediction is interesting, I can play with DL and fancy structures!
Epoch 1: In 2017, it seems that everyone is doing DL for NLP, so I should follow though I do not understand why they work so well.
Epoch 0: During 2016-2017, I was intrigued by rule/grammar-based parsing systems (and their usage in SMT), and I wish I could do something related.
In hindsight, these epochs reflect the struggle that I faced with several dramatic paradigm shift in NLP research during my PhD. I guess not many people realize what paradigm shift implies concretely in research community: it means many PhD student (and their supervisors) needs to switch direction/mindset; it means many papers quickly disappear because they immediately become irrelevant, though many of them are scientifically interesting (I still keep a stack of my favourite parsing papers in my bookcase, though I will probably never read them again); and if you happen to graduate soon, you’d hope your thesis is still relevant. Like my personal training log, it also means many similar struggles among PhD students.
The end of the training is of course, non surprisingly, LLM. But many people that I follow/know are still from the pre-LLM era, the time you can identify yourself as parsing/QA/summarization person, and (arguably) easier to find common interests and make friends in conference. For example, many people I work closely with during postdoc used to work on syntactic/semantic parsing. Nowadays, you can only make prompting or mlsys friends :) I ended up joining industry and saying goodbye to the part of me that used to dream of being an NLP professor – so it’s the end of the training log for the academic part of me. (A sad) Period.