Lessons from a Nobel Laureate (on deep learning and academia/industry)

2025-08-04 346 words 2 minutes

Contents

Last week I saw John Jumper give a presentation on his team’s development of AlphaFold 1/2/3. He started explaining how AlphaFold1 worked, essentially a convolutional neural network. He spent most of the talk explaining the many changes required to produce AlphaFold2, which went from mediocre predictive accuracy to changing the whole field.

I took away two main points.

There was no single trick that lead to the big performance gain of AF2 over AF1. No single bit of intuition that lead to such a huge shift in performance. Instead, lots of good ideas each of which increased performance slightly.

Figure 4 of the paper summarises this nicely. The best performing of these ideas accounted for about 10% of the performance gain. I expect this is broadly applicable across deep learning problems in biology. The use of a clear quantitative benchmark from the start enabled an approach of testing each idea objectively.

At the end of his talk, he also commented that the team – of what looked like around twenty people – was very small. It sounds like they also benefitted from being embedded in an environment with so much AI expertise. It’s vanishingly rare in academia to have a highly expert team this large, well-paid and secure, all focused on the same problem.

More speculatively, I imagine it’s rare in industry to be able to work on a curiosity-driven problem for such a long period of incremental results. So I left wondering how unique this environment, particular problem (good data and evaluation technique) was for success here, and if it was something we’d ever be able to replicate in academia, even in miniature. Or are we more likely to publish the incremental findings for others to build on, rather than keep working towards a single big advances? Or are we more likely to drop the project/individually underwhelming ideas and work on something different instead?

(As a side note, I was also surprised that the compute use described was not totally at odds with what we are used to working with, definitely not OpenAI/Meta etc scale.)