🦄 State of AI Report 2021

Oct 17, 2021

Hi all,

After a summer hiatus from writing this monthly newsletter, I'm excited to bring you the State of AI Report 2021! This is the fourth annual collaboration between Ian Hogarth and I. Our aim is to compile, analyse and distill the most important work in AI research, industry, talent, and politics over the last 12 months to inform conversation about the state of AI. Our report is open-access to all. Huge thanks to our inaugural Research Associate, Othmane Sebbouh, for working with us this year.

Below I'll share my director's cut - a snapshot of key themes and ideas that stood out to me.

We'd appreciate your help in spreading the report far and wide, thanks in advance! Any comments, critique, or suggestions, please hit reply :-)

🔬 Research

In our 2020 Report, we predicted that transformers would cross the chasm from NLP into computer vision, setting new state-of-the-art. The pace of transformer proliferation has been astounding - over 5% of the slides in this report contain transformers!

Transformer models, which focus machine learning algorithms on important relationships between data points to extract meaning more comprehensively for better predictions, have powered key breakthroughs in several fields. Notably, transformers underpinned DeepMind's AlphaFold 2 results (another 2020 Report prediction).

Indeed, we can now use large transformer-based language models (LLMs) to generate functional proteins that do not exist in nature.

And multimodal self-supervision, zero-shot learning, and image generation too...

In the wake of GPT-3, LLMs are emerging in specific languages - a form of nationalism - built by private and public companies, academic research labs, and independent open-source initiatives.

But while LLMs such as OpenAI's Codex can generate functional computer code in a dozen programming languages, don't expect these systems to crack the coding interview or math tests either. In fact, you better know that LLMs tend to be less truthful than their smaller peers in domains such as health, law, conspiracies and fiction.

We also saw advances to model-based reinforcement learning with an agent, DreamerV2, that learns behaviors purely within a simulation that its learned from pixels.

And evidence that humans can indeed improve their skills by means of coaching from an AI agent, in this case AlphaGo.

Data is coming back into the spotlight as the key ingredient to successful machine learning systems in production. Many practitioners aren't aware of "data cascades" - an adverse data domino effect. The result is a movement from model-centric AI to data-centric AI.

And with all the innovation in model architectures, we're seeing the emergence of a new framework challenger: JAX. Researchers are now first class citizens in the ML framework marketplace.

📚 Talent

The most powerful stat we found was the rapid ascension of the Chinese Academy of Sciences in the AI research world, a proverbial "started from the bottom, now we're here". The institution published 0 AI papers in 1980 and now publishes the most top 25% quality AI research today, eclipsing Western institutions.

Meanwhile in the West, the Great Academic Brain Drain continues. In 2019, 85% of the Professors who were hired from academic into industry were Tenured meaning they are highly experienced and otherwise had permanent employment at the university.

Strikingly, 88% of top AI faculty have received funding from big technology companies (namely Google, Microsoft, Apple, et al.).

This is altogether not surprising given many academic groups struggle to make ends meet in a field that is increasingly costly to compete in.

And governments cut or fail to expand their investment into higher education and research.

🤖 Industry

This year, we have seen AI become increasingly pivotal to breakthroughs across industry. Notably, both Exscientia and Recursion - leaders in AI-first drug discovery - IPO'd this year (a prediction from our 2020 Report!) and are advancing drugs into the clinic.

Meanwhile, the UK's National Grid ESO worked with non-profit Open Climate Fix to deploy a transformer-based forecasting service that slashed the error of electricity demand forecasting. This system is delivering forecasts to the control room since May 2021.

And as the world moved online almost overnight putting our logistics infrastructure to the test, deep learning systems helped automate 98% of stock replenishment decisions for Ocado's online grocers every day.

In industrial facilities across 30 cities in 15 countries, Intenseye's real-time computer vision software protects employees from >35 types of health and safety incidents that would otherwise go unseen. In just over 18 months, the system has detected over 1.8M unsafe acts.

Computer vision use is disseminating across even more and more visual tasks, ranging from KYC on new customers joining trading platforms en masse during the pandemic to the interpretation of 3D medical scans. Model-in-the-loop training for high-quality data comes to the fore (ref: V7).

In 2020, we predicted that NVIDIA would not complete its acquisition of Arm. At the time, this wasn't obvious - many pushed back. However, NVIDIA has received mounting resistance from customers, governments, regulators, and competitors.

As the world realised how critical the semiconductor supply chain is to our everyday lives, a new chokepoint emerged beyond the fab: ASML, the leading and only manufacturer of extreme ultraviolet lithography machines that are critical to leading-edge semiconductors. Spurred by geopolitical tailwinds, we predict that ASML will grow into a $500B company by our 2022 Report.

In our 20219 Report, we predicted that recent breakthroughs in NLP (namely transformers) would lead to NLP startups raising over $100M. Although we were 12 months too early, this year we saw half a dozen NLP startups raise over $375M to translate transformer research into industry. Recall that only a few years ago, similar companies were acquired for half this amount (re: Maluuba, SwiftKey et al.)

And today, AI-first businesses are the real teal. There are now over 182 active AI unicorns globally that are worth a combined $1.3T in enterprise value. Prime examples include Darktrace, Exscientia, DataRobot, Scale, SenseTime and more.

We're also seeing real enterprise value creation through M&A, secondaries, IPOs, and SPACs to the tune of €750B in 2021 alone. That's almost 3x more than 2020.

🧑‍⚖️ Politics

AI researchers have traditionally seen the AI arms race as a figurative one -- simulated dogfights between competing AI systems carried out in labs -- but that is changing with reports of recent use of autonomous weapons by various militaries including the US and Israel.

With governments revving up not only the rhetoric, but matching it with real money.

This money momentum is also in full force for startups (ref: Anduril et al.)

Against this backdrop and the growing capabilities of AI systems and their role in society and the economy is now evident, we set out to determine just how many researchers are actively working on AI safety (near-term) and AI alignment (long-term). We highlight that research into AI safety and the impact of AI still lags behind its rapid commercial, civil, and military deployment. Notably, <100 people work on AI Alignment in 7 leads AI orgs.

Meanwhile, new governance experiments are taking shape in the AI ecosystem: Anthropic as a public benefit corporation, Hugging Face as an open source private company, or even EleutherAI as an open source Discord server-based community with no company attached.

🔮 Predictions

Reviewing our predictions from 2020, 5/8 were a YES!, 2/8 were a NOPE, and 1/8 is debatable...

So what do we have in store for 2022?

🙏 Thanks!

---

Signing off,

Nathan Benaich, 17 October 2021

Air Street Capital is a venture capital firm investing in AI-first technology and life science companies. We’re an experienced team of investors and founders based in Europe and the US with a shared passion for working with entrepreneurs from the very beginning of their company-building journey.

Guide to AI

Discussion about this post