đ Your guide to AI: October 2020
Dear readers,
Welcome to the October 2020 issue of my newsletter, Your guide to AI. Here youâll find an analytical narrative covering key developments in AI tech, geopolitics, health/bio, startups, research and blogs.Â
This Friday I announced the launch of Air Street Capital, a venture capital firm I founded to invest in AI-first technology and life science companies. As members of the technology ecosystem, many of you are here because you too see the vast potential of AI as the most compelling force multiplier on technological progress.
Air Street Capital represents my conviction that tomorrowâs category-leading companies will be AI-first by design. The fund signals that generalism for technical fields like machine learning is dead. Investors need to be AI-native themselves to best spot opportunities at the earliest stages. The playbook for what âgreatâ looks like in this category of companies is still being written as we speak. Join us and share the news with friends!
If youâd like to chat about something youâre working on, just hit reply! If you enjoy the issue, Iâd appreciate you hitting forward to a couple of friends.
đ Technology news, trends and opinions
đ„ Life (and) science
The Nobel Prize in Chemistry was awarded to Jennifer Doudna and Emmanuelle Charpentier for the discovery of CRISPR, the popular gene-editing method. Keep your eyes out for machine learning-based improvements to the technology :-)
A new paper evaluated the current regulatory frameworks for the development and evaluation of AI-based diagnostic imaging algorithms and identified several factors that limit the trustworthiness of the systems. These include conflating the diagnostic task with the diagnostic algorithm, a superficial definition of the diagnostic task, difficulties in comparing similar algorithms, insufficient characterization of safety and performance elements, and a lack of resources to assess performance at each installed site. Similar to the pharma industry and the multi-step clinical trial process, the authors proposed four phases of development and evaluation.Â
Facebook is collaborating with CMU on the Open Catalyst 2020, the largest dataset of quantum mechanical simulations for the purposes of finding better electrocatalysts for renewable energy storage. The company is making use of its computing infrastructure and expertise in graph neural networks.
Nature featured a fun historical foray into the science of mapping neuron connections in the brain, and how advances in high-content microscopy, superresolution, and machine learning are accelerating a once painstakingly manual workflow. An excerpt: âReconstructing these circuits with then-standard techniques â moving from slice to slice, manually tracing each nerve cell â would have taken hundreds of thousands of hours, Helmstaedter estimates. So, his team combined automated image-processing algorithms with machine learning approaches and focused human effort on marking neuron branches while letting computers handle the volumetric reconstruction. This cut the workload to 20,000 hours â still the equivalent of 10 people working full-time for a year. Further AI improvements sped up the process still more, by training computers to evaluate the machine-assembled reconstructions and requesting human help only when needed.â
One latest FDA 510(k) clearance approval for medical imaging went to Ezra, which develops a prostate segmentation AI on MRIs. This solves for measuring the volume of the prostate, the size of a lesion, or segmenting the lesion or gland. In the US, there are almost 200k new cases of prostate cancer in 2020.Â
Google Health announced a collaboration with the Mayo Clinic, focused on using computer vision technology for segmenting head and neck cancers before radiotherapy. Â
đ The (geo)politics of AI
In our State of AI Report 2020, we highlighted the proliferation of facial recognition systems around the world. Only 3 countries have partial bans on the technology. While some companies (e.g. Apple, MSFT) are taking more thoughtful approaches and new legal precedents are set in the UK, an investigation by Amnesty International revealed that three companies based in France, Sweden, and the Netherlands sold facial recognition technology to key players of the Chinese mass surveillance apparatus. This finding highlights the tension between calls to strengthen export rules to include strong human rights safeguards and the corporate goals of making profits. Meanwhile, there are home-grown facial recognition systems that are popping up, for example in Africa.
A paper that examines the funding sources of tenure-track researchers faculty in computer science departments at MIT, University of Toronto, Stanford, and Berkeley finds that 52% of faculty with known funding sources have been directly funded by Big Tech (GAFAMI). Of note, the percentage increases to 58% when the analysis is limited to faculty who have published at least one ethics or fairness paper since 2015. The study warns that Big Tech can have an (in)direct influence on the output of fairness research through these funding mechanisms.
Interesting talent stat: Between 2009-18, Germany saw 90% growth in the number of university-trained computer scientists, but in the same period of time, the number of faculty only grew 15%...
đ Autonomous everything
Tesla has begun shipping its full self-driving software update to beta testers, some of whom created reaction videos to its use. The verdict so far is that progress is impressive but it goes without saying that human supervision at all times is needed.Â
Waymo will relaunch and expand its autonomous ride-hailing service in Phoenix. Waymo plans to open access to all customers in a 50-square mile area. The company has also moved into trucking, which it believes is a natural extension of its technology.Â
Cruise joins Waymo, Nuro, and AutoX as the fourth company to receive a permit from the California DMV to remove the human backup driver from their self-driving cars. They will be sending unmanned cars out onto the streets of SF before 2020 is out.Â
Hereâs a fun consumer review of Comma.ai OpenPilot, the $1,199 device that adds driverless capabilities to your car.Â
đȘ Hardware
The big evolving story is NVIDIAâs pending acquisition of Arm for $40B from its current owners, SoftBank. The deal has vocal proponents on both sides. In our State of AI Report, we predict that the transaction does not end up being completed. Even though NVIDIA is adding sweeteners to the deal, e.g. building a ÂŁ40M supercomputer for health AI research in Cambridge (UK), it appears that Chinese technology companies are pushing the state to block the deal unless access to Armâs designs remains unhindered. Given Armâs hundreds of licensees who depend on its RISC technology, there is concern that its ownership by NVIDIA could jeopardize unhindered technology access (not factoring the issues of geopolitics). Whatâs more, there is talk that large enterprises are exploring the main alternative to RISC, which is RISC-V, championed by inventors SiFive as a means of long-term hedging. Indeed, SiFive just announced their Intelligence VIU7 Series, a vector processor designed for AI and graphics workloads. Recall that China did put up a drawn-out fight against NVIDIA's acquisition of Mellanox. If youâre looking for a primer on the history of semiconductors and their role in geopolitics, I encourage you to read this piece.
Arm announced a new addition to their micro-neural processing unit accelerator with more power and low-power consumption.Â
In other big-chip news, AMD announced a $35B all-stock deal to acquire Xilinx, the makers of FPGAs in a bid to compete with Intel in the data center.Â
Apple introduced their iPhone 12 Pro that comes packed with a LiDAR sensor. The company demonstrated its use in augmented reality and 6x faster low-light auto-focusing for photos and videos.
đ©âđ» Enterprise software and big tech
Earlier this year, DocuSign acquired Seal Software, an AI-based contract analysis company for $188M in cash. Now, DocuSign announced the availability of this technology to its customers. The feature uses machine learning to identify clauses and conduct a risk assessment based on an organizationâs own legal and business standards. The goal is to improve the thoroughness of review in less time.Â
Google is adding further improvements to language-understanding in Search. After pushing BERT-based search into production, the company pushed a new 680M parameter model that powers the âdid you meanâ feature, i.e. understanding what you meant even if your query was full of typos. Meanwhile, Apple is moving into search.
Developer tools for machine learning (both open and closed source) are on fire right now. There are new startups and projects popping up every month that attack various points of the developer workflow, from version control, pipelining, experiment tracking, serving, monitoring, and more. However, there are several issues with this product segment and this post does a great job of itemizing them. Of note (and I have noticed this first hand over the years) is that there is still no dominant design pattern for machine learning. There are specific tools, there are end-to-end products, and everything in between. The ground has still not firmed up and enterprises of all shapes and sizes have their own particularities. Itâs the wild west ;-) a16z had a nice piece on emerging architectures for modern data infrastructure.
Adobe (finally) released ML-powered image and video editing features, in addition to tools that help creators prove that their images are real (and not generated).Â
NVIDIA creates lots of buzz around its replacement of video codecs with a neural network, which resulted in orders of magnitude lower bandwidth. They call it NVIDIA Maxine, a cloud-native video streaming AI SDK. Magic Ponyâs vision is still alive :-)
đŹResearch & Development
Great news for AI research is that arXiv now allows researchers to submit code with their manuscripts! The code will be stored in Papers With Code in a drive towards increasing the openness and reproducibility of AI research - a key issue we highlighted in the State of AI Report 2020. To dive deeper into this topic of reproducibility and replicability, I recommend this article.Â
Now, hereâs a selection of impactful work that caught my eye, grouped in categories:
đ NLP
Recent Advances in Google Translate, Google AI. This post shows how the companyâs popular translation system has averaged +5 BLEU score improvement overall 100+ languages in the last 12 months. The performance increase is thanks to improvements to âmodel architecture and training, improved treatment of noise in datasets, increased multilingual transfer learning through M4 modeling, and use of monolingual data.â
Rethinking attention with Performers, Cambridge, DeepMind, Google. This paper extends the Transformer architecture in a way that uses only linear, as opposed to quadratic, space, and time complexity as the number of tokens in the input sequence grows. This means that Performers can technically scale with much less of a compute footprint, which makes the model more palatable to large data domains where endless computing infrastructure cannot be taken for granted.Â
Learning to summarize from human feedback, OpenAI. An outstanding issue with the large-scale unsupervised pre-training of language models is that there is little ability for human oracles to tune the system to their preferences. This paper explores the context of summarisation and presents a method of improving summary quality by training a model to optimize for human preferences. They do this by training a model on a high-quality dataset of human comparisons between summaries to predict which summary is preferred. They use this model as a reward function to fine-tune a summarization model using reinforcement learning. The paper shows that humans prefer summaries generated by this system than those through unsupervised pre-training only or supervised learning. Â
Customizing triggers with concealed data poisoning, Berkeley. Progress in NLP has been fueled by large-scale dataset creation using public text on the web. A concern with this approach is data poisoning attacks, defined as an adversary âinserting a few malicious examples into a victim's training set in order to manipulate their trained model.â This paper shows that an adversary can control a modelâs predictions whenever the desired trigger phrase appears in the input. To defend from these attacks, the authors show that training for fewer epochs can mitigate the effects of poisoning at the cost of accuracy. To defend more than this, manual inspection of a large portion of the training set is required.Â
Beyond English-centric multilingual machine translation, Facebook AI Research. This work presents a many-to-many multilingual translation model that is capable of directly translating any pair of 100 languages. The paper also offers a new dataset that covers thousands of language directions with supervised data that is built using large-scale text mining. Blog post here.
đ· Computer vision
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, Google Research. This paper applies a standard Transformer directly to images by splitting an image into patches and providing the sequence of linear embeddings of these patches as an input to a Transformer. Image patches are treated the same way as tokens (words) in an NLP application and the model is trained on image classification in a supervised fashion. They find that when trained on sufficiently large datasets (e.g. 14M-300M images) and transferred to tasks with fewer data points, the vision transformer produces results that are close to or above state of the art on ImageNet.Â
âLess than oneâ-shot learning: Learning N classes from M<N samples, University of Waterloo. This paper evaluates whether it's possible to train a neural network to perform image classification of more labels than it was trained with (i.e. less than one-shot learning). To do this, they used the MNIST digits dataset and had to create composite images to train on. These were images that blend multiple digits together and are labeled with hybrid (or âsoftâ) labels. The paper explores the theory and boundaries of the approach. More here.
đ€ Reinforcement learning
Mastering Atari with discrete world models, Google Brain, DeepMind, University of Toronto. The Dreamer team is back at it with their new RL agent, DreamerV2. As a recap, the authors use virtual worlds to enable RL agents to generalize from past experience to achieve goals in new environments. Whatâs neat is that the agent learns behaviors from imagined outcomes and, in the case of DreamerV2, this happens purely from predictions in the compact latent space of its world model. This contrasts with a model-free RL setup, which does not have a world model from which outcomes can be predicted so the agent has to learn purely through trial and error. Using a single GPU and a single environment instance, DreamerV2 outperforms top single-GPU Atari agents Rainbow.Â
Massively large-scale distributed reinforcement learning with Menger, Google Research. This work describes a method to scale up the training of thousands of RL agents across multiple processing clusters with the goal of reducing overall training time. To do this, they solve for a large number of read requests from agents to a learner model and for the TPUâs input pipeline that feeds training data into the chipâs compute cores. Compared to a recent study on using RL for chip placement, they show that using Menger, the RL agent can train almost 10x faster on 512 cores. Â
Learning quadrupedal locomotion over challenging terrain, ETH-Zurich, Intel, KAIST. This paper demonstrates how a blind quadruped (Boston Dynamics-style) robot can be driven by a controller trained in simulation with reinforcement learning using proprioceptive signals from the joint encoders and an IMU as input. The robot demonstrates dynamic locomotion in diverse, complex natural environments (the Swiss Alps).
đ Speech
Self-training and Pre-training are Complementary for Speech Recognition, Facebook AI Research. This paper explores whether self-training and unsupervised pre-training (two methods that are effective on unlabelled data on their own) can be combined to a greater effect on speech recognition. The approach is to first âpre-train a wav2vec 2.0 model on the unlabeled data, fine-tune it on the available labeled data, use the model to label the unlabeled data, and finally use the pseudo-labeled data to train the final model.â They find that the combined system learns quickly from a few minutes of labeled data. Comparing with DeepSpeech2 (which trained on 10k hours of English), however, is tricky on such small academic datasets.
đ§Ș Science (bio, health, etc.)
ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular Property Prediction, University of Toronto, Reverie Labs, DeepChem. Molecular property prediction is the task of taking a chemical structure as input and predicting targets like solubility, toxicity, and more. Graph neural networks have been extremely useful on this task because they are more expressive representations of chemical structures than strings of characters. In this paper, however, the authors evaluate transformers (which model sequences) on the task of property prediction. They adapt the RoBERTa transformer implementation in HuggingFace, pre train on 77M unique SMILES (strings) from PubChem, and evaluate on MoleculeNet, a set of property classification tasks. While not state of the art, this work shows that transformers scale well on this task and could be a future area of research to accelerate drug discovery.Â
A global review of publicly available datasets for ophthalmological imaging: barriers to access, usability, and generalisability, University of Birmingham, Moorfields Eye Hospital, UCL. The availability of clinical data is crucial to the progress of AI in healthcare. This work focuses on ophthalmological (eye) images and indexes all publicly available datasets, their accessibility, which diseases they describe, and the populations they represent, as well as the completeness of the metadata. They point to issues with the curation of metadata, dataset imbalance, discoverability issues, and the underrepresentation of certain diseases.Â
A deep learning approach to programmable RNA switches, Harvard and MIT. Predicting the function of biological material from its sequence has significant consequences for speeding up synthetic biology. In this paper, the authors focus on programmable RNA switches, which are a way to build up cellular sensors for small molecules, proteins, and nucleic acids. Today, available datasets of switch designs and their empirically-validated functions are small and models of this sequence to function relationship use rational thermodynamics and kinetic analyses. This paper first generates a dataset of 10^5 switch designs to train a deep learning model to predict the switch function. This system is shown to outperform thermodynamic and kinetic models and also provide human-readable visualizations of how the model works. A separate paper on the same subject appears here.
Leveraging uncertainty in machine learning accelerates biological discovery and design, Harvard and MIT. This paper addresses the potential of ML-guided experimental design that relies on uncertainty and the use of empirical data to improve model robustness. The authors make use of a Gaussian process-based uncertainty prediction model on the task of predicting antibiotic potency from a molecular structure. They also show how this uncertainty-based approach can be used on generative molecular design, protein function prediction, and single-cell transcriptomics.Â
Computational planning of the synthesis of complex natural products, Polish Academy of Sciences, Northwestern. This paper builds on the already hot area of chemical synthesis planning (the stepwise process of making molecules from their building blocks) and extends it to naturally-occurring products. This is interesting because natural products tend to be more chemically complex than synthetic ones and need synthesis routes that are longer. The authors also validate their synthesis plans in the lab.
đ Resources
đ Blogs and reports
The team at Stanfordâs Institute for Human-Centered AI published a workshop paper on the Measurement in AI Policy. It tackles the opportunities and challenges that the industry faces, and offers perspectives on defining AI, what contributes to AI progress and how this can be tracked through papers, measuring the economic and societal impact of AI, and measuring the risks.Â
Hereâs a blog post that highlights several exciting papers in ML, some of which are under review at major conferences.Â
ML use cases in video conferencing are finally heating up! Hereâs an example of real-time, automatic sign language detection
The patent landscape for computer vision in the US and China, a report by CSET.
The London-based self-driving company, Wayve, share their approach to measuring autonomy performance.Â
What is a machine learning feature store and why is it useful?Â
Twitter sponsored the RecSys conference and shared learnings from recommender systems research here.
How DoorDash scaled its data platform.
Curious who the top authors, affiliations, countries, and collaborations are at NeurIPS 2020? Sergeyâs post has the answers!
đ„ Videos, talks
Google produced a documentary video about how they made use of BERT to revamp Search.Â
𧰠Open source and tooling
Facebook describes the internal tool for dataset discovery but is not open-sourced.Â
ML framework use is a hot topic, in part due to the rapid adoption rate of PyTorch over TensorFlow in many research venues. Lyft outlined their rationale and use of PyTorch for self-driving.Â
Spotify open-sourced Klio, an ecosystem that allows you to process audio files â or any binary files â easily and at scale.
The CC100, CommonCrawl dataset, is now available. It includes 2.5TB of clean unsupervised text from 100 languages.Â
A curated list of resources for MLOps.
đ°Venture capital financings and exits
Hereâs a financing round highlight reel:
Applied Intuition, the makers of a simulation software infrastructure product for autonomous robots, raised a $125M Series C led by existing investors Lux Capital, a16z, and General Catalyst. The 3-year old business is now valued at $1.25B. In previous newsletters we explained the importance of simulation to vehicle development - Applied Intuition is the SaaS option for companies who do not want (or cannot) build and maintain this function themselves.Â
LabGenius, the London-based AI-first therapeutic protein engineering startup, raised a $15M round led by Atomico. Disclosure: Air Street Capital is an investor.Â
Hyperscience, a New York-based enterprise process automation company, raised a $80M round led by Tiger Global, hot off the heels of a $60M round just 4 months ago. The product focuses on data entry and extraction within sectors such as finance and insurance.Â
Your.MD, a London-based health tech company using ML to help patients check their symptoms before seeing a doctor, raised a âŹ25M round led by Reckitt Benckiser.Â
Syte, an Israeli company offering visual search technology, raised a $30M Series C led by Viola Ventures.Â
Zest.ai, a US fintech company offering machine learning solutions for lending, raised a $15M round led by Insight Partners.Â
And a few M&A deals:
SigOpt, an SF-based provider of Bayesian optimization tooling, was acquired by Intel for an undisclosed sum.Â
Vilynx, a 10-year old US/US startup that offers 5-second video previews that are automatically generated using machine learning for content understanding, was acquired by Apple for a reported $50M. The solution is sold to publishers, video creators, and online media companies that use it to increase the video view rate. working on content understanding across audio, video, and text.Â
---
Signing off,Â
Nathan Benaich, 1 November 2020
Air Street Capital | Twitter | LinkedIn | State of AI Report | RAAIS | London.AI
Air Street Capital is a venture capital firm investing in AI-first technology and life science companies. Weâre an experienced team of investors and founders based in Europe and the US with a shared passion for working with entrepreneurs from the very beginning of their company building journey.