🏡Your guide to AI: May 2020
Dear readers,
The horrific murders of Ahmaud Arbery, Breonna Taylor, George Floyd, and the inexcusable violent racism that has shocked our societies are painful reminders of deeply rooted problems and injustice in our society. Our communities and our work are enriched by having people from all walks of life collaborate towards the common goal of improving our collective future. There can only be strength in diversity. Racism has no place in global society.
Nathan Benaich
---
Welcome back to regular readers and hello to everyone who is new. Enclosed you'll find Your Guide to AI: May 2020. I'll cover key developments in AI tech, geopolitics, health/bio, startups, research and blogs.
Drop me a line if you enjoyed the issue or want to share a critique, and thanks for sharing this issue with your friends and collaborators.
Before we kick off, a reminder that we're 3 weeks away from the 6th Research and Applied AI Summit on June 26th. You can sign up here - all proceeds from ticket sales fund open source AI projects, research and education via our non-profit RAAIS Foundation. Our lineup includes senior leadership from Cerebras, Lyft Level 5, ZOE, and Zama, research group leaders from Oxford, University of Vermont, and Moorfields Eye Hospital, and AI researchers from DeepMind and Google AI.
🆕 Technology news, trends and opinions
🏥 Healthcare, life science, and COVID
As I’ve discussed in several previous editions of this newsletter, the pharma industry is broadly under-indexed on AI-first computational and robotic platform approaches to drug discovery and development (DDD). Several startups in this field are making strong progress, e.g. Exscientia, LabGenius, Cellarity, Recursion, Insitro, and others. A piece in Nature Biotechnology surveyed this landscape and focused on how iterative cycles of machine learning, empirical wet-lab experimentation, and human feedback can accelerate DDD. Pharma is of course evolving too. Of note, Roche appointed Aviv Regev of the Broad Institute of Harvard and MIT as the new Head of Genentech Research and Early Development. This is noteworthy because Aviv has made a storied career in data-driven biology, including single-cell RNA sequencing and ML-based analytical pipelines, and while also co-chairing the Human Cell Atlas project.
The COVID pandemic has shown that technology startups collaborating with clinical research centers play an important role in public health. In a citizen science study with >2.6M participants reported in Nature Medicine, the health science company ZOE, Mass General Hospital, and KCL’s NHS Trust were able to validate “non-canonical” COVID symptoms far earlier than governmental health services. In particular, a combination of a) loss of smell+taste, b) fatigue, c) a persistent cough and d) a loss of appetite is most strongly correlated with COVID-19, adjusting for age, sex and BMI. Using a model that was trained to predict COVID from reported symptoms of 18,401 participants that had undergone a COVID RT-PCR test, the study predicted that close to 18% of symptom reporting participants are likely to have COVID.
Meanwhile, multiple research groups and companies have run and released drug screening data to find potential COVID treatments. However, from the 200+ active compounds that several studies reported on, only 32 were identified as being active in more than one study. Many of the drugs used in COVID clinical studies do not have published in vitro or in vivo studies in bioRxiv, which suggests that we know little about how they work. Without screening standards, it is hard to draw comparisons across studies too.
NinesAI, a Palo Alto-based company, gained 510(k) FDA clearance for their automated radiological review that analyses CT images of the head region to detect brain bleeds. Detecting this condition quickly is imperative to avoid long-term damage and death. Approximately half of the 30-day mortality occurs within the first 24 hours.
🌎 The (geo)politics of AI
The US government signed an extension of last year’s executive order that barred US companies from using telecoms equipment made by firms that present a national security risk (such as Huawei and ZTE Corp). The government’s Department of Commerce also added 24 Chinese companies and institutions to a sanction list for “supporting the procurement of items for military end-use in China”. A further 8 companies and the Institute for Forensic Science were placed on a second list that restricts access to US technology because they are “complicit in human rights violations and abuses…against Uygurs, ethnic Kazakhs, and other members of Muslim minority groups in the Xinjiang Uygur Autonomous Region”. The list includes Qihoo 360 (antivirus software and web browser), Cloudminds (RPA software), and CloudWalk (facial recognition software). Even so, CloudWalk raised $254M from Chinese provincial and municipal funds as it eyes a public listing on the Shanghai exchange this year. The company was incubated at the Chongqing Research Institute and was deeply involved in guiding the national strategy for facial recognition. SenseTime, another leading Chinese facial recognition startup is in the market to raise $1B.
In January the UK government decided to cap Huawei’s 5G equipment footprint to 35% and barred its use in the critical core of mobile networks where data is stored and routed. Now, the government is drawing up a 3-year plan to remove Huawei from 5G networks entirely.
TSMC, the world’s largest semiconductor fabricator, said it would spend $12B to create a chip fab in Arizona. This is undoubtedly a geopolitically-influenced decision in the backdrop of US-China trade tensions and Trump’s aforementioned executive order. The factory would focus on TSMC’s 5-nanometer process.
There’s been a flurry of government and defense contractor agreements around AI technology. This includes 1) a 5-year, $800M contract between the US DoD’s Joint AI Center and Booze Allen Hamilton, 2) the DoD’s Defense Innovation Unit selecting Google Cloud to build a solution to detect, protect against, and respond to cyber threats, and 3) Canada’s DarwinAI announcing a strategic collaboration with aerospace contractor Lockheed Martin around explainable AI solutions.
🚗 Autonomous everything
The latest in the self-driving world is a mix of layoffs, consolidation, fundraising, and talent poaching due to the COVID-induced industry timeout. GM’s Cruise laid off 8% of its staff (circa 150 employees) as it sought to cut costs through COVID. The company added Regina Dugan (Google’s former head of Advanced Technology and Products Group and DARPA Director) to its board. GM also announced its “Super Cruise” system, which is a hands-free driving feature on pre-mapped highways that takes aim at Tesla’s AutoPilot.
Reports surfaced that Zoox, the startup that reimagined the vehicle for the era of autonomy, is in trouble. Word on the street is that Amazon is soon set to acquire the business for around $1B, which would be the sum of the venture capital that Zoox raised. While the deal is in the process of closing, competitors such as Cruise are poaching talent.
As others cut their headcount, Aurora reported passing the 500 employee mark. The company added two senior engineering leaders both of whom developed their academic and professional careers in Europe (Free University of Berlin and KTH in Stockholm). Interestingly, Aurora is launching an in-house university to upskill their employees for roles that are supply-constrained on the market.
Meanwhile, the latest poll of 1,200 Americans by a coalition of industry payers and non-profits called Partners for Automated Vehicle Education revealed that ¾ Americans say that AV technology “is not ready for primetime”. Of note, 48% said they would not get in a self-driving taxi. Having said that, half of the people polled said they owned vehicles with ADAS features and responded favorably to having a vehicle that supports their driving as long as the driver is in full control.
Volvo showcased its integration of Luminar’s LiDAR system that will go into production from 2022. Luminar’s Iris system offers up to 500m range, which is set to help Volvo deliver ADAS on dedicated highway stretches.
For a roundup of the development stage and progress of major self-driving players, check out this piece in WIRED.
💪 The giants
Automation in the enterprise is a very hot topic. Vendors across the stack from startups to public companies are selling enterprises on the virtues of implementing software to automate repetitive workflows. This software often comes in the form of robotic process automation (RPA), low- or even no-code builders, and AI-based features such as document digitization (OCR). A recent survey of 796 executives by Bain found that while companies expect to double their use of automation technologies in the next two years, 44% of respondents said their automation projects have not achieved the savings they expected. There are a few reasons. Although automation can cut labor time by 20-30%, the median payback for these savings is 13 to 18 months. Those organizations that see higher levels of savings from automation tend to 1) have C-level sponsorship for the project (i.e. establishing automation as a key priority), 2) have a “center of excellence” for automation (i.e. centralized coordination and capabilities), and 3) spend >20% of their IT budget on automation.
Twitter announced the appointment of Fei-Fei Li to its Board of Directors. Fei-Fei is known for her work on computer vision (e.g. ImageNet), as a Professor at Stanford, and most recently as Head of AI for Google Cloud.
Google’s open-source project, TensorFlow, has surpassed 100 million downloads since its launch in 2015. In the last month alone, there were 10 million downloads. Nonetheless, it appears that Facebook’s PyTorch has overtaken TensorFlow in research code.
Facebook is also upping the ante on their marketplace product thanks to new computer vision models from Facebook AI and the Grokstyle team that was acquired in 2019. They deployed the GrokNet computer vision system that can identify fine-grained product attributes across billions of photos to help marketplace tag photos and make photos shoppable on Facebook Pages. The release also introduced Rotating View, which lets smartphone users capture multi-dimensional panoramic views of their listings. Together, the company hopes to drive more transactions on its marketplace.
In response to a new Greenpeace report that detailed 14 separate contracts between Amazon, Microsoft, and Google with major oil firms, Google responded by saying they “will not...build customer AI algorithms to facilitate upstream extraction in the oil and gas industry”. The company’s Cloud revenues from oil and gas customers were roughly $65M in 2019 (from a pool worth $113B according to HG Insights). Greenpeace isn’t satisfied because they don’t see how upstream extraction is a good use case for AI.
At the Microsoft Build 2020 conference, OpenAI demonstrated a pretty impressive code generation demo of a large language model trained on thousands of GitHub repositories. The developer writes prompts that instruct what a program should do, and the language model outputs the correct code. OpenAI also released its third generation GPT language model that has 175 billion parameters (10x more than any previous non-sparse language model).
Niantic announced the rollout of two new AR features that create more realistic, real-world experiences. This includes improved occlusion detection and the use of Portal Scanning videos to generate dynamic 3D maps of physical places. This helps the company capture crowdsourced data of the real world from which to build semantic and depth maps to empower AR features. We can expect a ton more to come from Niantic as it continues to deepen its talent bench in machine learning and AR.
🍪 Hardware
It’s no secret in the industry that the compute requirements for training modern AI systems are going up and to the right. A recent analysis by OpenAI showed there are two distinct eras in the timeline of AI training compute requirements. The first is pre-2012, i.e. pre/early-deep learning (Belief Networks, BiLSTMs, RNNs), and the second is post-2013, i.e. the age of modern deep learning (ResNet, BERT, AlphaGoZero). At the same time, there appears to be a concomitant reduction in algorithmic efficiency. That is to say, modern deep learning systems can achieve AlexNet performance with 44x less compute.
NVIDIA has unveiled its Ampere-based DGX A100 chip that contains 54 billion transistors and can deliver 5 petaflops of performance (20x the Volta chip) from its 8 A100 Tensor Core GPUs and 320Gb of memory. The chips are made with TSMC’s 7-nanometer process and cost $199,000. NVIDIA also released a whitepaper on their A100 Tensor Core GPU architecture here, which they call “the greatest generational leap in NVIDIA GPU accelerated computing ever”. Coupling the depth of NVIDIA’s software stack with these new chips makes NVIDIA a serious contender to reckon with.
Sony showcased an integrated AI processor and image sensor product that is positioned to run ML workloads directly on their cameras. This could be interesting in photography!
About a year ago, Microsoft announced a $1B strategic investment into OpenAI to develop an Azure-based supercomputer for AI workloads. Now, Microsoft has announced the result: a 285,000-core supercomputer running on Azure that is exclusively available to OpenAI.
🔬Research and Development🔬
Here’s a selection of impactful work that caught my eye, grouped in categories:
NLP
Imitation Attacks and Defenses for Black-box Machine Translation Systems, UC Berkeley. It is known that hosted prediction APIs are vulnerable to adversarial attacks. This paper shows that one can train a model to imitate a black-box MT system by inputting monolingual phrases into the MT system and learning to imitate the outputs. The imitation model can be used to deliver adversarial attacks on the MT system such that it outputs semantically-incorrect translations, dropped content, and vulgar model outputs. The authors also demonstrate how to defend from such an attack.
WT5?! Training Text-to-Text Models to Explain their Predictions, Google Research. This paper uses the text-to-text framework and its pre-trained T5 Transformer to train a model to generate explanations for its natural language predictions. The model can be queried for explanations by prepending “explain” to the input.
The cost of training NLP models, AI21 Labs. Access to scaled computation is a rate limiter to training ML models. In fact, much of the innovation in large-scale NLP models comes from industrial research groups with access to large R&D budgets (e.g. OpenAI, Facebook, DeepMind, et al.). This paper reviews the economics of modern-day NLP models by asking: “How much does it cost to train a model?”. At list-price, training an 11 billion parameter variant of Google’s T5 NLP model would cost well above $1.3M for a single run. Furthermore, “assuming 2-3 runs of the large model and hundreds of the small ones, the list-price tag for the entire project may have been $10M.”
Here is a selection of NLP papers presented at ICLR 2020, curated by Hugging Face.
Computer vision
YOLOv4: Optimal Speed and Accuracy of Object Detection, Institute of Information Science Academia Sinica (Taiwan). This paper expands upon the state-of-the-art object detector network YOLOv3 created by Joseph Redmon (who quit the field because of military use of open-source computer vision) and Ali Farhadi (whose startup, Xnor.ai, was since acquired by Apple). Their updated network is both 2x faster and more accurate (by circa 10%) than YoLov3 and EfficiencyDet, the SOTA AutoML object detector from Google.
Reinforcement learning
Unsupervised Meta-Learning for Reinforcement Learning, Berkeley. Much of machine learning is about task selection, dataset curation, and labeling, and adjusting model parameters. In RL, we design task distributions and a reward such that an agent can learn a useful behavior. However, manually designing these task distributions is rate-limiting to progress. This is where meta-learning comes in: automatically adjusting model parameters to improve learning. In this work, the authors devise a meta-learning algorithm that does not require the manual design of meta-training tasks. They do this by using unlabeled data to construct task distributions and then use meta-learning to quickly solve these self-proposed tasks. The approach results in agents that learn faster than RL from scratch or with random task proposals. Blog post.
Dream to Control: Learning Behaviors by Latent Imagination, Toronto, DeepMind, and Google Brain. This paper introduces an RL agent that learns to solve continuous control tasks from pixel inputs by learning a world model in its imagination. From past experiences, it learns a world model that accurately encodes compact model states and predicts rewards. The world model is used to derive behaviors. This uses two neural networks: an actor-network that predicts successful actions and a value network that evaluates the success of that state. Dreamer outperforms previous model-based and model-free approaches on a benchmark of 20 continuous control tasks with better data efficiency, less computation time, and stronger overall final performance.
Systems and methods
Why we need DevOps for ML data, Tecton.ai. Software development today is much faster than it was 20 years ago, in no small feat due to DevOps practices. Engineers work on a well-defined shared codebase where incremental changes are made, tested, versioned, and integrated into production, and monitored. In machine learning development, however, development is much slower because it tends to lack a well-defined, fully automated process from end-to-end, and takes much more time while models train, converge or report issues that need debugging. This piece dives into many of these challenges and proposes a data platform for ML. For more on the challenges facing ML system engineers, check out this piece from Seldon.
Monitoring dataset quality at scale with statistical modeling, Uber. This blog post runs through how Uber built up its Data Quality Monitor solution. The system leverages statistical modeling to find the most destructive anomalies in a dataset and alerts the data table owner to check the source. The next steps include making root cause analysis more automated and intelligent.
Machine learning on graphs: A model and comprehensive taxonomy, Stanford, and Google AI. Machine learning using graph representations is hot right now, in particular for biology, chemistry, and large-scale social network graphs. This paper proposes a comprehensive taxonomy of representation learning methods for graph-structured data, aiming to unify several disparate bodies of work. The authors propose a Graph Encoder-Decoder Model (GRAPHEDM), which “generalizes popular algorithms for semi-supervised learning on graphs (e.g. GraphSage, Graph Convolutional Networks, Graph Attention Networks), and unsupervised learning of graph representations (e.g. DeepWalk, node2vec, etc) into a single consistent approach.”
ZeRO-2 & DeepSpeed: Shattering barriers of deep learning speed & scale, Microsoft Research. Since announcing a novel memory optimization technology to speed up large neural network training, MSR released the second generation of ZeRO that offers 10x larger and faster training than generation one on models with 100 billion parameters.
Science (bio, health, etc.)
ZeroCostDL4Mic: an open platform to simplify access and use of deep learning in microscopy, UCL, Turku, Goethe-University Frankfurt, et al. This paper is neat because it presents a democratized interface for biologists to use deep learning in their microscopy workflows. The service is based on Google Colab and lets users leverage DL networks for cell segmentation, nuclei segmentation, 2D and 3D denoising, and label-free prediction all while using little or no coding. The project is hosted here and a longer piece on open source AI in biology can be found here.
nucleAIzer: A Parameter-free Deep Learning Framework for Nucleus Segmentation Using Image Style Transfer, University of Szeged, Broad Institute. Related to the paper above, this work uses image style transfer to automatically generate augmented training examples for nuclei labeling. It can be used for free here.
Video-based AI for beat-to-beat assessment of cardiac function, Stanford. This paper addresses the problem that human assessment of cardiac function is limited to the selective sampling of cardiac cycles and lots of interpretation variability between cardiologists for the same case. The authors present a video-based deep learning model called EchoNet-Dynamic, which segments the left ventricle, estimates the ejection fraction, and assesses for cardiomyopathy. The model has a variance that is comparable or less than that of human experts, as well as being more reliable and quicker to churn out a result. EchoNet-Dynamic uses standard echocardiogram videos as input and uses spatiotemporal convolutions to generate frame-level semantic segmentations.
Machine learning analysis of whole mouse brain vasculature, TU Munich, Helmholtz Zentrum München et al. In the study of mammalian brains, it is important to quantitatively analyze the entire brain vasculature. However, quantifying extremely small-scale changes in this brain vascular network is difficult because labeling and imaging of complete brains are not yet possible and the automated analysis of large 3D imaging datasets is complicated. This paper develops a two-pronged staining vascular strategy for mouse brains coupled with a 3D segmentation pipeline that leverages transfer learning from a synthetically generated vessel-like dataset. This approach lets the research subsequently register entire 3D segmented mouse brains against the reference Allen brain atlas in order to make quantitative comparisons automatically.
Accelerated discovery of CO2 electrocatalysts using active machine learning, University of Toronto. This paper focuses on the catalysts that accelerate the electrochemical reduction of CO2 to chemical feedstocks, which uses both CO2 and renewable energy. They use machine learning to guide the experimental exploration of multi-metallic systems to develop a novel electrocatalyst that is more efficient.
A deep learning system for differential diagnosis of skin diseases, Google Health, UCSF, MIT, et al. This paper presents a computer vision model that can distinguish between 26 common skin conditions while also providing a secondary prediction covering 419 skin conditions.
Predicting conversion to wet age-related macular degeneration (exAMD) using deep learning, Moorfields Eye Hospital, DeepMind, Google AI. Patients with exAMD in one eye are at risk of developing the condition in their second eye. This paper shows how a computer vision system can process a 3D OCT scan, generate a semantic segmentation tissue map, and predict the conversion of the healthy eye to exAMD with 6 months lead time.
Molecular generation for desired transcriptome changes with adversarial autoencoders, Insilico Medicine, and Neuromation. This paper demonstrates a model that can generate molecular structures of chemical compounds that will give rise to a desired transcriptional response. Such an approach is neat because drugs ultimately impart their biological effect by modulating pathways or gene expression.
Privacy and federated learning
Generative models for effective ML on private, decentralized datasets. Google. This is a neat paper that was presented at ICLR 2020 online. In the “non-private” domain, ML engineers rely on manual inspection of raw data to become familiar with the dataset they are working on. In the “privacy” and federated learning domain, however, the same dataset introspection behavior would void privacy guarantees. The solution the authors present is to indirectly introspect the data problems on federated devices by training generative models locally on devices with good data and those with bad data. Then you sample from the generative model to get examples of the data so you can debug.
📑 Resources
Blogs and reports
Metric Management for machine learning: A Comprehensive Stack to Track and Adapt Metrics to Fit Your Business.
A neat landscape of ML infrastructure tools for data preparation.
This report details the global use of facial recognition services. I was surprised to read that 98 countries actively use facial recognition and only 3 countries have banned the technology.
Videos/lectures
Springer released 408 books for free, including 65 machine learning and data books.
AI for full-self driving, a great talk by Andrej from Tesla.
ML for systems and chip design, a talk from Google Brain researchers Azalia and Anna at Caltech. Slides here.
Datasets and benchmarks
A big bad NLP database with 481 datasets and their download links.
Facebook created and open-sourced the Hateful Memes data set, which contains 10k+ new multimodal examples in which unimodal classifiers (just text or just language) would struggle to classify them correctly.
Scale AI collaborated with LiDAR maker Hesai to launch an open-source dataset called PandaSet.
Open source tools
An AR+ML prototype showed how to enable cutting elements from your surroundings (e.g. a potted plant on a desk) using your smartphone and pasting them into image editing software on your laptop. It’s pretty neat. The visual AR AirDrop :-)
Similarly, Google Lens now lets you copy/paste text from handwritten notes directly to your laptop.
pySLAM is a Python implementation of a deep learning-based monocular visual odometry pipeline.
A home for results in ML. The Papers with Code project now directly links results in tables from arXiv papers, semi-automatically extracts results from papers, and displays this as leaderboards to facilitate progress tracking on a task and dataset basis.
GraphLog: a multi-purpose, multi-relationship graph dataset built using rules grounded in first-order logic.
Hugging Face, the open-source NLP company, released v2.9.1, which includes 1,008 machine translation models covering 140 different languages.
💰Venture capital financings
Waymo raised a further $750M from T. Rowe Price and Fidelity Management to expand its first external investment round to $3B in total.
GRAIL, the Menlo Park-based liquid biopsy company, raised a $390M Series D from Canadian pension investors. This brings the company’s total capital raised to $1.9B. GRAIL presented positive data on detecting more than 50 cancer types from a blood draw with less than 1% false discovery rate with 76% accuracy and identifying the tissue of origin.
Covariant, a Berkeley-based warehouse robotics company, raised a $40M Series B led by Index Ventures. The company was founded out of Pieter Abbeel’s robotics research group at Cal Berkeley in 2017. The company announced a partnership with ABB in February and Knapp in Germany in March for order-pick automation.
Didi’s autonomous driving subsidiary raised over $500M from existing investor SoftBank.
insitro, the SF-based drug discovery company led by Daphne Koller raised a $143M Series B led by a16z. The company produces disease-specific models using iPSC cells.
Immunai, a Boston-based company that’s decoding the human immune system to improve human health, has raised a $20M Seed led by Viola Group and TLV Partners.
Arterys, an SF-based medical imaging startup with an FDA-cleared oncology imaging suite, raised a $28M Series C led by Temasek. The company now has six 510(k) pre-market approvals from the FDA.
Lilt, an SF- and Berlin-based enterprise machine translation company, raised a $25M Series B led by Intel Capital.
Invisible AI, an SF-based computer vision startup offering assembly line visual inspection software, raised a $3.6M Seed led by 8VC. The camera system tracks the movements of someone assembling parts and computes cycle time, total cycles, and the sequence of events.
FortressIQ, the SF-based enterprise process mining company, raised a $30M Series B led by M12 and Tiger Global.
Arculus, the German modular production platform, raised a €16M Series A led by Atomico. Instead of a conveyer belt production line, the company uses mobile robots to deliver a modular production process that can overcome bottlenecks when they occur.
Xwing, the SF-based autonomous aviation startup, raised a $10M Series A led by R7 Partners.
M&As in May 2020:
Intel acquired Israeli startup Moovit for $900M, which will join forces with Mobileye but will still operate as a standalone company and service. Founded in 2012, Moovit is a mobile and web app that helps people move around cities by combining all options (public transport, scooters, car sharing, Uber/Lyft, taxis) for real-time trip planning and payment. The company serves over 800M people across 103 countries. It captures over 6 billion anonymous data points a day and has a network of 685,000 data gathering curators who maintain maps. The company has 200 employees and raised $132M in venture capital. Intel/Mobileye their Series D.
Microsoft acquired UK-based Softomotive to expand its low-code robotic process automation within Microsoft Power Automate. Softomotive reportedly serviced >9,000 customers who used the WinAutomation product to automate business processes across both legacy and modern desktop applications. The RPA space has heated up significantly since UiPath raised its Series A in 2017. Finally, the major tech companies are playing offense against RPA startups that run on their clouds :-)
Apple acquired Inductiv, a Canadian startup that developed AI systems to automate the identification and remediation of errors in data. The company was co-founded by well-known Professors including Chris Ré from Stanford who was previously involved with Lattice Data, which Apple acquired in 2017.
---
Signing off,
Nathan Benaich, 7 May 2020
Air Street Capital | Twitter | LinkedIn | RAAIS | London.AI
Air Street Capital is a venture capital firm that invests in AI-first technology and life science companies. We’re a team of experienced investors, engineering leaders, entrepreneurs and AI researchers from the World’s most innovative technology companies and research institutions.