🏡Your guide to AI: Q1 2020, part 2

Apr 05, 2020

Dear readers,

I hope this reaches you safe and well at 🏡 Following from last Sunday's guide to AI in Q1 2020 part 1 of 2, here is part 2! In this edition we'll focus on AI research (NLP, vision, RL, science, systems) and startup activity (investments and M&A).

London.AI livestreams every 2-4 weeks! This past Thursday we moved our meetup online and hosted hundreds of viewers for talks from PolyAI, Graphcore, and ZOE/KCL (in fact, Tim went live on CNBC right after to spread the word about the COVID Symptom Tracker!).

Join us on the Facebook group here and please hit reply if you're interested in discussing your research or applied AI work at a future event.

Referred by a friend? Sign up here and share the newsletter on Twitter.

🔬 Research

Here’s a selection of impactful work that caught my eye, grouped in categories:

NLP

Language models as knowledge bases? Facebook and UCL. This paper investigates whether pre-trained language models build up their own relational knowledge bases that can serve as question/answer systems. They find that “without fine-tuning, BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge, (ii) BERT also does remarkably well on open-domain question answering against a supervised baseline, and (iii) certain types of factual knowledge are learned much more readily than others by standard language model pretraining approaches.”

Scaling Law for Neural Language Models, OpenAI. This is one of few that provides empirical evidence and theory around neural network model scaling. They focus on transformer models and show that model performance scales as a power-law with more data, more model parameters, and more training time.

A visual analysis tool to explore learned representations in transformers models, Harvard. exBERT is an interactive tool that gives insights into the meaning of contextual representations learned by language models.

Towards a Human-like Open-Domain Chatbot, GoogleAI. This paper presents “Meena”, an evolved transformer model that has 2.6 billion parameters and is trained on 341 GB of text, filtered from public domain social media conversations. This neural conversational model has 1.7x the capacity and is trained on 8.5x the amount of data than OpenAI’s GPT-2. The authors describe a new evaluation metric called Sensibleness and Specificity Average (SSA), which captures basic, but important attributes for natural conversations according to crowd workers. They show that perplexity, which is automatically calculated by neural models, correlates well with SSA, thus providing a quicker evaluation method than sampling crowd workers.

Microsoft released Turing-NLG, a 17 billion parameter language model capable of generation, Q&A, and summarisation. They show improved performance over GPT-2 and Megatron.

Then, SambaNova published a blog post saying that they’d trained a 100 billion parameter language model on their Dataflow-optimised compute system. They suggest that it is conceivable to run a 1 trillion parameter model soon (!).

Computer vision

MnasFPN : Learning Latency-aware Pyramid Architecture for Object Detection on Mobile Devices, Google AI and Google Brain. The authors present a mobile-friendly search space for the detection head and combine it with a latency-aware architecture search to produce efficient object detection models.

Transfusion: Understanding Transfer Learning for Medical Imaging, Google Research. This paper studies the effect of transfer learning on medical imaging model performance. They find that pre-training on ImageNet doesn’t actually improve model performance on diagnosing diabetic retinopathy or the classification of lung disease from chest X-rays.

Single-stage monocular 3D object detection with virtual cameras, Mapillary Research. In this paper, the authors show how video footage from a monocular RGB-only camera can be used to predict 3D object bounding boxes. The approach involves generating synthetic camera views from different angles of a scene.

Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data, Uber. The authors seek to speed up neural architecture search - an exciting approach to automatically design neural networks to have high predictive performance. They do so by using a generator (from a GAN) to synthesize artificial training data and show that applying neural architecture search on this unbounded synthetic data for a few steps predicts whether the NAS will perform well on real data. This means you can evaluate lots of neural architectures on synthetic data to find the right architecture and then move to real data to complete training.

On the relationship between self-attention and convolutional layers, EPFL. Attention-based neural networks, which have taken NLP by storm due to their ability to model sequences, have also been shown to match CNNs on computer vision tasks. As a result, this paper explores whether learned attention layers operate similarly to convolutional layers. They show that multi-head self-attention layers attend to pixel-grid patterns similarly to CNN layers.

Turning any 2D photo into 3D using CNNs, Facebook AI. The authors trained a mobile phone resource-aware CNN using neural architecture search on millions of pairs of 3D images and their accompanying depth maps.

EfficientDet: Scalable and Efficient Object Detection, Google Brain. The authors present several optimizations to neural architecture search to develop the EfficientDet family of object detector models. Their best model is more accurate than the next best with 4x fewer parameters and using 13x fewer FLOPs on the COCO dataset.

Predicting the future, Wayve.ai. This is a cool paper that addresses the biggest problem in self-driving: Predicting how a given scene will evolve and planning accordingly. The approach involves learning a model for the probability of future events. It is trained from observed future sequences. Then, they learn a second distribution to reflect the present world, which only has access to past data. During inference, they jointly predict future scene representation (semantic segmentation, depth, and optical flow).

Reinforcement learning

Learning to Predict Without Looking Ahead: World Models Without Forward Prediction, Google Brain. As the next step to their work on learning world models, the authors test whether RL agents can learn a world model by going through a more messy and slow process of evolution instead of minimizing a forward-predictive loss. To do so, they artificially constrain the probability that an agent is allowed to observe its real environment at each training step. As a result, the agent has to fill in its observation gaps to build out its world model. Even though the agent has been explicitly trained to predict the future, the resulting world model allows the agent to display key skills to make it successful in that environment.

Adversarial Skill Networks: Unsupervised Robot Skill Learning from Video, University of Freiburg. Designing reward functions that enable agents to learn desired behaviors is challenging in the real world. Unsupervised learning can help. This paper presents an approach to learn a task-agnostic skill embedding space from unlabeled multiview videos. They show that the learned embedding can guide an RL-agent to solve a wide range of tasks by composing previously unseen skills.

SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference, Google Research. The authors propose a training architecture to increase the number of frames per second that an RL agent can learn from on a given number of computing resources. Here, a central learner trains the model on GPUs using input from the distributed inference of an actor in its environment on hundreds of machines. The result is a significant speed-up in wall-clock time and computational efficiency over existing methods like IMPALA that use CPUs for neural network inference.

Systems and methods

Overton: A Data System for Monitoring and Improving Machine-Learned Products, Apple. This paper describes an internal ML system at Apple that automates the life cycle of model construction, deployment, and monitoring by providing a set of novel high-level, declarative abstractions for developers so they don’t need to write TensorFlow code.

Causal Discovery from Incomplete Data: A Deep Learning Approach, Eindhoven University of Technology and MIT. This paper addresses the problem of learning causality in situations with missing data. They propose a deep learning framework called Imputated Causal Learning (ICL), for iterative missing data imputation and causal structure discovery, producing both imputed data and causal skeletons. The paper presents simulations on both synthetic and real data to show that ICL can outperform state-of-the-art methods under different missing data mechanisms.

Uber ATG’s ML infrastructure and versioning control platform for self-driving vehicles, Uber. A nice description of how Uber handles the training and deployment of ML models, with particular focus on the DevOps side of things using a tool called VerCD. This lets Uber use automated continuous delivery to track and manage versioned dependencies for ML artefacts.

Science (bio, health, etc.)

A Deep Learning Approach to Antibiotic Discovery, MIT and Harvard. This paper is exciting because it demonstrates how deep learning on molecular graphs can be used to predict molecules with antibacterial activity. They use this approach to screen a large pool of chemical molecules and discover a molecule, Halicin, that is structurally divergent from conventional antibiotics. Halicin displays bactericidal activity against a wide phylogenetic spectrum of pathogens including the bacterium that causes tuberculosis.

Learning to grow: control of materials self-assembly using evolutionary reinforcement learning, Lawrence Berkeley National Lab and Vector Institute. The authors study molecular self-assembly, a process by which molecules or nanoparticles naturally come together into ordered structures. Today, if we’re given a set of molecules, conditions and a time period, it is not possible to predict the structure, phase, and yield of the structures that will form as a result. This paper shows how neuroevolutionary RL can learn a network that can “enact a time-dependent protocol of temperature and chemical potential in order to promote the self-assembly of the desired structure or choose between two competing polymorphs. In both cases the network identifies strategies different from those informed by human intuition, but which can be analyzed and used to provide new insight.”

Detection of anaemia from retinal fundus images via deep learning, Google Health and Google Research. Anemia manifests by a reduction in the red blood cell or hemoglobin count and the condition affects an estimated 1.6B people worldwide. Testing for it requires a blood draw. This paper shows that anemia can be regularly screened for using non-invasive retinal fundus images.

Machine learning on DNA-encoded libraries: A new paradigm for hit-finding, Google Applied Science, X-Chem, ZebiAI, and Cognitive Dataworks. This paper is great. It focuses on DNA-encoded libraries, a powerful technique for large-scale screening of small molecules in drug discovery. The technique works by barcoding chemical molecules, mixing them all together with a drug target of interest (e.g. a protein), then deconvoluting which molecules bound to the drug by using next-generation sequencing and barcode counting. The paper throws ML into the mix by training a graph CNN model on round 1 of a DEL experiment that identifies which chemicals are bound to a target. The model is used to virtually screen large libraries (approx 88M compounds) to predict which molecules are worth empirically testing in the next round of DEL experiment. The authors report hit rates between 29% - 72% compared to 1% hit rates in non-ML guided DEL experiments. Air Street Capital has made an investment in a related company called Anagenex.

Unified rational protein engineering with sequence-based deep representation learning, Harvard and MIT. This paper applies techniques from presentation learning in NLP to proteins. The authors train LSTMs to learn statistical representations of proteins as unlabelled amino acid sequences from approx. 24 million sequences. The model summarises arbitrary protein sequences into fixed-length vectors that approximate fundamental protein features (function, stability, secondary structure). They show that these representations can be used to predict the structural and functional properties of proteins. While having solved 3D protein structures is a gold standard for developing new proteins, this approach should help accelerate things! Code here.

International evaluation of an AI system for breast cancer screening, Google Health et al. This paper reports a large-scale screening mammography trial in the US and UK. It shows that deep learning models can predict biopsy-confirmed cancer cases and bring about an absolute reduction in false positives and false negatives. This could reduce the workload of a second reviewer (in the UK’s two reviewer system) by 88%. After its release, the paper drew criticism by some physicians who state that predicting biopsy-confirmed cancer isn’t the point of screening, which is to find more curable cancers. More criticism came because the work does not offer a detailed methods section nor does it offer open-source code, which is an impediment to reproducibility and transparency. Another NYU group paper published around the same time evaluated CNNs on over 1M breast cancer images. The code and trained models are available here.

Learning to Simulate Complex Physics with Graph Networks, DeepMind. This paper shows how to learn a realistic simulator of complex physics. This is done by representing the state of a physical system with particles, expressed as nodes in a graph, and computing dynamics via learned message-passing. They show how this model accurately simulates fluids, rigid solids, and deformable materials interacting with one another.

A Survey of Deep Learning for Scientific Discovery, Google. This is an overview of “many widely used deep learning models, spanning visual, sequential and graph-structured data, associated tasks and different training methods, along with techniques to use deep learning with fewer data and better interpret these complex models --- two central considerations for many scientific use cases.”

💰 Venture capital financings and exits

Here’s a highlight of the most intriguing financing rounds:

Automation Anywhere, a US-based provider of enterprise robotic process automation software, raised a $290M Series B round at a post-money valuation of $6.8B. Like others in the market, AA is pushing its marketplace ecosystem of RPA bots and third-party integrations to drive vendor lock-in.

Berkshire Grey, makers of warehouse robots and software focused automating pick/place/parcel movement, raised a $263M Series B led by SoftBank.

Accel Robotics, a San Diego-based provider of checkout-free stores, raised a $30M Series A from SoftBank.

AMP Robotics, which produces recycling sorting robots, raised a $16M Series A led by Sequoia.

Thought Machine, a London-based provider of cloud-native core banking software, raised a $83M Series B led by Draper.

Observe.ai, a US/Indian startup offering sales call analysis and coaching, raised a $26M Series A.

Relatedly, CallMiner, a US-based call center agent voice training system, raised a $75M round led by Goldman Sachs.

Neolix, a Chinese-based maker of AV robots for delivery in the style of Nuro.ai, raised a 200M RMB Series A.

Graphcore, the Bristol-based developer of the Intelligence Processing Unit, raised an additional $150M to add to its $200M Series D closed last year.

Hugging Face, authors of a leading open-source NLP library called Transformers, raised a $15M Series A led by Lux Capital.

Pachama, an AI-first climate-focused startup, raised a $4.1M Seed round.

Deep Genomics, an AI-first therapeutics discovery company, raised a $40M Series B.

K Health, which develops a symptom checking app with telehealth services, raised a $48M Series C.

SambaNova Systems, makers of a specialized AI chipset, raised a $250M Series C led by BlackRock.

Soul Machines, which develops super realistic human avatars, raised a $40M Series B.

Five, the London-based self-driving software company, raised a $41M Series B to go to market with B2B software products instead of offering a B2C self-driving ride-sharing service.

Behavox, a London-based employee compliance monitoring software company, raised a $100M round led by SoftBank Vision Fund 2.

Hailo, the Israeli AI chipmaker focused on edge computing, raised a $60M Series B.

M&A deals:

Intel acquired AI chipmaker Habana for $2B, which will remain as an independent unit.

DataRobot acquired 100-person data preparation startup Paxata to eat earlier steps of the ML pipeline. The acquisition did not report a price. Paxata had raised some $90M in venture financing.

Waymo expands its engineering effort into Oxford, UK by acquiring Latent Logic for an undisclosed sum. The startup was led by Shimon Whiteson, Professor at the University of Oxford, whose work spanned from multi-agent systems to inverse RL on video data to learn safe driving. This would help develop human-inspired driving behaviors (more on this from Lyft here).

Snap acquired AI Factory for $166M, a computer vision startup they’d been working with to create Snap’s Cameos feature. AI Factory’s founder had previously built and sold Looksery to Snap in 2015, which kick-started Snap’s facial filter features.

Apple acquired a few companies:

The big deal was for Seattle-based Xnor.ai, which was acquired for a reported $200M. Xnor.ai was building low-power, edge-based AI chips. It spun out of the Allen Institute for AI and was led by Ali Farhadi, Associate Professor at the University of Washington.
Dark Sky, a hyperlocal weather app, was acquired. The app will survive on iOS but is shut down on Android and elsewhere. Users aren't happy. More on how the system works.
Voysis, an Irish speech and NLP startup that had generated press attention around speech synthesis and making WaveNet work on a very small footprint, was acquired.

DocuSign acquired Seal Software for $188M in cash. Seal’s product was an AI-driven contract analysis tool that makes it simpler and faster to find, analyze, and extract data from contracts.

Square acquired Toronto-based Dessa for an undisclosed sum. The team will expand Square’s efforts to infuse ML across its product lines.

---

Signing off,

Nathan Benaich, 5 April 2020

Air Street Capital | Twitter | LinkedIn | RAAIS | London.AI

Air Street Capital is a venture capital firm that invests in AI-first technology and life science companies. We’re a team of experienced investors, engineering leaders, entrepreneurs and AI researchers from the World’s most innovative technology companies and research institutions.

Guide to AI

Discussion about this post