☀️Your guide to AI in Q1 2019, by nathan.ai
Hello from London 🇬🇧! I’m Nathan Benaich from Air Street Capital. We invest in entrepreneurs who build intelligent systems to solve problems for both enterprises and consumers.
Welcome to the Q1 2019 edition of my AI newsletter! I’ll synthesise a narrative that analyses and links important news, data, research and startup activity from the AI world. Grab your beverage of choice ☕ and enjoy the read!
Before we start:
📣Hit reply if you’d like to chat about building AI-first products, new research papers, interesting market opportunities or if you’re considering a career move in the startup world.
🤓 Back by popular demand, Ian Hogarth and I will be publishing the 2019 Edition of the State of AI Report on 28th June 2019. If you’d like to suggest high-profile research, companies, products or trends for inclusion in the Report, please hit reply.
🎫 RAAIS 2019 is now 6 weeks! Here is a YouTube live link to follow all of the talks online. We’re fortunate to host speakers from Graphcore, Recursion, Zymergen, Uber, Stanford, Berkeley, DeepMind, Google, PROWLER.io, LabGenius, Broad Institute, Partnership on AI and Dropout Labs. This event supports the non-profit RAAIS Foundation’s work in funding common good AI research, open source projects and education initiatives.
Referred by a friend? Sign up here. Help share by giving this it a tweet :)
🆕 Technology news, trends and opinions
🚗 Autonomous everything
Aurora is the latest autonomous vehicle startup unicorn after its $530M Series B financing. What’s interesting is that Amazon corporate has invested in the round, potentially as a means to seal a partnership to procure an autonomous technology stack for their logistics. Aurora also released a report on their safety protocols, standards and development practices.
Instead of skipping over Level 3, Ford has adjusted its strategy to take the stepping-stone approach.
Amazon announced field trials of their ground delivery robot, Scout. Astute observers will notice a strikingly similar design to Starship Technologies’ delivery robot. In fact, Amazon acquired a startup called Dispatch.ai in 2017, itself inspired and designed to replicate Starship that had launched in Europe a few years prior. Amazon is also beginning to roll out automated packing robots that operate on factory conveyor belts. These are four or five times faster than a human packer, processing up to 700 orders an hour. Amazon also launched Sagemaker Neo, which helps developers train a model once and then output optimised code for various edge hardware substrates.
Both Uber and Lyft are now public companies, much to the delight of existing investors who (especially for Uber) made extraordinary returns on early investments. Before going public, news leaked that Uber was spending $20m a month to sustain its self-driving organisation. After talks of Uber’s self-driving unit raising $1B from SoftBank and Toyota, the deal did end up closing pre-IPO. This new self-driving company, sitting within the Uber group, has its own board of directors with representation from Uber and investors and is valued at $7.25B.
Cruise raised another $1.15B of capital from T. Rowe Price Associates and existing investors, SoftBank Vision Fund, Honda and GM. It’s quite clear now that having a shot at building self-driving services requires several billion dollars as table stakes.
Meanwhile, Apple axed 200 employees from its Project Titan self-driving team.
Tesla ran a live broadcast centred around its self-driving technology capabilities. The event was a bold move (kudos to them, honestly) during which Elon and team leads presented their approach to self-driving and fielded answers from public investment analysts. The most interesting part to me was how Tesla uses its fleet infrastructure to a) encounter odd out-of-sample events, b) calls on the wider fleet to report back with similar footage, c) uses this footage to update its Autopilot systems. The company didn’t showcase any simulation work to round out these edge cases, however. Tesla put a lot of emphasis on their in-house designed silicon for self-driving, which removes dependency from the prior NVIDIA system (response from them here). This shows two things: a) full stack companies don’t like third-party dependencies on potentially competitive providers and b) custom silicon is really a thing in the age of ML. A few weeks later, Tesla reported a third fatal crash that occurred only 10 seconds after the driver engaged the Autopilot system.
Even Blackberry and the Canadian government are throwing their million dollar hats into the self-driving software ring with a $350M budget to catalyse the development of technology built in Canada.
Waymo introduced a new in-house built LiDAR with 95-degree vertical field of view and up to a 360-degree horizontal field of view.
💪 The giants
Microsoft made several AI-related announcements at their Build conference in Seattle. I attend the event and sat down with David Carmona, GM for Cloud and Enterprise AI, and Lance Olson, Partner Director of Program Management for the Azure AI Platform. Through these conversations, I learned two main strategy points. Firstly, Microsoft is really doubling down on enabling the business user to make use of ML features in their workflows. Here, data scientists (and Azure ML itself) are creating a growing number of ML models that are published through Office 365 and Power BI products so that the huge user base of business users can make use of them in these tools. For example, think about using a fraud detection model directly in Excel. Azure’s ML focus is also on enabling business users to build their own models for their data without relying too much on engineering resources. Secondly, Microsoft’s cloud ML value proposition is built around enabling users to create, train and containerise models so they can export and run them in whatever environment and infrastructure they choose. This means no vendor lock-in on Microsoft (whereas AWS and GCP don't let you export their services). This makes management teams more comfortable with data privacy, ownership and security. The company also published several features that fit nicely into the robotic process automation field, such as Form Recogniser, which is an unsupervised learning based data extraction API that only requires seeing 4 examples of a form to work.
Cloudera and Hortonworks seal their merger in a bid to consolidate their offering for enterprise-grade AI readiness. Using their tools, one can capture, store, manage and analyze data, as well as train and serve machine learning models.
Google shared loads of announcements at I/O a few days ago. ML has made it into battery management on the Pixel 3 smartphone, ML is being used to pinpoint the accurate location of a Google Maps user using the camera (not dissimilar to startups such as Scape or Blue Vision Labs, now Lyft Level 5), the new Google Assistant will ship on-device so it can run inference offline, a new ML kit for on-device translation between 59 languages and on-device object detection and tracking. Google has also implemented an entirely neural on-device speech recogniser as input for GBoard. As a company built around monetising predictions, Google is staying true to its goal of injecting ML into as many of its current and future products as possible.
Separately, Google announced a new AI Ethics Board, which included professors, scientists, and a former US deputy secretary of state. Very shortly thereafter, the Board was dissolved because of concerns over the Board member’s political views and the extent to which they would actually be able to scrutinise Google’s work. This is the latest development in a series of ethics oversight challenges at Google, DeepMind and OpenAI.
OpenAI has switched its corporate structure away from being a non-profit to a new “capped-profit” company such that it can raise billions of dollars to invent general intelligence. Here, OpenAI pitches investors a return capped at 100x their investment if the organisation succeeds at this goal.
Facebook open sourced a general-purpose platform for managing, deploying and automating AI experiments, as well as a tool to run Bayesian hyperparameter optimisation. More on their F8 announcements here and an NYT piece on the company’s significant efforts to clamp down on bad actors here.
Here are some new details about Facebook’s ML hardware infrastructure. The company has made it no secret that it is designing custom chips for inference workloads.
Boston Dynamics showcased videos of their robots being repurposed for warehouse pick and place problems here, in addition to the product of its latest acquisition, Kinema Systems.
Google has rebooted its work on robotics via a group called Robotics at Google. A profile piece ran in the NYT on this topic.
On the topic of robots, data compiled by the Robotics Industry Association shows that US factories have received 35,880 robots last year, 7% more than in 2017. The largest number are in the automotive components sector.
Here’s a new interview with Nigel Toon of Graphcore, which dives into a few of the technical specs and strategies for the company.
🏥 Healthcare and life science
A wave of pharma and biotech companies have signed co-development agreements with AI-based drug discovery startups:
Atomwise partnered with a large contract research organisation, Charles River Laboratories, to support their hit discovery and hit to lead development process. If they’re successful, Atomwise could net $2.4B in royalties over the next few years. Big bucks!
Exscientia partnered with Celgene, a pharma company focused on cancer and inflammatory disease. This deal offers an initial $25M upfront payment and the promise of milestone and royalty payments upon success.
LabGenius signed a two-year partnership with Tillotts Pharma to identify and develop new drug candidates for the treatment of inflammatory bowel diseases (IBD), such as Crohn’s disease.
Insitro entered into a three-year collaboration with Gilead to create disease models for nonalcoholic steatohepatitis (NASH), a chronic form of liver disease with limited treatment options and that can result in cancer. insitro will receive an upfront payment of $15M, with additional near-term payments up to $35M based on operational milestones. insitro will be eligible to receive up to $200M for the achievement of preclinical, development, regulatory and commercial milestones for each of the five potential Gilead targets.
Schrodinger, a market leading molecular simulation company founded in the 1990s, has raised $110M from Bill Gates, D.E. Shaw and GV to vertically integrate its drug development efforts. In the past, two cancer drugs have been discovered by its customers using Schrodinger software. Both have gone on to win FDA approval. The new financing sees Schrodinger leverage its software to run its own development programs.
Meanwhile, IBM Watson is pulling their product for drug discovery, citing sluggish revenue growth.
🇨🇳 AI in China
Here is an overview of AI semiconductor work in both small and large Chinese companies.
The extent to which Baidu, Alibaba, Tencent and Huawei are entrenched leaders of China’s AI landscape across infrastructure, technology tools and applications.
Analysis of 2 million publications up to 2018 run by the Allen Institute for AI in Seattle showed that while China has already “surpassed the US in number of published AI papers, the country’s AI researchers are poised to be in the top 50% of most cited papers this year and in the top 10 per cent next year”. The results show that the US share of citations in the top 10% of AI papers has declined gradually from 47% in 1982 to 29% last year. China, on the other hand, has risen to over 26% of citations in 2018.
Geopolitical tensions between the US and China don’t seem to be abating. This piece outlines how intertwined Chinese capital is in the US tech and venture market. What’s more, President Trump has blacklisted Huawei and 70 of its affiliates because they are deemed to be threats to national security. This means US companies are blocked from using or purchasing Huawei equipment. As a result, suppliers to Huawei are scrambling to understand the repercussions to their own businesses.
AI around the 🌍
Finland was the first European country to put a national AI strategy in place back in October 2017). What began as a free-access university course is now being scaled nationally to 55,000 people in partnership with government and companies. For example, technology companies Elisa and Nokia said they would train their entire workforce to be literate in AI. The Economy Minister Mika Lintilaä pledged that Finland will become a world leader in practical applications of AI.
🔮 Where AI is heading next
AutoML refers to the overall problem of automating otherwise manual steps in ML architecture, modelling and parameter training. For Google, the emphasis is on searching model architecture space to test and converge on new structures that improve model performance and other variables such as latency. For Microsoft, the emphasis is on model selection depending on the task and input data. For others, AutoML is about hyperparameter tuning. A recent post by Waymo described how their team used Google’s AutoML to automatically design CNN architectures using pre-trained cells for the task of semantic segmentation on LiDAR data (i.e. a transfer learning-based method). Compared to manually-designed networks, the AutoML outputs showed significantly lower latency with a similar quality or even higher quality with similar latency. Next, they designed a proxy segmentation task that could be rapidly computed and applied either random search or reinforcement learning to conduct end-to-end search while optimising for network quality and latency. Using the proxy task meant exploring over 10,000 candidate architectures over two weeks on a TPU cluster. The graph below shows thousands of individual resulting architectures from random search (green dot/line) compared to the prior transfer learning architecture (red dot), random search on a refined set of architectures (yellow dot/line) and reinforcement learning-based search (blue dot/line). What’s interesting is that the architectures discovered using RL-based search exhibited 20–30% lower latency with the same quality than those developed manually. They also yielded models of with an 8–10% lower error rate at the same latency as the previous architectures. Separately, neural architecture search was recently used to improve the hand-designed Transformer architecture.
Massive language models: OpenAI stirred up quite a storm in the AI world when they refrained from publishing a huge Transformer-based unsupervised language model called GPT-2. Trained with text data scraped from 8 million webpages, GPT-2 has 1.5 billion parameters that allow it to predict the next likely word in a sentence to generate pretty coherent sentences. It can perform well on question answering, reading comprehension, summarization, and translation without being trained with task-specific data. GPT-2 can also invent stories about talking unicorns :-/ OpenAI made a slightly larger model that was published online available to select researchers and journalists. The system is exposed here for the public to play with. OpenAI is concerned about malicious applications of the full-scale model and has experimented with self-censorship in a research ecosystem that is used to radical transparency. More on this here. For more on how the Transformer architecture works, have a read of this blog post and this one too. At RAAIS 2019, we’re delighted to host Ashish Vaswani of GoogleAI who is the first author on the paper that proposed the Transformer architecture, Attention is all you need.
Decentralised ML: The current ML paradigm posits that organisations should centralise their data as much as possible in order to facilitate the training and deployment of ML-based products. This certainly makes a lot of sense. Notwithstanding this trend, another countercurrent is emerging: Decentralisation of ML. Here, the thesis is that developers should push the model training to where the data is created and lives using federated learning, versus the other way around. Google is really pushing the envelope on this thread. The company has open sourced TF-Federated, a TensorFlow-based framework for that implements an approach federated learning to enable many participating clients to train shared ML models, while keeping their data locally. Here is a talk from I/O on the subject. I’m very excited to track progress in this space. At RAAIS 2019, we’re running a fascinating session on privacy-preserving ML with the pioneer of federated learning, Brendan McMahan, the lead of OpenMined, Andrew Trask, the lead of TF-Encrypted, Morten Dahl, and the Director of Research at Partnership on AI, Peter Eckersley.
Encoding vs learning knowledge: Rich Sutton’s essay, The Bitter Lesson, posits that “Search and learning are the two most important classes of techniques for utilizing massive amounts of computation in AI research.” He shares examples from several task domains where best-in-class models took advantage of massive compute to learn to problem solve. For example, in Go researchers' “initial effort was directed towards utilizing human understanding (so that less search was needed) and only much later was much greater success had by embracing search and learning.” Indeed, he argues that we want “AI agents that can discover like we can, not which contain what we have discovered. Building in our discoveries only makes it harder to see how the discovering process can be done.” This argument was well received by the deep learning crowd but countered by those who believe in encoding primitive knowledge of the world to help agents learn the more complex tasks faster.
Here’s a selection of impactful work that caught my eye:
Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network, Stanford University, UCSF, iRhythm Technologies. This study showed that single lead electrocardiogram traces in the ambulatory setting can be processed in a raw format by a deep learning model to detect 12 rhythm classes. The neural network system was trained on data from 54,000 patients. It achieved an average ROC of 0.97 and with a specificity fixed at the average specificity of cardiologists, the network was more sensitive for all rhythm classes. It remains to be seen if this approach works on multi-lead ECGs, which are more common in the clinic.
Towards reconstructing intelligible speech from the human auditory cortex, Columbia University, Hofstra Northwell School of Medicine, The Feinstein Institute for Medical Research. Researchers at Columbia used invasive electrocorticography to measure neural activity in 5 patients undergoing treatment for epilepsy while listening to continuous speech sounds.
Inverting this enabled the researchers to synthesize speech through a vocoder from brain activity. The system achieved 75% accuracy when tested on single digits ‘spoken’ via a vocoder. The deep learning method improved the intelligibility of speech by 65% over the baseline linear regression method. The research indicates the potential for brain-computer interfaces to restore communication for paralysed patients.
See, feel, act: Hierarchical learning for complex manipulation skills with multisensory fusion, MIT. This paper brings tactile reasoning into robotic manipulation. The authors use the Jenga game as a testbed to develop a methodology that emulates hierarchical reasoning and multisensory fusion. As per the abstract pretty much says it all: “The game mechanics were formulated as a generative process using a temporal hierarchical Bayesian model, with representations for both behavioral archetypes and noisy block states. This model captured descriptive latent structures, and the robot learned probabilistic models of these relationships in force and visual domains through a short exploration phase. Once learned, the robot used this representation to infer block behavior patterns and states as it played the game. Using its inferred beliefs, the robot adjusted its behavior with respect to both its current actions and its game strategy, similar to the way humans play the game. We evaluated the performance of the approach against three standard baselines and show its fidelity on a real-world implementation of the game.”
Approximating CNNs with bag-of-local features models works suprisingly well on ImageNet, University of Tübingen. This is a pretty cool paper. It tests the hypothesis that state-of-the-art CNN models (e.g. ResNet50) actually use relatively simple methods to perform well on ImageNet recognition tasks. They build a simple variant of the ResNet50 architecture called BagNet, which classifies an image based on the occurrences of small local image features without taking into account their spatial ordering. In this way, BagNet is similar to bag-of-word models for NLP tasks, where a combination of words in a sentence can be used to classify its topic, for example. For the BagNet experiment, a test image is split into small image patches, each of which is passed through a CNN to get class for each patch. Then, the sum of these classes is computed over all patches to reach an image-level decision. Interestingly, image features of size 17 x 17 pixels are enough to reach AlexNet-level performance while features of size 33 x 33 pixels are sufficient to reach around 87% top-5 accuracy. Higher performance values might be achievable with more careful placement of the 3 x 3 convolutions and additional hyperparameter tuning. You can read more via the blog post here.
Learning to follow directions in Street View, DeepMind. In this paper, the authors describe a new language grounding and navigation task that uses realistic images of real places, together with real (if not natural) language-based directions for this purpose. Here, agents must learn how to navigate visually accurate environments of real places using textual directions. The agents are given driving instructions which they must learn to interpret in order to successfully navigate in this environment.
Learning latent plans from play, Google Brain. As children, we acquire complex skills and behaviors by learning and practising diverse strategies and behaviors in a low-risk fashion called “play”. This work proposes play-supervised robotic skill learning to yield robotic control that is more robust to perturbations than if trained using expert skill-supervised demonstrations. Here, a human remotely teleoperates the robot in a playground environment, interacting with all the objects available in as many ways that they can think of. A human operator provides the necessary properties of curiosity, boredom, and affordance priors to guide rich object play. Despite not being trained on task-specific data, this system is capable of generalizing to 18 complex user-specified manipulation tasks with the average success of 85.5%, outperforming individual models trained on expert demonstrations (success of 70.3%).
Reinforcement Learning, Fast and Slow, DeepMind. A nice review paper that compares and contrasts RL with theories of neuroscience.
Machine behaviour, many institutions. This review outlines a set of questions that are fundamental to the emerging field of AI and then explores the technical, legal and institutional constraints on the study of machine behaviour.
Why are machine learning projects hard to manage? It’s a mix of challenges predicting what will be easy vs. hard, ML fails in unexpected ways, models require lots of relevant training data and companies often don’t know much effort it will be a priori. What’s more, implementing successful ML projects within your business is often 100x harder than you think. In fact, Palantir President Shyam Sankar goes to argue that data is the new snake oil, effectively meaning that if your company’s revenue is not derived from directly monetising ML-based predictions, then you must seriously focus down on using ML for the tasks that really matter.
How might one adapt deep learning models that are effective on natural images to medical images? Read more here.
Here is a nice overview post about how deep multi-agent reinforcement learning works.
Generative Adversarial Networks, which learn to create examples of data from which they have been trained, can also be used to reduce the need for labelled training data.
How should we go about formally verifying the behaviour of ML models? DeepMind offers three approaches they are pursuing here. Another strand of research on this theme comes from ETH Zurich’s Secure, Reliable, and Intelligent Systems Labs here. We hosted Peter Tsankov, Senior Researher, who works on this problem at our latest London.AI held at Facebook.
Why is unsupervised learning worth pursuing? More here.
A visual exploration of Gaussian Processes here.
Generate training data for robotic manipulation tasks with the UnrealROX photorealistic simulation environment.
Researchers from Stanford’s radiology and machine learning groups published an updated chest x-ray dataset that now brings the total number of available images to 500k. Paper here. As pointed out by Luke, the labels still contain errors because they are generated by using NLP to mine doctor’s notes that don’t necessarily describe the images directly. Given the scale of the dataset and the work put into generating and validating it, another paper could have been released with just dataset documentation.
Open source tools
Making building, training, testing and deploying ML models easier for non-coders with Uber’s Ludvig project.
Several resources focused on distributed machine learning training: 1) Google released GPipe, an open source library for efficiently training large neural network models. The library uses synchronous stochastic gradient descent and pipeline parallelism for training. 2) Another solution in this space is Uber’s Horovod, a distributed training framework that allows engineers to lever thousands of GPUs. 3) DeepMind released TF-Replicator, a software library that helps developers deploy their TensorFlow code between CPUs, GPUs and TPUs with minimal effort.
Here’s a cool blog post, Probabilistic Model-Based Reinforcement Learning Using The Differentiable Neural Computer. Here, the author replaces an LSTM with a Neural Turing Machine to learn a game environment entirely from pixels in order to train a complex reinforcement learning or evolution strategies agent. The experiments showed the DNC outperforming LSTM.
💰 Venture capital financings and exits
Here’s a highlight of the most intriguing financing rounds:
Nuro, a Bay Area company developing an on-road autonomous logistics fleet and local delivery service, raised a whopping $940M Series B from SoftBank Vision Fund. Nuro is barely three years old and has moved at incredible speed to develop a full-stack hardware and software autonomous logistics proposition. The founders spun out of Google’s self-driving project.
Aurora, the self-driving software company founded by veterans of Google’s self-driving project, Tesla Autopilot and Uber, raised a $530M Series B led by Sequoia. Interestingly, Amazon joined the round as a corporate investor, which suggests its plan to embolden its logistics offering. On a related note, Amazon also recently led a $575M Series G in Deliveroo, which agrees with the narrative of creating a hyper-utilised multi-payload (and potentially autonomous) logistics fleet for super fast city deliveries.
Two Bay Area-based self-driving truck startups that both include alums from Uber/Otto, raised large Series A rounds. Ike closed a $52M Series A led by Bain Capital Ventures, while Kodiak Robotics raised a $40M Series A led by Battery Ventures. It’s too early in either of the company’s life to tell the difference at this stage :-)
PolyAI, the London-based developer of conversational dialogue systems for to automate contact center operations, raised a $12M Series A led by Point72 Ventures. The founders bring experience from VocalIQ (acq. Apple), Facebook AI Research, GoogleAI, and Cambridge University’s famed Spoken Dialogue Systems research group.
Tessian, the London-based email cybersecurity company, raised a $42M Series B led by Sequoia. The service protects enterprise users from sending emails to the wrong recipients and acts against increasingly common spear phishing attacks.
Horizon Robotics, a Chinese AI chip startup, raised a $600M round that values the company at $3B.
Databricks, a unified analytics platform that brings together data science, engineering and business, raised a $250M Series E from existing investor Andreessen Horowitz. The product is used by over 2,000 organisations according to the company.
Megvii, the Beijing-based face recognition startup, raised a further $750M in capital for its Series D. This makes it one of the most funded computer vision startups on the market. The round included Chinese government funds as well as Australia’s Macquarie Group, and a subsidiary of the sovereign wealth fund Abu Dhabi Investment Authority (ADIA).
PathAI, a Boston-based histology image analytics company, raised a $60M Series B led by General Atlantic and General Catalyst. The solution is used by, amongst others, pharma companies to rapidly analyse the results of drugs they are testing on animal models.
Rasa, the Berlin-based open source framework for bot building, raised a $13M Series A led by Accel in a bid to become part of the unstructured to structure data infrastructure for robotic process automation.
A couple of M&A deals, including:
Figure Eight, one of the leading data labelling service companies, was acquired in a deal that could be worth up to $300M. The buyer, Appen, is a 20-year-old privately held data annotation company with 500 employees. Appen paid $175M in cash up front for 10-year-old Figure Eight, which pays off 3x the capital its raised from venture investors. An extra $125M is up for grabs based on certain commercial targets. To me, this substantiates the thesis that labelling is indeed a commodity services business.
Dynamic Yield, an Israeli startup offering AI-based personalisation software for customer experience, was acquired by McDonald’s for $300M. The product enables more than 300 brands in six continents – including industry leaders across retail, gaming, finance, travel, and publishing – to build unified customer profiles, launch and optimize personalization campaigns, and automate decision-making. More on the acquisition here.
Canvas Technologies, a Series A stage indoor robotics company focused on logistics, was acquired by Amazon for an undisclosed sum. The business had raised a $15M round from Playground Global.
Nathan Benaich, 19 May 2019
Air Street Capital | Twitter | LinkedIn | RAAIS | London.AI