News in artificial intelligence and machine learning: Aug-Sept 2016
From 11th August thru 3rd October 2016
Welcome to issue #15 of my newsletter covering news in AI. If anything piques your interest or you’d like to share a piece that’s caught your eye that I missed, just hit reply.
Are you building an AI-driven startup? I suggest you flick through these slides. Are you keen to dive into trending AI research? Here is your salvation.
I’m actively looking for entrepreneurs building companies that leverage AI to solve interesting, high-value problems in any industry. Do get in touch 👍
Referred by a friend? Sign up here. Want to share? Give it a tweet :)
Technology news, trends and opinions
💪 From the big boys
The Chinese search behemoth, Baidu, announced a $200m investment initiative focused on AI and ran a pre-release of their new open source deep learning framework called PaddlePaddle. The company has a ways to go to compete given that developers are still more likely to use TensorFlow, which holds the lead along with Caffe, Keras and Theano. Speaking of the growing number of hardware and software configurations available today, this research paper provides helpful benchmarks.
Backchannel run a rare piece on how Apple uses machine learning. It states that a 200mb software package runs on the iPhone encompassing “app usage data, interactions with contacts, neural net processing, a speech modeler and a natural language event modeling system”. I’ve held the view for a while now that today’s AI techniques and infrastructure will re-open a class of historically intractable problems while also enabling us to rethink how products and features should be designed. Apple seem to think the same: “Machine learning is enabling us to say yes to some things that in past years we would have said no to. It’s becoming embedded in the process of deciding the products we’re going to do next.”
Salesforce announced their internal umbrella AI initiative, modestly called Einstein, which will go on to power many of the company’s cloud services, as well as expose AI tools to end users. The team of 175 data scientists includes talent from acquired startups MetaMind, PredictionIO and RelateIQ. The company’s flagship event, Dreamforce, will attract 170k people into SF next week.
Six of the most powerful technology companies have set up the Partnership on AI as a non-profit aimed at advancing public understanding of AI and formulate best practices on the challenges and opportunities within the field. An important catalyst to this end will undoubtedly be the continuation of open source technology development, which Seldon’s founder articulates in this piece.
🌎 On the importance and impact of AI on the World
Stanford’s 100 year study on AI published their first report. It finds “no cause for concern that AI is an imminent threat to humankind. No machines with self-sustaining long-term goals and intent have been developed, nor are they likely to be developed in the near future”. From a public policy perspective, it recommends to:
Define a path toward accruing technical expertise in AI at all levels of government.
Remove the perceived and actual impediments to research on the fairness, security, privacy, and social impacts of AI systems.
Increase public and private funding for interdisciplinary studies of the societal impacts of AI.
Princeton postdoc, Aaron Bornstein, writes a fascinating piece on the interpretability of AI systems. Given that deep learning models create internal representations of input data using features that weren’t handcrafted, their inner works are difficult (but not entirely impossible) to understand. Without interpretability, high-stakes use cases (e.g. healthcare, government, finance) just won’t adopt models that otherwise deliver state of the art performance. On the same subject, this piece runs through several initiatives focused on verification, including a new DARPA project on Explainable AI.
a16z’s Chris Dixon sets out 11 reasons to be excited about the future of technology with short soundbites for each. Five of these are either directly related to or will be enabled by AI and machine learning.
👍 User-friendly AI
UC Berkeley announced a new Center for Human-Compatible AI to study how AI used for mission-critical tasks act in a way that is aligned with human values. One enabling technique is inverse reinforcement learning, where an agent (e.g. robot) can learn a task by observing human actions instead of learning to optimise a task on its own.
Designer Ines Montani makes the case for how front-end development can improve AI. Music to my ears! I take the view that although AI can be used to solve fascinatingly complex problems, wrapping a service with an API for others to dream up the most powerful use case isn’t the path to building a valuable company. Instead, one should productise technology with user-centered design as a top priority. Ines walks through how design can “improve the collection of annotated data, communicate the capabilities of the technology to key stakeholders and explore the system's behaviours and errors.”
💻 AI running at scale
Google has published a high-level description of their deep learning-based recommendation system for YouTube using TensorFlow. The system uses two networks, one to generate potential candidates from the corpus of videos and a second to rank these candidates using video features, user history and context. In contrast to many deep learning models, the ranking models uses hundreds of engineering features because the raw data doesn’t lend itself well as a direct input. Two weeks later, the company open sourced a data set of 8 million YouTube video URLs along with labels from a set of 4,800 classes.
Spotify takes us through the evolution of their machine learning teams who drive the recommendations behind their Discovery Weekly and Radio products.
Bloomberg run a piece on how 12 hedge funds that use machine learning-based quantitative strategies performed above an index of all hedge funds this year. On the a similar note, Numerai, the distributed quant hedge fund, explain their thesis on ensembling machine intelligence from thousands of data scientists around the world to achieve breakthroughs in stock market prediction accuracy.
Ten years after the original release of Google Translate, the Google Brain team announce a new state of the art Neural Machine Translation System (paper here). The system takes the entire text to be translated as an input to a recurrent neural network instead of breaking the input sentence into words and phrases. The network pays attention to a weighted distribution over the encoded input vector (i.e. Chinese word) most relevant to generate output word (i.e. English word). Of note, the Chinese to English Google Translate service is 100% machine translation based, producing 18 million translations per day!
🔬 AI in healthcare and life sciences
Google DeepMind announced a research partnership with the Radiotherapy Department at University College London Hospitals NHS Foundation Trust. The project focuses on improving the process of segmenting normal tissue from cancer in the head and neck region so that radiotherapy causes less collateral damage to non-cancer regions.
The Next Platform track research publications in deep learning since the summer and find a particular emphasis on medical applications for prenatal ultrasound, breast mammography, brain cancer and melanoma.
Slightly more left field, Elon Musk announced that he’s made progress on a design for a neural lace. This would would effectively serve as an interface between our brains and a machine to avoid the benign situation that humans become “house cats” in the age of superintelligent AI.
🚗 Department of Driverless Cars
I attended NVIDIA’s GPU Technology Conference (GTC) in Amsterdam last week and was positively taken aback by the extent of the company’s investment into driving autonomy. Jen-Hsun Huang, who founded the company in 1993 and still leads as CEO, spent the better part of his 1.5h opening keynote talking through the integrated hardware and software platform NVIDIA is launching to power autonomy. These products and services are pluggable such that their 80+ partners can choose what they want to buy vs build. NVIDIA is clearly positioned to provide the shovels for the self-driving gold rush, much like Google’s TensorFlow enables the company to sell more compute infrastructure time. Announcements included:
DRIVE PX 2, an in-car GPU computing platform available in three configurations to enable automated highway driving (1x GPU @ 10 watts), point-to-point travel (two mobile processors + 2 GPUs) or full autonomy (multiple PX 2 systems).
DRIVEWORKS, a software development kit that provides a runtime pipeline framework for environment detection, localisation, planning and a visualisation dashboard for the passenger.
DGX-1, a deep learning “supercomputer” to train the multiple networks running on the DRIVE PX 2.
The BB8 self-driving car (watch this video), which learned to drive in both rainy and dark conditions, take hard corners, navigate around cones and construction sites, and drive without needing any lane paths.
A HD mapping partnership with TomTom built on the DRIVE PX 2 platform.
Stratechery opines a piece on Google, Uber and the evolution of transportation as a service. He makes the case that Uber, which is rolling out their self-driving fleet in Pittsburgh, is in pole position in a race that Travis Kalanick’s calls “existential”.
Fortune run a piece on the journey of Justin.tv founders from building a live streaming business to a self-driving car technology company, both of which sold for over $1bn.
The US Federal Government released its first rulebook on autonomous vehicles, including regulation on the safe testing and deployment of AVs (including data sharing) as well as a model US state policy framework to regulate AVs.
Mapillary, the Swedish company operating a crowdsourced street level imagery service joined UC Berkeley’s Deep Drive where it will focus on semantic segmentation of real-world imagery and structure from motion to help drive research in deep learning and computer vision for autonomy.
Research, development and resources
The majority of machine learning models we talk about in the real world are discriminative insofar as they model the dependence of an unobserved variable y on an observed variable x to predict y from x. As such, they are used for supervised classification or regression tasks. Generative models, on the other hand, are fully probabilistic models of all variables from which randomly generated observable data points can be obtained. They’re all the rage at the moment because they have been shown to synthesise artificial content (text, images, video, sound) that look and sound real from a small, unlabeled datasets. Here’s a range of research and resources that help us understand how they work, why they’re fascinating and use cases:
A UCLA undergrad walks us through how generative adversarial networks work.
The Twitter Cortex Vx team (Magic Pony Technology) has had a busy summer publishing three papers: super-resolution of images and video on a single K2 GPU using a convolutional neural network architecture (paper), super-resolution of a 4x downsampled image using a generative adversarial network (paper) and an extended discussion of their work (paper).
Google DeepMind publish WaveNet, a generative model for raw audio (paper here). This work uses a generative convolutional neural network architecture that operates directly on the raw audio waveform to model the conditional probability distribution of future predictions on the basis of the sample immediately prior. The network samples audio at 16,000 times a second and each predicted sample is fed back through the network to predict the next sample. The results on text-to-speech tasks are impressive!
Shakir Mohamed, Research Scientist at Google DeepMind, presents his work on building machines that imagine and reason at this summer’s Deep Learning Summer School. He is also co-author on a paper, Unsupervised Learning of 3D Structure from Images, which uses generative models to infer 3D representations given a 2D image.
Researchers in Edinburgh publish the Neural Photo Editor, a novel interface for exploring the learned latent space of generative models and for making specific semantic changes to natural images. The method allows a user to produce said changes in the output image by use of a "contextual paintbrush" that indirectly modifies the latent vector.
Attention and Augmented Recurrent Neural Networks, Google Brain. In this piece, the authors describe how RNNs are useful for modelling sequences of data and explore four extensions thereof. The piece includes many visualisations to help distill the important bits.
Show and Tell: image captioning open sourced in TensorFlow, Google Brain. This system involves classifying a target image and using these encodings to initialise an image encoder with a vision model for the image captioning system to produce a descriptive caption. In this work, the authors add a fine-tuning step during which the captioning system is improved by jointly training its vision and language components on human generated captions. Interestingly, the new model implemented in TensorFlow can generate novel natural-sounding English descriptions of scenes that weren’t included in the training data. Paper here.
Stealing machine learning models via prediction APIs, EPFL. With the rise of cloud-based ML as a service offerings, the security of training data and predictions moves front and center. This work investigates model extraction attacks that seek to duplicate the functionality of a model by reverse engineering its parameters. The authors demonstrate that successful attacks rely on the outputs of prediction APIs, namely the high-precision confidence values and class labels, to iteratively identify the model parameters. They show that services built by Google, Amazon, Microsoft and BigML are particularly concerned
Venture capital financings and exits
The big deal: Movidius, the European company that created computational image-processor chips and software for connected devices, sold to Intel for $400m. The company was founded in 2005 and raised $95m over 7 rounds from Draper Esprit, Atlantic Bridge Capital, Robert Bosch Venture and others. According to Intel, Movidius’ algorithms tuned for deep learning, navigation and mapping, and natural interactions “will be deployed across Intel’s efforts on augmented, virtual and merged reality (AR/VR/MR), drones, robotics, digital security cameras and beyond.”
61 companies raised $305m over 62 financing rounds from 112 investors. Median deal size was $2m (down from $2.8m in last issue) at a pre-money valuation of $7.7m (up from $6.3m in last issue). Deals include:
Quanergy, the maker of 3D time-of-flight LiDAR sensors, raised a $100m Series B at a $1bn post-money valuation from public companies Sensata Technologies and Delphi Automotive. The company has raised $134m since its founding in 2012.
Blackwood Seven, the Danish media analytics and predictive sales modelling platform, raised a $15.1m round from JOLT Capital, Sunstone Capital and Conor Venture Partners. The company has raised $22.6m since its founding in 2012.
Skymind, the creator of an open-source enterprise deep-learning platform including Deeplearning4j, announced a $3m seed round from Westlake Ventures, Tencent Holdings, YC and a range of angel investors.
The London-based pre-idea and pre-team accelerator, Entrepreneur First, graduated their 6th cohort with over a dozen companies utilising AI to solve problems in healthcare, manufacturing, sales, bioinformatics and finance.
For a roundup of the European landscape for AI companies, check out this research piece by Project Juno AI.
---
Anything else catch your eye? Do you have feedback on the content/structure of this newsletter? Just hit reply!
I’m actively looking for entrepreneurs building companies that build/use AI to rethink the way we live and work.