News in artificial intelligence and machine learning [July/August]
From 12th July thru 10th August 2016.
Referred by a friend? Sign up here. Want to share? Give it a tweet :)
Welcome to the July/August issue of my AI newsletter, featuring headlines from big tech co’s, research on image/video manipulation and simulation, and $309m of investments and $550m of acquisitions!
If anything piques your interest or you’d like to share a piece that’s caught your eye that I missed, just hit reply!
I’m actively looking for entrepreneurs building companies that leverage AI to solve interesting, high-value problems across the board. Get in touch 👍
Technology news, trends and opinions
💪 Headlines from the big tech co’s
Google has now publicised the fact that DeepMind’s deep reinforcement learning approaches yielded a 15% improvement in power usage efficiency for their data centers by optimising 120 variables. They’ve also increased the conversion rate for app downloads in Google Play store by 4% as well as unlocked the quantitative early-detection of diabetic retinopathy.
In my view, this evidence validates the thesis I hold on AI:
It can unlock commercially valuable problems that were previously difficult to solve or were intractable. For example, listen to Kenn Cukier’s Economist podcast with Tractable to hear how the processing of car insurance claims is ripe for deep learning and transfer learning.
Considering the spectrum of possible approaches to solve a given problem (past, present and future), AI can reveal solutions that represent a new local maximum that is beyond the results achieved through optimisation of the current state of the art. Think how Lee Sedol has improve his gameplay since his experience facing AlphaGo, which prompted exploration into new strategies because those he was exploiting weren’t up to scratch.
Google, Microsoft and Amazon are pushing hard to democratise ML tools that sit in their cloud ecosystems to enable businesses to extract more value from their data. Exemplifying the value for Microsoft, for example, places on this strategy, their director of product management was quoted saying:
“Every company I talk with has someone extremely senior tasked with thinking about how to make this technology [cloud machine learning] work for them.”
From the research vantage point, Yann LeCun, Facebook’s director of AI Research, outlines his view on the positioning of Apple, Google, Microsoft, DeepMind and Facebook re: AI research. In short, “If you can’t publish, it’s not research. At best, it’s technology development.”
In a longer profile piece on Facebook’s 10-year vision, Zuck explains the importance of bringing the entire world online, solving core questions in general AI and unsupervised learning and how the Messenger bot platform arose organically from users asking questions of Facebook business pages.
Baidu, the Chinese search powerhouse, released two interesting projects:
The first is a new augmented reality platform (DuSee), which uses computer vision and deep learning to project animations into 3D space. They plan on implementing this functionality into their mobile apps, including Mobile Baidu search, which has hundreds of millions of users. This looks awfully similar to UK-based Blippar, which launched in 2012 and has raised $126m to date in 6 rounds.
The second is a collaboration with The Ullens Center for Contemporary Art in Beijing where the company presents a system that can compose music on the basis of the content of an image. For example, a picture of a beach fed into a neural network will output attribute labels that are associated with a matrix of “music units” to instruct the composer network to creates a piece of music to suit the picture.
As more non-tech companies jump on the AI bandwagon, they’ll need to be aware of the business implications imposed by learning systems. This pieces brings home the point of building “reciprocal data applications” whose purpose is to collect task-specific data through experiential interfaces that return incrementally more value to the user over time. I think it’s important to drive home the point that AI is one of the few technologies that can demonstrably drive accelerating returns today. *Thanks to Aris for sharing.
Given all the hype around chat bots for use cases like customer service (which some of you know I’m more bearish on than the average VC), I’m relieved to finally see Zendesk leverage their dominant position in this space to push out “automatic answers” to customer queries. Aside from the argument that companies often use more than one comms channel to communicate with their customers (which might warrant a cross-platform 3rd party vendor), I don’t see many reasons to believe that against Zendesk, Intercom, Desk.com, Facebook and others won’t develop their own bot-style technology given their existing relationships with clients and data.
🚗 A driverless future
News emerged that key members of Google’s self-driving car unit have recently departed, including their CTO/tech lead and a principal software engineer who helped set the program up. It’s worth keeping an eye on these movements given that a) former members of this team are setting their own companies (e.g. Otto) and b) this means Google’s dominance isn’t as set in stone as it once was.
Elon Musk released his second 10-year plan for Tesla in which he weaves a compelling story of democratising access to environmentally-friendly and energy efficient transport, while painting a future picture for autonomous car sharing (spoiler: your car can earn money for you while you sleep).
🔏 Controlling AI systems as they perform increasingly important tasks
Jack Clark (now ex-Bloomberg) runs an interview with DeepMind’s Demis Hassabis exploring how his team scopes high-value applications within Google, who controls general AI (he suggests donating it to the UN) and how to organise science in such a way to promote discovery. Indeed, as we make progress towards general AI, it’s important that we appreciate where the field is coming from, as explained by Tom Chatfield in his piece on the etymology of AI and other technologies we take for granted today.
The authors of a high-profile piece studying the social dilemma of autonomous vehicles (video summary of the paper here) have released Moral Machine. The platform will crowd-source human opinion on how machines should make decisions when faced with moral dilemmas as well as scenarios of moral consequence. The experiment asks some tough questions, have a try!
A piece ran in TechCrunch arguing about the true cost of free services, a well-known (at least in the tech world) ploy for large companies to acquire user data for the benefit of product development. In particular, the piece takes aim at the DeepMind/NHS collaboration, framing it as a trade of free services to the NHS in exchange for patient records. I don’t think is necessarily fair given that we need software to solve problems like early-detection of conditions, generating holistic views of patient data for diagnosis and management, as well as improving physician workflows to make the health system function more efficiently. It’s surely in our interests to see this technology take form.
The White House Office of Science and Technology Policy recently closed their call for information soliciting public input on the subject of artificial intelligence, specifically on the tools, technologies and scientific training that are needed. IBM published their in-depth response here — particularly interesting are their application areas and the core research problems they’re working on.
Will Knight’s piece on why we’re having a hard time building AI to solve natural language understanding ignited quite a fascinating discussion on Hacker News that is worth a read.
Research, development and resources
Here’s a a selection of impactful research papers that are worth a look. Special focus on image and video processing (always fun results!):
Learning a driving simulator, comma.ai and the University of Florida. Here, the driverless car startup release 7.25h of real-world video footage captured from a front-facing camera, along with several telemetric measurements including car speed, acceleration, steering angle, GPS and gyroscope angles. They use this dataset to learn a video predictor built using an autoencoder trained by generative adversarial methods. Here, a generator network receives random input samples from the encoder’s latent space and is tasked with synthesising realistic-looking images that are classified as real/fake by a discriminator network having access to a real-world bank of images. The authors also train a recurrent neural network transition model to ensure that the sequence of generated images properly recapitulates the structure of a highway road. This work is exciting because it demonstrates how an AI training simulator can be built from scratch by learning from real world data. Resulting videos can be seen here. *Thanks to Moritz for pointing me to this!
On the subject of driving simulators, Craig Quiter published a driving simulator environment (DeepDrive) based on the Grand Theft Auto (GTA) video game with hooks for perception and control. This serves as a testbed training self-driving cars that leverage deep (reinforcement) learning approaches. He publishes a demo video of an vehicle trained using a CNN (AlexNet) on raw GTA image inputs from a forward mounted camera, steering, throttle, yaw and speed.
Anticipating visual representations from unlabelled video, MIT and University of Maryland. Following on from the theme of the comma.ai paper, this study presents a framework for using the temporal structure of unlabelled video to learn to anticipate human actions and objects. Instead of predicting pixels or anticipating labeled categories, the approach is to use unlabelled video to train a deep regression network that learns to predict the representation of frames in the future. The network estimates several representations for the future and recognition algorithms (such as object or action classifiers) anticipate high-level concepts. This produces a distribution for high likelihood categories that are to happen in each future representation, thereby ‘predicting the future’.
Adversarial examples in the physical world, Google Brain and OpenAI. Ian Goodfellow, who published a seminal paper introducing generative adversarial networks in 2014, now present work that demonstrates the extent to which adversarial examples (input data that is perceptually indistinguishable yet subtly perturbed) can fool ML classifiers. Previous studies assumed direct access to the ML classifier, such that adversarial examples are fine-grained per-pixel modifications fed directly to the model. This work instead shows that adversarial examples created to fool a pre-trained ImageNet Inception classifier are also misclassified when the images are perceived through a cell-phone camera (i.e. a Nexus 5 takes the photo, which then runs through the classifier). *Here’s a longer write up in MIT Tech Review.
DeepWarp: Photorealistic image resynthesis for gaze manipulation, Skolkovo Institute of Science and Technology. The authors present an end-to-end, feed-forward neural architecture that can ingest an image of an eye region and synthesise an image of the same eye but with its gaze redirected by an angle that is arbitrarily selected by the user. The model is trained on pairs of images corresponding to eye appearance before and after redirection, where the network has to predict the warping field of the eye. This problem of gaze manipulation is of longstanding interest in computer vision research and the ability to synthesise output images of high realism is relevant for photo and video post-production editing, as well as digital avatars. *This project also provides example results, including President Obama rolling his eyeballs. Pretty neat!
Here are several resources commenting on key aspects of research AI:
Adam Geitgey has now published four parts to his terrific “Machine Learning is Fun” series, which has walk-throughs of what machine learning is (Part 1), how deep learning works with a practical example (Part 2), how to use deep learning for image recognition (Part 3) as well as face recognition (Part 4).
From the commercial standpoint, many investors and product owners talk about data being the core differentiator for companies utilising AI techniques. This piece by a Kaggle Grandmaster is a great read because it demonstrates that the majority of time spent building learning systems is actually on pre-processing (data cleaning and integration, largely). So yes, data is important, but so is your ability to handle it properly before modelling.
Facebook AI Research director, Yann LeCun, has been on a roll recently replying to threads on Quora relating to new research frontiers in AI. My favorite questions are: “are there some exciting but overlooked developments in ML research?”, “when will we see a theoretical and mathematical foundation for deep learning”, “what vision/perception problems is deep learning close to solving” and “what will the likely AI advancements be in the next 5–10 years”.
Want to train an LSTM RNN to write books for you? Check this out, with example Harry Potter and Star Wars passages.
Venture capital financings and exits
52 companies raised $309m over 55 financing rounds from 99 investors. Median deal size was $2.8m at a pre-money valuation of $6.3m. Deals include:
Turi (formerly Dato/GraphLab) was acquired by Apple for $200m! The company was founded in 2013 in Seattle and provided a large ecosystem of ML products deployed as microservices for developers. It employed 66 people (+56% in the last two years), of which 27% were engineering and 17% were research. The company raised two rounds worth a total of $25m from NEA, Madrona Venture Group, Opus Capital and Vulcan Capital, reaching a $75m post-money valuation.
Nervana Systems was acquired by Intel in a deal worth at least $350m! The San Diego-based company was founded in 2014 and developed a fully-optimized software and hardware stack for deep learning. It raised $24.m in three rounds from Data Collective, AME Cloud, Lux Capital, Allen & Co and others, most recently at a $83m post-money. Nervana employed 47 people (95% of which are engineers/research/IT).
Darktrace, a London-based enterprise cybersecurity company focusing on threat detection, raised a $65m from KKR at a reported $400m post-money.
Sift Science, an SF-based fraud detection and prevention company, raised a $30m Series C led by Insight Venture Partners at a $120m post-money.
CS Disco, a Houston-based company helping lawyers find evidence by reviewing documents of major cases and investigations, raised a $18m Series C led by Bessemer Venture Partners.
Behavox, a London-based company providing software for compliance officers and forensics teams to detect cases of market abuse, fraud, collusion or reckless behaviour, raised a £2m Series A led by Hoxton Ventures and Promus Ventures.
FiveAI, a London-based autonomous vehicle software company, raised a $2.7m Seed round led by Amadeus Capital Partners.
Innoviz Technologies, an Israeli company developing a high definition solid state LiDAR, object detection, sensor fusion and mapping products for autonomous vehicles, raised a $9m Series A from Magma Venture Partners and others.
---
Anything else catch your eye? Do you have feedback on the content/structure of this newsletter? Just hit reply!
I’m actively looking for entrepreneurs building companies that build/use AI to rethink the way we live and work.