News in artificial intelligence and machine learning
From 3rd May thru 1st June 2016. Referred by a friend? Sign up here. Want to share? Give it a tweet.
*This newsletter started as a collection of links 6 months ago with 40 believers. Now, we're 1,017 subscribers! Thanks for the support :)
*Reminder #1: only 4 weeks left until our 2nd annual Playfair AI Summit 2016 in London featuring a dozen brilliant academic researchers and entrepreneurs showcasing impactful applications of AI in the real world. Just hit reply if you want to hear more.
*Reminder #2: I'm now co-organising London.AI, the city's product-focused applied AI meetup with 100 engineers, researchers and entrepreneurs. Sign up here for news or reply if you'd like to speak.
Technology news, trends and opinions
Healthcare, cars and digital assistants
Making the front page of next month’s WIRED, Cade Metz writes a tour-de-force piece on DeepMind’s AlphaGo. While it already feels that AlphaGo is ancient history in the fast moving AI world, I think it makes a powerful case for how human-machine collaboration could help us improve our own mastery of complex tasks.
Google DeepMind also made the headlines for the data sharing agreement they signed with three hospitals that are part of UK’s National Health Service. The data on 1.6 million patients includes live and historical medical records stretching back 5 years. Its stated use is for “real time clinical analytics, detection, diagnosis and decision support”, with an initial focus on the Company’s Streams app for measuring the risk of acute kidney injury. While many engaged in heated debate over data privacy (see this headline, courtesy of The Daily Mail), my view is that medicine should be moving towards real-time monitoring of health and prediction of future conditions. Data sharing will be a necessary trade we should choose to make such that medicine can become proactive vs. reactive.
Switching over to autonomous driving, Tesla’s Director of Autopilot Programs, Sterling Anderson, announced that their fleet of Autopilot hardware-equipped vehicles has collectively driven 780M miles, of which 100M with Autopilot engaged. To put this into context, Tesla is now capturing more miles worth of data (camera, GPS, radar, and ultrasound) in a day than Google’s program logged since its inception in 2009.
Big news in virtual assistant world! Viv, the stealth AI assistant built by the Siri founders, was finally unveiled to the public at TechCrunch in New York (watch it here) and Pioneers in Vienna. This is noteworthy because Viv takes a different approach to Siri or Cortana, which the team calls a “dynamically evolving cognitive architecture system”. In short, Viv listens to a user query, understands intent by parsing action and concept objects to create a sequential plan of action on the fly. Of note, Viv leverages an ecosystem of 3rd party services to draw knowledge to fulfil the user query. If we’re to depend on digital assistants for every possible use case we can imagine, this approach is certainly more scalable than hard-coding connections between predefined words/phrases, domain expertise and ontologies.
More or less simultaneously, rumors surfaced that Apple will be releasing a Siri SDK for developers and a competitor to Amazon’s Alexa and Google Home. These products, possibly announced at WWDC in two weeks, are believed to utilise a beefed up version of Siri that leverages VocalIQ’s technology to retain semantic context between conversations. Watch VocalIQ founder, Blaise Thompson, present describe their system in a rare talk at the Playfair Capital AI Summit 2015. Apple needs to dramatically step up their AI R&D game to have a chance to compete in a world where AI will underpin our digital experiences.
AI takes center stage at Google I/O
Jeff Dean (Google Brain), John Giannandrea (Search) and Aparna Chennapragada (Google Now) took to the stage with Tom Simonite (MIT Tech Review) to discuss Google’s vision for machine learning (video).
After several years of R&D, the company revealed a proprietary application specific integrated circuit (ASIC), the Tensor Processing Unit, built to run TensorFlow. While tech specs aren’t public, they claim a 10x performance/watt boost (not clear over what!). So, Google bests on ASIC, Intel on FPGAs and NVIDIA is long GPUs!
As Sundar mentioned in his shareholder letter last month, Google delved further into the digital assistant world by releasing their rather unimaginatively named Google Assistant. It goes beyond Google Now in that it provides a persistent, cross-device and contextual experience to power a suite of Google products including Google Home (a-la-Amazon Echo) and Allo (messaging). Welcome, context awareness!
20%: proportion of queries that are made via the Google Android app are voice queries vs. text.
140B: words translated per day with Google Translate, which handles over 100 languages vs. just 2 a decade ago. For the sake of comparison, Facebook handles 2B translations per day (approx. 25B words) and 40 languages using their in-house tech (ditched Bing!).
200M: monthly active users on Google Photos and the number of photo labels automatically applied to user photos. Brilliant showcase for AI.
What to optimise when building AI and tech unemployment
Joshua Bloom, Professor at Berkeley and co-founder of Wise.io, throws up the question of how we should optimise the value chain for building AI systems. While contemporary AI R&D focuses on optimising accuracy over all else, we must consider time to train, time to predict, model size and load time. Indeed, we should articulate the business value tradeoff between these quantifiable variables with the goal to create smaller, faster, more stable and interpretable models. As more AI moves into production and attributed a $ value, this will become paramount.
On the topic of building AI, Moritz shares 10 strategies for creating training data to kickstart the holy grail in AI: the data network effect flywheel. I think the most powerful and aligned strategy is to create vertically-integrated AI products with the user-in-the-loop. Here, the dataset builds itself while the user forms a habitual behaviour. Win-win.
Putting fears of technology-driven unemployment into context, Louis Anslow curates a wonderful selection of historical quotes, illustrations and events on the topic from the 1920's through to present day. One can arguably say that “robots have been about to take all the jobs for more than 200 years”!
Research, development and resources
Google release SyntaxNet, an open-source neural network framework implemented in TensorFlow that provides a foundation for Natural Language Understanding (NLU) systems (thanks Cole Winans for the heads up!). Machines aren’t at their best when it comes to breaking down the syntax of human language, which can be notoriously ambiguous. Indeed, a 20–30 word sentence can have tens of thousands of potential syntactic structures. With SyntaxNet, a sentence is processed by a feed-forward neural network (without recurrence) to output a distribution of possible syntactical dependencies with each incremental word (“hypotheses”). Using a heuristic search algorithm (beam search), SyntaxNet runs multiple partial hypotheses as each word is processed and only discards unlikely hypotheses when other more highly-ranked hypotheses are being considered. The SyntaxNet English language parser, Parsey McParseface, is the most performant model out there, surpassing human level accuracy in some cases (see paper results).
A team of researchers from Georgia Tech have presented control algorithms that endow a 1/5 replica rally car to drive at the edge of what would otherwise be its handling limits on a dirt track while maintaining stability. The car, AutoRally, carries an inertial measurement unit, 2 front facing cameras, a GPS, rotation sensors on each wheel to measure speed, an Intel Quad-core i7 processor, Nvidia GPU, 32GB RAM and requires no other external sensing or computing resources. The vehicle’s algorithms are first trained by several minutes of piloted driving around the track. Sensor measurements are then used to combine both control and planning to enable autonomous driving — these fun videos showcase its aggressive driving capabilities. Specifically, the vehicle optimises its trajectory 2.5 seconds into the future by computing the optimal trajectory to maintain stability at a given speed every 16 milliseconds from a weighted average of 2,560 different possible trajectories. This work could ensure that autonomous cars that hit our roads are safe when driving in difficult conditions.
One shot learning with memory-augmented neural networks, Google DeepMind. Learning a new behaviour by drawing valid inferences from small amounts of data (“one-shot learning”) is a particularly complex task for a machine, but a trivial one for a human. This is largely because deep learning models typically rely on gradient-based optimisation to tune weights for each neuron in the network. This approach requires lots of data and iterative passes through the network. When applied to one-shot learning tasks, this strategy performs poorly. Instead, a two-tiered learning (“meta-learning”) approach is considered more fit for purpose. Here, the authors show that neural networks with memory capacity (e.g. DeepMind’s Neural Turing Machine and memory networks) are capable of meta-learning applied to the Omniglot classification task (1,600 classes with only few examples per class). The network performs better than the state of the art and can even outperform humans. It does this by slowly learning a useful representation of raw data and then uses external memory to rapidly bind new information.
Algorithmic transparency via quantitative input influence, Carnegie Mellon University. In the last issue, I featured a piece arguing for algorithmic accountability and model explainability when employing AI to make substantive real world decisions, especially in fault-intolerant domains like healthcare or defense. Here, the authors report a family of Quantitative Input Influence (QII) measures that describe the extent to which model features influence their outputs, accounting for input correlation and individual/combinatorial effects. QII requires access to the model in question and input data that has well understood semantics, namely credit decisions and online personalisation, but not not image or vide. To measure the impact of a given input on an output of interest (e.g. a classification outcome), the authors replace the actual value of every model input with a random independently chosen value and compute the probabilities that this input (actual vs. random value) is pivotal to the classification of interest.
Few quick mentions:
Google Brain and OpenAI publish a paper, Unsupervised learning for physical interaction through video prediction, that describes a predictive model that outputs the motion of raw unlabelled video pixels relative to the appearance of pixels in previous frames. Training this model on videos of 50k robot pushing motions, the network can make long-range predictions from these different actions (see videos here). This work has implications for training agents in the real world without labeled data.
Oxford, CIFAR and Google DeepMind explore the topic of Learning to communicate with deep multi-agent reinforcement learning. The setting includes multiple agents with partial observability and a collective goal to maximise the same discounted sum of rewards. Using two methods, one which is end-to-end trainable within each agent and the other across agents, the authors show that multiple agents can learn communication protocols in a complex environment that involving sequences and raw input images.
Andrej Karpathy published a great post on deep reinforcement learning, why it's a big deal, what it's about, how it developed and might be coming next.
Part 1 and Part 2 of an easily digestible review by Adam Geitgey exploring core concepts of machine learning and deep learning with example code and diagrams.
Venture capital financings and exits
$355M worth of deal making (57 financings and 13 acquisitions), including:
Fractal Analytics raised $100m in growth equity for its marketing analytics suite for understanding customer behaviour.
Zoox, the full-stack autonomous car company, raised $20m of a target $252m financing round. Aside from Cruise Automation (now owned by GM), Zoox is the only startup of 12 companies with a permit to test autonomous vehicles in California.
Apixio, which mines clinical charts and medical claims for valuable data on patient chronic conditions and risk, raised a $19.3M Series D led by Bain Capital.
Apical, a London-based developer of imaging and embedded computer vision technology, was acquired by the chip manufacturer ARM for $350M in cash. ARM says that it will use Apical to grow into markets such as connected vehicles, robotics and smart cities. Apical was founded in 2002, employed 100 people and had its technology shipped in more than 1.5B smartphones. Michael Tusch, founding CEO of Apical, owned 96% of the Company — the rest was owned by 4 individuals. No venture money in sight!
---
Anything else catch your eye? Just hit reply! I’m actively looking for entrepreneurs building companies that build/use AI to rethink the way we live and work.