News in artificial intelligence and machine learning you should know about

Apr 12, 2016

From 18th March thru 12th April. Referred by a friend? Sign up here. Want to share? Give it a tweet.

*I recorded a two-part podcast discussion on AI with Nick Moran @ The Full Ratchet. Check it out Part 1 here and let me know what you think! Loads of tough questions :) Part 2 to come.

Technology news, trends and opinions

Health-related AI 🔬💊

Mapping the brain to build better machines. The Intelligence Advanced Research Projects Activity (IARPA) has funded the Machine Intelligence from Cortical Networks program (MICrONS) to the tune of $100m over 5 years. Three teams will monitor neuronal activity from tens of thousands of neurons in a target cube of the vision cortex to create a 3D model of neuronal circuitry. This model will be used to discover rules governing the circuit that could help us understand feedback loops between neurons in order to build more biologically inspired artificial neural networks.
Massachusetts General Hospital in Boston, the #1 hospital in the US, launched their Clinical Data Science Center to create a hub focused on using AI technologies to diagnose and treat disease. We've seen a number of startups set out to tackle this problem, but seeing a heavyweight healthcare provider make this announcement, with Nvidia as a founding technology partner, is big news.

Autonomous vehicles 🏁🚗

Toyota announces their new Ann Arbor, MI-based AI and robotics research site following those in Palo Alto, CA and Cambridge, MA. Of note, Toyota already has two Technical Centers conducting autonomous vehicle research in the area, which is also home to a 23-acre mini-city testing ground for pilot vehicles. Remember, Toyota is also a likely bidder for the Google-owned Boston Robotics, the maker of Big Dog and humanoid robots.
A convoy of supervised self-driving trucks completes first European cross-border trip, representing manufacturers from DAF, Daimler, Iveco, MAN, Scania and Volvo. This form of transport looks like a low hanging fruit!

Hardware and developer tools ⌨

Microsoft presented a new vision project at their Build developer conference in SF. Seeing AI (demo video here), uses computer vision and NLP to describe a person’s surroundings, read text, answer questions and identify emotions on people’s faces. This is akin to Baidu's DuLight product and similar in result to Facebook's recent announcement making the site 'accessible' to the blind. Microsoft also announced CaptionBot to caption any photograph as well as their new Bot Framework (the plumbing required to manage and deploy bots) and Builder (including their natural language understanding product, LUIS). This goes to show that 'bot technology' in itself is unlikely to be a core differentiator in the market.
Three movements on the deep learning hardware front! Nvidia announced the fruit of a $2bn R&D programme, the Pascal architecture-based Tesla P100 GPU, as well as the world's first deep learning supercomputer, the DGX-1. It combines 8x 16GB Tesla GPUs to provide the throughput of 250 CPU-based servers, networking, cables and racks, all in a single box. These train 12x faster than four-way Maxwell architecture-based systems from a year ago. Meanwhile, Lawrence Livermore and IBM collaborate to build new brain-inspired supercomputer.

Tech concepts, explained

MIT Tech Review produced a well-rounded compilation of interviews (e.g. Jeff Dean/Google, Andrew Ng/Baidu), project profiles (e.g. Skype Translator, Toyota driverless cars), important research papers and opinion pieces to showcase how AI is hitting the mainstream.
Deep learning: the truth behind the hype - a very solid read on Symbolic vs. Connectionist models in AI and the path forward might be uniting the two.
What is behind the success of AI systems these days and how does it work? This annotated presentation walks through lots of important concepts. Still wondering whether these technical advances see broader applications? This (nontechnical) piece argues so by leaning on the ability of networks to capture 'intuition'.

Research, development and resources

Deep3D: Fully Automatic 2D-to-3D VideoConversion with Deep Convolutional Neural Networks, University of Washington, code here. 3D movies are growing in popularity (remember Avatar in 2008?), but they're expensive to produce using either 3D cameras or 2D video manually converted to 3D. To automatically convert 2D to 3D, one needs to infer a depth map for each pixel in an image (i.e. how far each pixel is from the camera) such that an image for the opposing eye can be produced. Existing automated neural network-based pipelines require image-depth pairs for training, which are hard to procure. Here, the authors use stereo-frame pairs that exist in already-produced 3D movies to train a deep convolutional neural network to predict the novel view (right eye's view) from the given view (left eye's view) using an internally estimated soft (probabilistic) disparity map.

“Why Should I Trust You?” Explaining the Predictions of Any Classifier, University of Washington. Code here. A key hurdle to the mass adoption of machine learning models in fault intolerant commercial settings (e.g. finance, healthcare, security) is the ability to provide explanations as to why certain predictions were made. Many models, especially neural networks, are today functionally black boxes with trust in their performance relying on cross validation accuracy. The authors present a model-agnostic algorithm that presents textual or visual artifacts using interpretable representations of underlying data (not necessarily a model's features) to provide the user with a qualitative understanding of what a given model is basing its classification predictions on. This is very nifty work. Further explanation here.

Dynamic Memory Networks for Visual and Textual Question Answering, MetaMind. A year ago, the MetaMind team published the dynamic memory network, a neural network architecture which processes input sequences and questions, forms episodic memories, and generates relevant answers. In this work, the team introduce a new input module to handle images instead of text, such that the network can now answer natural language questions from its understanding of features in the image. Specifically, the input module splits an image into small local regions and considers each region equivalent to a sentence in the input module for text.

The Curious Robot: Learning Visual Representations via Physical Interactions, Carnegie Mellon University. The task of learning visual representations in the real world with CNNs typically requires a large dataset of labeled image examples. This group instead explores whether a Baxter robotic arm can learn visual representations only by performing four physical interactions: push, poke, grasp and active vision. They show that by experiencing 130k of these interactions with household objects (e.g. cups, bowls, bottles) and using each data point for back-propagation through a CNN, the network can learn some generalised features that helps it classify household object images on ImageNet without having seen any labeled images before.

Deep learning for chatbots, part 1 - Introduction. Given the excitement around chat interfaces and their ability to evolve user experiences for today's generation of technophiles, here's a piece that describes where we're at technically, what's possible and what will stay nearly impossible for at least a little while. This series will follow up with implementation details in upcoming posts.

Venture capital financings and exits

34 investment rounds totalling $116m of announced value and 5 acquisitions, including:

x.ai, the (increasingly) automated NLP-based digital personal assistant for meeting scheduling, raised a $23m Series B round led by Two Sigma Ventures. Hats off to Dennis and the team for a great demonstration of how human-AI collaboration can tackle a clear workflow problem. Watch him present at this year's Virtual Assistant Summit in SF by Re.Work.
Twiggle, the fairly quiet Tel Aviv-based startup working on an improved core technology stack focused on e-commerce search, announced a $12.5m Series A led by Naspers, the publicly traded South African internet and media group.
Kreditech, the German online lender underwriting loans using non-traditional data points, closed out the final $11m of its $103m Series C with an investment from the International Finance Corporation (a division of The World Bank). *Jose Garcia Moreno-Torres, Kreditech's Chief Data Science Officer, is presenting at our second Playfair AI event on July 1st in London.
Gauss Surgical, the maker of Triton, an FDA-cleared mobile vision system running on iPad that accurately estimates intra-operative hemoglobin and blood loss on sponges in real time, raised a $12.6m Series A led by Providence Ventures.
Drive.ai, the stealth autonomous vehicle software company founded by Carol Reiley (who incidentally is also Andrew Ng's partner - Stanford/Google/Baidu fame), raised a $12m Seed round from undisclosed investors.
Salesforce acquired MetaMind, founded by Stanford PhD Richard Socher who was working on NLP and later vision, for an undisclosed sum (purportedly an acquihire). The business raised $8m from Khosla Ventures and Salesforce Founder/CEO Mark Benioff. Of note, Richard writes "[Salesforce will use MetaMind to] automate and personalize customer support, marketing automation, and [improve] many other business processes. We'll extend Salesforce's data science capabilities by embedding deep learning within the Salesforce platform." Very exciting indeed.

Anything else catch your eye? Just hit reply! I’m actively looking for entrepreneurs building companies that build/use AI to rethink the way we live and work.

Guide to AI

Discussion about this post