How Nutonian’s Machine Intelligence Model Shows How People And Technology Can Be Partners

There’s a great scene in “The Hunt for Red October” in which Seaman Jones, played by the actor Courtney Vance, is tracking down Russian submarines based on advanced analysis of underwater sounds. In the movie, Seaman Jones notices a sound categorized by the machine learning algorithms as seismic activity. It turns out that this sound was actually the signature of a top-secret caterpillar drive created by the Russians. Seaman Jones understands that the system was originally created to recognize and track seismic activity, and, that when it finds some thing it doesn’t recognize, it just calls it seismic activity. The system “comes home to mama,” as he puts it.

This is a great example of a powerful machine learning system working in tandem with a talented person and creating an important result. The analyst used the machine learning to put together a larger picture. This only happened because the analyst was not only knowledgeable about the domain, but also about the way the machine learning system worked. Seaman Jones knew the personality of the  system and was able to use this knowledge to make the system more useful.

Here’s what that Captain said: “Have I got this straight, Jonesy? A $40 million computer tells you you’re chasing an earthquake, but you don’t believe, and you come up with this on your own?”

The problem with most modern machine learning systems is that they are built just like the one Seaman Jones used. The systems have a personality, but usually that is tribal knowledge. The systems often have a pipeline of analysis, but the analyst isn’t privy to what is happening all along that pipeline. In other words, the systems are black boxes. If you know how the black box works you can make better use of it than someone who doesn’t.

But it shouldn’t be this way. As I pointed out in “The Man-Machine Framework: How to Build Machine-Learning Applications the Right Way” the right vision for machine learning is as a prosthetic, something that extends, but does not replace human intelligence. The article describes how machines are great at iconic and indexic thinking but that humans provide more advanced symbolic thinking, the crystallization of the story from fragmentary evidence. Seaman Jones engaged in this sort of symbolic thinking aided by the power of machine learning.

Machine Intelligence: A Model for Open Book Systems

How do you then build machine learning systems to support this new collaborative model? Nutonian, a Boston-based startup, has a vision of machine intelligence that is the most advanced example I’ve come across of a system that brings the idea of a man-machine partnership to life in a product. Instead of a black box, Nutonian’s system for creating predictive models is an open book that describes its workings to those who use it. You don’t have to rely on tribal knowledge. If you want to dive in and see what is happening inside of Nutonian, the product is created to give you a guided tour.

To understand the way that Nutonian opens itself up we must first review what Nutonian is trying to do through its Eureqa product. Nutonian’s mission is to empower the widest possible group of people to be able to use the power of data science. Nutonian achieves this mission by automating the process of discovering predictive models, one of the core activities of data science. Nutonian’s Eureqa product works its magic by starting with a series of independent variables and a dependent variable that we are trying to predict. Using a proprietary blend of advanced statistical techniques within an evolutionary search process, Eureqa industrializes the discovery and creation of predictive models. Here’s where we can start describing how Eureqa is an open book.

First of all, you don’t just get one model, but a range of models ranked by two dimensions, accuracy and simplicity. Accuracy tells you how strong the prediction is, but simplicity is a measure of how the model works. If you don’t have a window into how the model works, it would be natural to choose the most accurate model, no matter how complex it is. But if you can evaluate the underlying assumptions driving a model, you can apply Occam’s Razor and choose a simpler model that is not only still accurate, but easier and quicker to apply in practice, especially when certain machine-generated assumptions don’t match business realities.  Usually when using Eureqa analysts gravitate toward the model that achieves the most accuracy in as simple a way as possible.

Second of all, the system describes in a simple form the way each model works, how much variance is explained by each type of data so you can get a feel for what the model is like. This sort of explanation can lead an analyst to consider new types of data that may be driving a system. In addition, Eureqa provides an interactive method to simulate changes to each model, so you can evaluate in real-time how it would react to different inputs. Real-time simulation can be far more visually informative in helping users understand a complex system, as opposed to static numbers and graphs.

Third of all, you can make adjustments to the model and then use it to re-seed the process to begin another cycle of assisted analysis. This point is a crucial way in which the man-machine partnership comes to life by allowing users to quickly apply and encode their domain expertise into the machine. One of the main weaknesses of data-driven modeling is the inability of the machines to identify false correlations. For example, if you happen to turn on the coffee machine at the same time that you turn on the assembly line every morning, machine learning can’t tell which is significant. But you, the human, can, and swap out the “coffee” signal for the “assembly line” signal as the true causative agent. Similarly, from 2006 to 2011 the United States murder rate was well correlated with the market share of Internet Explorer (both dropped sharply). A machine may not be able to tell the difference, but a human can understand that IE’s market share has no meaningful correlation with the US murder rate.

Fourth, Eureqa has a “Signal Discovery Engine” that helps analysts find the most important data from among many sources. Signals are another name for the statistical idea of features, which in essence are ways to group and summarize data to find the information that helps to predict future outcomes. Good features lead to good algorithms. Because Eureqa is an open box, analysts can see which signals have been discovered and then put them to use in any other analytical work they are doing outside of Eureqa.

Finally, Eureqa explores the entire space of models, creating new ones that may never have occurred to other machine learning methods or human analysts. Contemporary machine learning relies on fitting data to an existing algorithm (making plenty of assumptions about the data along the way), then seeing how well the data actually fits. Instead, Eureqa’s free-form modeling starts from scratch to determine the optimal model. Eureqa also expands the analysis and creates models using methods beyond what a particular human analyst may know or have access to or have time to program and test.

The point is that Nutonian guides you through its internal workings, showing you the important details. By seeing all of this information, the symbolic intelligence of the analyst, the ability to put the big picture story together can come into play.

The principles Nutonian uses to create the open book offers choices such as: creating simple explanations, allowing the analyst to influence and control the system, and so on could become a checklist for how to create a machine learning product that is an open book.

What Nutonian has done, and the next generation needs to do, is to help Seaman Jones and analysts like him chase after earthquakes and find Russian subs. In this way, people and technology create a powerful partnership.

Follow Dan Woods on Twitter:

Dan Woods is on a mission to help people find the technology they need to succeed. Users of technology should visit CITO Research, a publication where early adopters find technology that matters. Vendors should visit Evolved Media for advice about how to find the right buyers. See list of Dan’s clients on this page.




Source: How Nutonian’s Machine Intelligence Model Shows How People And Technology Can Be Partners

Via: Google Alert for ML