You are probably the type of executive that wants to fully understand the emerging opportunities in tech, but when it comes to artificial intelligence are left wondering if the field is just full of hype. There certainly has been plenty of hype lately and combined with all of those articles about the looming threat from advances in the field the claims are straining credibility. Unfortunately, this confusion is somewhat by design because there are as many truly astounding benefits as there are exaggerations. The fact is, no executive leader can afford to entirely lose track of the rapid advances in technology especially where artificial intelligence is concerned. Work in the broad field of artificial intelligence enables more efficiency, personalized treatment of your customers, automation that can competitively distinguish your business and the opportunity to identify and troubleshoot problems before they have a catastrophic impact.

Below I have broken down some of the vocabulary, its origin and context in order to help the busy executive sound smart when conversing on the topic. No executive can afford to ignore this emerging opportunity, and no certainly no busy executive has time to read an “Introduction to Artificial Intelligence” textbook. Hopefully, just exploring the origin of these terms and how they have evolved to be used in our modern vocabulary will help you distinguish between practical application and marketing hype.

Lately, we have heard plenty about Artificial Intelligence (AI), Machine Learning, Deep Learning, and Data Science. Anyone remember when Data Mining was the big hype? What I will highlight in this treatment is that these terms are all deeply interdependent, overlapping and even inseparable. We’ll go through the history, but it’s important to recognize there are a lot of people in this field with similar skills. All of these people are trying to find work because paychecks are fun, and the value of their skills hasn’t always been immediately clear to businesses. Unfortunately, this has led to cycles of over-promising and under-delivering. The trend is so predictable over the decades that when people stop believing in artificial intelligence enough to stop investing that moment has come to be referred to as an AI Winter . The advances have accomplished tremendous things for us, but they take time. In order to capture the interests and budget of business stakeholders, the immediate benefits of these techniques get exaggerated. As one term falls out of favor and appears worth less investment, another pops up with greater promise. I don’t believe the potential of the field is diminished in any way; only that the appetite for immediate results drives hype in how much more intelligent machines can really immediately achieve.

There has been an interest in passing off intellectual tasks to machines for thousands of years, and plenty of hype to go with it. There was a surprising amount of progress in the Age of Reason, in this regard. Even Leibniz (yes, the Leibniz you heard about in calculus class) dreamed of a reasoning machine in which any argument could be settled through logical application of facts and rules. He imagined any two people in a dispute could declare “Calculemus!” and submit their arguments to a machine which would logically settle the matter. The hype of the field is exemplified by centuries old “Mechanical Turk” constructed in the late 1770’s in which a machine in the form of a large box with arms on the top very effectively played chess against a human opponent. It was later proved to be an elaborate hoax in which a skilled chess player hid inside the box and operated the arms. The myths surrounding intelligent machines have been exploited ever since.

Mechanical Turk Reconstruction

Mechanical Turk Reconstruction


Artificial Intelligence

If you don’t know who Alan Turing was, I highly recommend at least looking at the cliff notes. He was one of the greatest minds of the twentieth century who led breakthroughs in mathematics, biology and, along the way, founded modern computing. Turing had laid down the gauntlet for modern objectives in the field of what is now called AI in 1950 with “Computing Machinery and Intelligence” in which he devised a test (later referred to as the Turing Test).

The test was simple: a human would interact through teletype with two players hidden behind a curtain. One hidden player would be a human and the other a computer. If the player could not correctly identify the human then the computer sufficiently imitated the human to be considered intelligent. This idea that fundamentally machine intelligence should feel so natural that it we don’t easily distinguish it from being human is key to what became artificial intelligence.

John McCarthy, meanwhile, is widely credited with coining the term Artificial Intelligence in the 1950’s. Note that at that time the term “artificial” had a positive connotation, unlike today. McCarthy made many contributions to advanced symbolic processing with machines and even invented the programming language LISP, which continues to be widely used in AI circles. While decades of development have happened since then, the term Artificial Intelligence is still associated with machines that we interact with naturally, even if in a limited domain like chess games.

The term has its derivatives as well for more specific meaning such as weak AI, strong AI and specifically some will refer to Artificial General Intelligence.   In the interest of keeping this concise I’ll save those for another treatment.

Alan Turing while at Bletchley.

Alan Turing while at Bletchley.

Machine Learning

Machine Learning is a term widely thought to have been coined by Arthur Samuel in 1959 from work in pattern recognition as it relates to applications of Artificial Intelligence. In most contexts, this definition still holds and it is seen as a component of study in the wider field of artificial intelligence. For example, machine learning might be used to recognize a face in a robot’s camera feed or to translate speech to text for further analysis with a smart home device, but is not the field that drives the overall experience with the robot or smart home device. Making the device feel natural and intelligent is generally considered the work of the broad field of Artificial Intelligence. This is one reason why machine learning has found its way into business applications more easily than artificial intelligence. Having algorithms that can look at a large set of records and learn to classify or predict by example is widely applicable in operational, R&D and even executive leadership scenarios.

Having an intelligence that feels natural has been a less common business case until recently. Machine learning has been responsible for the building blocks to make self-driving cars possible, but AI will likely be required to make them feel natural and trustworthy. An interesting distinction that helps reveal the difference between the studies of machine learning and artificial intelligence is also in how their respective systems are evaluated. Often the Turing test is used to evaluate artificial intelligence (or some variation of it) but in order to evaluate machine learning applications we use typical statistical measures such as number of false positives, and deviation from accurate predictions.

Data Science

My favorite quote about the definition of data science comes from Gil Press in this Forbes article;

“There is more or less a consensus about the lack of consensus regarding a clear definition of data science.“

That pretty much sums it up today. The article was published just two years after Harvard Business Review published their “Sexiest Job” piece. This was when I first recall data science becoming a job title and a series of job postings began seeking people with those skills. There is more history to it than that, though, including Patil and Hammerbacher claiming to use this term to define their jobs at LinkedIn and Facebook. The term is peppered through out literature for decades preceding that in relation to statistics, data mining and even computer science.

I imagine this term had a purpose several years ago as the explosion of big data tools was taking hold, new tools to connect and visualize data were becoming available. A lot of people who understood data mining, machine learning and statistics didn’t really understand how to use databases or big data tools to maximize their utility with those fields (the way Google, LinkedIn, Facebook and Apple were able to do), and therefore a new profession was deemed necessary. As skills in those tools become the norm for practitioners of machine learning, the term seems to be on the decline. I don’t think this term was created for simply promotional purposes, but there are certainly plenty of people who believe that was the case because there is no real consensus on what it means today.

Deep Learning

I have never been very good with cars. When I was growing up fuel injection was a big deal and it was regularly advertised in cars. I didn’t care, and still don’t. I never understood why I needed to know about whether it was a fuel injection system that enhanced the internal combustion engine or any other features. People who really liked cars thought it was important, though, and I still remember one explaining to me in that excited voice that you could spray a fine mist into the manifold and create a much more efficient combustion.

That’s exactly how I feel about deep learning. It’s a wonderful innovation and has led to some real advances in how effective machine learning can be, but the discussion is reaching a broad audience that really just wants results and clarity. Deep learning is a very well-defined term that has to do a with a specific technique in order to make machine learning very effective (specifically to make it learn quickly on fewer training examples).

(I’ll explain it at a high level, but skip this paragraph if you care about deep learning as much as I care about fuel injection. But you might find it more fascinating than a fuel injector).

Biologically inspired learning systems have been a very hot research topic over the last couple of decades. Neural networks are usually at the center of developments. The work is interesting because occasionally, as with neural networks themselves, by modeling what we know about the brain, we also happen upon a system that can learn pretty well, too. Neural networks, since Minsky’s pioneering work in the 60’s, have had their own hype cycle. Some years they are treated like an inferior technique to other more recent advances and other times they capture the imagination of the public, (remember Schwarzenegger kept referring to his neural network in the Terminator, a movie that was made during one of the periods of great excitement about neural networks).

Fundamentally they simulate neurons connected that signal each other and integrate into an interesting result, even though all of their activity is a reaction to connections around each neuron. It turns out you can combine neural networks that serve a specific purpose to accomplish a greater level of adaptation in ensemble approaches. Stacking the networks so that one pre-processes the input of another is one way to do this. I did my Master’s thesis in combining networks in ways that model how the brain combines functional regions, way before deep learning came out. I was part of the research zeitgeist of the time, trying to find new ways to combine what we could do in machine learning to advance our understanding of learning to the next level. These kinds of deep architectures have been widely explored. The real contribution of deep learning, in the simplest way I can describe it, is that it uses one layer to simplify the meaningful input from the information and a subsequent layer to then learn from the simplified information. This enables greater learning and also is known to be analogous to the approach the human brain takes when inferring from perceived stimulus.

We have only covered four of these terms, but with all of the hype developed, the potential for tremendous competitive advantages for businesses and the rich history, staying abreast of the developments is intractable for most executives. There are easily 100 machine learning papers published per year, each one advancing developments in the field. Add to that the clumsy attempts to communicate the value of every new idea and why organizations desperately need it, and you are left with a confusing soup in a subject no one can afford to ignore.

Perhaps if we just had Leibniz’s Calculemus. In the presence of someone introducing a new term in this field we could simply declare “Calculemus!” and the machine would tell us if it was a necessary new description or just a term that creates more hype.