Differentiating AI, Machine Learning and Deep Learning

Many people confuse Artificial Intelligence, Machine Learning, and Deep Learning as interchangeable or varying degrees of the same thing. However, this is not true.

Understanding AI: Clearing up the Confusion

Artificial intelligence (AI), machine learning (ML), and deep learning (DL) are often mistaken for being the same thing, but they each have their own unique meanings. By understanding the scope of these terms, we can gain insight into the tools that leverage AI.

Artificial Intelligence (AI)

AI is a broad term that encompasses systems capable of automating reasoning and approximating the human mind. This includes sub-disciplines like ML, RL, and DL. AI can refer to systems that follow explicitly programmed rules as well as those that autonomously gain understanding from data. The latter form, which learns from data, is the foundation for technologies like self-driving cars and virtual assistants.

Machine Learning (ML)

ML is a sub-discipline of AI where system actions are learned from data rather than being explicitly dictated by humans. These systems can process massive amounts of data to learn how to represent and respond to new instances of data optimally.


Video: Machine Learning Fundamentals for Cybersecurity Professionals

Representation Learning (RL)

RL, often overlooked, is crucial to many AI technologies in use today. It involves learning abstract representations from data. For example, transforming images into consistent-length lists of numbers that capture the essence of the original images. This abstraction enables downstream systems to better process new types of data.

Deep Learning (DL)

DL builds on ML and RL by discovering hierarchies of abstractions that represent inputs in a more complex manner. Inspired by the human brain, DL models use layers of neurons with adaptable synaptic weights. Deeper layers in the network learn new abstract representations, which simplify tasks like image categorization and text translation. It's important to note that while DL is effective for solving certain complex problems, it is not a one-size-fits-all solution for automating intelligence.

The relationship between different AI sub-disciplines.
Reference:
“Deep Learning,” Goodfellow, Bengio & Courville (2016)

The types of algorithm learning techniques

ML algorithms have the power to sort data into different categories. The two main types of learning, supervised and unsupervised, play a big role in this capability.

Supervised Learning

Supervised learning teaches a model with labeled data, enabling it to predict labels for new data. For example, a model exposed to cat and dog images can classify new images. Despite needing labeled training data, it effectively labels new data points.

Supervised learning uses labeled data to identify factors that distinguish labels. Effective models can then label new data.

Unsupervised Learning

On the other hand, unsupervised learning works with unlabeled data. These models learn patterns within the data and can determine where new data fit into those patterns. Unsupervised learning doesn't require prior training and is great at identifying anomalies but struggles to assign labels to them.

Unsupervised learning discovers the structure of unlabeled data. Successful models assess the fit of new data to the learned structure.

Both approaches offer a range of learning algorithms, constantly expanding as researchers develop new ones. Algorithms can also be combined to create more complex systems. Knowing which algorithm to use for a specific problem is a challenge for data scientists. Is there one superior algorithm that can solve any problem?

There are numerous machine learning algorithms with varying strengths and weaknesses for different problem types.

The 'No Free Lunch Theorem': No Universal Algorithm Exists

The "No Free Lunch Theorem" states that there is no one perfect algorithm that will outperform all others for every problem. Instead, each problem requires a specialized algorithm that is tailored to its specific needs. This is why there are so many different algorithms available. For example, a supervised neural network is ideal for certain problems, while unsupervised hierarchical clustering works best for others. It's important to choose the right algorithm for the task at hand, as each one is designed to optimize performance based on the problem and data being used.

For instance, the algorithm used for image recognition in self-driving cars cannot be used to translate between languages. Each algorithm serves a specific purpose and is optimized for the problem it was created to solve and the data it operates on.

No free lunch theorem: No one algorithm excels at all problems.

Selecting the Right Algorithm in Data Science

Choosing the right algorithm as a data scientist is a blend of art and science. By considering the problem statement and thoroughly understanding the data, the data scientist can be guided in the right direction. It's crucial to recognize that making the wrong choice can lead to not just suboptimal results, but completely inaccurate ones. Take a look at the example below:

Comparison of machine learning algorithms (x-axis) results against different datasets (y-axis). True labels highlighted in yellow. Adapted from scikit-learn.org.
Comparison of results. An X is placed on wrong predictions that would lead to undesirable results. No single algorithm is effective for every dataset.
Adapted from
scikit-learn.org.

Choosing the right algorithm for a dataset can significantly impact the results obtained. Each problem has an optimal algorithm choice, but more importantly, certain choices can lead to unfavorable outcomes. This highlights the critical importance of selecting the appropriate approach for each specific problem.

How to Measure an Algorithm's Success?

Choosing the right model as a data scientist involves more than just accuracy. While accuracy is important, it can sometimes hide the true performance of a model.

Accuracy =(True Positives + True Negatives)(True Positives + True Negatives)(True Positives + True Negatives + False Positives + False Negatives)

Let's consider a classification problem with two labels, A and B. If label A is much more likely to occur than label B, a model can achieve high accuracy by always choosing label A. However, this means it will never correctly identify anything as label B. So accuracy alone is not enough if we want to find B cases. Fortunately, data scientists have other metrics to help optimize and measure a model's effectiveness.

One such metric is precision, which measures how correct a model is at guessing a particular label relative to the total number of guesses. Data scientists aiming for high precision will build models that avoid generating false alarms.

Precision =True Positives(True Positives + False Positives)

But precision only tells us part of the story. It doesn't reveal whether the model fails to identify cases that are important to us. This is where recall comes in. Recall measures how often a model correctly finds a particular label relative to all instances of that label. Data scientists aiming for high recall will build models that won't miss instances that are important.

Recall =True Positives(True Positives + False Negatives)

By tracking and balancing both precision and recall, data scientists can effectively measure and optimize their models for success.