Supervised Learning Algorithms

Anjan Parajuli
Analytics Vidhya
Published in
4 min readSep 15, 2020

--

Hmmm….Algorithms huh!!!

As I pledged in my last article that I would be writing about algorithms in next article.

Here I am buddies.

Algorithms are the core to building machine learning models and here I am providing details about most of the algorithms used for supervised learning to provide you with intuitive understanding for where to use it and where not to.

By the end of this article, you will be adept at algorithms from intuitive level of understanding.

CAVEAT: I AM NOT DESCRIBING MATHS BEHIND IT INSTEAD HOW IT WORKS AND WHERE TO USE IT.

So, folks here we go.

1.NAIVE BAYES

Naive Bayes are the algorithms used for classification based on Bayes theorem and it is the foundational algorithm to know at most for machine learning.

Advantages:

  1. It is very helpful for handling large amount of datasets and generalizes the data accurately for such large datasets.
  2. Applied mostly in classification problems eg. spam detection,spam filtering,sentiment analysis,fraud detection, recommendation engine etc.

Disadvantages:

  1. It is naive i.e doesn’t understand data in ordered format like in text learning.(Still it is preferred for its speed and easiness of use).
Stock price prediction

2.LOGISTIC REGRESSION

Logistic regression by name sounds algorithm for regression but in-fact it is a classification algorithm. It is a linear and simplest classification algorithm.

Pros:

  1. It is simple and interpretable.
  2. It works best for linear data i.e when classes we are trying to predict are non-overlapping and linearly separable.

Cons:

  1. When classes are non-linear, it will fail.
  2. It can’t handle complex problems.

3.Linear Regression

Linear Regression is also a linear model but used for regression problems.

Advantages:

  1. It is also simple, interpretable and hard to overfit.
  2. It is best when the relationship between input and output variables is linear.

Disadvantages:

  1. It will underfit the data when the relationship between input and output is nonlinear i.e it fails to generalize non linear data accurately.
  2. It also can’t model complex relationships.

4.K_NEAREST_NEIGHBORS

It is an algorithm that has the ability to model non-linear data as well as linear data efficiently. It is used for both regression and classification problems.

Advantages:

  1. Albeit being simple and interpretable ,it is highly flexible and efficient at learning more complex, non-linear relationships.
  2. Used in recommender systems,like in Netflix, spotify etc.

Disadvantages:

  1. It doesn’t work well when no of observations and features grow i.e doesn’t generalized well for large datasets.

5.SUPPORT VECTOR MACHINES(SVM)

SVM are highly flexible algorithms that make a separating data-line between datasets. It can be used for both regression and classification.

Advantages:

  1. It can handle complex datasets as well.
  2. It works for nonlinear data too.

Disadvantages:

  1. Prone to noise.
  2. Don’t work well for large datasets.

6.TREE BASED METHODS

Tree based methods are the most effective algorithms developed for solving extremely complex domains of problems. It is compatible for both classification and regression problems.

There are many tree based methods:

1.Decision tree 2.Bagging 3.Random Forests 4.Boosting(Gradient boost, Ada Boost, XG Boost).

Advantages:

  1. These methods are best for supervised learning for prediction problems.
  2. Handle complex relationships along with handling missing data and categorical features in an adept way.

Disadvantages:

  1. Difficult to interpret and might take long to train the model as well.

7.NEURAL NETWORKS:

Neural networks are the state of the art technique to generalize even the most complex problems out there in the world. These algorithms come under deep learning which is the most complex still the most efficient model to handle cumbersome problems and get the best metrics for our problems. Since these methods are really complex, we should first try to use above simple linear models before getting our hands dirty on neural networks.

Hooo🥱…..finally the article is over but not the learning process. I have provided the basic understanding of these algorithms used for machine learning from an intuitive perspective so that you would be able to perceive them with breeze. Next its up-to you to get more adept at these topics.

I guess you got a bit of concepts on these algorithms from this article. I hold my pen here. Oops I hold my hands out of my keyboard😂😂.

Anyway….

Thank you.And yeah be happy and don’t worry .Just take a small step at a time and you will reach the summit in a jiffy.

--

--