Hello Folks, if you read my previous three posts on
Artificial Intelligence (AI), then congratulations you have the basic knowledge
about the Machine Learning algorithms if not please read them. Today I would
like to discuss about some most commonly used interview question on the field
of Machine Learning and AI. Which would help you crack your interviews in
machine Learning. Most of the basic things are already covered, remaining we
will learn here.
Let’s get started
- What is Gradient Decent?
- Gradient
decent is an optimization algorithm which minimizes any given function. Given a
function Gradient decent starts with an initial set of parameters and
iteratively move to the set of parameters which provides minimum for that particular
function. It is little difficult to visualize; I will try to give an example
with figures for better understanding.
- In the above figure the blue dots are actual house prices(y_Actual) corroding to the house size, green line is the predicted house price(y_Prediction) and yellow dotted lines are prediction errors (prediction error= y_Prediction - y_Actual). So, the aim is to improve the prediction by minimizing the prediction error (y_Predict - y_Actual). Gradient decent is the algorithm which is used to minimize the prediction error and optimize the function.
- In the above figure the blue dots are actual house prices(y_Actual) corroding to the house size, green line is the predicted house price(y_Prediction) and yellow dotted lines are prediction errors (prediction error= y_Prediction - y_Actual). So, the aim is to improve the prediction by minimizing the prediction error (y_Predict - y_Actual). Gradient decent is the algorithm which is used to minimize the prediction error and optimize the function.
- What are the differences between Random forest and
Gradient boosting? Or explain the difference between bagging and boosting
algorithms.
The difference between Random
Forest and Gradient boosting is as follows-
- Randam
forest uses bagging and samples randomly, whereas gradient boosting uses
bagging, boosting samples with an increased weight on the ones that it got
wrong previously
- Because
all the trees in random forest are built without any consideration for any of
the other trees, this is incredibly easy to parallelize, which means that it
can train really quick. Whereas gradient boosting is iterative in that it
relies on the results of the tree before it, in order to apply a higher weight
to the ones that the previous tree got incorrect. So, boosting can't be
parallelized, and it takes much longer to train.
- The
final predictions for random forest are typically an unweighted average or an
unweighted voting, while boosting uses a weighted voting.
- Lastly,
random forest is easier to tune, faster to train and harder to overfit, while
gradient boosting is harder to tune, slower to train, and easier to overfit.
So, with that why would you go
with gradient boosting? Well, the trade-off is that gradient boosting is
typically more powerful and better-performing if tuned properly.
- What are the benefits of using gradient boosting?
- Well,
it's one of the most powerful machine learning classifiers out there. It also
accepts various types of inputs just like random forest, so it makes it very
flexible. It can also be used for classification or regression, and the outputs
feature importance which can be super useful. But it's not perfect. Some of the
drawbacks are that it takes longer to train because it can't be parallelized,
it's more likely to overfit because it obsesses over those ones that it got
wrong, and it can get lost pursuing those outliers that don't really represent
the overall population.
- What are Bias and Variance?
- The
prediction error in machine learning algorithms can be divided into three types-
o Bias error,
o Variance error and
o Irreducible error
- The
irreducible error cannot be reduced whatever algorithm is used. So, we will
focus into Bias and variance error.
- Bias
is the assumptions made by the model to make the target function easier to
approximate. High bias can cause an algorithm to miss the relevant relations
between features and target outputs (under fitting).
- Variance
is the amount that the estimate of the target function will change given
different training data. High variance can cause an algorithm to model the
random noise in the training data, rather than the intended outputs
(over-fitting).
- What is Bias Variance trade-off?
- The
bias and variance trade-off is an import aspect of machine learning algorithm.
To get an accurate model, an engineer’s goal is to reduce the bias and variance
as much as possible. However, it is not feasible in real life. If a learning
algorithm has low bias it must be very flexible so the it can fir any data. But
if the learning algorithm is too flexible it will fit ever training data set
and increase the variance error. So, there should be a trade-off between bias
and variance when selecting models of different flexibility or complexity and
in selecting appropriate training sets to minimize these sources of error!
- Explain the difference between L1 and L2 regularization
- L2
regularization tends to spread error among all the terms, while L1 is more
binary/sparser, with many variables either being assigned a 1 or 0 in weighting.
- Difference between KMEAN and KNN(K Nearest Neighbor)
algorithms
- The
main difference is Kmean clustering is unsupervised whereas KNN is supervised
machine learning algorithm. Which means KNN needs labelled data for prediction
but Kmean doesn’t need as it is unsupervised.
- Kmean
is used for clustering problem whereas KNN is a supervised learning algorithm
used for classification and regression problem.
- What are different Machine Learning techniques?
- The
different type of machine learning algorithms are-
o Supervised
Machine Learning Algorithms,
o Unsupervised
Machine Learning ALgoritms,
o Semi-Supervised
Machine Learning Algorithms,
o Re-inforcement
Machine Learning algorithms
For details please read my
previous post here:Supervised, Un-Supervised, Semi-Supervised machine and Reinforcement Learning algorithms
- Difference Between Supervised and Unsupervised machine
learning algorithms
- please
read my previous post here :Supervised, Un-Supervised, Semi-Supervised machine and Reinforcement Learning algorithms
- What are most commonly used Machine Learning Algorithms?
- please
read my previous post here:10 Most Commonly Used Machine Learning Algorithms
If you have any other question
which I can add to this list, please let me know in the comment section. Any
feedback or suggestion is always welcome. Stay tuned for next post. Regards, Mostafiz
Next post:Linear Regression Implementation with python
Next post:Linear Regression Implementation with python
Thanks for you feedback.. will share soon stay connected
ReplyDeleteThank you Sushmi.. stay tuned for more posts.
ReplyDeleteGreat post.
ReplyDeleteartificial intelligence online course
Machine Learning & Artificial Intelligence
b tech admission 2020
best artificial intelligence course
mba colleges delhi ncr
artificial intelligence engineering
nirf ranking 2020 engineering college
I am looking for and I love to post a comment that "The content of your post is awesome" Great work! Corso Intelligenza Emotiva Milano
ReplyDeleteThank you.. follow my blog for more update.
DeleteGreat Post I am regular reader of your blogs. I found your blogs very helpful for students i already shared your blogs multiple times with students.Apply Now For Best Machine Learning Training Course
ReplyDelete