ProjectBeta

Ensemble ML Algorithms

Hard

6 completions

~ 25 hours

4.7

This content is new. Please help us improve it by reporting bugs if you encounter them.

Explore stochastic gradient descent, decision tree, k-nearest neighbors, and support vector classification algorithms. Learn how to set up pipelines, customize scoring metrics, and tune hyperparameters with grid search. Understand how majority voting rule works through ensembling with the VotingClassifier. And compare the results with the RandomForestClassifier.

Provided by

JetBrains Academy

About

Data scientists often train and optimize different machine learning models and then choose the one with the best performance. But did you know that combining these models can make them even better? This project will teach you how to train and optimize multiple models and then combine them to get better results. You will discover how ensemble learning can outperform using a single model by working on a multi-class classification task about music and emotions.

Graduate project

This project covers the core topics of the Data Scientist course, making it sufficiently challenging to be a proud addition to your portfolio.

At least one graduate project is required to complete the course.

What you'll learn

Once you choose a project, we'll provide you with a study plan that includes all the necessary topics from your course to get it built. Here’s what awaits you:

Split the data to ensure a balanced representation of classes.

Train and evaluate classifiers using the macro f1-score.

Tune the hyperparameters of the classifiers for better performance.

Select non-optimized and optimized models for ensemble learning.

Compare the performance of a tuned random forest classifier with the voting classifier.