Project

Decision Tree from Scratch

Challenging
28 completions
~ 26 hours
4.5

Master functions and Python classes. Study the decision tree architecture, implement your algorithm, and apply NumPy and Pandas features to solve a real-world problem.

Provided by

JetBrains Academy JetBrains Academy

About

A decision tree is one of the most widely used machine learning algorithms due to its ease of interpretation. This algorithm is similar to the way we make decisions in our daily life.
In this project, you will take a closer look at the algorithm and write it from scratch with the help of Python, NumPy, and Pandas. Teach the model to process categorical and numerical features to make data-based decisions. Implement a decision tree for classification and apply it to a real dataset.

Graduate project icon

Graduate project

This project covers the core topics of the Coding Machine Learning Algorithms course, making it sufficiently challenging to be a proud addition to your portfolio.

At least one graduate project is required to complete the course.

What you'll learn

Once you choose a project, we'll provide you with a study plan that includes all the necessary topics from your course to get it built. Here’s what awaits you:
Take a first look at the dataset and start with building several functions
Write the splitting function for one node.
Implement a recursive split function for your classifier.
Gather all the functions in a Python class.
Add a predicting method to your decision tree.
Measure the quality of your model with categorical features.
Upgrade the split function by adding numerical features.
Update the recursive split and predicting functions so that they can deal with numerical features.
Measure the quality of your model on the full dataset.

Reviews

Aneurin Sutton avatar
Aneurin Sutton
11 months ago
I learned how to calculate weighted gini impurities for both categorical and numerical features to best choose how to split data when fitting a custom decision tree.
Roland Onderka avatar
Roland Onderka
12 months ago
I finally understood how a decision tree is working. Great project!
Andrei Raugas avatar
Andrei Raugas
2 years ago
While working on this project, I gained valuable hands-on experience in creating and training models. This helped me gain a deeper understanding of how to effectively use 'pandas' for data analysis and manipulation, including data preprocessing, feature engineering, and model evaluation.

4.5

Learners who completed this project within the Coding Machine Learning Algorithms course rated it as follows:
Usefulness
4.8
Fun
4.7
Clarity
4.1