Project

House Classification

Hard
68 completions
~ 19 hours
4.3

This project aims to show you the basic workflow of training a machine learning algorithm, from importing the data to evaluating the model’s performance. Moreover, you will learn how to use different data encoders.

Provided by

JetBrains Academy JetBrains Academy

About

Welcome to a real estate company located in Amsterdam. Your supervisor has given you a task to create a ML model that will predict the price category of a house based on various parameters. The key problem of the task is that there is too much categorical data. In this project, you will learn to work with the input data and to apply ready-to-use machine learning (decision tree) algorithms to your data.

Training project icon

Training project

This project allows you to practice and strengthen your coding skills, helping you get ready for more advanced tasks ahead.

What you'll learn

Once you choose a project, we'll provide you with a study plan that includes all the necessary topics from your course to get it built. Here’s what awaits you:
Split the data into the features and the target variable and get the training and test splits.
Use the one-hot encoder to transform the categorical variables and train the decision tree classifier.
Use the ordinal encoder to transform the categorical variables and train the decision tree classifier.
Use the target encoder to transform categorical variables and train the decision tree classifier.
Use the inbuilt sklearn methods to calculate the accuracy, recall, precision, F1 score, and AUC of each classifier. Which encoder performed the best?

Reviews

User 619502234 avatar
User 619502234
8 months ago
Various ways to encode and how to evaluate and compare the results.
Andrzej Molenda avatar
Andrzej Molenda
1 year ago
Mainly to use different Encoders - it wasn't that clear to me before
Jay PB
1 year ago
This is very good project. Learned about classification and decision trees.

4.3

Learners who completed this project within the Introduction to Data Science course rated it as follows:
Usefulness
4.6
Fun
4.2
Clarity
4.1