Project

Naive Bayes Classifier with Pen and Paper

Easy
6 completions
~ 7 hours
4.8

The project will give you a better understanding of how you go from words to numbers, and what the Naive Bayes classifier has under the hood. You will also practice solving classification problems on a simple dataset to get more comfortable with this kind of task.

Provided by

JetBrains Academy JetBrains Academy

About

Language identification is a task of Natural Language Processing. It consists in determining which language your data belongs to. Essentially, it is a classification problem where you have to assign language labels to sentences or texts. Could you ever imagine that all you need for a language identification system is a pen and a piece of paper? Yes, you got it right: no programming skills, no linguistic expertise. Let’s try it out! We will use the Naive Bayes approach to create a simple classifier and solve this problem.

Training project icon

Training project

This project allows you to practice and strengthen your coding skills, helping you get ready for more advanced tasks ahead.

What you'll learn

Once you choose a project, we'll provide you with a study plan that includes all the necessary topics from your course to get it built. Here’s what awaits you:
Calculate the priors and likelihoods for words and languages.
Use Laplace smoothing to account for unknown words.
Identify the language of four short phrases using the Naive Bayes classifier and the data you have prepared at previous stages.

Reviews

synth avatar
synth
2 years ago
Moderator
I learned about the naive bayes classifier! Looks overwhelming at first, but in fact simple and no any hard calculations required.
Amirhossein Biglari avatar
Amirhossein Biglari
3 years ago
an excellent project to start learning the Naive Bayes classifier.
Ayush avatar
Ayush
3 years ago
I have never enjoyed data science before but now I believe it is going to be a joyful journey. This project seems difficult but trust me it is so not in fact if you read carefully this is going to be so much fun and yes with pen and paper.

4.8

Learners who completed this project within the Coding Machine Learning Algorithms course rated it as follows:
Usefulness
5.0
Fun
4.6
Clarity
4.8