Project

Goods Category Prediction

Challenging
9 completions
~ 34 hours
4.2

Master the art of text preprocessing with the Natural Language Toolkit (NLTK) by learning to tokenize, lemmatize, and apply part-of-speech tags to text descriptions. Learn the fundamentals of Word2Vec, a popular technique for generating word embeddings, and explore its application in capturing semantic relationships within text. Gain hands-on experience in training a Word2Vec model and a Random Forest model for category predictions. Build a user-friendly web application with Streamlit and leverage Streamlit Cloud and GitHub for wider accessibility.

Provided by

JetBrains Academy JetBrains Academy

About

Retail platforms are flooded with millions of products, making accurate product categorization crucial for both retailers listing their products and customers searching for products to purchase. This project equips you with the skills to build a system that does just that! You will use Natural Language Processing (NLP) techniques to understand product descriptions and discover how word embeddings capture the contextual meanings of words in these descriptions. By the end of the project, you will not only have a solid grasp of NLP but also have the satisfaction of showcasing your work in a user-friendly web app!

Graduate project icon

Graduate project

This project covers the core topics of the MLOps Engineer course, making it sufficiently challenging to be a proud addition to your portfolio.

At least one graduate project is required to complete the course.

What you'll learn

Once you choose a project, we'll provide you with a study plan that includes all the necessary topics from your course to get it built. Here’s what awaits you:
Preprocess the descriptions using tokenization, lemmatization, and part-of-speech tagging
Train the Word2Vec model and transform the descriptions into machine-readable format
Fit a Random Forest classifier on the word embeddings vector to make product category predictions
Create a user-friendly web application to showcase predictions

Reviews

Rimmary
12 months ago
This project is even worse than Random Forest from Scratch. At least there you had to understand how a random forest works. Here you use almost nothing from the study plan. Most of it can be done with ctrl-c + ctrl-v. The theory questions are harder and more useful than this project. I did it withou ...

4.2

Learners who completed this project within the MLOps Engineer course rated it as follows:
Usefulness
4.0
Fun
3.5
Clarity
5.0