Project

Key Terms Extraction

381 completions
~ 28 hours
4.0

By completing this project, you will get to know and implement crucial text preprocessing stages, tokenization and lemmatization, use NLTK, an essential NLP library, work with XML files, and program maths formulas! Along the way, you will create a useful tool and learn how to handle reading and writing files with confidence.

Provided by

JetBrains Academy JetBrains Academy

About

Extracting keywords can help you get to the text meaning. Also, It can help you with splitting texts into different categories. In this project, you will learn how to extract relevant words from a collection of news stories. There are many different ways to do it, but we will focus on frequencies, part-of-speech search, and TF-IDF methods. Note that each method can yield the results with varying degrees of accuracy for different texts. In reality, it is always good to try various methods and choose the best.

Training project icon

Training project

This project allows you to practice and strengthen your coding skills, helping you get ready for more advanced tasks ahead.

What you'll learn

Once you choose a project, we'll provide you with a study plan that includes all the necessary topics from your course to get it built. Here’s what awaits you:
Improve the results by applying lemmatization and deleting stop-words, digits, and punctuation.
Discover how to use part-of-speech tagging to extract the most frequent nouns and refine your keywords.
Find out how to identify words with the highest TF-IDF score.

Reviews

Shashank Gupta avatar
Shashank Gupta
4 months ago
The Key Terms Extraction Project is a solid NLP tool that pulls out the most important terms from news articles using a clean TF-IDF workflow. With strong preprocessing steps like lemmatization, POS filtering, and stopword removal, the results come out accurate and relevant. The setup works well acr ...
Brian Smith avatar
Brian Smith
7 months ago
In the final stage, setting min_df=0.1, max_df=0.6 within TfidVectorizer does not yield any results. Remove them and you get the expected results. Why is it mentioned as a parameter requirement in the instructions?
Joydeep Chatterjee avatar
Joydeep Chatterjee
11 months ago
The project was murky at the end in terms of specifying what exact steps were needed to produce the final correct result, but otherwise great learning experience on NLP in general.

4.0

Learners who completed this project within the course rated it as follows:
Usefulness
4.4
Fun
3.9
Clarity
3.6