Project

Read Quality Control

155 completions
~ 15 hours
3.9

Dive into the world of bioinformatics by exploring one of the basic data types: sequenced data. Learn about key parameters of data quality and automate data evaluation with Python. Come to know how to utilize programming for real-world biological tasks. If you love to solve problems at the intersection of the sciences, this project is for you.

Provided by

Edvancium Edvancium

About

Good afternoon, It's the HR department!

We've received your application for the bioinformatics researcher position. However, It is not clear from your CV whether you have the skills to assess the quality of raw sequencing data. We have a small test for you — select the best bacterial data archive and explain why. The archives are attached below. It would be excellent if your program could also indicate various data quality metrics. Good luck!

We are looking forward to hearing from you,

Your HR.

Training project icon

Training project

This project allows you to practice and strengthen your coding skills, helping you get ready for more advanced tasks ahead.

What you'll learn

Once you choose a project, we'll provide you with a study plan that includes all the necessary topics from your course to get it built. Here’s what awaits you:
Evaluate the first parameter connected with the length of reads.
Learn how to spot contaminations.
Identify repeated reads and extract them.
Detect the unsequenced sections in reads and calculate the fifth quality parameter.
Choose one of the three archives and explain your choice based on what you have learned.

Reviews

nahandove avatar
nahandove
2 years ago
I know some bioinformatics, but this is still a nice data science project to play with. I learned how to read from files and archives, and a bit about the use of objects and OOP in Python (which I have not used in this project--I used dictionaries instead).
Paweł Albrycht
2 years ago
Actually not much. Some selected info about one technique out of million from biology.Tasks were complicated only because of poor description
Viacheslav Shalisko avatar
Viacheslav Shalisko
3 years ago
Regexps, string counting, gunzip. All in IDE only.There were some issues with the last stage 6, as the last IDE was not able to apply tests correctly (the .gz files it downloaded for the test were corrupted), but I had resolved it by installing an older IDE version.

3.9

Learners who completed this project within the course rated it as follows:
Usefulness
4.3
Fun
3.9
Clarity
3.6