Data Analysis for Hospitals. Stage 4/5

The statistics

Report a typo

Description

You have cleared your dataset of empty rows and values. Some values have also been corrected, and now we can start a comprehensive study of our data. In this stage, we will find the main statistical characteristics of our data, consider data distributions, and so on.

Answer the following questions and output the answers in the specified format.

  1. Which hospital has the highest number of patients?
  2. What share of the patients in the general hospital suffers from stomach-related issues? Round the result to the third decimal place.
  3. What share of the patients in the sports hospital suffers from dislocation-related issues? Round the result to the third decimal place.
  4. What is the difference in the median ages of the patients in the general and sports hospitals?
  5. After data processing at the previous stages, the blood_test column has three values: t = a blood test was taken, f = a blood test wasn't taken, and 0 = there is no information. In which hospital the blood test was taken the most often (there is the biggest number of t in the blood_test column among all the hospitals)? How many blood tests were taken?

Tip: One of the methods to solve the last problem is pandas.pivot_table() . Set aggfunc='count' and read documentation to set other parameters.

Objectives

Use the DataFrame from the previous stage. It's not necessary here to set the maximum number of columns to display. The fourth stage requires completing one step:

Answer the 1-5 questions using the pandas library methods. Output the answers on the separate lines in the format given in the Example section.

If you have corrupted CSV files, please download them and unzip in your working directory.

Example

The input is 3 CSV files, test/general.csv, test/prenatal.csv, and test/sports.csv.

The output: the following answers are given for reference only, the actual answers might be different.

The answer to the 1st question is Brighton
The answer to the 2nd question is 0.645
The answer to the 3rd question is 0.873
The answer to the 4th question is 35
The answer to the 5th question is Oxford, 178 blood tests
Write a program
IDE integration
Checking the IDE status
___

Create a free account to access the full topic