Computer scienceFundamentalsSQL and DatabasesFor data analysis

Types of Data

4 minutes read

When you dive into the world of statistics, marketing research, or data science, you encounter different types of data. These data types act like tools in a toolbox, and understanding them helps you choose the right one for your task.

Whether you're a student, a business enthusiast, a marketer, or someone interested in data science, learning about these data types is essential. Why is it important? Well, it's like learning to use the right tool for the job. When you understand data types, you can measure things accurately, and that helps you make better decisions.

Data types

Before embarking on a scientific adventure, it's vital to grasp the different forms of data, known as measurement scales. Just like a detective needs the right tools to solve different kinds of mysteries, you'll need to match your analytical methods to the type of data you're investigating during Exploratory Data Analysis (EDA).

Let's look at the schema below to get a basic understanding of what types of statistical data exist:

To present your data effectively, you'll need to match it with a suitable visualization method. Think of data types as organizational labels for different variables. Further on we'll delve into the primary variable categories and examine an example for each.

Qualitative or categorical data

Categorical data, also known as qualitative data, is a way to describe information that fits into categories. It's not about numbers but more about grouping things together. Categorical data is used for features like a person's gender, hometown, and more. It's about labels, not numbers. Sometimes, categorical data can include numbers, but those numbers don't have a mathematical meaning. Below let's see examples of categorical data:

  • Gender (categories: Male, Female)
  • Favorite Sport (categories: Soccer, Basketball, Tennis)
  • School Postcode (categories: 12345, 67890)

Nominal Data

Nominal data is one type of categorical data that's all about labeling variables without using numbers. It's like giving names or titles to things without ranking them. For example:

  • Letters in the Alphabet (categories: A, B, C)
  • Words (categories: Apple, Banana, Cherry)
  • Colors (categories: Red, Blue, Green)

Nominal data can be displayed using pie charts to show the percentage of each category.

Ordinal Data

Ordinal data is another type of data that follows a natural order, like a ranking. The important thing is, you know which comes first, second, and so on, but you can't measure the exact difference between them. Ordinal data is commonly found in surveys, finance, questionnaires, and more. For example:

  • Educational Levels (categories: High School, Bachelor's, Master's, Ph.D.)
  • Customer Satisfaction (categories: Very Satisfied, Satisfied, Not Satisfied)
  • Movie Ratings (categories: Excellent, Good, Average)

To understand ordinal data, you can use tools like bar charts to visualize the order or ranking. It helps when you want to see which category is better or worse, but you can't precisely say how much better or worse they are.

Quantitative data

Quantitative data is the easiest to comprehend. It provides answers to fundamental questions like "how many", "how much", and "how often".

Quantitative data can be represented as numbers, making it quantifiable and measurable through numerical variables.

It lends itself well to statistical analysis and can be effectively illustrated through various types of statistical charts and graphs, such as line charts, bar graphs, and scatter plots.

Here are some examples of quantitative data:

  • Total monthly sales
  • Temperature in degrees Celsius
  • Time in seconds to complete a task
  • Distance in kilometers

Quantitative data can be further divided into two categories: discrete data and continuous data. In statistics, marketing research, and data science, many decisions hinge on whether the underlying data is discrete or continuous.

Discrete Data

Discrete data consists of whole numbers only; it cannot be broken down into fractions.

For example, the number of employees in a company is a discrete data point. You can count complete individuals; you can't count 1.5 employees.

In simple terms, discrete data can only take specific values, and these data variables cannot be subdivided further. They have a finite number of possible values.

Here are some examples of discrete data:

  • Number of website visitors per day
  • Number of items in a shopping cart
  • Number of cars in a parking lot
  • Number of employees in a department

Continuous Data

Continuous data can be infinitely divided into smaller levels. It can be measured on a scale or continuum and can take on almost any numeric value.

For instance, you can measure your weight with extreme precision using kilograms, grams, or milligrams.

Continuous data can be collected across various measurements, including width, temperature, time, and more. The key distinction from discrete data is that continuous variables can assume values between any two numbers.

Here are some examples of continuous data:

  • Height of individuals in millimeters
  • Temperature in degrees Fahrenheit
  • Time in milliseconds to complete a task
  • Distance in meters from one point to another

Interval vs. ratio data

Interval and ratio data are two essential forms of quantitative data used in the world of statistics and research. They offer a higher level of precision compared to ordinal and nominal data, making them invaluable for in-depth analysis.

Interval Data

Interval data, also called interval-level data, is a type of information that's easy to measure because the gaps between values are equal. But here's the thing, it doesn't have a "real zero point," which means that when you see a zero, it doesn't always mean there's none of that thing. You can do basic math with it, like adding and subtracting. Some examples are temperatures in Celsius or Fahrenheit, IQ scores, and years like 2020, 2021, 2022. When you look at interval data, differences between two numbers mean something, but you can't say one thing is "twice as much" as another. For example, you can say, "Today is 10 degrees warmer," but you can't say, "It's twice as hot."

Ratio Data

Ratio data, also called ratio-level data, is a lot like interval data, but it's special because it has a "real zero point." This zero really means there's none of that stuff. You can do all sorts of math with ratio data, like multiplying and dividing. Some examples include height, weight, age, income, and distance. With ratio data, both differences and ratios make sense. You can say things like "The second person is 10 cm taller than the first person," and "The third person's height is twice that of the first person."

Basically, the big difference between interval and ratio data is whether or not they have a "real zero point." Interval data doesn't, while ratio data does. This makes a big difference in what kind of math you can do with them.

Conclusion

In this topic, we found out about the following types of data used in statistics:

  • Qualitative data, that can be nominal and ordinal;
  • Quantitive data, that can be discrete and continuous;
  • Continuous data, that can be interval and ratio.

Understanding these various data types is like having a diverse set of tools at your disposal. Each tool serves a specific purpose, and your knowledge of data types helps you select the right one for your analytical needs. Whether you're dissecting a complex dataset, conducting market research, or delving into the world of data science, a firm grasp of these data types is your key to accurate measurement and informed decision-making. So, embrace the world of data types, and let your analytical journey begin with some practice.

How did you like the theory?
Report a typo