Modern applications and services use databases for different purposes: from supporting key operations like user authorization, messaging, and payments, to conducting advanced analytical queries that allow businesses to get insights and make better decisions. Depending on the usage patterns and type of workload, databases and other storages can be classified into two large groups: OLTP and OLAP.
In this topic, we will be reviewing the differences between the OLTP and OLAP systems. By the end of the topic, you will have a clear understanding, of which model is right in your case.
OLTP and OLAP
Both terms OLTP and OLAP relate to online query processing but have different focuses.
OLTP stands for Online Transaction Processing. It focuses on processing multiple transactions for conducting key business operations.
OLAP refers to Online Analytical Processing. It focuses on executing complex analytical queries to get insights about customers, processes, hidden patterns in data, and so on.
Keep in mind that neither of these approaches is "better" because they are not rivals. OLTP and OLAP complement each other. Your choice between the systems should be guided by the problem you want to deal with.
The terms OLTP and OLAP have been used long before NoSQL emerged. Since that time the situation has changed a lot and now NoSQL is almost as common as classic relational databases. Not all modern databases and storages can be easily classified as OLTP or OLAP, but this classification still can be partially applied to NoSQL. So, we will refer to the terms when applicable to keep this information up-to-date with the current situation.
In the next sections, we are going to examine OLTP and OLAP in detail and then provide you with a table featuring the key differences between these approaches. Let's start with OLTP!
What is OLTP?
Online Transaction Processing (OLTP) systems are usually used to manage day-to-day business operations and work in almost real-time. Good examples of OLTP systems are online banking, retail and delivery, hotel booking systems, various messengers, and other multi-user systems.
Originally, OLTP highly relied on traditional relational databases (Oracle, MySQL, PostgreSQL, etc) that comply with the ACID (Atomicity, Consistency, Isolation, and Durability) guarantees, as the data is critical to everyday business operations and must not be lost or become inconsistent. Databases that follow the OLTP approach often use highly-normalized structures to protect data integrity from incorrect write operations (inserts or updates).
From the workload point of view, OLTP systems are optimized for processing a massive number of short and relatively simple transactions that represent CRUD-like (create, read, update, and delete) operations on domain objects. At the same time, modern OLTP systems can process a tremendous amount of such queries, up to hundreds of thousands or even a million transactions per second. Such types of OLTP systems are considered to be highly loaded and, as a rule, they should operate 24/7 and be constantly available.
As you may know, the popularity of NoSQL databases has grown a lot over the last few years. Nowadays they can be considered for typical OLTP workload, even if they don't fully comply with ACID. A good example here is document storage named MongoDB, which can outperform classic relational databases in some cases (especially in horizontal scaling) and provides much more flexibility in schema design. Another example would be Apache Cassandra, a column-oriented database that is often used for OLTP-like workload. Even without strong consistency guarantees, it provides fast writes and is highly available due to its distributed nature.
What is OLAP?
Online Analytical Processing (OLAP) allows analysts, managers, and executives to get insights from data for financial reports, sales forecasting, decision-making purposes, and other data analysis tasks. Results produced by OLAP can be used by executives to plan the future of the entire company.
Unlike OLTP, where transactions are the main source of data in the system, OLAP gets data as inputs from different already existing OLTP databases through various data loading procedures.
What about the structure of OLAP systems, such systems usually store huge amounts of historical data in a denormalized form that is more suitable for executing aggregations, groupings, calculations, and other operations. It allows OLAP to be optimized for executing long and complex analytical queries on huge amounts of data. The queries might last from seconds to hours.
In practice, OLAP systems can be implemented using relational databases (Oracle, MySQL, PostgreSQL, etc). This technique is named ROLAP (or relational OLAP). However, the most common and flexible way is to use specific software called Data Warehouse (DWH) that stores data in form of multi-dimensional hypercubes or columnar databases. Modern examples of data warehouses are Snowflake, Google BigQuery, Amazon Redshift, and others.
When OLAP should load or process extremely large amounts of data, it is often aided by BigData technologies such as Apache Hadoop, Apache Spark, and so on.
The key differences
After reviewing both OLTP and OLAP, it is time to compare all the key differences. Please, take a look at the table below. You don't need to memorize everything, just to understand the basic differences between OLTP and OLAP.
# | Key difference | OLTP systems | OLAP systems |
1 | Used for | key business operations in near real-time | data analysis, planning, decision-making |
2 | Data source | data used in transactions | historical data from different OLTP databases |
3 | Backed by | initially, relational databases; | mostly data warehouses; |
4 | Optimized for | processing a large number of short transactions | processing large amounts of data |
5 | Types of operations | all types of read and write operations (selects, inserts, updates, deletes) | mostly read operations; write operations are rare |
6 | Queries | relatively simple queries | complex and slow queries |
7 | Processing time | milliseconds | from seconds to hours |
8 | Data updates | multiple inserts and updates initiated by end-users of the system | data is usually inserted by batches regularly (for example, once a day) |
9 | Schema normalization | usually, a highly-normalized schema to preserve data integrity | usually denormalized |
10 | Backup and recovery | require rigorous backup and recovery procedures to keep data safe | require backup only from time to time; |
The border between OLTP and OLAP
After looking at the differences, you may think that OLTP and OLAP have nothing in common. However, in practice, OLAP and OLTP are not strict categories, they are more like a spectrum. Most real data storages focus on either OLTP or OLAP but also provide some features for dealing with the opposite kind of workload if it ever occurs. Moreover, modern companies often maintain several data storages that work together to provide both transaction processing and analytics facilitation.
There is also a relatively new category called Hybrid Transactional/Analytical Processing (HTAP) for handling both types of workload within a single database management system. Some people also call this type of work augmented transactions or translytical.
Conclusion
In this topic, you have learned the main distinctions between the two approaches to online data processing. Whereas OLTP handles multiple transactions from users to support key business operations, OLAP is an online historical storage that allows analysts to retrieve data and perform data analysis. These two approaches also differ in the type of workload they can serve: OLTP mostly deals with quick and simple CRUD-like queries, while OLAP serves complex long-running queries that involve large amounts of data. Despite plenty of differences, they are not adversaries. Moreover, they are often used together: for instance, when data produced by operations in OLTP is sent to OLAP for accumulation and further analysis.