SQL vs NoSQL

SQL vs NoSQL

Last updated
Sep 6, 2022 10:48 AM
Published
September 6, 2022
Tags
Systems Design
Reading Time

4 minutes

Intro

Ya’ll know how it is

High level differences

SQL Databases
NoSQL Databases
Storage Model
Each row represents an entity while each column represents a data point.
Document: JSON documents Key-value: key-value pairs Wide-column: tables with rows and dynamic columns Graph: nodes and edges
Schema
Each row conforms to a strict, fixed schema. Columns must be decided and chosen beforehand before data entry can commence. Rigid
Flexible
Purpose
General Purpose
Document: General purpose Key-value: Large amounts of data with simple lookup Wide-column: Large amounts of data with predictable query patterns Graph: Analysing and traversing relationships between connected graph-like data
Real world examples
Oracle, MySQL, Microsoft SQL Server, and PostgreSQL
Document: MongoDB and CouchDB, Key-value: Redis and DynamoDB, Wide-column: Cassandra and HBase, Graph: Neo4j and Amazon Neptune
Scaling
Vertical
Horizontal
ACID
Compliant
Not compliant. Typically eventually consistent, but can do multi-record ACID transactions at the cost of latency
History
Developed in the 1970s with a focus on the de-duplication of data. ie normalisation
Developed in the late 2000s with a focus on scaling and allowing for rapid application change.

Benefits of SQL

Data is ACID compliant

Here’s a quick refresher on what that means:

Atomicity: All transactions must succeed or fail completely and cannot be left partially complete, even in the case of system failure.

Consistency: The database must follow rules that validate and prevent corruption at every step. Data is always in a consistent state.

Isolation: Concurrent transactions cannot affect each other. Achieved through locking of various database resources.

Durability: Transactions are final, and even system failure cannot “roll back” a complete transaction.

Benefits of NoSQL

  • Flexible data models
  • Horizontal Scaling
  • High performance with simple queries
  • Rapid development

The key driving force in whether you would use one or the other:

If you’ve read a few articles or watched a few videos on this topic, you will notice that these info pieces will often mention the fact that SQL is scaled vertically while NoSQL is scaled horizontally and move on to some other difference. But there is actually a very deep implication with this:

image

CAP theorem informs us that in a distributed system we can only achieve two out of three tenets when it comes to Consistency, availability, and Partition Tolerance. This applies to databases as well.

Truth bomb: The choice of what database you use will always be informed by what the most appropriate consistency model for your domain is.

We will explore this concept in depth in my article on CAP theory and relate it back to distributed noSQL databases. But in short here’s a quick reference you can use when choosing your database:

CA (Consistency & Availability)

  • SQL RDBMS
    • MSSQL
    • MySQL
    • PostgresQL

AP (Availability & Partition Tolerance)

  • CouchDB
  • CassandraDB
  • Dynamodb

CP (Consistency & Availability)

  • Redis
  • MongoDB

sources

https://www.mongodb.com/nosql-explained/nosql-vs-sql

https://www.integrate.io/blog/the-sql-vs-nosql-difference

https://www.educative.io/courses/grokking-the-system-design-interview

https://www.ibm.com/cloud/blog/sql-vs-nosql