To NoSQL or Not to NoSQL

To NoSQL or Not to NoSQL

Explore a few considerations when deciding if NoSQL is right for you.

NoSQL was a revolutionary advance in technology when it first began to gain popularity. As time has passed and more teams have gained hands-on experience with it in long-living projects, the hype has died down and opinions have changed.

In this article we will take a look at how NoSQL compares to SQL and the considerations to keep in mind when deciding whether or not to use a NoSQL database.

What is NoSQL?

The first step to understanding when you should and shouldn't use NoSQL is having a firm grasp of NoSQL as a concept.

"NoSQL" itself is not a great name as it doesn't really encompass all that a NoSQL database has to offer. There are many different flavors of NoSQL. For example:

  • Document databases: e.g. MongoDB, DynamoDB
  • Key-Value databases: e.g. Redis, Memcached, Couchbase
  • Wide-column stores: e.g. Cassandra, BigTable, CosmosDB
  • Graph databases: e.g. Neo4j, ArangoDB, TigerGraph

What all of these have in common is that they are non-relational database, meaning they do not adhere to a defined schema of relational tables. Often when someone refers to a NoSQL database, they are referring to a document store such as MongoDB, however as you can see above there are many variations to keep in mind.

NoSQL refers to any databases that store data in non-relational constructs.

Primary features

Below are just a few of the notable features of NoSQL databases that make them such a useful technology:

Flexible data structures

The most obvious feature of NoSQL is the fact that it does not have to follow a schema pre-defined at the database-level. This allows you as a developer to have the flexibility of storing data in a chosen shape per-record. One record in a collection may look completely different than the next!

 {
   "id": "sdf8s7d76sfds9d0fs",
   "firstName": "Sabin",
   "lastName": "Adams",
   "favoriteColor": "#000000"
}

You may have another record in that same collection that looks more like this:

{
   "id": "sdf8s7d76sfds9d0fs",
   "firstName": "Thomas",
   "lastName": "Newman",
   "favoriteColor": "#1f1f1f",
   "hobby": "music composition"
}

Notice the new hobby property

This is perfectly acceptable in a NoSQL database and allows you as a developer to not have to worry about having your long-term data structure defined right from the start.

Quick iterations

A side-effect of having flexible schemas is that NoSQL allows developers and teams to iterate very quickly as their applications progress and evolve.

When an application requires data structure changes within the database, making those changes is as simple as adding or removing columns from the data model. The database doesn't care what data you give it, so long as it is of a valid format.

Potentially faster queries

NoSQL has the potential for faster queries compared to similar queries against a traditional relational database.

In NoSQL, relations are not handled via an underlying schema. Instead they are manually maintained using nested data structures. For example, consider a model where a user can have many posts:

{
   "id": "9879sdfs98df",
   "fullName": "Sabin Adams",
   "posts": [{ ... }]
}

As is shown above, the posts data is nested within the user record itself. This makes it very easy and efficient to search for a user's related posts. Rather than having to make use of a foreign key to point to a separate table and scan that table for linked data, it's all packaged nicely in one spot.

Horizontally scalable

The last point I will touch on is that NoSQL databases excel at being horizontally scalable.

Horizontal scaling refers to increasing the capacity of a system by adding additional machines (nodes), as opposed to increasing the capability of the existing machines. This is also called scaling out. - Cockroach Labs

The way data is structured in a NoSQL database makes it easier to shard and duplicate across multiple machines (or nodes). When using a NoSQL database, you give up a lot of the built-in features a traditional relational database gives you, such as referential integrity and transactions in return for an easily partition-able pool of data.

When a NoSQL database runs up against performance walls, you can simply add more nodes to your pool of databases that split the processing power between each other.

How Is It Different From SQL?

The sections above focused specifically on the features of NoSQL, but now let's take a look at how those features set NoSQL apart from traditional relational databases:

Rigid data structure

A SQL database has a very rigid data structure. In fact, you cannot begin using and storing data in your database without first defining a data structure and the relationships within that data structure.

This has positive and negative implications. In regards to how this causes SQL to differ from NoSQL, rigid data structures:

  • Make iteration much slower. Changing your SQL data model requires you to update your schema via a migration before you can store new data or remove data.
  • Cause a higher barrier of entry. Migrations are difficult to manage and easy to get wrong. Because of this, it is best practice to think all the way through your data needs before applying your initial schema to minimize the number of major structure changes. This takes time and resources.

The above are presented as negatives, however the time spent defining and updating a rigid data model allow you to construct well-defined relations in your data which allow you to:

  • Enforce relation rules
  • Establish ACID properties, allowing safe transactional operations across many tables of data
  • Improve data compatibility

Vertically scalable

Unlike NoSQL, which is horizontally scalable, traditional SQL databases are scaled vertically.

Vertical scaling refers to increasing the capacity of a system by adding capability to the machines it is using (as opposed to increasing the overall number of machines). This is also called scaling up. - Cockroach Labs

Relational data, due to the nature of the underlying data structure that make up relations, does not support the concept of horizontal scaling. Writing a SQL query with joins across different tables assumes it has immediate access to the related data. If the database were stored horizontally, the related data from such a query might only exist in a particular node, requiring a network request to fetch that segment of the data.

This poses problems as the database will have no way to determine whether it has access to all of the data when generating a query plan.

Because of this, SQL database are typically scaled vertically. When a SQL database runs up against performance issues, the database server itself is upgraded to a more powerful machine that can handle the desired workload.

Considerations

At this point, you should have a general overview of some of the primary differences between NoSQL and SQL databases. But, as we saw above, both have their own pros and cons.

So we now come to the question this article is aiming to answer: To NoSQL or Not to NoSQL. The answer to this question is the typical: It depends.

Every application is different and has different functional and organizational needs. SQL and NoSQL each solve specific problems the other can't account for. The answer to the question will depend on what your specific applications' needs are.

Below are a few considerations to keep in mind when deciding if NoSQL is a fit for your needs:

Structure of data

One major consideration is the structure and nature of the data your application needs.

If are expecting your data to primarily consist of non-relational data or large sets of unstructured data, NoSQL is a fantastic choice.

A few examples of when you might opt for a NoSQL database:

  • Social media-like applications where data and relationships are user-defined
  • Storage for unstructured variable data such as logs or IoT data
  • Applications such as content management systems where the data is primarily non-relational

Evolution of data

As discussed briefly above, the lack of a structured schema allows you to modify your data structure with little to no effort. This can be important for a team working to get an application out as soon as possible.

If your application is in its early phases and you are hoping to iterate and evolve it quickly, a NoSQL is definitely the best option.

Often teams will use a NoSQL database during experimentation phases. Once the product is complete, you can always take a look at the data needs of that application and determine whether a switch to a tradition SQL database would be beneficial.

Your team's capabilities

Another thing to consider when choosing a database type is the knowledge your team members possess. Working with a traditional SQL database requires the developer to understand how to write efficient SQL queries, often complex ones.

One of the reasons NoSQL was invented was due to the frustrating of the SQL language. With most NoSQL providers, your queries are run via an API built in your language of choice.

If your team consists primarily of developers who are not SQL-savvy, you may be best suited running a NoSQL database.

Cost

Due to the flexible nature of hosting and deploying NoSQL databases, you also gain access to more granular control of the costs that go into your database.

As your database scales up and down, the cost of hosting your data scales with it. This can often lead to much lower costs compared to a traditional SQL database which is run 24/7 on a machine with static allocations of resources.

Key Takeways

The key takeway here is that the NoSQL vs. SQL debate does not have any one definitive answer. Each type of database has its own special attributes that make it more suitable for specific situations.

When choosing your database, be sure to consider the strengths and weaknesses of each kind of database, the human resources available on your team and the future data needs of your application.