SQL stands for Structured Query Language and it is used for classic databases that most people today know how to use. They don’t require a person to have a lot of technical skill or training to be able to use the basics, making it one of the most popular choices among programmers. Even when it’s not the choice of the programmer, they often use subsets of SQL or create their own language similar to SQL.
SQL and traditional databases are the baseline of databases. SQL flavor databases, which are most databases today, are relational databases. Which means that they are really good at relating pieces of data to other pieces of data, finding overlaps and common ground, where data intersects and where it doesn’t.
For example, let’s say you are doing research for your business. Every employee is supposed to take certain compliance courses on an annual basis. For your company, there are three courses with tests that have to be completed by a certain time. That time is up in two weeks, so you want to know who has completed the required training. You can query a database to return only the people who have completed all of the training, or you can query it to return the ones who still have work to do. It will pull the list of courses, the list of employees and relate them together.
Much of programming in general is about relating data to other data. There are parent-child -type, cross sectional, and overlapping data relationships. These data relationships are how programming languages make applications and systems work properly.
NoSQL refers to any not-SQL database. Traditionally, these are object-oriented databases, which is a really helpful when you have a lot of data about an individual thing. An example of a website which uses object-oriented data is Facebook. Facebook users continually add more and more data about themselves to their profiles. Messages, photos, posts, likes, groups, etc. all link up on your profile, everything is about and stored under you.
The biggest problem with object-oriented databases isn’t really a problem with them. There are many SDK’s for OO Databases which make them feel like relational databases except they aren’t relational and the data is denormalized. For instance, if there are 20 services you can use in Facebook, then Facebook has to store data about those 20 services. Except most people don’t use all 20 services, and likely only use pieces of those services. Facebook only has to store the data that you use. Because it’s denormalized and because the data is linked together through you, it doesn’t matter if what’s stored under you is completely different than what’s stored under someone else. You only want your messages, your friends, your groups. What’s under someone else doesn’t matter because you only want what’s highly related to you.
This becomes a problem when you want to find something that includes other people. Let’s say you want to find all the groups that 40-year-olds belong to, but the object data isn’t designed to allow for that data to be elicited from it. So now you have to go through every single person and find if they’re 40 and then find whatever groups they belong to. In a relational database, you would simply query in one table “people who are 40” and in another table by a foreign key or another link, look up groups.
That example shows how relational databases can be more efficient than object-oriented databases if data is relational. A relational database completes this task with essentially a single query, where an object-oriented database is far more inefficient and essentially scans every object. With an object-oriented database, that backlog will start to build up and cause problems if you treat it like a relational database. When datastores get really big and you try to do cross sectional relationships and figure out how data is grouped, it begins to be a huge and arduous task because all of the data is attached to the person and there is no other way to look at it. So when you need to find one data attribute to group people’s messages together by time of day they were sent, you end up having to fish through all of the data to find what you’re looking for because you’re no longer looking by the object. You’re now looking for an attribute of the object’s objects, which makes it take a lot longer because you have to search through a many unnecessary data nodes on the tree.
NoSQL is object-oriented. SQL is relational. There are SDK’s to make one look like the other if needed, but under the hood, that is how they work. Understanding the difference is a key component into understanding what they are good for and when to use them and when not to use them.