This is a blog series on using the Neo4j NoSQL database.
Introduction
I have been around databases for 30 years starting out with IDMS on ICL mainframes. That progressed into Sybase and Microsoft SQL Server. I have cut my teeth on relational databases however; the one size fits all no longer applies anymore. Those previous eras can be referred to as the first and second database revolution targeting Mainframes, client/server and the beginnings of the web.
The third database revolution are NoSQL databases. With the growth of the cloud, social media, big data, mobile and IOT, a new wave of technologies needed to be designed to cope with the different demands placed on it.
NoSQL databases can be categorized as follows:
- Document Orientated
- Key Value Store
- Wide Column
- Graph
In order to talk about NoSQL it is often easier when comparing it against a relational database. This is by no means the complete list, just some of the common ones.
SQL |
NoSQL |
Predefined schema |
No schema enforcement (All Data has a schema) |
Tabular |
Various data structures |
ACID compliance |
|
Scales vertically |
Scales horizontally |
Full transaction support |
Partial transaction support |
SQL Query language |
SQL like language or custom language |
There are other NoSQL types but these are the most popular. If you look at the website DB-Engines Ranking it shows a list of all the databases based on data extracted from search engines. It uses the data to determine how often a search on a particular engine is performed and thereby accessing how popular it is. At the time of writing the most popular Graph database is Neo4j. In case you are wondering the name was originally Neo and the 4j was “for Java” since Java is the code it is written in. Although it is accessed from a multitude of other languages, the name just stuck and is now the actual product name.
Neo4j
What exactly is a Graph database? They follow a concept called Graph Theory which models the relationships between pairs of objects. It really outshines relational and NoSQL databases when you have allot of highly related data. It answers questions easily that would require a multitude of hierarchical and self-joins in relational databases. The following table gives a list of the features of Neo4j.
Platform |
|
Version |
|
Data Durability |
|
Indexes |
|
Query Language |
|
Compelling Features |
|
When would you use Neo4j
The simple answer is any data that is highly related and requires nested traversals. You can view the Use Cases website for a list of the common applications. I always think of an application like Facebook. “How would I find the friends of a friend who are potential friends of mine, who like Martial Arts and have other friends with the same like”? Sound a bit gobbledygook but imagine trying to answer that question in SQL. The simple more understood requirement would be a Sat Nav. “What is the fastest way from A to B via C but avoid motorways”. You get the picture.
Installing Neo4j
Neo4j runs on all platforms. It is available on cloud platforms, Docker containers or installed as standalone systems. At the time of writing there are two editions available; Community and Enterprise. There is also a Desktop installation that installs an Enterprise Edition with a developer licence and includes all the tools and features to try it out.
The rest of this demo uses the Desktop installation which can be found here.
In the next blog I will discuss some of the basics.