|
|
Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs
Page 2 of 3
In other words, this is a classic graphical database problem. Relationships matter as much as, if not more than, the data itself. Neo4j is the most popular graph database on the market these days. While graph databases are part of the NoSQL movement, they really solve different problems than, say, Couchbase or MongoDB. We aren't necessarily concerned with handling massive scale or doing analytics across terabytes of big data a la Hadoop's HBase. In fact, most graph databases are transactional, and the reason they are NoSQL is that SQL is simply inadequate to express the problems, as you can see in the amount of code it took in the findSuggestions method.
For the Granny4j version using Neo4j the main query comes down to this:
// select friends and friends of friends, order by depth of the relationship
String findFriendsQuery = "start n=node(*), person=node({userNode}) MATCH p = (person)-[:FRIEND*1..2]-(friend) return distinct
p order by length(p)";
As you can see there is a lot less code -- and it does the job. It's also more efficient. Check out all the code for Granny4j.
Why is this important? Theoretically, you can hire offshore developers for as little as $15 an hour who know SQL -- meaning the technology and people who know it are commoditized. Neo4j presumably requires more expensive expertise that is in lower demand. Nonetheless, there's always a correlation between lines of code and the number of bugs. We can decrease downtime and errors by decreasing the number of bugs per line, but it's an expensive process, and ultimately, it's easier to decrease the number of lines of code.
There's also a big efficiency issue. Even on my laptop, the unit test for GrannyJPA takes considerably longer than Granny4j. If you consider this at the kind of scale that a major retailer would require and take into account the law of diminishing returns, there's a real performance and scalability issue.
The biggest objections to introducing a new structured storage technology are usually related to the experience with the technology within the organization or "single source of record." While the latter concern is indeed a problem when combining many types of NoSQL databases with an existing SQL database, it wouldn't be a problem with Neo4j. Like most graph databases, Neo4j is transactional. As for the former consideration, that exists with any new or in this case different technology.
Personally, I'd rather be moving forward at a deliberate pace and finding new efficiencies than standing in place because it's what I've always done. Moreover, graph database technology isn't that much younger than the RDBMS.
As I've mentioned in the past, it all comes down to data structures. By using the RDBMS for everything over the last few decades, the industry has done the equivalent of using a list for every data structure. You wouldn't use only one data structure for every type of data in memory, why would you do that just because you're storing the data?
Sadly, my mom refuses to use any of the fancy tools I've developed. She's just stopped asking me what to get the kids and asks my wife instead.