I’m still new to Cassandra and by writing this post I hope to provide a better idea as a small step for a person like me who is new on Cassandra. I’m still in the initial process of understanding Cassandra, but by writing a blog post, my aim is to set up a solid foundation on myself as well as to provide something dependable for others for them to progress well with Cassandra.
Few months back I was looking for a way that I
can do something different than doing day today coding I do. So I was searching for different topics and I have been
exploring the NoSQL options other than JPA which is one of my
favorite areas to explore. Then the topic was moved NoSQL databases.
There were number of No SQL databases such as Cassandra,
Hbase, MangoDB, Simple DB, Voldemort,
etc. But most of these had nothing in common.
Where Cassandra fit into? Why Cassandra
?
Cassandra addresses some of the common issues come across with relational databases with the use of high volume data,
The size of the structured data become
massive specially for internet based companies like Google, Amazon
and Facebook etc. and vertical scaling of hardware is not a good
option with relational databases as well as some of are not capable to
vertically scale and achieve the handling of large number of
structured data.
With high scalability and availability
etc. these companies also wanted to have the high effectiveness on
commodity hardware instead of using high end high cost hardware.
Google and amazon in parallel came with same like idea to achieving
high scalability of mass storage within commodity hardware. Google
came up with BigTable and amazon with Dynamo. To get more
understanding on general NoSql context I would recommend reading on
"Finding the Right Data Solution for Your Application in the DataStorage Haystack" article.
Cassandra can be identified as a
combination of super sets of Bigtable and Dynamo. It combines the
shared-nothing architecture of Dynamo, with key-value concept of
Bigtable to create the next generation of databases that provides the
high scalability and other expected features including the same
effectiveness on commodity hardware. To get a better understanding
on Cassandra I would recommend reading on Cassandra White Paper
published by Datastax.
Furthermore without reading much if you want to get a quick overview about Cassandra, the below video from DataStax is the point to start.