Apache Cassandra Key-Value Store Platform
An Apache Cassandra Key-Value Store Platform is an open source distributed key-value store platform managed the Apache Cassandra Project.
- Context:
- It can (typically) support Eventually Consistent Semantics.
- It can (typically) have no Single Point of Failure.
- It can (typically) Scale Linearly.
- It can (typically) be used to create Stateful Services.
- It can be instantiated in a Cassandra DBMS Instance.
- It can support a Cassandra Tables (a type of associative array).
- It can include a Cassandra CLI and Cassandra CLQ.
- It can be associated to a Cassandra Connector (such as an Cassandra Spark connector).
- Example(s):
- Apache Cassandra v3.11.10 [1] (~2021-02).
- Apache Cassandra v3.11.4 [2] (~2019-02).
- Apache Cassandra v3.11 (~2017-06).
- Apache Cassandra v2.2.10 (~2017-06).
- …
http://cassandra.apache.org/download/
- as running in AWS Keyspaces (for Cassandra).
- …
- Counter-Example(s):
- AWS DynamoDB.
- ScyllaDB.
- Apache HBase.
- Aerospike DBMS.
- Google BigTable.
- Redis, fast writes, supports transactions.
- Voldemort/Voldermort Data Store System, horizontally scalable, no key iteration or indexing.
- MongoDB, uses sharding and memory-mapped files.
- Couchbase.
- See: Distributed Data Store, Cross-Platform Framework, Single Point of Failure, Computer Cluster, DataStax, Inc..
References
2020
- https://zdnet.com/article/a-closer-look-at-amazon-keyspaces/
- QUOTE: ... It's ironic. Apache Cassandra was arguably the first NoSQL platform to introduce a truly distributed operational database into the wild. But it's also one of the last to get its own managed DBaaS (Database-as-a-Service) cloud service, which is something for which AWS – and DataStax – have gotten plenty of demand. Both have had managed services in preview over the past few months, and now, AWS has gotten it ready for release. AWS offers a native, optimized implementation of Cassandra that it terms a "serverless Apache Cassandra-compatible service.” …
2017
- (Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/Apache_Cassandra Retrieved:2017-7-27.
- Apache Cassandra is a free and open-source distributed NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassandra offers robust support for clusters spanning multiple datacenters, with asynchronous masterless replication allowing low latency operations for all clients. Cassandra also places a high value on performance. In 2012, University of Toronto researchers studying NoSQL systems concluded that "In terms of scalability, there is a clear winner throughout our experiments. Cassandra achieves the highest throughput for the maximum number of nodes in all experiments" although "this comes at the price of high write and read latencies.”
2016
2012
- http://wiki.apache.org/cassandra/DataModel
- Cassandra is a partitioned row store, where rows are organized into tables with a required primary key.
The first component of a table's primary key is the partition key; within a partition, rows are clustered by the remaining columns of the PK. Other columns may be indexed independent of the PK.
This allows pervasive denormalization to "pre-build" resultsets at update time, rather than doing expensive joins across the cluster.
- Cassandra is a partitioned row store, where rows are organized into tables with a required primary key.
2011
- http://wiki.apache.org/cassandra/
- Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store. Cassandra brings together the distributed systems technologies from [[3]] and the data model from Google's [[4]]. Like Dynamo, Cassandra is [consistent]. Like !BigTable, Cassandra provides a !ColumnFamily-based data model richer than typical key/value systems.
Cassandra was open sourced by Facebook in 2008, where it was designed by Avinash Lakshman (one of the authors of Amazon's Dynamo) and Prashant Malik (Facebook Engineer). In a lot of ways you can think of Cassandra as Dynamo 2.0 or a marriage of Dynamo and !BigTable. Cassandra is in production use at Facebook but is still under heavy development.
- Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store. Cassandra brings together the distributed systems technologies from [[3]] and the data model from Google's [[4]]. Like Dynamo, Cassandra is [consistent]. Like !BigTable, Cassandra provides a !ColumnFamily-based data model richer than typical key/value systems.
2009
- http://www.eflorenzano.com/blog/post/my-thoughts-nosql/
- … Originally developed by Facebook, it was developed by some of the key engineers behind Amazon's famous Dynamo database.
Cassandra can be thought of as a huge 4-or-5-level associative array, where each dimension of the array gets a free index based on the keys in that level. The real power comes from that optional 5th level in the associative array, which can turn a simple key-value architecture into an architecture where you can now deal with sorted lists, based on an index of your own specification. That 5th level is called a SuperColumn, and it's one of the reasons that Cassandra stands out from the crowd. Cassandra has no single points of failure, and can scale from one machine to several thousands of machines clustered in different data centers. It has no central master, so any data can be written to any of the nodes in the cluster, and can be read likewise from any other node in the cluster. It provides knobs that can be tweaked to slide the scale between consistency and availability, depending on your particular application and problem domain. And it provides a high availability guarantee, that if one node goes down, another node will step in to replace it smoothly.
- … Originally developed by Facebook, it was developed by some of the key engineers behind Amazon's famous Dynamo database.