Apache Hive Distributed RDBMS Platform

From GM-RKB
Jump to navigation Jump to search

An Apache Hive Distributed RDBMS Platform is a RDBMS platform released by the Apache Hive Project.



References

2014


2011


  • (Wikipedia, 2011) ⇒ http://en.wikipedia.org/wiki/Apache_Hive
    • Apache Hive is a data warehouse infrastructure built on top of Hadoop for providing data summarization, query, and analysis.[1] While initially developed by Facebook, Apache Hive is now used and developed by other companies such as Netflix. Hive is also included in Amazon Elastic MapReduce on Amazon Web Services.

      Apache Hive supports analysis of large datasets stored in Hadoop compatible file systems such as Amazon S3 filesystem. It provides an SQL-like language called HiveQL while maintaining full support for map/reduce. To accelerate queries, it provides indexes such as the bitmap index.

      By default, Hive stores metadata an embedded Apache Derby database, and other client/server databases like MySQL can optionally be used.

      Currently, there are three file formats supported in Hive, which are TEXTFILE, SEQUENCEFILE and RCFILE.

  1. Venner, Jason (2009). Pro Hadoop. Apress. ISBN 978-1430219422. 

2009