开发者

What difference of RDBMS and Hive? [closed]

开发者 https://www.devze.com 2023-03-25 16:02 出处:网络
Closed. This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this po
Closed. This question needs to be more focused. It is not currently accepting answers.

Want to improve this question? Update the question so it focuses on one problem only by editing this post.

开发者_高级运维

Closed 7 years ago.

Improve this question

In RDMS like MySQL thereis database, are there database also on the Hive ?as i read on the manual, hive only have table, i bit confuse about it..

and what is different concept of RDBMS and Hive ?

Tks before


The main difference between RDBMs databases and Hive is specialization. While MySQL is general purpose database suited both for transactional processing (OLTP) and for analytics (OLAP), Hive is built for the analytics only. Technically the main difference is lack of update/delete
functioality. Data can only by be added and selected. In the same time Hive is capable of processing data volumes which can not be processed by MySQL or other conventional RDBMS (in shy budget).
MPP (massive parallel proecssing) databases are closest to the Hive by their functionality - while they have full SQL support they are scalable up to hundreds of computers. Another serious different - is query language.
Hive do not support full SQL even in select because of it's implementation. In my view main difference is lack of join for any condition other then equal. Hive query language sintax is also a bit different so you can not connect report generation software right to Hive.


Basically, hive is a sql-like scripting language built on MapReduce. When you issue commands, the commands are interpreted and ran over the distributed system. Since the files being crunched are flat, it is equivalent to running an equivalent code in Hadoop, and gathering the data. The whole flow is much slower than it would be if you used Mysql.


Hive vs Traditional database Hive --> Schema on READ - it's does not verify the schema while it's loaded the data Traditional database ---> Schema on WRITE - table schema is enforced at data load time i.e if the data being loaded does't conformed on schema in that case it will rejected

Hive -->It's very easily scalable at low cost
Traditional database ---> Not much Scalable, costly scale up.

Hive -->It's based on hadoop notation that is Write once and read many times
Traditional database ---> In traditional database we can read and write many time Hive -->Record level updates is not possible in Hive
Traditional database ---> Record level updates, insertions and deletes, transactions and indexes are possible

Hive -->OLTP (On-line Transaction Processing) is not yet supported in Hive but it's supported OLAP (On-line Analytical Processing) Traditional database --->Both OLTP (On-line Transaction Processing) and OLAP (On-line Analytical Processing) are supported in RDBMS.

or else please check the below URL

https://sensaran.wordpress.com/2016/01/30/comparison-with-hive-with-traditional-database/


This not quite a response to the original question, but it appeared to exceed the maximum comment size by 47 characters.

When you use an OLAP data warehouse using HDFS and Hive, you are not quite barred from updating the fact data. You can do it in the very same way as many good RDBS-based data warehouses do - by exchanging partitions between the stage and the warehouse. Table partitions in Hive are implemented as HDFS directories, so exchanging partitions is (almost) instantaneous: it's the time needed to rename a HDFS directory. Well, you'll have to call HDFS directly, bypassing the Hive interface and you would likely employ straight MapReduce to maintain stage, but in the datawarehouses developed by the company I work for, it proved to be a good approach.


Hive is invented at Facebook and its just like Sql but with little support for inner queries. It allows you to use all types of Joins, Group functions as in Sql also provide User Defined Functions(UDFs) which can be written in Java or any other language and can be used in Hive.

Hive mainly used when data is large so that partition or clustering can be done and its not generally used for single row insert or update as we done in Sql.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号