hive
Replacing an SQL query with unix sort, uniq and awk
We currently have some data on an HDFS cluster on which we generate reports using Hive. The infrastructure is in the process of being decommissioned and we are left with the task of coming up with an[详细]
2023-03-28 10:34 分类:问答Hive out-of-the-box json parser
I have a text file containing json records I would like to load to Hive. My json looks like: {\"vr\":1,\"tm\":1312816191516,\"tms\":\"08-08-2011 15:09:51.516 GMT\",\"as\":1002,\"pb\":1102,\"cts\":[12[详细]
2023-03-27 00:21 分类:问答Is a collocated join (a-la-netezza) theoretically possible in hive?
When you join tables which are distributed on the same key and used these key columns in the join condition, then each SPU (machine) in netezza works 100% independent of the other (see nz-interview).[详细]
2023-03-26 15:46 分类:问答How does hive/hadoop assures that each mapper works on data that is local for it?
2 basic questions that trouble me: How can I be sure that each of the 32 files hive uses to store my tables sits on its unique machine?[详细]
2023-03-26 08:49 分类:问答Make OLAP with Hadoop Hive from OLTP Mysql
I bit confuse with Hadoop hive which i read from Wiki used for make OLAP. Now i want to make OLAP on Hive from OLTP database which use Mysql开发者_如何转开发.[详细]
2023-03-26 07:52 分类:问答register hive udf using hue api
How to register a UDF开发者_StackOverflow by using HUE API? I am using below code but it\'s unable to register it.[详细]
2023-03-25 22:54 分类:问答What difference of RDBMS and Hive? [closed]
Closed. This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this po[详细]
2023-03-25 16:02 分类:问答Using sorted tables in Hive
In summary: I feel that my system is ignoring the concept of pre-sorted tables. - I expected to save time on the sorting step because I was using[详细]
2023-03-24 21:38 分类:问答realtime querying/aggregating millions of records - hadoop? hbase? cassandra?
I have a solution that can be parallelized, but I don\'t (yet) have experience with hadoop/nosql, and I\'m not sure which solution is best for my needs.In theory, if I had unlimited CPUs, my results s[详细]
2023-03-23 14:58 分类:问答How to sort (order by) big data with hive efficiently?
I want to sort a big dataset efficiently (i.e. with a custom partitioner, like described here: How does the MapReduce sort algorithm work?)开发者_开发技巧, but I want to do it with hive.[详细]
2023-03-19 10:24 分类:问答