开发者

Hive - create a table from zip file

开发者 https://www.devze.com 2023-03-13 18:38 出处:网络
I have开发者_StackOverflow中文版 bunch of zip files of CSVs, that I want to create Hive table from. I\'m trying to figure out what\'s the best way to do so.

I have开发者_StackOverflow中文版 bunch of zip files of CSVs, that I want to create Hive table from. I'm trying to figure out what's the best way to do so.

  • Unzip the files, upload them to HDFS.
  • Is there a way to copy the files to HDFS, unzip the
  • Or is there any other better / recommended way?


It's common practice to convert CSV files to tab separated or Ctrl A , or Ctrl B delimited and then upload it to Hadoop/Hive.

To upload files to HDFS you can use following command -

hadoop fs -put file_to_uplload hdfs_path

I assume you would like to automate this. In that case following instructions will be helpful.

  1. Create hive table with columns mapping to CSV files fileds.(you can remove unnecessary fields at this step). Choose your delimiter in hive create table statement.

  2. Convert csv files to delimited format (Ctrl A or Ctrl B)

  3. Upload files to Hive table location.

You can automate about steps using python batch processing scripts/framework.

For further reading : http://wiki.apache.org/hadoop/Hive/GettingStarted

0

精彩评论

暂无评论...
验证码 换一张
取 消