I have a large mysql table that I would like to transfer to a Hadoop/Hive table. Are there st开发者_Go百科andard commands or techniques to transfer a simple (but large) table from Mysql to Hive? The table stores mostly analytics data.
First of all download mysql-connector-java-5.0.8 and put the jar to lib and bin folder of Sqoop
Create the table definition in Hive with exact field names and types as in mysql
sqoop import --verbose --fields-terminated-by ',' --connect jdbc:mysql://localhost/test --table employee --hive-import --warehouse-dir /user/hive/warehouse --fields-terminated-by ',' --split-by id --hive-table employee
test - Database name
employee - Table name (present in test)
/user/hive/warehouse - Directory in HDFS where the data has to be imported
--split-by id - id can be the primary key of the table 'employee'
--hive-table employee - employee table whose definition is present in Hive
Sqoop User Guide (One of the best guide for learning Sqoop)
Apache Sqoop is a tool that solves this problem:
Apache Sqoop(TM) is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases.
精彩评论