1 解压安装
hadoop@Master:~$ sudo tar xvfz sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz hadoop@Master:~$ sudo mv sqoop-1.4.6.bin__hadoop-2.0.4-alpha /usr/local/sqoop1/ hadoop@Master:~$ sudo chmod -R 775 /usr/local/sqoop1 hadoop@Master:~$ sudo chown -R hadoop:hadoop /usr/local/sqoop1
2 修改环境变量
hadoop@Master:~$ sudo nano /etc/profile #sqoop export SQOOP_HOME=/usr/local/sqoop1 export PATH=$SQOOP_HOME/bin:$PATH hadoop@Master:~$ source /etc/profile
3 配置sqoop的环境变量
下载mysql驱动包,mysql-connector-java-5.1.27.jar 把jar包丢到到$SQOOP_HOME/lib下面 hadoop@Master:/usr/local/sqoop1/conf$ cp sqoop-env-template.sh sqoop-env.sh # 指定各环境变量的实际配置 # Set Hadoop-specific environment variables here. #Set path to where bin/hadoop is available #export HADOOP_COMMON_HOME= #Set path to where hadoop-*-core.jar is available #export HADOOP_MAPRED_HOME= #set the path to where bin/hbase is available #export HBASE_HOME= #Set the path to where bin/hive is available #export HIVE_HOME= 但是一般情况下我们的/etc/profile已经配置相关的环境变量 export JAVA_HOME=/usr/lib/jvm/ export JRE_HOME=${JAVA_HOME}/jre export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib:/usr/local/hive/lib export PATH=${JAVA_HOME}/bin:$PATH #HADOOP VARIABLES START export JAVA_HOME=/usr/lib/jvm/ export HADOOP_INSTALL=/usr/local/hadoop export PATH=$PATH:$HADOOP_INSTALL/bin export PATH=$PATH:$JAVA_HOME/bin export PATH=$PATH:$HADOOP_INSTALL/sbin export HADOOP_MAPRED_HOME=$HADOOP_INSTALL export HADOOP_COMMON_HOME=$HADOOP_INSTALL export HADOOP_HDFS_HOME=$HADOOP_INSTALL export YARN_HOME=$HADOOP_INSTALL export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib" #HADOOP VARIABLES END export HIVE_HOME=/usr/local/hive export PATH=$PATH:$HIVE_HOME/bin:/usr/local/hbase/bin export JAVA_LIBRARY_PATH=/usr/local/hadoop/lib/native export SCALA_HOME=/usr/lib/scala export PATH=$PATH:$SCALA_HOME/bin #sqoop export SQOOP_HOME=/usr/local/sqoop1 export PATH=$SQOOP_HOME/bin:$PATH #HBASE export HBASE_HOME=/usr/local/hbase
4 开始测试
需要拷贝mysql的驱动到lib下面 hadoop@Master:~/mysql-connector-java-5.0.8$ sudo cp mysql-connector-java-5.0.8-bin.jar /usr/local/sqoop1/lib/ 以mysql为例子 IP:192.168.1.178 用户名:chu888chu888 密码:skybar 数据库:hivetestdb 表:cdsgus [root@hadoop01 ~]# sqoop help Available commands: codegen Generate code to interact with database records create-hive-table Import a table definition into Hive eval Evaluate a SQL statement and display the results export Export an HDFS directory to a database table help List available commands import Import a table from a database to HDFS import-all-tables Import tables from a database to HDFS import-mainframe Import datasets from a mainframe server to HDFS job Work with saved jobs list-databases List available databases on a server list-tables List available tables in a database merge Merge results of incremental imports metastore Run a standalone Sqoop metastore version Display version information 列出所有的数据库 hadoop@Master:/usr/local/sqoop1/lib$ sqoop list-databases --connect jdbc:mysql://192.168.1.178 --username chu888chu888 --password skybar 列出数据库中所有的表 hadoop@Master:/usr/local/sqoop1/lib$ sqoop list-tables --connect jdbc:mysql://192.168.1.178/hivetestdb --username chu888chu888 --password skybar 导出mysql表到hdfs上 hadoop@Master:/$ hdfs dfs -mkdir /user/sqoop hadoop@Master:/$ hdfs dfs -chown sqoop:hadoop /user/sqoop sqoop import --connect jdbc:mysql://192.168.1.178/hivetestdb --username chu888chu888 --password skybar --table cdsgus --m 2 --target-dir /user/sqoop/cdsgus hadoop@Master:~$ sqoop import --connect jdbc:mysql://192.168.1.178/hivetestdb --username chu888chu888 --password skybar --table cdsgus --m 2 --target-dir /user/sqoop/cdsgus Warning: /usr/local/sqoop1/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /usr/local/sqoop1/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /usr/local/sqoop1/../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. 16/03/03 01:28:13 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6 16/03/03 01:28:13 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 16/03/03 01:28:13 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 16/03/03 01:28:13 INFO tool.CodeGenTool: Beginning code generation 16/03/03 01:28:14 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM cdsgus AS t LIMIT 1 16/03/03 01:28:14 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM cdsgus AS t LIMIT 1 16/03/03 01:28:14 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/hadoop Note: /tmp/sqoop-hadoop/compile/7b9cf86a577c124c063ff5dc2242b3fb/cdsgus.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 16/03/03 01:28:17 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/7b9cf86a577c124c063ff5dc2242b3fb/cdsgus.jar 16/03/03 01:28:17 WARN manager.MySQLManager: It looks like you are importing from mysql. 16/03/03 01:28:17 WARN manager.MySQLManager: This transfer can be faster! Use the --direct 16/03/03 01:28:17 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path. 16/03/03 01:28:17 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql) 16/03/03 01:28:17 INFO mapreduce.ImportJobBase: Beginning import of cdsgus SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 16/03/03 01:28:17 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 16/03/03 01:28:18 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 16/03/03 01:28:18 INFO client.RMProxy: Connecting to ResourceManager at Master/192.168.1.80:8032 16/03/03 01:28:23 INFO db.DBInputFormat: Using read commited transaction isolation 16/03/03 01:28:23 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(id), MAX(id) FROM cdsgus 16/03/03 01:28:23 INFO mapreduce.JobSubmitter: number of splits:2 16/03/03 01:28:23 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1456939431067_0002 16/03/03 01:28:24 INFO impl.YarnClientImpl: Submitted application application_1456939431067_0002 16/03/03 01:28:24 INFO mapreduce.Job: The url to track the job: http://Master:8088/proxy/application_1456939431067_0002/ 16/03/03 01:28:24 INFO mapreduce.Job: Running job: job_1456939431067_0002 16/03/03 01:28:38 INFO mapreduce.Job: Job job_1456939431067_0002 running in uber mode : false 16/03/03 01:28:38 INFO mapreduce.Job: map 0% reduce 0% 16/03/03 01:32:11 INFO mapreduce.Job: map 50% reduce 0% 16/03/03 01:32:13 INFO mapreduce.Job: map 100% reduce 0% 16/03/03 01:32:14 INFO mapreduce.Job: Job job_1456939431067_0002 completed successfully 16/03/03 01:32:14 INFO mapreduce.Job: Counters: 31 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=247110 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=218 HDFS: Number of bytes written=3130492684 HDFS: Number of read operations=8 HDFS: Number of large read operations=0 HDFS: Number of write operations=4 Job Counters Killed map tasks=1 Launched map tasks=3 Other local map tasks=3 Total time spent by all maps in occupied slots (ms)=422821 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=422821 Total vcore-seconds taken by all map tasks=422821 Total megabyte-seconds taken by all map tasks=432968704 Map-Reduce Framework Map input records=20050144 Map output records=20050144 Input split bytes=218 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=19391 CPU time spent (ms)=206680 Physical memory (bytes) snapshot=313565184 Virtual memory (bytes) snapshot=3757293568 Total committed heap usage (bytes)=65142784 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=3130492684 16/03/03 01:32:14 INFO mapreduce.ImportJobBase: Transferred 2.9155 GB in 235.5966 seconds (12.672 MB/sec) 16/03/03 01:32:14 INFO mapreduce.ImportJobBase: Retrieved 20050144 records.
导出mysql表全部数据到hive hive> create database test_sqoop; OK Time taken: 0.81 seconds hive> show databases; OK chu888chu888 default test_sqoop Time taken: 0.247 seconds, Fetched: 3 row(s) hive> 使用sqoop创建表并导入表 sqoop import --connect jdbc:mysql://192.168.1.178/hivetestdb --username chu888chu888 --password skybar --table cdsgus --hive-import --hive-table test_sqoop.cdsgus 将数据从hive导入mysql mysql> use hivetestdb; Database changed mysql> show tables; +----------------------+ | Tables_in_test_sqoop | +----------------------+ | cdsgus | +----------------------+ 1 row in set (0.00 sec) mysql> truncate cdsgus; Query OK, 0 rows affected (0.00 sec) mysql> select * from cdsgus; Empty set (0.00 sec) sqoop --connect jdbc:mysql://192.168.1.178/hivetestdb --username chu888chu888 --password skybar --table cdsgus --export-dir /user/hive/warehouse/test_sqoop.db/cdsgus/ --input-fields-terminated-by '\0001' 增量导入 sqoop import --connect jdbc:mysql://192.168.1.178/hivetestdb --username chu888chu888 --password skybar --table cdsgus --hive-import --hive-table test_sqoop.cdsgus --check-column id --incremental append --last-value 2 HBASE导入 sqoop import --connect jdbc:mysql://192.168.1.178/hive_hadoop --username chu888chu888 --password skybar --table TBLS --hbase-table TBLS --hbase-create-table --hbase-row-key TBL_ID --column-family SD_ID
错误阻力
hadoop@Master:/$ sqoop import --connect jdbc:mysql://192.168.1.178/hivetestdb --username chu888chu888 --password skybar --table cdsgus Warning: /usr/local/sqoop1/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /usr/local/sqoop1/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /usr/local/sqoop1/../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. 16/03/03 00:32:16 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6 16/03/03 00:32:16 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 16/03/03 00:32:16 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 16/03/03 00:32:16 INFO tool.CodeGenTool: Beginning code generation 16/03/03 00:32:16 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM cdsgus AS t LIMIT 1 16/03/03 00:32:16 ERROR manager.SqlManager: Error reading from database: java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic@654f0d9c is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries. java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic@654f0d9c is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries. at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:914) at com.mysql.jdbc.MysqlIO.checkForOutstandingStreamingData(MysqlIO.java:2181) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1542) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:1723) at com.mysql.jdbc.Connection.execSQL(Connection.java:3277) at com.mysql.jdbc.Connection.execSQL(Connection.java:3206) at com.mysql.jdbc.Statement.executeQuery(Statement.java:1232) at com.mysql.jdbc.Connection.getMaxBytesPerChar(Connection.java:3673) at com.mysql.jdbc.Field.getMaxBytesPerCharacter(Field.java:482) at com.mysql.jdbc.ResultSetMetaData.getPrecision(ResultSetMetaData.java:443) at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:286) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:241) at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:227) at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1833) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1645) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) 16/03/03 00:32:17 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: No columns to generate for ClassWriter at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1651) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) 如果出现上面的错误,请更新/usr/lib/sqoop/mysql-java-connector.jar文件。 ISSUE: https://issues.apache.org/jira/browse/SQOOP-1400 这里面还有一种可能就是你在hadoop/common这个目录也有一个mysql的驱动包,这个包也许版本很古老! hadoop@Master:/usr/local/hadoop/share/hadoop/common$ ls hadoop-common-2.6.0.jar hadoop-common-2.6.0-tests.jar hadoop-nfs-2.6.0.jar jdiff lib mysql-connector-java-5.0.8-bin.jar sources templates hadoop@Master:/usr/local/hadoop/share/hadoop/common$ sudo rm -rf mysql-connector-java-5.0.8-bin.jar
FAQ
出现这二段警告,我们怎么解决,需要修改脚本
修改hadoop@hadoopmaster:/usr/local/sqoop/bin$ nano configure-sqoop
hadoop@hadoopmaster:/usr/local/hive/lib$ sqoop list-databases --connect jdbc:mysql://hadoopslave2 --username hive --password hive Warning: /usr/local/sqoop/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /usr/local/sqoop/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /usr/local/sqoop/../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
此文章本站原创,地址 https://www.vxzsk.com/1433.html
转载请注明出处!谢谢!
感觉本站内容不错,读后有收获?小额赞助,鼓励网站分享出更好的教程
上一篇:MySQL 删除表
下一篇:11.6、内置的Symbol值
^