------ Solution ---------------------------------------- ----
official explanation seems to be supported through a test page through sqoop into hdfs in the blob, but when imported into hbase has not been successful, expect answers
------ Solution --------------------------------------------
sorry, I did not write on the 1st floor
should sqoop not supported direct blob data to guide the way
explain my approach
give an example of LZ
- created in Oracle Blob field with a data sheet
- temp_01 (idx int, bstr Blob, aaa varchar2 (10))
- insert test data
insert into temp_01 (idx, bstr, aaa) values ??(1, to_blob ('aaa'), 'aaa'); insert into temp_01 (idx, bstr, aaa) values ??(2, to_blob ('bbb'), 'bbb'); insert into temp_01 (idx, bstr, aaa) values ??(3, to_blob ('ccc'), 'ccc'); insert into temp_01 (idx, bstr, aaa) values ??(4, to_blob ('ddd'), 'ddd');
- Query
SELECT t. * FROM temp_01 t
/ *****************
Idx bstr aaa
1aaa
2bbb
3ccc
4ddd
****************** /
perform SQOOP import program to the HDFS directory / tmp/temp01_tmp01 in
sqoop import - append - connect jdbc: oracle: thin: @ 10.126.24.31:1521: FRDB - username XXX - password XXX - -table temp_01 - columns idx, aaa, bstr - target-dir / tmp/temp01_tmp01-m 1 - inline-lob-limit 16777216
Here is the output
13/05/23 12:21:22 INFO manager.SqlManager: Using default fetchSize of 1000
13/05/23 12:21:22 INFO tool.CodeGenTool: Beginning code generation
13/05/23 12:21:23 INFO manager.OracleManager: Time zone has been set to GMT
13/05/23 12:21:23 INFO manager.SqlManager: Executing SQL statement: SELECT t. * FROM temp_01 t WHERE 1 = 0
13/05/23 12:21:23 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is / opt / hadoop
Note: / tmp/sqoop-hadoop/compile/f2d5ca24a56102a5c4495ab1a0609d56/temp_01.java uses or overrides a deprecated API.
Note: Recompile with-Xlint: deprecation for details.
13/05/23 12:21:24 INFO orm.CompilationManager: Writing jar file: / tmp/sqoop-hadoop/compile/f2d5ca24a56102a5c4495ab1a0609d56/temp_01.jar
13/05/23 12:21:24 INFO manager.OracleManager: Time zone has been set to GMT
13/05/23 12:21:24 INFO mapreduce.ImportJobBase: Beginning import of temp_01
13/05/23 12:21:25 INFO mapred.JobClient: Running job: job_201305151834_0074
13/05/23 12:21:26 INFO mapred.JobClient: map 0% reduce 0%
13/05/23 12:21:39 INFO mapred.JobClient: map 100% reduce 0%
13/05/23 12:21:44 INFO mapred.JobClient: Job complete: job_201305151834_0074
......................................................................
......................................................................
......................................................................
13/05/23 12:21:44 INFO mapreduce.ImportJobBase: Transferred 48 bytes in 20.0827 seconds (2.3901 bytes / sec)
13/05/23 12:21:44 INFO mapreduce.ImportJobBase: Retrieved 4 records.
13/05/23 12:21:44 INFO util.AppendUtils: Creating missing output directory - temp01_tmp01
View SQOOP results
hadoop dfs-cat / tmp/temp01_tmp01/part-m-00000
Results:
1, aaa, 0a aa
2, bbb, 0b bb
3, ccc, 0c cc
4, ddd, 0d dd
comma separated last column is the BLOB data
Then how to deal with, LZ can figure it out, mapping HIVE, you can import HBASE
I did not put BLOB (MYSQL, ORACLE) data directly into too HBASE, HDFS are imported in this manner
Then HBASE directory mapping to create the BLOB
------ For reference only ---------------------------------- -----
How can this code be compiled and executed for testing?
回复删除You are doing a great job by sharing useful information about Hadoop course. It is one of the post to read and improve my knowledge in Hadoop.You can check our How to Import Data from MySQL to HDFS using sqoop,tutorial for more information about Import data from mysql to sqoop.
回复删除