hive外部表建立时出现的问题

乡里伢崽

浏览: 108477 次
性别:
来自: 深圳

最近访客更多访客>>

loginboot

gaojingsong

eliot4u

benwudashi

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

hive

1：刚刚研究hive，遇到一个很迷糊的问题，我想load hadoop下的a.txt文件，将这些数据导入到/user/hive/warehouse/下的testHiveDriverTable表中，只想导入数据，不想移动该文件。
2但是每次运行后，该文件都会自动的移动到/user/hive/warehouse/下。我看教程说，建立外部表的话，就不会移动hdfs上的文件位置，但是这些我都尝试了，还是没有解决问题。请各位看看如何处理
3：代码如下：

建表语句：ResultSet res = stmt.executeQuery("create external table " + tableName + " (keystring, value string)");
Load语句：sql = "loaddata inpath '"+ROOT_PATH+"/home2/hadoop/a.txt' into table " +tableName;

运行结果：

Running: show tables'testHiveDriverTable'
testhivedrivertable
Running: describetestHiveDriverTable
key string
value string
Running: load data inpath'hdfs:////home2/hadoop/a.txt' into table testHiveDriverTable
Running: select * from testHiveDriverTable
1 a null
2 b null
3 c null
Running: select count(1) fromtestHiveDriverTable

运行前：
http://dl2.iteye.com/upload/attachment/0086/0198/4e698ec4-b3c1-3221-87d5-dcc948b4cf50.png

运行后：
http://dl2.iteye.com/upload/attachment/0086/0200/705dd992-a158-30dd-aea3-b59355877945.png

问题已经解决了。是我自己的建立表语句有问题，理解的不争取啊！

1：我本来是这么认为的，create 一个外关联表后，只是一个单纯的建表语句，需要load后才会将hadoop中的数据导入(仅仅是关联性的导入，非移动或复制)，load后的原文件夹下的数据是不会变化的。但是尝试了很多次都会移动文件，困扰我的正是这个原因，是我理解错误了。

2：其实load数据，人家hive说的很明确，就是复制或者移动数据用的。文件肯定是要移位的。

3：正确的做法是

CREATE EXTERNAL TABLE testHiveDriverTable(hostname string,logdate string, type string,class string,demo array<string>) row format delimited fields terminated by '|' COLLECTION items terminated BY '@' stored as textfile location 'hdfs://IP:port/home2/hadoop/'
其中'hdfs://IP:port/home2/hadoop/'是关键，这个才是指定hadoop文件的位置。

4：建好表以后，如果不想移动数据，根本不需要load ，直接select语句就可以查询了。

分享到：

hive参数的意义 | WARN conf.HiveConf: DEPRECATED: Configur ...

2014-07-01 16:46
浏览 1998
评论(0)
分类:编程语言
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论