开发者

HDFS path changing when trying to update files in HDFS

开发者 https://www.devze.com 2023-03-31 22:57 出处:网络
I am new to Hadoop and HDFS, so maybe it is something I am doing wrong when I copy from local (Ubuntu 10.04) to HDFS on a single node on localhost.The initial copy works fine, but when I modify my loc

I am new to Hadoop and HDFS, so maybe it is something I am doing wrong when I copy from local (Ubuntu 10.04) to HDFS on a single node on localhost. The initial copy works fine, but when I modify my local input folder and try to copy back to HDFS, the HDFS path changes.

~$ $HADOOP_HOME/bin/hadoop dfs -copyFromLocal /tmp/anagram /user/hduser/anagram
~$ $HADOOP_HOME/bin/hadoop dfs -ls /user/hduser/anagram
Found 1 items
-rw-r--r--   1 hduser supergroup    4067675 2011-08-29 05:44 /user/hduser/anagram/SINGLE.TXT

After adding another file (COMMON.TXT) to the same local directory, I run the same copy on the local directory to HDFS, but this time it copies to a different location than the first time (/user/hduser/anagram to /user/hduser/anagram/anagram).

~$ $HADOOP_HOME/bin/hadoop dfs -copyFromLocal /tmp/anagram /user/hduser/anagram
~$ $HADOOP_HOME/bin/hadoop dfs -ls /user/hduser/anagram
Found 2 items
-rw-r--r--   1 hduser supergroup    4067675 2011-08-29 05:44 /user/hduser/anagram/SINGLE.TXT
drwxr-xr-x   - hduser supergroup          0 2011-08-29 05:48 /user/hduser/anagram/anagram
~$ $HADOOP_HOME/bin/hadoop dfs -ls /user/hduser/anagram/anagram
Found 2 items
-rw-r--r--   1 hduser supergroup     805232 2011-08-29 05:48 /user/hduser/anagram/anagram/COMMON.TXT
-rw-r--r--   1 hduser supergroup    4067675 2011-08-29 05:48 /user/hduser/anagram/anagram/SINGLE.TXT

Has anyone ran into this? I found that to resolve this, you need to remove the first directory and then copy over again:

~$ $HADOOP_HOME/bin/hadoop dfs -rmr /user/hduser/anagram/anagram
Deleted hdfs://localhost:54310/user/hduser/anagram/anagram
~$ $HADOOP_HOME/bin/hadoop dfs -rmr /user/hduser/anagram
Deleted hdfs://local开发者_StackOverflowhost:54310/user/hduser/anagram
~$ $HADOOP_HOME/bin/hadoop dfs -copyFromLocal /tmp/anagram /user/hduser/anagram
~$ $HADOOP_HOME/bin/hadoop dfs -ls /user/hduser/anagram
Found 2 items
-rw-r--r--   1 hduser supergroup     805232 2011-08-29 05:55 /user/hduser/anagram/COMMON.TXT
-rw-r--r--   1 hduser supergroup    4067675 2011-08-29 05:55 /user/hduser/anagram/SINGLE.TXT

Does anyone know how to do this without having to delete the directory every time?


It seems to me that this is side effect (check the FileUtil.java, static method FileUtil.checkDest(String srcName, FileSystem dstFS, Path dst, boolean overwrite) ) try this:

hadoop dfs -copyFromLocal /tmp/anagram/*.TXT /user/hduser/anagram

for updating directory.

0

精彩评论

暂无评论...
验证码 换一张
取 消