Does dfs.blocksize change existing data

My version of Hadoop is 2.5.2. I am modifying my dfs.blocksize file in the hdfs-site.xml file on the main node. I have the following question:

1) Will this change affect existing data in HDFS 2) Do I need to propagate this change to all its nodes in the Hadoop cluster or only to NameNode?

0
source share
4 answers

you must make changes to the hdfs-site.xml of all slaves ... the size of dfs.block must be consistent across all datanodes.

+1
source

1) Will this change affect existing data in HDFS

No, it will not. It will save the old block size in the old files. In order for it to accept a new block change, you need to rewrite the data. You can either have hadoop fs -cp or distcp in your data. The new copy will have a new block size, and you can delete your old data.

2) Do I need to propagate this change to all its nodes in the Hadoop cluster or only to NameNode?

I believe that in this case you only need to change the NameNode. However, this is a very bad idea. You need to keep all configuration files in sync for a number of good reasons. When you take your Hadoop deployment more seriously, you should probably use something like Puppet or Chef to manage your configs.

Also note that whenever you change the configuration, you need to restart NameNode and DataNodes so that they can change their behavior.

Interesting note: you can set the size of individual files as they are written to overwrite the default block size. For example, hadoop fs -D fs.local.block.size=134217728 -put ab

+3
source

Changing the block size in hdfs-site.xml only affects new data.

+1
source

what distribution are you using ..., seeing your questions, it looks like you are using the apache distribution. The easiest way to find is to write a shell script to first remove hdfs-site.xml in slaves, e.g.

 ssh username@domain.com 'rm /some/hadoop/conf/hdfs-site.xml' ssh username@domain2.com 'rm /some/hadoop/conf/hdfs-site.xml' ssh username@domain3.com 'rm /some/hadoop/conf/hdfs-site.xml' 

later copy hdfs-site.xml from the wizard to all subordinates

 scp /hadoop/conf/hdfs-site.xml username@domain.com :/hadoop/conf/ scp /hadoop/conf/hdfs-site.xml username@domain2.com :/hadoop/conf/ scp /hadoop/conf/hdfs-site.xml username@domain3.com :/hadoop/conf/ 
+1
source

Source: https://habr.com/ru/post/985061/


All Articles