How does the HDFS client know the block size during recording?

The HDFS client is outside the HDFS cluster. When an HDFS client writes a file so that the HDFS clients break the files into blocks and then write the block to the datanode.

The question is, how does the HDFS client know Blocksize? The block size is configured in the node name, and the HDFS client has no idea about the block size, and then how it will split the file into blocks?

+4
source share
3 answers

HDFS is designed so that the block size for a specific file is part of MetaData.

Let me just check what that means?

NameNode, HDFS . hdfs-site.xml, , , -Ddfs.blocksize.

, org.apache.hadoop.hdfs.DFSConfigKeys.DFS_BLOCK_SIZE_DEFAULT, 128 .

NameNode , , , dfs.namenode.fs-limits.min-block-size ( 1 ).

, NameNode , .

+3

( 4- Hadoop Definitive Guide)

Anatomy of a File Record from the Hadoop Definitive Guide

" , create() DistributedFileSystem ( 1 3-4). DistributedFileSystem RPC namenode , ( 2). namenode , , , . , ; , IOException. DistributedFileSystem FSDataOutputStream . , FSDataOutputStream DFSOutputStream, datanodes namenode. ( 3), DFSOutputStream , , . "

:

HDFS ( Hadoop)

public class FileCopyWithProgress {
public static void main(String[] args) throws Exception {
    String localSrc = args[0];
    String dst = args[1];

    InputStream in = new BufferedInputStream(new FileInputStream(localSrc));

    Configuration conf = new Configuration();
    FileSystem fs = FileSystem.get(URI.create(dst), conf);
    OutputStream out = fs.create(new Path(dst), new Progressable() {
        public void progress() {
            System.out.print(".");
        }
    });

    IOUtils.copyBytes(in, out, 4096, true);
}

}

create() FileSystem, getDefaultBlockSize() , inturn , , ​​namenode. , hadoop.

,

0

: URI , URI Client . , , NameNode DataNodes.

P.S: Client = EdgeNode

0

Source: https://habr.com/ru/post/1658726/


All Articles