30Mb to load into Azure DataLake using DataLakeStoreFileSystemManagementClient

I get an error when using

_adlsFileSystemClient.FileSystem.Create(_adlsAccountName, destFilePath, stream, overwrite) 

to upload files to datalake. The error is associated with files larger than 30 MB. It works great with smaller files.

Error:

in Microsoft.Azure.Management.DataLake.Store.FileSystemOperations.d__16.MoveNext () --- The end of the stack trace from the previous place where the exception was thrown is on System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (Task tasks) under System. Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (Task tasks) under Microsoft.Azure.Management.DataLake.Store.FileSystemOperationsExtensions.d__23.MoveNext () --- End of the stack trace from the previous place where Runtime exception was thrown. System .TaskAwaiter.ThrowForNonSuccess (Task tasks) under System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (Task tasks) under Microsoft.Azure.Management.DataLake.Store.FileSystemOperationsExtensions.Create (IFileS tName, String directFilePath, Stream streamContents, Nullable 1 overwrite, Nullable 1 syncFlag) in AzureDataFunctions.DataLakeController.CreateFileInDataLake (String destFilePath, Stream stream, Boolean overwrite) in F: \ GitHub \ ZutoDW \ ADFLFFiles_FileFile_File_File_File_Files_File_Files_File_File_File_Files_File_Files_File_File_Files

Has anyone else come across this? Or have you observed similar behavior? I will get around this by dividing the files into 30 MB and downloading them.

However, this is not practical in the long run, because the source file is 380 MB, and potentially it is much larger. I do not want to have 10-15 split files in my datalake in the long run. I would like to download as a single file.

I can upload the same file to datalake through the portal interface.

+5
source share
2 answers

Please try using DataLakeStoreUploader to load a file or directory into DataLake, more demo code, please refer to the github example . I am testing a demo and it works correctly for me. We can get the Microsoft.Azure.Management.DataLake.Store and the Microsoft.Azure.Management.DataLake.StoreUploader SDK from nuget. Below are my detailed steps:

  • Creating a console application in C #
  • Add the following code

      var applicationId = "your application Id"; var secretKey = "secret Key"; var tenantId = "Your tenantId"; var adlsAccountName = "adls account name"; var creds = ApplicationTokenProvider.LoginSilentAsync(tenantId, applicationId, secretKey).Result; var adlsFileSystemClient = new DataLakeStoreFileSystemManagementClient(creds); var inputFilePath = @"c:\tom\ForDemoCode.zip"; var targetStreamPath = "/mytempdir/ForDemoCode.zip"; //should be the '/foldername/' not the full path var parameters = new UploadParameters(inputFilePath, targetStreamPath, adlsAccountName, isOverwrite: true,maxSegmentLength: 268435456*2); // the default maxSegmentLength is 256M, we can set by ourself. var frontend = new DataLakeStoreFrontEndAdapter(adlsAccountName, adlsFileSystemClient); var uploader = new DataLakeStoreUploader(parameters, frontend); uploader.Execute(); 
  • Debugging the application.

    enter image description here

  • Validation from the azure portal

enter image description here

SDK information, please refer to the packages.config file

 <?xml version="1.0" encoding="utf-8"?> <packages> <package id="Microsoft.Azure.Management.DataLake.Store" version="1.0.2-preview" targetFramework="net452" /> <package id="Microsoft.Azure.Management.DataLake.StoreUploader" version="1.0.0-preview" targetFramework="net452" /> <package id="Microsoft.IdentityModel.Clients.ActiveDirectory" version="3.13.8" targetFramework="net452" /> <package id="Microsoft.Rest.ClientRuntime" version="2.3.2" targetFramework="net452" /> <package id="Microsoft.Rest.ClientRuntime.Azure" version="3.3.2" targetFramework="net452" /> <package id="Microsoft.Rest.ClientRuntime.Azure.Authentication" version="2.2.0-preview" targetFramework="net452" /> <package id="Newtonsoft.Json" version="9.0.2-beta1" targetFramework="net452" /> </packages> 
+2
source

He answered here .

There is currently a size limit of 30,000,000 bytes. You can get around by creating the source file, and then add it with a stream size less than the limit.

+4
source

Source: https://habr.com/ru/post/1262292/


All Articles