U-SQL output in Azure Data Lake

Is it possible to automatically split a table into several files based on column values ​​if I do not know how many different key values ​​are in the table? Can I put a key value in a file name?

+4
source share
2 answers

This is our top ask (and earlier fooobar.com/questions/1275874 / ... too :). We are currently working on it and hope that it will be available by the summer.

Before that, you need to write a script generator. I use U-SQL to generate the script, but you can do it with Powershell or T4, etc.

Here is an example:

Suppose you want to write files for the name column in the following table / rowset @x :

 name | value1 | value2 -----+--------+------- A | 10 | 20 A | 11 | 21 B | 10 | 30 B | 100 | 200 

You must write a script to generate the script as follows:

 @x = SELECT * FROM (VALUES( "A", 10, 20), ("A", 11, 21), ("B", 10, 30), ("B", 100, 200)) AS T(name, value1, value2); // Generate the script to do partitioned output based on name column: @stmts = SELECT "OUTPUT (SELECT value1, value2 FROM @x WHERE name == \""+name+"\") TO \"/output/"+name+".csv\" USING Outputters.Csv();" AS output FROM (SELECT DISTINCT name FROM @x) AS x; OUTPUT @stmts TO "/output/genscript.usql" USING Outputters.Text(delimiter:' ', quoting:false); 

Then you take genscript.usql , add the @x calculation and send it to get the data split into two files.

+9
source

Great question! I will be interested to know what Mr. Rhys answers.

Sorry, but this is only half the answer.

My first thoughts are to split the ADL table using your key value. But then I'm not sure how you will deal with individual exits if the potential WHERE clause is not deterministic. Maybe CROSS JOIN in every result and ... miss it!

It would be nice to have a WHILE loop with some dynamic code!

Check out this post on MS forums for dynamic input datasets. Just like FYI.

https://social.msdn.microsoft.com/Forums/en-US/aa475035-2d57-49b8-bdff-9cccc9c8b48f/usql-loading-a-dynamic-set-of-files?forum=AzureDataLake

0
source

Source: https://habr.com/ru/post/1275873/


All Articles