How to load grouped data using SSIS

I have a complex data source with flat files. Data is grouped as follows:

Country City US New York Washington Baltimore Canada Toronto Vancouver 

But I want this to be the format when it is loaded into the database:

 Country City US New York US Washington US Baltimore Canada Toronto Canada Vancouver 

Has anyone encountered such a problem before? Do you have an idea to handle this?
The only idea I got now is to use the cursor, but it is too slow.
Thanks!

+5
source share
2 answers

The answer to cha will work, but here is another case if you need to do this in SSIS without temporary / intermediate tables:

You can start the data stream through a Script transformation that uses a variable of the DataFlow level. Since each row is included in Script, the value of the Country column is checked.

If it has a non-empty value, fill this variable with a value and pass it in the data stream.

If the country has an empty value, rewrite it with the value of the variable, which will be the last non-empty value of the country that you received.

EDIT: I looked at your error message and found out something new about Script Components (a data flow tool, not Script Tasks, Flow Flow tool):

The ReadWriteVariables collection is only available in the PostExecute method to maximize performance and minimize the risk of blocking conflicts. Therefore, you cannot directly increase the value of a package variable when processing each row of data. Increase the value of the local variable and set the value of the batch variable to the value of the local variable in the PostExecute method after all the data has been processed. You can also use the VariableDispenser Property to get around this limitation, as described later in this section. However, writing directly to the package variable, since each line is processed, will adversely affect performance and increase the risk of conflicts when locking.

This comes from this MSDN article , which also contains additional information about the Variable Dispenser workaround if you want to go this route, but apparently I am fooling you above when I said that you can set the value of the package variable in a script. You must use a variable that is local to the script, and then modify it in the Post-Execute event handler. I can’t say from the article whether this means that you cannot read the variable in the script, and if that happens, then the Variable Manager will be the only option. Or, I suppose, you could create another variable for which Script will have read-only access and set its value to the expression so that it always has the value of the read-write variable. It might work.

+3
source

Yes it is possible. First you need to load the data into a table with an IDENTITY column:

 -- drop table #t CREATE TABLE #t (id INTEGER IDENTITY PRIMARY KEY, Country VARCHAR(20), City VARCHAR(20)) INSERT INTO #t(Country, City) SELECT a.Country, a.City FROM OPENROWSET( BULK 'c:\import.txt', FORMATFILE = 'c:\format.fmt', FIRSTROW = 2) AS a; select * from #t 

The result will be:

 id Country City ----------- -------------------- -------------------- 1 US New York 2 Washington 3 Baltimore 4 Canada Toronto 5 Vancouver 

And now with a little recursive CTE magic, you can fill in the missing details:

 ;WITH a as( SELECT Country ,City ,ID FROM #t WHERE ID = 1 UNION ALL SELECT COALESCE(NULLIF(LTrim(#t.Country), ''),a.Country) ,#t.City ,#t.ID FROM a INNER JOIN #t ON a.ID+1 = #t.ID ) SELECT * FROM a OPTION (MAXRECURSION 0) 

Result:

 Country City ID -------------------- -------------------- ----------- US New York 1 US Washington 2 US Baltimore 3 Canada Toronto 4 Canada Vancouver 5 

Update:

As Tab Alleman suggests below, the same result can be achieved without a recursive query:

 SELECT ID , COALESCE(NULLIF(LTrim(a.Country), ''), (SELECT TOP 1 Country FROM #tt WHERE t.ID < a.ID AND LTrim(t.Country) <> '' ORDER BY t.ID DESC)) , City FROM #ta 

By the way, the format file for your input is this (if you want to try the scripts, save the input as c: \ import.txt and the format file below as c: \ format.fmt):

 9.0 2 1 SQLCHAR 0 11 "" 1 Country SQL_Latin1_General_CP1_CI_AS 2 SQLCHAR 0 100 "\r\n" 2 City SQL_Latin1_General_CP1_CI_AS 
+3
source

Source: https://habr.com/ru/post/1246989/


All Articles