The CSV format is not correct?

I create a CSV with EXPORT-CSV in Powershell and then load it into a Perl script. But Perl cannot import the file.

I checked the CSV file against the working version (which was exported from the same Perl script and not powershell), and there is no difference. Columms are the same and both have a semicolon as a separator. If I open the file in Excel, everything will be in the first cell of each row (this means that I need to make text-in-coloumns). The work file ends in different cells from the very beginning.

To add to the confusion: when I open the file in notepad and copy / paste the contents into a new file, the import works!

So what am I missing? Are there any hidden properties that I cannot detect with Notepad? Do I need to change the encoding type?

Please, help:)

+4
source share
5 answers

To better view CSV files, use Notepad ++ . This will tell you the encoding of the file in the status bar. Also enable hidden characters (View> Show Character> Show All Characters). This will show if there are only linear channels, or carriage return + linear channels, tabs and spaces, etc. You can also change the encoding of the file in the "Encoding" menu. This can help you spot the differences. Notepad does not display this information.

Refresh . Here's how to convert a text file from Windows to Unix format in code:

$allText = [IO.File]::ReadAllText("C:\test.csv") -replace "`r`n?", "`n" $encoding = New-Object System.Text.ASCIIEncoding [IO.File]::WriteAllText("C:\test2.csv", $allText, $encoding) 

Or you can use Notepad ++ (Edit> EOL Conversion> Unix Format).

+6
source

This may be an encoding problem when you use export-csv

ASCII used by default, which should be fine, but try setting -Encoding UTF8 in the export-csv command.

+2
source

From CPAN Text :: CSV :

 use Text::CSV; my @rows; my $csv = Text::CSV->new ( { binary => 1 } ) # should set binary attribute. or die "Cannot use CSV: ".Text::CSV->error_diag(); open my $fh, "<:encoding(utf8)", "test.csv" or die "test.csv: $!"; while ( my $row = $csv->getline( $fh ) ) { $row->[2] =~ m/pattern/ or next; # 3rd field should match push @rows, $row; } $csv->eof or $csv->error_diag(); close $fh; 

Never try to disassemble the CSV yourself, at first glance it seems easy, but it has many deep holes to fall.

+1
source

Excel tends to assume that files saved in .csv format are indeed comma separated. However, it looks like you are using semicolons. You can try switching to a comma, or if this is not an option, try changing the extension to .txt. Excel should automatically recognize it if you run the former, while the latter guides you through the import wizard when you upload the file.

0
source

Given what was discovered through other posts, I think your best bet is:

  • Convert to CSV string (which uses unix-y carriage output, not Windows)
  • Send this to a file, ensuring that the encoding is not ASCII.

 $str = $object | convertto-csv -notypeinformation | foreach-object { $_ -replace "`"","" } # 

foreach-object is a hack to remove extra quotes added by convertto-csv . If your data may have double quotes, you will need to find alternatives.

 $str | out-file -filepath "path\to\newcsv" -encoding UTF8 
0
source

Source: https://habr.com/ru/post/1392270/


All Articles