Make sure fgetcsv () reads the entire line

I use PHP to import data from a CSV file using fgetcsv (), which gives an array for each row. Initially, I had a character limit set to 1024, for example:

while ($data = fgetcsv($fp, 1024)) { // do stuff with the row } 

However, a CSV with 200+ columns exceeded the limit of 1024 in many rows. This caused the line to be stopped in the middle of the line, and then the next call to fgetcsv () will begin where the previous one stopped, and so on, until EOL is reached.

Since then, I have increased this limit to 4096, which should take care of most cases, but I would like to put a check to make sure that the entire line has been read after each line. How can I do it?

I was thinking of checking the end of the last element of the array for line-term characters (\ n, \ r, \ r \ n), but wouldn't they be handled by fgetcsv ()?

+6
source share
4 answers

Thank you for the suggestions, but these solutions do not really solve the problem, knowing that we take into account the longest line, while maintaining the limit. I was able to do this using the wc -L UNIX command via shell_exec() to determine the longest line in the file before starting the line fetch. Code below:

 // open the CSV file to read lines $fp = fopen($sListFullPath, 'r'); // use wc to figure out the longest line in the file $longestArray = explode(" ", shell_exec('wc -L ' . $sListFullPath)); $longest_line = (int)$longestArray[0] + 4; // add a little padding for EOL chars // check against a user-defined maximum length if ($longest_line > $line_length_max) { // alert user that the length of at least one line in the CSV is too long } // read in the data while ($data = fgetcsv($fp, $longest_line)) { // do stuff with the row } 

This approach ensures that every line is read in full and will still provide a secure network for really long lines without having to go through the entire file from PHP line by line.

+1
source

Just omit the length parameter. It is optional in PHP5.

 while ($data = fgetcsv($fp)) { // do stuff with the row } 
+6
source

Just do not specify a limit, and fgetcsv () will break as much as necessary to capture the full line. If you specify a limit, then it is entirely up to you to check the file stream and make sure that you are not cutting something in the middle.

However, note that not specifying a constraint can be risky if you do not have control over the generation of this .csv in the first place. It would be easy to clutter your server with malicious CSV, which has many terabytes of data in one line.

+3
source

I would be careful with your final decision. I was able to download a file named /.;ls -a;.csv to execute the command. Make sure you check the file path if you use this approach. It might also be a good idea to provide default_length in case your wc crashes for some reason.

 // use wc to find max line length // uses a hardcoded default if wc fails // this is relatively safe from command // injection since the file path is a tmp file $wc = explode(" ", shell_exec('wc -L ' . $validated_file_path)); $longest_line = (int)$wc[0]; $length = ($longest_line) ? $longest_line + 4 : $default_length; 
0
source

Source: https://habr.com/ru/post/916607/