Firstly, Excel (2003, at home) is not so smart. If your column 1000 * chi2 contains spaces, for example. 1000 * chi2, excel will guess incorrectly.
Trivial case: if your data was originally separated by tabs (not spaces), and multiple tabs were used to display empty columns, then, at least in TCL, it's easy to split each row into the contents of the tab, and I think the trivial is also in Python
But I guess your problem is that they used only whitespace characters. The biggest clue I see to solve this problem was to insert text into a notepad and select a fixed-size font. Everything builds neatly, and you can use the number of characters in each line as a measure of length.
So, IF you can rely on this function of your input, then you can use the sieve approach to determine where the column breaks. When you break lines in the first pass, pay attention to the "positions" along the line that are occupied by non-white space, excluding the position from your list if it is EVER occupied by non-white space. As you walk, you will quickly reach a number of positions that are NEVER occupied by white space. These are therefore your column delimiters. In your example, your "sieve" will be in positions 10-16, 23-24.32, 42-47, which have never been practiced without spaces (provided that I can count). Thus, the addition of this set is your set of column positions in which your data should lie. Thus, the foreach line, each block of unsafe space will fit into one of your columns from the set of positions (i.e., the set of additions) indicated above. I have never been encoded in Python, so the binding is a TCL script that will identify using the sieve approach where the column in the text is split and emit a new text file, and these spaces are replaced by one tab - i.e. 10-16 are replaced with one tab, 23-24 with another, etc. The resulting file is divided into a tab, i.e. The trivial case. I admit, I just tried this on YOUR little data copied to an ex.txt text file; the output goes to ex_.txt. I suspect this could also be a problem if the headers contain spaces.
Hope this helps!
set fh [open ex.txt] set contents [read $fh];#ok for small-to-medium files. close $fh #first pass set occupied {} set lines [split $contents \n];#split contents at line breaks. foreach line $lines { set chrs [split $line {}];#split each line into chars. set pos 0 foreach chr $chrs { if {$chr ne " "} { lappend occupied $pos } incr pos } } #drop out with long list of occupied "positions": sort to create #our sieve. set datacols [lsort -unique -integer $occupied] puts "occupied: $datacols" #identify column boundaries. set colset {} set start [lindex $datacols 0];#first occupied pos might be > 0?? foreach index $datacols { if {$start < $index} { set end $index;incr end -1 lappend colset [list $start $end] puts "col break starts at $start, ends at $end";#some instro! set start $index } incr start } #Now convert input file to trivial case output file, replacing #sieved space chars with tab characters. set tesloc [lreverse $colset];#reverse the column list! set fh [open ex_.txt w] foreach line $lines { foreach ele $tesloc { set line [string replace $line [lindex $ele 0] [lindex $ele 1] "\t" ] } puts "newline is $line" puts $fh $line } close $fh