Define single characters in CSV using python

Question

Define single characters in CSV using python

I have data in tab delimited format that looks like this:

0/0:23:-1.03,-7.94,-83.75:69.15    0/1:34:-1.01,-11.24,-127.51:99.00    0/0:74:-1.02,-23.28,-301.81:99.00

I'm only interested in the first 3 characters of each record (i.e. 0/0 and 0/1). I decided that the best way to do this - use matchand genfromtxtin numpy. This example until I got:

import re
csvfile = 'home/python/batch1.hg19.table'
from numpy import genfromtxt
data = genfromtxt(csvfile, delimiter="\t", dtype=None)
for i in data[1]:
    m = re.match('[0-9]/[0-9]', i)
        if m:
        print m.group(0),
        else:
        print "NA",

This works for the first line of data, but it's hard for me to determine how to expand it for each line of the input file.

Should I make it a function and apply it to each line separately or is there a more pythonic way to do this?

+3

python numpy csv

Stedy Dec 03 '10 at 0:26

source share

4 answers

NumPy, :

file = open('home/python/batch1.hg19.table')
for line in file:
    for cell in line.split('\t'):
        print(cell[:3])

, , , , .

+4

The Maniac 03 . '10 0:37

:

for line in open('yourfile').read().split('\n'):
    for token in line.split('\t'):
        print token[:3] if token else 'N\A'

+1

nate c 03 . '10 0:35

python . , , .

file = open("home/python/batch1.hg19.table")
for line in file:
    columns = line.split("\t")
    for column in columns:
        print column[:3]
file.close()

, , .

0

JonMR 03 . '10 0:51

unutbu · Accepted Answer · 2010-12-03T00:38:59+0000

Numpy , . , , numpy, . .

numpy:

result=[]
with open(csvfile,'r') as f:
    for line in f:
        row=[]
        for text in line.split('\t'):
            match=re.search('([0-9]/[0-9])',text)
            if match:
                row.append(match.group(1))
            else:
                row.append("NA")
        result.append(row)
print(result)

# [['0/0', '0/1', '0/0'], ['NA', '0/1', '0/0']]

:

0/0:23:-1.03,-7.94,-83.75:69.15 0/1:34:-1.01,-11.24,-127.51:99.00   0/0:74:-1.02,-23.28,-301.81:99.00
---:23:-1.03,-7.94,-83.75:69.15 0/1:34:-1.01,-11.24,-127.51:99.00   0/0:74:-1.02,-23.28,-301.81:99.00

Define single characters in CSV using python

More articles: