Python CSV for JSON

Here is my code, very simple stuff ...

import csv import json csvfile = open('file.csv', 'r') jsonfile = open('file.json', 'w') fieldnames = ("FirstName","LastName","IDNumber","Message") reader = csv.DictReader( csvfile, fieldnames) out = json.dumps( [ row for row in reader ] ) jsonfile.write(out) 

Declare some field names, the reader uses CSV to read the file, and the names that are submitted dump the file in JSON format. Here's the problem ...

Each entry in the CSV file is on a different line. I want the JSON output to be the same. The problem is that he folds it all on one giant long line.

I tried using something like for line in csvfile: and then ran my code below with reader = csv.DictReader( line, fieldnames) , which scrolls along each line, but it makes the whole file on one line and then projects the whole file to another line ... continues until the line ends.

Any suggestions for fixing this?

Edit: To clarify, I currently have: (each entry in line 1)

 [{"FirstName":"John","LastName":"Doe","IDNumber":"123","Message":"None"},{"FirstName":"George","LastName":"Washington","IDNumber":"001","Message":"Something"}] 

What I'm looking for: (2 entries on 2 lines)

 {"FirstName":"John","LastName":"Doe","IDNumber":"123","Message":"None"} {"FirstName":"George","LastName":"Washington","IDNumber":"001","Message":"Something"} 

Not every separate field backed / on a separate line, but each record on it has its own line.

Invalid input.

 "John","Doe","001","Message1" "George","Washington","002","Message2" 
+47
json python csv
Oct 31 '13 at 3:15
source share
9 answers

The problem with your desired result is that it is not valid json document ;; this is json document flow!

This is good if you need it, but it means that for every document you want in your output, you need to call json.dumps .

Since the new line that you want to split your documents is not contained in these documents, you are on the hook for its delivery yourself. Therefore, we just need to pull the loop from the json.dump call and insert new lines for each written document.

 import csv import json csvfile = open('file.csv', 'r') jsonfile = open('file.json', 'w') fieldnames = ("FirstName","LastName","IDNumber","Message") reader = csv.DictReader( csvfile, fieldnames) for row in reader: json.dump(row, jsonfile) jsonfile.write('\n') 
+70
Oct 31 '13 at 12:49 on
source share

You can try this

 import csvmapper # how does the object look mapper = csvmapper.DictMapper([ [ { 'name' : 'FirstName'}, { 'name' : 'LastName' }, { 'name' : 'IDNumber', 'type':'int' }, { 'name' : 'Messages' } ] ]) # parser instance parser = csvmapper.CSVParser('sample.csv', mapper) # conversion service converter = csvmapper.JSONConverter(parser) print converter.doConvert(pretty=True) 

Edit:

Simplified approach

 import csvmapper fields = ('FirstName', 'LastName', 'IDNumber', 'Messages') parser = CSVParser('sample.csv', csvmapper.FieldMapper(fields)) converter = csvmapper.JSONConverter(parser) print converter.doConvert(pretty=True) 
+7
08 Feb '15 at 15:20
source share

I accepted @SingleNegationElimination's answer and simplified it into a three-line that can be used in the pipeline:

 import csv import json import sys for row in csv.DictReader(sys.stdin): json.dump(row, sys.stdout) sys.stdout.write('\n') 
+5
Nov 25 '15 at 10:25
source share

Add indent parameter to json.dumps

  data = {'this': ['has', 'some', 'things'], 'in': {'it': 'with', 'some': 'more'}} print(json.dumps(data, indent=4)) 

Also note that you can just use json.dump with open jsonfile :

 json.dump(data, jsonfile) 
+2
Oct 31 '13 at 3:17
source share

How about using Pandas to read the csv file into a DataFrame ( pd.read_csv ) and then manipulate the columns if you want (deleting them or updating values) and finally converting the DataFrame back to JSON ( pd.DataFrame.to_json ).

Note. I have not tested how effective this will be, but it is certainly one of the easiest ways to manipulate and convert large csv to json.

+1
Jul 07 '16 at 17:10
source share

I see that this is old, but I need the code from SingleNegationElimination, but I had a problem with data containing non utf-8 characters. They appeared in the fields that do not bother me too much, so I decided to ignore them. However, this made some effort. I am new to python, so with some trial and error I got it to work. The code is a copy of SingleNegationElimination with additional utf-8 processing. I tried to do this with https://docs.python.org/2.7/library/csv.html , but eventually gave up. The code below worked.

 import csv, json csvfile = open('file.csv', 'r') jsonfile = open('file.json', 'w') fieldnames = ("Scope","Comment","OOS Code","In RMF","Code","Status","Name","Sub Code","CAT","LOB","Description","Owner","Manager","Platform Owner") reader = csv.DictReader(csvfile , fieldnames) code = '' for row in reader: try: print('+' + row['Code']) for key in row: row[key] = row[key].decode('utf-8', 'ignore').encode('utf-8') json.dump(row, jsonfile) jsonfile.write('\n') except: print('-' + row['Code']) raise 
+1
Aug 18 '16 at 15:50
source share

As a minor improvement for @MONTYHS answer iterating through tup field names:

 import csv import json csvfilename = 'filename.csv' jsonfilename = csvfilename.split('.')[0] + '.json' csvfile = open(csvfilename, 'r') jsonfile = open(jsonfilename, 'w') reader = csv.DictReader(csvfile) fieldnames = ('FirstName', 'LastName', 'IDNumber', 'Message') output = [] for each in reader: row = {} for field in fieldnames: row[field] = each[field] output.append(row) json.dump(output, jsonfile, indent=2, sort_keys=True) 
0
Mar 05 '14 at 19:43
source share

You can use a Pandas DataFrame for this, with the following example:

 import pandas as pd csv_file = pd.DataFrame(pd.read_csv("path/to/file.csv", sep = ",", header = 0, index_col = False)) csv_file.to_json("/path/to/new/file.json", orient = "records", date_format = "epoch", double_precision = 10, force_ascii = True, date_unit = "ms", default_handler = None) 
0
Feb 02 '17 at 12:13
source share
 import csv import json csvfile = csv.DictReader('filename.csv', 'r')) output =[] for each in csvfile: row ={} row['FirstName'] = each['FirstName'] row['LastName'] = each['LastName'] row['IDNumber'] = each ['IDNumber'] row['Message'] = each['Message'] output.append(row) json.dump(output,open('filename.json','w'),indent=4,sort_keys=False) 
-one
Oct 31 '13 at 12:03 on
source share



All Articles