Convert query results to DataFrame in python

Question

Convert query results to DataFrame in python

I am trying to manipulate the result from a query using psycog2. Thus, I need a hidden result in a pandas DataFrame. But when I use the following code and print, only the column names are printed out of line. I also used 'pd.DataFrame.from_records', but that did not work.

import psycopg2 import pandas as pd import numpy as np conn_string = "Connect_Info" conn = psycopg2.connect(conn_string) cursor = conn.cursor() cursor.execute(query) rows=pd.DataFrame(cursor.fetchall(),columns=['page_num','Frequency']) for row in rows: print row conn.commit(); conn.close();

The result of cursor.fetchall () is

 (1L, 90990L) (3L, 6532L) (2L, 5614L) (4L, 4016L) (5L, 2098L) (6L, 1651L) (7L, 1158L) (8L, 854L) (9L, 658L) (10L, 494L) (11L, 345L) (12L, 301L) (13L, 221L) (15L, 152L) (14L, 138L) (16L, 113L) (17L, 93L) (18L, 73L) (20L, 62L) (19L, 55L) (22L, 44L) (21L, 35L) (23L, 29L) (25L, 24L) (27L, 19L) (26L, 18L)

+2

python pandas dataframe psycopg2

maggs Jul 17 '15 at 10:12

source share

3 answers

This may not be the answer to your question, but you should use read_sql_query to do this, instead do a fetchall and wrap it in a DataFrame yourself. It will look like this:

 conn = psycopg2.connect(...) rows = pd.read_sql_query(query, conn)

not all your code above.

And for your actual question, see http://pandas.pydata.org/pandas-docs/stable/basics.html#iteration for an explanation and different options.
The basis is that iteration over the file frame is performed by iterations over the column names. You can use other functions to repeat lines, such as .iterrows() and .itertuples() . But remember that in most cases manual repetition by line is not required.

+10

joris Jul 17 '15 at 10:43

source share

another sentence uses itertuples, which gives (index, row_value1, row_value2 ...) tuples.

 for tup in rows.itertuples(): print tup '(0, 1, 90990) (1, 3, 6532) (2, 2, 5614) (3, 4, 4016) ...'

since you can see that the first position is the index, socend is the value of the first column, and the second is the value of the second column.

0

omri_saadon Jul 17 '15 at 10:41

source share

Padraic cunningham · Accepted Answer · 2015-07-17T10:26:34+0000

This is exactly what should happen when you iterate over the data file, you see the column names. If you want df to just print df. To see the lines:

 for ind, row in df.iterrows(): print(row.values)

Or. values:

 for row in df.values: print(row)

Convert query results to DataFrame in python

More articles: