It seems that loading data from CSV is faster than from SQL (Postgre SQL) with Pandas. (I have an SSD)
Here is my test code:
import pandas as pd import numpy as np start = time.time() df = pd.read_csv('foo.csv') df *= 3 duration = time.time() - start print('{0}s'.format(duration)) engine = create_engine('postgresql://user:password@host:port/schema') start = time.time() df = pd.read_sql_query("select * from mytable", engine) df *= 3 duration = time.time() - start print('{0}s'.format(duration))
Foo.csv and the database are the same (same amount of data and columns in both, 4 columns, 100,000 rows filled with random int).
CSV takes 0.05 s
SQL takes 0.5 s
Do you think CSV is 10 times faster than SQL? I wonder if I missed something here ...
This is normal behavior, reading a csv file is always one of the fastest ways to simply load data.
CSV . . CSV . SQL , . , , , .
, csv 1920 2017 csv, 2010 .
csv approach csv, 2010 2017 .
SQL- - SQL
SQL .
, CSV , SQL, , :
CSV , , , .
SQL , .. , , . , - , CSV.
, , , , .
,
select * from mytable where myindex = "myvalue";
csv. - SQL
Source: https://habr.com/ru/post/1676768/More articles:Shiny & networkD3 responds to node click - javascriptSystem.MissingMethodException Unit Test - xunitDifference in the 'lib' property in tsconfig.json between es6 and es2017? - ecmascript-6When do I not want to enable "Control Flow Guard" in Microsoft Visual Studio? - c ++Как избежать многих "if" и предоставить параметры выполнения во время выполнения - javahttps://translate.googleusercontent.com/translate_c?depth=1&pto=aue&rurl=translate.google.com&sl=ru&sp=nmt4&tl=en&u=https://fooobar.com/questions/1676769/boolean-inputs-to-angular-2-component&usg=ALkJrhiMrNANr3iQTa4XuWz1o9tFqUBC2gWaiting from the skull - multithreadingHow to wait until an element passes through a pipeline? - c #Parse multipart / form-data Body on AWS Lambda in Java - javaПочему malloc не возвращается? - c++All Articles