Lazy creates a Dask DataFrame from PostgreSQL / Cassandra

As I understand it, a Dask DataFrame is the right way to handle tabular data. I have a table in PostgreSQL and I know how to load it into pandas.Dataframe.

I know it odocan be used to convert pandas.Dataframeto dask.dataframe. But This is not a lazy operation: such a conversion power loads the entire PostgeSQL table into memory, and this is bad. I prefer to read objects one by one or pieces. How to do it?

A similar problem with Cassandra. But Cassandra is like distributed storage and can be optimized for distributed access. But how to do it with Dask?

+4

python postgresql cassandra dataframe dask

Sklavit Oct 6 '16 at 23:33

No one has answered this question yet.

See related questions:

1719

How to exit PostgreSQL command line utility: psql

1419

Select rows from DataFrame based on values in a column in pandas

1033

Remove column from panda DataFrame

879

Get a list of pandas DataFrame column headers

774

Save PL / pgSQL output from PostgreSQL to CSV file

643

Creating a copy of the database in PostgreSQL

573

Use the list of values to select rows from pandas data

325

Creating an empty Pandas DataFrame and then populating it?

6

How to convert xarray dataset to pandas data in dask data frame

2

Collecting attributes from dask data package providers

Source: https://habr.com/ru/post/1657000/

All Articles