How to write a Pandas Dataframe for an existing Django model

I am trying to insert data into a Pandas DataFrame into an existing Django, Agency model that uses the SQLite backend. However, after answering How to write a Pandas Dataframe for a Django Model and Saving a Pandas DataFrame with a Django Model , the whole SQLite table is replaced and the Django code is split. In particular, this is an automatic generated id column in Django, which is replaced by index , which causes errors when rendering templates ( no such column: agency.id ).

Here is the code and the result of using Pandas to_sql in the SQLite table, Agency .

In models.py :

 class Agency(models.Model): name = models.CharField(max_length=128) 

In myapp/management/commands/populate.py :

 class Command(BaseCommand): def handle(self, *args, **options): # Open ModelConnection from django.conf import settings database_name = settings.DATABASES['default']['NAME'] database_url = 'sqlite:///{}'.format(database_name) engine = create_engine(database_url, echo=False) # Insert data data agencies = pd.DataFrame({"name": ["Agency 1", "Agency 2", "Agency 3"]}) agencies.to_sql("agency", con=engine, if_exists="replace") 

The call to ' python manage.py populate ' successfully adds three agencies to the table:

 index name 0 Agency 1 1 Agency 2 2 Agency 3 

However, this changed the DDL of the table:

 CREATE TABLE "agency" ("id" integer NOT NULL PRIMARY KEY AUTOINCREMENT, "name" varchar(128) NOT NULL) 

in

 CREATE TABLE agency ( "index" BIGINT, name TEXT ); CREATE INDEX ix_agency_index ON agency ("index") 

How can I add a DataFrame to a Django-driven model and keep the Django ORG intact?

+5
source share
1 answer

To answer my own question, since I often imported data using Pandas into Django, the error I made was trying to use Pandas built-in Sql Alchemy DB ORM, which changed the definition of the base database table.In the above context, you can simply use Django ORM to connect and insert data:

 from myapp.models import Agency class Command(BaseCommand): def handle(self, *args, **options): # Process data with Pandas agencies = pd.DataFrame({"name": ["Agency 1", "Agency 2", "Agency 3"]}) # iterate over DataFrame and create your objects for agency in agencies.itertuples: agency = Agency.objects.create(name=agency.name) 

However, you can often import data using an external script, rather than using the control command as described above, or using the Django shell. In this case, you must first connect to the Django ORM by calling the setup method:

 import os, sys import django import pandas as pd sys.path.append('../..') # add path to project root dir os.environ["DJANGO_SETTINGS_MODULE"] = "myproject.settings" # for more sophisticated setups, if you need to change connection settings (eg when using django-environ): #os.environ["DATABASE_URL"] = "postgres://myuser: mypassword@localhost :54324/mydb" # Connect to Django ORM django.setup() # process data from myapp.models import Agency Agency.objects.create(name='MyAgency') 
  • Here I exported my myproject.settings settings module to DJANGO_SETTINGS_MODULE so that django.setup() can select project settings.

  • Depending on where you run the script, you may need a path to the system path so that Django can find the settings module. In this case, I run my script two directories below my project root.

  • You can change any settings before calling setup . If your script should connect to the database differently than the one configured in settings . For example, when running a script locally in Docker containers for Django / postgres.

Note that the above example used django-environ to specify database settings.

+2
source

Source: https://habr.com/ru/post/1262428/


All Articles