I am new to data science and am currently improving my skills. I used a dataset from kaggle and planned how to present the data and ran into a problem.
What I was trying to achieve was to insert data into different data frames using a for loop. I saw an example of this and used a dictionary to save data frames, but the data in the data frame is overwritten.
I have a list of data frames:
continents_list = [african_countries, asian_countries, european_countries, north_american_countries,
south_american_countries, oceanian_countries]
This is an example of my data frame from one of the continents:
Continent Country Name Country Code 2010 2011 2012 2013 2014
7 Oceania Australia AUS 11.4 11.4 11.7 12.2 13.1
63 Oceania Fiji FJI 20.1 20.1 20.2 19.6 18.6
149 Oceania New Zealand NZL 17.0 17.2 17.7 15.8 14.6
157 Oceania Papua New Guinea PNG 5.4 5.3 5.4 5.5 5.4
174 Oceania Solomon Islands SLB 9.1 8.9 9.3 9.4 9.5
First, I selected the entire row for the country that has the highest rate for the year:
def select_highest_rate(continent, year):
highest_rate_idx = continent[year].idxmax()
return continent.loc[highest_rate_idx]
for, , :
def show_highest_countries(continents_list):
df_highest_countries = {}
years_list = ['2010','2011','2012','2013','2014']
for continent in continents_list:
for year in years_list:
highest_country = select_highest_rate(continent, year)
highest_countries = highest_country[['Continent','Country Name',year]]
df_highest_countries[year] = pd.DataFrame(highest_countries)
return df_highest_countries
: ,
: () ? ?