How to fill an empty matrix with data values

I am desperately trying to populate a matrix with values ​​from a data frame. This is trade data, so the data frame looks something like this:

country1 country2 value
1 Afghanistan  Albania    30
2 Afghanistan  Albania    81
3 Afghanistan    China     5
4     Albania  Germany     6
5       China  Germany     8
6       China   Turkey   900
7     Germany   Turkey    12
8     Germany      USA     3
9     Germany   Zambia   700

Using the unique and sort commands, I created a list of all the countries that appear in df (and converted them to a matrix):

     countries_sorted
[1,] "Afghanistan"   
[2,] "Albania"       
[3,] "China"         
[4,] "Germany"       
[5,] "Turkey"        
[6,] "USA"           
[7,] "Zambia"    

Using this β€œlist”, I created an empty trading matrix (7x7):

             Afghanistan Albania China Germany Turkey USA Zambia
Afghanistan          NA      NA    NA      NA     NA  NA     NA
Albania              NA      NA    NA      NA     NA  NA     NA
China                NA      NA    NA      NA     NA  NA     NA
Germany              NA      NA    NA      NA     NA  NA     NA
Turkey               NA      NA    NA      NA     NA  NA     NA
USA                  NA      NA    NA      NA     NA  NA     NA
Zambia               NA      NA    NA      NA     NA  NA     NA

Now, I hopelessly cannot fill this matrix with numbers / sums from the df value column. I tried something like this:

a<-cast(df, country1~country2 , sum)

which works to some extent, but the matrix does not preserve its original 7x7 format, so I need a matrix where the diagonal is 0s.

> a
     country1 Albania China Germany Turkey USA Zambia
1 Afghanistan     111     5       0      0   0      0
2     Albania       0     0       6      0   0      0
3       China       0     0       8    900   0      0
4     Germany       0     0       0     12   3    700

Please someone with a solution ????

+4
source share
3 answers

:

#your data.frame
df <- read.table(header=T, file='clipboard', stringsAsFactors = F)
#the list of unique countries
countries <- unique(c(df$country1,df$country2))

:

#create all the country combinations
newdf <- expand.grid(countries, countries)
#change names
colnames(newdf) <- c('country1', 'country2')
#add a value of 0 for the new combinations (won't affect outcome)
newdf$value <- 0
#row bind with original dataset
df2 <- rbind(df, newdf)


#and create the table using xtabs:
#the aggregate function will create the sum of the value for each combination
> xtabs(value ~ country1 + country2, aggregate(value~country1+country2,df2,sum))
             country2
country1      Afghanistan Albania China Germany Turkey USA Zambia
  Afghanistan           0     111     5       0      0   0      0
  Albania               0       0     0       6      0   0      0
  China                 0       0     0       8    900   0      0
  Germany               0       0     0       0     12   3    700
  Turkey                0       0     0       0      0   0      0
  USA                   0       0     0       0      0   0      0
  Zambia                0       0     0       0      0   0      0
+4

, @LyzandeR, dplyr tidyr.

dt = read.table(text=
"country1 country2 value
Afghanistan  Albania    30
Afghanistan  Albania    81
Afghanistan    China     5
Albania  Germany     6
China  Germany     8
China   Turkey   900
Germany   Turkey    12
Germany      USA     3
Germany   Zambia   700", header=T, stringsAsFactors=F)

library(dplyr)
library(tidyr)

dt2 = 
    dt %>% 
      group_by(country1,country2) %>%    # for every combination of countries
      summarise(SumValue = sum(value))   # get the sum of value

# get all possible countries that appear in your dataset
list_countries = union(dt2$country1, dt2$country2)

expand.grid(country1=list_countries, country2=list_countries, stringsAsFactors = F) %>%  # create all possible combinations of countries
  left_join(dt2, by=c("country1","country2")) %>%  # join back info whenever it is found
  mutate(SumValue = ifelse(is.na(SumValue),0,SumValue)) %>%  # replace NAs with 0s
  spread(country2,SumValue)  # reshape data

#     country1 Afghanistan Albania China Germany Turkey USA Zambia
# 1 Afghanistan           0     111     5       0      0   0      0
# 2     Albania           0       0     0       6      0   0      0
# 3       China           0       0     0       8    900   0      0
# 4     Germany           0       0     0       0     12   3    700
# 5      Turkey           0       0     0       0      0   0      0
# 6         USA           0       0     0       0      0   0      0
# 7      Zambia           0       0     0       0      0   0      0
+1

Since only the upper diagonal matrix and the diagonal are equal to 0, it remains the same, except for the first column, which is deleted, since it does not contain information (only zeros). You can simply add it to the matrix using cbind:

Z = matrix(rep(0,7),ncol=1)
newMatrix = cbind(Z,oldMatrix)
0
source

Source: https://habr.com/ru/post/1608639/


All Articles