How to fill an empty matrix with data values

Question

How to fill an empty matrix with data values

I am desperately trying to populate a matrix with values from a data frame. This is trade data, so the data frame looks something like this:

country1 country2 value
1 Afghanistan  Albania    30
2 Afghanistan  Albania    81
3 Afghanistan    China     5
4     Albania  Germany     6
5       China  Germany     8
6       China   Turkey   900
7     Germany   Turkey    12
8     Germany      USA     3
9     Germany   Zambia   700

Using the unique and sort commands, I created a list of all the countries that appear in df (and converted them to a matrix):

     countries_sorted
[1,] "Afghanistan"   
[2,] "Albania"       
[3,] "China"         
[4,] "Germany"       
[5,] "Turkey"        
[6,] "USA"           
[7,] "Zambia"

Using this “list”, I created an empty trading matrix (7x7):

             Afghanistan Albania China Germany Turkey USA Zambia
Afghanistan          NA      NA    NA      NA     NA  NA     NA
Albania              NA      NA    NA      NA     NA  NA     NA
China                NA      NA    NA      NA     NA  NA     NA
Germany              NA      NA    NA      NA     NA  NA     NA
Turkey               NA      NA    NA      NA     NA  NA     NA
USA                  NA      NA    NA      NA     NA  NA     NA
Zambia               NA      NA    NA      NA     NA  NA     NA

Now, I hopelessly cannot fill this matrix with numbers / sums from the df value column. I tried something like this:

a<-cast(df, country1~country2 , sum)

which works to some extent, but the matrix does not preserve its original 7x7 format, so I need a matrix where the diagonal is 0s.

> a
     country1 Albania China Germany Turkey USA Zambia
1 Afghanistan     111     5       0      0   0      0
2     Albania       0     0       6      0   0      0
3       China       0     0       8    900   0      0
4     Germany       0     0       0     12   3    700

Please someone with a solution ????

+4

casting matrix r

samyandi Sep 23 '15 at 9:15

source share

3 answers

LyzandeR · Answer 1 · 2015-09-23T09:33:56+0000

:

#your data.frame
df <- read.table(header=T, file='clipboard', stringsAsFactors = F)
#the list of unique countries
countries <- unique(c(df$country1,df$country2))

:

#create all the country combinations
newdf <- expand.grid(countries, countries)
#change names
colnames(newdf) <- c('country1', 'country2')
#add a value of 0 for the new combinations (won't affect outcome)
newdf$value <- 0
#row bind with original dataset
df2 <- rbind(df, newdf)


#and create the table using xtabs:
#the aggregate function will create the sum of the value for each combination
> xtabs(value ~ country1 + country2, aggregate(value~country1+country2,df2,sum))
             country2
country1      Afghanistan Albania China Germany Turkey USA Zambia
  Afghanistan           0     111     5       0      0   0      0
  Albania               0       0     0       6      0   0      0
  China                 0       0     0       8    900   0      0
  Germany               0       0     0       0     12   3    700
  Turkey                0       0     0       0      0   0      0
  USA                   0       0     0       0      0   0      0
  Zambia                0       0     0       0      0   0      0

AntoniosK · Answer 2 · 2015-09-23T09:46:53+0000

, @LyzandeR, dplyr tidyr.

dt = read.table(text=
"country1 country2 value
Afghanistan  Albania    30
Afghanistan  Albania    81
Afghanistan    China     5
Albania  Germany     6
China  Germany     8
China   Turkey   900
Germany   Turkey    12
Germany      USA     3
Germany   Zambia   700", header=T, stringsAsFactors=F)

library(dplyr)
library(tidyr)

dt2 = 
    dt %>% 
      group_by(country1,country2) %>%    # for every combination of countries
      summarise(SumValue = sum(value))   # get the sum of value

# get all possible countries that appear in your dataset
list_countries = union(dt2$country1, dt2$country2)

expand.grid(country1=list_countries, country2=list_countries, stringsAsFactors = F) %>%  # create all possible combinations of countries
  left_join(dt2, by=c("country1","country2")) %>%  # join back info whenever it is found
  mutate(SumValue = ifelse(is.na(SumValue),0,SumValue)) %>%  # replace NAs with 0s
  spread(country2,SumValue)  # reshape data

#     country1 Afghanistan Albania China Germany Turkey USA Zambia
# 1 Afghanistan           0     111     5       0      0   0      0
# 2     Albania           0       0     0       6      0   0      0
# 3       China           0       0     0       8    900   0      0
# 4     Germany           0       0     0       0     12   3    700
# 5      Turkey           0       0     0       0      0   0      0
# 6         USA           0       0     0       0      0   0      0
# 7      Zambia           0       0     0       0      0   0      0

Jan van der Vegt · Answer 3 · 2015-09-23T09:28:31+0000

Since only the upper diagonal matrix and the diagonal are equal to 0, it remains the same, except for the first column, which is deleted, since it does not contain information (only zeros). You can simply add it to the matrix using cbind:

Z = matrix(rep(0,7),ncol=1)
newMatrix = cbind(Z,oldMatrix)

How to fill an empty matrix with data values

More articles: