Suppose I have a distance matrix in which the cost of fate and the cost of origin must be below a certain threshold value (for example, US 100) for link exchange. My difficulty is to achieve a common set after classifying these areas: A1 (cost of fate and origin below the threshold) with A2 and (the same thing) A3 and A4; A2 links from A1 and A4; A4 with A1 and A2. Thus, A1, A2 and A4 will be classified in the same group as the group with the highest frequency of connections between each other. Below I set the matrix as an example:
A1 A2 A3 A4 A5 A6 A7
A1 0 90 90 90 100 100 100
A2 80 0 90 90 90 110 100
A3 80 110 0 90 120 110 90
A4 90 90 110 0 90 100 90
A5 110 110 110 110 0 90 80
A6 120 130 135 100 90 0 90
A7 105 110 120 90 90 90 0
I program this with Stata, and I have not put the matrix higher in matrix form, as in mata. The column listing the letters A plus the number is a variable with the names of the matrix growths, and the remaining columns are named with each locale name (for example, A1, etc.).
I returned a list of links between each locality with the following code, which maybe I did it very "rudely", as I was in a hurry:
clear all
set more off
//inputting matrix
input A1 A2 A3 A4 A5 A6 A7
0 90 90 90 100 100 100
80 0 90 90 90 100 100
80 110 0 90 120 110 90
90 90 110 0 90 100 90
110 110 110 110 0 90 90
120 130 135 100 90 0 90
105 110 120 90 90 90 0
end
//generate row variable
gen locality=""
forv i=1/7{
replace locality="A`i'" in `i'
}
*
order locality, first
//generating who gets below the threshold of 100
forv i=1/7{
gen r_`i'=0
replace r_`i'=1 if A`i'<100 & A`i'!=0
}
*
//checking if both ways (origin and destiny below threshold)
forv i=1/7{
gen check_`i'=.
forv j=1/7{
local v=r_`i'[`j']
local vv=r_`j'[`i']
replace check_`i'=`v'+`vv' in `j'
}
*
}
*
//creating list of links
gen locality_x=""
forv i=1/7{
preserve
local name = locality[`i']
keep if check_`i'==2
replace locality_x="`name'"
keep locality locality_x
save "C:\Users\user\Desktop\temp_`i'", replace
restore
}
*
use "C:\Users\user\Desktop\temp_1", clear
forv i=2/7{
append using "C:\Users\user\Desktop\temp_`i'"
}
*
//now locality_x lists if A.1 has links with A.2, A.3 etc. and so on.
//the dificulty lies in finding a common intersection between the groups.
Which returns the following listing:
locality_x locality
A1 A2
A1 A3
A1 A4
A2 A1
A2 A4
A3 A1
A4 A1
A4 A2
A4 A7
A5 A6
A5 A7
A6 A5
A6 A7
A7 A4
A7 A5
A7 A6
I am trying to get to know a lot of intersections, but I don't know how to do this in Stata. I want to do something to reprogram the threshold and find a common set. I would appreciate it if you could create a solution in R, given that I can program a little in it.
A similar way to get a list in R (as @ user2957945 insert in his answer below):
structure(c(0L, 80L, 80L, 90L, 110L, 120L, 105L, 90L, 0L, 110L,
90L, 110L, 130L, 110L, 90L, 90L, 0L, 110L, 110L, 135L, 120L,
90L, 90L, 90L, 0L, 110L, 100L, 90L, 100L, 90L, 120L, 90L, 0L,
90L, 90L, 100L, 110L, 110L, 100L, 90L, 0L, 90L, 100L, 100L, 90L,
90L, 80L, 90L, 0L), .Dim = c(7L, 7L), .Dimnames = list(c("A1",
"A2", "A3", "A4", "A5", "A6", "A7"), c("A1", "A2", "A3", "A4",
"A5", "A6", "A7")))
id = m < 100
m_new = (id + t(id) == 2) & m !=0
result = subset(reshape2::melt(m_new), value)
result[order(result[[1]], result[[2]]), 1:2]
Var1 Var2
8 A1 A2
15 A1 A3
22 A1 A4
2 A2 A1
23 A2 A4
3 A3 A1
4 A4 A1
11 A4 A2
46 A4 A7
40 A5 A6
47 A5 A7
34 A6 A5
48 A6 A7
28 A7 A4
35 A7 A5
42 A7 A6
" ", , , intersect R. id, id (). , A.1 A.2 A.4, A.2 A.1 A.4, A.4 A.1 A.2, id (). , . , , A.1 A.2 A.6, A.2 A.1 A.6, A.6 - A.1 A.2 ( A.6 A.4, ). A.6 , , A.1, A.2 A.4 A.6 .