Search for common links in a matrix and classification by common intersection

Question

Search for common links in a matrix and classification by common intersection

Suppose I have a distance matrix in which the cost of fate and the cost of origin must be below a certain threshold value (for example, US 100) for link exchange. My difficulty is to achieve a common set after classifying these areas: A1 (cost of fate and origin below the threshold) with A2 and (the same thing) A3 and A4; A2 links from A1 and A4; A4 with A1 and A2. Thus, A1, A2 and A4 will be classified in the same group as the group with the highest frequency of connections between each other. Below I set the matrix as an example:

    A1  A2  A3  A4  A5  A6  A7
A1  0   90  90  90  100 100 100
A2  80  0   90  90  90  110 100
A3  80  110 0   90  120 110 90
A4  90  90  110 0   90  100 90
A5  110 110 110 110 0   90  80
A6  120 130 135 100 90  0   90
A7  105 110 120 90  90  90  0

I program this with Stata, and I have not put the matrix higher in matrix form, as in mata. The column listing the letters A plus the number is a variable with the names of the matrix growths, and the remaining columns are named with each locale name (for example, A1, etc.).

I returned a list of links between each locality with the following code, which maybe I did it very "rudely", as I was in a hurry:

    clear all

    set more off

    //inputting matrix

    input A1 A2 A3 A4 A5 A6 A7
    0 90 90 90 100 100 100
    80 0 90 90 90 100 100
    80 110 0 90 120 110 90
    90 90 110 0 90 100 90
    110 110 110 110 0 90 90
    120 130 135 100 90 0 90
    105 110 120 90 90 90 0

    end

    //generate row variable

    gen locality=""

    forv i=1/7{

        replace locality="A`i'" in `i'

    }
    *

    order locality, first


    //generating who gets below the threshold of 100

    forv i=1/7{

        gen r_`i'=0

        replace r_`i'=1 if A`i'<100 & A`i'!=0

    }
    *

    //checking if both ways (origin and destiny below threshold)

    forv i=1/7{

        gen check_`i'=.

    forv j=1/7{

            local v=r_`i'[`j']

            local vv=r_`j'[`i']

            replace check_`i'=`v'+`vv' in `j'

                }

    *
        }
    *

    //creating list of links

    gen locality_x=""

    forv i=1/7{

        preserve

        local name = locality[`i']

        keep if check_`i'==2

        replace locality_x="`name'"

        keep locality locality_x

        save "C:\Users\user\Desktop\temp_`i'", replace

        restore

    }
    *

    use "C:\Users\user\Desktop\temp_1", clear

    forv i=2/7{

        append using "C:\Users\user\Desktop\temp_`i'"
    }
    *

    //now locality_x lists if A.1 has links with A.2, A.3 etc. and so on.
    //the dificulty lies in finding a common intersection between the groups.

Which returns the following listing:

locality_x  locality
A1  A2
A1  A3
A1  A4
A2  A1
A2  A4
A3  A1
A4  A1
A4  A2
A4  A7
A5  A6
A5  A7
A6  A5
A6  A7
A7  A4
A7  A5
A7  A6

I am trying to get to know a lot of intersections, but I don't know how to do this in Stata. I want to do something to reprogram the threshold and find a common set. I would appreciate it if you could create a solution in R, given that I can program a little in it.

A similar way to get a list in R (as @ user2957945 insert in his answer below):

structure(c(0L, 80L, 80L, 90L, 110L, 120L, 105L, 90L, 0L, 110L, 
90L, 110L, 130L, 110L, 90L, 90L, 0L, 110L, 110L, 135L, 120L, 
90L, 90L, 90L, 0L, 110L, 100L, 90L, 100L, 90L, 120L, 90L, 0L, 
90L, 90L, 100L, 110L, 110L, 100L, 90L, 0L, 90L, 100L, 100L, 90L, 
90L, 80L, 90L, 0L), .Dim = c(7L, 7L), .Dimnames = list(c("A1", 
"A2", "A3", "A4", "A5", "A6", "A7"), c("A1", "A2", "A3", "A4", 
"A5", "A6", "A7")))

# get values less than threshold
id = m < 100 
# make sure both values are less than threshold, and dont include diagonal
m_new = (id + t(id) == 2) & m !=0 
# melt data and subset to keep TRUE values (TRUE if both less than threshold and not on diagonal)
result  = subset(reshape2::melt(m_new), value)
# reorder to match question results , if needed 
result[order(result[[1]], result[[2]]), 1:2] 

   Var1 Var2
8    A1   A2
15   A1   A3
22   A1   A4
2    A2   A1
23   A2   A4
3    A3   A1
4    A4   A1
11   A4   A2
46   A4   A7
40   A5   A6
47   A5   A7
34   A6   A5
48   A6   A7
28   A7   A4
35   A7   A5
42   A7   A6

" ", , , intersect R. id, id (). , A.1 A.2 A.4, A.2 A.1 A.4, A.4 A.1 A.2, id (). , . , , A.1 A.2 A.6, A.2 A.1 A.6, A.6 - A.1 A.2 ( A.6 A.4, ). A.6 , , A.1, A.2 A.4 A.6 .

+4

matrix r graph-theory set-intersection stata

John Doe 13 '17 18:23

2

R

# get values less then threshold
id = m < 100 
# make sure both values are less then threshold, and dont include diagonal
m_new = (id + t(id) == 2) & m !=0 
# melt data and subset to keep TRUE values (TRUE if both less than threshold and not on diagonal)
result  = subset(reshape2::melt(m_new), value)
# reorder to match question results , if needed 
result[order(result[[1]], result[[2]]), 1:2] 

   Var1 Var2
8    A1   A2
15   A1   A3
22   A1   A4
2    A2   A1
23   A2   A4
3    A3   A1
4    A4   A1
11   A4   A2
46   A4   A7
40   A5   A6
47   A5   A7
34   A6   A5
48   A6   A7
28   A7   A4
35   A7   A5
42   A7   A6

.

structure(c(0L, 80L, 80L, 90L, 110L, 120L, 105L, 90L, 0L, 110L, 
90L, 110L, 130L, 110L, 90L, 90L, 0L, 110L, 110L, 135L, 120L, 
90L, 90L, 90L, 0L, 110L, 100L, 90L, 100L, 90L, 120L, 90L, 0L, 
90L, 90L, 100L, 110L, 110L, 100L, 90L, 0L, 90L, 100L, 100L, 90L, 
90L, 80L, 90L, 0L), .Dim = c(7L, 7L), .Dimnames = list(c("A1", 
"A2", "A3", "A4", "A5", "A6", "A7"), c("A1", "A2", "A3", "A4", 
"A5", "A6", "A7")))

+2

user2957945 13 '17 18:53

Andrei · Accepted Answer · 2017-05-16T17:52:31+0000

, , , igraph:

# Load necessary libraries
library(igraph)

# Define global parameters
threshold <- 100

# Compute the adjacency matrix
# (distances in both directions need to be smaller than the threshold)
am <- m < threshold & t(m) < threshold

# Make an undirected graph given the adjacency matrix
# (we set diag to FALSE so as not to draw links from a vertex to itself)
gr <- graph_from_adjacency_matrix(am, mode = "undirected", diag = FALSE)

# Find all the largest complete subgraphs
lc <- largest_cliques(gr)

# Output the list of complete subgraphs as a list of vertex names
lapply(lc, (function (e) e$name))

, Stata . , ( ), Stata (.. clustermat).

Search for common links in a matrix and classification by common intersection

More articles: