This can be done without full matrix multiplication, using only matrix element multiplication.
A B . A t(A), B .
: colSums(t(A) * B)
:
n = 5
m = 10000;
A = matrix(runif(n*m), n, m);
B = matrix(runif(n*m), m, n);
:
diag(A %*% B)
# [1] 2492.198 2474.869 2459.881 2509.018 2477.591
:
colSums(t(A) * B)
# [1] 2492.198 2474.869 2459.881 2509.018 2477.591
.