I know this is an old question, but I really was looking for exactly that. Since there was no instruction for it yet, I implemented 64-bit multiplication myself using pmuldq, as Paul R. mentioned. Here's what I came up with:
// requires g++ -msse4.1 ...
Godbolt on SSE Multiply64Bit .
source share