Intel has made great strides in implementing SSE over the past 5 years or so, which AMD has not really done. Initially, both were really only 64-bit executive units, and 128-bit operations were divided into 2 micro-operations. Since the introduction of Core and Core 2, Intel processors have had a full 128-bit implementation of SSE, which means that 128-bit operations effectively increase 2x throughput (1 micro versus 2). More modern Intel processors also have several SSE execution units, which means you can get> 1 instruction per clock for 128-bit SIMD instructions.
source share