I am going to assume that you want to keep the same algorithm. This should be at least a slightly more efficient implementation. The main difference is that the code in the loop uses registers, not memory.
int lcm(int a,int b) { __asm { xor ecx, ecx mov esi, a mov edi, b lstart: inc ecx mov eax, ecx xor edx, edx idiv esi test edx, edx jne lstart mov eax, ecx; idiv edi test edx, edx jnz lstart mov eax, ecx leave ret } }
However, as Jason noted, this is really not a very efficient algorithm - multiplication, GCD search, and division will usually be faster (if a and b quite small).
Edit: there is another algorithm that is almost easier to understand, which should also be much faster (than the original, not multiplication, and then division by GCD). Instead of generating consecutive numbers until you find one that divides both a and b , generate consecutive multiple units (preferably more) until you find one that evenly divides into another:
int lcm2(int a, int b) { __asm { xor ecx, ecx mov esi, a mov edi, b lstart: add ecx, esi mov eax, ecx xor edx, edx idiv edi test edx, edx jnz lstart mov eax, ecx leave ret } }
It remains dead simply to understand, but should significantly improve the original.