Paul R has a pretty good answer, but I would like to expand it a bit. If you think of sound as a series of pulses (and this is how it is), then a higher step will have more pulses per second (higher frequency), and a lower step will be smaller (lower frequency). To lower the pitch of an existing sound, you must propagate these pulses (make them farther apart). As a result, the duration of the sound will increase, because you have not reduced the number of pulses, you just made them even more separate (less per second). The opposite happens if you try to increase the pitch: the pulses are closer to each other, which makes the sound shorter in duration.
If you want the duration to remain constant regardless of changes in the recorded tone, you need to either discard information (lower step) or production information (higher step). This is where fancy processing comes in. What can be safely discarded? What can be safely duplicated or built?
source share