For such an application, you probably do not want to record to an audio file - instead, record notes and timings for a more compact format, and then play back the same as if the user had pressed notes on recorded times.
If you want to export the audio file format, you can write a simple mixer that adds together separate samples from the original samples with the correct offsets and puts the results in the output audio buffer. You might also need to write a very simple compressor to save sample size without distortion caused by clipping. This can be done by dividing any total sample above 95% of the maximum sample value. There may also be a way to use OpenAL for this mix for you and play in the sound buffer.
source share