Just for completeness, since the question is a bit older:
Appendix 30 really added audio support with the main sound (!) Sound object. However, it only supports a sampling frequency of 8000 Hz.
DICOM General Audio Waveform, 44,100 .
, . . 3 DICOM.
, DICOM MPEG2, MPEG4. DICOM 5 .