juhara.com
Language : English Indonesia

Playing WAV and MIDI files using DirectX Audio with Delphi

Zamrony P Juhara
15 September 2006 17:13:00
 (2773 views)
This article discusses how to setup DirectX Audio version 8 or later to load and play WAV and MIDI files including initializing performance, loader, sound segment and audiopath. This article also disc

DirectX Overview

If someone conducts a survey about what API for game programming used by most of game running on PC, I am sure that DirectX will be on top list. Nowdays, DirectX has become a mature API for game programming on Windows platform. It makes game developer's life easier. Back to the old days of game programming history where all PC games were running on DOS platform, game developer must code for everything under the sun. From input hardware such as mouse, keyboard and joysticks, to output hardware such as graphics card and sound card. Every hardware has its own code (phew..game developer must be horrible job isn't it?). But now DirectX comes to the rescue. It makes game developer focus on game programming rather than API programming. One of interesting part of game programming is sound programming. Without sound, games will be dry and plain and lame?.

DirectX has several components for sound programming i.e DirectSound and DirectMusic, but in version 8.0 those components are combined into single component, i.e. DirectX Audio. I will explain it for DirectX 8.0. For version 9.0 you should replace 8 suffix with 9, for example, replace IDirectMusicPerformance8 with IDirectMusicPerformance9 and so on. For this article I use DirectX header conversion from here. Download project source code accompany this article here.

Hold Your Breath and Dive In....

Ok, enough with overview stuff. Let's start with real stuff. There are several steps to use DirectX Audio. It can be divided into three main parts as following:

  1. Initialization.
  2. Playing and stopping sound sample.
  3. Finalization (Shutdown).

Initialization Part

To start using DirectX Audio, first of all, we must initialize COM by calling CoInitialize function. This function is declared in ActiveX unit, so add this unit in your uses clause. This function has one parameter, fill this parameter with nil upon calling CoInitialize. Once it's done, we are ready to initialize DirectX Audio. Ok let's continue with next step.

Initializing Performance

Performance object is a overall manager for music playback. In one application, usually we only need one performance. To create and initialize performance, we use CoCreateInstance function which is declared in ActiveX unit. For example:

CoCreateInstance(CLSID_DirectMusicPerformance,nil,CLSCTX_INPROC,IID_IDirectMusicPerformance8,FPerformance);

Code above will initialize instance of IDirectMusicPerformance8 and store the address of instance in FPerformance. FPerformance is variable of type IDirectMusicPerformance8. Class ID CLSID_DirectMusicPerformance and interface ID IID_DirectMusicPerformance8 are both declared in DirectMusic unit. Be sure to include this unit in your uses clause. Second parameter is for COM aggregation which is not supported, we set it to nil. Constant CLSCTX_INPROC used here because DirectX is an in process COM server. We can check status of CoCreateInstance using Failed or Succeeded function.

After we initialize FPerformance, call InitAudio function which is member function of IDirectMusicPerformance8. For example:


if (FPerformance<>nil) then
begin
   FPerformance.InitAudio(nil,nil,
      Handle,
      DMUS_APATH_SHARED_STEREOPLUSREVERB,
	  64,
	  DMUS_AUDIOF_ALL,
	  nil);
end;

First and second parameter are instances of IDirectMusic and IDirectSound respectively. We set it to nil because we want DirectX Audio creating IDirectMusic and IDirectSound instance automatically for internal use of performance. Third parameter is window handle to use for DirectSound creation. This parameter can be 0 which indicates that foreground window will be used. Fourth parameter is audiopath type, this value will tell DirectX Audio to create an ordinary music setup with stereo output and reverb or 0 to use default audiopath. Fifth parameter is number of channel to be used. Sixth parameter is flag that indicate which feature we want to use. We want all feature, so we set it DMUS_AUDIOF_ALL. Last parameter is pointer to data structure of synthesizer parameter. We want default so we set nil. Up to this point, we finish initializing performance. Let's continue the journey...

Initializing Loader

Purpose of loader object is to load DirectMusic object such as sound sample from file or resource before it can be incorporated with performance object. Application should maintain one loader object at a time and not free it until there are no more loading must be done. Microsoft recommends that we create single global loader object to make object caching done efficiently.

To initialize loader object, we use CoCreateInstance function again,

CoCreateInstance(CLSID_DirectMusicLoader,nil,CLSCTX_INPROC,IID_IDirectMusicLoader8,FLoader);

Can you see that? Initializing loader object is almost same with initializing performance object. What we need to set is only class ID and interface ID of IDirectMusicLoader8 and of course a variable that will hold pointer to IDirectMusicLoader8 interface.

Loading Sound Sample From File

We have performance. We have loader. Now we need something to play right? Let's torture our sound card by loading some sounds from file so we can hear our PC screaming.

To load sound sample from file, we use LoadObjectFromFile function. This function is member function of IDirectMusicLoader8. Following code is one of many examples how to do it.

FLoader.LoadObjectFromFile(CLSID_DirectMusicSegment,IID_IDirectMusicSegment8,wchFilename,FSoundSegment);

LoadObjectFromFile() will create sound segment and load sound sample stored in file with name wchFilename into FSoundSegment. wchFilename type is PWideChar, so be sure to convert your filename if you use string type for filename. Because we want to store sound sample in sound segment, we set first and second parameter with class ID and interface ID of IDirectMusicSegment8. LoadObjectFromFile will not close the file until segment is release, so you may receive an error message if you try to open the file while a segment is using it.

To load sound sample from memory, we use GetObject, member function of IDirectMusicLoader8.

ZeroMemory(@objDesc,sizeof(TDMus_ObjectDesc));
  objdesc.dwSize:=sizeof(TDMus_ObjectDesc);
  objdesc.dwValidData:=DMUS_OBJ_CLASS or DMUS_OBJ_MEMORY;
  objdesc.guidClass:=CLSID_DirectMusicSegment;
  objdesc.llMemLength:=FInternalStream.Size;
  objdesc.pbMemData:=FInternalStream.Memory;

  FLoader.GetObject(objdesc,
                     IDirectMusicSegment8,
                     FSegment);

Before we create segment and load it with sound samples in memory, we need to initialize variable of type TDMus_ObjectDesc which will hold information on type of object we want to get. First we fill it with zero. Most of DirectX function calls failed if we do not initialize data structure.

Because we want to load data from a location in memory into segment, TDMus_ObjectDesc field that is relevant for this task is only guidClass, llMemLength and pbMemData. To retrieve IDirectMusicSegment instance, guidClass must hold class ID of CLSID_DirectMusicSegment. llMemLength and pbMemData hold size and address of sound data to be loaded.

To let DirectX know that guidClass, llMemLength and pbMemData hold valid data, we must fill dwValidData with DMUS_OBJ_CLASS OR DMUS_OBJ_MEMORY. Last thing to do is fill dwSize with size of TDMus_ObjectDesc structure.

Download Segment

To incorporated sound sample with performance so we can play it later, we must download segment containing sound sample to performance. We use member function of IDirectMusicSegment8 Download.

FSoundSegment.Download(FPerformance);

Creating Audiopath

Audiopath? Maybe some of you will shout at me and say, "What's new monster you bring now?". Well, this new monster actually reponsible for managing flow of sound data. When we create performance, DirectX Audio also create default audiopath for us. If you only need to play one sound sample at a time, then you can skip this part because you can play it with default audiopath.

In game, we often have to play multiple sounds at a time such as playing background music while playing explosion sound when game player pushes fire button. Using default audiopath only, will prevent us from playing those sounds simultaneously.

To be able to mix those sounds and play it simultaneously, we need to create audiopath for each our sound segment. To create audiopath, DirectX Audio provide us with two functions to accomplish this task, i.e. CreateAudioPath and CreateStandardAudioPath. Both functions is member function of IDirectMusicPerformance8 but the difference is audiopath created by first function will have its setting loaded from a configuration while second one, the caller must define its setting. I prefer second one because it's more simple.

FPerformance.CreateStandardAudioPath(DMUS_APATH_SHARED_STEREOPLUSREVERB,64,TRUE,FAudioPath);

The code will create an stereo audiopath with reverb and number of channel is 64. we set third parameter to TRUE because we want to activate audiopath we just create. The last parameter is where we want to store pointer of IDirectMusicAudioPath8 interface. When we reach this point, we're done with initialization part. Let's move to the most interesting part of this article...playing our sound stuffs! Hip hip hurray..

Playing and stopping your stuff

To play sound segment from default audiopath, we can use PlaySegment. To play sound segment from our previously created audiopath, then we must use PlaySegmentEx. The later function contains more parameters but it's more flexible. PlaySegment and PlaySegmentEx are both member functions of IDirectMusicPerformance8.

FPerformance.PlaySegmentEx(FSoundSegment,nil,nil,DMUS_SEGF_SECONDARY,0,FSegmentState,nil,FAudioPath);

First parameter is of course segment which holds sound that we want to play. Second is segment name. We can set it to nil because it's currently not supported. Third one is for transition effect like fading from one sound to another. We set it to nil because for game application, we want it get played immediately. Fourth is segment flag that we use. DMUS_SEGF_SECONDARY tells DirectX Audio to play our segment as secondary segment. Using this flag will enable us to mix sound with other sounds that currently playing. Fifth parameter is time to start play our sound. We want play it immediately so we set it with zero value. FSegmentState is variable of type IDirectMusicSegmentState that will receive status of segment currently played. Seventh parameter is audiopath or segment state that will be stop when new segment start playing. We set it to nil because we want other segments to be played as it was. The last parameter is audiopath that we want to use. If we want to play using default audiopath we set it to nil

If you can play then you can stop. To stop sound segment from playing, call Stop or StopEx. StopEx is extension of Stop. I will only explain StopEx because it's more flexible. Both are member functions of IDirectMusicPerformance8

FPerformance.StopEx(FSoundSegment,0,0);

First stuff on StopEx parameter list is segment to be stopped. Second and third, tell DirectX Audio that we want to stop playback immediately

Hey, finish with your stuff folks. Let's discuss how to properly shutdown DirectX Audio...

Finalization (shutdown)

Of course to properly shutdown, we must politely release all interface instances we already created. Otherwise, DirectX Audio will curse us with a bad thing, memory leak. We must pay attention for releasing performance object. Before we release it, we must call its CloseDown function because we have made a call to its InitAudio function.

FPerformance.CloseDown;

Optionally we can unload previously downloaded segment before releasing segment by calling its Unload function. According to what Microsoft said CloseDown will also unload all downloaded segments that have not been unloaded, so I think it save to skip this unloading step.

Go to Surface and Get Some Air...

We have enough information. Let us now continue the journey by creating wrapper class that will simplify task for playing sound with DirectX Audio. Classes below are two classes that we will create.

  1. TSoundCollection class
  2. TSoundItem class

TSoundCollection class

We will encapsulate performance and loader into this class. It will maintain one performance object, one loader and sound samples list. This class is responsible for setting up environment and also for creating and destroying all instances of TSoundItem. TSoundCollection will be inherited from TCollection class, because they share common functionalities. Following code are declaration of TSoundCollection. I will not display detail implementation of this class. You can grab source code here. I have commented out the code as clear as I can, hopefully that everybody will understand.

type 
   TSoundCollection=class(TCollection)
   private
     FLoader:IDirectMusicLoader8;
     FPerformance:IDirectMusicPerformance8;
   public
     constructor Create(ItemClass:TCollectionItemClass);
     destructor Destroy;override;
   end;

We need to override constructor and destructor, because we need to setup DirectX Audio performance and loader and do audio initialization with a call to InitAudio function.

constructor TSoundCollection.Create(ItemClass: TCollectionItemClass);
begin
  inherited;
  CoCreateInstance(CLSID_DirectMusicPerformance,
                   nil,
                   CLSCTX_INPROC,
                   IDirectMusicPerformance8,
                   FPerformance);
  CoCreateInstance(CLSID_DirectMusicLoader,
                   nil,
                   CLSCTX_INPROC,
                   IDirectMusicLoader8,
                   FLoader);

  FPerformance.InitAudio(nil,
                         nil,
                         0,
                         DMUS_APATH_SHARED_STEREOPLUSREVERB,
                         64,
                         DMUS_AUDIOF_ALL,
                         nil
                         );
end;

destructor TSoundCollection.Destroy;
begin
  FLoader:=nil;
  FPerformance.CloseDown;
  FPerformance:=nil;
  inherited;
end;

TSoundItem class

This class is responsible for playback and loading sound. We give it methods for playing and stop sound and loading sound from file and stream. We also give this class two property Looped and IsPlaying. Looped which is boolean type for playing sound infinite loop or single loop. If Looped is TRUE then sound will be played infinitely, which is very suitable for playing background music. Following declaration is declaration of TSoundItem.

   TSoundItem=class(TCollectionItem)
   private
     FSegment:IDirectMusicSegment8;
     FSegmentState:IDirectMusicSegmentState8;
     FAudioPath:IDirectMusicAudioPath8;
     FSoundCollection:TSoundCollection;
     FInternalStream:TMemoryStream;
     FLooped: boolean;
     function GetIsPlaying:boolean;
     procedure SetSegmentLoop(const loop:boolean);
     procedure SetLooped(const Value: boolean);
   public
     constructor Create(Collection:TCollection);override;
     destructor Destroy;override;
     function Play:HResult;
     function Stop:HResult;
     procedure LoadFromFile(const filename:string);
     procedure LoadFromStream(Stream:TStream);
   published
     property IsPlaying:boolean read GetIsPlaying;
     property Looped:boolean read FLooped write SetLooped;
   end;

Constructor

To be able to mix each sound segment, we need to create audiopath for every instance of TSoundItem. Constructor of TSoundItem contains code to create audiopath for this TSoundItem instance. To create audiopath, we need address of IDirectMusicPerformance8 instance, which is maintained by TSoundCollection.

constructor TSoundItem.Create(Collection: TCollection);
begin
  inherited;
  FSoundCollection:=Collection as TSoundCollection;
  FSoundCollection.FPerformance.CreateStandardAudioPath(
                       DMUS_APATH_SHARED_STEREOPLUSREVERB,
                       64,true,
                       FAudioPath);
end;

Destructor

Destructor Destroy contains code to ensure that segment and audiopath instance get freed by set it to nil. We also free FInternalStream, an internal buffer which holds sound data.

destructor TSoundItem.Destroy;
begin
  FSegment.Unload(FAudioPath);

  FSegment:=nil;
  FSegmentState:=nil;
  FAudioPath:=nil;

  FInternalStream.Free;
  inherited;
end;

Loading Sound

TSoundItem have two methods for loading sound. LoadFromFile and LoadFromStream. Loading from file is made simple by utilize LoadObjectFromFile, member function of IDirectMusicLoader8 interface. While loading sound from stream use GetObject of IDirectMusicLoader8.

procedure TSoundItem.LoadFromFile(const filename: string);
var afilename:widestring;
begin
  afilename:=filename;
  FSoundCollection.FLoader.LoadObjectFromFile(CLSID_DirectMusicSegment,
                               IDirectMusicSegment8,PWideChar(aFilename),
                               FSegment);
  FSegment.Download(FAudioPath);
  SetSegmentLoop(FLooped);
end;

procedure TSoundItem.LoadFromStream(Stream: TStream);
var objdesc:TDMus_ObjectDesc;
begin
  if FInternalStream=nil then
    FInternalStream:=TMemoryStream.Create
  else
  begin
    FInternalStream.Clear;
  end;

  FInternalStream.CopyFrom(Stream,0);

  ZeroMemory(@objDesc,sizeof(TDMus_ObjectDesc));
  objdesc.dwSize:=sizeof(TDMus_ObjectDesc);
  objdesc.dwValidData:=DMUS_OBJ_CLASS or DMUS_OBJ_MEMORY;
  objdesc.guidClass:=CLSID_DirectMusicSegment;
  objdesc.llMemLength:=FInternalStream.Size;
  objdesc.pbMemData:=FInternalStream.Memory;

  FSoundCollection.FLoader.GetObject(objdesc,
                     IDirectMusicSegment8,
                     FSegment);
  FSegment.Download(FAudioPath);
  SetSegmentLoop(FLooped);
end;

Playing and Stopping Sound

Play and Stop methods are for start playback and stop. These two methods are wrapper for PlaySegmentEx and StopEx functions.

function TSoundItem.Play: HResult;
var aseg:IDirectMusicSegmentState;
begin
  result:=FSoundCollection.FPerformance.PlaySegmentEx(FSegment,
                  nil,
                  nil,
                  DMUS_SEGF_SECONDARY,
                  0,
                  aseg,
                  nil,
                  FAudioPath
                  );
  aseg.QueryInterface(IDirectMusicSegmentState8,FSegmentState);
end;

function TSoundItem.Stop: HResult;
begin
  result:=FSoundCollection.FPerformance.StopEx(FSegment,0,0);
end;

In Play implementation, we keep pointer to IDirectMusicSegmentState8. We need it to retrieve status of a segment for example to find out if a segment is currently playing.

Are we playing something?

To let us know whether a segment is playing or not, we can use IDirectMusicPerformance8.IsPlaying function. In TSoundItem class this functionality is encapsulated in GetIsPlaying function which is isPlaying property accessor.

function TSoundItem.GetIsPlaying: boolean;
begin
  result:=(FSegment<>nil) and
          (FSegmentState<>nil) and
          (FSoundCollection.FPerformance.IsPlaying(FSegment,
                              FSegmentState)=S_OK);
end;

Playing Once or Forever?

For background sound, we need to play it infinitely until we explicitly stop it. So how do we instruct DirectX Audio to play it infinitely or once? IDirectMusicSegment8 has SetRepeats function which its purpose is to instruct how many to play segment. Set with value of DMUS_SEG_REPEAT_INFINITE will play segment infinitely, 0 to play it once.

procedure TSoundItem.SetLooped(const Value: boolean);
begin
  if FLooped<>Value then
  begin
    FLooped := Value;
    SetSegmentLoop(FLooped);
  end;
end;

procedure TSoundItem.SetSegmentLoop(const loop: boolean);
var loop_count:cardinal;
begin
  if FSegment<>nil then
  begin
    if loop then
      loop_count:=DMUS_SEG_REPEAT_INFINITE
    else
      loop_count:=0;

    FSegment.SetRepeats(loop_count);
  end;
end;

That's all I guess, now let us create a sample application to utilize these classes. Create a project and drag and drop 9 buttons onto the form. Rename it as btnLoadSample1, btnLoadSample2 btnLoadSample3, btnPlaySample1, btnPlaySample2, btnPlaySample3, Button1, Button2, Button3. Complete all OnClick handler of the buttons as follow:

unit ufrmMain;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs,udxaudio, StdCtrls;

type
  TfrmMain = class(TForm)
    btnLoadSample1: TButton;
    btnLoadSample2: TButton;
    btnLoadSample3: TButton;
    btnPlaySample1: TButton;
    btnPlaySample2: TButton;
    btnPlaySample3: TButton;
    Button1: TButton;
    Button2: TButton;
    Button3: TButton;
    procedure btnLoadSample1Click(Sender: TObject);
    procedure btnLoadSample2Click(Sender: TObject);
    procedure btnLoadSample3Click(Sender: TObject);
    procedure btnPlaySample1Click(Sender: TObject);
    procedure btnPlaySample2Click(Sender: TObject);
    procedure btnPlaySample3Click(Sender: TObject);
    procedure Button1Click(Sender: TObject);
    procedure Button2Click(Sender: TObject);
    procedure Button3Click(Sender: TObject);
  private
    SoundCollection:TSoundCollection;
    Sample1,Sample2,Sample3:TSoundItem;
    { Private declarations }
  public
    constructor Create(AOwner:TComponent);override;
    destructor Destroy;override;
  end;

var
  frmMain: TfrmMain;

implementation

{$R *.dfm}

procedure TfrmMain.btnLoadSample1Click(Sender: TObject);
begin
  if Sample1=nil then
    Sample1:=SoundCollection.Add as TSoundItem;

  Sample1.LoadFromFile('samples/bark.wav');
end;
procedure TfrmMain.btnLoadSample2Click(Sender: TObject);
begin
  if Sample2=nil then
    Sample2:=SoundCollection.Add as TSoundItem;

  Sample2.LoadFromFile('samples/horse.wav');
end;
procedure TfrmMain.btnLoadSample3Click(Sender: TObject);
begin
  if Sample3=nil then
    Sample3:=SoundCollection.Add as TSoundItem;

  Sample3.LoadFromFile('samples/elephant.wav');
end;

constructor TfrmMain.Create(AOwner: TComponent);
begin
  inherited;
  SoundCollection:=TSoundCollection.Create(TSoundItem);
end;

destructor TfrmMain.Destroy;
begin
  SoundCollection.Free;
  inherited;
end;

procedure TfrmMain.btnPlaySample1Click(Sender: TObject);
begin
  if sample1<>nil then
    Sample1.Play;
end;

procedure TfrmMain.btnPlaySample2Click(Sender: TObject);
begin
  if sample2<>nil then
    Sample2.Play;
end;

procedure TfrmMain.btnPlaySample3Click(Sender: TObject);
begin
  if sample3<>nil then
    Sample3.Play;
end;

procedure TfrmMain.Button1Click(Sender: TObject);
var data:TFileStream;
begin
  if Sample1=nil then
    Sample1:=SoundCollection.Add as TSoundItem;

  data:=TFileStream.Create('samples/bark.wav',fmOpenRead);
  try
    Sample1.LoadFromStream(data);
  finally
    data.Free;
  end;
end;

procedure TfrmMain.Button2Click(Sender: TObject);
var data:TFileStream;
begin
  if Sample2=nil then
    Sample2:=SoundCollection.Add as TSoundItem;

  data:=TFileStream.Create('samples/horse.wav',fmOpenRead);
  try
    Sample2.LoadFromStream(data);
  finally
    data.Free;
  end;

end;

procedure TfrmMain.Button3Click(Sender: TObject);
var data:TFileStream;
begin
  if Sample3=nil then
    Sample3:=SoundCollection.Add as TSoundItem;

  data:=TFileStream.Create('samples/elephant.wav',fmOpenRead);
  try
    Sample3.LoadFromStream(data);
  finally
    data.Free;
  end;

end;

end.

I will not explain all event handlers because they are basically doing same thing. Event btnLoadSample1Click(), as its name implied, loads sound data from WAV file into Sample1. Before calling LoadFromFile(), we check whether Sample1 instance is already created or not, if it isn't then we create one by calling Add method of TSoundCollection.

procedure TfrmMain.btnLoadSample1Click(Sender: TObject);
begin
  if Sample1=nil then
    Sample1:=SoundCollection.Add as TSoundItem;

  Sample1.LoadFromFile('samples/bark.wav');
end;

Button1Click() is doing same thing as bnLoadSample1(), i.e. loading sound data into segment, but the way it's doing its stuff is different. It loads sound data from stream. BtnPlay1Click() procedure is pretty straightforward, it calls Play method of Sample1 if Sample1 is not nil.

Conclusion

In this article, we have discussed how to setup performance object, loader object. We also discussed how to load and setup segment and play it simultaneously with other sound. As you can see, it's all very easy, though it might seem complicated at first. Hopefully, this article would help you and did not scared you with game programming. See you on next topic.

Related Article

Do you like this article? Help this website improve by donating. Any amounts is appreciated.

Or you can help by bookmarking this page. Delicious Bookmark this on Delicious