juhara.com
Language : English Indonesia

DirectShow Tips: Splitting Audio - Video Data from Movie file (AVI)

Zamrony P Juhara
22 November 2006 15:26:00
 (2932 views)
This tutorial is about how to split audio - video data from movie file such as AVI with DirectShow and Delphi

Introduction

I had video clip of song but my MP3 Player was not able to play video file, so I want to split audio data from that file and save it as a MP3 file. Then I thought about how to split audio - video data from movie file. After digging DirectShow documentation, I finally figure out how to do that. In this article, I will only talk about AVI file.

Prerequisite

This article continues DirectShow Tips: Developing WAV - MP3 Converter article. We are going to use frame work we've built. We will improve and add features to unit uDirectShowPlayer.pas. If you haven't read it, I suggest you to read it first because I will only explain about the construction of filter graph for splitting audio - video.

Software needed

We need following softwares:

  • DirectX version 8 or newer.
  • DirectX header conversion. You can download here (DirectX 9) or here (DirectX 8.1). I use DirectX 8.1 header.
  • Delphi compiler.
  • MP3 codec and Indeo® video 5.10 codec. Indeo® 5.10 codec is already included in XP installation. To check whether you had Indeo® video 5.10 codec installed, click Control Panel » Sounds and Audio Devices » Hardware tab » On Devices, select Video Codecs and click Properties button. On Video Codecs Properties window, locate Indeo® video 5.10. If you can't find it, you need to download and install it. You can find this codec in many websites offering video codec download.
  • wavdest.ax filter. This filter is included in download distribution.

Filter Graph of Separation of Audio - Video from AVI file.

To split audio - video data from AVI file, we need some filters i.e. File Source filter to read AVI file, AVI Splitter filter to split AVI stream into audio stream and video stream, AVI Decompressor filter to decode video stream. We need MPEG Layer-3 filter to compress audio data into MP3 data. The outputs are then sent to WAV Dest to transform it into stream and then stored to file with File Writer filter.

Filter graph of separation audio - video data from AVI file

Figure 1. Filter graph of separation audio - video data from AVI file.

Output of AVI Decompressor is huge uncompressed frames video data, therefore we passed it to video compressor filter Indeo video 5.10 to compress it before it's sent to AVI Mux filter. AVI Mux filter is multiplexer filter that combines and synchronize video and audio. Pin Input 01 accepts video data, pin Input 02 accepts audio data. We let Pin Input 02 empty, because we want AVI Out would contain video data only. AVI stream is sent to File Writer to be stored in file.

Design

We will encapsulate this extraction process in TBasicAudioVideoExtractor derived from TBasicPlayer class. This is base class for extracting audio - video. Especially for AVI, we encapsulate it in TAVIAudioVideoExtractor class derived TBasicAudioVideoExtractor.

UML diagram of encapsulation of audio - video extraction

Figure 2. UML diagram of encapsulation of audio - video extraction.

TBasicAudioVideoExtractor is provided with three additional properties i.e. SrcFilename, AudioFilename and VideoFilename, respectively source filename, audio output filename and video output filename. We will also add Extract() method which wraps BuildFilterGraph() and Run() call.

Implementation

Fixing GetPin() Bug.

In previous TBasicPlayer.GetPin() method, there is an annoying bug that cause get pin at a particular index did not return correct IPin interface. This bug was not detected in previous demo, because pin we retrieved always the first pin (which returned the correct pin). Ok, code below is improved GetPin().

function TBasicPlayer.GetPin(aFilter: IBaseFilter;
  const PinDir: TPin_Direction; const indx: integer): IPin;
var pPins:IEnumPins;
    ctr:integer;
    aPin:IPin;
    CurPinDir:TPin_Direction;
begin
  result:=nil;
  //start pin enumeration
  aFilter.EnumPins(pPins);
  if pPins<>nil then
  begin
    //pin enumeration ok
    ctr:=0;
    while pPins.Next(1,aPin,nil)=S_OK do
    begin
      aPin.QueryDirection(curPinDir);
      if (curPinDir=PinDir) then
      begin
        if (ctr=indx) then
        begin
          result:=aPin;
          exit;
        end;
        inc(ctr);
      end
    end;
  end;
end;

Add Connection to Particular Pin Feature to ConnectFilter Method.

In old ConnectFilter(), OutFilter and InFilter are always connected through first output pin and first input pin. In new ConnectFilter, we modify it to be able to connect any output pin to any input pin we want. To make sure old code can be compiled with no change, OutIndx and InIndx are created as default parameter with default value of 0.

function TBasicPlayer.ConnectFilter(OutFilter,
  InFilter: IBaseFilter;
  const OutIndx:integer=0;
  const InIndx:integer=0): HResult;
var outpin,inPin:IPin;
begin
  outPin:=GetPin(OutFilter,PINDIR_OUTPUT,OutIndx);
  inPin:=GetPin(InFilter,PINDIR_INPUT,InIndx);
  result:=FFilterGraph.Connect(outPin,inPin);
end;

Implementation of TBasicAudioVideoExtractor.

TBasicAudioVideoExtractor class is declared as follow:

     TBasicAudioVideoExtractor=class(TBasicPlayer)
     private
       FAudioFilename: string;
       FVideoFilename: string;
       FSrcFilename: string;
       procedure SetAudioFilename(const Value: string);
       procedure SetSrcFilename(const Value: string);
       procedure SetVideoFilename(const Value: string);
     published
       property SrcFilename:string read FSrcFilename write SetSrcFilename;
       property AudioFilename:string read FAudioFilename write SetAudioFilename;
       property VideoFilename:string read FVideoFilename write SetVideoFilename;
     public
       function Extract:boolean;
     end;

Code snippet below is complete implementation code:

{ TBasicAudioVideoExtractor }

function TBasicAudioVideoExtractor.Extract:boolean;
begin
  try
    BuildFilterGraph;
    Run;
    result:=true;
  except
    RemoveAllFilters;
    result:=false;
  end;
end;

procedure TBasicAudioVideoExtractor.SetAudioFilename(const Value: string);
begin
  FAudioFilename := Value;
end;

procedure TBasicAudioVideoExtractor.SetSrcFilename(const Value: string);
begin
  FSrcFilename := Value;
end;

procedure TBasicAudioVideoExtractor.SetVideoFilename(const Value: string);
begin
  FVideoFilename := Value;
end;

Implementation of TAVIAudioVideoExtractor.

We only override BuildFilterGraph. We flesh it with code to construct filter graph of audio - video extraction from AVI file.

     TAVIAudioVideoExtractor=class(TBasicAudioVideoExtractor)
     public
       procedure BuildFilterGraph;override;
     end;

Complete code of BuildFilterGraph is shown in following code snippet:

procedure TAVIAudioVideoExtractor.BuildFilterGraph;
var pMP3FileSink,pAVIFileSink:IFileSinkFilter;
    pFileSource:IFileSourceFilter;

    afileReader,aAVISplitter,aWaveDest,
    aMPEGLayer3,
    aMP3FileWriter,
    aAVIFileWriter,
    aAVIMux,
    aAVIDecompressor,
    aAVICompressor:IBaseFilter;

    aSrcFilename,aDestFilename:widestring;

    hr:HResult;
begin
  if (FFilterGraph<>nil) then
  begin
    //add file reader filter
    afileReader:=AddFilterByCLSID(CLSID_AsyncReader,'File Reader');
    //add AVI splitter filter
    aAVISplitter:=AddFilterByCLSID(CLSID_AVISplitter,'AVI Splitter');

    //Add MPEG Layer 3 Encoder
    aMPEGLayer3:=FindFilterByFriendlyName(CLSID_AudioCompressorCategory,
                                         'MPEG Layer-3');
    FFilterGraph.AddFilter(aMPEGLayer3,'MPEG Layer-3');

    //add Wave Dest filter
    awaveDest:=AddFilterByCLSID(CLSID_WaveDest,'Wave Dest');
    //add MP3 file writer filter
    aMP3FileWriter:=AddFilterByCLSID(CLSID_FileWriter,'MP3 File Writer');
    //add AVI file writer filter
    aAVIFileWriter:=AddFilterByCLSID(CLSID_FileWriter,'AVI File Writer');
    //add AVI Mux filter
    aAVIMux:=AddFilterByCLSID(CLSID_AVIDest,'AVI Mux');

    //add AVI Decompressor filter
    aAVIDecompressor:=AddFilterByCLSID(CLSID_AVIDec,'AVI Decompressor');
    //Add Indeo video 5.10 Encoder
    aAVICompressor:=FindFilterByFriendlyName(CLSID_VideoCompressorCategory,
                                         'Indeo® video 5.10');
    FFilterGraph.AddFilter(aAVICompressor,'Indeo video 5.10');

    if (afileReader<>nil) and
       (aAVISplitter<>nil) and
       (aMPEGLayer3<>nil) and
       (aWaveDest<>nil) and
       (aMP3FileWriter<>nil) and
       (aAVIFileWriter<>nil) and
       (aAVIMux<>nil)  and
       (aAVICompressor<>nil) then
    begin
      //ambil instance IFileSourceFilter
      afileReader.QueryInterface(IID_IFileSourceFilter,pFileSource);
      if pFileSource<>nil then
      begin
        aSrcFilename:=FSrcFilename;
        pFileSource.Load(PWideChar(aSrcFilename),nil);
      end;

      //connect output pin file reader ke
      //input pin AVI splitter
	    hr:=ConnectFilter(aFileReader,aAVISplitter);
      if hr<>S_OK then
        RaiseDirectShowException(hr,'Koneksi File Reader ke AVI Splitter gagal. ');

      //connect output AVI splitter ke
      //input pin AVI Decompressor
      hr:=ConnectFilter(aAVISplitter,aAVIDecompressor);
      if hr<>S_OK then
        RaiseDirectShowException(hr,'Koneksi AVI Splitter ke AVI Decompressor. ');

      {--------begin video extractor----------}

      //connect output AVI Decompressor ke
      //input pin AVI Compressor
      hr:=ConnectFilter(aAVIDecompressor,aAVICompressor);
      if hr<>S_OK then
        RaiseDirectShowException(hr,'Koneksi AVI Decompressor ke AVI compressor gagal. ');

      //connect output AVI Compressor ke
      //input pin AVI AVI Mux
      hr:=ConnectFilter(aAVICompressor,aAVIMux);
      if hr<>S_OK then
        RaiseDirectShowException(hr,'Koneksi AVI compressor ke AVI Mux gagal. ');

      //ambil instance IFileSinkFilter
      aAVIFileWriter.QueryInterface(IID_IFileSinkFilter,pAVIFileSink);
      if pAVIFileSink<>nil then
      begin
        aDestFilename:=FVideoFilename;
        pAVIFileSink.SetFileName(PWideChar(aDestFilename),nil);
      end;

      //connect output AVI Mux ke
      //input pin AVI File Writer
      hr:=ConnectFilter(aAVIMux,aAVIFileWriter);
      if hr<>S_OK then
        RaiseDirectShowException(hr,'Koneksi AVI Mux ke AVI FileWriter gagal. ');

      {--------end video extractor----------}
      {--------begin audio extractor----------}

      //connect output AVI splitter ke
      //input pin MPEG Layer 3
      hr:=ConnectFilter(aAVISplitter,aMPEGLayer3,1,0);
      if hr<>S_OK then
        RaiseDirectShowException(hr,'Koneksi AVI Splitter ke MPEG Layer 3. ');


      //connect output MPEG Layer 3 ke
      //input pin Wave Dest
      hr:=ConnectFilter(aMPEGLayer3,aWaveDest);
      if hr<>S_OK then
        RaiseDirectShowException(hr,'Koneksi MPEG Layer 3 ke Wave Dest gagal. ');

      //ambil instance IFileSinkFilter
      aMP3fileWriter.QueryInterface(IID_IFileSinkFilter,pMP3FileSink);
      if pMP3FileSink<>nil then
      begin
        aDestFilename:=FAudioFilename;
        pMP3FileSink.SetFileName(PWideChar(aDestFilename),nil);
      end;

      //connect output wave Dest ke
      //input pin FileWriter
      hr:=ConnectFilter(aWaveDest,aMP3FileWriter);
      if hr<>S_OK then
        RaiseDirectShowException(hr,'Koneksi Wave Dest ke File Writer gagal. ');

      {--------end audio extractor----------}
    end;
  end;
end;

I think it's not too hard to understand, you can study how it works by its comments.

Main Application Design.

Application user interface

Figure 3. Application user interface.

Create new application and drag and drop controls onto the form to make it looks like screenshot above. Change the name of each controls (see declaration below) and add variable of type TAVIAudioVideoExtractor.

Main Application Implementation.

Flesh out event handler for each controls to make it looks like following code:

unit ufrmMain;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs, StdCtrls,DirectShow,uDirectShowPlayer, ComCtrls, ExtCtrls;

type
  TfrmMain = class(TForm)
    grpbxSource: TGroupBox;
    lblFilename: TLabel;
    edSourceFilename: TEdit;
    grpbxDestination: TGroupBox;
    edAudioFile: TEdit;
    edVideoFile: TEdit;
    lblAudioFile: TLabel;
    lblVideoFile: TLabel;
    btnBrowse: TButton;
    btnBrowseAudio: TButton;
    btnBrowseVideo: TButton;
    btnExtract: TButton;
    btnClose: TButton;
    OpenDialog1: TOpenDialog;
    SaveVideoDialog: TSaveDialog;
    SaveAudioDialog: TSaveDialog;
    ProgressBar1: TProgressBar;
    progressTime: TTimer;
    procedure btnExtractClick(Sender: TObject);
    procedure btnBrowseClick(Sender: TObject);
    procedure btnBrowseAudioClick(Sender: TObject);
    procedure btnBrowseVideoClick(Sender: TObject);
    procedure btnCloseClick(Sender: TObject);
    procedure progressTimeTimer(Sender: TObject);
  private
    extractor:TAVIAudioVideoExtractor;
    { Private declarations }
    procedure WM_MMNotify(var msg: TMessage);message WM_MMNOTIFY;
  public
    constructor Create(AOwner:TComponent);override;
    destructor Destroy;override;
    { Public declarations }
  end;

var
  frmMain: TfrmMain;

implementation


{$R *.dfm}

{ TForm1 }

constructor TfrmMain.Create(AOwner: TComponent);
begin
  inherited;
  extractor:=TAVIAudioVideoExtractor.Create;
  extractor.Handle:=Handle;
end;

destructor TfrmMain.Destroy;
begin
  extractor.Free;
  inherited;
end;

procedure TfrmMain.WM_MMNotify(var msg: TMessage);
var aplayer:TBasicPlayer;
    evCode,param1,param2:integer;
begin
  aplayer:=TBasicPlayer(msg.LParam);
  aplayer.EventObj.GetEvent(evCode,param1,param2,0);
  case evCode of
    EC_COMPLETE:begin
                  aplayer.RemoveAllFilters;
                  btnExtract.Enabled:=true;
                  progressbar1.Position:=0;
                  progressTime.Enabled:=false;
                end;
  end;
  aplayer.EventObj.FreeEventParams(evCode,param1,param2);
end;

procedure TfrmMain.btnExtractClick(Sender: TObject);
begin
  btnExtract.Enabled:=false;

  extractor.SrcFilename:=edSourceFilename.Text;
  extractor.AudioFilename:=edAudioFile.Text;
  extractor.VideoFilename:=edVideoFile.Text;

  if (extractor.Extract) then
  begin
    progressbar1.Max:=extractor.Duration;
    progressbar1.Min:=0;
    progressbar1.Position:=0;
    progressTime.Enabled:=true;
  end else
    btnExtract.Enabled:=true;
end;

procedure TfrmMain.btnBrowseClick(Sender: TObject);
begin
  if OpenDialog1.Execute then
    edSourceFilename.Text:=OpenDialog1.FileName;
end;

procedure TfrmMain.btnBrowseAudioClick(Sender: TObject);
begin
  if SaveAudioDialog.Execute then
    edAudioFile.Text:=SaveAudioDialog.FileName;
end;

procedure TfrmMain.btnBrowseVideoClick(Sender: TObject);
begin
  if SaveVideoDialog.Execute then
    edVideoFile.Text:=SaveVideoDialog.FileName;
end;

procedure TfrmMain.btnCloseClick(Sender: TObject);
begin
  Close;
end;

procedure TfrmMain.progressTimeTimer(Sender: TObject);
begin
  progressbar1.Position:=extractor.Position.Current;
end;

end.

To download source code of demo, please click here.

Summary.

In this article, we have discussed the way to construct filter graph for splitting audio video from AVI file with DirectShow. We also have implemented wrapper class for this audio - video extraction easier to do.

Related Article

Do you like this article? Help this website improve by donating. Any amounts is appreciated.

Or you can help by bookmarking this page. Delicious Bookmark this on Delicious