Wednesday, March 7, 2012

Implementing a DirectShow H264 source filter

After not finding a suitable DirectShow source filter able to render raw H.264 files, we decided to roll our own one (available at the Video Processing Project. I'm sure that many developers have written one of these and perhaps it's time to stop reinventing the wheel. Should anyone want/like to contribute improvements/extensions to this filter, please drop us a line.

The H.264 reference software allows one to take a YUV file and encode it into a .264 file format. These .264 files consist of a sequence of NAL units, each prepended with a start code (0x00000001). A source filter would thus have to read one of these files, break it up into separate NAL units, and then pass one frame at a time to the decoder. Windows 7 features a built-in H.264 decoder.

The IFileSourceFilter interface is implemented to facilitate loading of .264 files. This causes GraphEdit/GraphStudio to display a dialog box in which one can select the desired .264 file.

One of the first steps in writing a source filter is to provide the correct output pin media type, that allows the DirectShow framework to render the graph. In this case, the MEDIA_SUBTYPE_H264 was selected since using it requires the least amount of effort. The implementation of GetMediaType looks as follows:


  HRESULT H264OutputPin::GetMediaType(CMediaType *pMediaType)
  {
    CAutoLock cAutoLock(m_pFilter->pStateLock());
    CheckPointer(pMediaType, E_POINTER);

    pMediaType->InitMediaType();    
    pMediaType->SetType(&MEDIATYPE_Video);
    pMediaType->SetSubtype(&MEDIASUBTYPE_H264);
    pMediaType->SetFormatType(&FORMAT_VideoInfo2);
    VIDEOINFOHEADER2* pvi2 = (VIDEOINFOHEADER2*)pMediaType->AllocFormatBuffer(
                                                                       sizeof(VIDEOINFOHEADER2));
    ZeroMemory(pvi2, sizeof(VIDEOINFOHEADER2));
    pvi2->bmiHeader.biBitCount = 24;
    pvi2->bmiHeader.biSize = 40;
    pvi2->bmiHeader.biPlanes = 1;
    pvi2->bmiHeader.biWidth = m_pFilter->m_iWidth;
    pvi2->bmiHeader.biHeight = m_pFilter->m_iHeight;
    pvi2->bmiHeader.biSize = m_pFilter->m_iWidth * m_pFilter->m_iHeight * 3;
    pvi2->bmiHeader.biSizeImage = DIBSIZE(pvi2->bmiHeader);
    pvi2->bmiHeader.biCompression = DWORD('1cva');
    const REFERENCE_TIME FPS_25 = UNITS / 25;
    pvi2->AvgTimePerFrame = FPS_25;
    SetRect(&pvi2->rcSource, 0, 0, m_pFilter->m_iWidth, m_pFilter->m_iHeight);
    pvi2->rcTarget = pvi2->rcSource;
    pvi2->dwPictAspectRatioX = m_pFilter->m_iWidth;
    pvi2->dwPictAspectRatioY = m_pFilter->m_iHeight;
    return S_OK;
  }

This code is sufficient to allow DirectShow to insert the Windows H.264 decoder into the pipeline. Here the width and height seem to be of little importance since they are in any case communicated in the (H.264) Sequence Parameter Set. The parameter sets are found by scanning the .264 file for the appropriate NAL units. The NAL unit type can be extracted from the NAL unit header as follows:

  unsigned char uiNalUnitType = nalUnitHeader & 0x1f;


Sequence parameter sets have value 7, picture parameter sets 8, and IDR frames value 5. One typically needs to pass these to the decoder before other encoded frames.

In closing, the filter currently is also able to output a custom media type, namely MEDIASUBTYPE_H264M. This makes it easy to test our own H.264 decoder filter. In the property pages of the source filter, one can select what the output media type of the H.264 source filter should be. It should be noted, that our H.264 decoder has limitations regarding the implemented parts of the specification as described in this post.

Should you wish to to be able to drag and drop .264 files into GraphStudio, run the registry scripts in the videoprocessing\Projects\Win32\Launch directory. Unfortunately these have only been tested on Windows 7.

Improvements/suggestions/corrections are of course welcome!

No comments:

Post a Comment