I am working on a h264 hardware accelerated decoder using the Media Foundation Source Reader, but have run into a problem. I followed this tutorial and supported the Windows SDK Media Foundation samples.
My application works fine when hardware acceleration is disabled, but it does not provide the required performance. When I turn on acceleration by passing the IMFDXGIDeviceManager to IMFAttributes used to create the reader, things get more complicated.
If I create ID3D11Device using the D3D_DRIVER_TYPE_NULL driver, the application works fine and the frames are processed faster than in program mode, but judging by the use of the processor and the GPU, it still does most of the processing on the processor.
On the other hand, when I create the ID3D11Device using the D3D_DRIVER_TYPE_HARDWARE driver and run the application, one of these four things can happen.
I get an unpredictable number of frames (usually 1-3) before the IMFMediaBuffer::Lock function returns 0x887a0005, which is described as "an instance of the GPU device was suspended. Use GetDeviceRemovedReason to determine the appropriate action" When I call ID3D11Device::GetDeviceRemovedReason , I I get 0x887a0020, which is described as "The driver encountered a problem and was placed in the state of the removed device", which is not as useful as I would like.
The application crashes in an external dll when IMFMediaBuffer::Lock called. It seems that the dll depends on the GPU used. For the integrated Intel GPU, this is igd10iumd32.dll, and for the Nvidia mobile GPU, it is mfplat.dll. The message about this particular failure is as follows: "An exception was thrown at 0x53C6DB8C (mfplat.dll) in the file decoder_ tester.exe: 0xC0000005: read access violation location 0x00000024". The addresses are different between performances, and sometimes they include reading, sometimes writing.
The graphics driver stops responding, the system freezes for a short time, and then the application crashes, as at point 2, or ends, as in paragraph 1.
The application works great and processes all frames using hardware acceleration.
In most cases, it is 1 or 2, rarely 3 or 4.
Here is what CPU / GPU usage is when processing without throttling in different modes on my machine (Intel Core i5-6500 with HD Graphics 530, Windows 10 Pro).
- NULL - CPU: ~ 90%, GPU: ~ 15%
- EQUIPMENT - CPU: ~ 15%, GPU: ~ 60%
- SOFTWARE - CPU: ~ 40%, GPU: ~ 7%
I tested the application on three machines. All of them have integrated Intel GPUs (HD 4400, HD 4600, HD 530). One of them also had a switchable NVIDIA GPU (GF 840M). It is equally similar to all of them, the only difference is that it crashes in another dll when using the Nvidia GPU.
I have no previous experience with COM or DirectX, but all this is inconsistent and unpredictable, so for me it looks like a memory corruption. However, I do not know where I am making a mistake. Could you help me find what I am doing wrong?
Below is a minimal code example that I could find. I am using Visual Studio Professional 2015 to compile it as a C ++ project. I prepared definitions for enabling hardware acceleration and choosing a hardware driver. Comment on them to change the behavior. In addition, the code expects this video file , which will be present in the project directory.
#include <iostream> #include <string> #include <atlbase.h> #include <d3d11.h> #include <mfapi.h> #include <mfidl.h> #include <mfreadwrite.h> #include <windows.h> #pragma comment(lib, "d3d11.lib") #pragma comment(lib, "mf.lib") #pragma comment(lib, "mfplat.lib") #pragma comment(lib, "mfreadwrite.lib") #pragma comment(lib, "mfuuid.lib") #define ENABLE_HW_ACCELERATION #define ENABLE_HW_DRIVER void handle_result(HRESULT hr) { if (SUCCEEDED(hr)) return; WCHAR message[512]; FormatMessage(FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_IGNORE_INSERTS, nullptr, hr, MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), message, ARRAYSIZE(message), nullptr); printf("%ls", message); abort(); } int main(int argc, char** argv) { handle_result(CoInitializeEx(nullptr, COINIT_APARTMENTTHREADED | COINIT_DISABLE_OLE1DDE)); handle_result(MFStartup(MF_VERSION)); { CComPtr<IMFAttributes> attributes; handle_result(MFCreateAttributes(&attributes, 3)); #if defined(ENABLE_HW_ACCELERATION) CComPtr<ID3D11Device> device; D3D_FEATURE_LEVEL levels[] = { D3D_FEATURE_LEVEL_11_1, D3D_FEATURE_LEVEL_11_0 }; #if defined(ENABLE_HW_DRIVER) handle_result(D3D11CreateDevice(nullptr, D3D_DRIVER_TYPE_HARDWARE, nullptr, D3D11_CREATE_DEVICE_SINGLETHREADED | D3D11_CREATE_DEVICE_VIDEO_SUPPORT, levels, ARRAYSIZE(levels), D3D11_SDK_VERSION, &device, nullptr, nullptr)); #else handle_result(D3D11CreateDevice(nullptr, D3D_DRIVER_TYPE_NULL, nullptr, D3D11_CREATE_DEVICE_SINGLETHREADED, levels, ARRAYSIZE(levels), D3D11_SDK_VERSION, &device, nullptr, nullptr)); #endif UINT token; CComPtr<IMFDXGIDeviceManager> manager; handle_result(MFCreateDXGIDeviceManager(&token, &manager)); handle_result(manager->ResetDevice(device, token)); handle_result(attributes->SetUnknown(MF_SOURCE_READER_D3D_MANAGER, manager)); handle_result(attributes->SetUINT32(MF_READWRITE_ENABLE_HARDWARE_TRANSFORMS, TRUE)); handle_result(attributes->SetUINT32(MF_SOURCE_READER_ENABLE_ADVANCED_VIDEO_PROCESSING, TRUE)); #else handle_result(attributes->SetUINT32(MF_SOURCE_READER_ENABLE_VIDEO_PROCESSING, TRUE)); #endif CComPtr<IMFSourceReader> reader; handle_result(MFCreateSourceReaderFromURL(L"Rogue One - A Star Wars Story - Trailer.mp4", attributes, &reader)); CComPtr<IMFMediaType> output_type; handle_result(MFCreateMediaType(&output_type)); handle_result(output_type->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video)); handle_result(output_type->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_RGB32)); handle_result(reader->SetCurrentMediaType(MF_SOURCE_READER_FIRST_VIDEO_STREAM, nullptr, output_type)); unsigned int frame_count{}; std::cout << "Started processing frames" << std::endl; while (true) { CComPtr<IMFSample> sample; DWORD flags; handle_result(reader->ReadSample(MF_SOURCE_READER_FIRST_VIDEO_STREAM, 0, nullptr, &flags, nullptr, &sample)); if (flags & MF_SOURCE_READERF_ENDOFSTREAM || sample == nullptr) break; std::cout << "Frame " << frame_count++ << std::endl; CComPtr<IMFMediaBuffer> buffer; BYTE* data; handle_result(sample->ConvertToContiguousBuffer(&buffer)); handle_result(buffer->Lock(&data, nullptr, nullptr)); // Use the frame here. buffer->Unlock(); } std::cout << "Finished processing frames" << std::endl; } MFShutdown(); CoUninitialize(); return 0; }