Metal cores behave badly on the new MacBook Pro (end of 2016)

I am working on a macOS project that uses Swift and Metal to process images on a GPU. Last week I received a new 15-inch MacBook Pro (late 2016) and noticed something strange with my code: the kernels that were supposed to write the texture didn't seem to do it ...

After searching many times, I found that the problem was with which GPU Metal (AMD Radeon Pro 455 or Intel (R) HD Graphics 530) used to perform the calculations.

Initializing MTLDevice with MTLCopyAllDevices() returns an array of devices representing the Radeon and Intel GPUs (while MTLCreateSystemDefaultDevice() returns the default device, which is Radeon). In any case, the code works as expected with the Intel GPU, but this is not the case with the Radeon GPU.

Let me show you an example.

To get started, here is a simple kernel that takes an input texture and copies its color into the output texture:

  kernel void passthrough(texture2d<uint, access::read> inTexture [[texture(0)]], texture2d<uint, access::write> outTexture [[texture(1)]], uint2 gid [[thread_position_in_grid]]) { uint4 out = inTexture.read(gid); outTexture.write(out, gid); } 

I want to use this kernel, I use this piece of code:

  let devices = MTLCopyAllDevices() for device in devices { print(device.name!) // [0] -> "AMD Radeon Pro 455", [1] -> "Intel(R) HD Graphics 530" } let device = devices[0] let library = device.newDefaultLibrary() let commandQueue = device.makeCommandQueue() let passthroughKernelFunction = library!.makeFunction(name: "passthrough") let cps = try! device.makeComputePipelineState(function: passthroughKernelFunction!) let commandBuffer = commandQueue.makeCommandBuffer() let commandEncoder = commandBuffer.makeComputeCommandEncoder() commandEncoder.setComputePipelineState(cps) // Texture setup let width = 16 let height = 16 let byteCount = height*width*4 let bytesPerRow = width*4 let region = MTLRegionMake2D(0, 0, width, height) let textureDescriptor = MTLTextureDescriptor.texture2DDescriptor(pixelFormat: .rgba8Uint, width: width, height: height, mipmapped: false) // inTexture var inData = [UInt8](repeating: 255, count: Int(byteCount)) let inTexture = device.makeTexture(descriptor: textureDescriptor) inTexture.replace(region: region, mipmapLevel: 0, withBytes: &inData, bytesPerRow: bytesPerRow) // outTexture var outData = [UInt8](repeating: 128, count: Int(byteCount)) let outTexture = device.makeTexture(descriptor: textureDescriptor) outTexture.replace(region: region, mipmapLevel: 0, withBytes: &outData, bytesPerRow: bytesPerRow) commandEncoder.setTexture(inTexture, at: 0) commandEncoder.setTexture(outTexture, at: 1) commandEncoder.dispatchThreadgroups(MTLSize(width: 1,height: 1,depth: 1), threadsPerThreadgroup: MTLSize(width: width, height: height, depth: 1)) commandEncoder.endEncoding() commandBuffer.commit() commandBuffer.waitUntilCompleted() // Get the data back from the GPU outTexture.getBytes(&outData, bytesPerRow: bytesPerRow, from: region , mipmapLevel: 0) // Validation // outData should be exactly the same as inData for (i,outElement) in outData.enumerated() { if outElement != inData[i] { print("Dest: \(outElement) != Src: \(inData[i]) at \(i))") } } 

When I run this code with let device = devices[0] (Radeon GPU) outTexture is never written (my guess), and as a result, outData remains unchanged. On the other hand, when you run this code with let device = devices[1] (Intel GPU), everything works as expected, and outData is updated with inData values.

+5
source share
1 answer

I think that whenever the GPU writes an MTLStorageModeManaged resource, such as a texture, and then you want to read this resource from the CPU (for example, using getBytes() ), you need to synchronize it using a blit encoder. Try putting the following line commandBuffer.commit() :

 let blitEncoder = commandBuffer.makeBlitCommandEncoder() blitEncoder.synchronize(outTexture) blitEncoder.endEncoding() 

You can do without this on the integrated GPU, because the GPU uses system memory for the resource and does not synchronize anything.

+8
source

Source: https://habr.com/ru/post/1260302/


All Articles