Metal backend refactor #893

aleph1 · 2025-02-28T08:48:57Z

This is almost ready to integrate, but there is one outstanding issue.

There is a conflict between two functions, kinc_g5_command_list_set_vertex_constant_buffer and kinc_g5_command_list_set_vertex_buffers, in that they both attempt to set a vertex buffer at index 0.

I added some logging on my local build, and I think the problem stems from the fact the the functions are called in this order:

kinc_g5_command_list_set_vertex_buffers (writes to buffers 0...n)
kinc_g5_command_list_set_vertex_constant_buffer (writes to buffer 0, overwriting the previously set vertex buffer)

The Vulkan backend deals with this by adding three variables static uint32_t lastVertexConstantBufferOffset, static uint32_t lastFragmentConstantBufferOffset, static uint32_t lastComputeConstantBufferOffset and then alters the code in commandlist.m.h in some pretty major ways. I guess it works, but I haven't tried the Vulkan backend as I am on Mac.

I haven't delved into the whole kinc rendering pipeline so I am not sure if it is easy to switch the order that kinc_g5_command_list_set_vertex_buffers and kinc_g5_command_list_set_vertex_constant_buffer are called in, or if it might have a negative effect somewhere else. I also don't know if it will fully fix the issue as I think the kinc_g5_command_list_set_vertex_buffers if it wrote to indexes 1...n I would still need to account for the buffer size of the vertex constant buffer. I couldn't see a way to access that from outside G4.c.h. Is it possible to add a function to retrieve this?

If you want to see the instancing sample working you have to disable the constant buffer being set. You can do that by commenting out the last line of the kinc_g5_command_list_set_vertex_constant_buffer function ([encoder setVertexBuffer:buf offset:offset atIndex:0];).

Now uses all buffers like in the Vulkan backend.

Correctly assembles vertexDescriptor with correct offsets and strides, as well as perInstance and perVertex step functions.

aleph1 · 2025-02-28T23:25:31Z

I have discovered what the other issue is with instancing and it has to do with how vertex fragments constants seem to be handled. The current metal backend has the constant index hardcoded as 1, which is what the outputted metal code uses as the expected buffer index for uniform data. The problem is when we use instancing there are effectively two (or more) buffers assigned based on the generated memory layout.

For example, we create a buffer pipeline like this (I am using a single float as the only uniform as a “most simple” test case before moving to more complex uniform structures):

kinc_g4_vertex_structure_t structure;
kinc_g4_vertex_structure_init(&structure);
kinc_g4_vertex_structure_add(&structure, "pos", KINC_G4_VERTEX_DATA_F32_3X);
kinc_g4_vertex_structure_t structure_inst;
kinc_g4_vertex_structure_init(&structure_inst);
kinc_g4_vertex_structure_add(&structure_inst, "off", KINC_G4_VERTEX_DATA_F32_3X);
structure_inst.instanced = true;
kinc_g4_pipeline_init(&pipeline);
pipeline.vertex_shader = &vertex_shader;
pipeline.fragment_shader = &fragment_shader;
pipeline.input_layout[0] = &structure;
pipeline.input_layout[1] = &structure_inst;
pipeline.input_layout[2] = NULL;
kinc_g4_pipeline_compile(&pipeline);
tmp_loc = kinc_g4_pipeline_get_constant_location(&pipeline, "tmp");

And have shader code like this:

#version 450

in vec3 pos;
in vec3 off;

uniform float tmp;

void main() {
	gl_Position = vec4(pos.x + off.x + tmp, pos.y + off.y, 0.5, 1.0);
}

But the resulting kinc/make metal code expects the uniforms to be at buffer 1, but they have to be placed at buffer 2 due to the pipelines vertex structure.

// shader_vert_main

#include <metal_stdlib>
#include <simd/simd.h>

using namespace metal;

struct shader_vert_main_uniforms
{
    float tmp;
};

struct shader_vert_main_out
{
    float4 gl_Position [[position]];
};

struct shader_vert_main_in
{
    float3 off [[attribute(0)]];
    float3 pos [[attribute(1)]];
};

vertex shader_vert_main_out shader_vert_main(shader_vert_main_in in [[stage_in]], constant shader_vert_main_uniforms& uniforms [[buffer(1)]])
{
    shader_vert_main_out out = {};
    out.gl_Position = float4((in.pos.x + in.off.x) + uniforms.tmp, in.pos.y + in.off.y, 0.5, 1.0);
    out.gl_Position.z = (out.gl_Position.z + out.gl_Position.w) * 0.5;       // Adjust clip-space for Metal
    return out;
}

If I simply modify the compiled metal vertex shader and replace uniforms [[buffer(1)]] with uniforms [[buffer(2)]] it works as expected. Do you have any insight this as I would very much appreciate them?

aleph1 · 2025-02-28T23:59:51Z

This is a quick video illustrating the aforementioned issue of buffer indexes. If I manually adjust the compiled shader to point at the correct buffer index then everything works as expected and the triangles are drawn with instancing and they utilize a uniform to move.

kore-metal-fix.mp4

You can see how I am having to ensure the constants vertex buffer is set as the last vertex buffer, as index 1 might already be occupied when doing things like instancing. Is there a reason why index 1 was chose for the constants vertex buffer, when 0 could have been used, and then all other vertex buffers would simply use 1...n?

aleph1 · 2025-03-01T20:46:59Z

I did a bit more digging, and I think this might be considered a krafix issue.

https://github.com/Kode/krafix/blob/43f9df54d0624b63e7c98a4019b60d07bc8c9202/Sources/MetalTranslator.cpp#L386-L390

The issue seems to boil down to the fact that when Metal vertex buffers user per vertex and per instance step functions in a single memory layout, they take up more than one buffer slot. I think that is why in my test case changing the compiled metal vertex shader input functions uniforms [[buffer(1)]] with uniforms [[buffer(2)]] fixes things. Though I'm not sure whether MetalTranslator or MetalTranslator2 is used, so I could be wrong about this.

If it is MetalTranslator then given the else in the krafix translator putting the constants at buffer 0, I wonder whether the constants could always be in buffer zero even when there are other vertex buffers. There might be a reason why this isn’t possible, but in theory it might fix things in terms of the metal shader compilation working as expected.

RobDangerous · 2025-03-01T20:51:44Z

MetalTranslator2 is used.

aleph1 · 2025-03-01T22:36:19Z

Thanks. I think MetalTranslator2 might require a refactor to get shader compilation working with more complex structures (like having a memory layout that has multiple layouts). I did some research and this seems relevant:

KhronosGroup/SPIRV-Cross#1072

As does the link from within the comments:

https://github.com/KhronosGroup/SPIRV-Cross/wiki/Reflection-API-user-guide#mapping-reflection-output-to-vulkan-cheat-sheet

Compiling my new example with instancing and uniforms works as expected in OpenGL on Mac and I will try it with Vulkan on my Windows machine in the next week or two.

I'm leaving detailed information in case I don’t finish this work.

aleph1 · 2025-03-02T04:18:28Z

A bit more research.

Assigning the constant buffer in the metal backend to index 1 is problematic because the other vertex buffers can take up positions 0 to n. So, do we want to write it consistently to index 0, with the other vertex buffers being in positions 1 to n? I’m still trying to understand if the existing metal implementation chose these indexes for a reason, or were they just because it was only partially implemented? I might lean towards the vertex constant buffer being in position 0 as otherwise I think krafix needs an overhaul for metal as because the shaders are compiled at build time vs runtime it has no way of knowing which buffer index the constants will end up in.
krafix’s integration with spirv seems to lose bindings that are written by the user into the glsl code that it compiles. If I write a vertex shader like the following and compile it from the CLI with glslang -G and use the resultingspirv-cross --msl --msl-decoration-binding the binding position is maintained where krafix loses the bindings somewhere in the compilation. I have included the two compiled versions after the original glsl. Specifically note the difference in [[buffer(n)]] positions.

#version 450 core

layout(binding = 2, set = 0) uniform uniforms
{
    float tmp;
};

layout(location=0) in vec3 pos;
layout(location=1) in vec3 off;

void main() {
    gl_Position = vec4(pos.x + off.x + tmp, pos.y + off.y, 0.5, 1.0);
}

Using glslang and spirv-cross on the cli (buffer position is maintained):

#include <metal_stdlib>
#include <simd/simd.h>

using namespace metal;

struct uniforms
{
    float tmp;
};

struct main0_out
{
    float4 gl_Position [[position]];
};

struct main0_in
{
    float3 pos [[attribute(0)]];
    float3 off [[attribute(1)]];
};

vertex main0_out main0(main0_in in [[stage_in]], constant uniforms& _29 [[buffer(2)]])
{
    main0_out out = {};
    out.gl_Position = float4((in.pos.x + in.off.x) + _29.tmp, in.pos.y + in.off.y, 0.5, 1.0);
    return out;
}

Using krafix (buffer position is changed from 2 to 1):

// shader_vert_main

#include <metal_stdlib>
#include <simd/simd.h>

using namespace metal;

struct uniforms
{
    float tmp;
};

struct shader_vert_main_out
{
    float4 gl_Position [[position]];
};

struct shader_vert_main_in
{
    float3 off [[attribute(1)]];
    float3 pos [[attribute(0)]];
};

vertex shader_vert_main_out shader_vert_main(shader_vert_main_in in [[stage_in]], constant uniforms& _29 [[buffer(1)]])
{
    shader_vert_main_out out = {};
    out.gl_Position = float4((in.pos.x + in.off.x) + _29.tmp, in.pos.y + in.off.y, 0.5, 1.0);
    out.gl_Position.z = (out.gl_Position.z + out.gl_Position.w) * 0.5;       // Adjust clip-space for Metal
    return out;
}

RobDangerous · 2025-03-02T22:56:23Z

2 is intentional and is done for every graphics backend. Very tricky to get things working cross-platform when we can not assign bindings ourselves (as 1 illustrates). Will check you suggestion.

aleph1 added 3 commits February 28, 2025 02:43

Fixed kinc_g5_command_list_set_vertex_buffers

b510c66

Now uses all buffers like in the Vulkan backend.

Refactored kinc_g5_pipeline_compile

bd204cf

Correctly assembles vertexDescriptor with correct offsets and strides, as well as perInstance and perVertex step functions.

Added back missing line

165bd8d

aleph1 changed the title ~~Metal instancing fix~~ Metal backend refactor Feb 28, 2025

Make constant vertex buffer last of all vertex buffers.

121d4f9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metal backend refactor #893

Metal backend refactor #893

aleph1 commented Feb 28, 2025

aleph1 commented Feb 28, 2025 •

edited

Loading

aleph1 commented Feb 28, 2025

aleph1 commented Mar 1, 2025

RobDangerous commented Mar 1, 2025

aleph1 commented Mar 1, 2025

aleph1 commented Mar 2, 2025 •

edited

Loading

RobDangerous commented Mar 2, 2025

Metal backend refactor #893

Are you sure you want to change the base?

Metal backend refactor #893

Conversation

aleph1 commented Feb 28, 2025

aleph1 commented Feb 28, 2025 • edited Loading

aleph1 commented Feb 28, 2025

aleph1 commented Mar 1, 2025

RobDangerous commented Mar 1, 2025

aleph1 commented Mar 1, 2025

aleph1 commented Mar 2, 2025 • edited Loading

RobDangerous commented Mar 2, 2025

aleph1 commented Feb 28, 2025 •

edited

Loading

aleph1 commented Mar 2, 2025 •

edited

Loading