-
-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metal backend refactor #893
base: main
Are you sure you want to change the base?
Conversation
Now uses all buffers like in the Vulkan backend.
Correctly assembles vertexDescriptor with correct offsets and strides, as well as perInstance and perVertex step functions.
I have discovered what the other issue is with instancing and it has to do with how vertex fragments constants seem to be handled. The current metal backend has the constant index hardcoded as 1, which is what the outputted metal code uses as the expected buffer index for uniform data. The problem is when we use instancing there are effectively two (or more) buffers assigned based on the generated memory layout. For example, we create a buffer pipeline like this (I am using a single float as the only uniform as a “most simple” test case before moving to more complex uniform structures): kinc_g4_vertex_structure_t structure;
kinc_g4_vertex_structure_init(&structure);
kinc_g4_vertex_structure_add(&structure, "pos", KINC_G4_VERTEX_DATA_F32_3X);
kinc_g4_vertex_structure_t structure_inst;
kinc_g4_vertex_structure_init(&structure_inst);
kinc_g4_vertex_structure_add(&structure_inst, "off", KINC_G4_VERTEX_DATA_F32_3X);
structure_inst.instanced = true;
kinc_g4_pipeline_init(&pipeline);
pipeline.vertex_shader = &vertex_shader;
pipeline.fragment_shader = &fragment_shader;
pipeline.input_layout[0] = &structure;
pipeline.input_layout[1] = &structure_inst;
pipeline.input_layout[2] = NULL;
kinc_g4_pipeline_compile(&pipeline);
tmp_loc = kinc_g4_pipeline_get_constant_location(&pipeline, "tmp"); And have shader code like this: #version 450
in vec3 pos;
in vec3 off;
uniform float tmp;
void main() {
gl_Position = vec4(pos.x + off.x + tmp, pos.y + off.y, 0.5, 1.0);
} But the resulting kinc/make metal code expects the uniforms to be at buffer 1, but they have to be placed at buffer 2 due to the pipelines vertex structure. // shader_vert_main
#include <metal_stdlib>
#include <simd/simd.h>
using namespace metal;
struct shader_vert_main_uniforms
{
float tmp;
};
struct shader_vert_main_out
{
float4 gl_Position [[position]];
};
struct shader_vert_main_in
{
float3 off [[attribute(0)]];
float3 pos [[attribute(1)]];
};
vertex shader_vert_main_out shader_vert_main(shader_vert_main_in in [[stage_in]], constant shader_vert_main_uniforms& uniforms [[buffer(1)]])
{
shader_vert_main_out out = {};
out.gl_Position = float4((in.pos.x + in.off.x) + uniforms.tmp, in.pos.y + in.off.y, 0.5, 1.0);
out.gl_Position.z = (out.gl_Position.z + out.gl_Position.w) * 0.5; // Adjust clip-space for Metal
return out;
} If I simply modify the compiled metal vertex shader and replace |
This is a quick video illustrating the aforementioned issue of buffer indexes. If I manually adjust the compiled shader to point at the correct buffer index then everything works as expected and the triangles are drawn with instancing and they utilize a uniform to move. kore-metal-fix.mp4You can see how I am having to ensure the constants vertex buffer is set as the last vertex buffer, as index 1 might already be occupied when doing things like instancing. Is there a reason why index 1 was chose for the constants vertex buffer, when 0 could have been used, and then all other vertex buffers would simply use 1...n? |
I did a bit more digging, and I think this might be considered a krafix issue. The issue seems to boil down to the fact that when Metal vertex buffers user per vertex and per instance step functions in a single memory layout, they take up more than one buffer slot. I think that is why in my test case changing the compiled metal vertex shader input functions If it is MetalTranslator then given the |
MetalTranslator2 is used. |
Thanks. I think MetalTranslator2 might require a refactor to get shader compilation working with more complex structures (like having a memory layout that has multiple layouts). I did some research and this seems relevant: As does the link from within the comments: Compiling my new example with instancing and uniforms works as expected in OpenGL on Mac and I will try it with Vulkan on my Windows machine in the next week or two. I'm leaving detailed information in case I don’t finish this work. |
A bit more research.
#version 450 core
layout(binding = 2, set = 0) uniform uniforms
{
float tmp;
};
layout(location=0) in vec3 pos;
layout(location=1) in vec3 off;
void main() {
gl_Position = vec4(pos.x + off.x + tmp, pos.y + off.y, 0.5, 1.0);
} Using glslang and spirv-cross on the cli (buffer position is maintained): #include <metal_stdlib>
#include <simd/simd.h>
using namespace metal;
struct uniforms
{
float tmp;
};
struct main0_out
{
float4 gl_Position [[position]];
};
struct main0_in
{
float3 pos [[attribute(0)]];
float3 off [[attribute(1)]];
};
vertex main0_out main0(main0_in in [[stage_in]], constant uniforms& _29 [[buffer(2)]])
{
main0_out out = {};
out.gl_Position = float4((in.pos.x + in.off.x) + _29.tmp, in.pos.y + in.off.y, 0.5, 1.0);
return out;
} Using krafix (buffer position is changed from 2 to 1): // shader_vert_main
#include <metal_stdlib>
#include <simd/simd.h>
using namespace metal;
struct uniforms
{
float tmp;
};
struct shader_vert_main_out
{
float4 gl_Position [[position]];
};
struct shader_vert_main_in
{
float3 off [[attribute(1)]];
float3 pos [[attribute(0)]];
};
vertex shader_vert_main_out shader_vert_main(shader_vert_main_in in [[stage_in]], constant uniforms& _29 [[buffer(1)]])
{
shader_vert_main_out out = {};
out.gl_Position = float4((in.pos.x + in.off.x) + _29.tmp, in.pos.y + in.off.y, 0.5, 1.0);
out.gl_Position.z = (out.gl_Position.z + out.gl_Position.w) * 0.5; // Adjust clip-space for Metal
return out;
} |
2 is intentional and is done for every graphics backend. Very tricky to get things working cross-platform when we can not assign bindings ourselves (as 1 illustrates). Will check you suggestion. |
This is almost ready to integrate, but there is one outstanding issue.
There is a conflict between two functions,
kinc_g5_command_list_set_vertex_constant_buffer
andkinc_g5_command_list_set_vertex_buffers
, in that they both attempt to set a vertex buffer at index 0.I added some logging on my local build, and I think the problem stems from the fact the the functions are called in this order:
The Vulkan backend deals with this by adding three variables
static uint32_t lastVertexConstantBufferOffset
,static uint32_t lastFragmentConstantBufferOffset
,static uint32_t lastComputeConstantBufferOffset
and then alters the code in commandlist.m.h in some pretty major ways. I guess it works, but I haven't tried the Vulkan backend as I am on Mac.I haven't delved into the whole kinc rendering pipeline so I am not sure if it is easy to switch the order that kinc_g5_command_list_set_vertex_buffers and kinc_g5_command_list_set_vertex_constant_buffer are called in, or if it might have a negative effect somewhere else. I also don't know if it will fully fix the issue as I think the kinc_g5_command_list_set_vertex_buffers if it wrote to indexes 1...n I would still need to account for the buffer size of the vertex constant buffer. I couldn't see a way to access that from outside G4.c.h. Is it possible to add a function to retrieve this?
If you want to see the instancing sample working you have to disable the constant buffer being set. You can do that by commenting out the last line of the kinc_g5_command_list_set_vertex_constant_buffer function (
[encoder setVertexBuffer:buf offset:offset atIndex:0];
).