You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Subject: Clarification on RT2 model output structure and usage
Hi [Support Team/RT2 Developers],
I’m working on integrating the RT2 model into a ROS2-based robotic arm control system. The model is being used to generate control commands based on visual input (images) and natural language instructions. However, I’m encountering some challenges in understanding the structure and purpose of the model's output.
Here’s what I observe:
The model output is a tuple with two elements.
The first element is a tensor with shape [1, 1023, 20000].
The meaning and intended use of this tensor are unclear in my application, as the size seems too large for direct use as robotic arm joint control commands.
The second element in the tuple has not been explored yet.
Could you please clarify the following:
What is the meaning of each element in the model’s output tuple?
Is the first tensor (shape [1, 1023, 20000]) designed for use in robotic control, or does it require additional decoding or processing? If so, what is the recommended approach?
If the second element of the tuple is relevant to robotic control, could you provide guidance or examples on how to interpret and use it?
Any examples, documentation, or best practices related to using RT2 for robotic arm control would be greatly appreciated.
Thank you for your assistance!
Best regards
Upvote & Fund
We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.
The text was updated successfully, but these errors were encountered:
Subject: Clarification on RT2 model output structure and usage
Hi [Support Team/RT2 Developers],
I’m working on integrating the RT2 model into a ROS2-based robotic arm control system. The model is being used to generate control commands based on visual input (images) and natural language instructions. However, I’m encountering some challenges in understanding the structure and purpose of the model's output.
Here’s what I observe:
tuple
with two elements.[1, 1023, 20000]
.tuple
has not been explored yet.Could you please clarify the following:
tuple
?[1, 1023, 20000]
) designed for use in robotic control, or does it require additional decoding or processing? If so, what is the recommended approach?tuple
is relevant to robotic control, could you provide guidance or examples on how to interpret and use it?Any examples, documentation, or best practices related to using RT2 for robotic arm control would be greatly appreciated.
Thank you for your assistance!
Best regards
Upvote & Fund
The text was updated successfully, but these errors were encountered: