-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build failure on Ubuntu 24.10 #17
Comments
I confirm both the build failure and the fix proposed by @NeQuissimus, on Ubuntu Rockchip 24.04 LTS for rk3588. With the fix it now installs, however when giving a prompt to a model it's still not working as it should, possibly because I'm using Python 3.12 instead of 3.10 or 3.8, which is on me. Edit: haven't checked entirely how everything works, but using a qwen model converted with the latest toolkit works, while using the qwens from this repo's page fails altogether. In case anyone needs it, it's the qwen.rkllm from here, it's now talking to itself so I guess it's not an instruct model and I'm abusing it, but it's the most recent one, supposedly the only one publicly available converted with the latest toolkit. I'm still using Python 3.12. The command now requires to specify max new tokens and max context length:
I'm a complete noob here so there might be mistakes, but hopefully it helps someone else. A big shoutout to the repo's maintainer Pelochus, and to NeQuissimus for the fix! |
@MartynaKowalska - any RKLLM models that are converted with 1.1.* are compatible with each other and even the older 0.9.7 kernel module. I have a bunch on Huggingface that I have tested with Armbian Noble on Orange Pi 5 Plus. Feel free to try them out: https://huggingface.co/c01zaut |
@c0zaut Thank you for your reply and contribution! I downloaded and tested a model of yours. This is the command I gave and its initial output:
After that, a massive text in English and Chinese was outputted, and I could give my input to the LLM. I said "Hello! How are you?", but it then started conversing with itself about artificial intelligence. Did I do something wrong? Did I pick the wrong model? |
@MartynaKowalska You did not - it has to do with the chat template. Change the prompt prefix and postfix in your script to align - https://huggingface.co/c01zaut/RK3588-Prompt If you are just using it for general chat, I made a Gradio app that automatically handles chat templates: https://github.com/c0zaut/RKLLM-Gradio |
@c0zaut Thank you very much, I managed to set everything up properly and I was able to chat with Qwen. Your app is fantastic, keep up the phenomenal work💜!! Can I ask you why each model has so many different files? What should one choose? |
@MartynaKowalska - Those are just different conversions. RKLLM has a bunch of options for quantization parameters, so I try to do a range for folks to test out performance/accuracy for each. Thank you for enjoying! Once I have some extra time, I plan to finish implementing multi-modal and image generation support (also looking at websearch, since a couple of models support that tool calling.) |
I got suggested to add #include under ubuntu 24.04 and it works. |
This is the solution. Sorry to reply so late, I haven't checked issues in a while. In the future I will add this so that you don't need to do this by yourself. |
Building this encounters the following:
The following fixes the issue but I am not sure if this needs to be conditional for Armbian or Ubuntu < 24.10
The text was updated successfully, but these errors were encountered: