-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FineTuning BLIP2 - various issues #376
Comments
I have tried messing around with blip 2 t5 xxl with same settings for LoraConfig (blip opt 6.7 was working fine) it outputs jibberish and converges to 0 waaay to quickly |
Figured it out, the T5 model expects input_ids as instructions, and labels (decoder_input_ids) as your captions |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
I am getting the following error (only when I use peft): TypeError: forward() got an unexpected keyword argument 'inputs_embeds' I was wondering if you knew what might be the issue? Or do you have an example notebook I could look at? |
I'm also getting the error that the loss ends up being all |
pinging @younesbelkada here |
Still an issue for me after trying various versions of PEFT and PyTorch. A currently non-working system setup:
|
Hi @z3ugma , Have you any solution now? |
Hi @bryanchiaws , |
Hello,
Thank you again for the fantastic work on this library and all the examples you are including !!
Big up @younesbelkada for all the support as well...
I have been trying to play around with BLIP2 and PEFT using the example notebook (https://colab.research.google.com/drive/16XbIysCzgpAld7Kd9-xz-23VPWmqdWmW?usp=sharing#scrollTo=6cCVhsmJxxjH) and a few things came up and I was hoping to get your help:
The q_proj and k_proj layers don't exist and so I used "q","v" or tried to use just the default values and it made the loss converge to 0 extremely quickly. However, the model was really just outputting gibberish so I'm likely not using the right target_modules... How are you supposed to tweak this parameter? In general too, is there a heuristic for these such as T5 -> q,v , OPT -> q_proj,k_proj and is that different for the regular model vs BLIP2?
outputs = model(input_ids=input_ids, pixel_values=pixel_values, labels=input_ids)
From my understanding, this would imply that we are already passing the label into the model that we want to predict as an input?
I also tried to modify the notebook to go beyond just image captioning and try to train a VQA model by modifying the following:
But then it didn't really seem to converge as well as the regular image captioning despite always having the same prompt throughout my dataset... Anything I could be doing wrong?
Thanks in advance!
The text was updated successfully, but these errors were encountered: