Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't load state_dict for GPT2ForSequenceClassification (Unexpected key(s) in state_dict) #1

Open
xiyiyia opened this issue Jul 12, 2023 · 3 comments

Comments

@xiyiyia
Copy link

xiyiyia commented Jul 12, 2023

Hi! @malik727 @Hunaid2000
I guess the problem is in the GPTGC.pt.
May I get a new file of pre-trained model?
Thanks a lot!

Error(s) in loading state_dict for GPT2ForSequenceClassification:
	Unexpected key(s) in state_dict: "transformer.h.0.attn.bias", "transformer.h.0.attn.masked_bias", "transformer.h.1.attn.bias", "transformer.h.1.attn.masked_bias", "transformer.h.2.attn.bias", "transformer.h.2.attn.masked_bias", "transformer.h.3.attn.bias", "transformer.h.3.attn.masked_bias", "transformer.h.4.attn.bias", "transformer.h.4.attn.masked_bias", "transformer.h.5.attn.bias", "transformer.h.5.attn.masked_bias", "transformer.h.6.attn.bias", "transformer.h.6.attn.masked_bias", "transformer.h.7.attn.bias", "transformer.h.7.attn.masked_bias", "transformer.h.8.attn.bias", "transformer.h.8.attn.masked_bias", "transformer.h.9.attn.bias", "transformer.h.9.attn.masked_bias", "transformer.h.10.attn.bias", "transformer.h.10.attn.masked_bias", "transformer.h.11.attn.bias", "transformer.h.11.attn.masked_bias". 
  File "/home/genre-detector-gpt2/GPTGC.py", line 165, in load_model
    self.model.load_state_dict(torch.load(f"{params['MODEL_DIR']}/{params['NAME']}.pt"))
  File "/home/genre-detector-gpt2/GPTGC.py", line 59, in __init__
    self.load_model()
  File "/home/genre-detector-gpt2/main.py", line 13, in <module>
    model = GPTGC(device) # Loading model for inference.
RuntimeError: Error(s) in loading state_dict for GPT2ForSequenceClassification:
	Unexpected key(s) in state_dict: "transformer.h.0.attn.bias", "transformer.h.0.attn.masked_bias", "transformer.h.1.attn.bias", "transformer.h.1.attn.masked_bias", "transformer.h.2.attn.bias", "transformer.h.2.attn.masked_bias", "transformer.h.3.attn.bias", "transformer.h.3.attn.masked_bias", "transformer.h.4.attn.bias", "transformer.h.4.attn.masked_bias", "transformer.h.5.attn.bias", "transformer.h.5.attn.masked_bias", "transformer.h.6.attn.bias", "transformer.h.6.attn.masked_bias", "transformer.h.7.attn.bias", "transformer.h.7.attn.masked_bias", "transformer.h.8.attn.bias", "transformer.h.8.attn.masked_bias", "transformer.h.9.attn.bias", "transformer.h.9.attn.masked_bias", "transformer.h.10.attn.bias", "transformer.h.10.attn.masked_bias", "transformer.h.11.attn.bias", "transformer.h.11.attn.masked_bias". 
@xiyiyia
Copy link
Author

xiyiyia commented Jul 12, 2023

To run this code, I have made the decision to remove the bias layer.
If accuracy is not the primary concern for your goal, you can implement the following change:

Replace the load_model() function in line 165 with the following code snippet:

loaded_dict = torch.load(f"{params['MODEL_DIR']}/{params['NAME']}.pt")
model_dict = self.model.state_dict()
loaded_dict = {k: v for k, v in loaded_dict.items() if k in model_dict}
model_dict.update(loaded_dict) 
self.model.load_state_dict(model_dict)

This modification will allow you to proceed with your desired testing.

@malik727
Copy link
Collaborator

malik727 commented Jul 12, 2023

@xiyiyia Can you tell me the Python, PyTorch, and Hugging Face transformer versions you're using? I tested on the following configuration and it runs fine:

Python = 3.9.13
HuggingFace-Hub = 0.11.1
PyTorch = 1.13.1
Transformers = 4.27.4

In principle, we should've used the "save_pretrained()" and "load_pretrained()" functions provided by the transformers library as they don't bind the functionality to a specific version of the library.

To fix the unexpected keys issue we'll need to alter the key names of the stored model to match the key names expected by the GPT-2 architecture.

@xiyiyia
Copy link
Author

xiyiyia commented Jul 12, 2023

@xiyiyia Can you tell me the Python, PyTorch, and Hugging Face transformer versions you're using? I tested on the following configuration and it runs fine:

Python = 3.9.13 HuggingFace-Hub = 0.11.1 PyTorch = 1.13.1 Transformers = 4.27.4

In principle, we should've used the "save_pretrained()" and "load_pretrained()" functions provided by the transformers library as they don't bind the functionality to a specific version of the library.

To fix the unexpected keys issue we'll need to alter the key names of the stored model to match the key names expected by the GPT-2 architecture.

Here are the versions of the packages you mentioned:

Python = 3.8.13
HuggingFace-Hub = 0.15.1
PyTorch = 1.11.0
Transformers = 4.31.0

I will create a new environment for testing.

Thanks for your nice work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants