Chapter 12 #32

ritesh2014 · 2025-01-27T13:43:54Z

Below Code:

trainer = SFTTrainer(
model=model,
train_dataset=dataset,
dataset_text_field="text",
tokenizer=tokenizer,
args=training_arguments,
max_seq_length=512,

# Leave this out for regular SFT
peft_config=peft_config,

)

Error:
TypeError: SFTTrainer.init() got an unexpected keyword argument 'dataset_text_field'

Tried to resolve using Gemini, but the solutions kept throwing new errors and the code got messed up.
Has anyone resolved this?

The text was updated successfully, but these errors were encountered:

MaartenGr · 2025-01-27T14:44:05Z

Where are you using the above code, in Colab? If so, can you use the versions of transformers/trl as shown in the requirements.txt?

ritesh2014 · 2025-01-27T15:19:01Z

Using Colab.

So changed -
!pip install -q accelerate peft bitsandbytes transformers trl sentencepiece

To-
!pip install -q accelerate==0.31.0 peft==0.11.1 bitsandbytes==0.43.1 transformers==4.41.2 trl==0.9.4 sentencepiece==0.2.0

Had to provide versions to all the libs as additional errors were cropping up.
Now it works. When you can, pls change the notebook in Github to prevent this occurence for others.

MaartenGr · 2025-02-05T16:44:02Z

@ritesh2014 Thanks for testing the updated requirements! I finally had some time to update this and it should work now. Also, the reason why this took a bit longer is a nice one (cool update in a couple of hours 😉).

amina-mardiyyah · 2025-02-07T10:31:00Z

For future reference or anyone who bumps into this error on Colab, you could also replace transformers TrainingArguments with SFTConfig from SFT instead. Then define parameters as below:

training_arguments = SFTConfig(
    output_dir=output_dir,
    per_device_train_batch_size=2,
    gradient_accumulation_steps=4,
    optim="paged_adamw_32bit",
    learning_rate=2e-4,
    lr_scheduler_type="cosine",
    num_train_epochs=1,
    logging_steps=10,
    max_seq_length = 512,
    fp16=True,
    gradient_checkpointing=True,
    dataset_text_field="text", 
)

Then:


trainer = SFTTrainer(
     model=model,
     train_dataset=dataset,
     tokenizer=tokenizer,
     args=training_arguments,
 )

MaartenGr added a commit that referenced this issue Feb 5, 2025

Fix #32

e2bfd91

MaartenGr closed this as completed in 57e8572 Feb 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chapter 12 #32

Chapter 12 #32

ritesh2014 commented Jan 27, 2025

MaartenGr commented Jan 27, 2025

ritesh2014 commented Jan 27, 2025

MaartenGr commented Feb 5, 2025

amina-mardiyyah commented Feb 7, 2025 •

edited

Loading

Chapter 12 #32

Chapter 12 #32

Comments

ritesh2014 commented Jan 27, 2025

MaartenGr commented Jan 27, 2025

ritesh2014 commented Jan 27, 2025

MaartenGr commented Feb 5, 2025

amina-mardiyyah commented Feb 7, 2025 • edited Loading

amina-mardiyyah commented Feb 7, 2025 •

edited

Loading