hi, depends on your use case and hardware you have access.

Oct 28, 2023

hi, depends on your use case and hardware you have access. generally speaking you can finetune a model couple ways

1. form of prefix-tuning that prepends a learnable adaption-prompt - Adapter

2. Low-rank adaption (LoRA) This significantly reduces the number of trainable parameters and speeds up training with little impact on the final performance of the model.

The finetuning requires at least one GPU with ~24 GB memory (RTX 3090) for 7B models. QLora came to play if you a a limited GPU vram.

just keep in mind, quantization to 8 bits reduce the precision and quality.

depends how your configuration your training parameters. you can find more details here: https://huggingface.co/docs/transformers/main_classes/trainer

hope this help, cheers!

Written by Minyang Chen

Responses (1)