# Supervise Finetuning This experiment is used for supervised fine-tuning (SFT) on UltraChat. This README provides instructions for setting up and running the experiment. ### Data Preparation To begin, you'll need to obtain the following datasets: 1. [UltraChat Dataset](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) ### Supervised Fine-Tuning (SFT) To perform Supervised Fine-Tuning on the UltraChat dataset, use the following script: ``` export DATA_PATH=data/ultrachat.json export CONV_TEMPLATE=llama-3-instruct export MODEL_PATH=pretrained_models/Meta-Llama-3-8B export GA=16 export OUTPUT_DIR=saved_models/llama-3-8b_ultrachat bash scripts/train_sft.sh ``` ### Sequence Parallel (SP) for Supervised Fine-Tuning (SFT) We have implemented sequence parallel (SP) for SFT. If you want to use SP, please add these three parameters: ```bash --sequence_parallel_size 8 \ --sequence_parallel_mode "ulysses" \ --cutoff_len 16000 ``` - **`sequence_parallel_size`**: Used to set the number of GPUs to process a single sequence together. The default value is 1, which means SP is not used. - **`sequence_parallel_mode`**: Specifies the specific implementation method of SP. We currently only support `ulysses`. - **`cutoff_len`**: Used to specify the maximum length that the model can handle. When using SP, gradient_accumulation_steps needs to be multiplied by sequence_parallel_size to equal the original batch size when not using SP. Here is the comparison chart of the results with and without sp: ![sp](../assets/sp.png) Currently, we have only tested the performance of the Qwen 2.5 and Qwen 3 models on this code, and there have been no issues. A significant portion of our SP functionality implementation is based on the open-source repository from [360-LLaMA-Factory](https://github.com/Qihoo360/360-LLaMA-Factory). ### References ``` @article{ding2023enhancing, title={Enhancing Chat Language Models by Scaling High-quality Instructional Conversations}, author={Ding, Ning and Chen, Yulin and Xu, Bokai and Qin, Yujia and Zheng, Zhi and Hu, Shengding and Liu, Zhiyuan and Sun, Maosong and Zhou, Bowen}, journal={arXiv preprint arXiv:2305.14233}, year={2023} } ```