Skip to content

This project showcases fine-tuning of the Qwen2.5-3B model using the Chain-of-Thought (CoT) prompting technique to enhance mathematical reasoning capabilities. With efficient parameter-efficient fine-tuning (PEFT) via LoRA and the high-speed Unsloth framework, the model is optimized for clarity, depth, and accuracy in problem-solving.

Notifications You must be signed in to change notification settings

ashkunwar/COT_Finetuning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 

Repository files navigation

Chain-of-Thought Mathematical Reasoning with Qwen 3B

Overview

This repository demonstrates fine-tuning the Qwen 3B model to perform mathematical reasoning using the Chain-of-Thought (CoT) prompting strategy. Leveraging advanced training methodologies and efficient fine-tuning strategies, this implementation aims to enhance the Qwen model's reasoning and problem-solving capabilities.

Features

  • Model Architecture: Qwen2.5-3B fine-tuned with LoRA (Low-Rank Adaptation).
  • Prompt Strategy: Structured Chain-of-Thought prompting to improve reasoning clarity.
  • Fine-tuning Method: Efficient Parameter-Efficient Fine-Tuning (PEFT) using the unsloth framework.
  • Dataset: Mathematical reasoning tasks extracted from the "Ashkchamp/Openthoughts_math_filtered_30K" dataset.

Technologies Used

  • Unsloth: For rapid and memory-efficient fine-tuning of LLM models.
  • Hugging Face Transformers & TRL: Leveraged for model manipulation, training, and evaluation.
  • PEFT (LoRA): Enables efficient fine-tuning by adapting only select model parameters.

Installation

pip install -q unsloth

Dataset Preparation

Dataset used:

Data preprocessing involves structured prompts containing the system context, mathematical questions, detailed Chain-of-Thought reasoning, and explicitly formatted solutions.

Training

Fine-tuning configurations:

  • Batch Size: 2 (with gradient accumulation of 4 steps)
  • Sequence Length: 8192 tokens
  • Learning Rate: 2e-5
  • Optimizer: AdamW (8-bit)
  • Precision: Automatic FP16/BF16 based on hardware support
  • LoRA Rank: 32

Execute training:

python train.py  # if encapsulated in a script, otherwise run cells sequentially in the provided notebook

Usage

Post-training, the model is equipped to handle mathematical queries employing Chain-of-Thought reasoning clearly and systematically:

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="unsloth/Qwen2.5-3B",
    max_seq_length=8192,
    dtype=None,
    load_in_4bit=True
)

prompt = "Your structured math problem here."
tokenized_input = tokenizer(prompt, return_tensors="pt")
output = model.generate(**tokenized_input)

print(tokenizer.decode(output[0], skip_special_tokens=True))

Contribution

Feel free to contribute improvements, report issues, or suggest new features via pull requests or issue submissions.

License

Distributed under the MIT License. See the LICENSE file for more information.

About

This project showcases fine-tuning of the Qwen2.5-3B model using the Chain-of-Thought (CoT) prompting technique to enhance mathematical reasoning capabilities. With efficient parameter-efficient fine-tuning (PEFT) via LoRA and the high-speed Unsloth framework, the model is optimized for clarity, depth, and accuracy in problem-solving.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published