This project focuses on recognizing handwritten mathematical expressions using a deep learning-based system. The approach leverages a hybrid model combining Convolutional Neural Networks (CNNs) with Transformers to accurately interpret and convert handwritten mathematical symbols into machine-readable formats like LaTeX.
- Introduction
- Features
- System Architecture
- Methodology
- Results and Discussion
- Future Work
- Deployed Link
- Authors
- Model and Code Access
Mathematical Expression Recognition (MER) plays a crucial role in educational and professional settings, allowing for the digitization of handwritten math content. This project aims to design a CNN-Transformer model capable of converting handwritten expressions into LaTeX for easy editing and digital storage.
- Hybrid CNN-Transformer Model: Combines CNNs for feature extraction and Transformers for spatial relationship interpretation.
- Real-time Recognition: Optimized for quick processing of mathematical symbols.
- Output Formats: Supports LaTeX or MathML output for recognized expressions.
- Enhanced Accuracy: Developed with a focus on high symbol recognition accuracy and handling a wide range of mathematical symbols.
The architecture includes a CNN for symbol recognition and a Transformer to understand symbol relationships. This combined approach allows the model to manage the two-dimensional structure of handwritten mathematical expressions effectively.
- Data Acquisition: Collect handwritten expressions from sources like scanned documents or digital input devices.
- Preprocessing: Convert images to grayscale, reduce noise, and apply thresholding.
- Segmentation: Separate images into individual symbols or groups for processing.
- Model Inference: Use CNNs for feature extraction and Transformers for spatial relationships.
- Output Generation: Convert the recognized symbols into LaTeX or MathML.
The model demonstrates high accuracy in symbol recognition and effectively converts basic mathematical expressions. However, challenges remain with complex structures and diverse handwriting styles, suggesting areas for improvement in preprocessing and attention mechanisms.
- Implement advanced attention mechanisms for complex expressions.
- Improve preprocessing for better symbol isolation.
- Expand training data for broader handwriting diversity.
- Optimize model for deployment on mobile and resource-limited devices.
- Chime Gyeltshen Dorji
- Chencho Wangdi
- Tenzin Kinchap
- Karma Wangchuk
Due to GitHub's file size limit of 100 MB, our model and code could not be uploaded to the GitHub repository. You can access the model and code on Google Drive at the following link: