A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
-
Updated
Apr 9, 2025 - C++
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)
CLIP inference in plain C/C++ with no extra dependencies
multimodal routing, geocoding, and map tiles
LLaVA server (llama.cpp).
The simulation system for robotic general intelligence™
[ECCV 2022] Multimodal Transformer with Variable-length Memory for Vision-and-Language Navigation
ROS2 package that integrates L3CAM sensors using L3CAM SDK
Highway Driving (project 7 of 9 from Udacity Self-Driving Car Engineer Nanodegree)
Dual WiFi 2.4 & 5GHz + Bluetooth Scanner whit RTL8720DN BW16
ROS2 package for the visualization of the fusion of the L3Cam device sensors
🔥 🔥 Alternative to Ollama 🔥 🔥 multi-model <1ms LLM switching
Repository to document and advertise our McGill Capstone Group 22 Project
🚀 Accelerate large language model training and fine-tuning with Surogate’s high-performance, mixed-precision framework in C++ and Python.
🔍 Scan WiFi 2.4-5GHz and Bluetooth quickly with WIFIBLE, featuring touch support and optimized performance for seamless connectivity.
Add a description, image, and links to the multimodal topic page so that developers can more easily learn about it.
To associate your repository with the multimodal topic, visit your repo's landing page and select "manage topics."