Skip to content

An AI-powered medical assistant that uses image analysis and voice interaction using Meta Llama3 Vision for multimodal understanding, OpenAI Whisper for speech-to-text, and Gradio for an interactive healthcare assistant app

Notifications You must be signed in to change notification settings

fahim-ysr/HealthIntuit

Repository files navigation

HealthIntuit: - An AI-powered healthcare assistant


HealthIntuit Demo

Project Overview

HealthIntuit is a multimodal AI-powered web app that analyzes patient-submitted images and voice queries to generate preliminary medical insights and prescription drafts. Designed for educational purposes, it leverages Groq's ultra-fast LLMs and ElevenLabs' lifelike speech synthesis to simulate doctor-patient interactions while emphasizing the importance of professional consultation. The app streamlines triage workflows, reduces administrative burdens, and empowers patients with accessible preliminary guidance.

Mission

Leverage multimodal AI to streamline preliminary patient triage and prescription drafting, supporting medical education and workflow efficiency without replacing licensed healthcare providers.

  • Medical Education: Students can simulate doctor-patient interactions for training.
  • Workflow Automation: Generates draft prescriptions for review (saves ~15 mins/case).

Tech Stack

CategoryTechnologies Used
AI/MLGroq API (Llama-4-Scout-17B, Whisper-Large-v3), ElevenLabs TTS
FrontendGradio (Python-based UI), Custom CSS Styling
BackendPython 3.10
Audio/ImagePyDub, SpeechRecognition, Base64 Encoding
SecurityEnvironment Variables, Input Validation

Problem Statement

  • Healthcare Access Crisis (Canada Focus):
    • 4.8 million Canadians lack a family doctor (2025 Statistics Canada), with average wait times of 25.6 weeks for specialist referrals.
    • Rural disparities: 23% of rural patients travel >100 km for basic care, exacerbating health inequities.
  • Pain Points Addressed:
    • Long Wait Times: Patients face delays in receiving initial feedback for non-emergency conditions (e.g., skin irritations).
    • Prescription Errors: 12% of manual prescriptions contain dosage inaccuracies (Canadian Medical Association, 2024).
    • Workflow Inefficiencies: Doctors spend 37% of their time on administrative tasks instead of patient care.
  • Impact:
    • Misdiagnosis costs Canada’s healthcare system $1.2B annually in preventable complications.
    • 68% of patients report frustration with "black box" medical jargon in preliminary diagnoses.

System Architecture

High-Level UML Diagram (Text Representation)

[User Interface (Gradio)]
       ↑ ↓
[Backend Logic (Python)]
       ↑ ↓
[Groq AI Model (Llama-4-Scout-17B, Whisper-Large-v3)]
       ↑
[Prescription Formatter (AI-powered)]
       ↑
[ElevenLabs/gTTS TTS]
    
Workflow Diagram

Key Components

  • Frontend Operations:
    • Responsive, accessible UI using Gradio.
    • Secure image (medical photo) and audio (voice) upload.
    • Real-time feedback: Progress indicators and error messages for invalid or missing inputs.
  • Backend Operations:
    • All logic handled within a Python backend (Gradio app).
    • Handles file processing, AI inference, and output formatting.
    • Uses environment variables for secure API key management.
  • AI Integrations:
    • Groq LLM: For multimodal (image + text) medical analysis.
    • Whisper (Groq): For speech-to-text transcription of patient queries.
    • ElevenLabs/gTTS: For converting doctor’s AI response to natural-sounding speech.
  • Prescription Generator:
    • AI-driven prescription formatter: Converts doctor’s analysis into a structured, downloadable text prescription.
    • Downloadable as a .txt file.
  • Multi-modal Input:
    • Accepts both voice (audio) and medical image uploads for comprehensive analysis.
  • Dynamic Prescription Templating:
    • Uses Groq AI to generate a clear, structured prescription based on the doctor’s response.

Sample Prescription Output:

Patient Name: John Doe
Date: 2025-05-20 14:30:00

Diagnosis: Mild seborrheic dermatitis

Prescription:
- Ketoconazole 2% shampoo: Apply 10ml to scalp daily

Recommendations:
- Avoid harsh hair products
- Follow up in 2 weeks

Diagonosed by ⚕️HealthIntuit

Disclamer: AI Medical Assistant (Educational Use Only)

Technical Highlights

  • Code Quality & Principles:
    • Partial adherence to SOLID (modular functions, easy to extend AI models).
    • Clean, PEP8-compliant code; documented functions.
    • Test coverage not yet implemented (future: pytest, target 90%+).
  • Security:
    • API keys managed via environment variables.
    • Basic input validation and temporary file cleanup.
  • Scalability:
    • Single-instance Gradio app (suitable for small user base).
    • Groq AI APIs handle inference scaling.
  • Observability:
    • Basic logging for audio recording and error handling.
    • Future: centralized logging, monitoring, and error tracking.

Product Features

  • Multi-modal Input:
    • Accepts both patient queries (voice) and medical images (photo upload).
  • Educational Disclaimers:
    • Prominently displayed in all outputs and prescription downloads.
    • Logic prevents use without user acknowledgment of educational purpose.

Business & Product Vision

  • Market Fit:
    • Addresses needs in medical education, telehealth, and workflow automation.
  • Monetization:
    • SaaS for institutions, API for EMR vendors, open-source core for community adoption.
  • Growth:
    • Roadmap includes multilingual support, integration with Canadian EMRs, and expansion to global markets.

For educational use only. Consult a licensed healthcare provider for medical advice.


Watch the video

About

An AI-powered medical assistant that uses image analysis and voice interaction using Meta Llama3 Vision for multimodal understanding, OpenAI Whisper for speech-to-text, and Gradio for an interactive healthcare assistant app

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published