Skip to content

Latest commit

 

History

History
178 lines (143 loc) · 5.83 KB

File metadata and controls

178 lines (143 loc) · 5.83 KB

Office Add-in to AI Analysis - Complete Flow Analysis

Current State - What Actually Happens

1. User Clicks "Save to Spaarke" in Outlook

Component: SaveFlow.tsxuseSaveFlow.ts

1. Client collects metadata (subject, sender, recipients, attachments list)
2. Client calls POST /api/office/save with:
   - contentType: 'Email'
   - triggerAiProcessing: true (default from DEFAULT_PROCESSING_OPTIONS)
   - aiOptions: { profileSummary: true, ragIndex: true, deepAnalysis: false }

2. API Receives Save Request

Component: OfficeEndpoints.csOfficeService.SaveAsync()

1. Creates ProcessingJob record in Dataverse
2. Retrieves email body + attachments via Microsoft Graph (OBO token)
3. Converts to .eml format using IEmailToEmlConverter
4. Uploads .eml to SharePoint Embedded via SpeFileStore
5. Creates Document record in Dataverse (via IDataverseService.CreateDocumentAsync)
6. Updates Document with SPE pointers, email metadata
7. Queues OfficeJobMessage to "office-upload-finalization" queue
8. Returns 202 Accepted with jobId

Status: Job marked as COMPLETE immediately (Option B flow) Issues:

  • Creates Document record BEFORE worker runs
  • Job status = Complete but workers haven't run yet

3. UploadFinalizationWorker Processes Message

Component: UploadFinalizationWorker.cs

1. Receives message from "office-upload-finalization" queue
2. Checks if file already in SPE (TempFileLocation starts with "spe://")
3. Uses existing DocumentId from payload (created in step 2)
4. Creates EmailArtifact record
5. Extracts + uploads attachments as child Documents
6. IF payload.TriggerAiProcessing:
   - Queues message to "office-profile" queue
   ELSE:
   - Marks job Complete

Issues Observed:

  • Failing with "Incorrect attribute value type System.Int32" when updating ProcessingJob
  • Dead-lettering messages → profile queue never receives them

4. ProfileSummaryWorker (SHOULD Process)

Component: ProfileSummaryWorker.cs

1. Receives message from "office-profile" queue
2. Calls IAppOnlyAnalysisService.AnalyzeDocumentAsync()
3. Creates Analysis record with Document Profile playbook
4. IF payload.RagIndex:
   - Queues to "office-indexing" queue
   ELSE:
   - Marks job Complete

Status: NEVER RUNS because UploadFinalizationWorker fails

What SHOULD Happen - Existing AI Infrastructure

Existing AI Analysis Flow for Documents

Entry Points:

  1. POST /api/ai/analyze - Direct analysis endpoint
  2. Service Bus - document-events queue with topic subscription
  3. Background Workers - ServiceBusJobProcessor listening to queue

Existing Components:

  • IAnalysisOrchestrationService - Orchestrates analysis pipeline
  • IAppOnlyAnalysisService - App-only document analysis
  • IAiToolService - AI tool execution framework
  • IFileIndexingService - RAG indexing pipeline

Existing Flow:

Document Created
  ↓
Message to document-events queue
  ↓
ServiceBusJobProcessor picks up
  ↓
ProfileSummaryJobHandler processes
  ↓
Calls IAppOnlyAnalysisService.AnalyzeDocumentAsync()
  ↓
Creates Analysis record with playbook
  ↓
Queues indexing job if needed

Problems Identified

1. Duplicate Infrastructure

We created NEW workers (UploadFinalizationWorker, ProfileSummaryWorker) that duplicate existing infrastructure:

  • Existing: ServiceBusJobProcessor + ProfileSummaryJobHandler
  • New (Office): ProfileSummaryWorker

Result: Two separate pipelines doing the same thing

2. Premature Job Completion

OfficeService marks job as COMPLETE immediately after upload, but workers haven't run yet.

Should: Job stays Running until ProfileSummaryWorker completes

3. Missing Integration with Existing AI Services

The Office flow creates its own queue (office-profile) instead of using:

  • Existing document-events queue
  • Existing ProfileSummaryJobHandler

4. Type Mismatch Bugs

UpdateProcessingJobAsync has attribute type mismatches causing worker failures

Recommended Architecture

Option A: Use Existing Infrastructure (RECOMMENDED)

User Saves Email in Outlook
  ↓
POST /api/office/save
  ├─ Upload .eml to SPE
  ├─ Create Document record
  ├─ Create EmailArtifact record
  └─ Publish to EXISTING document-events queue  <-- USE EXISTING
       ↓
       EXISTING ServiceBusJobProcessor picks up
       ↓
       EXISTING ProfileSummaryJobHandler processes
       ↓
       EXISTING IAppOnlyAnalysisService creates Analysis
       ↓
       EXISTING IFileIndexingService indexes if requested

Changes Required:

  1. Remove ProfileSummaryWorker (duplicate)
  2. Keep UploadFinalizationWorker for Office-specific tasks (EmailArtifact, attachments)
  3. UploadFinalizationWorker publishes to document-events queue instead of office-profile
  4. Let existing AI infrastructure handle analysis

Option B: Keep Separate But Fix

Keep office-profile queue and ProfileSummaryWorker
BUT: ProfileSummaryWorker just calls existing IAppOnlyAnalysisService

Changes Required:

  1. Fix type mismatch bug in UpdateProcessingJobAsync
  2. Ensure ProfileSummaryWorker properly calls existing IAppOnlyAnalysisService
  3. Fix job status flow (don't mark Complete until workers finish)

Recommendation

Use Option A - eliminate duplication and use existing proven AI infrastructure.

Why:

  1. Existing ProfileSummaryJobHandler already does what we need
  2. Existing document-events queue already exists
  3. Less code to maintain
  4. Single pipeline = easier debugging
  5. AI enhancements benefit all document sources (not just Office)

Implementation:

  1. UploadFinalizationWorker should publish to document-events instead of office-profile
  2. Remove ProfileSummaryWorker.cs
  3. Remove office-profile queue from Service Bus
  4. Existing ServiceBusJobProcessor + ProfileSummaryJobHandler handles analysis