Skip to content

RomuloGatto/opstream_challenge

Repository files navigation

File Upload & Metadata Extraction Service

This project is a RESTful API that allows users to upload files, stores them in AWS S3, and triggers a process to extract metadata using an AWS Lambda function. The extracted metadata is stored in DynamoDB and can be retrieved via an API endpoint.


Prerequisites

  • Node.js (v18 or higher)
  • AWS CLI (configured with appropriate permissions)
  • Terraform (v1.0 or higher)

Deployment

First-Time Setup

The deployment workflow involves an initial two-step process due to the need for the Lambda zip file to be uploaded to the S3 bucket before it can be configured as part of the infrastructure. For subsequent updates, additional manual steps are required.

  1. Clone the Repository:

    git clone https://github.com/romulogatto/opstream_challenge.git
    cd opstream_challenge
  2. Install Dependencies:

    npm install
  3. First-Time Workflow:

    • Partial Terraform Deployment:

      cd terraform
      terraform init
      terraform apply -target=aws_s3_bucket.file_upload_bucket -target=aws_dynamodb_table.metadata_table -target=aws_iam_role.lambda_role

      Terraform will output the S3 bucket name where the Lambda zip file should be uploaded.

    • Package and Upload Lambda Function:

      cd lambda/metadataExtractor
      zip -r metadataExtractor.zip index.js node_modules
      aws s3 cp metadataExtractor.zip s3://<your-s3-bucket-name>/lambda/metadataExtractor.zip
    • Complete Terraform Deployment:

      terraform apply
  4. Subsequent Lambda Updates:

    • When updating the Lambda code:
      • Repackage and upload the Lambda function as described in Step 3.
      • Manually redeploy the Lambda function via the AWS Management Console to apply the updates.
  5. Run the Server:

    npm start

API Endpoints

1. Upload File

  • Endpoint: POST /api/files/upload

  • Request:

    • Form-data:
      • file: The file to upload (required).
      • Any additional fields in the request body will be included as metadata.
    • Example:
      curl -X POST http://localhost:3000/api/files/upload \
          -F "file=@/path/to/file.pdf" \
          -F "author=John Doe" \
          -F "expirationDate=2025-12-31"
  • Response:

    • Success: { "fileId": "unique-file-id" }
    • Error: { "error": "File upload failed" }

2. Retrieve Metadata

  • Endpoint: GET /api/metadata/:fileId
  • Response:
    • Success:
      {
        "expirationDate": "2025-12-31",
        "uploadDate": "2025-01-10T21:14:49.273Z",
        "metadata": {
          "pages": 2,
          "textPreview": "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed ullamcorper ante eget nulla sagittis blandit. Cras commodo elit a justo ultrices mollis. Duis tempor dictum nisi, ac lobortis ex aliquam id. Nunc sollicit",
          "fileSize": 151112,
          "fileType": "application/pdf"
        },
        "fileName": "LoremIpsumDolor.pdf",
        "fileId": "13af05cd-5143-4e67-a0dc-763e5c6e972e",
        "author": "John Doe"
      }
    • Error:
      {
        "error": "Metadata not found"
      }

Testing

To run all tests:

npm test

Notes:

  • Ensure that Terraform has been applied before running integration tests, as these require the AWS infrastructure to be active.
  • Unit tests will work independently of the AWS infrastructure.

Challenges and Improvements

  1. Two-Step Terraform Workflow:

    • The initial deployment requires manual steps for uploading the Lambda zip file and completing the configuration.
  2. Lambda Redeployment:

    • Subsequent updates to the Lambda code require manual redeployment through the AWS Console.
  3. Enhanced Error Handling:

    • Improve logging and retry mechanisms for better fault tolerance when interacting with AWS services.
  4. CI/CD Integration:

    • Automate the packaging, uploading, and redeployment of the Lambda function using CI/CD pipelines.
  5. Scalability:

    • Optimize the metadata extraction process to handle larger file sizes and more complex file types.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors