Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 7 additions & 2 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,7 @@
*
!packages/
.venv/
__pycache__/
*.pyc
.git/
dist/
build/
.pytest_cache/
45 changes: 45 additions & 0 deletions .github/workflows/ci-tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
name: CI - MarkItDown tests

on:
push:
branches: [ "chore/dev-docker-tests" ]
pull_request:
branches: [ "main" ]

jobs:
tests:
runs-on: ubuntu-latest
steps:
- name: Checkout source
uses: actions/checkout@v4

- name: Setup Python 3.11
uses: actions/setup-python@v5
with:
python-version: "3.11"
cache: "pip"

- name: Install system deps (ffmpeg, exiftool)
run: |
sudo apt-get update -y
sudo apt-get install -y ffmpeg exiftool

- name: Install project deps
run: |
python -m pip install -U pip
pip install -e 'packages/markitdown[all]'
pip install -e packages/markitdown-sample-plugin
pip install pytest

- name: Run tests (skip network-heavy)
env:
PYTEST_ADDOPTS: -k "not convert_url and not convert_http_uri"
run: |
pytest -q packages/markitdown/tests | tee last-run.txt

- name: Upload test summary
if: always()
uses: actions/upload-artifact@v4
with:
name: pytest-summary
path: last-run.txt
47 changes: 20 additions & 27 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,33 +1,26 @@
FROM python:3.13-slim-bullseye
# imagem Debian (não Alpine) — suporta onnxruntime e magika
FROM python:3.11-slim
WORKDIR /work

ENV DEBIAN_FRONTEND=noninteractive
ENV EXIFTOOL_PATH=/usr/bin/exiftool
ENV FFMPEG_PATH=/usr/bin/ffmpeg
# instala compiladores e dependências do sistema
RUN apt-get update && apt-get install -y \
build-essential git ffmpeg libxml2 libxslt1-dev \
libffi-dev libssl-dev python3-dev \
&& rm -rf /var/lib/apt/lists/*

# Runtime dependency
RUN apt-get update && apt-get install -y --no-install-recommends \
ffmpeg \
exiftool
# copia o código do repositório
COPY . .

ARG INSTALL_GIT=false
RUN if [ "$INSTALL_GIT" = "true" ]; then \
apt-get install -y --no-install-recommends \
git; \
fi
# instala o markitdown e pytest
RUN pip install --no-cache-dir -U pip \
&& pip install --no-cache-dir -e 'packages/markitdown[all]' pytest

# Cleanup
RUN rm -rf /var/lib/apt/lists/*
# define o diretório do pacote e roda os testes
WORKDIR /work

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

variable files
pre-commit

CMD ["pytest", "-q"]

WORKDIR /app
COPY . /app
RUN pip --no-cache-dir install \
/app/packages/markitdown[all] \
/app/packages/markitdown-sample-plugin
# garante que a variável pode vir de fora sem estourar o entrypoint
ENV PYTEST_ADDOPTS=""

# Default USERID and GROUPID
ARG USERID=nobody
ARG GROUPID=nogroup

USER $USERID:$GROUPID

ENTRYPOINT [ "markitdown" ]
# executa os testes do pacote principal; caminho relativo exige WORKDIR=/work
ENTRYPOINT ["pytest", "-q", "packages/markitdown/tests"]
36 changes: 36 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -246,3 +246,39 @@ trademarks or logos is subject to and must follow
[Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/general).
Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship.
Any use of third-party trademarks or logos are subject to those third-party's policies.

## Local test setup (Docker + WSL)

Build:
```bash
docker build -t markitdown-tests .
```

Run (skip remote URL tests to avoid 429):
```bash
docker run --rm -it -v "$(pwd):/work" \
-e PYTEST_ADDOPTS='-k "not convert_url and not convert_http_uri"' \
markitdown-tests
```

## Installation and how to run tests (without Docker)

Requirements:
- Python 3.11 (recommended)
- pip
- gcc / build-essential (for compiled deps)
- ffmpeg and exiftool available on PATH (some tests rely on them)

Setup:
```bash
python3 -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install -e 'packages/markitdown[all]'
pip install -e 'packages/markitdown-sample-plugin'
```

Run tests:
```bash
pytest -q packages/markitdown/tests -k "not convert_url and not convert_http_uri"
```