Skip to content
View arnaudlemaignen's full-sized avatar

Block or report arnaudlemaignen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
arnaudlemaignen/README.md

👋 Hi, I’m Arnaud

Lead Principal Software Engineer | DevOps & SRE Expert | System Reliability & Observability

With 20+ years of hands-on experience in software engineering, infrastructure, and system reliability, I build and maintain high-availability, scalable systems, with deep expertise in observability, CI/CD, and modern cloud architecture.

Since I was a kid, I have been curious about technology. I started by typing BASIC commands on a Commodore VIC-20 and later built my first PC with my older brother. These early experiences gave me a strong interest in both hardware and software, even when mistakes (like formatting a hard drive or burning a processor) taught me valuable lessons.

I have always been interested in numbers, metrics, and KPI. This focus on measurement naturally led me to my first observability project in my current company more than 10 years ago, using early versions of Grafana, Prometheus, and Collectd with Ansible on bare metal. Later, I helped migrate our large monolithic system to Kubernetes, writing Dockerfiles, Helm charts, which led me to embrace DevOps and SRE practices at the core of my work.

Today, AI and ML are essential tools for software engineers to develop faster and more reliably. They are also valuable for sizing and forecasting, which I was able to implement to improve the resource usage model of my company’s applications.


🛠 My Strengths & Tech Stack

Here are the areas I excel in and the tools I use regularly:

Domain Technologies & Tools
Cloud & Infrastructure AWS, Kubernetes, Docker, Terraform, Ansible
Reliability, Observability & Monitoring Prometheus, Grafana, ELK, Logging / Tracing / Metrics stacks
CI/CD & Automation GitHub Actions, GitLab CI/CD, GitOps, Flux, Jenkins, automated testing
Backend & Systems Engineering Go, Java, Python, React.js, bash, REST & gRPC services, microservices, event-driven systems
DevOps / SRE Practices Incident response, on-call best practices, SLA/SLO/SLI design, resilience, performance tuning
FinOps AWS Billing, dimensioning/sizing engineering, cost optimizations, billing
DevSecOps Kyverno, Falco, SAST
Storage SAN, NAS, Ceph, FSx, EFS, EBS

📌 Pinned Repositories

Here are some of my repositories I’m proud to showcase:

  • vpr-exporter – FinOps tool to optimize infra cost by aligning requests with real usage.
  • resource-model-exporter – ML tool to generate a model of resource consumption.
  • service-availability-exporter – A Prometheus exporter that aggregates and exposes service availability metrics .
  • grafana-dashboards – Observability Grafana dashboards in many different area (using vpr-exporter/node-exporter based on prometheus and AWS datasources)

🌟 Minor Contributions & Projects

Here are some of my contributions to the observability eco-system:

  • Grafana – that I love since version 2.0 !
  • Jenkins Exporter – to get statistics metrics from Jenkins CI.
  • Logstash Exporter – to get statistics metrics from Logstash.

📈 GitHub Statistics

Top Langs Arnaud’s GitHub Stats


🔭 What I’m Working On / Interested In

  • Evolving reliability practices: service level objectives (SLOs), error budgets, resilience engineering
  • Deepening involvement in observability stack integrations (OpenTelemetry, tracing, distributed logging)
  • Architecture for large-scale systems: multi-region, microservices, serverless, fault tolerance, HPC
  • Cloud security best practices: IAM, Infrastructure Security, Data Protection
  • Apetite for AI / MLOps: infrastructure resource usage prediction
  • Mentoring / sharing knowledge: writing about best practices, contributing to open source, speaking

📫 Let’s Connect

LinkedIn


My 2 Lord Kelvin's mantras:

“Measure is to know.”
“If you cannot measure it, you cannot improve it.”

Thanks for dropping by. Feel free to explore my repos, open issues / PRs, or reach out if you want to collaborate.

Pinned Loading

  1. resource-model-exporter resource-model-exporter Public

    Resource Model Exporter is a prometheus exporter that collects resource usage & dimensioning metrics then find and expose the model as metrics.

    Go 2 1

  2. grafana-dashboards grafana-dashboards Public

    This repository showcases custom Grafana dashboards for monitoring and visualizing key metrics in various areas.

    1

  3. vpr-exporter vpr-exporter Public

    The Vertical Pod Recommender (aka VPR) is a micro service which is recommending CPU and Mem Req/Limits. Contrary to KRR and VPA, the VPR is JVM aware and compute the Mem Req/Limits.

    Go 1

  4. grafana/grafana grafana/grafana Public

    The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many mo…

    TypeScript 71k 13.2k

  5. jenkins-exporter jenkins-exporter Public

    Forked from simplesurance/jenkins-exporter

    Export Jenkins Build Metrics to Prometheus

    Go 1

  6. service-availability-exporter service-availability-exporter Public

    A Prometheus exporter that aggregates and exposes service availability metrics

    Go 1