Senior Software Engineer-MLOps & Observability

Warsaw, hybrid

CloudFerro is a provider of cloud data processing services. It provides and supports cloud computing for specialized markets, including the European space industry, climate research and science.

It has extensive experience in storing and processing large data sets, including multi-petabyte Earth observation satellite data repositories. CloudFerro solutions are used by leading companies and scientific institutions in Europe from various market sectors that process large data sets: European Space Agency (ESA), EUMETSAT, European Center for Medium-Range Forecasts (ECMWF), Mercator Ocean International, German Aerospace Agency (DLR), EGI and many others.

We are currently looking for people who want to help us build a sovereign European cloud platform from the ground up, creating a real alternative to global hyperscalers. We are building a complete PaaS stack based on Kubernetes and open source technologies, from serverless and managed databases to MLOps, monitoring, and observability. By joining us, you become part of a team designing the foundations of the platform rather than developing an already‑existing product. We work in a full‑ownership model, where every architectural decision stays with the product for years. This is a place for people who want to genuinely influence the shape of the European cloud.

Due to our dynamic development, we are seeking talented individuals to join our team as a:

Role

In this role, you will develop the MLOps and Observability offering for our cloud platform. You will be responsible end‑to‑end for Kubeflow, Airflow, Jupyter Notebooks, and the monitoring platform, from the Kubernetes layer to APIs and documentation. You will work in a you build it, you own it model, defining how customers train models, build data pipelines, and monitor their applications. This is a role with real impact on the direction of the entire ML platform.

Your responsibilities

  • Full ownership of MLOps and Observability: responsibility for Kubeflow, Airflow, Jupyter Notebooks, and Observability as a Service in a you build it, you own it model, from architecture and code to APIs, documentation, and operational stability.
  • Developing the monitoring platform: building observability services for customers on top of the foundation prepared by Platform DevOps / SRE Engineers.
  • End‑to‑end delivery: from operator/integration, through API endpoints and CLI commands, to Terraform resources and documentation.
  • Integration with shared services: connecting your services with IAM, billing, and quota systems developed by the Common Services team.
  • Co‑creating technical standards: working with other Product Engineers on the platform’s direction and best practices.

Requirements

  • Experience with Kubernetes: independently building or operating complex systems running on K8s.
  • Distributed systems design: practical ability to create scalable, reliable services.
  • Proficiency in Go and good knowledge of Python: building operators, controllers, APIs, and CLI tools.
  • Integrations and APIs: ability to build API wrappers and integrations with REST/gRPC frameworks.
  • AI‑assisted development: comfortable using tools like Claude Code or Copilot as a natural part of daily work.

Nice to have

  • Experience with ML platforms: Kubeflow, JupyterHub, GPU scheduling on Kubernetes.
  • Knowledge of Airflow or other workflow orchestration tools.
  • Observability stack: Prometheus/Thanos, Grafana, Loki, OpenTelemetry.
  • Multi‑tenant monitoring: exposing metrics and logs per customer.
  • Terraform providers: creating custom resources.
  • Experience at a cloud provider: working in a cloud‑services environment.
  • Open‑source contributions, especially within the CNCF ecosystem.

Why join us

  • Building a cloud platform: you help create a modern ecosystem from scratch, supporting the European space industry, climate research, and scientific projects with real impact.
  • Autonomous team: you join a newly formed group operating like a startup within a stable, large organization with access to resources smaller companies cannot offer.
  • CNCF technologies in production: you work with Kubernetes, KNative, Cilium, ArgoCD, Kubeflow, and other open‑source projects, with opportunities to contribute back.
  • Real influence: you have a voice in architectural and product decisions, and your ideas truly shape the platform’s direction.
  • AI‑native workflow: we use tools like Claude Code daily, not as an experiment, but as an integral part of our work.
  • Autonomy and stability: we offer significant freedom in organizing your work, transparent collaboration principles, and stable employment with competitive compensation.
  • Benefits package: medical care, multisport, life insurance.
  • Language classes available.

Want to know more?

Feel free to contact our recruitment team:

Senior Software Engineer-MLOps & Observability - CloudFerro Head 0032 1
Małgorzata Duda

Recruitment Expert

mduda@cloudferro.com

Linkedin

Senior Software Engineer-MLOps & Observability - CloudFerro Head 0035 1
Wioletta Dobkowska

Recruitment Expert

wdobkowska@cloudferro.com

Linkedin

Senior Software Engineer-MLOps & Observability - CloudFerro 020
Senior Software Engineer-MLOps & Observability - 002 coding

Software Developers

Software Development

We create and develop advanced distributed systems. Their task is to provide services for the extensive processing of satellite data, of which we collect and catalogue vast quantities. In order for our systems to work efficiently, we need services to retrieve, describe and record this data.