¿Qué buscamos?
Senior MLOps Engineer – LLM Infrastructure & Performance Engineering
San Sebastián (Spain) | Hybrid
About the opportunity
We are partnering with a fast-growing deep-tech company working at the intersection of large-scale AI systems, high-performance computing, and next-generation model optimization.
The team focuses on building production-grade infrastructure for advanced AI models, solving real-world challenges where performance, scalability, and efficiency are critical.
This is a highly technical environment, ideal for engineers who want to operate close to the limits of modern GPU systems and large-scale ML workloads.
Role mission
You will take ownership of the infrastructure layer powering large-scale LLM training and inference, ensuring models are not only state-of-the-art but also efficient, scalable, and production-ready.
This role sits at the intersection of systems engineering and machine learning, with a strong focus on performance optimization and distributed systems.
What you’ll be doing
Design and scale distributed training pipelines for large language models.
Optimize GPU utilization, memory usage, and training efficiency.
Build and improve high-throughput inference systems for LLM serving.
Implement advanced techniques to reduce latency and maximize throughput.
Orchestrate workloads across cloud and on-premise environments.
Define best practices for model lifecycle (training → deployment → monitoring).
Perform deep performance analysis across the full stack (from low-level GPU to application layer).
Drive engineering standards, mentor team members, and contribute to technical decisions.
¿Qué te hace destacar?
What we’re looking for
5+ years of experience in MLOps, DevOps, or backend/system engineering.
Proven experience working with LLM infrastructure or large-scale ML systems.
Strong expertise in:
PyTorch ecosystem
GPU computing (CUDA, distributed training)
Experience with modern LLM tooling (training or inference).
Solid background in distributed systems and performance optimization.
Strong programming skills in Python (C++/Rust is a plus).
Experience deploying systems in cloud or hybrid environments.
Fluent English.
Strong plus if you have
Experience with:
High-performance inference systems
Model optimization (quantization, distillation, compression)
HPC environments or large-scale clusters
Familiarity with orchestration tools (Ray, SLURM, etc.)
Experience with Kubernetes and containerized workloads
Contributions to open-source ML infrastructure
Experience with observability and monitoring systems
Key competencies
Systems thinking and performance mindset
Strong problem-solving ability in complex environments
Ownership and autonomy
Ability to operate in high-performance teams
Curiosity for cutting-edge AI infrastructure
¿Qué te ofrecemos?
Why this role stands out
Work on LLM infrastructure at scale, not toy problems
Direct impact on real-world AI systems used in production
Highly technical environment with top-tier engineers
Ownership of critical systems and architecture decisions
Fast-paced, high-growth deep-tech setting
Relocation & lifestyle
The company will fully support you with all administrative, legal, and logistical processes to ensure a smooth relocation to San Sebastián.
Initial housing support will be provided, including temporary accommodation and assistance with your home search.
A unique opportunity to settle in one of the most desirable cities to live in Spain — offering an exceptional quality of life, with beaches, world-class gastronomy, surf, and a vibrant cultural scene.