Stone head

Senior GPU Engineer

  • $150k-$220k
  • ID: 4297
  • Posted: 06.11.25

 Senior GPU Engineer – Decentralised Compute Infrastructure – Global Remote

Plexus are working with one of the most exciting teams in the decentralised computing space. With a focus on enhanced scalability, efficiency, cost and security. 

With a recent raise of 8figures, they are looking to onboard a hands on Senior GPU Engineer into the team. 

Responsibilities

  • Design and manage multi-tenant GPU clusters using Kubernetes, Slurm, or similar platforms.
  • Develop schedulers and resource-sharing tools to optimize GPU utilization and efficiency.
  • Build autoscaling systems for training and inference workloads (Ray, Run:AI, Volcano, KubeFlow)
  • Implement monitoring and observability for GPUs, network, and job performance (Prometheus, Grafana, OpenTelemetry).
  • Profile and optimize compute throughput and cost efficiency (NCCL, CUDA, ROCm, GPUDirect, RDMA, InfiniBand).
  • Collaborate on high-throughput I/O and data pipelines (Lustre, Ceph, S3, NVMeoF, Alluxio).

Requirements

  • 2+ years managing distributed GPU infrastructure in production.
  • Strong experience with Kubernetes or Slurm, and Linux systems.
  • Skilled in Python/Go/C++, automation, and infrastructure-as-code (Terraform, Helm).
  • Familiar with CUDA/NCCL/ROCm, Ray/Run:AI/Volcano, and high-speed networking (InfiniBand, RoCE).
  • Knowledge of AI storage systems (Lustre, Ceph, S3) and performance optimization.
  • Excellent collaboration and communication skills across technical teams.

Offer/Benefits

  • Up to $220k base 
  • Fully remote, work-from-anywhere 

Sound like you, or someone you know? Please apply 

Apply for this job: