ML Engineer, Convergence AI

Competitive
0
k
UK
Senior
Hybrid
Machine Learning
AI
GCP
Terraform
Python
Convergence AI
Tech
B2C
20-50
Full-Time
AI
Like the look of this opportunity? Click the link to apply, and make sure you tell them you came from IU!
  The Role  

Responsibilities

  • Design, implement, and maintain our ML-focused cloud infrastructure on GCP using Infrastructure as Code (Terraform)
  • Build and manage HPC clusters with Slurm for distributed ML workloads, focusing on GPU/TPU utilization and job scheduling
  • Develop and maintain ML pipeline automation tools and ML-specific CI/CD workflows in Python
  • Design and optimize data storage solutions for ML datasets, model artifacts, and feature stores
  • Implement comprehensive monitoring, logging, and alerting solutions for ML model performance and infrastructure health
  • Collaborate with ML engineers and data scientists to provide robust infrastructure for model training and deployment
  • Lead and implement security best practices for ML systems, including model security and data protection
  • Requirements3+ years of experience in ML infrastructure or ML platform engineering
  • Strong proficiency in Python for ML pipeline automation and tooling
  • Extensive experience with Slurm cluster management for large-scale ML workloads
  • Proven track record with Terraform and Infrastructure as Code for ML environments
  • Solid understanding of GCP's ML-specific services (Vertex AI, AI Platform, etc.)
  • Experience with distributed training systems and model serving infrastructure
  • Experience with ML observability tools and performance monitoring
  • Excellent problem-solving skills with a focus on ML system reliability and optimization
  • Bonus QualificationsExperience scaling large language model (LLM) infrastructure
  • Knowledge of ML-specific orchestration tools (e.g., MLflow, Ray)
  • Experience with high-performance computing for ML training
  • Contributions to ML infrastructure-related open-source projects
  • Experience with GPU/TPU cluster management and optimization
  • Background in ML operations (MLOps) or AI reliability engineering
  • Familiarity with vector databases and efficient embedding storage/retrieval
  About  
Founded
2014
Equity Available
Visa Sponsorship Available
Certified B-Corp

At Convergence, our mission is to build a future of abundance for all of humanity.

We are developing the first generation of truly general agents that can pick up any skill by actively working and learning from experience.

  Benefits  
  • Be at the cutting edge of AI and LLM technology
  • Work on challenging problems that impact users' daily lives
  • Collaborative and innovative work environment
  • Opportunities for professional growth and learning
  • Competitive salary and benefits package
  Apply!  

 Apply  direct!

Fill out the form through the link and we'll send the person hiring for this role all of your relevant details!

Or  Connect  with the team!

Connect with the person below and send them a message!

💼 more jobs like this
ML Engineer