I specialize in designing and operating high-performance cloud infrastructure for ML/AI workloads at scale. Expert in architecting GPU-accelerated compute environments, Kubernetes-based ML platforms, and MLOps pipelines supporting model training, deployment, and inference.
Building high-performance AI/ML infrastructure at scale
AI Infrastructure Engineer with 7+ years of experience designing and operating high-performance cloud infrastructure for ML/AI workloads at scale. Proven expertise in architecting GPU-accelerated compute environments, Kubernetes-based ML platforms, and MLOps/CI-CD pipelines supporting model training, deployment, and inference.
Deep technical background in AWS cloud infrastructure, container orchestration (EKS), distributed systems, and Python/Go automation for AI/ML frameworks (PyTorch, TensorFlow). Successfully deployed production AI systems serving real-time inference, RAG pipelines, and LLM integrations.
NVIDIA-certified in Generative AI LLMs with comprehensive knowledge of GPU infrastructure, AI model optimization, and enterprise-scale ML operations. Demonstrated ability to collaborate with data science, engineering, and cross-functional teams.
A comprehensive toolkit of AI/ML infrastructure technologies, frameworks, and tools I use to build production-grade machine learning systems at scale.
Retrieval-Augmented Generation pipelines with pgvector
GPT-4, Claude, Llama model serving & integration
GPU-accelerated compute environments & scheduling
Model training, deployment, and inference pipelines
ML model training and deployment at scale
Deep learning research and production models
Machine learning and neural networks
LLM application development framework
Transformers and model hub integration
Vector similarity search for AI applications
EC2, EKS, S3, Lambda, SageMaker, ECS, RDS
EKS, GPU scheduling, Helm, RBAC, CRDs
Infrastructure as Code for multi-cloud
Container optimization & multi-stage builds
GitHub Actions, Jenkins, GitLab CI, ArgoCD
AI/ML metrics, GPU monitoring, alerting
ML dashboards and observability
Configuration management and automation
FastAPI, Flask, Pandas, NumPy, async
High-performance CLI tools & microservices
Infrastructure automation scripting
Real-time data streaming at scale
Event-driven architectures
RDS, Aurora, query optimization
Caching and session management
Interested in learning more about my technical expertise?
A track record of building AI/ML infrastructure, deploying production AI systems, and leading technical teams across healthcare, biotech, and consumer technology companies.
Proposed reference architectures for production ML/AI infrastructure โ demonstrating how I approach platform design across different environments and constraints.
A proposed production-grade MLOps platform for autonomous robotics workloads on bare-metal GPU clusters. Leverages SchedMD Slinky to unify Kubernetes and Slurm scheduling on a shared GPU pool โ eliminating resource silos and enabling seamless orchestration of training, simulation, and inference workloads.
A structured approach to ramping up as an AI Infrastructure Engineer โ from discovery through delivery and strategic roadmap.
Focus on understanding the current infrastructure, ML workflows, team dynamics, and organizational priorities. Identify quick wins while building institutional knowledge.
This plan is adaptable based on organizational maturity, team size, and immediate priorities. The core philosophy: listen first, deliver quick wins, then architect the future.
Validated expertise across AI/ML, cloud platforms, infrastructure as code, and container orchestration. Click any badge to verify on Credly.
GPU infrastructure & LLM deployment
AI infrastructure design & management
ML model training & deployment
AI/ML services on AWS
Cluster management & operations
Application development on K8s
Infrastructure as Code
Distributed systems design
AWS application development
Ready to discuss your next cloud project or explore opportunities to work together? I'd love to hear from you.
Whether you're looking for a cloud engineer to join your team, need consultation on AWS infrastructure, or want to collaborate on AI/ML projects, I'm always open to discussing new opportunities and challenges.
ยฉ 2026 Martial D. Bah Bioh. All rights reserved.