Principal Machine Learning Engineer - Johannesburg (Hybrid, Permanent)
A senior, hands-on engineering leadership role within an elite Data & AI environment, leading the design and delivery of complex, real-time, production-scale machine learning systems. The role combines deep technical capability with architectural leadership, strategy input, mentorship, and thought leadership.
Key Technical Requirements
Architecture & Engineering
*
Proven experience designing end-to-end ML ecosystems, not just building models
*
Strong capability in real-time, event-driven, and streaming architecture
*
Deep experience building highly scalable, fault-tolerant production systems
*
Strong understanding of data governance, robustness, security, and quality engineering
*
Extensive experience working with microservices architectures
Data Streaming & Real-Time Processing
*
Expert-level hands-on experience with:
*
Kafka
*
Flink
*
Beam (advantageous)
*
Strong understanding of real-time ingestion, processing, and event pipelines
Cloud & Infrastructure
*
Significant experience architecting and delivering on multi-cloud platforms
*
Primary strength required: AWS
*
Plus meaningful experience across GCP and Azure
*
Expected depth across services such as:
*
Lambda, S3, RDS, DynamoDB, VPC (or equivalents in other clouds)
*
Strong containerisation & orchestration capability:
*
Kubernetes
*
Docker
Programming & Software Engineering
*
Strong multi-language engineering background, ideally across:
*
Python
*
Go
*
Java
*
C#
*
JavaScript (beneficial)
*
Strong CI/CD mindset and modern software engineering best practices
*
Comfortable building full applications, APIs, services, and ML-supporting infrastructure
Machine Learning & Data
*
Proven production experience with:
*
Structured and unstructured data
*
Real-world deployment of ML solutions (not only experimentation)
*
Experience with:
*
Jupyter / notebook-driven environments
*
ML platform tooling (e.g., SageMaker or equivalent)
*
Semi-supervised learning approaches useful but not mandatory
*
Exposure to:
*
Large Language Models
*
Generative AI and emerging applied AI capabilities
Databases & Storage
*
Strong across:
*
SQL databases (e.g., MS SQL, MySQL)
*
NoSQL platforms (e.g., MongoDB)
*
Graph databases (e.g., Neo4j)
Ways of Working
*
Experience working in modern Agile environments (Scrum / Kanban)
