The Data Engineer is the architect and builder of Dis-Chem Life's data foundation, creating the infrastructure that turns raw information into a strategic asset. This role goes far beyond moving data from A to B, it is about designing high-performance, future-proof systems that make data accurate, accessible, and truly powerful.
By developing best-in-class data pipelines, warehouse systems, architecture, and governance frameworks the Data Engineer enables the entire organisation, from the actuarial, data science and analytics teams to general operations, to work with clean, structured, and reliable datasets at scale while protecting our customers' data privacy as stipulated in the POPI Act.
Solving hard engineering problems, building resilient ingestion frameworks, handling messy and complex source systems, optimising cloud architecture for cost and performance, and ensuring that every downstream user can focus on insight and innovation rather than wrangling.
The ultimate purpose is to build and continuously evolve a scalable, intelligent data platform that grows with Dis-Chem Life's ambitions, fuels advanced analytics and modelling, unlocks automation, and sets a new benchmark for how data drives customer intelligence and operational excellence in the South African insurance industry.
Summary of the Role
The Data Engineer is responsible for designing, implementing, and maintaining the core technical solutions that keep Dis-Chem Life's running at peak performance. This includes building scalable and resilient data ingestion frameworks, integrating complex source systems, and optimising cloud architecture for both performance and cost efficiency. The role requires deep hands-on experience with modern data engineering tools, ETL/ELT processes, workflow orchestration, and cloud platforms. Strong problem-solving skills, precision, and the ability to collaborate seamlessly with analytics, AI, and automation teams are essential. The Data Engineer continuously drives improvements in data processes and platform efficiency, ensuring the organisation can rely on high-quality, reliable data to make faster, smarter, and more impactful decisions.
Benefits
Competitive salary
Direct and significant influence over building the company's data backbone as we are still in early development stages.
Exposure to advanced analytics and AI projects with real-world business impact
Access to modern cloud, orchestration, and automation technologies
Hybrid working model with flexibility and autonomy
Will be working with interesting datasets comprising health data, customer behaviour, payments, retail spend, etc.
Key Responsibilities
Build & Maintain Data Pipelines, Architecture, and Software
Design, develop, optimise, and monitor scalable ETL/ELT pipelines and warehouse systems,
Implement, monitor, and maintain reporting and analytics software.
Architect robust, future-proof data infrastructure to support advanced analytics, AI, and automation.
Ensure performance, reliability, and security across all data systems.
Ensure Data Quality, Reliability & Accessibility
Implement rigorous data quality validation, monitoring, and governance to guarantee data integrity.
Deliver clean, well-structured datasets that downstream teams can confidently use.
Minimise time spent on data cleaning and wrangling for actuaries, data scientists, and operational BI analysts
.
Enable AI, Analytics & Automation
Prepare AI-ready datasets with consistency, scalability, and timeliness in mind.
Collaborate with data scientists to build feature pipelines for predictive modelling.
Support advanced automation workflows and real-time data requirements.
Scale Data Architecture
Design and optimise best-in-class data architecture to be capable of handling increasing data volumes and complexity.
Leverage cloud-native solutions to enable rapid scaling, flexibility, and cost efficiency.
Continuously enhance data infrastructure performance and reduce operational costs.
Handle Complex Engineering Challenges
Own the technical work of data ingestion, transformation, and orchestration.
Solve challenging engineering problems to allow teams to focus on insights, models, and decisions.
Act as the go-to expert for ensuring data is accessible, accurate, and usable.
Collaboration & Knowledge Sharing
Work closely with analysts, actuaries, and data scientists to understand evolving data needs.
Document data flows, definitions, and system processes to ensure transparency and reusability.
Mentor colleagues and promote best-practice data engineering across the organisation.
Soft Skills
Obsessed with clean, high-quality data and how it drives better models/decisions
Collaborative mindset, thriving at the intersection of engineering and analytics
Strong communicator, able to explain complex engineering choices to non-technical users
Detail-driven but pragmatic, balancing precision with speed in delivery
Curious, innovative, and always seeking ways to improve
Technical Skills
Data Architecture
- design and implementation of scalable, maintainable data systems, defining data flows, and establishing architectural patterns for enterprise-scale solutions
Advanced SQL
- extraction, transformation, and optimisation
Python Programming
- strong skills (pandas, PySpark) for data pipelines and scientific workflows
Big Data Frameworks
- hands-on experience with at least one major framework (Hadoop, Spark, Kafka, Elastic Stack, or Databricks)
Database Expertise
- proficiency across all industry standard types including relational (PostgreSQL, MySQL), NoSQL (MongoDB, Cassandra). Understanding of lesser used types including time-series (InfluxDB, TimescaleDB) and graph databases (Neo4j)
Data Modelling
- dimensional modelling, normalisation, star/snowflake schemas, and designing efficient data structures for analytical workloads
Data Lifecycle Management
- end-to-end data management including ingestion, storage, processing, archival, retention policies, and data quality monitoring throughout the pipeline
Data Science Integration
- familiarity with feature stores, model-serving pipelines
ETL/ELT Tools
- hands-on experience with tools like dbt, Windmill, Airflow, Fivetran
Cloud Platforms
- experience with AWS, Azure, or GCP and modern warehouses (Snowflake, BigQuery, Redshift)
Streaming Data
- knowledge of real-time data processing (Kafka, Spark, Flink)
Infrastructure Management
- experience with Docker, Kubernetes, container orchestration, and managing scalable data infrastructure deployments is advantageous
APIs & Integrations
- understanding of APIs, integrations, and data interoperability
Version Control
- Git and CI/CD practices for production data pipelines
Data Governance
- familiarity with governance and compliance (POPIA, FAIS)
Experience
3-5 years in a Data Engineering or related technical role
Proven ability to deliver clean, scalable pipelines supporting analytics and AI
Hands-on work with cloud-native and warehouse systems
Experience collaborating with Data Science teams to deliver AI/ML-ready datasets
Exposure to regulated industries (insurance/finance) advantageous
Qualifications
Bachelor's degree in Data Engineering, Computer Science, Information Systems, or related field
Cloud certifications (AWS, Azure, GCP) or Data Engineering credentials preferred
* Advanced SQL and Python certifications are advantageous
Beware of fraud agents! do not pay money to get a job
MNCJobs.co.za will not be responsible for any payment made to a third-party. All Terms of Use are applicable.