Data Engineer (expert) 1551

Pretoria, GP, ZA, South Africa

Job Description

We are seeking a hands-on Data Engineer with strong experience in building scalable data pipelines and analytics solutions on Databricks. You will design, implement, and maintain end-to-end data flows, optimize performance, and collaborate with data scientists, analytics, and business stakeholders to turn raw data into trusted insights.



Key Responsibilities:



Design, develop, test, and maintain robust data pipelines and ETL/ELT processes on Databricks (Delta Lake, Spark, SQL, Python/Scala/SQL notebooks).



Architect scalable data models and data vault/ dimensional schemas to support reporting, BI, and advanced analytics.



Implement data quality, lineage, and governance practices; monitor data quality metrics and resolve data issues proactively.



Collaborate with Data Platform Engineers to optimize cluster configuration, performance tuning, and cost management in cloud environments (Azure Databricks).



Build and maintain data ingestion from multiple sources (RDBMS, SaaS apps, files, streaming queues) using modern data engineering patterns (CDC, event-driven pipelines, change streams, Lakeflow Declarative Pipelines).



Ensure data security and compliance (encryption, access controls) in all data pipelines.



Develop and maintain CI/CD pipelines for data workflows; implement versioning, testing, and automated deployments.



Partner with data scientists and analysts to provision clean data, notebooks, and reusable data products; support feature stores and model deployment pipelines where applicable.



Optimize Spark jobs for speed and cost; implement job scheduling, monitoring, and alerting.



Document data lineage, architecture, and operational runbooks; participate in architectural reviews and bestpractice governance.



Qualifications/Experience:



Bachelor's or Master's degree in Computer Science, Data Engineering, Information Systems, or a related field.



3+ years of hands-on data engineering experience.



Essential Skills:



Expertise with Apache Spark (PySpark), Databricks notebooks, Delta Lake, and SQL.



Strong programming skills in Python for data processing.



Experience with cloud data platforms (Azure) and their Databricks offerings; familiarity with object storage (ADLS).



Proficient in building and maintaining ETL/ELT pipelines, data modeling, and performance optimization.



Knowledge of data governance, data quality, and data lineage concepts.



Experience with CI/CD for data pipelines, and orchestration tools (GitHub Actions, Asset Bundles or Databricks' jobs).



Strong problem-solving skills, attention to detail, and ability to work in a collaborative, cross-functional team.



Advantageous Skills:



Experience with streaming data (Structured Streaming, Kafka, Delta Live Tables).



Familiarity with materialized views, streaming tables, data catalogs and metadata management.



Knowledge of data visualization and BI tools (Splunk, Power BI, Grafana).



Experience with data security frameworks and compliance standards relevant to the industry.



Certifications in Databricks or cloud provider platforms.



More than 10 Years

Beware of fraud agents! do not pay money to get a job

MNCJobs.co.za will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Related Jobs

Job Detail

  • Job Id
    JD1595882
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    Pretoria, GP, ZA, South Africa
  • Education
    Not mentioned