: Using the Matillion ETL platform for data integration.
Cloud Data Warehouses
: Familiarity with cloud data warehouses like Snowflake, AWS Redshift, or Google BigQuery.
Key Responsibilities:
Design & Develop Data Pipelines
: Build and optimize scalable, reliable, and automated ETL/ELT pipelines using
AWS services
(e.g., AWS Glue, AWS Lambda, Redshift, S3) and
Databricks
.
Cloud Data Architecture
: Design, implement, and support in maintaining data infrastructure in
AWS
, ensuring high availability, security, and scalability. Work with lake houses, data lakes, data warehouses, and distributed computing.
DBT Core Implementation
: Lead the implementation of
DBT Core
to automate data transformations, develop reusable models, and maintain efficient ELT processes.
Data Modelling
: Build efficient data models to support required analytics/reporting.
Optimize Data Workflows
: Monitor, troubleshoot, and optimize data pipelines for performance and cost-efficiency in cloud environments. Utilize
Databricks
for processing large-scale data sets and streamlining data workflows.
Data Quality & Monitoring
: Ensure high-quality data by implementing data validation and monitoring systems. Troubleshoot data issues and create solutions to ensure data reliability.
Automation & CI/CD
: Implement
CI/CD
practices for data pipeline deployment and maintain automation for monitoring and scaling data infrastructure in
AWS
and
Databricks
.
Documentation & Best Practices
: Maintain comprehensive documentation for data pipelines, architectures, and best practices in
AWS
,
Databricks
, and
DBT Core
. Ensure knowledge sharing across teams.
Skills & Qualifications:
Required:
Bachelor's / master's degree in computer science
, Engineering or a related field.
4+ years
of experience as a Data Engineer or in a similar role.
Extensive hands-on experience with
AWS services
(S3, Redshift, Glue, Lambda, Kinesis, etc.) for building scalable and reliable data solutions.
Advanced expertise in
Databricks
, including the creation and optimization of data pipelines, notebooks, and integration with other AWS services.
Strong experience with
DBT Core
for data transformation and modelling, including writing, testing, and maintaining DBT models.
Proficiency in
SQL
and experience with designing and optimizing complex queries for large datasets.
Strong programming skills in
Python/PySpark
, with the ability to develop custom data processing logic and automate tasks.
Experience with
Data Warehousing
and knowledge of concepts related to OLAP and OLTP systems.
Expertise in building and managing
ETL/ELT pipelines
, automating data workflows, and performing data validation.
Familiarity with
CI/CD
concepts, version control (e.g., Git), and deployment automation.
Having worked under Agile project environment
Preferred:
Experience with
Apache Spark
and distributed data processing in
Databricks
.
* Familiarity with
streaming data
solutions (e.g.,
AWS Kinesis, Apache Kafka
).
Beware of fraud agents! do not pay money to get a job
MNCJobs.co.za will not be responsible for any payment made to a third-party. All Terms of Use are applicable.