Established in 2001, RSAWEB is South Africa's fastest growing internet service provider (ISP) with a focus on providing connectivity to home customers, and a wide array of technology solutions to businesses. We are obsessed about ensuring all our customers receive the best possible digital experience and exceptional customer service. Thousands of customers have given RSAWEB a 5-star rating, with an average rating of 4.7 out of 5 on Google - the best-rated ISP in South Africa. We are extremely proud of winning KFM's Best of the Cape Awards: Best ISP in 2021 and 2022 being one of the fastest streaming ISPs on Netflix and a consistently top-rated ISP on MyBroadband. These accolades are not for nothing, as we constantly strive to improve our products, services, and solutions to enhance each customer's experience. Having invested heavily in infrastructure, RSAWEB has built a strong presence in South Africa with Data Centres in Johannesburg and Cape Town.
Our Products and Services:
Fibre-to-the-Home (FTTH)
Fibre-to-the-Business (FTTB)
Enterprise connectivity
Mobile connectivity and data management
Cloud infrastructure and more!
At RSAWEB, we are passionate about using our creativity, to provide innovative solutions and services, that allow our customers to succeed in all areas of life. We believe that we are in the business of connecting customers and businesses with each other and a world of infinite possibility and opportunity, through technology. Our mission transcends our values through every customer, every interaction, every connection, every day.
Our values:
We Build Trust and Ownership
We Honour & Respect People
We Cultivate Passion & Creativity
We Innovate Feverishly
We Go the Extra Mile
We Believe in Humility
We Communicate Openly & Honestly
We Make it Fun
We Teach, Grow & Learn
We Do More, With Less
As an SRE Tech Lead, you will lead a team responsible for ensuring the reliability, availability, and scalability of critical systems and services. You will collaborate closely with software development, infrastructure, and product teams to design, implement, and optimize services and systems. This role focuses on leading efforts to build, monitor, and scale infrastructure, solve complex operational challenges, and improve system performance.
Key Responsibilities:
Leadership & Mentorship:
Lead and mentor a team of SREs to build and maintain highly reliable and scalable systems.
Provide technical guidance and drive the adoption of best practices in SRE methodologies (incident management, monitoring, disaster recovery, etc.).
Collaborate with cross-functional teams (Software Engineering, Network Engineering and Cloud Operations) to ensure the best outcomes.
System Reliability:
Design and implement solutions to increase system reliability and operational efficiency.
Lead the creation and maintenance of service-level indicators (SLIs), service-level objectives (SLOs), and service-level agreements (SLAs).
Implement robust monitoring and alerting solutions to detect and address issues proactively.
Incident & Problem Management:
Oversee and coordinate incident response, ensuring fast and effective resolution of critical issues.
Conduct post-mortems and drive continuous improvement to prevent future incidents.
Participate in on-call rotation and ensure the on-call team is equipped to respond to incidents.
Automation & Efficiency:
Automate repetitive operational tasks and processes to reduce manual intervention and increase reliability.
Develop and maintain infrastructure-as-code (IaC) solutions using tools like Terraform, Ansible, or similar.
Implement CI/CD pipelines and improve development workflows for better system deployment and rollback strategies.
Capacity Planning & Optimization:
Ensure systems are designed to handle capacity and scaling requirements based on usage trends.
Optimize performance of systems
Research and Development:
Led research and development initiatives within the SRE team to keep teams informed of emerging technologies.
Documented new technologies and their potential applications within an internal technology roadmap.
Requirements
Qualifications
Degree/diploma in Computer Science, Engineering, IT, or related field.
Industry certifications (AWS, Azure, GCP, Kubernetes, DevOps) are advantageous.
Technical Skills
Strong experience with Linux systems and distributed systems.
Deep understanding of cloud platforms (AWS/Azure/GCP).
Strong scripting/programming skills (Python, Go, Bash).
Experience with:
+ CI/CD (GitHub Actions, GitLab CI, Jenkins)
+ Infrastructure as Code (Terraform/Ansible)
+ Kubernetes & container orchestration
+ Monitoring/observability tools
+ Networking concepts (DNS, load balancing, routing)
+ SQL/NoSQL databases Solid knowledge of security, performance optimisation, and system design.
Experience
5-8+ years in SRE, DevOps, Cloud Engineering, or Platform Engineering.
2+ years leading technical teams or driving technical direction.
Experience in large-scale, high-availability environments.
Benefits
Medical Aid (Discovery)
Reduced Gap Cover Rates (Turnberry Premier)
Retirement Annuity Contribution (Allan Gray)
Medical Insurance (Momentum - Health4Me)
Discounted Internet Connectivity
Free Employee Wellness Programme (Lyra Wellbeing, formerly ICAS)
Exposure to latest industry technologies and standards
Lastly, a work environment that rivals the very best!
If you have not heard from us within 2 weeks of submitting your application, please consider your application as unsuccessful.
Beware of fraud agents! do not pay money to get a job
MNCJobs.co.za will not be responsible for any payment made to a third-party. All Terms of Use are applicable.