Job Summary Key Responsibilities: Monitoring and Alerting: Implementing and maintaining monitoring systems to track system health and performance, alerting on symptoms rather than just outages. Incident Response: Responding to and resolving production incidents, troubleshooting across the entire stack, and providing…