Description:
Job Description: Must have good understanding of SRE aspects and guide teams Strong experience in Azure cloud operations and large scale distributed systems. Hands on expertise with observability tools: Dynatrace, Newrelic/Datadog Prometheus, and Grafana. Solid understanding of ITSM processes (incident, problem, change). Strong scripting and troubleshooting skills (Python, Shell) for complex production issues. Implement and maintain disaster recovery, failover mechanism and backup strategies Pro
Feb 28, 2026;
from:
dice.com