We are looking for a Monitoring Engineer to work with our SRE and Development teams. The role involves reviewing and optimising our monitoring and tooling to improve how we work and increase availability by proactively finding trends.
Own the monitoring of our applications and systems
Work with other teams to ensure that the tooling and processes around monitoring are fit for purpose
Coach team members on how to effectively monitor services, systems and even business metrics
Facilitate communication and effective collaboration
Help with the standardisation and automation of monitoring across multiple environments
Regularly review costs to ensure that we are not over-spending
Help build a productive environment where team members ‘own’ their product end to end
Preferred Skills & Experience:
In-depth knowledge of monitoring tools. DataDog runs all of our monitoring.
Skills working with both the cloud and physical datacentres.
Experience working with tools to help set up and manage on-call rotations. We use OpsGenie.
Knowledge on how to integrate different agents and monitoring libraries for improved visibility.
Understanding of metrics, logs and how to cross-relate them.
Excellent communication and leadership skills.
Problem-solving and conflict-resolution ability.
Outstanding organizational skills.
Experience in an agile environment.