Site Reliability Engineer
Company
O.C. Tanner
Location
Salt Lake City, UT
Type
Full Time
Job Description
Job Description
O.C. Tanner develops employee recognition and rewards programs that help companies appreciate people who do great work. As part of that effort, we build large-scale, international, multi-million user web and mobile applications used by Fortune 500 companies.
We are looking for a Site Reliability Engineer who can help enhance the availability of our services. With a focus on uptime and performance, you'll shape areas like infrastructure orchestration, continuous integration and deployment, security, configuration/change management, and participate in incident and problem management. With a passion for innovation and an instinct for reliability and scalability, this role will make a long-lasting impact on the organization.
If you are passionate about quality, operations, everything cloud, automation, caring strongly for others and enjoy contributing to best of breed technologies, you may have found a great home with O.C. Tanner. The position is ideal for a self-starter and quick learner with a love of people, infrastructure and automation who enjoys collaborative work on leading-edge technologies.
Responsibilities:
- Own SolarWinds discoveries and identify outliers
- Troubleshoot existing scripts in languages such as PowerShell and Linux shell
- Design and develop automation to achieve self-healing systems
- Create and implement state-of-the-art system and customer experience monitoring solutions
- Own SRE application health and ensure apps are up-to-date and resourced appropriately
- Drive internal improvements and projects to maximize proactive monitoring and alerting
- Coordinate with partners across the organization to monitor critical systems
- Provide rotating on-call support
- Schedule and drive forward-thinking discussions to optimize monitoring/logging tools architecture
Job Requirements
- 3+ years of experience as an SRE or related area exercising DevOps principles
- Strong interest in analyzing and troubleshooting highly-available services with a distributed architecture
- Strong understanding of Linux systems and networking and security fundamentals
- Experience with major cloud providers like AWS and GCP
- Experience maintaining infrastructure as code using tools like Terraform, Chef, or Puppet
- Experience building and maintaining pipelines with CI/CD tools
- Experience with monitoring services and tools like Datadog, NewRelic, Grafana, Prometheus, or equivalent
- The ability to explain your ideas clearly, give and receive feedback, and work well with team members
- Pride in fostering stellar uptime, your unique views, and original ideas
- Kubernetes experience preferred
Date Posted
12/17/2023
Views
2
Similar Jobs
Senior Software Engineer, Devices Automation - Block
Views in the last 30 days - 0
Square a company that has evolved since its inception in 2009 is seeking a Software Engineer with extensive experience in embedded devices and test en...
View DetailsControls Technician - Utah - Dandy
Views in the last 30 days - 0
Dandy a venturebacked company is revolutionizing the dental industry with advanced technology They are hiring an experienced Controls Technician to ma...
View DetailsSenior Data Engineer - Sortly
Views in the last 30 days - 0
Sortly is a successful distributed and remotefirst company offering a multidevice inventory management solution They are seeking a Data Engineer with ...
View DetailsFEA Engineer - PhysicsX
Views in the last 30 days - 0
PhysicsX is a deeptech company specializing in machine learning applications for physics simulations They aim to revolutionize design and engineering ...
View DetailsEngineer, Quality Assurance – BBU (EQA1) - JMA Wireless
Views in the last 30 days - 0
JMA is a leading company in wireless technology particularly in 5G with its advanced softwarebased platform manufactured in Syracuse NY The companys t...
View DetailsAVL Technical Engineer - Life.Church
Views in the last 30 days - 0
The AVL Technical Engineer at LifeChurch is responsible for providing continuous technical support to campus teams resolving technical issues and ensu...
View Details