Site Reliability Engineer
Company
Egen
Location
West Suburbs
Type
Full Time
Job Description
Egen is a fast-growing and entrepreneurial company with a data-first mindset. We bring together the best engineering talent working with the most advanced technology platforms, including Google Cloud and Salesforce, to help clients drive action and impact through data and insights. We are committed to being a place where the best people choose to work so they can apply their engineering and technology expertise to envision what is next for how data and platforms can change the world for the better. We are dedicated to learning, thrive on solving tough problems, and continually innovate to achieve fast, effective results.
We are seeking a Site Reliability Engineer to ensure system reliability and infrastructure support. You will be responsible for delivering scalability, performance optimization, incident management, and analysis.
Responsibilities:
- Ensure system reliability and uptime of applications depending on the SLA’s
- Monitor system performance metrics and determine the approaches to optimize the system
- Lead incident management efforts with available methodology and document RCA(Root Cause Analysis), lessons learned, and any SOP’s for solving the issue in future
- Work closely with DevOps and Application teams to align priorities, share knowledge and drive continuous improvement initiatives
- Prioritize response efforts based on issue severity, potential impact on users, and business priorities
- Evaluate and approve changes to production systems, balancing the need for innovation with the requirement of stability and reliability
- Optimize resource usage and manage costs by identifying inefficiencies, rightsizing infrastructure resources, and implementing cost-saving measures
What we're looking for:
- 3+ years of SRE experience with Azure and/or AWS
- Bachelor’s Degree is preferred but will consider relevant experience as an equivalent
- Programming: Java, SpringBoot, SQL, Bash
- Monitoring: DataDog, Splunk, Grafana
- Docker, Kubernetes, Linux
- Incident/Alerts Management: VictorOps, PagerDuty
- Git, Bitbucket
- Troubleshooting complex, intertwined distributed services
- Attention to detail
- Testing, Monitoring, Logging, Alerting
- Documentation
- Incident Management
Date Posted
11/14/2024
Views
0
Similar Jobs
Process Engineer - Nemera
Views in the last 30 days - 0
This job description outlines a handson engineering position responsible for leading the development installation validation and lifecycle management ...
View DetailsSoftware Engineer Intern - Summer 2025 - Motorola Solutions
Views in the last 30 days - 0
Motorola Solutions is seeking a motivated intern to contribute to the development and support of their DataInsightsSubscriptionManagement systems The ...
View DetailsLead Salesforce Engineer - Grainger
Views in the last 30 days - 0
Grainger a leading industrial distributor is seeking a Senior Software Engineer to lead Salesforce development and implementation The role involves le...
View DetailsSenior Data Scientist - Technical Lead - Fortune Brands Home & Security
Views in the last 30 days - 0
Fortune Brands Innovations Inc is seeking an experienced AI and Data Scientist to develop AI solutions for connected products creating realworld impac...
View DetailsAssociate Product Manager | OPE - Chervon North America
Views in the last 30 days - 0
Chervon a leading global manufacturer of power tools and outdoor power equipment is seeking an Associate Product Manager for OPE The role involves dev...
View DetailsProduct Manager - Carpentry - Chervon North America
Views in the last 30 days - 0
Chervon a leading global manufacturer of power tools and outdoor power equipment is seeking a Product Manager for Carpentry The role involves managing...
View Details