Staff Site Reliability Engineer

Agiloft • Remote

Company

Agiloft

Location

Remote

Type

Full Time

Job Description

As the most trusted global leader in data-first contract lifecycle management (CLM) software, Agiloft helps organizations manage the end-to-end process of proposing, negotiating, signing, and leveraging contracts using our flexible Data-first Agreement Platform (DAP). With contract data as the foundation, customers quickly and collaboratively reach agreement and leverage contract visibility to thrive with competitive advantage. Employing powerful, pragmatic artificial intelligence as a legal force multiplier, and robust integration capabilities as a data liberator, organizations around the world trust Agiloft’s certified implementers to deliver connected, intelligent, and autonomous solutions across the entire contract lifecycle.


Top analysts like Gartner, Forrester, and IDC agree, all showing Agiloft as a leader in the CLM space. Our no code platform is easily managed and administered by business users, which is why Agiloft is the contract you keep: nearly a full 100% of new customers are satisfied with their initial implementations, and some 97% of customers renew every year. Ours is a growing, vibrant, successful company that is at the forefront of a market that is becoming a must-have for all organizations.


We believe that the way to build the strongest, most vibrant place to work is to bring in individuals from all walks of life, and to support them in bringing their authentic selves to their day, every day. Our working philosophy is that “EX = CX”: when employee experience is excellent, so is customer experience. We support multiple Employee Resource Groups (ERGs), and offer a working environment that supports healthy work/life balance, including floating holidays and a quarterly, no-questions-asked wellness day.


Position Overview


As a Staff Site Reliability Engineer (SRE), you will be responsible for developing and implementing highly reliable and scalable system. You will work closely with different functional teams to create a stable, efficient, and scalable environment, leading complex projects requiring collaboration with multiple stakeholders.

Job Responsibilities

  • Define and enforce SRE best practices and standards.
  • Architect and implement highly reliable and scalable systems.
  • Lead complex post-incident reviews and implement systemic improvements.
  • Collaborate with product and engineering teams to set reliability targets.
  • Manage high-impact incidents and coordinate incident response.
  • Contribute to budget planning and resource allocation.
  • Lead efforts to establish disaster recovery strategies.
  • Provide technical leadership and mentorship to the SRE team.
  • Continuously track and improve metrics (for example, DORA) to optimize software delivery and operational performance.
  • Participate in on-call rotation.
  • Other duties as assigned

Required Qualifications

  • 8-10 years of experience in similar or related role
  • Bachelor’s degree in Computer Science, Information Technology, or related field (or equivalent experience)
  • In-depth knowledge of Cloud Ops technologies including Amazon Web Services (AWS) and Terraform or other Infrastructure as Code (IaC)
  • Advanced knowledge in Linux operating systems and troubleshooting OS issues
  • Expertise in setting up and managing monitoring tools (such as Prometheus, Grafana, Datadog, Nagios, Open Telemetry, ELK, or similar tools)
  • In-depth understanding of monitoring and alerting systems, networking principles (such as load balancing, CDN, and disaster recovery)
  • Strong understanding of:
  • Incident management
  • Capacity planning
  • Disaster recovery
  • Observability practices (in tools such as OpenTelemetry and Jaeger)
  • Advanced experience with or knowledge of with security measures and practices (for example, threat modeling, compliance, and secure coding practices)
  • Strong analytical and problem-solving skills
  • Knowledge with Linux systems and common system administration tasks
  • Strong understanding of programming/scripting languages (such as Python) including additional scripting skills in multiple languages to automate SRE operations
  • Excellent communication and teamwork skills
  • A willingness to learn and adapt in a fast-paced, dynamic environment

Preferred Qualifications

  • Familiarity with DevOps practices, infrastructure as Code tools, and Agile methodologies a plus

Ensuring a diverse and inclusive workplace is our priority. We are committed to an environment of acceptance where you are free to bring your full self to work. All employment decisions at Agiloft are based on business needs, job requirements, and individual qualifications without regard to race, color, religion or belief, national or social ethnic origin, sex, age, sexual orientation, gender identity and/or expression, parental status, marital status, Veteran status, or any other status protected by the laws or regulations in the locations where we operate. If you have a need that requires accommodation during the recruiting process, please let us know by contacting Director, Talent Acquisition, Brad Toothman at [email protected].

 

Applicants from underrepresented groups such as minorities, veterans, or individuals with disabilities encouraged to apply.


Applications will be reviewed as submitted. There will be no application deadline for this opportunity.

Apply Now

Date Posted

01/24/2025

Views

0

Back to Job Listings ❤️Add To Job List Company Info View Company Reviews
Positive
Subjectivity Score: 0.95

Similar Jobs

Linux Support Engineer - Voltage Park

Views in the last 30 days - 0

Voltage Park is seeking a Linux Support Engineer for a fulltime remote position The ideal candidate will have command line level Linux sys administrat...

View Details

Technical Architect - CDW

Views in the last 30 days - 0

CDW offers a rewarding career opportunity for a Technical Architect with expertise in ServiceNow The role involves delighting customers by collaborati...

View Details

Senior React.js & Python Developer - Lemon.io

Views in the last 30 days - 0

Lemonio is a marketplace that connects Senior Developers with handpicked startups in the US and Europe They offer projects based on the developers exp...

View Details

Federal Security Solutions Engineer - Rapid7

Views in the last 30 days - 0

Rapid7 is seeking a Federal Solutions Engineer with 5 years of experience in cybersecurity solutions engineering or technical sales focusing on federa...

View Details

Sales Engineer - Dandy

Views in the last 30 days - 0

Dandy a venturebacked company is revolutionizing the 200B dental industry with advanced technology They are looking for a Sales Engineer with 5 years ...

View Details

Engineering Manager (Group Practice Tooling & Provider CX) - Headway

Views in the last 30 days - 0

Headway is a mental healthcare company founded in 2019 aiming to build a new mental health care system accessible to everyone They have a national net...

View Details