Site Reliability Engineer
Company
Anywhere365
Location
South Africa
Type
Full Time
Job Description
Founded in 2010 in The Netherlands Anywhere365 is a global leader in Enterprise Dialogue Management with a vision to ensure every employee and customer feels heard understood and valued. With around 240 employees in working from 22 different countries we partner with over 2000 leading enterprises including Mazda the UN International Organization for Migration Adecco Group and the University of Cape Town to deliver exceptional customer experiences through the power of Microsoft Teams and AI-driven insights. Our commitment to innovation customer focus and accountability drives our success.
We are looking for a highly skilled and driven Site Reliability Engineer (SRE) to join our team with a strong emphasis on communications technologies cloud operations and system performance. This role requires expertise in monitoring alerting anomaly detection automation security and performance tuning across our critical communications platforms. You will be responsible for the reliability availability and performance of services such as SIP Skype for Business and Azure Communication Services (ACS). Your role will also focus on optimizing resource utilization cost management and ensuring disaster recovery and business continuity (BCP/DR).
Main responsibilities:
-
Develop and maintain real-time monitoring and alerting systems using tools like Prometheus Grafana and the ELK stack to ensure system health and performance.
-
Identify and resolve anomalies and bottlenecks proactively reducing downtime through automated detection and alert mechanisms.
-
Automate infrastructure provisioning scaling and patching using tools like Terraform and Azure DevOps across Kubernetes Windows and Linux environments.
-
Build self-healing systems and leverage Kubernetes operators CI/CD pipelines and event-driven automation to improve reliability.
-
Analyze and optimize system performance for latency-sensitive services including VoIP video and messaging.
-
Implement cloud cost optimization strategies such as using Reserved Instances rightsizing virtual machines and leveraging Azure Cost Management tools.
-
Strengthen system security by enforcing best practices for hardening vulnerability patching and incident management in collaboration with security teams.
-
Design and execute robust disaster recovery plans ensuring fault-tolerant architectures and reliable backup and restore strategies.
Why we would like to have a dialogue with you
We pick competencies over skills and experience. Can you convince us that you possess the following competencies:
-
Communication: The ability to communicate clearly and effectively with individuals across the organization and to be responsive to their needs and concerns.
-
Action-oriented: The ability to act quickly and decisively even in the face of uncertainty to move projects forward and achieve business goals.
-
Commitment: The ability to consistently meet or exceed the quality standards expected by stakeholders.
-
Collaboration: The ability to work effectively with others and to build strong relationships based on trust and mutual respect recognizing that everyone has something to contribute.
-
Taking ownership: The ability to take full responsibility and accountability for tasks projects or actions demonstrating a sense of commitment and dedication towards achieving desired outcomes.
Competencies are key but to be successful in this role you need to bring a few essentials to kickstart the conversation:
Key skills & experience:
-
5+ years of experience as an SRE Systems Engineer or in a similar role with a focus on communications technologies.
-
Proven experience with cloud platforms with a strong focus on Azure and experience with Azure Resource Patching for Kubernetes (AKS) VMs (Windows and Linux).
-
Experience with Microsoft Teams/Skype for Business and Azure Communication Services.
-
Strong understanding of SIP VoIP and related protocols.
-
Strong understanding of networking concepts and experience with Cisco networking technologies (e.g. routers switches firewalls).
-
Experience with scripting languages (e.g. Python PowerShell Bash Terraform Helm Pulumi) and automation tools.
-
Experience with network performance monitoring tools (e.g. Wireshark tcdump) is a plus.
Some last notes:
We are in the process of establishing a legal entity in South Africa and you will be employed directly by us once it is finalized. In the interim we expect you to work as a contractor.
Currently this role is remote. However we are also planning to open an office in South Africa where you will be expected to work on-site four days a week once it is operational.
As this position involves supporting regions such as Europe and the US we require flexibility in working shifts to cover these time zones. Additionally occasional weekend work will be expected. Rest assured you will be compensated for any irregular hours worked.
Anywhere365 is committed to creating a diverse environment and is proud to be an equal-opportunity employer. We accept difference and we thrive on it for the benefit of our employees our products and our community.
Please note that we have a background check policy. The background check differs per country and position. If you would like to know more the Talent Acquisition Specialists are happy to answer any questions!
Date Posted
01/27/2025
Views
0
Similar Jobs
AI Solution Manager, ServiceNow Platform - ServiceNow
Views in the last 30 days - 0
ServiceNow a global market leader in AIenhanced technology is seeking an AI Solution Manager to lead the implementation of AI solutions for complex bu...
View DetailsSenior Program Manager, Global Occupational Health & Safety - ServiceNow
Views in the last 30 days - 0
ServiceNow is seeking a Health Safety Program Manager to design implement and lead a comprehensive corporate safety program The role involves develop...
View DetailsExecutive Assistant - ServiceNow
Views in the last 30 days - 0
ServiceNow a global market leader in AIenhanced technology is seeking a highly organized and experienced executive assistant to support a VP The role ...
View DetailsStaff Engineer, System Design Verification Engineering - Western Digital
Views in the last 30 days - 0
Western Digital is seeking a validation engineer to define and track test plans characterize and optimize SSDs and lead bug review meetings The ideal ...
View DetailsStaff Flight Test Engineer - Wisk
Views in the last 30 days - 0
Wisk Aero is seeking a Staff Flight Test Engineer to join their team in Hollister CA The role involves ensuring safe and efficient flight testing and ...
View DetailsQuality Assurance Specialist - MANUAL TESTING - viaPeople - Solen Software Group
Views in the last 30 days - 0
Solen Software Group is seeking a Quality Assurance Specialist for a remote position The company is a fastgrowing technology firm that encourages cont...
View Details