Full-Time Site Reliability Engineer
SGS is hiring a remote Full-Time Site Reliability Engineer. The career level for this job opening is Experienced and is accepting Winnipeg | Calgary | Toronto, Canada based applicants remotely. Read complete job description before applying.
SGS
Job Title
Posted
Career Level
Career Level
Locations Accepted
Share
Job Details
Site Reliability Engineer will ensure reliability, supportability, scalability, and performance of .NET stack applications (ASP.NET MVC, Angular, Web API).
Partner with development and product operations teams to understand application requirements and translate them into operational practices.
Design, implement, and maintain infrastructure automation tools using Infrastructure as Code (IaC) methodologies.
Monitor application health and performance metrics, proactively identifying and resolving potential issues.
Implement incident response procedures to ensure timely resolution of outages and service disruptions.
Establish and improve best practices for product solution design/architecture and development. Participate in peer and team code reviews, developing comprehensive coding standards and guidelines for consistency, maintainability, and quality in software development.
Collaborate with engineers to develop and implement disaster recovery plans.
Continuously improve monitoring and alerting processes to ensure efficient problem identification and resolution.
Stay updated on the latest advancements in .NET infrastructure and SRE best practices.
Must Haves: Actively applying architectural, coding best practices, and conducting code reviews. Strong understanding of supportability and maintainability KPIs. Strong development background in C#/.NET. Experienced with at least one programming language. Senior-level understanding of Azure cloud services and Azure DevOps. Deep knowledge in one of: Azure Pipelines, Releases, Ansible, Puppet, or Chef. Bachelor's degree required. Minimum 3+ years experience in a related technical role (Systems Administrator, Network Engineer). Experience with configuration management tools (Ansible, Puppet, Chef) preferred. Azure experience required. Familiarity with monitoring and alerting tools (.NET performance counters, Azure App Insight, Prometheus, Grafana) preferred. Ability to manage and coordinate multiple projects. Strong understanding of system administration principles, including operating systems (Windows Server preferred) and networking concepts. Ability to work independently and as part of a team.