Full-Time Site Reliability Engineering (Senior)
Promptly Health is hiring a remote Full-Time Site Reliability Engineering (Senior). The career level for this job opening is Senior Manager and is accepting Portugal based applicants remotely. Read complete job description before applying.
Promptly Health
Job Title
Posted
Career Level
Career Level
Locations Accepted
Share
Job Details
Job Description
Design, implement, and maintain highly available and scalable infrastructure using Terraform.
Build and maintain robust monitoring and alerting systems using tools like Prometheus, Grafana, and Alertmanager.
Manage and optimize Kubernetes clusters, ensuring reliability, performance, and security.
Implement best practices for CI/CD pipelines, automating deployments and minimizing downtime.
Work closely with engineering and product teams to ensure seamless integration of new services and technologies.
Troubleshoot and resolve infrastructure-related issues, ensuring high availability and reliability.
Enforce security and compliance best practices across the infrastructure.
Lead initiatives to improve observability, reliability, and performance of systems.
Collaborate with talented engineers to drive DevOps culture.
Stay up-to-date with industry trends and emerging technologies for continuous improvement.