Full-Time Senior Site Reliability Engineer
PandaDoc is hiring a remote Full-Time Senior Site Reliability Engineer. The career level for this job opening is Senior Manager and is accepting Worldwide based applicants remotely. Read complete job description before applying.
PandaDoc
Job Title
Posted
Career Level
Career Level
Locations Accepted
Share
Job Details
Site Reliability Engineers are critical for PandaDoc's success. We ensure PandaDoc services are reliable and provide our customers with minimal downtime. Site Reliability Engineers are the driving force behind smooth system operation, proactively identifying bottlenecks.
Responsibilities:
- Build software, frameworks, and tools for reliable PandaDoc service operations.
- Manage stability and operation of critical production applications, including application reviews, capacity planning, and performance tuning.
- Continuously develop automations/tooling for improved platform reliability and availability.
- Collaborate with other engineers and cross-functional teams, fostering strong engineering principles and representing our values.
- Participate in Proof-of-Concept (PoC) projects on new product/platform frameworks.
- Improve observability as a developer/maintainer of systems/frameworks, and as a mentor to product development teams.
Qualifications:
- 3+ years of experience with higher-level languages (e.g., Python or Go).
- Strong experience with configuration and maintenance observability tools (e.g., Prometheus, Grafana, Kibana).
- Experience supporting critical production services.
- Strong troubleshooting skills in distributed Linux systems environments.
- Strong understanding of application, system, and network tracing.
- Strong experience with AWS and Kubernetes.
- Experience with industry-standard DevOps tools (e.g., GitLab, Jenkins, Terraform).
- Strong communication and knowledge-sharing skills related to reliability.
- Proactive ownership and pride in technical and team contributions.