Full-Time Site Reliability Engineer
Vantaca is hiring a remote Full-Time Site Reliability Engineer. The career level for this job opening is Experienced and is accepting USA based applicants remotely. Read complete job description before applying.
Vantaca
Job Title
Posted
Career Level
Career Level
Locations Accepted
Share
Job Details
Vantaca's Vision: Leading community management performance software, enabling improved business performance for owners, operators, community management teams, and associations.
Modern Cloud Architecture: Adaptable single-platform design for 100% of business processes, proactive reporting, and integration with preferred software and banking partners.
Team Culture: Collaborative, visionary team focused on results, nurturing environment to adapt to change, customer-centric approach emphasizing customer needs.
Role Overview: Champion the integration of monitoring practices into software development and operations. Collaborate with developers, operations, and system experts to ensure system reliability, performance, and availability throughout the software development lifecycle.
Key Responsibilities:
- Design and implement monitoring measures into the software development and deployment pipeline.
- Conduct monitoring assessments, availability testing, and system performance reviews to improve reliability.
- Collaborate with development and operations teams to automate alerting processes and integrate monitoring tools into CI/CD.
- Provide guidance and support for incident response and contribute to incident response plan development.
- Stay informed about monitoring tools, technologies, and best practices, proposing innovative solutions to enhance reliability.
Expectations: Maintain operational compliance, implement site controls, ensure monitoring practices align with industry standards and service level objectives.
Personal Characteristics: Always growing, team player, accountable, committed to customer experience.
Requirements: Collaborate with development and operations to embed monitoring in DevOps, implement monitoring best practices for cloud and containerized applications, conduct comprehensive analysis of events, improve monitoring insights across environments, diagnose performance constraints, define and maintain monitoring policies, provide incident response expertise, stay current with industry trends.
Qualifications: Bachelor's degree in relevant field or equivalent experience, proven experience in site reliability or similar role, strong understanding of monitoring principles and best practices, experience with monitoring tools and performance analysis, proficiency in scripting and automation (e.g., PowerShell, Python, Bash), ability to query and analyze data, passionate about diagnostic analysis and troubleshooting, familiarity with cloud environments and disaster recovery frameworks, understanding of version control (Git), and experience with Agile and DevOps methodologies.