Full-Time Senior Site Reliability Engineer
Daxko is hiring a remote Full-Time Senior Site Reliability Engineer. The career level for this job opening is Senior Manager and is accepting Birmingham, AL based applicants remotely. Read complete job description before applying.
Daxko
Job Title
Posted
Career Level
Career Level
Locations Accepted
Share
Job Details
Senior Site Reliability Engineer - Linux is a role for the motivated coder/hacker/engineer who wants to solve problems at the root cause, in an elegant and sustainable way.
In this position, you will be an instrumental part of our TechOps team, which exists to build and support the foundational tools that our product teams use to build products our customers love and trust.
We care deeply about our delivery pipeline being simple, reliable, consistent, and fast.
You will be successful if you have a deep love for automation, building scalable systems, embracing new technologies, and sharing with teammates.
Your responsibilities include:
- Supporting all Daxko software offerings and integrated third-party tools
- Collaborating on cases escalated to TechOps Support and building long-term solutions for recurring cases with automatable solutions
- Identifying and resolving technical debt items that could make other engineers more efficient
- Coordinating with agile development teams, DBAs, implementation, and support to ensure the production environment is healthy and stable
- Identifying repetitive tasks and automating them (spinning up new environments, deployments, etc.)
- Building, supporting, and administering all aspects of Daxko's continuous product delivery pipeline
- Working with core components such as load balancers, firewalls, etc.
- Making it painless for product teams to develop, test, deploy, and monitor by providing clear, documented frameworks around our operational systems
- Executing our disaster recovery plan; ensuring it is up-to-date and thoroughly tested
- Mentoring team members as a subject-matter expert
- Monitoring system activity 24x7 as part of an on-call rotation
- Troubleshooting system jobs and services that fail and work with core development teams as needed to ensure operational stability and efficiency
Requirements:
- Bachelor's degree in a technical discipline or equivalent experience
- 5+ years of experience
- Extensive experience with automation tools (Terraform, Chef, or Ansible)
- Scripting experience with Python, Ruby, Bash
- Experience with modern git repo technologies (GitHub, BitBucket, GitLab)
- Experience with CI/CD technologies (Jenkins, GitLab CI)
- Problem-solving skills and attitude
- Ability to work independently and as part of a team
- Advanced understanding of Linux, networking, and Internet principles
- Fantastic attention to detail
- Ability to prioritize and work well under pressure
- Effective interpersonal skills (written and oral)
- Strong understanding of internet technologies (DNS, SNMP, HTTP, TCP/IP, CDNs)
- Strong understanding of serverless technologies (AWS Lambdas)
- Experience with virtualization and cloud technologies (VMWare, AWS)
Preferred Experience:
- Experience with Containers and Orchestration (Docker, Kubernetes, Rancher, EKS, ECS)
- Experience with Monitoring Technologies (Logicmonitor, Instana, NewRelic, Rapid7, CloudPassage, etc.)
- Experience working tickets and managing priorities within issue tracking systems (Jira, etc.)
- Experience with modern web technologies (HTML5, CSS3, AJAX, JQuery, etc.)
- Experience developing or supporting C#, Java, or PHP applications
- General knowledge of relational databases (MySQL, MSSQL preferred)
- Experience supporting NoSQL and caching systems (Redis, MongoDB, DynamoDB, ElastiCache, etc.)
- Understanding of Event-driven architecture and related systems (Kafka, Kinesis, SNS, Redshift)