Full-Time Senior Engineer, NSO
Life360 is hiring a remote Full-Time Senior Engineer, NSO. The career level for this job opening is Experienced and is accepting USA, Canada based applicants remotely. Read complete job description before applying.
Life360
Job Title
Posted
Career Level
Career Level
Locations Accepted
Salary
Share
Job Details
About the Job
This role will monitor the day-to-day operations of Life360’s services while also working to improve our overall observability and reporting capabilities. Life360’s system contains dozens of microservices, all of which are candidates for observation, tracking, and reporting. Onboarding new services and maintaining visibility of metrics for existing services are part of your responsibilities. This position is part of a strongly-knit team where good communication is a must. Life360 has on-call responsibilities for all engineers, which this position both supports and participates in. Responding to alerts as they come in, methodical execution of runbooks, and escalation of major issues to service owners will be daily events. The ability to quickly understand large systems at scale is critical. Thinking creatively about our system, gaining strong familiarity with the tools we use to maximize their benefits, and finding ways to automate manual tasks are behaviors that will elevate great engineers in this role.
What You’ll Do
- Use tools such as Prometheus, Grafana, and Datadog to create and maintain observability infrastructure and tooling, including creating alerts, production reporting, and writing documentation.
- Manage observability infrastructure.
- Serve as a member of “follow the sun” L1 on-call support, working alone or with teammates to answer pages for all onboarded services and resolve or escalate issues in a timely manner.
- Utilize anomaly detection and alerting, respond to alerts in PagerDuty, drive incidents to their conclusion, and lead the effort to strengthen the system based on post-mortem action items.
- Coordinate cross-team and cross-functional efforts with processes, documentation, and tooling to ensure operational excellence.
What We’re Looking For
- Bachelor's in Computer Science, Engineering, related field or equivalent practical experience
- 5+ years experience writing/reading/debugging code in one or more languages, such as: Java, Python, Shell, Ruby
- 5+ years experience working with large-scale distributed systems and managing Linux-based systems in a cloud like AWS
- In depth experience with large scale observability and reporting systems (New Relic, Datadog, ElasticSearch, Prometheus, etc)
- 3+ year(s) experience with solutions such as Docker, Kubernetes, system virtualization, cloud monitoring and logging
- 3+ years experience with IaC and config management tools such as Terraform, Cloudformation, Chef, Ansible, and similar.
- Experience working as part of a team, using analytical, problem-solving skills
- Excellent troubleshooting and attention to detail
- Ability to quickly learn new technologies and follow industry trends
- Ability to analyze and optimize high-traffic internet applications