Full-Time Site Reliability Engineer
Blackpoint%20cyber is hiring a remote Full-Time Site Reliability Engineer. The career level for this job opening is Experienced and is accepting Australia based applicants remotely. Read complete job description before applying.
Blackpoint%20cyber
Job Title
Posted
Career Level
Career Level
Locations Accepted
Share
Job Details
Blackpoint Cyber is the leading provider of world-class cybersecurity threat hunting, detection and remediation technology. Founded by former NSA experts, Blackpoint Cyber is in hyper-growth mode.
Job Overview: We're seeking a passionate and experienced Site Reliability Engineer to join our high-impact team. You'll lead in designing, building, and scaling robust infrastructure, CI/CD pipelines, and build systems.
Key Responsibilities:
- Design, build, and maintain highly scalable infrastructure using Terraform and Terragrunt to automate cloud resource provisioning.
- Manage and optimize AWS cloud environments for cost-efficiency, security, and high availability.
- Manage and scale Kafka and Confluent Cloud platforms.
- Deploy and maintain Redis instances for caching.
- Implement robust monitoring and alerting systems using Prometheus, Grafana, Alert Manager, and OpsGenie.
- Troubleshoot complex system issues, ensuring optimal performance and uptime.
- Manage Kubernetes clusters using tools like Helm, ArgoCD, Istio, and Kustomize.
- Enable feature flag management and controlled rollouts.
- Work closely with development teams to seamlessly integrate new features and services.
- Foster continuous improvement by evaluating and adopting emerging SRE tools and best practices.
Skills & Qualifications:
- 4+ years SRE experience
- Strong problem-solving and communication skills
- Expertise in Infrastructure as Code (IaC) using Terraform and Terragrunt
- Deep knowledge of AWS services
- Experience with Confluent Cloud and Kafka
- Strong experience with REDIS, OpenSearch/Elasticsearch/ Chaos Search
- Proficiency in monitoring and alerting using Prometheus, Grafana, Alert Manager
- Experience managing Kubernetes clusters, package management with Helm, deployment with ArgoCD, and service mesh configurations using Istio
- Familiarity with Kustomize
- Development experience in NodeJS/Python/GoLang
Nice to Have:
- Multi-cloud experience (GCP, Azure)
- Security and compliance knowledge
- Familiarity with serverless architectures and CI/CD tools (Jenkins, GitHub Actions)