Full-Time Lead Site Reliability Engineer (AZURE)
Hitachi Solutions is hiring a remote Full-Time Lead Site Reliability Engineer (AZURE). The career level for this job opening is Expert and is accepting Greenville based applicants remotely. Read complete job description before applying.
This job was posted 7 months ago and is likely no
longer active. We encourage you to explore more recent opportunities on our site. However, you
may still try your luck using 'Apply Now' link below. We recommend focusing on newer listings
available here.
Hitachi Solutions
Job Title
Lead Site Reliability Engineer (AZURE)
Posted
Career Level
Full-Time
Career Level
Expert
Locations Accepted
Greenville
Salary
YEAR $142500 - $198750
Share
Job Details
This is a full-time role for an expert in systems design with considerable skill in large software development in an Azure dev environment.
Responsibilities include:
- Designs and implements CI/CD tooling (GitHub Actions / Azure DevOps, etc.)
- Defines and implements build and test pipelines for containerized architectures.
- Develops infrastructure as code (IaC) for stateful deployments.
- Implements Role-Based Access Control (RBAC).
- Implements linting and other code quality controls.
- Manages SaaS deployment APIs.
- Assists in the design, engineering, development, planning, and administration of Azure Kubernetes AKS clusters.
- Works closely with application, engineering, security, and operations teams to engineer and build Kubernetes and Azure PaaS & IaaS solutions.
- Focuses on availability, latency, performance, efficiency, monitoring/observability, emergency response, capacity planning, and SLO/SLI/Error Budget management.
- Analyzes, troubleshoots, and resolves operational challenges.
- Manages site stability, performance, and reliability.
- Develops a fully automated multi-environment observability stack.
- Strives for automation to reduce toil and increase development velocity.
- Performs application-specific production support, incident management, and change management.
- Identifies changes for product architecture related to reliability, performance, and availability.
- Analyzes and addresses complex technical challenges and issues.
- Creates and maintains technical documentation (design specifications, user guides, run books, etc.).
- Looks for opportunities to improve system availability and performance.
- Collaborates with software development teams, product managers, and other engineers.
- Participates in Agile ceremonies.
- Mentors engineers and fosters a culture of continuous learning.
- Stays updated with latest technologies, tools, and cloud computing.
- Collaborates with customers for support and feedback.
- Triages incoming Web Support escalations.
- Contributes to incident root cause analysis and service restoration.
- Serves as an incident commander during outages.
- Requires strong background in SRE supporting 24x7 highly available production environments.
- Requires solid experience with Monitoring/APM/Observability tools.
- Requires strong background with Azure Resources like Key Vault, Data Factory, Azure Databricks and Storage Accounts.
- Requires experience implementing observability plans around logs, metrics, and traces.
- Requires experience in an agile development team developing software.
- Implements and participates in best practices for CI/CD.
- Requires experience with cloud infrastructure environments (preferably Azure), and Infrastructure as code (Terraform, Bicep, ARM).
- Designs, develops, and maintains infrastructure using IaC tools and technologies.
- Requires strong experience with containerization technology and/or Kubernetes.
- Requires experience with Release automation, system administration, and configuration management.
- Requires experience with programming languages (Python, Go, etc.).
- Requires strong understanding of Linux, Windows, software development, systems, networking, and cloud concepts.
- Requires strong interpersonal and teaming skills to set and enforce process.
- Requires strong analytical and programming skills.
- Bonus: Experience with MLFlow and other MLOps pipeline technology.
Skills
FAQs
What is the last date for applying to the job?
The deadline to apply for Full-Time Lead Site Reliability Engineer (AZURE) at Hitachi Solutions is
9th of May 2025
. We consider jobs older than one month to have expired.
Which countries are accepted for this remote job?
This job accepts [
Greenville
] applicants. .
Related Jobs You May Like
Azure DevOps Engineer
Jersey City, NJ
2 days ago
.NET
Azure
DevOps
Derex Technologies Inc
Full-Time
Experienced
Lead Palantir Developer
Seattle, WA
2 days ago
CI/CD Pipelines
Data Engineering
Palantir Foundry
Logic20/20 Inc.
Full-Time
Experienced
YEAR $156750 - $173329
Cloud AppOps Engineer
Atlanta, GA
3 days ago
Application Support
AWS
Cloud Services (EC2, S3, IAM, ELB, VPC, VPN)
Sutherland
Full-Time
Experienced
Query Tuning Specialist - Database Performance - Postgre
Austin, Texas
3 days ago
Database Management
Performance Tuning
Problem-solving
ServiceNow
Full-Time
Experienced
DevOps Engineer, Playout
New York, New York
3 days ago
CICD
Cloud Services (AWS, GCP, Azure)
DevOps
NBCUniversal
Full-Time
Experienced
YEAR $90000 - $110000
Query Tuning Specialist - Database Performance - Postgres
Austin, Texas
3 days ago
Database Management
Performance Tuning
SaaS/PaaS/Cloud Development
ServiceNow
Full-Time
Experienced
Lead Palantir Developer
Seattle, WA
4 days ago
CI/CD Pipelines
Cloud ETL
Palantir Foundry
Logic20/20 Inc.
Full-Time
Experienced
YEAR $156750 - $173329
Cloud AppOps Engineer
Atlanta, GA
4 days ago
Application Support
AWS
Cloud Security
Sutherland
Full-Time
Experienced
Site Reliability Engineer
Stamford, Connecticut
4 days ago
Cloud Platforms (AWS, GCP, Azure)
Configuration Management
Monitoring And Alerting Tools
NBCUniversal
Full-Time
Experienced
YEAR $110000 - $145000
Senior Cloud Platform Engineer (Networking)
Berlin, Germany
5 days ago
AWS
Go
Networking
Scalable GmbH
Full-Time
Experienced