Full-Time Senior Infrastructure Engineer
TrueML is hiring a remote Full-Time Senior Infrastructure Engineer. The career level for this job opening is Senior Manager and is accepting Worldwide based applicants remotely. Read complete job description before applying.
TrueML
Job Title
Posted
Career Level
Career Level
Locations Accepted
Salary
Share
Job Details
Why TrueML? TrueML is a mission-driven financial software company that aims to create better customer experiences for distressed borrowers. Consumers today want personal, digital-first experiences that align with their lifestyles, especially when it comes to managing finances. TrueML's approach uses machine learning to engage each customer digitally and adjust strategies in real time in response to their interactions. The TrueML team includes inspired data scientists, financial services industry experts and customer experience fanatics building technology to serve people in a way that recognizes their unique needs and preferences as human beings and endeavoring toward ensuring nobody gets locked out of the financial system.
About the team/role: The DevEx team at TrueML is responsible for the improvement of our existing AWS and newly designed infrastructure as we push towards a modernization and platform rearchitecture. In this pursuit we are responsible for the continued DevOps improvement of TrueML as we push towards being a cloud native DevOps enabled organization. To make this happen, we are a set of individuals that work together with others to help ensure the scalability and future maintainability of the systems and services that we create here at TrueML as we push towards helping consumers.
Goal/Impact of work: Our mission is to streamline and enhance the end-to-end software development lifecycle (SDLC) for engineers at TrueML. By providing robust tooling, automated workflows, and improved infrastructure, we empower our teams to innovate more rapidly, operate with greater efficiency, and deliver increased value to our clients. Additionally, by enhancing visibility into system performance, operational costs, and overall business impact, we enable engineers to make more informed, data-driven decisions—ultimately reducing friction, boosting productivity, and fostering a culture of continuous improvement.
Key Responsibilities:
- Develop and manage Infrastructure as Code (IaC) using tools like Terraform.
- Design, implement, and maintain scalable, resilient systems on AWS or other cloud platforms.
- Build, manage, and optimize CI/CD pipelines with tools such as GitHub Actions, ArgoCD, AWS CodePipeline, or Jenkins.
- Deploy and operate Kubernetes clusters, using tools like Helm for configuration.
- Collaborate with development teams to identify bottlenecks in the software development lifecycle (SDLC) and build tools or automation to optimize workflows.
- Create and maintain CLI tools, scripts, and frameworks to simplify processes like infrastructure management, monitoring, and secrets handling.
- Troubleshoot and resolve infrastructure and application issues, focusing on root cause analysis.
- Promote platform security and stability by implementing best practices and designing resilient systems.
- Work cross-functionally to empower teams to manage their services independently by providing robust tools, processes, and documentation.
Required Skills and Experience:
- Deep knowledge of AWS cloud infrastructure and related services.
- Solid understanding of networking fundamentals, including DNS, HTTP, and cloud-based networking.
- Proficiency in a programming language such as Python, TypeScript, or Go.
- Experience with CI/CD processes and tools.
- Hands-on expertise in managing Kubernetes clusters and associated tools.
- Familiarity with Linux fundamentals, including basic troubleshooting and command-line usage.
- Understanding of network security best practices and database management.
- Proven ability to troubleshoot and debug complex systems.
- Experience building tools and automation to optimize software development processes.
- A security-first mindset with experience designing and operating secure systems.