Skip to content

Job Openings

Lead Site Reliability Engineer

Boston, MA 02109

Posted: 02/15/2024 Employment Type: Contract To Hire Job Category: Cloud & DevOps Job Number: 613152 Country: United States Is job remote?: Yes

Job Description

Lead Site Reliability Engineer
Location: 100% Remote
Hourly Rate: up to $80/hr DOE
Benefits: Health, Dental, Vision and more

Scope of Position:
Planet Technology's client in the healthcare tech industry is seeking a Lead SRE to join their team. As Lead SRE, you will play a critical role in their hosting department. The hosting department ensures the telehealth platform operates securely, efficiently, and in compliance with healthcare regulations, facilitating the delivery of digital health services to patients and healthcare providers.

Our team integrates SRE (Site Reliability Engineering), DevOps, Software Development, and Information Security expertise to create an agile and elastic Cloud infrastructure. This approach ensures the seamless and secure operation of our telehealth services. Specifically, they will design, build, and maintain scalable and resilient systems using Infrastructure as Code and automation.

Job Qualifications:
  • 10 or more years of experience operating infrastructure in a 24/7 environment
  • 4 or more years of programming experience with at least one modern language such as Python, Java, and Bash
  • 10 or more years of experience with web application architecture
  • 10 or more years of development experience working with IaC tools like Terraform and CDK
  • 10 or more years of hands-on experience building and troubleshooting Kubernetes from the ground up
  • 10 or more years working with Cloud computing providers, with a solid understanding of AWS concepts, services, and tools
  • 10 or more years of experience managing Linux-based systems
  • Well-versed in TCP/IP networking concepts
  • Technical and non-technical communication skills

Job Responsibilities
  • Build, manage, and troubleshoot large-scale production Kubernetes clusters, including deep hands on with the control plane
  • Design and build infrastructure & systems that provide high levels of scalability, reliability, and performance, while balancing security, maintainability, and operational excellence
  • Automate the deployment and management of infrastructure on public cloud platforms
  • Write high-quality, standardized scripts, configurations, and tests to increase platform reliability
  • Design and build cloud infrastructure platform using infrastructure as code
  • Ensure compliance with security and data handling policies
  • Create and maintain technical documentation
  • Collaborate with product developers to ensure successful deployment and hosting of products
  • Lead investigations of projects, defining and outlining their scope, and documenting findings
  • Provide clear instructions to the team based on the project scope and requirements, ensuring documentation in the ticketing system
  • Take full leadership responsibility for the hands-on execution of complex projects, actively contributing to project work
  • Play a pivotal role in project planning, actively participating in the development of detailed Methods of Procedure (MOP) documents
  • Foster collaboration within the team, ensuring a comprehensive understanding of project requirements
  • Actively contribute to the development and enhancement of team processes and best practices
  • Provide mentorship and guidance to SRE team members, coaching them in making informed technical decisions
  • Track and reduce architectural, process, and technical debt for SRE toolset
  • Participate in on-call rotation
Apply Online

Send an email reminder to:

Share This Job:

Related Jobs:

Login to save this search and get notified of similar positions.
Social Share Buttons and Icons powered by Ultimatelysocial