Job Details

Explore Location

Manager, Site Reliability Engineer

Avalon Administrative Services LLC

Tampa, Florida, United States

(on-site)

Posted 21 hours ago

Job Function

Other

Description

About Avalon Healthcare Solutions:

Avalon Healthcare Solutions is the nation’s leader in diagnostic intelligence, uniquely focused on transforming the role of diagnostic testing across the healthcare ecosystem. Our proprietary Diagnostic Insights Platform delivers evidence-based policies, curated lab networks, and real-time analytics that simplify complex diagnostics, accelerate innovation adoption, and optimize diagnostic investments.

Supporting over 30 health plans and 44 million members nationwide, Avalon partners with payers and providers to ensure diagnostic testing is performed appropriately, efficiently, and at the right time. Our flexible solutions span routine and genetic testing management, automated adherence, and end-to-end diagnostics support-driving measurable value, reduced waste, and improved clinical outcomes.

With unmatched scientific rigor, deep clinical expertise, and a performance-based model, Avalon is redefining how diagnostics power personalized care and healthcare value.
Learn more at www.avalonhcs.com.

You will be part of a team that shapes a new market and business. Most importantly, you will help Avalon to achieve its mission and improve clinical outcomes and health care affordability for the people we serve.

For more information about Avalon, please visit www.avalonhcs.com.

Avalon Healthcare Solutions, and its affiliates, is an equal opportunity employer. This position description is subject to change at any time. As determined by the company based upon business needs, an employee in this position may be required to perform duties and take responsibility for work other than as described in this document.

About the Manager, Site Reliability Engineer :

This position requires a hands-on Manager of Site Reliability Engineer (SRE) to lead a team of 5–8 SREs responsible for ensuring the availability, scalability, and performance of our production and cloud infrastructure. This role is central to driving operational excellence, establishing modern observability practices, and maturing our cloud security posture within AWS.

This position is eligible for remote work, but quarterly travel will be required to Avalon's corporate office located in Tampa, Florida.

Manager, Site Reliability Engineer – Essential Functions and Responsibilities:

Key Responsibilities:

Leadership & People Management

Lead, mentor, and grow a high-performing team of SREs through coaching, training, goal-setting, and performance feedback.
Foster a culture of operational excellence and continuous improvement.
Collaborate closely with developer, security, and product teams to balance reliability with feature delivery velocity.

Incident Response & Reliability Operations

Own the end-to-end incident response process, including on-call management (PagerDuty), escalation handling, and RCA facilitation.
Establish and enforce SLIs, SLOs, and error budgets in alignment with business priorities.
Lead regular game days, failover tests, and resilience reviews.

Observability & Performance

Implement and maintain end-to-end monitoring, alerting, and observability using tools such as CloudWatch, Prometheus, and Grafana.
Drive visibility into system health, performance, and capacity trends through dashboards and metrics.
Collaborate with development teams to optimize service performance and latency.

Infrastructure & Automation

Oversee infrastructure operations and deployment pipelines leveraging AWS (ECS/Fargate, EC2, Lambda, RDS, CloudFront, ALB/NLB).
Manage container orchestration, ensuring secure and efficient image management, scaling, and deployments.
Advance Infrastructure as Code (IaC) practices using Terraform and CloudFormation.
Drive automation across environment provisioning, CI/CD, and compliance checks.

Security & Compliance

Partner with Security to implement and monitor AWS security controls (IAM least privilege, KMS, Secrets Manager, GuardDuty, Config, Security Hub, and Control Tower).
Ensure adherence to compliance frameworks (e.g., SOC 2, ISO 27001, HIPAA).
Conduct vulnerability remediation and security posture reviews across cloud and container environments.

Continuous Improvement

Define and report on SRE metrics (MTTR, MTBF, change failure rate, incident frequency).
Champion service reliability reviews and architecture improvements to reduce toil and improve resilience.
Stay abreast of emerging AWS services and SRE best practices to evolve the platform.

Key Technologies

Cloud: AWS (ECS/Fargate, EC2, RDS, Lambda, CloudWatch, IAM, CloudTrail, VPC, S3, Route53)
Containers: Docker, ECS/Fargate, ECR
Monitoring/Observability: Prometheus, Grafana, OpenTelemetry, AWS CloudWatch
Incident Management: PagerDuty, Opsgenie, Slack integrations
CI/CD & IaC: GitHub Actions, Jenkins, Terraform, Harness
Security & Compliance: AWS Security Hub, GuardDuty, IAM, KMS, Config, CloudTrail, vulnerability scanning tools

Manager, Site Reliability Engineer – Minimum Qualifications:

7-10 Years in SRE, DevOps, or Infrastructure Engineering and 2+ years in a leadership or management role.
Bachelor of Science (4 year) degree in a technical field such as engineering or computer science, or extensive relevant work experience
Strong background in AWS operations, containerized workloads, and modern observability stacks.
Experience leading incident response programs and implementing operational runbooks.
Proven track record of automating infrastructure and enforcing security best practices.
Excellent communication and cross-functional leadership skills.

Manager, Site Reliability Engineer – Preferred Qualifications:

Exposure to financial or healthcare compliance and audit frameworks (PCI, SOC 2, ISO 27001, or HITRUST).
Familiarity with chaos engineering and capacity planning methodologies.
Snowflake and data/ETL exposure

PI280013491

Job ID: 81284071

Jobs You May Like

PEPI Manager - Merger Integration & Carve-outs...

Alvarez & Marsal Private...

Tampa, FL, United States (on-site)

Director, Supply Chain - Distribution & Logistics...

Alvarez & Marsal Private...

Tampa, FL, United States (on-site)

PEPI: Senior Associate, Supply Chain -...

Alvarez & Marsal Private...

Tampa, FL, United States (on-site)

PEPI: Manager, Supply Chain - Distribution &...

Alvarez & Marsal Private...

Tampa, FL, United States (on-site)

Job Location

Median Salary

Net Salary per month

$4,332

Cost of Living Index

68/100

Median Apartment Rent in City Center

(1-3 Bedroom)

$2,132 - $3,789

Safety Index

54/100

Utilities

Basic

(Electricity, heating, cooling, water, garbage for 915 sq ft apartment)

$120 - $422

High-Speed Internet

$50 - $110

Transportation

Gasoline

(1 gallon)

$3.29

Taxi Ride

(1 mile)

$2.40

Data is collected and updated regularly using reputable sources, including corporate websites and governmental reporting institutions.