Senior Site Reliability Engineer
Location: Hybrid (Alpharetta, GA – 3 days/week in office)
Type: Full-Time
We are seeking a Site Reliability Engineer to join our team and play in enhancing the stability, performance, and reliability of our production systems. You’ll work closely with development, DevOps, and security teams to improve observability, optimize system performance, and ensure production readiness. From monitoring to automation, you’ll make a direct impact on our cloud infrastructure and service reliability.
In this role, you will work hand-in-hand with our development, operations, and security teams worldwide to implement best practices, automate deployments, and ensure our platforms are reliable, secure, and scalable. Troubleshooting in Kubernetes requires deep understanding of pods, nodes, networking, scaling, logs, and service-to-service communication
This role requires a deep understanding of SRE best practices and a strong ability to troubleshoot complex issues.
Your responsibilities in this role will include:
Maintain and enhance monitoring tools (New Relic, Graylog) for service health and performance metrics.
Implement and maintain high-availability systems with capacity planning, performance optimization, and fault tolerance.
Define and monitor Service Level Indicators, Objectives, and Agreements with teams.
Deploy and manage Kubernetes workloads to AWS EKS(A) using Helm, ArgoCD
Automate operational processes to reduce manual interventions.
Manage Kubernetes workloads on AWS EKS for secure and stable deployments.
Participate in on-call rotation, troubleshoot production issues, and implement permanent fixes.
Work with DevOps to improve CI/CD pipelines and with development teams to embed resilience and observability.
Document operational runbooks, escalation procedures, and production playbooks.
We are looking for you to have the following skills and experience:
- 8+ years of experience as a Site Reliability Engineer, or equivalent
- Experience with tools like New Relic for monitoring and Graylog for logging.
- 3+ years of experience with Amazon Web Services (AWS) or Microsoft Azure
- 3+ years of experience with Kubernetes clusters - performance monitoring in Kubernetes.
- Proficiency with public cloud environments (AWS preferred)
- Proficiency in scripting language, like Bash, Groovy, Python
- Excellent debugging and troubleshooting skills.
- Ability to prioritize tasks efficiently and independently under minimal supervision.
Nice to Have
- AWS Cloud certification
- Familiar with .NET applications.
- Knowledge in Terraform, Ansible, monitoring tools
This is a full-time role and we are unable to sponsor so you must be a USC or be a Green Card holder. We are working onsite a few days each week in our Alpharetta offices so you must live in Atlanta and within commuting distance of our office. If you thrive on solving complex technical challenges, have a passion for automation, and want to influence how enterprise platforms evolve and modernize, this is an ideal opportunity for you.
Ready to take the next step in your SRE career? Apply now and help us build the future of reliable systems!
Recommended Jobs
Senior AI Engineer - Supply Chain Intelligence
We are looking to add our first Senior AI Engineer to our Platform and Innovation R&D team to help build out our Supply Chain Intelligence solutions. Ideal candidates will have the opportunity …
Registered Nurse ER | Covington, Georgia
Job Description Job Description Bachelor of Science in Nursing (BSN) degree, active registered nurse (RN) license and minimum 1+ years RN experience required. Applicants who do not meet these qua…
Registered Nurse Cardiac Rehabilitation
Job Description Job Description Job description The Registered Nurse Case Manager (NCM) is responsible for actively coordinating and monitoring all clinical aspects of the Program intervention…
Early Childhood Educator
Summary The Willow School is a vibrant school community looking for nurturing and professional teachers working with children infant through 3-years-old. We are seeking an educator that will emb…
Practice Administrator
Job Description Job Description JobID: 52166 Prestige Staffing is searching for a dynamic Practice Administrator to join a stable, physician-owned concierge internal medicine practice with a r…
Licensed Aesthetician
Job Description U Skin Aesthetics seeks a driven, compassionate, skilled, Spa licensed Esthetician to join our Atlanta team!! The ideal candidate will be passionate about skincare and cosmetic p…
Entry-level Inside Sales Representative
Job Description Job Description ZCorum is seeking a dynamic, tech-savvy individual to join our sales team. This is an inside sales position in an exciting industry, where you will be selling ZCor…
Patient Representative
SouthCoast Health is looking for a Full-Time Patient Representative for our Cardiology Department SouthCoast Health is seeking a Full-Time Patient Representative to join our Cardiology office…
Primary Care Physician - Family Medicine or Internal Medicine - Greater Everett Area, Washington State
Optum WA, (formerly The Everett Clinic) is seeking a Primary Care Physician to join our team in Everett, WA. Optum is a clinician-led care organization that is changing the way clinicians work and li…
Mechanical Engineer V
Mechanical Engineer V JOB-10044772 Anticipated Start Date 10/27/2025 Location Overland Park, KS Type of Employment Contract-to-Hire Employer Info Our client is …