Senior Site Reliability Engineer

  • Full-time

Company Description

What started in 2007 with a pizza and a plan has grown into a fast-moving SaaS company empowering manufacturers, distributors, and wholesalers to thrive in complex B2B commerce.

Our mission is simple: help businesses build stronger relationships through seamless digital commerce.

At Sana Commerce, you’ll join a team that’s bold, growth-oriented, and customer-obsessed, where every engineer has real ownership and impact.

About the role

We’re looking for a Senior Site Reliability Engineer (SRE) to enhance the reliability, performance, and scalability of our global e-commerce platform.

You’ll play a critical role in building resilient systems, automating infrastructure, and driving observability across Azure and Kubernetes environments.

This is a hands-on engineering role where you’ll blend reliability strategy with real-world execution, keeping today’s systems healthy while shaping the ones we’ll depend on tomorrow.

Job Description

What you’ll do

  • Lead incident response and postmortems, drive investigations, document learnings, and implement permanent fixes to prevent recurrence.
  • Manage and optimize Azure Kubernetes environments, own cluster configurations, performance, cost control, and security best practices.
  • Build observability systems, develop dashboards, alerts, and metrics using Dynatrace, Honeycomb, ElasticSearch, Grafana/Kibana, and Azure Monitor (KQL).
  • Automate for resilience, write reliable scripts in PowerShell, Bash, Python, or C#, embedding logging, rollback, and version control.
  • Implement Infrastructure-as-Code, design and maintain Terraform, Bicep, or ARM templates to standardize and automate deployments.
  • Optimize system performance, identify bottlenecks through deep monitoring, dump analysis, and right-sizing of cloud resources.
  • Collaborate across engineering teams, integrate reliability principles into CI/CD pipelines and the broader SDLC.
  • Participate in on-call rotations, lead during critical incidents, ensuring lasting fixes and operational excellence.

Qualifications

What you’ll bring

  • 5+ years in SRE, DevOps, or Cloud Infrastructure roles with experience in large-scale, distributed systems.
  • Strong Azure and Kubernetes expertise (production-level).
  • Proven ability in observability engineering using Dynatrace, Honeycomb, Elastic, Grafana/Kibana, or Azure Monitor.
  • Skilled in PowerShell, Bash, Python, or C#, with an automation-first mindset.
  • Proficient in Infrastructure-as-Code (Terraform, Bicep, ARM).
  • Solid grasp of TCP/IP, networking fundamentals, and performance tuning.
  • Strong communicator able to translate complex technical findings into clear, actionable insights.
  • Certifications preferred:
    • Microsoft Certified: Azure Administrator Associate
    • Certified Kubernetes Administrator (CKA)

Why you’ll love working here

  • Impact from day one – Join a scale-up where your ideas shape how global businesses operate online.
  • Continuous learning – Access a structured onboarding rated 9.1/10 by previous hires, mentorship, and feedback culture.
  • Hybrid flexibility – Work from our Alexandria office 3 days per week and from home 2 days.
  • Career growth – Expand your technical and leadership scope in a company built for long-term success.

Our values

At Sana Commerce, our values drive everything we do:

  • Champions of Our League – We deliver lasting success, balancing quick wins and long-term value
  • Supercharge Our Customers – We’re revolutionizing B2B commerce together, helping our customers to lead and succeed.
  • Determined to Grow – We embrace challenges, growing and raising the bar for ourselves and our industry.
  • Bold Together – We dare to be bold because we have each other’s back.

Ready to build reliability that scales?

Apply now and help shape the foundation of our next-generation SaaS platform.

Additional Information

#LI-Hybrid

 

Privacy Policy