Sr. Site Reliability Engineer
- 1919 North Lynn Street, Arlington, VA, United States
Higher Logic is an industry leader in cloud-based engagement platforms. Our data-driven approach gives organizations an expanded suite of engagement capabilities, including online communities and marketing automation. From the initial web visit to renewal and ongoing engagement, we help you track and manage interactions along each stage of the digital customer experience.
Organizations worldwide use Higher Logic to bring people all together, by giving their community a home where they can interact, share ideas, answer questions, and stay connected. Everything we do - the tools and features in our software, our services, partnerships, best practices - drives our ultimate goal of making your organization successful.
This is a full-time on-site position with the DevOps Team. Senior SREs work independently on concurrent complex projects to deliver technical solutions, execute road maps and promote best practices. Success in this role depends on performing at a high degree of technical skill in a 7x24x365 global production environment, while maintaining a positive attitude, aim towards solutions, and good working relationships with their coworkers.
DevOps at Higher Logic has primary responsibility for service reliability in the production environment. We live at the juncture of Engineering, Operations, and Support, which means that we interact with large swathes of the company on a daily basis. The company frequently introduces new products, features and services; we also constantly modify and scale existing services. Managing this environment requires high individual knowledge and capabilities, coupled with optimism, focus, and close teamwork across the organization and the company. Higher Logic operates on a large scale, serving tens of millions of end users every day. The entire technical stack is well on the way to being fully Cloud native. No matter how much you know, you will learn and grow here.
- Coordinate process, communication, documentation and remediation for operational incidents.
- Actively support security and compliance functions.
- Decrease incidence, scope and severity of operational failures (improve MTTR and MTBF).
- Guide products to Production Readiness (scalability, observability, operability, resiliency, etc.).
- Create, maintain and operate build and deployment automation and operations (CI/CD pipelines).
- Provide tier three on-call technical support.
- Significant experience in AWS configuration, operations, deployment and troubleshooting.
- Desire to improve product, technology, people and process.
- Experience with EC2, SQL Server, IIS, Docker, S3, IAM, HAProxy, ELB.
- Appreciation of the value of diversity of opinions, approaches, and backgrounds.
- Substantial knowledge in one or more of: Terraform, Chef, Puppet, Ansible.
- Excellent communications & collaboration skills.
- Understanding of the value provided by incremental solution delivery, POCs, MVPs, etc.
- Bachelor’s degree or better in Computer Science, MIS, or equivalent commercial experience.
- Windows and Linux SRE work over 4-year or longer period.
- AWS Certified DevOps Engineer credential; Professional certificate is a plus.
- Experience with Autoscaling, RDS, Aurora, PostgresSQL, ECS, Fargate, Redis, Memcached, S3, SQS, SES, SNS, Secrets Manager, Lambda, CloudWatch, Active Directory forest & ADFS, CI/CD, containers at scale.
- Proficiency in at least one high level language, C# preferred.
- Familiarity with Agile (Kanban and Scrum).
All your information will be kept confidential according to EEO guidelines.