Site Reliability Engineer [Remote-USA]
- 5305 Village Center Dr, Columbia, MD 21044, USA
- Employees can work remotely
credativ U.S. is the North American branch of credativ, an international open source support company. At credativ U.S. we focus on support and services for open source based web operations technology. Primarily based on the east coast, we are a distributed team, working with clients in all different sectors, at home and abroad, helping them to get the most from their technology solutions.
The credativ U.S. reliability engineering team is a mix of DBA/SRE/DBRE oriented folks who strive to help our clients get the most from their technology solutions. We work closely with developers, operations, and client groups to provide a "full stack" perspective on providing highly available services at scale. We believe a polyglot approach is the best way to learn and solve today's challenging technology problems. Sure, everyone has thier favorites, but understanding engineering fundementals and how systems interact across different applications and platforms is a key to our success.
As a member of the team, you'll work with a wide variety of technologies supporting multiple application stacks across a mix of infrastrcuture options, working to obtain a thorough understanding of modern technology stacks, including networking and systems level knowledge. You'll work directly with clients and devleopment teams to ensure thier goals are met through automation, analysis, and infrastrcuture configuration. Common problems you'll be working on include:
- Monitoring for problems and diagnosing and addressing those problems as needed.
- Maintaining scalable, reliable, and robust environments.
- Performance tuning and optimization.
- Building tools and scripts to simplify, automate, or solve the problems vendors leave behind.
- Work on installation / configuration of database oriented systems.
- Participate in 24x7 on-call support rotation.
Note that while you will be working from home on a distributed team, with access to multiple customer environments, potentially with people and computers around the world, we are looking for U.S. based applicant at this time.
No one knows it all, so don't get hung up on any one item, but these are some of the things we'll be asking about.
*Requirements & Education:*
- 4+ years working in Unix/Linux environments, particularly with web facing systems.
- Experience architecting and managing systems in cloud vendor environments.
- Experience with configuration management tools (Terraform, Puppet, Ansible).
- Ability to work cooperatively with software engineers and system administrators.
- Exceptional problem-solving expertise and attention to detail.
- The ability to remain calm in the face of extreme crises.
- BS in Computer Science or equivalent experience.
*Super Bonus Items*
- You can build a complete web environment armed with a network
connection and a shell prompt.
- You like databases, and don't mind writing SQL.
- Comfortable with C.
- You don't believe there is OneTrueWay™ of server administration.
- You understand that efficiency and thoroughness trade-offs happen every day.
- You crave communication and collaboration, including with application developers and non-operations folks.
- You don't believe that all systems have the same requirements of consistency, availability, and partition-tolerance.
- You love experimenting, testing, and QA so much, you learned how to do it in production.
- You don't believe in root causes.
- You accept humor as a coping mechanism.
We believe in diversity as a core asset. From the tools we use to the technologies we choose to the people we work with, understanding that diversity in approach has always led us to better success. We take pride in the diversity of our staff, and seek diversity in our applicants.