Senior Site Reliability Engineer - Databases
- Full-time
Company Description
Twitter is committed to serving the public conversation by helping people stay informed, inform others, and discuss what matters. Our Curation team is on a mission to better facilitate this through the curation of the best, most relevant, and timely content that reaches, engages and delights one of the largest daily audiences in the world.
Twitter’s Curation team sits within its Consumer Product team, which works directly on developing the core Twitter product with our customers in mind.
Job Description
Who We Are:
As a member of the organisation you will be dedicated to improving the reliability of our end-to-end data and database infrastructure. Your work will integrate directly with our products
Our core infrastructure receives hundreds of millions of tweets per day and serves tens of billions of API requests. We also serve over 2+ billion search queries per day, render millions of ad impressions, and process hundreds of terabytes of log and interaction data daily
We investigate difficult operational issues; from the software, systems, automation, and process perspectives. We will understand the challenges around integrating disparate infrastructures into a new facility, processes and procedures
We develop services and tooling to automate repetitive tasks and/or provide self-service applications
We actively participate in the vision to move away from high operational cost tasks, and contribute to services that can shrink and expand based on demand, self heal, automatically rollout, etc
We will train and invest in our team members and make sure that they are successful in supporting a large variety of systems and products
We’re looking for an industry-experienced SRE to join us and help us further Twitter reliability by building services and automating some of our biggest operational tasks. The candidate must have relevant experience building and operating production systems, as well as a strong programming background.
Your responsibilities include:
Working closely with engineering teams to design, build, and maintain systems and help them decide on database to use, schema design and query tuning
Using your expertise to tune and push our databases beyond their normal limit
Solving issues across the entire stack: hardware, software, application and network
Mentoring other SREs on standard methodology for everything, from monitoring to solving complex code and database issues
Identifying and driving opportunities to improve automation for the company; scope and build automation for deployment, management and visibility of our services.
Actively participate and contribute to code reviews and technical design documents, with an eye toward identifying performance and reliability bottlenecks
Representing the SRE organisation in design reviews and operational readiness exercises for new and existing services
Participating in on-call (24x7) and customer support (8x5) rotations
Qualifications
Who You Are:
5+ years of proven experience in doing software support, reliability, or operations engineering experience in production environments
Strong proven ability to write modular and well-tested code in Python or Go
Experience in driving and delivering multi quarter cross team projects to completion
Demonstrated ability supporting any or all of the following: Any relational databases (MySQL, Postgres etc), Hadoop, Druid, BigQuery and other data management services on-prem and on public cloud
Ability to work well with and be able to influence a myriad of personalities at all levels
Adaptable and able to focus on the simplest, most efficient & reliable solutions
Have a track record of successful practical problem solving, excellent written and social communication, and documentation skills
Desired: Ability to lead technical teams through design and implementation across an organization.
Desired: Experience with open source projects like Vitess, Orchestrator, Percona, Airflow and other database tools
Additional Information
We care about making work happy and productive for everyone, with a permanent option to work remotely or regularly work from home when our offices reopen; a home office expense budget; wellness benefits; regular #NoMeetingFridays; and up to 20 weeks of parental leave.
A few other things we value:
Challenge - We work with Twitter's product and standards teams to solve some of the industry’s hardest content problems. Come to be challenged, learn, and thrive as a curator.
Diversity - Diversity makes us a better organization and team. We value diverse backgrounds, ideas, and experiences.
Work-Life Balance - We work hard, but we believe with hard work should come balance.
We are committed to an inclusive and diverse Twitter. Twitter is an equal opportunity employer. We do not discriminate based on race, ethnicity, color, ancestry, national origin, religion, sex, sexual orientation, gender identity, age, disability, veteran, genetic information, marital status or any other legally protected status.