Edge SRE (Systems Engineer) - US Remote

  • Full-time

Company Description

Twitter is what’s happening and what people are talking about right now. For us, life's not about a job, it's about purpose. We believe real change starts with conversation. Here, your voice matters. Come as you are and together we'll do what's right (not what's easy) to serve the public conversation.

Job Description

Who We Are:

Twitter's Network Engineering team handles a dynamic, constantly growing and evolving global network that provides a reliable, high-performance and secure network behind one of the few products in the world that touches over 1 billion people.

Our SREs (Site Reliability Engineers) work on improving the availability, scalability, performance, and reliability of Twitter’s production services.

Who You Are:

You are a Systems Engineer / Site Reliability Engineer who has had hands-on experience with Network Engineering technologies, configuration, troubleshooting, and traffic management.

What You’ll Do:

As a member of the organization, you will be dedicated to improving the reliability of our end-to-end platform. Your work will integrate directly with Twitter's products. You will investigate sophisticated and difficult operational issues; from the software, networks, systems, automation, and process perspectives. You will understand the challenges around integrating disparate infrastructures into a new facility, processes, and procedures.

This is an opportunity to work with open-source technologies and the wider SRE community, and actively participate in the vision to move away from high operational cost tasks such as break/fix, cluster migrations, new service buildouts, abuse, etc. Our team contributes to services that can shrink and expand based on demand, self-heal, automatically rollout, etc.

Responsibilities Include (but are not limited to):

  • Performing deep dives into both systemic and latent reliability issues; partnering with software, network, and systems engineers across the organization to produce and roll out fixes.

  • Tackling issues across the entire stack: hardware, software, network, and applications.

  • Driving standardization efforts across multiple fields and services in conjunction with embedded SREs throughout the organization.

  • Mentoring SREs on standard methodology for everything from supervising to fixing complex code issues.

  • Identifying and leading opportunities to improve automation for the Network Engineering Edge team; scoping and building automation for deployment, management, and visibility of our services.

  • Participating in code reviews for projects primarily written in Java and Scala, built on open-source libraries such as Finagle, and running on both physical and virtualized platforms.

  • Representing the Network Engineering Edge team in design reviews and operational readiness exercises for new and existing services.

Qualifications

Who You Are - You Have:

  • Experience operating a production environment at a high scale with emphasis on availability, throughput, latency, and healthy customer experience.

  • Deep knowledge of remote network troubleshooting using various tools: tcpdump, arp-scan, DHCP, VLAN, DNS, HTTPS, etc.

  • Demonstrated knowledge & experience of Unix/Linux systems, specifically RHEL/CentOS, TCP/IP, HTTP, and experience supporting multi-tier web application architectures, kernels, systems libraries, file systems, and system resources monitoring.

  • Experience with cloud computing platforms: AWS and Azure or Google.

  • Good knowledge of Python.

  • Familiar with CI/CD pipelines and build tools like Jenkins.

  • Strong team player with a “can-do” attitude, and the flexibility to jump in wherever needed.

  • Would thrive on a small engineering team and is excited to solve open-ended problems across the stack.

  • Direct customer-facing experience in troubleshooting challenging technical issues.

  • A deep understanding of shell scripting and at least one higher-level language (Python).

  • Work well with and be able to influence a myriad of personalities at all levels.

  • Ability to prioritize tasks and work independently.

  • Adaptability and are able to focus on the simplest, most efficient & reliable solutions.

  • A track record of successful practical problem solving, excellent written and social communication, and documentation skills.

  • A B.S. in computer science or similar field or comparable experience.

Desired Experience:

  • Ability to lead technical teams through design and implementation across an organization.

  • Experience with existing open-source projects such as Squid, Apache, and Apache Mesos.

Desired Additional Qualifications:

  • Experience in operating and developing foundational infrastructural services, such as CDN, DNS, Load Balancing and Internet Proxy Servers.

  • Experience with Ansible, Jenkins, or Terraform

Additional Information

We are committed to an inclusive and diverse Twitter. Twitter is an equal opportunity employer. We do not discriminate based on race, color, ethnicity, ancestry, national origin, religion, sex, gender, gender identity, gender expression, sexual orientation, age, disability, veteran status, genetic information, marital status, or any legally protected status.

San Francisco applicants: Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

Privacy Policy