Lead Infrastructure Reliability Engineering

  • Ashburn, VA, USA
  • Full-time

Company Description

As the world’s leader in digital payments technology, Visa’s mission is to connect the world through the most creative, reliable and secure payment network - enabling individuals, businesses, and economies to thrive. Our advanced global processing network, VisaNet, provides secure and reliable payments around the world, and is capable of handling more than 65,000 transaction messages a second. The company’s dedication to innovation drives the rapid growth of connected commerce on any device, and fuels the dream of a cashless future for everyone, everywhere. As the world moves from analog to digital, Visa is applying our brand, products, people, network and scale to reshape the future of commerce.

At Visa, your individuality fits right in. Working here gives you an opportunity to impact the world, invest in your career growth, and be part of an inclusive and diverse workplace. We are a global team of disruptors, trailblazers, innovators and risk-takers who are helping drive economic growth in even the most remote parts of the world, creatively moving the industry forward, and doing meaningful work that brings financial literacy and digital commerce to millions of unbanked and underserved consumers.

You’re an Individual. We’re the team for you. Together, let’s transform the way the world pays.

Job Description

As a Senior Network Automation Engineer, Infrastructure Reliability Engineering, your primary responsibility will be to lead automation initiatives and maintain related frameworks. You will be working very closely with operations, engineering, and monitoring teams.  Our customers expect higher availability, more visibility, better performance, root cause analysis, and prediction of failures.  

  • Subject matter expert on operational processes and workflows, and their supporting automation and tooling.
  • Write and maintain software to solve complex network management and monitoring tasks, including:

o   deploying and auditing configuration of network devices

o   monitoring network health, including metrics collection, visualization, and alerting

o   tracking network utilization over time to assist capacity planning models

Essential Functions

  • Automate network monitoring and alerts triage as part of proactive approach to maintaining high level of service
  • Collaboratively design, develop, and execute software based automation and tooling solutions to drive global operational processes and efficiency.
  •  Utilize scripting including Python and NETCONF to automate network device provisioning process, execute network changes and obtain data collection for reporting or analysis.
  •  Maintain and support Ansible automation framework and scripts
  • Develop automation process for on-boarding new devices to monitoring system
  •  Develop and maintain tools to reduce down time in network operations, improve speed and accuracy of network changes, assist in diagnosis and remediation of advanced network problems and drive innovative solutions to reduce failure tensors
  •  Automate network deployment and data collection tasks to enhance the productivity and validity of the results
  • Model network configuration and operational data in a platform neutral structure
  •  Develop testing platforms to detect network configuration inconsistencies before changes are made
  •  Build reports to gain intelligence about the network and detect errors early
  • Document automation standards with written procedures, processes, diagrams and other technical documents
  •  Integrate network tools with Service Now ticketing system to improve the overall efficiency of IT operation
  • Work with Engineering and Operations teams to evaluate and recommend tools, technologies and processes to ensure the highest quality operational tooling and platforms.
  • Collaborate with stakeholders, functional owners, and subject matter experts to interpret business and operations needs and articulate how to address those in partnership with engineering and programs teams. 


Basic Qualifications:

  • 10 years of work experience with a Bachelor’s Degree or at least 8 years of work experience with an Advanced Degree (e.g. Masters/MBA/JD/MD) or at least 3 years of work experience with a PhD

Preferred Qualifications:

  •  12-15 years of work experience with a Bachelor’s Degree or 8-10 years of experience with an Advanced Degree (e.g. Masters, MBA, JD, MD) or 6+ years of work experience with a PhD
  • Hands-on experience in software development methodologies, experience with Python, Perl, Go, or other scripting; exposure to web services (REST, gRPC)
  • Experience with YANG, JSON, XML
  • Excellent communicator and team-builder, capable of establishing trust and securing partnership with all levels of the organization, from engineers to senior executives.
  • Demonstrated technical understanding of technologies, tools, and processes within the ecosystem of the network technologies.
  • Experience with automation technologies such as Ansible
  • Understanding of the routing protocols like TCP/IP, UDP, MPLS, BGP4, OSPF, and RIP
  • Impressive track record of being able to deliver on complex initiatives and possess attention to detail
  • Excellent analytical skills, with experience rationalizing large, disparate datasets into concrete observations/recommendations, and formalizing into new network solutions
  • Ability to work on complex projects with diverse teams
  • Knowledge and experience with diverse IT architectures and enterprise IT data centers, large-scale transaction processing environments, external hosted services and cloud computing environments. 
  • Must be both a self-starter and team player with the ability to work independently with limited supervision.
  • Must be extremely flexible and able to manage multiple tasks and priorities on very tight deadlines 
  • Prior experience with datacenter monitoring and management
  • Networking Technologies such as SNMP, Netflow, Flow analysis
  • Experience developing service oriented systems, REST, python
  • Exposure to Hadoop, Spark, Kafka, Storm, Ganglia, Nagios, openTSDB, Elasticsearch or other distributed compute platforms
  • Advanced network technical certifications (e.g., CCIE etc.) preferred


Additional Information

Work Hours: 

  • Incumbent must make themselves available during core business hours.

Travel Requirements:

  • “This position requires the incumbent to travel for work 5% of the time.

Physical Requirements:

  • This position will be performed in an office setting.  The position will require the incumbent to sit and stand at a desk, communicate in person and by telephone, frequently operate standard office equipment, such as telephones and computers, reach with hands and arms, and bend or lift up to 25 pounds.

Visa will consider for employment qualified applicants with criminal histories in a manner consistent with EEOC guidelines and applicable local law.

Privacy Policy