Senior Director, Site Reliability Engineer
- Horner St, Amherst, CO 80721, USA
Broadridge, a global fintech leader with over $4 billion in revenue, provides communications, technology, data, and analytics. We help drive business transformation for our clients with solutions for enriching client engagement, navigating risk, optimizing efficiency, and generating revenue growth. Broadridge employs over 10,000 full-time associates globally with a significant presence in North America, Europe, and Asia. Please visit our website at www.broadridge.com to learn more.
We are currently seeking a Senior Director, Site Reliability Engineer (SRE) to lead a team of engineers responsible for implementing new, highly-automated DevOps processes to drive low failure rates and increase overall system reliability. This individual will be part of a SRE Practice spanning Capital Markets, Wealth and Asset Management solutions.
Site Reliability Engineering incorporates aspects of software engineering and automation engineering and applies that to IT operations problems. Engineers spend time doing DevOps related work such as issue management and manual intervention. However, since the environment is expected to be highly automated, the team should spend most of its time on development tasks such as scaling and automation.
The Senior Director will be responsible for leading, growing and motivating a set of high performing and talented engineers. The ideal candidate will in addition to possessing excellent management skills will also be a highly skilled system engineer with knowledge of code and automation. They will also have a strong drive to improve existing systems and processes to advance the way our hardware, software, and network solutions are designed and deployed.
- Implement GTO SRE strategy across a given Portfolio of products and associated SRE teams within a business segment.
- Manage SRE teams so that they:
- Have effective capabilities for ensuring production uptime and stability as well as the observability, reliability, availability, performance, capacity planning and operational support for the products across GTO.
- Have effective processes for continuous improvements to improve Service Level Objectives (SLO) and mean time to identification (MTTI), mean time to resolution (MTTR), and mean time to failure (MTTF).
- Can effectively engage in incident management and have well defined procedures for the identification of relationships between processes and events, and their root cause.
- Focus on automation; in the context of self-healing, auto-remediation, removing manual toil, orchestration tooling and infrastructure-as-code patterns.
- Ensure that the systems can withstand 'chaos engineering' practices and can fail gracefully when services are degraded.
- Ensure the means exist to quickly recover a degraded service (instrumentation, runbooks, tooling, etc).
- Ensure adequate instrumentation and alerting exists to spot leading indicators of an impending incident in the system; as well as in systems on which the platform depends.
- Provide leadership defining and refining engineering processes as the teams grow. Motivate, lead and develop a team of talented engineers.
- Drive SRE education across the team to improve quality and reliability.
- Directly collaborate with Portfolio technology and product stakeholders to understand their strategies and needs and incorporate them into SRE backlog. Partner with them to ensure effective communication regarding service reliability, performance and superior customer experience.
- Provide best practice SRE consideration to Portfolio product Architecture and Application Development teams so that stability and reliability are incorporated into new solutions.
- Bachelor's degree
- More than 15 years of relevant working experience with a strong technical hands-on experience
- Strong experience with automation and orchestration of applications and infrastructure components
- Constant improvement approach
- Excellent knowledge in distributed architecture, Cloud, microservices, SOA, IaaS and PaaS as related to design patterns
- Ability to identify potential design issues and present valid solutions/options during the design phases
- Experience in Agile and Test-Driven Development (TDD) methodologies
- Experience in leading SRE teams and/or DevOps functions or similar
- Understanding what it takes to support applications and its related infrastructure in a production environment (Service Level Agreements)
- Experience growing and building highly effective teams.
- Experience collaborating across organizational boundaries, forming alliances with other members of the Portfolio management leadership team and building bridges that support functional as well as company goals.
- Ability to identify trends and promote solutions that solve challenges efficiently across the organization
- Highly-collaborate who can build strong relationships at all levels of the technology and business organizations
- SRE Sprint planning and ability to prioritize tasks to meet the sprint
- Experience working through the definition, design, release and run cycle of software products to markets
- Experience with DevOps, ITIL, Cloud Services, IT Infrastructure and Operations, including environment stand-up, server builds, firewalls, security and regulatory compliance.
- Experience of any object-oriented language
- Proficiency working in Unix/Linux environments.
- Experience with IBM MQ, Kafka, Postgres
- Experience with Amazon AWS solutions capabilities such as EC2, EBS, RDS, S3, Cloud Formation, Dynamo DB, Route 53, IAM, ELB, CloudWatch, Lambda, Kinesis etc.
- Experience of Logging, Monitoring and Alerting framework for hybrid cloud or third-party services using AppDynamics, Splunk, Data Dog and CA APM.
- Experience with Atlassian toolset JIRA/Confluence and agile development practices
- Experience with tools such as Jenkins and Ansible, GIT, Maven, Nexus, Chef, Docker, Terraform, Kubernetes, Pivotal Cloud Foundry, Concourse
Salary Range $140k-$165k annually
This position is bonus eligible
Broadridge Financial Solutions offers a comprehensive benefits program to help you and your family protect your health and financial security.
Medical – May be eligible to select from three medical plans including a Health Savings Account compatible plan
Dental – May be eligible to select from three plans including certain orthodontia coverage
Vision Insurance – May be eligible to select from a voluntary plan covering a full range of vision care services
Health Savings Accounts – May be eligible to participate in a tax-advantaged savings account used for healthcare expenses
Flexible Spending Accounts – Eligible to participate in a tax-advantaged spending accounts for healthcare and/or dependent daycare expenses
Disability Insurance – Eligible for company provided Short-Term and core Long-Term disability insurance coverage with the option to purchase supplemental Long-Term disability insurance coverage, which provides income replacement for illness or injury
Voluntary Insurance – Eligible to purchase certain additional insurance coverage like Critical Illness, Hospital Indemnity, and Group Accident and may be eligible to purchase a Personal Accident Insurance policy
Life Insurance – May be eligible for basic term life coverage with the option to purchase additional supplemental life coverage
Paid Time Off – Vacation, personal leave, sick leave, holidays, jury and bereavement leave
Vacation Flex – Opportunity to purchase additional vacation time
Retirement Saving Programs – Opportunity to participate in the Broadridge 401(k) Plan
Educational Assistance – May be eligible for financial assistance for qualifying courses
Employee Assistance Program – Eligible for short-term confidential counseling services
Broadridge is an equal opportunity employer and makes employment decisions without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, disability, age or any other protected status. "Everyone Benefits from Diversity & Inclusion. Diverse & Inclusive Teams Drive Growth." US applicants: Click here to view the "EEO is the Law" poster. If you are a qualified individual with a disability or a disabled veteran, you may request a reasonable accommodation in the event you are unable or limited in your ability to use or access the Companys career webpage as a result of your disability. You may request a reasonable accommodation(s) by calling 888-237-7769 or by sending an email to [email protected]