Major Incident Manager
- Montreal, QC, Canada
- Employees can work remotely
SG Digital leads the industry in providing online gaming, sports, and iLottery technology solutions, which enable the world’s largest gambling companies to provide games to their players. The role will focus on the online Casino business, which offers a platform of 2.5K+ Table and Slot machines to 180 Operators Globally.
SG Digital is a combination of several, young, former start-up business and entrepreneurialism continues to be at the heart of the business. Employees are given ownership of their own areas of responsibility and encouraged to think big, challenge the norm and drive continuous positive change. The culture is perfect for candidates wanting independence and the ability to make an impact.
You will join the global Major Incident Management team and will own Incidents assigned to you ensuring we are meeting SLAs set for resolution times and communications. You will co-ordinate technical teams to ensure the fastest resolution of an incident following our Incident Manage Procedure.
The Incident Managers provide 24/7 support through a follow the sun office setup with offices in Montreal, London, Gibraltar and Bangalore. Some on-call and shift work is required. This is backed up by a Manager-On-Duty roster for out of hours support to provide 24/7 support for our global customer base.
You will have an aptitude for troubleshooting and solving problems, whilst working well under pressure.
Incident Management and Problem Management are part of the one team, working closely together to improve the services and products SG Digital provides and follow ITIL processes.
- The Major Incident Manager is not expected to be a technical expert or resolve the incident himself - although a broad technical and industry understanding is useful. System Administration, Technical Project/Program Management experience, IT Operations or Technical Support experience is beneficial.
- IT Service Management experience, ideally as a certified Incident Manager
- Strong knowledge of ITIL processes
- Broad experience of IT infrastructure and applications
- Experience working with Change Management, Incident Management and Problem Management, ideally in a regulated environment
- Commercial experience, with an understanding of SLAs and contractual obligations
- Excellent communication skills both verbally and written
- Experienced with Jira
Managing SG Digital’s response to high priority customer production incidents - These are typically incidents which have resulted in a full or partial service outage for a customer with associated revenue loss and the potential impact to brand image.
Restoring production service as quickly and efficiently as possible within the terms of SLAs and regulatory requirements.
- Coordinate internal or 3rd party resources: guiding and directing, identifying and engaging the necessary resources from our worldwide and on-call pool of technical experts as required.
- Defining a plan of action for complex issues and liaising with colleagues such as technical support, operations, customer success managers, account managers, regulatory compliance team, 3rd parties such as game providers, etc.
- Ensuring internal executives are kept informed of progress so they can field customer executive escalations. Where needed, attend bridge calls with customers to keep them up to date with ongoing incidents.
- Ensuring the incident ticket is updated with all salient information, including the reason for decisions made, approvals for emergency changes, timelines, etc.
- Writing Incident Reports, outlining the incident and what was done to restore the service. Root cause analysis is not required for these reports.
- When there are no high priority incidents, the incident manager works to ensure the efficient resolution and closure of open incidents. This is a collaborative process working with development and operations teams.
- Work closely with Problem Management to identify recurring or high-impacting incidents that requires further analysis and long-term solutions
- Actively participating in regular Operational Reviews / Continuous Improvement meetings with Product/Development and other teams throughout the company
- Dealing with business customers, the Major Incident Manager messages technical progress in terms a business customer will understand and in a manner which de-escalates situations
- On regular basis review data quality on incidents, and identify opportunities for improvements and training
- Maintain training material and provide training to end users
- Assist with KPI reporting