Aiops and Observability Specialist

  • Full-time

Company Description

Procept Associates Professional Services Limited (Procept Africa) is a franchisee of Procept Associates Ltd, Canada.  We specialize in consulting, training and software solutions, through a network of associates  and partners, using best practice  frameworks in Canada, Nigeria, South Africa,  East Africa, Zambia and now Ghana.

 

We are seeking an experienced and dedicated AIOps and Observability Specialist to join our client in the telecommunications Industry’s IT operations team. As the AIOps and Observability Specialist, you will play a key role in implementing and optimizing AI-driven solutions and observability practices to ensure the reliability, performance, and efficiency of our IT systems, applications, and networks.

Job Description

You are responsible for the following:

  1. Design, implement, and maintain AIOps solutions to monitor and analyze IT systems, applications, and networks.
  2. Deploy machine learning algorithms for anomaly detection, root cause analysis, and incident prediction.
  3. Configure and manage observability tools and platforms to gain real-time visibility into system health and performance.
  4. Develop monitoring dashboards, alerts, and reports to provide comprehensive insights into the IT environment.
  5. Conduct root cause analysis for incidents using data from AIOps and observability tools to identify underlying issues.
  6. Work closely with software engineers to instrument applications with appropriate logging, metrics, and tracing capabilities.
  7. Continuously analyze monitoring data to identify trends, anomalies, and opportunities for optimization.
  8. Implement automation and integrations with Continuous Integration and Continuous Delivery pipelines to enable seamless monitoring in the development lifecycle.
  9. Stay updated with industry trends and advancements in AIOps and observability practices, and recommend new tools or methodologies for adoption.

Qualifications

  • First degree in Computer Science or IT related fields.
  • ITIL3/4 Foundation Certificate is mandatory.
  • 2-3 years of experience working with AIOps tools, machine learning frameworks, and observability technologies.
  • Strong knowledge of IT infrastructure, networks, servers, databases, and cloud environments.
  • Proficiency in Python or Java
  • Understanding of IT service management and incident management frameworks.
  • Experience in designing monitoring strategies for microservices and cloud-based applications.
  • Excellent analytical and problem-solving skills.