#EG Platform AI Architect
- Contract
- Department: Others
Company Description
NCS is a leading AI Tech Services company. With a 15,000-strong team across the Asia Pacific, NCS scales its platforms and capabilities to provide clients with greater agility and AI expertise across a range of Industries. Embracing a strong ecosystem of global partners, NCS transforms technology services delivery combining AI with digital resilience to drive real business impact. NCS is a subsidiary of the Singtel Group.
Job Description
About the AI Centre of Excellence
You will be part of the AI Centre of Excellence (COE), a centralised platform team that builds, operates, and evolves an enterprise-grade GenAI platform consumed by multiple AI projects across the organisation.
The COE operates as a platform product organisation, not a project delivery team. It owns shared capabilities end-to-end — reusable building blocks, common services, agent harnesses, orchestration frameworks, retrieval, governance, and observability. Success is measured by platform adoption across AI projects, reliability, and long-term value, rather than one-off delivery milestones.
The role
In this role, you are the technical and architectural authority for the COE platform. You define the target architecture, design patterns, and technical guardrails for the GenAI platform — including agent orchestration, retrieval, AI gateway, governance, and observability — and you guide engineers to make those architectures practical, scalable, and production-ready.
You focus on platform-wide design, not one-off solutions. Your work shapes how GenAI capabilities are built, exposed, governed, and evolved across the enterprise.
This is the most technically demanding of the four AI architecture roles. It spans applications, infrastructure, and security — with apps as the centre of gravity.
Key responsibilities
- Define and evolve the reference architecture for the COE GenAI platform — covering agents, retrieval, orchestration, harness, AI gateway, model lifecycle, and governance.
- Design multi-tenant platform capabilities that are safely and efficiently consumed by multiple AI projects.
- Establish architectural standards, design patterns, and non-functional requirements (security, isolation, scalability, cost, observability).
- Review and guide technical designs produced by Platform AI Engineers; act as escalation point for complex platform decisions.
- Partner with the Engagement Solution Architect to convert recurring AI project needs into platform roadmap items.
- Balance innovation with operability — make pragmatic trade-offs between ideal architecture and delivery reality.
- Collaborate across security, data, and application architecture teams to ensure platform decisions hold up at enterprise scale.
Technical environment
Across all roles, you will be working with — or designing for — the following stack. You do not need every item listed; you should be familiar with most and able to learn the rest.
- LLM providers: Commercial LLM APIs and self-hosted open-weight models served via vLLM / LLM-d.
- Agent frameworks: LangChain, LangGraph, LlamaIndex.
- Retrieval: Milvus and Graph DB, with a custom-built ingestion and retrieval framework on top.
- Orchestration and harness: Custom-built harness using Skills, eval loops, prompt and context engineering, and multi-agent orchestration on a service mesh.
- Runtime: Containerised applications on OpenShift; multi-cloud Kubernetes (AKS, EKS) deployed via GitOps and ArgoCD.
- Guardrails: Custom guardrails plus F5 AI Guardrails, Presidio (PII redaction), and OpenTelemetry plugins.
- MLOps: Lightweight, focused on LLM fine-tuning and embedding model lifecycle. (No deep / classical ML.)
- Observability: OpenTelemetry, Grafana, Prometheus, Loki, Tempo.
How we work
- Platform-product mindset: We treat the COE platform as a product. Adoption, developer experience, and reliability matter more than shipping features that nobody uses.
- Build once, serve everywhere: Capabilities needed by multiple AI projects become COE building blocks. One-off needs stay in the project.
- Governance by default: Compliance, security, and guardrails are inherited from the platform, not bolted on per project.
- Apps lead, infra supports: GenAI is mostly application-layer work with infrastructure underneath. We expect everyone to be apps-strong and infra-literate.
- Telco context: We operate in a telco environment. Telco domain experience is a plus across all roles, not mandatory.
Qualifications
Required qualifications and experience
- 8+ years in software, platform, or AI / data engineering roles, with a clear progression toward architecture.
- 3+ years of hands-on architecture experience for GenAI, LLM, or agent-based systems in production.
- Proven track record of designing shared platforms, internal products, or developer platforms consumed by multiple teams.
- Strong background in cloud-native, distributed, and event-driven system architectures.
- Demonstrated ability to influence technical direction across multiple teams without direct authority.
Required skills and knowledge
GenAI applications and orchestration (must have, deep)
- Reference architectures for agentic systems, RAG, and multi-agent orchestration.
- Memory and context engineering at platform scale.
- Prompt and context engineering as a first-class platform concern.
- Familiarity with LangChain, LangGraph, LlamaIndex, and harness design (Skills, tool use, sub-agents).
- Eval and guardrail design — building feedback loops into the platform itself.
Multi-tenant platform design (must have, deep)
- Tenant isolation, identity, secrets, and access control patterns.
- AI Gateway design: routing, access control, quotas, observability.
- Designing for adoption — SDKs, building blocks, developer experience.
- Cost governance, model routing, and FinOps for shared GenAI platforms.
Infrastructure and runtime (must have, working depth)
- Cloud-native platforms — Kubernetes / OpenShift — at architectural depth.
- Multi-cloud Kubernetes (AKS, EKS) with GitOps / ArgoCD.
- Model serving with vLLM and GPU-aware deployment.
- Working knowledge of IaC principles (e.g. Terraform or equivalent).
Security and governance (must have, working depth)
- Architecture for guardrails — F5 AI Guardrails, Presidio, custom PII redaction, prompt-injection defences.
- Lightweight MLOps for LLM fine-tuning and embedding model lifecycle.
- Model lifecycle management, model registries, model cards, AI governance.
Observability
- Observability architectures using OpenTelemetry, Grafana, Prometheus, Loki, Tempo.
Key competencies and attributes
- Strong systems thinking and architectural judgment.
- Ability to translate strategy into executable designs.
- Pragmatic — balances ideal architecture against delivery reality.
- Comfortable influencing without authority across multiple senior stakeholders.
- Platform-first, long-term ownership mindset.
Nice to have
- Telco domain experience.
- Open-source contributions to LLM, agent, or platform projects.
- Prior experience building developer platforms or internal PaaS.
Additional Information
We are driven by our AEIOU beliefs—Adventure, Excellence, Integrity, Ownership, and Unity—and we seek individuals who embody these values in both their professional and personal lives. We are committed to our Impact: Valuing our clients, Growing our people, and Creating our future.
Together, we make the extraordinary happen.
Learn more about us at ncs.co and visit our LinkedIn career site.
Scam Alert
We are aware of fraudulent job offers and impersonations of NCS recruiters. Phishing emails using convincing-looking but fake addresses are also commonly used to trick you into thinking that they come from official NCS sources.
Please note that all official communications from NCS Group will only be sent from verified corporate email addresses. Always check that the sender’s email address ends with the genuine NCS domain, @ncs.com.sg and beware of extra letters, symbols or misspellings. When in doubt, verify the sender’s identity by contacting us at [email protected].