MLOps Engineer (Python Backend + AI/GenAI Experience)
- Full-time
Company Description
Blend is a premier AI services provider, committed to co-creating meaningful impact for its clients through the power of data science, AI, technology, and people. With a mission to fuel bold visions, Blend tackles significant challenges by seamlessly aligning human expertise with artificial intelligence. The company is dedicated to unlocking value and fostering innovation for its clients by harnessing world-class people and data-driven strategy. We believe that the power of people and AI can have a meaningful impact on your world, creating more fulfilling work and projects for our people and clients. For more information, visit www.blend360.com
Job Description
We are looking for a senior MLOps Engineer with strong Python backend engineering expertise to design, build, and manage scalable ML and AI platforms. The ideal candidate has hands-on experience with AWS SageMaker, ML pipelines, Infrastructure as Code, GenAI/RAG workflows, and containerized deployments.
You will collaborate closely with Data Scientists, ML Engineers, and AI Engineers to build robust pipelines, automate workflows, deploy models at scale, and support end-to-end ML lifecycle in production.
Key Responsibilities
MLOps & ML Pipeline Engineering
- Build, maintain, and optimize ML pipelines in AWS (SageMaker, Lambda, Step Functions, ECR, S3).
- Manage model training, evaluation, versioning, deployment, and monitoring using MLOps best pratices.
- Implement CI/CD for ML workflows using GitHub Actions / CodePipeline / GitLab CI.
- Set up and maintain Infrastructure as Code (IaC) using CloudFormation or Terraform.
Backend Engineering (Python)
- Design and build scalable backend services using Python (FastAPI/Flask).
- Build APIs for model inference, feature retrieval, data access, and microservices.
- Develop automation scripts, SDKs, and utilities to streamline ML workflows.
AI/GenAI & RAG Workflows (Good to Have / Nice to Have)
- Implement RAG pipelines, vector indexing, and document retrieval workflows.
- Build and deploy multi-agent systems using frameworks like LangChain, CrewAI, or Google ADK.
- Apply prompt engineering strategies for optimizing LLM behavior.
- Integrate LLMs with existing microservices and production data.
Model Deployment & Observability
- Deploy models using Docker + Kubernetes (EKS/ECS) or SageMaker endpoints.
- Implement monitoring for model drift, data drift, usage patterns, latency, and system health.
- Maintain logs, metrics, and alerts using CloudWatch, Prometheus, Grafana, or ELK.
Collaboration & Documentation
- Work directly with data scientists to support experiments, deployments, and re-platforming efforts.
- Document design decisions, architectures, and infrastructure using Confluence, GitHub Wikis, or architectural diagrams.
- Provide guidance and best practices for reproducibility, scalability, and cost optimization.
Qualifications
Must-Have
- 5+ years total experience, with at least 3 years in MLOps/ML Engineering.
- Hands-on experience deploying at least two MLOps projects using AWS SageMaker or equivalent cloud services.
- Strong backend engineering foundation in Python (FastAPI, Flask, Django).
- Deep experience with AWS services: SageMaker, ECR, S3, Lambda, Step Functions, CloudWatch.
- Strong proficiency in Infrastructure as Code: CloudFormation / Terraform.
- Strong understanding of ML lifecycle, model versioning, monitoring, and retraining.
- Experience with Docker, GitHub Actions, Git-based workflows, CI/CD pipelines.
- Experience working with RDBMS/NoSQL, API design, and microservices.
Good to Have
- Experience building RAG pipelines, vector stores (FAISS, Pinecone), or embeddings workflows.
- Experience with agentic systems (LangChain, CrewAI, Google ADK).
- Understanding of data security, privacy, and compliance frameworks.
- Exposure to Databricks, Airflow, or Spark-based pipelines.
- Multi-cloud familiarity (Azure/GCP AI services).
Soft Skills
- Strong communication skills, able to collaborate with cross-functional teams.
- Ability to work independently and handle ambiguity.
- Analytical thinker with strong problem-solving skills.
- Ownership mindset with focus on delivery and accountability.
Additional Information
Our Perks and Benefits:
📚 Learning Opportunities:
- Certifications in AWS (we are AWS Partners), Databricks, and Snowflake.
- Access to AI learning paths to stay up to date with the latest technologies.
- Study plans, courses, and additional certifications tailored to your role.
- Access to Udemy Business, offering thousands of courses to boost your technical and soft skills.
- English lessons to support your professional communication.
👩🏫 Mentoring and Development:
Career development plans and mentorship programs to help shape your path.
🎁 Celebrations & Support:
- Special day rewards to celebrate birthdays, work anniversaries, and other personal milestones.
- Company-provided equipment.
⚖️ Flexible working options to help you strike the right balance.