Lead - Data Platform Engineering
- Full-time
Company Description
Organizations everywhere struggle under the crushing costs and complexities of “solutions” that promise to simplify their lives. To create a better experience for their customers and employees. To help them grow. Software is a choice that can make or break a business. Create better or worse experiences. Propel or throttle growth. Business software has become a blocker instead of ways to get work done.
There’s another option. Freshworks With a fresh vision for how the world works.
At Freshworks, we build uncomplicated service software that delivers exceptional customer and employee experiences. Our enterprise-grade solutions are powerful, yet easy to use, and quick to deliver results. Our people-first approach to AI eliminates friction, making employees more effective and organizations more productive. Over 72,000 companies, including Bridgestone, New Balance, Nucor, S&P Global, and Sony Music, trust Freshworks customer experience (CX) and employee experience (EX) software to fuel customer loyalty and service efficiency. And, over 4,500 Freshworks employees make this possible, all around the world.
Fresh vision. Real impact. Come build it with us.
Job Description
Overview:
We are looking for a highly skilled Data Engineer to design, build, and manage scalable data pipelines and systems that power analytics, insights, and business intelligence. This role involves hands-on ownership of our Data Lake, Databricks platform, and real-time data streaming pipelines across MySQL, Kafka, and Spark ensuring reliable, secure, and performant data flow across the organization.
Key Responsibilities
Own and manage the enterprise Data Lake infrastructure on AWS and Databricks, ensuring reliability, scalability, and governance.
Design, develop, and optimize data ingestion and transformation pipelines from MySQL to Kafka (CDC pipelines) and from Kafka to Databricks using Spark Structured Streaming.
Strong expertise in designing and implementing MapReduce jobs to process and transform large-scale datasets efficiently across distributed systems.
Capable of optimizing MapReduce workflows for performance, fault tolerance, and scalability in big data environments.
Build robust batch and near real-time data pipelines capable of handling high-volume, high-velocity data efficiently.
Develop and maintain metadata-driven data processing frameworks, ensuring consistency, lineage, and traceability.
Implement and maintain strong observability and monitoring systems (logging, metrics, alerting) using Prometheus, Grafana, or equivalent tools.
Work closely with Product, Regulatory, and Security teams to ensure compliance, privacy, and data quality across the data lifecycle.
Collaborate with cross-functional teams to build end-to-end data lakehouse solutions integrating multiple systems and data sources.
Apply best practices in code quality, CI/CD automation (Jenkins, GitHub Actions), and infrastructure as code (IaC) for deployment consistency.
Ensure system reliability and scalability through proactive monitoring, performance tuning, and fault-tolerant design.
Stay up to date with the latest technologies in data engineering, streaming, and distributed systems, and drive continuous improvements.
Qualifications
Required Skills & Experience
Strong programming expertise in one or more of the following: Scala, Spark, Java, or Python.
Experience 6 - 10 Years
Proven experience working with Kafka (Confluent or Apache) for building event-driven or CDC-based pipelines.
Hands-on experience with distributed data processing frameworks (Apache Spark, Databricks, or Flink) for large-scale data handling.
Solid understanding of Kubernetes for deploying and managing scalable, resilient data workloads (EKS experience preferred).
Practical experience with AWS Cloud Services such as S3, Lambda, EMR, Glue, IAM, and CloudWatch.
Experience designing and managing data lakehouse architectures using Databricks or similar platforms.
Familiarity with metadata-driven frameworks and principles of data governance, lineage, and cataloging.
Experience with CI/CD pipelines (Jenkins, GitHub Actions) for data workflow deployment and automation.
Experience with monitoring and alerting frameworks such as Prometheus, Grafana, or ELK stack.
Problem-solving, communication, and collaboration skills.
Additional Information
At Freshworks, we have fostered an environment that enables everyone to find their true potential, purpose, and passion, welcoming colleagues of all backgrounds, genders, sexual orientations, religions, and ethnicities. We are committed to providing equal opportunity and believe that diversity in the workplace creates a more vibrant, richer environment that boosts the goals of our employees, communities, and business. Fresh vision. Real impact. Come build it with us.