Data Engineer / 1

Full-time

Company Description

Inetum Polska is part of the global Inetum Group and plays a key role in driving the digital transformation of businesses and public institutions. Operating in cities such as Warsaw, Poznan, Katowice, Lublin, Rzeszow, Lodz the company offers a wide range of IT services. Inetum Polska actively supports employee development by fully funding training, certifications, and participation in technology conferences. Additionally, the company is involved in local social initiatives, such as charitable projects and promoting an active lifestyle. It prides itself on fostering a diverse and inclusive work environment, ensuring equal opportunities for all.

Globally, Inetum operates in 19 countries and employs over 28,000 professionals. The company focuses on four key areas:

Consulting (Inetum Consulting): Strategic advisory services that help organizations define and implement innovative solutions.
Infrastructure and Application Management (Inetum Technologies): Designing and managing IT systems tailored to clients’ individual needs.
Software Implementation (Inetum Solutions): Deploying partner solutions from industry leaders like Microsoft, SAP, Salesforce, and ServiceNow.
Custom Software Development (Inetum Software): Creating unique software solutions to meet specific client needs.

With strategic partnerships with major technology giants, including Microsoft, SAP, Salesforce, and ServiceNow, Inetum delivers advanced technological solutions tailored to customer requirements. In 2023, Inetum reported revenues of €2.5 billion, underscoring its strong position in the digital services market.

Inetum distinguishes itself by offering a comprehensive range of benefits that meet the diverse needs of employees, providing flexibility, support and commitment. Here's what makes working at Inetum unique:

Flexible and hybrid work:

Flexible working hours.
Hybrid work model, allowing employees to divide their time between home and modern offices in key Polish cities.

Attractive financial benefits:

A cafeteria system that allows employees to personalize benefits by choosing from a variety of options.
Generous referral bonuses, offering up to PLN6,000 for referring specialists.
Additional revenue sharing opportunities for initiating partnerships with new clients.

Professional development and team support:

Ongoing guidance from a dedicated Team Manager for each employee.
Tailored technical mentoring from an assigned technical leader, depending on individual expertise and project needs.

Community and Well-Being:

Dedicated team-building budget for online and on-site team events.
Opportunities to participate in charitable initiatives and local sports programs.
A supportive and inclusive work culture with an emphasis on diversity and mutual respect.

Job Description

Join our team to leverage your data engineering skills in a dynamic environment, ensuring seamless data migration and optimization for advanced AI and ML projects. Apply now to be part of our innovative journey!

Key responsibilities

Data pipeline development:

Design, develop, and deploy Python-based ETL/ELT pipelines to migrate data from the on-premises MS SQL Server into the Databricks instance,
Ensure efficient ingestion of historical parquet datasets into Databricks.

Data quality & validation:

Implement validation, reconciliation, and quality assurance checks to ensure accuracy and completeness of migrated data,
Handle schema mapping, field transformations, and metadata enrichment to standardize datasets,
Ensure data governance, quality assurance, and compliance are integral to all migration activities.

Performance optimization:

Tune pipelines for speed and efficiency, leveraging Databricks capabilities such as Delta Lake when appropriate,
Manage resource usage and scheduling for large dataset transfers.

Collaboration:

Work closely with AI engineers, data scientists, and business stakeholders to define data access patterns required for upcoming AI POCs,
Partner with infrastructure teams to ensure secure connection between legacy systems and Databricks.

Documentation & governance:

Maintain technical documentation for all data pipelines,
Adhere to data governance, compliance, and security best practices throughout the migration process.

Qualifications

Required skills & experience:

Proven experience in Python for data engineering tasks (PySpark, Pandas, etc.),
Hands-on experience with Databricks and the Spark ecosystem,
Solid understanding of ETL/ELT concepts, data modeling, and pipeline orchestration,
Experience working with Microsoft SQL Server, including direct database connections,
Practical experience ingesting Parquet data and managing large historical datasets,
Knowledge of Delta Lake and structured streaming in Databricks is a plus,
Familiarity with secure data transfer protocols between on-premises environments and cloud platforms,
Strong problem-solving skills and ability to work independently.

Preferred qualifications:

Experience with AI/ML data preparation workflows,
Understanding of data governance and compliance requirements related to customer and contract data,
Familiarity with orchestration tools such as Databricks Workflows or Airflow,
Experience in setting up Databricks environments from first use.

I'm interested