Data Engineer - Synthetic DataTeam
Who We Are
Ipsos is one of the world’s largest research companies and currently the only one primarily managed by researchers, ranking as a #1 full-service research organization for four consecutive years.
With over 75 different data-driven solutions, and presence in 90 markets, Ipsos brings together research, implementation, methodological, and subject-matter experts from around the world, combining thematic and technical experts to deliver top-quality research and insights.
Simply speaking, we help the biggest companies solve some of their biggest problems, serving more than 5000 clients across the globe by providing research, data, and insights on their target markets.
Role Overview :
As a Data Engineer on the Synthetic Data Team at Ipsos, you will play a pivotal role in building and maintaining the infrastructure necessary to support our synthetic data initiatives.
You will collaborate closely with data scientists to design, develop, and optimize data pipelines, ensuring efficient and reliable data processing and storage.
Your expertise in data engineering will be instrumental in enabling the team to generate and validate high-quality synthetic data at scale.
Impact of Role :
Your contributions will be essential to the success of our synthetic data projects. By creating robust and scalable data infrastructure, you will empower data scientists to focus on research and innovation, accelerating our progress in leveraging synthetic data for market research.
Your work will directly impact the efficiency and effectiveness of our data-driven solutions, ultimately benefiting our clients and the organization as a whole.
What you will be doing :
- Develop and maintain robust, scalable, and efficient data pipelines to process and manage large volumes of data for synthetic data generation
- Implement ETL processes to ensure clean, structured, and well-prepared data is available for analysis and model training
- Design and deploy data infrastructure and architectures to support the generation and storage of synthetic data, leveraging cloud technologies and big data tools.
- Optimise data storage solutions, ensuring data security, integrity, and accessibility while managing costs and resources efficiently
- Work closely with data scientists to integrate synthetic data models into production environments, ensuring seamless data flow and accessibility.
- Implement CI / CD practices to automate the deployment of data pipelines and synthetic data models, ensuring reliability and quick iteration.
- Provide technical support and troubleshooting for data infrastructure, ensuring minimal downtime and efficient resolution of issues.
- Explore and implement new data engineering technologies and tools that can enhance the efficiency and capabilities of the synthetic data team.
- Document data pipeline architectures, processes and best practices to maintain a knowledge base
- Develop and enforce standards for data management and governance to ensure data quality, security, and compliance.
You're the right person, if
- You have a solid foundation in data engineering, with experience in building and maintaining scalable data pipelines using technologies like Apache Spark, Kafka, SQL, and NoSQL databases
- You are proficient in programming languages such as Python, Java, or Scala, and have experience with ETL frameworks and data workflow orchestration tools
- You have hands-on experience with cloud platforms (., AWS, Google Cloud, Azure) and are skilled in leveraging cloud-based data storage and processing solutions.
- You are familiar with containerisation and orchestration technologies like Docker and Kubernetes, and can deploy and manage data infrastructure in cloud environments.
- You are adept at identifying inefficiencies in data systems and can proactively implement improvements to enhance performance and reliability
- You have a strong commitment to data quality, ensuring that all data processes are accurate, consistent, and reliable.
- You have experience working with synthetic data generation, AI / ML model deployment, or similar projects, and are excited by the unique challenges and opportunities in this area
- You are familiar with privacy-preserving technologies and have an understanding of the ethical considerations related to synthetic data
- You enjoy working in a collaborative environment, partnering with data engineers, analysts, and other team members to integrate and apply synthetic data solutions.
- You are an effective communicator who can clearly articulate complex concepts and findings to both technical and non-technical stakeholders
- You have a passion for pushing the boundaries of data science and a strong desire to revolutionize market research through synthetic data
If you don’t meet 100% of the requirements, we encourage all who feel they might be a fit for the opportunity to apply.
What’s in it for you :
At Ipsos you’ll experience opportunities for Career Development, an exceptional benefits package (including generous annual leave / paid time off, healthcare plans, wellness benefits), a flexible workplace policy, and a strong collaborative culture.