Description of activities:
We are looking for a data engineer at BUUT to design, build, and maintain a scalable, general-purpose data lake and data pipelines that enable the L&I (learning&insights) team to generate actionable insights. This role will ensure data accessibility, quality, and performance for both analytical and reporting needs, reducing dependency on multiple engineers and creating a unified, efficient data infrastructure.
- With the following results:
- Data Pipeline Development
- Design and implement at least 2–3 robust data pipelines that support L&I analytics and reporting needs.
- Ensure pipelines are automated, tested, and monitored for reliability.
- Enable Analytics & Insights
- Provide usable datasets and query capabilities for L&I to generate insights.
- Deliver at least one analytics feature or dashboard powered by the new data infrastructure.
- Operational Excellence
- Implement CI/CD workflows for data jobs (GitHub Actions).
- Set up basic monitoring and alerting for pipeline health and data quality.
- Data Governance & Quality
- Define and apply data quality checks (row-level and aggregate).
- Establish data lineage documentation for key pipelines.
- Relevant knowledge skills & competences:
Must-haves:
Tools & Platforms:
- Version Control & CI/CD: GitHub, GitHub Actions
- At least one major cloud platform: AWS, Azure, or Databricks
Languages: Python, SQL
Experience: Minimum 3 years creating and maintaining data pipelines
Core Knowledge:
- Data pipelines (ETL), orchestration (batch vs streaming)
- Data modeling (star schema, dimensional modeling, SCD)
- OLTP vs OLAP concepts
- Testing (unit, integration, E2E)
- Data governance basics (lineage, quality checks)
Nice-to-haves:
Tools & Platforms:
- AWS services: Glue, Athena, DynamoDB, Step Functions
- Languages: PySpark, Golang
Patterns & Techniques:
- Infrastructure as Code (AWS CloudFormation)
- Table formats: Iceberg / Delta / Hudi
- Schema evolution, reprocessing, monitoring
- Medallion architecture
- Distributed data processing
Experience:
- Improving SDLC for data teams (validation, testing automation)
- Generating insights for end-users (e.g., personalization)
Inclusiviteit en diversiteit
Uiteraard staat deze vacature open voor iedereen die zich hierin herkent. We geloven dat diverse teams van belang zijn voor ons als lerende organisatie, die voorop wil blijven lopen in de wereld van werk. Want juist verschillen tussen mensen zorgen voor groei. Van collega's, klanten, kandidaten en daarmee van Randstad Professional. Heb jij een uniek talent? We ontmoeten je graag.