Follow Us On:

Data engineer

Years of Experience

3+ yrs

Location

Singapore

Skillset Category

Strong Python programming (modular design, error handling, logging) - Advanced SQL (joins, window functions, optimization) - Hands-on experience with Pandas and Kafka for data processing

Mandatory Skills

Experience with orchestration tools Apache Airflow Prefect (or equivalent) - Experience with package and dependency management (pip, virtual environments)

Candidates who require work passes need not apply

 

Job Description

Core Mandatory Skills

– Strong Python programming (modular design, error handling, logging)

– Advanced SQL (joins, window functions, optimization)

– Hands-on experience with Pandas and Kafka for data processing

– Experience with orchestration tools:- Apache Airflow

– Prefect (or equivalent)

– Experience with package and dependency management (pip, virtual environments)

Key Competencies Expected

– Pipeline Development & Orchestration- Design, build, and maintain data pipelines using Python, SQL, and orchestration tools

– Develop and manage Directed Acyclic Graph (DAGs) / flows in using orchestration tools like Apache Airflow and Prefect

– Ensure pipelines are idempotent, scalable, and fault-tolerant

– Implement logging, monitoring, and alerting for pipeline observability

– Package & Dependency Management

– Install, upgrade, and manage Python packages in controlled environments

– Maintain e.g. requirements.txt / dependency manifests with version pinning

– Resolve dependency conflicts and ensure compatibility across environments (dev, UAT, prod)

– Support deployments in restricted or air-gapped environments where require

– Security Remediation & Library Fixes- Analyse vulnerability reports from security scanning tools (e.g., CVE findings)

– Upgrade or replace vulnerable libraries while maintaining pipeline stability

– Fix broken imports, deprecated APIs, and compatibility issues arising from library updates

– Collaborate with security teams to ensure compliance with organisational standards

– Code Refactoring & Optimization- Refactor legacy code across:- Data ingestion APIs

– Data transformation (Pandas/SQL)

– Model training and inference pipelines

– Orchestration workflows

– Improve code modularity, readability, and performance

– Ensure backward compatibility and minimal disruption to production systems

– Data Processing & Integration- Perform data transformation and validation using Pandas and SQL

– Integrate streaming data pipelines using Kafka (producers/consumers)

– Ensure schema consistency and data quality across pipeline stages

– Testing, Deployment & Support- Implement unit and integration tests for pipelines

– Support workflows for deployment of data pipelines

– Troubleshoot pipeline failures and perform root cause analysis

– Provide production support and continuous improvement of data workflows

– Streaming and Integration Skills- Working knowledge of Kafka (topics, partitions, consumers, producers)

– Experience handling schema evolution and message serialization/deserialization

– Platform Awareness Skills- Working knowledge of Kafka (topics, partitions, consumers, producers)

– Experience handling schema evolution and message serialization/deserialization- Security & Reliability

– Experience resolving vulnerabilities from security scans

– Understanding of secure coding practices

– Experience working in regulated or high-security environments

 

Qualifications

Experience Requirements

– Preferably at least 2-3 or more years of experience in data engineering

– Prior experience working with production data pipelines

– Prior experience handling dependency conflicts, library upgrades, and refactoring in live systems

– Ability to work across multiple layers (API / data / orchestration / ML)