3+ yrs
Singapore
Strong Python programming (modular design, error handling, logging) - Advanced SQL (joins, window functions, optimization) - Hands-on experience with Pandas and Kafka for data processing
Experience with orchestration tools Apache Airflow Prefect (or equivalent) - Experience with package and dependency management (pip, virtual environments)
Core Mandatory Skills
– Strong Python programming (modular design, error handling, logging)
– Advanced SQL (joins, window functions, optimization)
– Hands-on experience with Pandas and Kafka for data processing
– Experience with orchestration tools:- Apache Airflow
– Prefect (or equivalent)
– Experience with package and dependency management (pip, virtual environments)
Key Competencies Expected
– Pipeline Development & Orchestration- Design, build, and maintain data pipelines using Python, SQL, and orchestration tools
– Develop and manage Directed Acyclic Graph (DAGs) / flows in using orchestration tools like Apache Airflow and Prefect
– Ensure pipelines are idempotent, scalable, and fault-tolerant
– Implement logging, monitoring, and alerting for pipeline observability
– Package & Dependency Management
– Install, upgrade, and manage Python packages in controlled environments
– Maintain e.g. requirements.txt / dependency manifests with version pinning
– Resolve dependency conflicts and ensure compatibility across environments (dev, UAT, prod)
– Support deployments in restricted or air-gapped environments where require
– Security Remediation & Library Fixes- Analyse vulnerability reports from security scanning tools (e.g., CVE findings)
– Upgrade or replace vulnerable libraries while maintaining pipeline stability
– Fix broken imports, deprecated APIs, and compatibility issues arising from library updates
– Collaborate with security teams to ensure compliance with organisational standards
– Code Refactoring & Optimization- Refactor legacy code across:- Data ingestion APIs
– Data transformation (Pandas/SQL)
– Model training and inference pipelines
– Orchestration workflows
– Improve code modularity, readability, and performance
– Ensure backward compatibility and minimal disruption to production systems
– Data Processing & Integration- Perform data transformation and validation using Pandas and SQL
– Integrate streaming data pipelines using Kafka (producers/consumers)
– Ensure schema consistency and data quality across pipeline stages
– Testing, Deployment & Support- Implement unit and integration tests for pipelines
– Support workflows for deployment of data pipelines
– Troubleshoot pipeline failures and perform root cause analysis
– Provide production support and continuous improvement of data workflows
– Streaming and Integration Skills- Working knowledge of Kafka (topics, partitions, consumers, producers)
– Experience handling schema evolution and message serialization/deserialization
– Platform Awareness Skills- Working knowledge of Kafka (topics, partitions, consumers, producers)
– Experience handling schema evolution and message serialization/deserialization- Security & Reliability
– Experience resolving vulnerabilities from security scans
– Understanding of secure coding practices
– Experience working in regulated or high-security environments
Qualifications
Experience Requirements
– Preferably at least 2-3 or more years of experience in data engineering
– Prior experience working with production data pipelines
– Prior experience handling dependency conflicts, library upgrades, and refactoring in live systems
– Ability to work across multiple layers (API / data / orchestration / ML)