Follow Us On:

DATA AI ENGINEER

  • Home
  • DATA AI ENGINEER

DATA AI ENGINEER

Years of Experience

2yrs

Location

malaysia

Skillset Category

Strong experience in data engineering, pipeline orchestration, and distributed data processing. Proficiency in Python and SQL. Experience with cloud data platforms, object storage, warehouses, workflow orchestration, and message queues. Familiarity with unstructured text data, NLP workflows, and ML data preparation. Understanding of data modeling, system reliability, monitoring, and performance tuning.

Mandatory Skills

Strong experience in data engineering, pipeline orchestration, and distributed data processing. Proficiency in Python and SQL. Experience with cloud data platforms, object storage, warehouses, workflow orchestration, and message queues. Familiarity with unstructured text data, NLP workflows, and ML data preparation. Understanding of data modeling, system reliability, monitoring, and performance tuning.

Job Description

We are looking for a Data Engineer to own the data pipelines, storage architecture, and AI-enablement layer for a media monitoring platform. This role will focus on building reliable data foundations for large-scale ingestion, processing, enrichment, and serving of multilingual media content for NLP and machine learning use cases. 

Core Responsibilities 

  • Design and maintain scalable batch and streaming pipelines for news, social media, web, and broadcast data ingestion. 
  • Build ETL/ELT processes to clean, normalize, deduplicate, enrich, and structure unstructured media content. 
  • Prepare datasets, labels, and feature-ready data for AI/ML model training, fine-tuning, and evaluation. 
  • Support NLP workflows such as entity extraction, topic classification, sentiment analysis, clustering, summarisation, and alerting. 
  • Ensure data quality, schema consistency, lineage, observability, and fault tolerance across the platform. 
  • Optimize storage, compute, and query performance across the data stack. 
  • Implement governance, access control, and auditability for sensitive or regulated content. 
  • Work closely with the full stack developer to align backend data services with product requirements and API consumption patterns.
    • Background in media monitoring, social listening, content intelligence, or news analytics. 
    • Experience with multilingual datasets and text-heavy pipelines. 
    • Exposure to LLM-based systems, vector search, or retrieval pipelines. 
    • Familiarity with secure deployment environments and data governance practices.