Back to job search
Key Activities:
Data Infrastructure Design and Maintenance:
- Architect, maintain, and enhance analytical and operational services and infrastructure, including data lakes, databases, data pipelines, and metadata repositories, to ensure accurate and timely delivery of actionable insights.
**Collaboration:**
- Work closely with data science teams to design and implement data schemas and models, integrate new data sources with product teams, and collaborate with other data engineers to implement cutting-edge technologies in the data space.
Data Processing:
- Develop and optimize large-scale batch and real-time data processing systems to support the organization's growth and improvement initiatives.
Workflow Management:
- Utilize workflow scheduling and monitoring tools like Apache Airflow and AWS Batch to ensure efficient data processing and management.
Quality Assurance:
- Implement robust testing strategies to ensure the reliability and usability of data processing systems.
Continuous Improvement:
- Stay abreast of emerging technologies and best practices in data engineering, and propose and implement optimizations to enhance development efficiency.
Required Skills:
Technical Expertise:
- Proficient in Unix environments, distributed and cloud computing, Python frameworks (e.g., pandas, pyspark), version control systems (e.g., git), and workflow scheduling tools (e.g., Apache Airflow).
Database Proficiency:
- Experience with columnar and big data databases like Athena, Redshift, Vertica, and Hive/Hadoop.
Cloud Services:
- Familiarity with AWS or other cloud services like Glue, EMR, EC2, S3, Lambda, etc.
Containerization:
- Experience with container management and orchestration tools like Docker, ECS, and Kubernetes.
CI/CD:
- Knowledge of CI/CD tools such as Jenkins, CircleCI, or AWS CodePipeline.
Nice-to-have Requirements:
Programming Languages:
- Familiarity with JVM languages like Java or Scala.
Database Technologies:
- Experience with RDBMS (e.g., MySQL, PostgreSQL) and NoSQL databases (e.g., DynamoDB, Redis).
BI Tools:
- Exposure to enterprise BI tools like Tableau, Looker, or PowerBI.
Data Science Environments:
- Understanding of data science environments like AWS Sagemaker or Databricks.
Monitoring and Logging:
- Knowledge of log ingestion and monitoring tools like the ELK stack or Datadog.
Data Privacy and Security:
- Understanding of data privacy and security tools and concepts.
Messaging Systems:
- Familiarity with distributed messaging and event streaming systems like Kafka or RabbitMQ.
Senior Data Engineer
Location | Vietnam |
Industry | Information & Communications Technology (ICT) | Job reference | 16308 |
Job type | Permanent |
Salary | 80.000.000-120.000.000 |
Consultant email | duong.tran@manpower.com.vn |
Date posted | Jul 05, 2024 |
Key Activities:
Data Infrastructure Design and Maintenance:
- Architect, maintain, and enhance analytical and operational services and infrastructure, including data lakes, databases, data pipelines, and metadata repositories, to ensure accurate and timely delivery of actionable insights.
**Collaboration:**
- Work closely with data science teams to design and implement data schemas and models, integrate new data sources with product teams, and collaborate with other data engineers to implement cutting-edge technologies in the data space.
Data Processing:
- Develop and optimize large-scale batch and real-time data processing systems to support the organization's growth and improvement initiatives.
Workflow Management:
- Utilize workflow scheduling and monitoring tools like Apache Airflow and AWS Batch to ensure efficient data processing and management.
Quality Assurance:
- Implement robust testing strategies to ensure the reliability and usability of data processing systems.
Continuous Improvement:
- Stay abreast of emerging technologies and best practices in data engineering, and propose and implement optimizations to enhance development efficiency.
Required Skills:
Technical Expertise:
- Proficient in Unix environments, distributed and cloud computing, Python frameworks (e.g., pandas, pyspark), version control systems (e.g., git), and workflow scheduling tools (e.g., Apache Airflow).
Database Proficiency:
- Experience with columnar and big data databases like Athena, Redshift, Vertica, and Hive/Hadoop.
Cloud Services:
- Familiarity with AWS or other cloud services like Glue, EMR, EC2, S3, Lambda, etc.
Containerization:
- Experience with container management and orchestration tools like Docker, ECS, and Kubernetes.
CI/CD:
- Knowledge of CI/CD tools such as Jenkins, CircleCI, or AWS CodePipeline.
Nice-to-have Requirements:
Programming Languages:
- Familiarity with JVM languages like Java or Scala.
Database Technologies:
- Experience with RDBMS (e.g., MySQL, PostgreSQL) and NoSQL databases (e.g., DynamoDB, Redis).
BI Tools:
- Exposure to enterprise BI tools like Tableau, Looker, or PowerBI.
Data Science Environments:
- Understanding of data science environments like AWS Sagemaker or Databricks.
Monitoring and Logging:
- Knowledge of log ingestion and monitoring tools like the ELK stack or Datadog.
Data Privacy and Security:
- Understanding of data privacy and security tools and concepts.
Messaging Systems:
- Familiarity with distributed messaging and event streaming systems like Kafka or RabbitMQ.