- Develop and operate cross-company data infrastructure.
- Manage data integration and data pipelines.
- Aggregate data from various internal products, services, and CRM tools into a data lake.
- Appropriately distribute the data collected in the data lake to analysis and ML platforms.
- Build and maintain data analysis infrastructure.
- Optimize DWH performance.
- Ensure data quality.
YOUR SKILLS AND EXPERIENCE:
- Educational background:
- Graduated from a top-level university/college in Vietnam
- Majored in mathematics, Computer Science or related to it
- Technical skill:
- Experienced developing AI algorithm or understand how it works from scratch
- Developed an Application Programming Interface as an internet service
- Operated and manipulated an API service by using an appropriate software
- Measured performance by using metrics software
- Controlled and tuned accuracy through operating data
- Development experience with Python and SQL(BigQuery is preferable)
- Development and operational experience using AWS and Google Cloud or similar
- Configuration management: Terraform
- CI/CD: GitHub Actions
- Monitoring and logging: Datadog, Cloud Monitoring, CloudWatch
- Project management: JIRA Cloud, Miro, ...
- Documentation: Kibela, Google Workspace
- Spark: 2-3 years of experience
- Airflow: 1 year (As a user not administrator)
- Experienced developing a system by using some cloud services on AWS, GCP or Azure (AWS is preferable)
- Understood cloud service components from architectural perspective (To know what can do, how it works and what should be paid attention)
- Designed and integrated a system with appropriate security
- Experienced as Project Manager role or Product Manager role
- Worked as a Data engineer lead of a team at least 2 members for a half year (larger and longer is better)
- Open-minded
- Has a capability to understand what is asked and why it is asked