Director, Data Ops

Địa điểm Hoan Kiem
Ngành nghề Dịch vụ tài chính - Ngân hàng, Bảo hiểm
Mã số 14649
Loại công việc Cố định
Lương Negotiable
Email liên hệ thanh.tran@manpower.com.vn
Ngày đăng Tháng mười 25, 2023
I. Key Responsibilities
  1. Automation
The Data Operation Director helps build tools for automation to manage the operations.
Instead of manually performing these functions, their aim is to automate them. Such functions include:
  • Continuous integration and continuous delivery
  • Incident response
  • Alerts
  • Monitoring
They are also responsible for ensuring that the underlying infrastructure is running smoothly, and that systems and tools are working as expected.
They also monitor critical applications and services to minimize downtime and ensure their availability.
 
  1. Issue resolution
The data ops team work closely with developers, especially when issues arise so they will collaborate with developers to help with troubleshooting and provide consultation when alerts are issued.
This team will investigate and then resolve the issue in the event that a developer runs into a problem.
Following the incident resolution, the engineer will revisit the issue and determine the cause to ensure it doesn’t happen again.
 
  1. Cross team collaboration
Based on the above, DataOps work across different teams, mainly operations and development. By building reliable systems and providing support to these teams, this will give these teams more time to divert their attention to building new features and hence get these out faster to customers.
 
Common tools and experience needed:
  • Monitoring: such tools include AWS CloudWatch and NewRelic
  • Incident management/on-call: such as TWS, and other altering tools.
  • Project management and issue tracking: such as Jira and Trello
  • Infrastructure orchestration: including Terraform and SaltStack
 
  1. Other responsibilities
  • Administer production jobs
  • Understand debugging info
  • “Drain” traffic away from a cluster
  • Roll back a bad software push
  • Block or rate-limiting unwanted traffic
  • Bring up additional serving capacity
  • Use the monitoring systems (for alerting and dashboards)
 
II. Qualifications
  • 10+ years of experience with significant experience in the DevSecOps and SRE space
  • Experience in designing and running robust and highly scalable data platform and data pipelines
  • Experience with leading a team of experienced SRE / DevSecOps professionals
  • Extensive experience with designing/supporting both streaming and batch ETL pipelines
  • Clear understanding of distributed computing, especially in databases
  • Experience with open-source technologies (Spark, Kafka, Presto, Hive, Cassandra etc.)
  • Experience working on any of the Cloud platforms (GCP, AWS, Azure)
  • Strong communications skills and presentation skills to C levels
  • Ability to manage numerous requests concurrently and be able to prioritize and deliver