Job Description
This role focuses on the design, development, and optimization of data processing systems and architectures. The candidate will lead the creation of real-time and offline data pipelines, ensuring alignment with business objectives and technical requirements. Key responsibilities include defining data modeling strategies, implementing low-latency and high-performance ETL processes, and establishing comprehensive data platform engineering standards. The position requires expertise in developing scalable solutions that support data governance, quality assurance, and security protocols. The candidate will also be responsible for maintaining documentation, monitoring system operations, and resolving technical challenges related to data processing efficiency and reliability.
Key Responsibilities
- Design and build real-time and offline data processing systems with emphasis on performance, stability, and scalability
- Develop data modeling frameworks for structured and unstructured data sources
- Create and maintain ETL processes that ensure data consistency and minimize latency
- Establish technical specifications for data platform engineering, including documentation standards and operational monitoring protocols
- Implement data governance frameworks to ensure compliance with regulatory requirements and data security policies
- Monitor data quality metrics and develop corrective measures for data anomalies
- Collaborate with cross-functional teams to identify data processing needs and optimize system performance
- Conduct root cause analysis for data processing issues and propose technical solutions
- Develop and maintain metadata management systems for data lineage tracking and cataloging
- Ensure the reliability and security of data platforms through continuous improvement and risk mitigation strategies
Job Requirements
- Proven experience in designing and implementing data processing systems (minimum 5 years)
- Expertise in ETL development using tools like Apache Spark, Kafka, or Flink
- Strong understanding of data modeling techniques and database optimization strategies
- Proficiency in creating technical documentation and maintaining code repositories
- Knowledge of data governance frameworks and compliance standards (e.g., GDPR, HIPAA)
- Experience with data quality management tools and methodologies
- Ability to develop metadata management solutions for data cataloging and lineage tracking
- Strong problem-solving skills with experience in optimizing data processing workflows
- Proficiency in monitoring system performance and implementing alerting mechanisms
- Excellent communication skills for collaborating with stakeholders and presenting technical solutions
- Preferred: Experience with cloud-based data platforms (AWS, Azure, GCP) and containerization technologies (Docker, Kubernetes)
- Preferred: Familiarity with data security protocols and encryption standards
- Preferred: Strong background in data engineering best practices and DevOps methodologies