JOB DESCRIPTION
Role Description:
Play a key role in building next-generation data warehousing systems to power impactful business decisions for large corporations. Design and implement scalable data warehouse architecture, including various ETL pipelines and workflow orchestration tools on leading cloud platforms to enable data science and analytics programs. Display client-centricity and be result-oriented, with a learning mindset, attention to detail, and ability to prioritize one's work. Ugam, a Merkle company, is Great
Key Responsibilities:
Create data architecture that is flexible, scalable, consistent for cross-functional use, and aligned to stakeholder requirements
Leverage cloud services and build cloud-deployable solutions, preferably in a serverless environment
Design, develop, and maintain scalable and resilient ETL/ELT pipelines for handling large volumes of complex data
Deploy a real-time data governance strategy (organize, transform, activate) using workflow orchestration tools to comply with data modeling, integrity, and privacy principles
Ensure that the data pipeline infrastructure meets the analysis, reporting, and data science needs of the organization
Collaborate with stakeholders like data analysts, data scientists, and IT infrastructure/DevOps teams
Qualifications/Certifications:
Bachelor's / Master's in Computer Science, Engineering, Statistics, Information Systems or other quantitative fields
Additional Information
What are we looking for?
5 years of industry experience in data engineering, data science, or related field with a good understanding of cloud services
Experience working on warehousing systems, and an ability to contribute towards implementing end-to-end, loosely coupled/decoupled technology solutions for data ingestion and processing, data storage, data access, and integration with business user-centric analytics/BI frameworks
Must have skills:
Hands-on experience with Python, advanced level SQL or NoSQL (Google Big Table, MongoDB, or similar)
One or more data warehousing systems (Redshift, Big Query, Snowflake, or similar),one or more ETL tools (Talend, SSIS, Informatica PowerCenter, or similar)
At least one cloud environment and related services (AWS, GCP, Azure, Dataproc, or similar),and one or more CI/CD tools (Git, Bitbucket, Jenkins, Docker, or similar)