Data Architect (2 Positions)
Location:
Toronto, Edmonton
Requirements:
Azure, Architecture, Data Lakes, Data Warehousing
Client:
IT Digital Consulting Client
Description:
Primary Technical Skills:
Data Architecture experience in implementing Data Lake, Data warehouse or Hybrid Data Platform on cloud
Azure data platform - ADLS (data lake), Synapse, Purview, ADF, etc.
Exposure to Hadoop Distribution (CDH and Horton)
Developing Conceptual models and semantic views
Experience in Synapse warehouse models and singleton data ingestion approach
Implementing data pipelines for retrieval / Ingestion / presentation / semantics
Using data ingestion/data technologies and design patterns and producing new frameworks / framework variants
Skills in data modelling (both structured and unstructured data) working directly with the business & data scientists
Data Management using CDH and metadata repository technologies (Collibra/Purview)
Skills in data acquisition (landing, ingestion and metadata) of various data types including Salesforce, SAP, XML, json, flat file systems (ISAM/VSAM) and Relational data
Skills in data manipulation: Azure Data factory and Data Bricks, Eventing environment, orchestrated by Kafka
Data presentation via visualisation technologies e.g. Power BI. Exposure to Tableau
Additional Technical Skills:
Data management and analytics
Data discovery processes
Familiar with Scrum and working in an Agile team
Generation of data catalogues/models from the metadata repository and exposing that to various user communities
ELT technologies – Azure Data Factory, Sqoop, Syncsort/ETL
Developing technical specifications corresponding with the lake architecture, based on business and functional analysis
Capturing and enhancing metadata from source systems and management of data drift
Security integration and review of Security Policies for the lake resources. Implementation of those policies in Sentry or Ranger providing Authorisation controls at the required granularity
Experience in large scale distributed processing architectures e.g. enterprise distributed caching, low latency data driven platforms
Expertise in fault tolerant systems in Azure including Clustering & multi AZ deployments
Working knowledge of installing, configuring and operationalising Big Data clusters e.g. Azure Object or Data Lake Storage Gen2
Understanding of configuration management and Devops technologies (e.g. GitLab / Jenkins / Octopus )
Key Responsibilities:
Architect & Design: Lead the architectural design of Petabyte size data platforms, ensuring scalability, reliability, and performance. Develop and maintain robust data infrastructure to support large-scale, enterprise-grade applications.
Data Pipeline Development: Design, build, and optimize ETL pipelines from scratch, ensuring the efficient movement, transformation, and storage of large data sets.
Technology Leadership: Provide technical leadership in implementing technologies like PySpark, AWS, Databricks, and other cloud-based solutions. Guide the team in selecting the best tools and practices for specific projects.
Collaboration & Strategy: Work closely with cross-functional teams, including stakeholders and directors, to understand and translate business requirements into technical solutions. Influence the strategic direction of data architecture across projects.
Apply Now