Data Architect (2 Positions - Both Filled)

Location:

Toronto, Edmonton

Requirements:

Azure, Architecture, Data Lakes, Data Warehousing

Client:

IT Digital Consulting Client

Description:

Primary Technical Skills:

Data Architecture experience in implementing Data Lake, Data warehouse or Hybrid Data Platform on cloud

Azure data platform - ADLS (data lake), Synapse, Purview, ADF, etc.

Exposure to Hadoop Distribution (CDH and Horton)

Developing Conceptual models and semantic views

Experience in Synapse warehouse models and singleton data ingestion approach

Implementing data pipelines for retrieval / Ingestion / presentation / semantics

Using data ingestion/data technologies and design patterns and producing new frameworks / framework variants

Skills in data modelling (both structured and unstructured data) working directly with the business & data scientists

Data Management using CDH and metadata repository technologies (Collibra/Purview)

Skills in data acquisition (landing, ingestion and metadata) of various data types including Salesforce, SAP, XML, json, flat file systems (ISAM/VSAM) and Relational data

Skills in data manipulation: Azure Data factory and Data Bricks, Eventing environment, orchestrated by Kafka

Data presentation via visualisation technologies e.g. Power BI. Exposure to Tableau

Additional Technical Skills:

Data management and analytics

Data discovery processes

Familiar with Scrum and working in an Agile team

Generation of data catalogues/models from the metadata repository and exposing that to various user communities

ELT technologies – Azure Data Factory, Sqoop, Syncsort/ETL

Developing technical specifications corresponding with the lake architecture, based on business and functional analysis

Capturing and enhancing metadata from source systems and management of data drift

Security integration and review of Security Policies for the lake resources. Implementation of those policies in Sentry or Ranger providing Authorisation controls at the required granularity

Experience in large scale distributed processing architectures e.g. enterprise distributed caching, low latency data driven platforms

Expertise in fault tolerant systems in Azure including Clustering & multi AZ deployments

Working knowledge of installing, configuring and operationalising Big Data clusters e.g. Azure Object or Data Lake Storage Gen2

Understanding of configuration management and Devops technologies (e.g. GitLab / Jenkins / Octopus )

Key Responsibilities:

Architect & Design: Lead the architectural design of Petabyte size data platforms, ensuring scalability, reliability, and performance. Develop and maintain robust data infrastructure to support large-scale, enterprise-grade applications.

Data Pipeline Development: Design, build, and optimize ETL pipelines from scratch, ensuring the efficient movement, transformation, and storage of large data sets.

Technology Leadership: Provide technical leadership in implementing technologies like PySpark, AWS, Databricks, and other cloud-based solutions. Guide the team in selecting the best tools and practices for specific projects.

Collaboration & Strategy: Work closely with cross-functional teams, including stakeholders and directors, to understand and translate business requirements into technical solutions. Influence the strategic direction of data architecture across projects.

Apply Now