Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Ajay Shrirang Kharade

Charlotte,NC

Summary

Data-driven, multi-faceted technologist and leader with global experience in Data Engineering Implementations, Big Data, and Data & Analytics Consulting. Proven track record of driving data-driven transformations and implementing successful data strategies aligned with business objectives. Leadership roles equipped with skills to define and execute comprehensive data strategies that enhance business performance. Big Data Architect and Lead Developer, extensively assisting clients in Big Data initiatives, defining technology strategy, and architecting & implementing big data solutions. Business development and implementation experience across industries such as Banking & Financial Services, Financial Regulations, Retail, and Consumer Products. Led on-premise cluster and cloud data platform migration implementations, with expertise in architectures on Hadoop, AWS, Azure, and GCP. Designed and developed data modernization initiatives, focusing on cloud and on-premise solutions. Designed and implemented ETL pipelines, data modeling, data mapping, and data governance strategies, including data lineage, security, cataloging, discovery, and quality. Proficient in Data Integration architecture, with experience in Data Warehousing, Data Lakes, Lakehouse, Data Mesh, and Data Federation implementations. Well-versed in data governance, data cataloging, and strategies for data discovery and quality across diverse data solutions.

Overview

14
14
years of professional experience
1
1
Certification

Work History

Data Architect

Synechron Inc.
05.2023 - Current
  • Client - Wellsfargo Financial services company Project - Embedded IT
  • Developed and delivered business information solutions.
  • Gathered, defined and refined requirements, led project design and oversaw implementation.
  • Established and secured enterprise-wide data analytics structures.
  • Designed data models for complex analysis needs.
  • Managed identification, protection and use of data assets.
  • Designed internal process improvements to automate repetitive tasks, shortening data delivery times.
  • Evaluated and created unified data solution for the over 500+ assets technical data/reporting processes within the context of compliance and segregation of duties
  • Led the relocation of select technical processes into the organization's recommended tech stack, ensuring system optimization
  • Developed and promoted a unified architecture for asset remediation on the Hadoop data platform
  • Actively participated in architecture review and approval discussions, ensuring alignment with best practices
  • Designed a unified architecture to address various data integration needs across the organization
  • Developed and optimized ETL data pipelines to handle diverse types of data loads efficiently
  • Established a comprehensive Business Intelligence (BI) reporting architecture to facilitate seamless data flow and analysis
  • Conducted in-depth interviews with business users to understand functional requirements and translated them into technical solutions
  • Provided technical leadership and mentorship to a team of data analysts and developers, fostering a culture of data-driven decision-making
  • Created and maintained documentation related to data architecture, including data dictionaries, metadata, and data lineage, ensuring transparency and ease of understanding in the data ecosystem
  • Environment: Hadoop, Spark, Pyspark, Hive, Hive, Unix shell scripting, Python, Power BI, Tableau, Dremio, SQL Server, Oracle, Teradata, SSIS, Autosys etc

Data Solution Architect

EPAM Systems
11.2019 - 05.2023
  • Client 1 - PNC Bank Project - ESG Analytics Solution
  • Client 2 - Canadian Tire Project - Data Lake & Enterprise Cost Analysis System
  • Client 3 - Vanguard Project - Content Management
  • Client 4 - Waymo Project - Data Warehouse / Analytics Platform
  • Developed comprehensive documentation for solution design specifications, ensuring clear communication between stakeholders at all stages of project lifecycle.
  • Supervised deployments and provided troubleshooting and user support.
  • Managed project planning, resource allocation, scope, schedule, status and documentation.
  • Conducted technical workshops and education sessions for customers and development / testing teams.
  • Worked with customers or prospective customers to develop integrated solutions and lead detailed architectural dialogues to facilitate delivery of comprehensive solution.
  • Designed and Implemented complex ESG investment solution for the asset management group of the big financial bank
  • As part of the data governance strategy implemented the data quality framework for the on-prem Hadoop data lake
  • Created and mentored the development teams for using the best practices for the coding standards across the projects for both on-prem and cloud-based systems
  • Lead data-related projects, collaborating with cross-functional teams to ensure project goals are met within established timelines and budgets
  • Research and evaluate new data technologies, tools, and platforms to determine their suitability for an organization's data architecture
  • Created and maintained documentation related to data architecture, including data dictionaries, metadata, and data lineage
  • Provide technical leadership and mentorship to data analysts and developers, fostering a culture of data-driven decision-making within the organization
  • Planned and executed the integration of data from various sources into a unified data architecture, ensuring data quality and consistency
  • Developed and maintained conceptual, logical, and physical data models to define the structure and relationships of data elements within an organization's data architecture
  • Established and enforced data governance policies and procedures to ensure data privacy, security, compliance, and accuracy
  • Worked on GCP cloud solution where created data analytics architecture and implemented ETL data pipeline
  • Developed BI solution where created dashboards to provide insights to business users
  • Also addressed non-functional requirements such data governance, data catalog, data lineage, data security in overall solution
  • Environment: Python, Azure Data Factory, Azure Databricks, Azure Data Lake Storage gen 2, Sql Server, Power Apps, Azure functions, Azure Event hub, Azure Kubernetes Service (AKS) Cloudera Hadoop, Spark, Pyspark, Hive, Impala, Hive, Unix shell scripting, Python, Tableau, Amazon S3, EMR, DynamoDB, Glue, Lambda functions, Athena, Kinesis, AWS Redshift, RDS, Aurora, Athena, GCS, Google Big Query, Cloud Dataprep, Cloud Dataproc, Cloud Spanner, Cloud SQL Compute Engine, Looker, Pub-sub, Cloud Data Fusion, Data Catalog, Cloud Composer, Cloud Dataflow.

Consultant (Big Data Architect/Lead Data Engineer)

SVAM International Inc
08.2018 - 03.2019
  • Client - Barclays Project - Investment Data Engineering Platform
  • Worked for one of the top investment banks for setting up their asset management on-premises lake platform for the Engineering, BI reporting, and data science work
  • As part of governance strategy designed and implemented the schema validation tool to detect schema changes before the ingestion into data lake
  • Understand the existing model and created flexible model which helps to onboard new information set easily
  • Designed and created the integration architecture and strategy for one- time historical load and incremental load in identical required parquet file format so that processing is easy
  • Responsible for designing and implementing pipelines that collect, store, process, and analyze large volumes of information from various sources
  • They ensure that data is transformed into a usable format and is available for analysis and reporting
  • Created metadata management strategy to categories system metadata and maintained it as part of information catalog so that it is easily accessible within organization
  • Environment: Cloudera Hadoop, Spark, Pyspark, Hive, Impala, Hive, Unix shell scripting, Python, QlickView etc.

Manager In Projects (Data Architect)

Cognizant Technologies Solutions
08.2016 - 07.2018
  • Client - Fossil Retail Project - Baseline - Gross Margin Report
  • As a data solution architect, design and implemented on-premises and cloud- based data analytics solution for world’s largest fashion brand retailer
  • Architected and implemented complex data analytics work used by executive members day to day which can provide valuable insights into business operations, customer behavior, and market trends, helping organizations make data-driven decisions and gain a competitive edge in their industries
  • Provided technical leadership and mentorship to data analysts and developers, fostering a culture of data-driven decision-making within the organization
  • Worked extensively with Enterprise infrastructure and cloud team to setup resources on on-prem Hadoop based system and in Google Cloud Platform
  • Actively involved to setup the processes, policies, and guidelines that govern the management of data assets within an organization
  • As a data architect, with presale typically worked in a consulting capacity, providing technical expertise and guidance to help clients develop and implement effective data management solutions
  • Created technical proposals that outline the data management solution, its benefits, and its implementation plan
  • Environment: Google Cloud Platform, Google Cloud Storage, Google Big Query, Google Dataprep, Google Dataflow, UNIX shell scripting RHEL, Google Cloud SDK, Hadoop, Pig, Hive, HDFS, SAP BODS, Azkaban3.0(workflow scheduler), Hortonworks Hadoop distribution, Spark Core, Spark SQL, Unix Shell scripting, Amabri Hive views, Apache Zeppelin Amazon S3, EMR, DynamoDB, Glue, Lambda functions, Athena, Kinesis, AWS Secrete manager, AWS Redshift, RDS, Aurora, Athena etc..

Senior Consultant (Senior Data Engineer)

Capgemini India Pvt. Ltd, HSBC
07.2015 - 08.2016
  • Client - HSBC Bank Project Name - Anti-Money Launderings (AML) use case for Alert Similarity Score
  • Implemented data solution to determine the similarity of AML transaction alerts, a similarity score can be calculated by comparing the attributes of the alerts such as transaction amount, transaction type, location, involved parties, and timing
  • Collaborated with data scientists, business analysts, and other stakeholders to understand business requirements and develop solutions that meet their needs
  • Extensively worked for tuning and optimizing data processing and storage systems to improve performance and reduce processing times
  • Architected and implemented the Sqoop based data ingestion framework, which can automate the process of importing data from a database into Hadoop, reducing manual effort, and improving data accuracy and efficiency
  • Additionally, the above framework customized to meet specific business needs, such as integrating with data validation or data quality tools and provide greater control over data ingestion operations
  • Mentored junior consultants, helping them enhance their skills and contribute more effectively to projects.
  • Facilitated workshops with clients to identify pain points, establish goals, and define actionable steps towards achieving desired outcomes.
  • Environment: HDP, Core Java, Hadoop, Pig, Hive, HBase, HDFS, Hue, Eclipse Luna, Ambari, Apache Ranger, Teradata, Oracle, Apache Zappelin etc.

Data Engg, Senior Data Engg, Lead Data Engg

Sears Holdings
05.2011 - 07.2015
  • Project 1 - Web Intelligence
  • Project 2 - Pricing Dashboard
  • Project 3 - Amazon Real Time Pricing, Future State Emergency Pricing (FSEP)
  • Project 4 - Pricing Hub
  • During my tenure with this organization, played various roles within the field of data engineering, including data engineer, senior data engineer, and lead data engineer
  • Majorly worked in retail pricing domain, where performed data engineering work which was critical for enabling data-driven decision-making and improve pricing strategies, optimize inventory management, and enhance the overall customer experience
  • Played a critical role in designing, building, and maintaining the data infrastructure that enabled retail client to derive insights from its data
  • As a data engineer, I focused on building and maintaining data pipelines, ensuring data quality, and optimizing data processing
  • As a senior data engineer, I took on additional responsibilities such as mentoring junior team members and collaborating with stakeholders to understand their data needs
  • And as a lead data engineer, I was responsible for leading a team of data engineers, managing project timelines, and driving data strategy for the organization
  • Designed, built, and maintained data pipelines that collect, transform, and load data from various sources into data warehouses, data lakes, or other data storage systems
  • Developed and maintained Extract, Transform, and Load (ETL) processes to ensure that data transformed into a usable format for analysis
  • Designed and implemented data models to support efficient data processing, querying, and analysis
  • Kept up with the latest developments in data engineering technologies, tools, and best practices, and exploring new approaches to solving data engineering challenges
  • Environment: Cloudera Hadoop Distribution, Hadoop, HDFS, Pig, Hive, HBase, Sqoop, Flume, Kafka, Core Java, J2EE, Spring MVC, Jquery, Javascript, MongoDB, Apache Storm, UNIX shell scripting, Core PHP5, MongoDB, Prestodb, Teradata, Oracle, SVN, Git, Jenkins, SonarQube etc.

Technical Associate

Tech Mahindra
09.2010 - 05.2011
  • Client - Telus Canada Project - Net Cracker
  • Worked together with customer support by analyzing and correcting reported problems in timely manner
  • Produced high quality code developed using sound computer science principles
  • Designed, built, tested and maintained scalable and stable off shelf application or custom-built technology solutions to meet business needs
  • Act as subject matter expert for Application Software developers and Engineers.

Education

Bachelor of Engineering - Computer Science and Engineering

Shivaji University
Kolhapur, India
07.2007

Skills

  • ETL development / Data Pipeline Design in Cloud -
  • Microsoft Azure Azure Blob Storage, Azure Data Factory, Azure Data Explorer, Data Catalog, ADLS Gen-2, Azure Databricks, HDInsights, Azure Purview, Power Apps, Azure functions, Azure Eventhub, Micro services, Azure Synapse analytics, Power BI, Azure DevOps, Cosmos DB, Azure SQL managed instance, Azure SQL Server etc
  • Amazon Web Services (AWS) - Amazon S3, EMR, DynamoDB, Glue, Lambda functions, Athena, Kinesis, AWS Secrete manager, AWS Redshift, RDS, Aurora, Athena, Amazon DataZone etc
  • Google Cloud Platform (GCP) - GCS, Google Big Query, Cloud Dataprep, Cloud Dataproc, Cloud Spanner, Cloud SQL, Kubernates, App Engine, Compute Engine, Pub-sub, Cloud Data Fusion, Data CataLog, Cloud Composer, Cloud Dataflow, Looker, Google Cloud SDK, IAM, Vertex AI, AutoML
  • Hadoop Ecosystem - Cloudera/Hortonworks Hadoop, Pig, Hive, Hbase, Sqoop, Oozie, HBase, Zookeeper, Azkaban (workflow scheduler), Apache Storm, Flume, Apache Kafka, Apache Tez, Spark, PySpark, Spark SQL (Data frames, Datasets), Spark Streaming
  • Data Modeling - ERwin, Drawio, MS Visio, Lucidchart
  • NoSQL Databases - MongoDB, Cassendra
  • Cloud SAAS Data Technologies - Snowflake, Databricks
  • Business Intelligence - Tableau, PowerBI, Looker
  • SQL and Databases - MySQL, Oracle, PostgreSQL, Teradata etc
  • Project Management Framework & Tools - Agile (Scrum), Waterfall model, Jira
  • CI/CD Tools - Github, Azure DevOps, Subversion (SVN) Jenkins
  • Programming languages: Java, Scala, Python, Unix Shell script

Certification

Microsoft Azure – 1. Microsoft Azure Fundamentals 2. Microsoft Azure Data Engineer DP-203 Snowflake – 1. Hands on essential – Data warehouse 2. Hands On Essentials - Data Applications Databricks – 1. Databricks Lakehouse Fundamentals 2. Generative AI Fundamentals 3. Databricks Fundamentals Hadoop - Certified in Big Data technologies: Hadoop, Pig, Hive, Map reduce, Kafka MongoDB – 1. MongoDB for Java Developers 2. MongoDB for DBAs Google Cloud Platform – 1. Udemy - Google Cloud Professional Data Engineer Certification 2. Generative AI using Azure OpenAI ChatGPT

Timeline

Data Architect

Synechron Inc.
05.2023 - Current

Data Solution Architect

EPAM Systems
11.2019 - 05.2023

Consultant (Big Data Architect/Lead Data Engineer)

SVAM International Inc
08.2018 - 03.2019

Manager In Projects (Data Architect)

Cognizant Technologies Solutions
08.2016 - 07.2018

Senior Consultant (Senior Data Engineer)

Capgemini India Pvt. Ltd, HSBC
07.2015 - 08.2016

Data Engg, Senior Data Engg, Lead Data Engg

Sears Holdings
05.2011 - 07.2015

Technical Associate

Tech Mahindra
09.2010 - 05.2011

Bachelor of Engineering - Computer Science and Engineering

Shivaji University
Ajay Shrirang Kharade