Summary
Overview
Work History
Education
Skills
Websites
Certification
Timeline
Generic

Adi Paruchuri

Charlotte,USA

Summary

Over 12 years of experience in the field of Big Data Engineering, Datawarehouse, Data Lake, Data Modeling and Business Intelligences Engineering with fine prominence in designing and implementing statistically significant analytic solutions to build enterprise applications. 9+ years of implementation and extensive working experience in writing ETL and ELT Jobs for analyzing data using wide array of tools in Hadoop ecosystem and Cloud Data Engineering (AWS and GCP). 10+ years of implementation and extensive working experience in creating enterprise BI application using MicroStrategy, Power BI and Apache Superset. 8+ Years of expertise’s in designing, modeling and engineering data driven products for Self-Services, Analytics and Reporting.

Overview

12
12
years of professional experience
1
1
Certification

Work History

Sr Data Engineer

Arche Group
02.2024 - Current
  • Involved in full Software Development Life Cycle (SDLC) - Business Requirements Analysis, preparation of Technical Design documents, Data Analysis, Logical and Physical database design, Coding, Testing, Implementing, and deploying to business users.
  • Involved in developing Spark applications using Spark – SQL in Databricks for data extraction, transformation and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.
  • Involved in developing Spark code using Scala and Spark-SQL for faster testing and processing of data.
  • Built scalable and robust data pipelines for Business Partners Analytical Platform to automate their reporting dashboard using Spark SQL and PySpark, and also scheduled the pipelines.
  • Developed ETL pipelines in and out of the data warehouse using a combination of Python and Snowflake, writing SQL queries against Snowflake.
  • Creates reports using Power BI for SharePoint list items.
  • Developed Tableau reports on the business information to examine the examples in the business.
  • Developed multiple Map Reduce jobs in Java for data cleaning and preprocessing.
  • Developed Python-based API (RESTful Web Service) to track revenue and perform revenue analysis.
  • Installed and configured Apache Airflow for workflow management and created workflows in Python.
  • Written Python DAGs in Airflow which orchestrate end-to-end data pipelines for multiple applications.
  • Worked on PySpark APIs for data transformations.
  • Handlers and Error Handling in SSIS packages and notified process results to various user communities.
  • Created SSIS packages to extract data from OLTP to OLAP systems and scheduled jobs to call the packages and stored procedures.
  • Implemented Data Quality logic on the Data Sources by binding the Physical Data Element to variables used in the rule logic.
  • Implemented Kerberos for strong authentication to provide data security.
  • Designed and implemented data retention policies in alignment with business requirements and regulatory standards.
  • Implemented data stewardship practices to ensure accuracy, consistency, and reliability of enterprise data assets.
  • Designed and implemented data cleansing pipelines to remove duplicates, correct inconsistencies, and standardize formats across large datasets.
  • Implemented data governance frameworks to ensure accuracy, consistency, and compliance of enterprise data assets.
  • Designed, set up, and maintained AWS services including Amazon RDS, Amazon Redshift, AWS Glue, S3, and Lambda to support enterprise-level data engineering workflows.
  • Configured AWS cloud services for endpoint deployment, serverless data pipelines, and secure data access using IAM roles and policies.
  • Developed infrastructure-as-code (IaC) templates using CloudFormation and Terraform for deploying AWS resources.
  • Developed JSON scripts for deploying data processing pipelines using AWS Glue and Step Functions.
  • Ingested data from various relational and non-relational sources into AWS for processing and analytics.
  • Configured to receive real-time data from Apache Kafka and store the stream data to HDFS using Kafka Connect.
  • Optimized Hive tables using techniques like partitions and bucketing to improve HiveQL query performance.
  • Worked extensively with Sqoop for importing/exporting data between HDFS and relational databases.
  • Utilized Kubernetes and Docker for the runtime environment of the CI/CD system to build, test, and deploy during production.
  • Used tools like Jira, GitHub to update the documentation and code.
  • Worked on SQL queries in dimensional data warehouses and relational data warehouses. Performed Data Analysis and Data Profiling using complex SQL queries on various systems.
  • Developed NoSQL databases using CRUD, indexing, replication, and sharding in MongoDB.
  • Followed agile methodology for the entire project.
  • Actively participated and provided feedback in a constructive and insightful manner during weekly iterative review meetings to track progress and resolve issues.
  • Environment: Spark, Scala, Python, PySpark, MapReduce, ETL, Tableau, Power BI, AWS, Git, Snowflake, Star Schema, Apache Airflow, CI/CD, Terraform, CloudFormation, Kubernetes, Jira, MongoDB, SQL, Agile, Windows.

Sr Data Engineer

LOWES
12.2019 - 01.2024
  • Involved in various phases of Data Engineering and BI EngineeringSoftware Development Life Cycle (SDLC) including requirement gathering, modeling, analysis, architecture design, prototyping, developing, and testing.
  • Designed and implemented efficient data models and schemas in Hive to enhance data retrieval and query performance.
  • Data ingestion from relational databases into HDFS using Sqoop import/export and also created Sqoop Job, Evaluate, and incremental jobs.
  • Good working experience in ingesting data from various databases (DB2, Oracle, Teradata, PostgreSQL, MongoDB and HIVE) and source systems (Adobe and Clickstream) to Big Query, enabling scalable data storage and efficient querying on Google Cloud Platform (GCP).
  • Build data pipelines in airflow in GCP for ETL related jobs using different airflow operators.
  • Experience in GCPDataproc, GCS, Cloud functions, BigQuery.
  • Used cloud shell SDK in GCP to configure the services Data Proc, Storage, BigQuery
  • Experience in building Snow pipe, Data Sharing, Databases, Schemas and Table structures and Played key role in Migrating Teradata objects into Snowflake environment.
  • Responsible for estimating the cluster size, monitoring and troubleshooting of the Spark Databricks cluster.
  • Creating Databricks notebooks using SQL, Python and automated notebooks using jobs.
  • Migrate data into RV data pipeline using Databricks, Spark SQL and Scala.
  • Used Databricks for encrypting data using server-side encryption.
  • Performed data purging and applied changes using Databricks and Spark data analysis.
  • Used Databricks notebooks for interactive analysis utilizing Spark APIs.
  • Worked with HIVE's data storage infrastructure creating tables, distributing data by implementing partitions and buckets, writing and optimizing HQL queries.
  • Developed a comprehensive data quality assurance framework for the batch processing environment, implementing data validation and error handling mechanisms to ensure high data integrity and accuracy.
  • Leveraged GCP services including Dataproc, GCS, Cloud Functions, and Big Query to optimize data processing and analysis, achieving improved system efficiency and cost reductions.
  • Developed data pipelines using Airflow to ingest data from various file-based sources such as (FTP, SFTP, API, Main Frame) in GCP for ETL-related jobs utilizing various airflow operators (Bash Operator,Python Operator,Dummy Operator, Database operators and Spark operators) to streamline data processing workflows.
  • Leveraged the power of Airflow'sscheduling capabilities to automate data pipelines, ensuring timely data updates and delivery.
  • Conducted performance tuning on Big Query queries to optimize execution times and reduce query costs resulting in faster data retrieval.
  • Implemented data access controls, encryption, and audit trails using GCP's built-in security features such as Identity and Access Management (IAM), VPC, Audit logging, and Cloud Key Management.
  • Created Hive tables, loading and analyzing data using hive queries.
  • Created Partitions, Bucketing and Indexing concepts for optimization as part of hive data modelling.
  • Designed and developed, maintenance of data integration programs in a Data Lake and RDBMS environment with both traditional and non-traditional source systems as well as RDBMS and NoSQL data stores for data access and analysis.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs and experience in using Spark-Shell and Spark Streaming.
  • Experience in working with product teams to create various store level metrics and supporting data pipelines written in GCP's big data stack.
  • Worked with App teams to collect information from google analytics 360 and build data marts in bigquery for analytical reporting for the sales and products team.
  • Experience in GCPDataproc, Dataflow, PubSub, GCS, Cloud functions, BigQuery, Stackdriver, Cloud logging, IAM, Data studio for reporting etc.
  • Build a program with Python sdk with Apache beam framework and execute it in Cloud Dataflow to stream pub sub messages into big query tables.
  • Experience in deploying streaming maven build cloud dataflow jobs.
  • Vast experience in identifying production bugs in the data using stack driver logs in GCP.
  • Experience in GCP Dataproc, GCS, Cloud functions, Cloud SQL & BigQuery.
  • Used Cloud shell SDK in GCP to configure the services Data Proc, Storage, BigQuery.
  • Applied partitioning and clustering for high volume tables on high cardinality fields in BigQuery to make queries more efficient.
  • Used Cloud Functions to support the data migration from BigQuery to the downstream applications.
  • Developed scripts using PySpark to push the data from GCP to the third-party vendors using their API framework.
  • Helped in the deployment and maintenance of Kafka on Kubernetes.
  • Extensive experience managing the Kafka topics.
  • Provide guidance to development team working on PySpark as ETL platform.
  • Data Extraction, aggregations and consolidation of Adobe data using PySpark.
  • Redesigned the Views in snowflake to increase the performance.
  • Experience in managing logs through Kafka with Logstash.
  • Designed the Views in snowflake to increase the performance.
  • Experience in managing logs through Kafka with Logstash.
  • Optimized the PySpark jobs to run on Kubernetes Cluster for faster data processing.
  • Architect on logical data models, recognize source tables to build MicroStrategy schema objects including Attributes, Facts, Hierarchies and Relationships.
  • Designed and developed interactive Power BI dashboards to visualize key performance indicators and business metrics.
  • Implemented DAX formulas to create complex calculations and custom measures for in-depth data analysis.
  • Integrated Power BI with multiple data sources, including SQL databases, Excel, and cloud platforms, for seamless data visualization.
  • Optimized Power BI reports for performance, reducing load times and enhancing user experience.
  • Configured row-level security (RLS) in Power BI to ensure data privacy and role-specific access.
  • Conducted training sessions for end-users to effectively navigate and interpret Power BI reports and dashboards.
  • Created ApacheSuperset Charts, Dashboards, datasets, Database Connections and Custom CCS for styling.
  • Created User groups and Users in Apache Superset.
  • Environment: Kafka 3.5.1, Python 3.X, SQL, Linux, MongoDB, PostgreSQL, Oracle, DB2, Teradata, Git, Oozie, Jenkins, Apache Spark, IntelliJ, Hive, Sqoop, GCP Dataproc, GCP GCS, GCP Cloud Functions, GCP Big Query, MicroStrategy 11, Apache Superset and Power BI.

Senior Data Engineer

OCTA Pharma
08.2019 - 12.2019
  • Extensive experience in working with AWS cloud Platform (EC2, S3, EMR, Redshift,Athena, Lambda and Glue).
  • Migrated an existing on-premises Hadoop application to AWS using services like EC2 and S3 processing and storage.
  • Creating databricks notebooks using SQL, Python and automated notebooks using jobs.
  • Migrate data into RV data pipeline using Databricks, Spark SQL and Scala.
  • Used Databricks for encrypting data using server-side encryption.
  • Performed data purging and applied changes using Databricks and Spark data analysis.
  • Used Databricks notebooks for interactive analysis utilizing Spark APIs.
  • Databrick notebooks were built to streamline and curate data for various business use cases and mounted blob storage on Databrick.
  • Experience in application development using Spark SQL on Databricks for data extraction, transformation and aggregation.
  • Used AWS EMR to transform and move large amounts of data into and out of AWS S3.
  • Performed end-to-end Architecture & implementation assessment of various AWS services like Amazon EMR, Redshift and S3.
  • Used AWS EMR to transform and move large amounts of data into and out of other AWS data stores and databases, such as Amazon Simple Storage Service (Amazon S3) and Amazon DynamoDB.
  • To analyze the data Vastly used Athena to run multiple queries on processed data from Glue ETL Jobs and then used Quick Sight to generate Reports for Business Intelligence.
  • Created Lambda functions to run the AWS Glue job based on the AWS S3 events.
  • Worked on scheduling all jobs using Airflow scripts using python added different tasks to DAG.
  • Extensive experience in Spark RDD, Data Frame, Spark SQL, Spark Streaming and Pyspark.
  • Developed Spark Applications by using Python and Implemented Apache Spark data processing Project to handle data from various RDBMS and Streaming sources.
  • Used Databricks for encrypting data using server-side encryption.
  • Performed data purging and applied changes using Databricks and Spark data analysis.
  • Used Databricks notebooks for interactive analysis utilizing Spark APIs.
  • Worked with the Spark for improving performance and optimization of the existing MapReduce in Hadoop.
  • Experienced in Maintaining the Hadoop cluster on AWS EMR.
  • Loaded data into S3 buckets using AWS Glue and PySpark. Involved in altering data stored in S3 buckets using Elasticsearch and loaded data into Hive external tables.
  • Configured Snow pipe to pull the data from S3 buckets into Snowflake table.
  • Creating Hive tables and loading and analyzing data using hive queries.
  • Automated recurring scripts and workflow using Apache Airflow and shell scripting to ensure daily execution in production.
  • Hands on experience in setting up workflow using Apache Airflow and Oozie workflow engine for managing and scheduling Hadoop jobs.
  • Created Databases and tables in Redshift.
  • Environment: Spark SQL, HDFS, Hive, Apache Sqoop, Airflow, Spark, AWS (EC2, IAM, S3, EMR, Redshift,Athena, Lambda and Glue), Python, SQL, Shell scripting, Linux, MySQL PostgreSQL, IntelliJ, Oracle.

DE/BI Consultant

Bank of America
01.2019 - 07.2019
  • Worked on a team of Excellence (BI and ETL)
  • Roles and Responsibilities:
  • Worked on a team of Excellence (BI and ETL) with multiple teams and Business Users to discuss and finalize requirements for the datasets to be used in Dynamic Dashboards and provide best practice whenever required.
  • Worked with Business Users to discuss and finalize layouts and designs of DynamicDashboards.
  • Worked with Data modeling team on logical data models, recognize source tables to build MicroStrategy schema objects including Attributes, Facts, Hierarchies and Relationships.
  • Created various public Objects Filters, Matrix, Custom Groups and Consolidations according to the requirements.
  • Converted existing reports and dashboards from Tableau and QlikView to MicroStrategy.
  • Trained Business users on MicroStrategy V1 and MicroStrategy office.
  • Work on creating Tableau visualize for Treasury as POC and trained users on Tableau.
  • Expertise in Tableau Analysis
  • Experience in BI interactive Dashboards and Tableau Publisher.
  • Used MicroStrategy cubes to create data visualizations in Tableau.
  • Developed new complex Informatica Power Center Mappings to extract and pull the data according to the guidelines provided by the business users and populate the data into Target Systems.
  • Created Mappings using Mapping Designer to load the data from various sources using different transformations like Source Qualifier, Expression, Lookup Connected and Unconnected, Aggregator, Update Strategy, Joiner, Filter, and Sorter transformations.
  • Designed STAR and SNOW Schema Data Warehouse Models using Facts and Dimensions.
  • Develops logical and physical data models for ETL applicationsProvide System and Application Administration for Informatica PowerCenter and Power ExchangeDesigns, develops, automates, and supports complex applications to extract, transform, and load data.
  • ETL Mappings, Mapplets and Workflows using Informatica PowerCenter 9.x.
  • Strong knowledge of Informatica ETL and Oracle/DB2 database technologies
  • Strong analytical skills and SQL proficiency.
  • Strong experience in DWH Technologies (Informatica, Netezza, SQL, Unix, Autosys).
  • Strong understanding of operational data staging environments, data modeling principles, and data warehousing concepts.
  • Scheduled Informatica jobs through AutoSys scheduling tool.
  • Prepared all documents necessary for knowledge transfer such as ETL Strategy, ETL development standards and ETL process.
  • Worked on SQL tools like TOAD and SQL Developer to run SQL Queries and validate the data.
  • Environment: MicroStrategy 10.2/10.4/10.11(Architect, Desktop, Command Manager, Object Manager, Narrowcast Server, Enterprise Manager), Tableau 10.3/10.5/2018/2019, Informatica PowerCenter 9.1/8.6, Power Exchange 9.1/8.6, Oracle 11g, PL/SQL, Autosys, Toad, IBM Netezza, SQL Server, Toad Datapoint, SQL developer, JAMS.

Data Engineer (Hadoop/Spark/AWS/MicroStrategy)

Spectrum Communication
10.2018 - 01.2019
  • Involved designing and developing Data warehouse and Data Lake in Hadoop/Spark/AWS using Spark SQL and AWS Glue.
  • Involved designing and developing BI applications using MicroStrategy.
  • As Data Engineer in to drive projects using Spark, SQL and AWS cloud environment.
  • Worked on data governance to provide operational structure to previously ungoverned data environments.
  • Involved in ingestion, transformation, manipulation, and computation of data using kinesis, SQL, AWS glue and Spark.
  • Done data migration from RDBMS to NoSQL database and document data deployed in various data systems.
  • Used Hive QL to analyze the partitioned and bucketed data, Executed Hive queries on Parquet tables stored in Hive to perform data analysis.
  • Developed Sqoop Jobs to load data from RDBMS to external systems like HDFS and HIVE.
  • Worked on converting the dynamic XMLdata for injection into HDFS.
  • Developed complete ETL from end to end using Lambda Function and AWS GLUE DATA BREW.
  • Implemented Spark Scripts using Spark SQL to load hive tables for faster processing of data.
  • Developed PySpark based pipelines using spark data frame operations to load data to ETL/ELT using EMR for jobs execution &AWS S3 as storage layer.
  • Developed streaming pipelines using Apache Kafka with Python.
  • Developed AWS lambdas using Python& Step functions to orchestrate data pipelines.
  • Was responsible for creating on-demand tables on S3 files using Lambda Functions and AWS Glue using Python.
  • Worked on setting up AWS EMR, EC2 clusters and Multi-Node Hadoop Cluster inside developer environment.
  • Create data ingestion modules using AWS Glue for loading data in various layers in S3 for Analytical usage using Athena.
  • Create manage bucket policies and lifecycle for S3 storage as per organizations guidelines.
  • Created Fact/Dimension and aggregated data output from S3 to Redshift for Data warehouse use.
  • Used Lambda functions and step functions to trigger Glue Jobs and orchestrate the data pipeline.
  • Good Understanding of other AWS services like S3, EC2 IAM, RDS Experience with Orchestration and Data Pipeline like AWS Step functions/Data Pipeline/Glue.
  • Worked with Data modeling team on logical data models, recognize source tables to build MicroStrategy schema objects including Attributes, Facts, Hierarchies and Relationships.
  • Created various Metrics as Conditional Metrics, Nested Metrics and Level Metrics as recommended.
  • Created Aggregate tables and Logical views on Database Using SQL.
  • Created Schedules to send email notification (PDF Reports, Failure notification, ETC)
  • Created various public Objects Filters, Matrix according to the requirements.
  • Headed Conversion of existing reports and dashboards from Tableau and Power BI to MicroStrategy.
  • Connected MicroStrategy to Multiple databases (TERADATA/HANA/HADOOP).
  • Environment: AWS,Amazon S3, Amazon Redshift, Amazon EC2s, Amazon Athena, Spark SQL, HDFS, Hive, Pig, Apache Sqoop, Scala, Python, Shell scripting, Linux, MySQL Oracle Enterprise DB, PostgreSQL, IntelliJ, Oracle, Subversion, Control-M, Teradata, ETL, Agile Methodologies,MicroStrategy 10.2/10.5 (Architect, Desktop, Command Manager, Object Manager, Enterprise Manager).

Sr Data Engineer (Hadoop/Spark/AWS/MicroStrategy)

Bank of America
09.2017 - 10.2018
  • Experience in designing and developing MVP products in Hadoop/AWS/Spark using Python to Transform data and load in HIVE and RDBMS.
  • Responsible for creating on-demand tables on S3 files using Lambda Functions and AWS Glue using Python.
  • Worked on setting up AWS EMR, EC2 clusters and Multi-Node Hadoop Cluster inside developer environment.
  • Developed the PySpark code for AWS Glue jobs on EMR.
  • Implemented to preprocess the failure messages in Kafka using offsetid.
  • Good Understanding of other AWS services like S3, EC2 IAM, RDS Experience with Orchestration and Data Pipeline like AWS Step functions/Data Pipeline/Glue.
  • Used Hive QL to analyze the partitioned and bucketed data.
  • Executed Hive queries on Parquet tables stored in Hive to perform data analysis to meet the business specification logic.
  • Created Hive Tables for Reporting use.
  • Developed Sqoop Jobs to load data from RDBMS to external systems like HDFS and HIVE.
  • Worked on converting the dynamic XMLdata for injection into HDFS.
  • Implemented Spark Scripts using Spark SQL to access hive tables into spark for faster processing of data.
  • Worked with Snowflake utilities, spark SQL, Snow Pipe, etc.
  • Developed complete ETL from end to end using Lambda Function, AWS GLUE DATA BREW.
  • Worked on ETL pipeline to source data from multiple systems and deliver calculated data from AWS to Datamart (SQL Server).
  • Implemented Spark Scripts using Pyspark and Spark SQL to access hive tables into spark for faster processing of data.
  • Analyze SQL scripts and design the solution to implement using Pyspark.
  • Responsible for converting row-like regular hive external tables into columnar snappy compressed parquet tables with key-value pairs.
  • Loaded the data into Spark RDD and do in memory data Computation to generate the Output response.
  • Used several RDD transformation to filter the data and injected into Spark SQL.
  • Worked with multiple Business Users to discuss and finalize layouts and designs of Dynamic Dashboards.
  • Worked with Data modeling team on logical data models, recognize source tables to build MicroStrategyschemaobjects including Attributes, Facts, Hierarchies and Relationships.
  • Created various public Objects Filters, Matrix, Custom Groups and Consolidations according to the requirements.
  • Converted existing reports and dashboards from Tableau and QlikView to MicroStrategy.
  • Developed working mockups for business users to help them D3 visualize the final solution.
  • Trained Business users on MicroStrategy V1 and MicroStrategy office.
  • Experience in Marks, Publisher and Security in Tableau.
  • Experience with Tableau for Data Acquisition and data visualizations for client and customers
  • Work on creating Tableau visualize for Treasury as POC and trained users on Tableau.
  • Expertise in Tableau Analysis
  • Experience in BI interactive Dashboards and Tableau Publisher.
  • Used MicroStrategy cubes to create data visualizations in Tableau.
  • Experience in Power BI, Power BI Pro, Power BI Mobile.
  • Expert in creating and developing Power BI Dashboards in rich look.
  • Designed and documented the entire Architecture of Power BI.
  • Environment: AWS, Spark SQL, HDFS, Hive, Pig, Apache Sqoop, Scala, Python, Shell scripting, Linux, MySQL Oracle Enterprise DB, PostgreSQL, IntelliJ, Oracle, Subversion, Control-M, Teradata, ETL,MicroStrategy 10.2/10.4/10.11(Architect, Desktop, Command Manager, Object Manager, Narrowcast Server, Enterprise Manager), Tableau 10.3/10.5/2018/2019, Agile Methodologies.

Data/BI Engineer (Hadoop/ETL/MicroStrategy)

CITI Bank
08.2014 - 09.2017
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Designed and developed complex ETL data pipelines and maintained the data quality to support a rapidly growing business.
  • Experienced on loading and transforming of large sets of structured, semi structured and unstructured data from RDBMS through Sqoop and placed in HDFS for further processing.
  • Build and maintained scalable data pipelines using the Hadoop ecosystem and other open-source components like Hive.
  • Managing and scheduling of Jobs on a Hadoop cluster using Oozie.
  • Createdtables using Hive and queries are performed using HiveQL.
  • Involved in creating Hive tables loading data and transform data MapR.
  • Extensive working knowledge of partitioned table, UDFs, performance tuning, compression-related properties in Hive.
  • Involved in writing complicated Pig Script along with involved in developing and testing Pig Latin Scripts.
  • Extensively used Pig for data cleansing and HIVE queries for the analysts.
  • Responsible for processing unstructured data using Pig and Hive.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Deployed Hadoop cluster usingHortonworks with Pig, Hive, HBase.
  • Created Hive tables from JSON data using data serialization framework like AVRO.
  • Responsible for loading the customers data and event logs from Kafka into HBase using REST API.
  • Worked on debugging performance tuning and analyzing data using Hadoop components Hive/Pig.
  • Worked on creating and automating reports in excel using data imported from Hive via ODBC.
  • Hands on design and development of an application using Hive (UDF).
  • Present in designing Row keys and Schema Design for NOSQL Database HBase and knowledge of other NOSQL database Cassandra.
  • Used HiveQL to analyze the partitioned and bucketed data executed Hive queries on Parquet tables stored in Hive to perform data analysis to meet the business specification logic.
  • Written Hive queries on the analyzed data for aggregation and reporting.
  • Developed end to end data processing pipelines that begin with receiving data using distributedmessaging systems Kafka through persistence of data into HBase.
  • Used Hive Context to integrate Hive meta store for optimum performance.
  • Developed Sqoop Jobs to load data from RDBMS to external systems like HDFS and HIVE.
  • Developed Stored Procedures and Functions, Views for the Oracle database PL/SQL.
  • Developed MicroStrategy reports and dashboards using multi-sourcing.
  • Developed MicroStrategy reports using power point and excel using MicroStrategy office Plugin.
  • Created MicroStrategy intelligent cubes to create interactive and cube-based dashboards.
  • Implemented various workarounds like Hierarchies Prompts in cube-based data sets, hiding un-hiding using conditional formatting, table driven security datasets with system prompts.
  • Defined best practices for Tableau report development.
  • Converted WEBI Reports to Tableau Dashboards for MicroStrategy Advanced Visualizations.
  • Worked with data modeling team in creating Semantic Layers for optimization of reports.
  • Converted existing reports and dashboards from Tableau and QlikView to MicroStrategy.
  • Responsible for generating actionable insights from complex data to drive real business results for various application teams and worked in Agile Methodology projects extensively.
  • Environment: HDFS, Hive, Pig, Apache Sqoop, Scala, Shell scripting, Linux, MySQL, Oracle Enterprise DB, Eclipse, Oracle, Git, Oozie, Tableau, MySQL, Agile Methodologies,MicroStrategy 9.3.1/9.4.1/10.2/10.4/10.7 (Architect, Desktop, Command Manager, Object Manager, Narrowcast Server, Enterprise Manager).

MicroStrategy Consultant

Meijer
04.2013 - 08.2014
  • Obtained the requirement specifications from the Business Analysts for Finance Team.
  • Interacted with the business users to build the sample report layouts.
  • Worked with Data modeling team on logical data models, recognize source tables to build MicroStrategy schema objects including Attributes, Facts, Hierarchies and Relationships.
  • Worked on logical data modeling by creating Semantic Layer views.
  • Worked extensively on creating Metrics and Compound Metrics, Filters, Custom Groups and Consolidations.
  • Used Pass-Through Functions in Attributes, Metrics, and Filters.
  • Implemented intelligent cubes for sourcing datasets to build Dashboards.
  • Creation of Intelligent cubes and sharing to reduce the database load and decreasing the report execution by using the Cube Services.
  • Created Dynamic Enterprise Dashboards and utilized the new features to convert the dashboards into Flash mode.
  • Involved in troubleshooting MicroStrategy Web Reports, optimizing the SQL using the VLDB Properties.
  • Developed Auto Prompt Filters that give user a choice of different filtering criteria each time they run the filter.
  • Used Intelligent Cube datasets to provide performance for dashboards.
  • Created and Converted dashboards to suit to MicroStrategy Mobile, especially for iPad.
  • Performed Object Manager to deploy the MicroStrategy Objects from the development stage to QA and then to Production environment.
  • Used Enterprise Manager to generate reports to analyze the system performance.
  • Built quick reports/dashboards from internal MSTR data sources like Intelligent Cubes using MicroStrategy Visual Insights.
  • Sliced and diced data using quick filters, targeting graphs, widgets for interactivity using MicroStrategy Visual Insights.
  • Created multi layered analyses using layers and panels for showing KPIs using MicroStrategyVisual Insights.
  • Hand full experience with Narrowcast which includes creation of Services, Information Objects, Subscription Set, Publication and Schedules as per the requirement.
  • Used Cube Advisor to determine the best practices for supporting dynamic sourcing for existing project.
  • Tested all the reports by running queries against the warehouse using TOAD. Also Compared those queries with the MicroStrategy SQL engine generated queries.
  • Exported data from Teradata using Fast Export utility.
  • Experience in using Teradata utilities such as Fast load, Multiload and Fast Export.
  • Worked with the team on the upgrading process from MicroStrategy 9.2.1 to 9.3.1 and to 9.4.1
  • Environment: MicroStrategy 9.2/9.2.1/9.3.1/9.4.1 (Architect, Desktop, Command Manager, Object Manager, Narrowcast Server, Enterprise Manager) Teradata 14.

Education

Bachelor’s - Electronics and Communication Engineering

JNTUH
01.2011

Skills

  • Data engineering expertise in Hadoop and AWS technologies
  • NO SQL Databases: HBase, Cassandra, and MongoDB
  • Graph Databases: Neo4j and Amazon Neptune
  • Hadoop Distributions: Cloudera (CDH3, CDH4, and CDH5) and Hortonworks
  • Languages: C, C, Java, Scala, XML, HTML, AJAX, CSS, SQL, PL/SQL, Pig Latin, HiveQL, Unix, Shell Scripting, Python
  • Operating Systems: UNIX, Red Hat LINUX, Mac OS and Windows Variants
  • Source Code Control: GitHub and Docker
  • Databases: Teradata, Microsoft SQL Server, MySQL, and DB2
  • DB languages: MySQL, PL/SQL, PostgreSQL & Oracle
  • Build Tools: Jenkins, Maven, ANT, Gradle, Log4j
  • Business intelligence software proficiency
  • Development Tools: Eclipse, IntelliJ, Microsoft SQL Studio, NetBeans
  • Experienced with ETL tools: Talend and Informatica
  • Agile and Scrum methodologies
  • Performance tuning
  • Data warehousing
  • Advanced SQL
  • Data quality assurance
  • Metadata management
  • Business intelligence

Certification

  • GCP DE - https://www.credly.com/badges/8a17badb-bbed-4717-ad8c-f7f24f508fa2/public_url
  • GCP Cloud Developer - https://www.credly.com/badges/0c0a93a3-aa65-48ea-b1a7-540ac7fe4fe/public_url
  • GCP DBE - https://www.credly.com/badges/9491af40-99ff-4edd-9efa-fd5984cd8bd/public_url

Timeline

Sr Data Engineer

Arche Group
02.2024 - Current

Sr Data Engineer

LOWES
12.2019 - 01.2024

Senior Data Engineer

OCTA Pharma
08.2019 - 12.2019

DE/BI Consultant

Bank of America
01.2019 - 07.2019

Data Engineer (Hadoop/Spark/AWS/MicroStrategy)

Spectrum Communication
10.2018 - 01.2019

Sr Data Engineer (Hadoop/Spark/AWS/MicroStrategy)

Bank of America
09.2017 - 10.2018

Data/BI Engineer (Hadoop/ETL/MicroStrategy)

CITI Bank
08.2014 - 09.2017

MicroStrategy Consultant

Meijer
04.2013 - 08.2014

Bachelor’s - Electronics and Communication Engineering

JNTUH
Adi Paruchuri