Summary
Overview
Work History
Education
Skills
Languages
Timeline
Generic

Divya Sree Velugula

Frisco,TX

Summary

Competent and result-oriented professional around 6+ years of experience in developing and implementing information solutions across diverse technical backgrounds, currently working as a Senior Big Data Developer at Bank of America. Having 6+ years of exclusive experience in Big data – Hadoop & Spark applications using industry accepted methodologies and procedures. Experienced in working with Microsoft Azure Cloud – ADLS, ADF, Azure Databricks, Azure SQL DB Having Strong experience in implementation of Big data application processes on IBM - Big insight’s Platform and Cloudera distributions using Hadoop 2.X Yarn Architectures Good exposure on Hadoop eco systems like Python, HDFS and Map Reduce, Sqoop, Pig, HIVE, Spark, Apache NIFI. Have very good exposure in extracting and loading the data to and from between RDBMS and Hadoop ecosystem using SQOOP. Experience in Transforming the large Data sets using PIG scripts. Experience in Loading Huge Data sets into HIVE Tables and creating HIVE jobs using HQL. Having Experience in working with HDFS and MapReduce, PIG, HIVE, SQOOP, Mongo DB Having good exposure in SPARK core, SPARK SQL. Having diverse background with fast learning skills and creative analytical abilities with good communication and technical skills. Good Knowledge on Data Structures and Algorithms. Ability to work effectively as a member of a team and ability to make decisions including the ability to prioritize tasks effectively.

Overview

8
8
years of professional experience

Work History

Senior Big Data Developer-Application Programmer V

Bank of America
Addison, TX
11.2021 - Current
  • Responsibilities:
  • Actively involved in all phases of Data life cycle creating, optimizing, updating and maintaining data models for various applications and systems
  • Lead design reviews of schemas and relevant metadata to ensure consistency, quality, accuracy and integrity
  • Worked in gathering information and requirements from multiple teams to build and design data schemas and project plans
  • Monitored log4j vulnerability throughout the bank by extracting, cleaning and loading data into well-defined
  • HIVE tables for better analysis
  • Worked on retrieving data from Kafka queues and Splunk
  • Processed the streaming data using best optimization techniques in Spark to attain better performance
  • Loaded the processed data to HDFS and built HIVE tables on top of it
  • Analyzed the data in order to find out any anomalies as part of Data Loss Prevention practices and provide alerts when found
  • Developed Python scripts to perform analysis on the Hive data to obtain best performance when executed in local
  • Build Spark projects in IntelliJ and deploy then using Jenkins which is in fact connected through the BitBucket repositories
  • Scheduled daily and hourly jobs on crontab and Oozie scheduler
  • Involved in troubleshooting errors for legacy applications on the server
  • Actively worked in migrating the existing functionalities and project from CDP6 to CDP7( server-server migration)
  • Computing: Hadoop, HIVE, HDFS, Spark, Pyspark, Scala, Python, SQL, Oozie, Apache Kafka, Splunk, Unix Shell Scripting, Crontab
  • Integrated third-party APIs into existing applications for additional functionality.
  • Maintained source control repository in order to track changes between versions of codebase.
  • Conducted full lifecycle software development from planning to deployment and maintenance.
  • Jenkins, Intellij, BitBucket, Horizon Jira Board etc.
  • Optimized stored procedures for increased efficiency in query results.
  • Assisted in the creation of technical documentation such as design documents, testing plans.
  • Communicated with clients to define program needs and requirements, explaining potential challenges and costs.

Software Engineer

CVS Health
Irving, USA
05.2020 - 10.2021
  • Actively involved in all phases of data science life cycle including data extraction, data cleaning, data processing and data visualization
  • Involved in Data processing, Data analysis building models to get insights from structured and unstructured data
  • Worked on writing complex Sql queries to consult, update and reorganize the data
  • Extensive hand-on on different libraries like NumPy, Pandas, Matplotlib, Plotly, ScikitLearn
  • Hand-on experience with various Hive optimization Techniques to store and process the data efficiently.
  • Developed scenarios and executed them to automate monthly PBM business cycle
  • Handled Huge amount of data related to PBM Domain to generate monthly reports and provide analytical solutions for legacy systems
  • Met with stakeholders, product teams and customers throughout system development lifecycle.
  • Scheduled ongoing performance quality assurance checks for software applications and automated performance test scripts.
  • Analyzed code and corrected errors to optimize output.
  • Employed integrated development environments (IDEs).
  • Wrote user manuals and other documentation for roll-out in customer training sessions.

Computing: Python, Data Structures, Python Libraries, Hadoop, Hive, Apache Spark, Mainframes, Tableau, SQL Server, Eclipse, DB Source.

Software Engineer

Tech Mahindra
Hyderabad, India
01.2016 - 01.2019

Project1: Glaxo Smith Kline

  • Involved in the analysis, design, and development phases of the Big Data Systems and continuously involved
  • Using Cloudera as a distribution platform for Hadoop, created HIVE Tables on top of datasets in staging layers
  • Extensively used Cloudera Stream-sets to transfer Raw data files into HDFS foundation layer
  • Worked on developing Python Scripts to automate ETL process and for integrating third party tools
  • Created Impala Tables as the target tables to load data into the integration layer
  • Used Parquet file format in all target tables in Impala for better performance
  • Involved in Building data frames and RDD’s using Spark SQL and Spark Core
  • Used Various Hive performance optimization techniques
  • Used Spark core and Spark SQL for data transformations.
  • Analyzed solutions and coding fixes for software problems.
  • Provided technical support to customers regarding product usage and troubleshooting issues.
  • Coordinated with project managers to meet development timelines and plan testing.

Project2 : Shell Oil & Gas

  • Responsibilities:
  • Developing data pipelines using Azure data factory to ingest data from on-premise systems to Azure Data
  • Lake storage core
  • Implementing the data transformation or processing logics using Azure databricks service with spark and
  • Scala
  • Used azure databricks notebooks to write the spark code
  • Loading the transformed datasets into Azure SQL-DB using Azure data factory
  • Writing the stored procedures on SQL-DB stage layer to handle incremental updates

Computing: Python 3, Anaconda, Data Structures, Hadoop, Spark Scala, Hive, Impala, Nifi, Teradata, Sql Server,GIT, Microsoft Azure, Azure DataLake Storage, Azure Data Factory, Azure DataBricks, Spark Scala, Azure SQL-DB

Education

Master’s - Data Science

University of North Texas

Bachelor of Technology -

JNTU

Skills

  • Languages - C, Python, PL/SQL, UNIX Bash/Shell Scripting
  • Databases - MS SQL Server 2016, Oracle 11g, MongoDB
  • Web Technologies - HTML5, CSS3, JSON, AJAX, XML, Bootstrap
  • Version Control - Stash, Bitbucket and Git Lab
  • Cloud Platforms - Microsoft Azure, ADLS, ADF, Azure Databricks, Azure
  • Operating Systems - Windows, Linux, MAC
  • Debugger IDEs - Eclipse, PyCharm, IntelliJ, Visual Studio Code
  • Big Data Technologies - Hadoop, Spark, Hive, Sqoop, Flume, StreamSets, Kafka, Splunk, Pyspark

Languages

English
Professional
Telugu
Professional
Hindi
Limited

Timeline

Senior Big Data Developer-Application Programmer V

Bank of America
11.2021 - Current

Software Engineer

CVS Health
05.2020 - 10.2021

Software Engineer

Tech Mahindra
01.2016 - 01.2019

Master’s - Data Science

University of North Texas

Bachelor of Technology -

JNTU
Divya Sree Velugula