Summary

Overview

Work History

Education

Skills

Websites

Accomplishments

Timeline

Sumanth Obilineni

Ashburn,VA

Summary

Data Engineer with over 7 years of experience in designing and developing ETL pipelines using Databricks. Proven expertise in big data ecosystems, encompassing data aggregation, querying, storage, and analysis. Achieved a 30% reduction in release time through optimized CI/CD pipelines and proficient in Azure Cloud services and DevOps practices. Strong analytical skills complemented by a solid understanding of database technologies such as Spark, Hive, and Kafka for real-time and batch processing.

Overview

years of professional experience

Work History

Data Engineer

IQVIA

Philadelphia, PA

03.2024 - Current

Company Overview: MedTech | Philadelphia, PA
Involved in creating architecture for pipelines, with steps including getting supplier data from different SFTP locations, QC data, transforming the data to a common format, running analysis, and providing insights.
Worked on creating automated pipelines in Airflow to load the data from SFTP to Azure Blob, and create tables in medallion architecture in Databricks.
Responsible for creating a custom SFTP sensor in Airflow to start the pipelines when files are placed in SFTP.
Created a QC framework on partner data to check outliers, required columns, and column fill percentages.
The QC framework is also responsible for finding missing data, creating alerts if outliers are found, by using data trends.
Involved in migrating hospital claims data from HDFS to Azure Databricks.
Reduced the time taken to run the jobs by 80% using Spark and cloud practices.
Created pipelines to Airflow to periodically distcp history and delta data to Azure Blob.
Worked on creating an email, and logging utilities to send alerts and log messages to Azure Log Analytics, respectively.
Managed metadata structures needed for building reusable and generic ETL components using ADF, Databricks Jobs.
Launched Databricks workspaces using Terraform.
Automated tasks to test notebooks, or run Spark jobs on Databricks clusters using Azure DevOps and GitHub Actions.
Created Dev, Test, and Prod environments for different stages of development.
Each environment can have different configurations, such as cluster sizes, libraries, and jobs.
Developed log monitoring jobs for the performance of Spark jobs, and alerted teams in case of failures and exceeding task thresholds.
Created pipelines in GitLab for dev, test, and prod environments to create workflows.
MedTech | Philadelphia, PA
Environment: Databricks, SparkSQL, PySpark, Python, Azure, SFTP, Airflow, JIRA, GitLab, Maven, Azure DevOps.

Data / DevOps Engineer

MetiStream

Ember, VA

09.2022 - 02.2024

Company Overview: Ember, Virginia.
Developed a solution for the Ember Platform, decreasing the delay in data availability by 80%, and boosting data availability by 100%.
Developed CloudFormation templates to orchestrate AWS EKS and EMR.
Developed scripts to automate and push Docker images to ECR.
Created Helm charts to pull and use the Docker images in EKS.
Services are exposed using the AWS Load Balancer with SSL termination.
Created CI/CD pipelines using Jenkins, including creating Docker images to use in EKS.
Performed impact analysis, performance tuning, and capacity planning for the enterprise data warehouse, and its infrastructure source systems are added, and new integration business rules and logic are introduced.
Developed APIs to connect to Elasticsearch, and built ES DSL queries for Ember Dashboards.
Developed APIs to connect MongoDB to store metadata.
Implemented many generic or reusable components, logging wrapper, generic exception handling, etc.
Testing and validating various integrations to the cloud (AWS) and big data distributions (Cloudera).
Architecture discussions, proposing new ideas to enhance the Ember functionality.
Code reviews on PRs and validation of test cases.
Configured pipelines to alert the stakeholders, and production operations send alerts using Nagios.
Environment: Python, Spark SQL, Java, SQL Server, PySpark, AWS, EKS, Cloudera, ElasticSearch, ElasticSearch DSL, Git, Maven, Nagios, and JIRA.

DevOps Engineer

Fidelity Investments

Raleigh, NC

10.2020 - 08.2022

Company Overview: Raleigh, NC.
Performed software configuration and release management activities for three different Java applications.
Designed and implemented Continuous Integration (CI) processes and tools, with approvals from development and other affected teams.
Defined processes to build and deliver software baselines for internal and external customers.
Coordinated with Anthill consultants to resolve licensing, technical, and ongoing issues, including Anthill patching, and application-related needs.
Collaborated with web administrators to set up automated deployment for SharePoint applications using Anthill and SVN tools.
Executed build operations using ANT scripts, modifying them as per project requirements.
Created and managed metadata types such as Branch, Label, Trigger, and Hyperlink; supported developers in creating config specs, and managed the merge process for project-specific branches.
I took ownership of the release branch, implementing triggers to enforce development policies and invoke operations before or after critical ClearCase events using PERL scripts.
Designed release plans in coordination with stakeholders, including project management, development leads, QA teams, and ClearCase administrators.
Worked on cross-platform environments (Windows NT and Linux) to ensure a thorough understanding and functionality of ClearCase.
Coordinated Change Control Board (CCB) meetings to discuss defects and enhancements, generating detailed reports to resolve issues before subsequent builds and testing.
Built version-controlled Java code on ClearCase Unified Change Management (UCM) project-based code streams, utilizing Visual Build Pro (VBP), and ANT scripts for VGS’ partners.
Environment: ClearCase, SVN, Shell, ANT, Hudson, JIRA, Linux, Windows, JBoss, Subversion, Visual Basic 6.0, Visual SourceSafe 6.0, SQL Server, PERL, Cruise Control, Git, Maven, JIRA.

Data Engineer

KPIT Technologies

Pune, India

01.2019 - 07.2020

Company Overview: Pune, India.
Designed and developed the real-time matching solution for customer data ingestion.
Worked on converting the multiple SQL Server and Oracle stored procedures into Hadoop using Spark SQL, Hive, Scala, and Java.
Created a production data lake that can handle transactional processing operations using the Hadoop ecosystem.
Developed PySpark and Spark SQL code to process the data in Apache Spark on Amazon EMR to perform the necessary transformations.
Involved in validating and cleansing the data using Pig statements, and hands-on experience in developing Pig macros.
Worked with Hadoop, Big Data Integration, and ETL on performing data extraction, loading, and transformation processes for ERP data.
Performed extensive exploratory data analysis using Teradata to improve the quality of the dataset, and created data.
Experienced in various Python libraries, like Pandas, one-dimensional NumPy, and two-dimensional NumPy.
Developed data visualizations in Power BI to display the day-to-day accuracy of the model with newly incoming data.
Utilized Jira as a project management methodology, and Git for version control to build the program.
Reported and displayed the analysis results in the web browser with HTML and JavaScript.
Involved constructively with project teams, supported the project's goals through principles, and delivered insights for the team and client.
Environment: Hadoop, Python, Spark SQL, Hive, Java, SQL Server, PySpark, Tableau, Git, Maven, Power BI, JIRA.

SQL Developer

iAppSoft

Hyderabad, India

05.2017 - 12.2018

Company Overview: SDTM.| Hyderabad, India
Standardized and quality-checked clinical trial data (greater than 2L table with different schemas) using clustering, unifying, transforming, analytical, and statistical techniques.
Automated this process of analyzing various tables based on the domain (functional area).
Requirements for building in-house tools for the curation process.
Understanding SDTM and other clinical data models, curation processes, ETL tools, etc.
To better understand and quickly develop new techniques that suit and accelerate our process.
Performance and quality testing of the standardization process was performed using an in-house developed application.
Analyze various studies, stage them based on set business rules, and identify new business rules for the same.
Profile all the tables available in all the servers and databases periodically.
Compare the previous month's profiles with the current month's profiles to determine unused, redundant databases, and tables to free up space and form a structured database.
Environment: Oracle, SQL Developer, Windows.

Education

Master of science - Information Systems

Indiana Tech University

Fort Wayne, Indiana, United States

Bachelor of Science - Computer Science and Engineering

PRIST University

Thanjavur, Tamilnadu, India

Skills

PostgresSQL
Oracle
MySQL
MongoDB
Java
Python
SQL
Shell Scripting
Airflow
Dataflow
StreamSets
Databricks
Cloudera
Spark
Hadoop
PySpark
SparkSQL

DeltaLake
Kafka
Docker
Kubernetes
EKS
Azure
AWS
Jenkins
Ansible
Terraform
PyCharm
IntelliJ
Eclipse
SQL Developer
Power BI
JIRA

Websites

www.linkedin.com/in/sumanth-obilineni

Accomplishments

Led successful migrations of on-prem workloads to AWS and Azure, improving scalability and performance.
Designed and implemented disaster recovery strategies, ensuring minimal downtime during critical incidents.
Mentored junior engineers, fostering a culture of knowledge sharing and professional growth.

Timeline

Data Engineer

IQVIA

03.2024 - Current

Data / DevOps Engineer

MetiStream

09.2022 - 02.2024

DevOps Engineer

Fidelity Investments

10.2020 - 08.2022

Data Engineer

KPIT Technologies

01.2019 - 07.2020

SQL Developer

iAppSoft

05.2017 - 12.2018

Master of science - Information Systems

Indiana Tech University

Bachelor of Science - Computer Science and Engineering

PRIST University

Sumanth Obilineni

Summary

Overview

Work History

Data Engineer

Data / DevOps Engineer

DevOps Engineer

Data Engineer

SQL Developer

Education

Master of science - Information Systems

Bachelor of Science - Computer Science and Engineering

Skills

Websites

Accomplishments

Timeline

Data Engineer

Data / DevOps Engineer

DevOps Engineer

Data Engineer

SQL Developer

Master of science - Information Systems

Bachelor of Science - Computer Science and Engineering

Similar Profiles

CHANDRIMA SARKARCHANDRIMA SARKAR

Soham GhoshSoham Ghosh

Rini MerlinRini Merlin

Sinojkumar NairSinojkumar Nair

Nichole FordNichole Ford