Summary
Overview
Work History
Education
Skills
Research Publication
Projects
Accomplishments
Websites
Timeline
Generic

Vybhav Kothareddy

Denton,USA

Summary

Results-driven Data Engineer with expertise in designing, testing, and maintaining data management systems. Demonstrated proficiency in database design and data mining, leveraging machine learning to enhance strategic decision-making. Achievements include optimizing data retrieval processes and significantly improving system efficiency. Committed to utilizing advanced analytical skills to drive business growth.

Overview

5
5
years of professional experience

Work History

Data Engineer

Horizon IT Inc
05.2023 - Current
  • Designed and implemented scalable data pipelines on Google Cloud Platform (GCP) using tools such as BigQuery, Cloud Dataflow, Cloud Pub/Sub, and Cloud Storage.
  • Developed ETL workflows to process and transform large datasets, ensuring data quality and consistency for analytics and reporting.
  • Built and optimized data models in BigQuery to support business intelligence and machine learning use cases, reducing query time by 40%.
  • Automated data ingestion processes from multiple sources, leveraging GCP services and Python-based scripts, improving data processing efficiency by 30%.
  • Collaborated with cross-functional teams to define data requirements and deliver insights, contributing to key decision-making processes.
  • Applied machine learning models to analyze trends and forecast outcomes, integrating predictive analytics into data pipelines.
  • Implemented data governance practices, including access control, data encryption, and compliance monitoring, ensuring adherence to industry standards.

Teaching Assistant

University of North Texas
Denton, USA
05.2022 - 03.2023
  • Assistant undergrad students on programming and helped them in their projects and labs individually
  • Assisting instructors in conducting exams lab experiments and grading the assignments
  • Assisting the instructor in preparing ppt for class whenever required
  • Evaluated and revised lesson plans and course content to facilitate and moderate classroom discussion and student-centered learning

Associate Data Engineer

Aptean
07.2021 - 12.2021
  • Selected for a global interdisciplinary team working on a highly specialized project in the sector to develop a Business Intelligence solution for Manufacturing Execution System (MES)
  • Designed functional designs on my own and adjusted them to match the needs of each customer, making an additional $150 in profit each time
  • Developed an algorithm that gathers data dynamically from many databases (before it could only gather from one), even with data as large as 700-800 GB
  • This increases maintainability and performance by a factor of three
  • Shaped interactive and powerful visualizations integrated with AI insights and reducing the rendering time by 50%
  • Tools and Technologies involved: Power BI desktop, Power BI dataflows, power shell, SQL server, MS Excel

Software Developer Intern

Aptean
07.2020 - 07.2021
  • Used Power BI data flows to design a data warehouse from ERP databases that was previously in dataset format
  • This resulted in a 50% cost reduction and a 20-fold boost in efficiency, yielding $6000 in income per customer environment
  • Helped the engineering team develop apps that automatically handle dataflow refreshes and deployment, enhancing continuous integration and delivery while decreasing human labor by 70%

Education

Masters of Science - Data Science

The University of North Texas
Denton, Texas
05.2023

Bachelors of Technology - Computer Science and Engineering

Lovely Professional University
Punjab, India
07.2021

Skills

  • Data Analysis: Python libraries (NumPy, Pandas, Keras, Sci Kit-Learn, TensorFlow, Matplotlib, Seaborn), Machine learning algorithms, Excel Data Analysis, Power BI, Data Mining and Data Visualisation
  • Data Engineering Tools: Apache Spark, Airflow, Hadoop, Hive, Kafka
  • Cloud Platforms: Google Cloud Platform (BigQuery, Dataflow, Pub/Sub), AWS (Redshift, Glue, S3)
  • GIS Skills: Geospatial Data, QGIS, ESRI Arcgis, CARTO, SketchUp, FME, Remote sensing
  • Programming languages: C/C, SQL, Python, Java, Linux and Unix
  • Web development: HTML, CSS, JavaScript, jQuery, Json and XML
  • Other skills: IoT, Arduino, ESP8266and Raspberry Pi, Adobe Photoshop, Unity 2D/3D and Sensors
  • Tools and Frameworks: Power BI, Anaconda, Git, Sencha EXT JS, Tensorflow, Keras, ASPNET, Unity, Machine learning
  • Languages: English, Telugu, Hindi

Research Publication

Review and further prospects of plant disease detection using Machine Learning

May 2021

Journal URL: https://ijsrcseit.com/CSEIT217324

  • Agriculture is the field that helps in the economic growth of a country. But this is lacking behind in using new technologies of machine learning. These techniques help in getting the maximum yield of crops. Many methods of machine learning are applied in agriculture to improve the yield rate of crops. Hence we can improve the performance by checking the accuracy between other crops.
  • Deep Learning and Sensor technologies do implement in many farming sectors. These techniques will solve the problems of farmers in the agriculture field. This will help in improving the sustainable and economic growth of a country

Projects

End-to-End Data Pipeline for Real-Time E-commerce Analytics

  • Designed and implemented a scalable real-time data pipeline to analyze e-commerce website traffic and customer behavior.
  • Ingested large-scale, real-time data using Apache Kafka from multiple sources, including website logs and APIs.
  • Processed streaming data with Apache Spark (pyspark) for real-time transformations and batch processing.
  • Stored raw and processed data in Google BigQuery and Google Cloud Storage, leveraging efficient schema design for analytics.
  • Built automated ETL workflows with Apache Airflow and Cloud Composer to clean, transform, and load data into a data warehouse.
  • Developed interactive dashboards using Power BI/Tableau to visualize key metrics like daily active users, sales trends, and conversion rates.
  • Implemented monitoring and alerting solutions using GCP Cloud Monitoring and Pub/Sub to ensure pipeline reliability.
  • Optimized pipeline performance to handle high data volume, improving processing time by 30%.
  • Tools & Technologies: Google Cloud Platform (BigQuery, Cloud Composer, Cloud Storage, Pub/Sub), Apache Kafka, Apache Spark, Python, SQL, Power BI/Tableau, Git.

Categorization and Conceptual Representation of Human Psychology in the Action of Decision-Making

  • This study has focused on explaining the ways human cognitive psychology influences decision-making and the ultimate outcomes.
  • Deep learning facilitates decision-making in selecting appropriate treatment plans. Unconscious motives and interpersonal issues are some of the factors that negatively influence the decision-making process and this enhances the risks of making biased decisions.
  • Implemented and fitted Machine learning models on top of the data to analyze the performance of each one and speculate.
  • Developing the algorithms from scratch has provided vast understanding on the background working of each ML algorithm.

LPUIH2020 : Solar agriculture and crop protection

  • This project is developed using the Arduino programming and IoT analytics,this project explains how a solar energy source used in agriculture and to maintain soil moisture balance , sufficient water requirements and security of crop area from other animals.
  • This component applied data mining to analyze the data for predicting suitable temperature, humidity, and soil moisture from sensors for optimal future management of crops growth using Data Analysis.
  • Developing this project helped me understand the real-world projects and the architecture involved.
  • Tools and Technologies: Arduino, Python, postman, C#, SQL Server, Github.

Accomplishments

  • Achieved 3rd place in university level in LPU smart india hackathon 2020
  • Received Journal Publication for 'Review and Further Prospects of Plant Disease Detection Using Machine Learning' from IJSRCSEIT
  • Received a 'Elite' certification in completion of 'python for data science' from NPTEL
  • Achieved 2nd prize in photography competition held by Youthvibe in 2018
  • Received an 'Elite' Certification in completion of 'programming in java' from NPTEL

Timeline

Data Engineer

Horizon IT Inc
05.2023 - Current

Teaching Assistant

University of North Texas
05.2022 - 03.2023

Associate Data Engineer

Aptean
07.2021 - 12.2021

Software Developer Intern

Aptean
07.2020 - 07.2021

Masters of Science - Data Science

The University of North Texas

Bachelors of Technology - Computer Science and Engineering

Lovely Professional University
Vybhav Kothareddy