Objective : Data Engineer with 2+ years of hands-on experience in building scalable data pipelines and processing large datasets using Python, PySpark, MySQL, AWS, and Big Data technologies. Eager to contribute to impactful projects while growing in Big Data systems and LLM-based applications. Looking to leverage my skills to drive data-driven decisions and continuous improvement.
Certificates
My responsibilities included :
Attended an immersive NVIDIA AI workshop hosted by our company.
LinkedIn url: www.linkedin.com/in/yedida-venkata-kanishka-vardhan-9020a0179
Project: AZ Brain – Oncology & Non-Oncology Data Intelligence
Client: AstraZeneca
Role: Data Engineer
Tech Stack: AWS, PySpark, PostgreSQL, Elasticsearch, Python, SQL
Description : AZ Brain is a global business intelligence platform designed to support AstraZeneca's oncology and non-oncology initiatives. The system empowers regional stakeholders to track drug performance, doctor engagement, treatment statistics, and clinical metrics across cancer and H1 applications.
Built using scalable data engineering pipelines and robust ETL workflows, the platform integrates large datasets from multiple sources and delivers insights through APIs and UI dashboards. It combines data science, engineering, and cloud technologies to ensure accurate and actionable reporting for strategic decisions.
An internal LLM-powered tool, ChatIQ, was integrated to enhance app usability. Developing a GenAI agent that dynamically generates SQL queries on filtered datasets, summarizes drug and HCP insights, and suggests contextual follow-up questions, and improved navigation across all AZ Brain applications, ensuring a seamless stakeholder experience.
Project: Reservoir Water Level Forecasting using machine learning.
Description : Uneven rainfall and climate changes have adverse effects on reservoir water levels and have caused disasters in the past. Poor water management may lead to significant socio-economic losses. Forecasting the inflow and water level of the reservoir based on the past data(by considering various factors) helps us to prepare in advance to tackle any extreme conditions. This is an attempt to make use of time series analysis to predict water level at reservoir and hence provide better management.
Role : Worked as a team Member, Performed data preprocessing and gathered dataset.
Technology : Python, Streamlit.
Exceptional Data Engineer with experience turning raw data from multiple sources into insights.
Skilled in delivering business insights and supporting LLM-based apps through clean, scalable data solutions.