- Implemented Python scripts to automatically sync data between different storage server depending on current available space and remote location.
- Tested compatibility and functionality of a web application using Selenium in Python.
- Generated graphs for business decision-making using Python matplotlib library.
- Drafted and managed business requirement documents through weekly meetings for new, existing and ongoing projects for external and internal customers.
- Interacted personally with clients and cross-functional team members by drafting Customer Requirement Specifications.
- Updated and tested requirements through the application lifecycle management software within the hybrid of the Agile and Waterfall methodologies.
- Recorded all technical, functional, defects, implementations and status updates within JIRA.
- Tested software releases prior to and after deployment to the production environment for quality assurance.
- Used Python library Beautiful Soup for web scrapping to extract data for building graphs.
- Wrote python scripts to parse XML documents and load the data in database.
Facilitated communication between the business and technical teams to identify and resolve issues.
• Actively participated in gathering requirements and analysing business needs.
• Conducted advanced data analysis and implemented complex algorithms to enable stakeholders to uncover and quantify critical insights using data analytics tools such as Python, R, Hadoop, and Spark.
• Developed dynamic views and templates using Python and implemented website interfaces using Django's view controller and template language.
• Utilized AWS services like Amazon EMR, Apache Spark, and Apache Hadoop to process and analyze large-scale datasets, optimizing data workflows for performance and cost-efficiency.
• Automated data pipelines from diverse external sources (web pages, APIs, etc.) to internal data warehouses (SQL Server, AWS), and integrated with reporting tools like Tableau.
• Managed data warehouses on AWS, including designing optimized schemas, data distribution, and performance tuning using services like Amazon Redshift.
• Applied mathematical operations for calculations using Python libraries NumPy, SciPy, and Pandas.
• Designed and implemented backend data access modules using PL/SQL stored procedures and Oracle databases.
• Developed Hive queries (HQL) for data mapping and validation.
• Built ETL architectures and mapped source-to-target data flows for loading data into data warehouses, gaining proficiency in AWS services like EC2, S3, EBS, and RDS.
• Employed Teradata utilities such as Fast Export and MLOAD for efficient data migration and ETL tasks, seamlessly transferring data from OLTP source systems to OLAP target systems.
• Created Spark programs using Scala APIs to benchmark performance against Hive and SQL.
• Scripted shell commands for data loading and injection into HDFS.
• Utilized Sqoop with Unix scripts to import data from SQL Server into HDFS.