Summary
Overview
Work History
Education
Skills
Timeline
CustomerServiceRepresentative

Barkam Saiprasad

montgomery,AL

Summary

Boasting over 6 years of robust experience as a Data Engineer, skilled across various domains including IT services and financial services. Adept in utilizing Hadoop ecosystem technologies such as Apache Hadoop, HDFS, MapReduce, Apache Hive, and Apache Pig for complex data processing. Proficient with Azure cloud services including Azure HDInsight, and Azure Stream Analytics, focusing on solutions that scale across data-heavy environments. Expert in Python scripting to automate and streamline data processes, enhancing efficiency and productivity. Demonstrated ability to manage Git for effective version control, ensuring smooth workflows in collaborative and distributed development environments. Versed in executing SQL-driven projects for database management, showcasing the ability to handle complex data manipulation and querying tasks. Experienced with a comprehensive suite of AWS services including S3, Redshift, and EMR, optimizing cloud-based data storage and processing. Capable of integrating MongoDB into data solutions, emphasizing performance in NoSQL database implementations. Specialized in Apache Kafka for effective real-time data streaming and processing within high-demand environments. Proficient in the deployment of SSIS and Apache Airflow for robust ETL processes, ensuring data integrity and timely delivery. Hands-on experience with AWS Schema Conversion Tool and Hadoop YARN, focusing on data schema standardization and resource management. Currently engaged with cutting-edge technologies such as Apache Spark, which facilitates large-scale data processing and analytics. Utilizing Azure SQL Database, ADLS, and ADF to develop advanced data handling and storage solutions in the cloud. Implementing Azure Cosmos DB for globally distributed database services, ensuring high availability and elastic scalability. Employing Terraform for efficient infrastructure as code applications, enabling reproducible and scalable cloud environments. Leveraging Snowflake for dynamic cloud data warehousing, enhancing data retrieval and analytics capabilities. Integrating Azure Active Directory to manage secure access and identity management within complex project environments. Utilizing PyTorch in financial models to predict and analyze data, leading to more informed decision-making processes. Focused on automating ETL and data processing tasks to improve operational efficiencies and reduce manual errors. Ensuring robust data security and compliance through stringent measures and regular audits in sensitive data environments. Dedicated to continuous professional development, keeping abreast with the latest data engineering tools and methodologies. Committed to mentoring junior data engineers, fostering a learning environment and sharing expert knowledge in data engineering.

Overview

5
5
years of professional experience

Work History

Data Engineer

Equifax
12.2023 - Current
  • Developing and maintaining Apache Spark data processing pipelines, utilizing Azure SQL Database and ADLS for data management.
  • Orchestrating data integration and transformation processes efficiently using Azure Data Factory and Azure Stream Analytics.
  • Employing Azure Cosmos DB for handling complex queries and ensuring global distribution of the financial database services.
  • Managing cloud infrastructure using Terraform, implementing configurations that ensure scalable and secure environments.
  • Utilizing Snowflake for effective data warehousing, enhancing analytics capabilities within the financial sector.
  • Implementing secure data access and identity management using Azure Active Directory, ensuring compliance with financial regulations.
  • Leveraging PyTorch for building predictive models that enhance financial forecasting and risk management.
  • Automating repetitive ETL tasks using scripting and Python, enhancing the efficiency and accuracy of data flows.
  • Employing Git for version control, managing codebase changes and collaboration across large-scale financial projects.
  • Utilizing SQL for complex data querying and report generation, supporting critical financial decision-making processes.
  • Engaging in continuous improvement of data processing workflows and technologies to stay ahead in the financial services sector.
  • Leading team initiatives to optimize data systems for scalability and performance, particularly in high-demand financial environments.
  • Focusing on data integrity and accuracy through rigorous validation and testing, ensuring reliable financial reporting.
  • Collaborating with stakeholders to develop and refine data-driven strategies, delivering actionable insights for the financial industry.
  • Championing data security initiatives, implementing robust measures to protect sensitive financial information.
  • Facilitating data migrations to cloud platforms, ensuring seamless transitions with minimal impact on operational continuity.
  • Developing scripts for automation of data collection and processing, significantly reducing manual efforts and potential errors.
  • Conducting training sessions for team members on new technologies and best practices in data engineering within the financial domain.
  • Spearheading cross-functional projects that leverage data to drive innovation and operational efficiency in financial services.
  • Ensuring compliance with regulatory requirements through meticulous data management and security protocols.
  • Leading efforts to enhance data visualization and reporting techniques, enabling more insightful analyses of financial trends.
  • Advocating for the adoption of advanced analytical tools and techniques to improve the speed and accuracy of financial data processing.
  • Environment: Apache Spark, Azure SQL Database, Azure Data Lake Storage (ADLS), Azure Data Factory, Azure Stream Analytics, Azure Cosmos DB, Terraform, Snowflake, Azure Active Directory, PyTorch, Python, Git, SQL.

Database Engineer

Verisk Analytics
08.2021 - 04.2023
  • Engineered and maintained robust ETL pipelines, integrating AWS Redshift and MongoDB for optimized data warehousing and management.
  • Implemented real-time data ingestion and processing using Apache Kafka, complemented by AWS EMR for scalable data operations.
  • Orchestrated complex data workflows using Apache Airflow, enhancing the automation and scheduling of data tasks.
  • Developed data transformations leveraging Python and SQL to support comprehensive business analytics and data reporting.
  • Utilized AWS Schema Conversion Tool for effective data schema management, ensuring compatibility across different platforms.
  • Managed and optimized Hadoop YARN resources to support high-volume data processing tasks, ensuring efficient use of infrastructure.
  • Monitored and enhanced system performance through SQL optimization and database tuning, delivering faster data access and processing.
  • Led initiatives to improve data quality and integrity, implementing rigorous testing and validation protocols.
  • Facilitated the integration of new data sources, expanding the data ecosystem and enhancing analytical capabilities.
  • Provided technical mentorship to team members, elevating the overall expertise and effectiveness of data operations.
  • Conducted detailed evaluations of emerging technologies, proposing adaptations that align with strategic business goals.
  • Delivered comprehensive reports and presentations to stakeholders, demonstrating the impact of data innovations on business processes.
  • Championed security enhancements, employing best practices to safeguard sensitive and critical data assets.
  • Played a key role in cross-departmental collaborations, forging partnerships that enhance the scope and depth of data projects.
  • Focused on streamlining data collection and integration techniques, reducing complexity and enhancing data availability.
  • Developed and maintained comprehensive documentation for data architectures, processes, and user guides, ensuring clarity and accessibility.
  • Contributed to the strategic planning of IT projects, aligning data management strategies with broader business objectives.
  • Led the troubleshooting and resolution of complex data-related issues, maintaining high availability and performance standards.
  • Engaged in professional development opportunities to stay current with industry trends and advancements in data technology.
  • Actively participated in the community of practice for data engineers, sharing insights and best practices.
  • Environment: AWS Redshift, MongoDB, Apache Kafka, AWS EMR, Apache Airflow, Python, SQL, AWS Schema Conversion Tool, Hadoop YARN, SQL optimization, data integration tools.

Hadoop Engineer

Aufait Technologies Pvt. Ltd.
12.2019 - 07.2021
  • Configured and managed Apache Hadoop environments, optimizing data storage and processing capabilities for large-scale applications.
  • Implemented data processing tasks using MapReduce, ensuring efficient handling of extensive datasets within the Hadoop ecosystem.
  • Utilized Apache Hive for advanced data querying and management, supporting complex data analysis and reporting needs.
  • Developed data transformations and analytics using Apache Pig, enhancing the interpretability and utility of raw data sets.
  • Orchestrated data workflow automation with Apache Oozie, improving reliability and efficiency of scheduled data tasks.
  • Leveraged Apache Kafka for building robust real-time data streaming platforms, enhancing data availability and timeliness.
  • Employed Python scripting to automate various data handling tasks, significantly reducing manual intervention and error rates.
  • Utilized Git for robust version control in software development projects, facilitating effective collaboration and code management.
  • Crafted complex SQL scripts for data manipulation and analysis, serving critical business needs and insights.
  • Optimized data storage solutions using HDFS, ensuring high data availability and fault tolerance in distributed environments.
  • Ensured data accuracy and consistency through comprehensive testing and validation, maintaining high standards of data quality.
  • Collaborated closely with team members to identify and address data processing challenges, fostering a culture of continuous improvement.
  • Enhanced data security protocols to protect sensitive information, adhering to best practices and regulatory requirements.
  • Participated in code reviews to ensure adherence to coding standards and practices, contributing to the maintenance of high-quality software.
  • Engaged in the development and enhancement of data processing tools and methodologies, staying at the forefront of technology advancements.
  • Provided technical support and expertise in troubleshooting data-related issues, ensuring swift resolutions and minimal downtime.
  • Engaged in ongoing learning and training to enhance technical skills, particularly in the rapidly evolving field of big data.
  • Contributed to project documentation, ensuring clear and comprehensive information transfer and retention.
  • Environment: Apache Hadoop, MapReduce, Apache Hive, Apache Pig, Apache Oozie, Apache Kafka, Python, Git, SQL, HDFS.

Education

Master Of Information System Management - Management Information Systems

Auburn University
Auburn, AL
12.2023

Skills

  • Big Data Processing: Apache Hadoop, HDFS, MapReduce, Apache Hive, Apache Pig
  • Real-time Streaming: Apache Kafka
  • Cloud Services: Azure HDInsight, Azure Stream Analytics, Azure SQL Database, ADLS, ADF, AWS S3, AWS Redshift, AWS EMR, Snowflake, Terraform
  • Data Warehousing: Snowflake, AWS Redshift
  • Programming Languages: Python, SQL
  • Version Control: Git
  • ETL Tools: SSIS, Apache Airflow
  • Machine Learning: PyTorch
  • NoSQL Databases: MongoDB, Azure Cosmos DB
  • Security & Identity: Azure Active Directory
  • Infrastructure as Code: Terraform
  • Data Management: Apache Oozie, Hadoop YARN

Timeline

Data Engineer

Equifax
12.2023 - Current

Database Engineer

Verisk Analytics
08.2021 - 04.2023

Hadoop Engineer

Aufait Technologies Pvt. Ltd.
12.2019 - 07.2021

Master Of Information System Management - Management Information Systems

Auburn University
Barkam Saiprasad