Senior Data Engineer and Analyst with experience in Automobile, Media, Health Care, Software and Engineering domains. Expertise in building data pipelines and dashboards that furnish insights used to advance opportunity identification and process re-engineering along with a story to tell. Specialized in Data Analytics, Data Engineering and Data Visualization. Analytically minded professional with a proven ability to solve complex quantitative business challenges. Exceptional verbal and written communication skills, with a track record of effectively conveying insights to both business and technical teams. Adept at utilizing data to drive strategic decision-making and delivering impactful presentations.
Over 8+ years of IT experience in Big Data Engineering, Analysis, Design, Implementation, Development, Maintenance, and test large scale applications using SQL, Hadoop, Python, Java, and other Big Data technologies. Hands on experience in installation, configuration, supporting and managing Hadoop Clusters using Cloudera and Hortonworks distribution of Hadoop. Experience working with Elasticsearch, Log stash and kibana. Experience working with AWS Stack (S3, EMR, EC2, SQS, Glue and Athena, RedShift). Design AWS architecture, cloud migration, AWS EMR, DynamoDB, Redshift, and lambda function event processing. Used AWS Glue and I designed jobs to convert nested JSON objects to parquet files and load them into S3. Experienced in developing PySpark programs and creating data frames and working on transformations. Created Hive, SQL and HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL, and a variety of portfolios. Worked on developing ETL processes to load data from multiple data sources to HDFS using Kafka and Sqoop. Experience in using Apache Kafka for collecting, aggregating, and moving large amounts of data. Design and implementation of Oracle database systems to meet business requirements and performance goals. Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and from RDBMS to HDFS. Experience in data analysis using Hive, Impala. Experience in developing large scale applications using Hadoop and Other Big Data tools. In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node. Practical experience in all phases of the data management lifecycle, including the analysis of initial business requirements, conversion to data requirements, as well as the creation and execution of creative business procedures and data solutions for the manufacturing, healthcare, energy, and e-commerce industries. Proven expertise in data governance, master data management, advanced analytics, healthcare data domains, data distribution, data warehousing, and relational data modeling. Skilled at working with cross-functional teams to comprehend business requirements and transform them into scalable pySpark solutions. Spearheaded end-to-end ETL processes using Talend BigData ETL tool, ensuring seamless data extraction, transformation, and loading. Major expertise with SQL, PL/SQL, and TALEND development and maintenance in a corporate-wide ETL solution on UNIX and Windows platforms. Experienced in data manipulation using python for loading and extraction as well as with python libraries such as NumPy, SciPy and Pandas for data analysis and numerical computations. I managed the development and delivery of Docker images using CI/CD pipelines, and as part of the deployment procedure, terraform managed the required infrastructure upgrades. Large-scale environments were created using Terraform as infrastructure as a code (IAC). Experience with database SQL and NoSQL (HBase and Cassandra.) Perform structural modifications using Hive and analyze data using visualization/reporting tools (Tableau). Experience in using Hadoop ecosystem and processing data using Tableau, Quick sight and Power BI. Django and Flask, two top Python web frameworks, are used by an experienced data engineer to power data-driven solutions for the automobile sector. Took part in cross-functional meetings to collect requirements and offer technical knowledge on issues pertaining to APIs. Experienced with Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN. Hands on experience on Hadoop /Big Data related technology experience in Storage, Querying, Processing, and analysis of data. Experienced in using various Hadoop infrastructures such as Hive and Sqoop. To seek and maintain full-time position that offers professional challenges utilizing interpersonal skills, excellent time management and problem-solving skills. Detail-oriented team player with strong organizational skills. Ability to handle multiple projects simultaneously with a high degree of accuracy. Hardworking and passionate job seeker with strong organizational skills eager to secure entry-level [Job Title] position. Ready to help team achieve company goals. Organized and dependable candidate successful at managing multiple priorities with a positive attitude. Willingness to take on added responsibilities to meet team goals. Astute Data Enigneer with data-driven and technology-focused approach. Communicates clearly with stakeholders and builds consensus around well-founded models. Talented in writing applications and reformulating models. Astute [Job Title] with data-driven and technology-focused approach. Communicates clearly with stakeholders and builds consensus around well-founded models. Talented in writing applications and reformulating models.