44 viewsjobseeker
PRABU R. — Senior AWS Data Engineer from India

PRABU R.

Senior AWS Data Engineer

India 3-6 years
Open to offersNew to Platform
Languages
English
Video Introduction
No video introduction yet
The candidate has not added a video.
Contact information and social networks are private. Connect to unlock.
Hidden

About

Prabu R. is an accomplished AWS Data Engineer with over five years of experience specializing in designing and implementing end-to-end ETL pipelines for clients in the technology services sector. At Unified Points Tech Pvt Ltd, Prabu R. engineered scalable data workflows using AWS cloud services such as EMR, Glue, and Redshift, integrating components like Lambda, SQS, and CloudWatch to automate and monitor pipelines. His expertise includes building CDC frameworks, designing event-driven Lambda functions triggered by S3 uploads, and managing data transformations through multi-stage S3 lakes and Redshift tables for dashboarding and analytics. Proficient in Python, SQL, and PySpark, Prabu R. led data processing and orchestration using Step Functions and Airflow, optimized Spark and ETL job performance, and ensured cost-effective AWS deployments. His responsibilities spanned Agile project environments, multi-environment deployments, and comprehensive monitoring and notification setups.

Experience

  • AWS Data Engineer

    UNIFIED POINTS TECH PVT LTD · 2022 — Present
    Contributed to the data engineering team by constructing end-to-end data pipelines, designed to handle incremental data while loading fact and dimension tables seamlessly. The Redshift fact table serves as a foundation for dashboard development by the analytics team. Implemented a Change Data Capture (CDC) framework for daily updates from PostgreSQL to AWS, utilizing a checkpointing mechanism. Managed the process where files uploaded by the CCI team trigger a Lambda function, initiating a Step Function through the Python SDK Boto3 in an event-driven manner. The Step Function orchestrates a collection of Glue jobs, supplemented with an alerting system that notifies stakeholders via SNS in case of failures. Organized the cataloging of files from Bronze S3 using Glue Data Catalog, cleansing, deduplicating, and type casting, before transferring them to Silver S3. This includes applying business transformations for data stored in Gold S3 as partitioned parquet files, maintaining Slowly Changing Dimensions (SCD) Type 2, and multiplexing data to Redshift tables for Tableau dashboard creation. Engaged in unit testing, system integration testing, and user acceptance testing while optimizing performance to address long-running queries and ensure efficient resource utilization. Established job automation schedules in EventBridge scheduler and implemented SQS for decoupling application workers, facilitating live data ingestion into various data stores. Integrated Amazon CloudWatch for log file monitoring and set up alarms for job status notifications via Slack. Deployed builds across multiple environments, including QA and Production, and applied Python scripting to create and manage CRON jobs. Maintained expertise in dealing with SCD data through incremental loads and engaged in cost optimization strategies across AWS resources.
  • Data Engineer

    UNIFIED POINTS TECH PVT LTD · 2020 — 2021
    Executed data extraction processes from S3 (JSON files), after gathering data from an Oracle database via Oracle GoldenGate (OGG), utilizing AWS Glue with PySpark for transformation. Transferred processed data into a raw zone within S3, applying initial cleansing, followed by a refined zone with comprehensive business logic and transformations, and eventually pushed cleaned data copies to Redshift tables for downstream analytics. Employed various file formats in data extraction and executed data transformations through Glue Studio, subsequently loading results into S3. Created and managed ETL pipelines utilizing PySpark on EMR clusters, employing Spark submit for job submission and leveraging a solid understanding of Spark architecture for job debugging. Applied unload techniques for data retrieval from S3 to client-specific Redshift tables and utilized Athena for conducting analytical queries on S3 data. Developed Lambda functions to create event triggers that initiate Glue jobs upon object updates in S3, and integrated Amazon CloudWatch for monitoring log files. Generated SNS topics for notifications as necessary and established an end-to-end ETL process encompassing Step Functions, Glue, Lambda triggers, and EventBridge. Managed pipelines featuring Slowly Changing Dimensions (SCD) while overseeing delta updates.

Skills & Expertise

Education

  • B.E (Mechanical Engineering)
    Madha Engineering College, Kundrathur, Chennai · 2010 — 2014