Skip to the content.

Data Engineer Portfolio

Name: Yue Huang
Email: huangyue1752@gmail.com
Tel: +15144300566

Welcome to my data engineering portfolio. Please click the project titles below to access the code in the repository.

Data Engineer Project: Real-Time Shopify Data ETL with Kafka, Azure Event Hub & Databricks

Objective:

Developed an ETL pipeline to stream and process real-time Shopify order data using Kafka, Azure Event Hub, Databricks, and various cloud technologies for scalable data processing and transformation.

Technical Highlights:

Technologies:


Data Engineer Project: Real-Time Data Sync from Wikimedia API to MySQL using Kafka

Objective:

Developed a real-time data streaming solution to fetch data from the Wikimedia API, process it with Kafka, and forward it to MySQL for further use, enabling efficient data storage and processing.

Technical Highlights:

Technologies:


Full Stack ETL Project

Objective:

Extract Covid data from multiple Rapid API endpoints, transform and combine the data in python, load the data to SQL server via odbc connection in Python, and finally connect SQL server with Power BI to feed the dashboard

Technical Highlights:

Technologies:


Holman API Data Pipeline Project

Objective:

This project aimed to automate the extraction, transformation, and loading (ETL) process of vehicle-related data from multiple API endpoints into our systems.

Technical Highlights:

Technologies: