Skip to content
#

apache-sqoop

Here are 5 public repositories matching this topic...

This repository showcases a Medallion Architecture Data Lakehouse designed for both batch and real-time processing of e-commerce and marketing data. It supports comprehensive data analysis, reporting, and monitoring, providing a scalable solution for deriving insights from integrated datasets.

  • Updated Sep 26, 2024
  • Jupyter Notebook

This project simulates a real-world enterprise data migration and modernization strategy. It extracts transactional data from a simulated "On-Premise" environment (hosted on AWS EC2), performs heavy distributed processing using a Hadoop/Spark cluster, and ultimately serves the data via a Cloud-Native, serverless architecture to optimize costs .

  • Updated Mar 19, 2026
  • Python

This project demonstrates the process of extracting data from a MySQL database, transferring it using Apache Sqoop, storing it in Hive Data warehouse (the data actually is store in Hadoop Distributed File System (HDFS)), and performing analysis using Hive Query Language (Hive QL) (it is a language close to SQL). Then visualize the data in Power BI,

  • Updated Oct 31, 2023
  • HiveQL

Improve this page

Add a description, image, and links to the apache-sqoop topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the apache-sqoop topic, visit your repo's landing page and select "manage topics."

Learn more