hadoop-mapreduce

Here are 140 public repositories matching this topic...

vim89 / datapipelines-essentials-python

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations

python big-data spark apache-spark hadoop etl xml python3 xml-parsing pyspark data-pipeline datalake hadoop-mapreduce spark-sql etl-framework hadoop-hdfs etl-pipeline etl-components

Updated May 6, 2023
Python

guillaume6pl / mr_pagerank

Star

Computing pagerank with Hadoop MapReduce

python search pagerank hadoop-mapreduce

Updated Apr 24, 2017
Python

MariaDukmak / Hadopy

Star

Easy parallel map-reduce command line tool

map hadoop parallel mapreduce hadoop-mapreduce hadopy

Updated May 15, 2021
Python

HawxChen / CloudComputing

Star

MapReduce, Spark, Hadoop, PostgreSQL, Cluster Management

spark apache-spark hadoop hdfs mapreduce cluster-analysis clustering-evaluation hadoop-mapreduce cluster-computing mapreduce-server hadoop-mapreduce-programming hadoop-mapreduce-aws

Updated Nov 22, 2017
Python

kowaalczyk / spark-minimal-algorithms

Star

An python implementation of Minimal Mapreduce Algorithms for Apache Spark

python spark apache-spark algorithms python3 pyspark hadoop-mapreduce apache-hadoop minimal-algorithms

Updated Jun 22, 2020
Python

venancioromero / Analysis-of-feelings-USA

Star

MapReduce example written in python to analyze the feelings of EE UU

python aws aws-s3 hadoop-mapreduce

Updated Jan 20, 2018
Python

MarwanMashra / Hadoop-MapReduce

Star

Map/Reduce project with Hadoop

python distributed-systems hadoop mapreduce hadoop-mapreduce hadoop-hdfs

Updated Feb 27, 2022
Python

highoncarbs / hadoopwithpy

Star

🐘 ➕ 🐍 Learning Hadoop with Python

python flask hadoop recommender-system hadoop-mapreduce hadoop-streaming

Updated Oct 12, 2017
Python

krishnadey30 / NewsHeadlines

Star

This repository have codes that extracts meaningful information from News headline data-set.

python hadoop hadoop-mapreduce news-dataset mapreduce-python

Updated Apr 28, 2019
Python

MandarGogate / Association-Rule-Mining-Hadoop-Python

Star

A case study on mining association rules between different factors related to deaths of people in the United States

python data-science machine-learning data-mining hadoop mining map-reduce mapreduce association-rules hadoop-mapreduce hadoop-streaming

Updated Jun 24, 2017
Python

sreetamparida / Hiraishin

Star

A REST-based service that translates the SQL query into MapReduce and Spark jobs. It runs these jobs and provides the JSON object. SQL to MapReduce and Spark translator.

sql spark python3 pyspark mapreduce hadoop-mapreduce hadoop-streaming mapreduce-python sqltomapreduce sqltospark

Updated Sep 30, 2020
Python

terodea / CS-BigData

Star

Learn Big Data tools/ framework by doing examples, POC, per projects.

java airflow scala kafka big-data spark hive hadoop bigdata hbase python3 map-reduce sqoop case-study hadoop-mapreduce

Updated Jul 29, 2022
Python

abhibalani / emr_lambda

Star

Lambda to start EMR and run a map reduce job

aws aws-lambda aws-emr hadoop-mapreduce aws-emr-clusters mapreduce-python

Updated Aug 16, 2019
Python

mac40 / BDC

Star

Big Data Computing

big-data spark university clustering hadoop-mapreduce padua association-analysis

Updated Mar 6, 2020
Python

ajerit / parallel-bfs

Star

Parallel implementation of Breadth-First Search algorith in Java MapReduce and PySpark. This implementation finds degrees of separation between Twitter Users