Skip to content

krithivas91/Flight-Data-Analysis-AWS-MapReduce

Repository files navigation

Description Hadoop was configured in fully distributed mode on EC2 instances of AWS. In this project, I have deployed a MapReduce application on AWS that takes 22years of Flight data as input to perform analysis. We have used oozie workflow engine on top of Hadoop for coordination. More information can be found in the project report available. OUTPUT ————————————————————————————————————————————— Airlines Highest probabilty of airlines on schedule	0.0 HA	0.7494274 AQ	0.62106043 DH	0.6090783 Lowest probabilty of airlines on schedule	0.0 PI	0.32256523 PS	0.4177755 HP	0.45224547 Airports Airports with Longest taxi-in	0.0 CKB	183.0 LNY	88.01384 MTH	14.65625 Airports with Shortest taxi-in	0.0 BFF	2.0 PVU	2.5 DUT	2.5461006 Airports with Longest taxi-out	0.0 ACK	32.30936 SOP	26.157728 BQN	25.006365 Airports with Shortest taxi-out	0.0 MKK	4.5104165 KSM	6.1116753 VIS out	6.2633705 Cancellation The most common reason for flights cancellations	0.0 Carrier	317971.0 

About

Analytics on 22years of Flight Data- AWS/MapReduce

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages