Skynet: a distributed application for american flights statistics
Every day about 20000 flights sail through the skies of the United States. Online a huge amount of data is available regarding each one of these flights, including information on delays, cancellations and their causes. From this data it’s possible to compute a large number of different statistics on mean delays, probability of cancellation, most probable delay cause etc. based on a single airport, route or airline. For this reasons we decided to create Skynet. Skynet is an application capable of gathering data from the U.S. Department of Transportation website through a web scraper implemented inside of it. The data is then analyzed, processed and distributed on multiple database realized through MongoDB and Neo4j technologies, which are necessary to achieve a good tradeoff between performances, reliability and availability.
Documentation Source code