A Comparative Analysis of Cloud based Recommendation System on Mapreduce and Spark

dc.contributor.authorGhimire, Sarala
dc.date.accessioned2022-01-12T06:12:35Z
dc.date.available2022-01-12T06:12:35Z
dc.date.issued2017-11
dc.descriptionToday, Big Data is a hot issue both in industrial and academic fields. The need of data processing is changing with the gradual increase in data volume and with the mass of sources leading to a diversity of structures.en_US
dc.description.abstractToday, Big Data is a hot issue both in industrial and academic fields. The need of data processing is changing with the gradual increase in data volume and with the mass of sources leading to a diversity of structures. Although relational database management system (RDBMS) remaining the primary technology for data management of structured data and been proven best for more than 40 years, it has reached its limit, and the reason is massive growth in the diverged volume of data. Several researchers and organizations now focused on MapReduce and Spark framework that has discovered huge success in processing and analyzing a large volume of data on several clusters. In this study, the performance of MapReduce, RDBMS, and Spark with various comparison measures are evaluated. To conduct a comparison and analysis, three processes are computed: (a) developed recommendation system with all three algorithms, (b) run that system on various data networks and data sizes, and (c) the output is then analyzed and compared on the basis of time computation, memory consumption, and CPU usage. Moreover, statistical validation of the observed results from all the algorithms with respective node and network configuration using Friedman rank test and Holm post-hoc test are performed. Overall, observations show that Spark is about 2.5x and 5x faster than MapReduce, and 10/20 times faster than RDBMS. The reason for these speedups is the efficiency of the alternative least square algorithm and reduced CPU and disk overheads due to RDD caching in spark.en_US
dc.identifier.urihttps://elibrary.tucl.edu.np/handle/20.500.14540/7292
dc.language.isoenen_US
dc.publisherPulchowk Campusen_US
dc.subjectCloud Computingen_US
dc.subjectMapReduceen_US
dc.subjectMulti-node clusteren_US
dc.subjectHadoopen_US
dc.titleA Comparative Analysis of Cloud based Recommendation System on Mapreduce and Sparken_US
dc.typeThesisen_US
local.academic.levelMastersen_US
local.affiliatedinstitute.titlePulchowk Campusen_US
local.institute.titleInstitute of Engineeringen_US
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Sarala Ghimire.pdf
Size:
4.64 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: