A Comparative Study of Naive Bayesian Spam Filtering Using Word Distribution and Trigrams

dc.contributor.authorDangol, Pabitra
dc.date.accessioned2023-02-05T05:09:03Z
dc.date.available2023-02-05T05:09:03Z
dc.date.issued2011
dc.description.abstractA comparative study of Naive Bayesian spam filter is done on the basis of tokenization. The study is focused on the reliability and accuracy of the spam filter between word-based tokenization and trigram-based tokenization. Both of the filters are implemented using the same classifier and trainer. The results of the study is that word-based spam filtering is better when the amount of pre-categorized emails available for training are limited and when the resources available for the classification process were limited as well. For sufficient amount of resources and emails, the results suggest that trigram-based spam filtering is better due to its higher reliability and accuracy.en_US
dc.identifier.urihttps://hdl.handle.net/20.500.14540/14846
dc.language.isoen_USen_US
dc.publisherDepartment of Computer Science and I.T.en_US
dc.subjectNaive Bayesian Spamen_US
dc.subjectWord Distributionen_US
dc.titleA Comparative Study of Naive Bayesian Spam Filtering Using Word Distribution and Trigramsen_US
dc.typeThesisen_US
local.academic.levelMastersen_US
local.institute.titleCentral Department of Computer Science and Information Technologyen_US

Files

Original bundle

Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
Cover page.pdf
Size:
45.96 KB
Format:
Adobe Portable Document Format
Description:
Loading...
Thumbnail Image
Name:
Chapter.pdf
Size:
235.66 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: