Browsing by Subject "Machine learning"
Now showing 1 - 5 of 5
Results Per Page
Sort Options
Item Analysis of Queries Routing in Super-Super-Peer Based P2P Architecture Using NBTree: The Hybrid Algorithm(Department of Computer Science and Information Technology, 2014) Dewan, AbhishekThe Internet is converging to a more dynamic, huge, fully distributed peer-to-peer (P2P) overlay networks containing millions of nodes typically for the purpose of information distribution and file sharing as the increase in the number of computers connected to the Internet are increasing rapidly. Because of which a challenging problem in unstructured P2P system is how to locate peers that are relevant with respect to a given query with minimum query processing and minimum answering time. Connected peers can leave the overlay network any time and new peers can join it any time. To achieve our goal we suggest an unstructured P2P system which is based on an organization of peers around super-peers that is connected to super-superpeer according to their semantic domains and also uses NBTree: The Hybrid Algorithm to extract Super-Peer that contains peers with relevant data respect to a given query. Keywords: Decision Tree, Machine Learning, NBTree, P2P, P2P Queries Answering, P2P Queries Routing, Super-Super-Peer, WekaItem Nepal Stock Exchange Market Prediction using Support Vector Regression and Back- Propagation Neural Network(Department of Computer Science and Information Technology, 2017) Pun, Top BahadurAcquisition of knowledge by analyzing the large volume of data from the various perspective and summarizing into useful information has become essential in recent years. Stock market prediction is interesting and challenging research in data science incorporated with artificial intelligence. In this research work, Nepal Stock Exchange data has been used to predict the stock price for a next day. Data sets are collected from Nepal Stock Exchange. Data preprocessing is performed in order to compute an accurate result. The data collection of the year 2016 is used for this work. The data belong to promoter share and unwanted feature eliminated from considered data. Overall stock data is further divided into the different sector of investment in stock market. Data sets are normalized for better performance, before applying the machine learning methods. Min-Max and Z-score normalization two methods are used for this work. Support Vector Machine and Artificial Neural Network are applied in order to predict stock price in the market. In order to measure the performance of two learning models mean square error, mean absolute error, root mean square error and coefficient of determination are used. The prediction of stock market using two models found better on Min-Max normalized data. SVR found better than BPNN on predicting stock market of different investment sector.Item Nepali Document Clustering using DBSCAN and OPTICS Algorithm(Department of Computer Science & Information Technology, 2018) Maharjan, PrabinAutomated document clustering is the process of grouping documents into a small sets of meaningful collections based on similarity between them. This research evaluates density based clustering algorithms namely Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Ordering points to Identify Cluster Structure(OPTICS) algorithms using four performance metrics: Homogeneity, Completeness, V-Measure and Silhouette Coefficient on Nepali dataset. Features extraction is done using combination of Term Frequency – Inverse Document Frequency (TFIDF) with Latent Semantic Indexing (LSI). The results based on the performance metrics mentioned above shows that clustering result of DBSCAN is slightly better than OPTICS algorithm. The time required for processing is better for DBSCAN algorithm.Item Nepali Document Clustering using K-Means, Mini-Batch K-Means, and DBSCAN(Department of Computer Science and Information Technology, 2018) Maharjan, AmanAutomated document clustering is the process of grouping documents into a small sets of meaningful and coherent collections. This research evaluates K-Means, Mini-Batch K-Means and Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithms using four performance measures: Homogeneity, Completeness, V-Measure and Silhouette Coefficient in the context of Nepali documents. Features extraction is done using Term Frequency– Inverse Document Frequency (TFIDF) and TFIDF+ Latent Semantic Indexing (LSI) combination. The empirical results shows that Mini-Batch K-Means performs better when using TFIDF only and K-Means performs better when using TFIDF + LSI. Similarly, in time constrained environments, the clustering time of Mini-Batch K-Means is better than other two algorithms.Item Support vector machines based part of speech tagging for Nepal text(Department of Computer Science and Information Technology, 2012) Shahi, Tej BahadurOptimal part-of-speech tagging have great importance in various field of natural language processing such as machine translation, information extraction, word sense disambiguation, speech recognition and others. Due to the nature of the Nepali language, tagset used and size of the corpus (training data), getting accurate part-of-speech tagger is of challenging issue. This study is oriented to build an analytical machine learning model based on which it can be possible to determine the attainable accuracy. To complete this task, the support vector machine based part-of-speech tagger has been developed and tested for various instances of input to verify the accuracy level. The SVM tagger construct the feature vectors for each word in input and classify the word into one of two classes (One Vs Rest). The performance analysis includes different components such as known words, unknown words and size of the training data. The present study of support vector machine based part of speech tagger is limited to use certain set of features and it use a small dictionary which affects its performance. The learning performance of tagger is observed and found that it can learn well from the small set of training data and increases the rate of learning on the increment of training size.