Computer Science & Information Technology

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 20 of 143
  • Item
    A comparative evaluation of buffer replacement algorithms LIRS-WSR and CCF-LRU for flash memory based systems
    (Department of Computer Science and Information Technology, 2017) Yadav, Mahesh Kumar
    Available with full text
  • Item
    Performance Analysis of Stream Cipher RC4 Variants: VMPC & SPRITZ
    (Department of Computer Science and Information Technology, 2018) Sharma, Santosh
    Stream cipher algorithms are most powerful tools in symmetric cryptography. These algorithms perform either bit wise or byte wise encryption in a simple way just doing XOR operation between key and message (plain text). Stream cipher algorithms are about 5 to 10 times faster than AES, TDES (block cipher). In stream cipher, creating key stream by randomizing the bits is most important thing. These algorithms are useful normally in GSM mobile communication, Hard disk encryption, Multimedia encryption and fast Software encryption, standard web application, network protocols etc. In this thesis, stream cipher RC4 variants VMPC & SPRITZ are studied, analyzed the performance and implemented in Java Programming using NetBeans 8.0.2 considering their other parameters constant. The empirical performance shows that VMPC cipher is found to be better if the message size is bigger. With small size of message stream cipher RC4 variants VMPC & SPRITZ shows worst performance. The speed of stream cipher SPRITZ is much slower due to cryptographic function “sponge function”, where different functions are used for the additional layer of security in RC4 variants, although performance of SPRITZ also increases with increment of size of message. While the message size gets increased, performance of stream cipher VMPC is better in all size of messages .Therefore, while inputting different and big size of message, performance of stream cipher VMPC gets increased and it is found to be the better algorithm for the large size message in the targeted architecture computer.
  • Item
    Semantic text clustering using enhanced vector space model using Nepali language
    (Department of Computer Science and Information Technology, 2012) Sitaula, Chiranjibi
    The vector space model is popular method for the clustering process while doing research in the field of text mining. The main reason of its popularity is its less computational overhead and simplicity. Classical vector space model can not be used for semantic analysis purpose because it simply uses syntactic model for clustering. In order to cluster the documents in sentence level, how individual keyword plays the important roles in the text clustering, is studied in this work. For this, an Enhanced method is proposed which can easily outperform classical vector space model due to the involvement of fuzzy set approach. The classical Vector Space Model is enriched with fuzzy set so as to form the Enhanced Vector Space Model in text clustering.In order to give the semantic text clustering, fuzzy set plays crucial role in addition to classical Vector Space Model.
  • Item
    Off-line Nepali Handwritten Character Recognition Using MLP and RBF Neural Networks
    (Department of Computer Science & Information Technology, 2012) Pant, Ashok Kumar
    An off-line Nepali handwriting recognition, based on the neural networks, is described in this research work. For the recognition of off-line handwritings with high classification rate a good set of features as a descriptor of image is required. Two important categories of the features are described, geometric and statistical features for extracting information from character images. Directional features are extracted from geometry of skeletonized character image and statistical features are extracted from the pixel distribution of skeletonized character image. The research primarily concerned with the problem of isolated handwritten character recognition for Nepali language. Multilayer Perceptron (MLP)& Radial Basis Function (RBF) classifiers are used for classification. The principal contributions presented here are preprocessing, feature extraction and MLP& RBF classifiers. The another important contribution is the creation of benchmark dataset for off-line Nepali handwritings. There are three datasets for Nepali handwritten numerals, Nepali handwritten vowels and Nepali handwritten consonants respectively. Nepali handwritten numeral dataset contains total 288 samples for each 10 classes of Nepali numerals, Nepali handwritten vowel dataset contains 221 samples for each 12 classes of Nepali vowels and Nepali handwritten consonant dataset contains 205 samples for each 36 classes of Nepali consonants. The strength of this research is efficient feature extraction and the comprehensive classification schemes due to which, the recognition accuracy of 94.44% is obtained for Nepali handwritten numeral dataset, 86.04% is obtained for Nepali handwritten vowel dataset and 80.25% is obtained for Nepali handwritten consonant dataset. Keywords: Off-line handwriting recognition, Image processing, Neural networks, Multilayer perceptron, Radial basis function, Preprocessing, Feature extraction, Nepali handwritten datasets
  • Item
    Learning object model for object oriented paradigm
    (Department of Computer Science and Information Technology, 2010) Khatiwada, Suresh
    Though object-orientation is widely acknowledged as an area of computing, it is still considered difficult to study because of a lot misunderstanding concepts and principles. The principle themselves may not be very difficult to grasp but the deep understanding of the concepts needed to produce effective object-oriented solutions to problems is hard to achieve. This thesis applied Learning Object Technology to build a web-based environment for teaching and learning object-oriented programming. This research work describes an interactive teaching- learning system that can help students understand some basic concepts and principles of object-oriented programming related to classes and instances. The difference in basic concepts of C++ and Java is analyzed and suggested the learning techniques. Four learning objects covering all the contents are developed so that they can be used, re-used or referenced during object-oriented programming support learning. The proposed system is a combination using the educational material, which is transformed into reusable learning objects, created from basic concept of object-oriented programming.
  • Item
    Minimizing the evacuation time of traffic management system using simulated annealing algorithm
    (Department of Computer Science and Information Technology, 2012) Subedi, Bal Krishna
    Route configuration is important task for evacuation planning of any disaster. Evacuation route planning problem is to find optimal route through obstacle’s environmental graph from a specified start location to a desired goal destination while satisfying certain optimization criteria. Emergency Evacuation Route Planning [EERP] problem should be optimal for the best path configuration. Recently, a genetic algorithm based approach has been introduced to configure the optimal route for EERP problem. However, it has not done with increase of city or place, changing direction of source, goal and congestion place, and handling of heuristic information. Consequently, the performance of the genetic algorithm based approach deteriorates significantly. This motivates the research of the tasks. The simulated annealing algorithm based approach to find the optimal route for emergency evacuation route planning is an optimization algorithm similar to the genetic algorithm in principle. However, our investigation and simulation have indicated that the simulated annealing algorithm based approach is simpler and appropriate for EERP problem. Its performance is also shown to be better than that of genetic algorithm based approach in EERP. The first step of route configuration for EERP problem is to search an initial feasible route. A commonly used method for finding the initial route is to randomly pick up some vertices of the graph of cities or places. The tasks propose a heuristic method to search the feasible initial route efficiently and then, the heuristic method is incorporated into the proposed simulated annealing algorithm based approach, which takes less evacuation time to get optimal route configuration for EERP problem.
  • Item
    Creation of parallel corpus from comparable corpus (for English-Nepali language pair)
    (Department of Computer Science and Information Technology, 2011) Pant, Hari Prashad
    Statistical Machine Translation system is a great need of different multilingual countries like Nepal. But, one of the major bottlenecks in the development of Statistical Machine Translation systems for different language pairs is the lack of bilingual parallel data used for training such systems. Such parallel data contains the more or less exact translation of some source language sentence to the target language sentence. This is what we call parallel corpus used for training the Statistical Machine Translation System. There are such parallel corpora available relatively for few language pairs, for few domains and in limited size. Constructing such useful parallel data manually for different language pairs, different domains, and of sufficiently large size and good quality is really costly both human and monetarily. It is parallel corpora may be the scarce resource, but comparable corpora are the rich, diverse resource that are readily available in several domains and language pairs. These corpora consists of a set of documents in two different languages which are not the exact translations of each other but contain somewhat related and similar information on the same topic. Such texts in large quantities can be found on the Web, good examples are online news agencies like CNN, BBC, etc. In this dissertation, a method is proposed, which lets us to exploit such diverse resource: comparable corpora in order to extract the parallel data from them in an automated manner. The proposed method first tries to tokenize the documents at paragraph level and then candidate target sentences for each source sentence are obtained by using the sentencelength based method. After that the best match among the candidate sentences is made based on the bilingual dictionary. It has been observed that the quality and the number of words present in the bilingual dictionary enhance the accuracy of the model for the creation of parallel corpus from the comparable corpora.
  • Item
    Analysis of authorization framework and its implementation
    (Department of Computer Science and Information Technology, 2011) Bhandari, Pushpendra Singh
    As more resources are being made available over the internet and intranet, it is important to ensure that appropriate resources are accessed by appropriate users. In a large scale service oriented computing environment where thousands of computers, storage systems, networks, scientific instruments and other devices distributed over wide area networks presents unique security problems that are not addressed by traditional client-server/distributed computing environments. Thus, a need for authorization is required. Authorization implementation enables users and organizations to have secure, protected, and private access to remote services. It has been found that early design of authentication and authorization eliminates a high percentage of application vulnerabilities. This thesis report focuses on need for an authorization, its requirements and how access of the protected resources from unauthenticated users in a distributed, web-based system is controlled by using the several controls and mechanisms provided by various authorization techniques and tools. This thesis focuses on Shibboleth, the most widely used automated authentication and authorization tool. It is a system designed to exchange information across realms for authentication and authorization. Finally, an implementation is shown demonstrating how an authorization can be used in an organization to ensure a secure access to the protected resources based on different access controls.
  • Item
    Ranking unstructured documents in IR (A comparative study of vector space model and latent semantic indexing model)
    (Department of Computer Science and Information Technology, 2012) Rana, Jamir
    For thousands of years people have realized the importance of archiving and finding information. With the advent of computers, it became possible to store large amounts of information; and finding useful information from such collections became a necessity. The field of Information Retrieval (IR) was born in the 1950s out of this necessity. Over the last fifty years, the field has matured considerably. Several IR systems are used on an everyday basis by a wide variety of users. The goal of information retrieval (IR) is to provide users with those documents that will satisfy their information need. Various Models of Information retrieved have been implemented like Boolean Model, Vector Space Model, Probabilistic Model and so on, among these models Vector Space Model (VSM) and Latent Semantic Indexing Model (LSI) are also promising models being used till date. The main concern of the study is to rank the documents and find out whether LSI Model overcomes the problems of VSM when the problems are attached with synonyms and polysemys while ranking documents. The implemented features of these models like how to represent documents and query as vectors in R |v| , term-document matrix, term-weighting, cosine similarity, SVD decomposition, dimensionality reduction and its effect in results of LSI have been presented. Precision and recall have been implemented to know the effectiveness of the system. Conclusions have been drawn and future recommendation has been provided for better improvement.
  • Item
    Performance analysis of hash message digests SHA-2 and SHA-3 finalists
    (Department of Computer Science& Information Technology, 2012) Dahal, Ram Krishna
    Cryptographic hash functions are considered as workhorses of cryptography. NIST published the first Secure Hash Standard SHA-0 in 1993 as Federal Information Processing Standerd publication (FIPS PUBS) which two years later was replaced by SHA-1 to improve the original design and added SHA-2 family by subsequent revisions of the FIPS. Most of the widely used cryptographic hash functions are under attack today. With the need to maintain a certain level of security, NIST is in the process of selecting new cryptographic hash function through public competition. The winning algorithm will not only have to establish a strong security, but also exhibit good performance and capability to run. Here in this work, analyses are focused on the performance of SHA-3 finalists along with the current standard SHA-2. As specified by the submission proposal by those five finalists, the Java implementations have been done. The results of empirical performance comparison show that two SHA-3 finalists namely Skein and BLAKE perform better which is nearly same as the performance of SHA-2. There is vast gap in the performance of the candidates as the best performer gives 3-4 times better result than the least performer. The results show that, when considering only on the performance aspect, by assuming all the candidates are equally secure, the alternative to SHA-2 can be Skein or BLAKE.
  • Item
    Optimal coloring of a plane using unit distance graphs
    (Department of Computer Science and Information Technology, 2012) Bist, Harendra Raj
    Generally, plane coloring is the assignment of different colors to different points of a plane. Plane can be represented via graphs and hence can be colored using graph coloring. So the chromatic number of a plane is equivalent to finding the chromatic number of the equivalent graph of the plane. The unit distance graph can be used to find the chromatic number of a plane. The main application of unit distance graphs is the unit distance wireless networks (UDW), in cellular networks; the geometric regions are partitioned into hexagonal cells. Unit distance graph can represent the hexagonal cells, and coloring the hexagonal regions implies distribution of frequencies among the hexagonal cells. Different heuristic based algorithms viz. contraction based RLF, DSATUR and IDO based graph coloring algorithms are used in this study in order to color the unit distance graphs. And, different simple unit distance graphs used during the study include wheel graph, cycle graph, grid graph, etc. as test cases for these coloring algorithms. The optimality of plane coloring, nowadays, is no more just confined to the minimal number of colors rather the coloring time makes importance. Since, coloring time is important in the field of cellular networks as the assignments of the frequency in a geometric regions and assignment of data to registers in program execution in minimum time. Since, these problems can be resolved through graph coloring. In this context, this study analyzes on optimal coloring of planes using heuristic based approaches so as to suggest an effective coloring paradigm.
  • Item
    Analysis of Adhoc on demand distance vector (AODV) and dynamic source (DSR) routing algorithms in mobile Adhoc networks
    (Department of Computer Science and Information Technology, 2011) Pokhrel, Vivek
    Ad-hoc networking is a concept in computer communications, which means that users wanting to communicate with each other form a temporary network, without any form of centralized administration. Each node participating in the network acts both as host and a router and must therefore is willing to forward packets for other nodes. For this purpose, a routing protocol is needed. An ad-hoc network has certain characteristics, which imposes new demands on the routing protocol. The most important characteristics are the dynamic topology, which is a consequence of node mobility. Nodes can change position quite frequently, which means that we need a routing protocol that quickly adapts to topology changes. The nodes in an ad-hoc network can consist of laptops and personal digital assistants and are often very limited in resources such as CPU capacity, storage capacity, battery power and bandwidth. This means that the routing protocol should try to minimize control traffic, such as periodic update messages. Instead the routing protocol should be reactive, thus only calculate routes upon receiving a specific request. The Internet Engineering Task Force currently has a working group named Mobile Ad-hoc Networks that is working on routing specifications for ad-hoc networks. This master thesis evaluates some of the protocols put forth by the working group. This evaluation is done by means of simulation using Network Simulator (NS-2) from Berkeley. The simulations have shown that there certainly is a need for a special ad-hoc routing protocol when mobility increases. More conventional routing protocols like DSDV have a dramatic decrease in performance when mobility is high. Two of the proposed protocols in this work are DSR and AODV. They perform very well when mobility is high. However, we have found that a routing protocol that entirely depends on messages at the IP-level will not perform well. Some sort of support from the lower layer, for instance link failure detection or neighbor discovery is necessary for high performance. The size of the network and the offered traffic load affects protocols based on source routing, like DSR, to some extent. A large network with many mobile nodes and high offered load will increase the overhead for DSR quite drastically, in these situations; a hop-by-hop based routing protocol like AODV is more desirable. Keywords: MANETS, DSR, AODV, NS2- network simulator
  • Item
    Improved dynamic programming approach for the response time
    (Department of Computer Science and Information Technology, 2011) Pant, Shiv Raj
    Fair sequences are useful in a variety of manufacturing and computer systems. The concept of fair sequence has emerged independently from scheduling problems of diverse environments, principally from manufacturing, hard real-time systems, operating systems and network environments. There has been a growing interest in scheduling problems where fair sequence is needed. There are various applications where jobs, clients, or products need to be scheduled in such a way that they get their necessary resources at a constant interval, without being too early or too late. The concept of variation in response time has been recently appeared in literature and a lot of research is being carried out in this area. The problem of variation in the response time is known as Response Time Variability Problem (RTVP). This dissertation includes recent researches regarding the response time variability problem. RTVP is very hard to solve optimally. It has been proved to be NP-hard. Our concern in this dissertation is to find out the optimal sequence of jobs with objective of minimizing the response time variability. Various solutions based on heuristics exist in the literature to fulfill this objective. One of the approaches is the dynamic programming approach. This dissertation work focuses on the dynamic programming approach. Dynamic programming approach is a complete enumeration scheme that minimizes the amount of computation to be done by dividing the problem into series of subproblems. It solves the subproblems until it finds the solution of the original problem. This approach is not supposed to be a practical solution because of the exponential time and space complexity. The main objective of this dissertation is to improve the dynamic programming approach to RTVP to obtain an efficient solution. The dynamic programming approach will be practically improved by applying some heuristic methods. The basic idea behind the improvement is that we need not search the whole state space if we can find that some states do not lead to an optimal solution. Heuristics will be applied to prune the nonoptimal states. Since, the problem is NP-hard, we cannot theoretically reduce exponential complexity to polynomial complexity. But practically, we can apply heuristic methods to modify the algorithm that can solve the larger instances of the problem.
  • Item
    Support vector machines based part of speech tagging for Nepal text
    (Department of Computer Science and Information Technology, 2012) Shahi, Tej Bahadur
    Optimal part-of-speech tagging have great importance in various field of natural language processing such as machine translation, information extraction, word sense disambiguation, speech recognition and others. Due to the nature of the Nepali language, tagset used and size of the corpus (training data), getting accurate part-of-speech tagger is of challenging issue. This study is oriented to build an analytical machine learning model based on which it can be possible to determine the attainable accuracy. To complete this task, the support vector machine based part-of-speech tagger has been developed and tested for various instances of input to verify the accuracy level. The SVM tagger construct the feature vectors for each word in input and classify the word into one of two classes (One Vs Rest). The performance analysis includes different components such as known words, unknown words and size of the training data. The present study of support vector machine based part of speech tagger is limited to use certain set of features and it use a small dictionary which affects its performance. The learning performance of tagger is observed and found that it can learn well from the small set of training data and increases the rate of learning on the increment of training size.
  • Item
    A secure cryptographic algorithm improving security over known plain text attack based on Hill Cipher
    (Department of Computer Science and Information Technology, 2011) Sah, Khagendra Prasad
    The secrecy of sensitive information against unauthorized access or deceitful changes has been of prime concern throughout the centuries. With the introduction of computer, the security of data or information (stored on a computer for a shared system or during their transmission in a distributed system) to maintain its confidentiality, proper access control, integrity and availability has been a major issue. The only way to address all these issues is cryptology. Hill Cipher is one of the most famous symmetric cryptosystem that can be used to protect information from unauthorized access. Hill cipher is a multi-letter and polygraph substitution cipher. Although the Hill Cipher is resistant to frequency letter analysis and strong against ciphertext only attack, it succumbs to known-plaintext attack. The other disadvantage of the Hill Cipher is non invertible key matrix because all the matrices don’t posses their inverses and this reduces the number of possible keys from actual number of possible keys and leads to the chances of brute-force attack. In the scheme that is going to be proposed, a variant of the Hill cipher is introduced that makes the Hill Cipher more secure and retains the efficiency. The scheme uses the Cipher Block Chaining (CBC) mode of operation and bitwise XOR operation that leads to diffusion and confusion and make the scheme strong against the known-plaintext attack.
  • Item
    Predicting sentence using N-gram language model for Nepali language
    (Department of Computer Science and Information Technology, 2012) K.C., Ananda
    Sentence completion is a real time ubiquitous feature directed to predict a succeeding words sequence, an appropriate completion of a given initial text fragment. Sentence completion able a user to retrieve desired information with little knowledge over exact keywords and with least typing efforts. Under statistical method, this work will deal with N-gram method to predict the remaining part of sentence for Nepali language using Viterbi as a decoding algorithm. By analyzing the result of this work, Trigram Prediction Model is more accurate than Bigram Prediction Model. To get the best result, this work recommends taking a large corpus with sufficient repetition of words.
  • Item
    Comparison of association rule mining algorithms- Apriori and FP growth
    (Department of Computer Science and Information Technology, 2011) Bhatt, Krishan Dev
    Data mining is a part of a process called KDD-knowledge discovery in databases. This process consists basically of steps that are performed before carrying out data mining, such as data selection, data cleaning, pre-processing, and data transformation. Association rule techniques are used for data mining if the goal is to detect relationships or associations between specific values of categorical variables in large data sets. There may be thousands or millions of records that have to be read and to extract the rules. Frequent pattern mining is a very important task in data mining. The approaches applied to generate frequent set generally adopt candidate generation and pruning techniques for the satisfaction of the desired objectives. This dissertation shows how the different approaches achieve the objective of frequent mining along with the complexities required to perform the job. This dissertation looks into a comparison among Apriori and FP Growth algorithm. The process of the mining is helpful in generation of support systems for many computer related applications. It has been observed that with higher support and confidence on both algorithms, FP-Growth extracts the better association rules than Apriori algorithm. While decreasing the support and confidence value Apriori seems better than FP-Growth algorithm.
  • Item
    Comparative analysis of particle swarm optimization varying the inertia factor
    (Department of Computer Science and Information Technology, 2013) Aryal, Sandeep
    Finding a sub-optimal solution to a difficult problem sometimes is better than finding the optimal one. It results in the reduction of cost in terms of time and feasibility. Approximation algorithms do the same thing. Among the different optimization techniques for different optimization problems, approximation algorithms help in finding approximate to optimal results. In this dissertation, an implementation of the Particle Swarm Optimization, an approximation algorithm, has been provided. Different parameters as found in the Particle Swarm Optimization have been varied. The impact of the variation in the algorithm has been studied with respect to three standard benchmark equations namely, Parabola, Rosenbrock and Griewank and statistically analyzed afterwards. The main area of this work however, goes through the variation of the Inertia factor in the algorithm. This factor has been varied with the values that go through arithmetic, geometric and harmonic sequence. The impact or the resulting effects of the variations for the benchmark equations have been provided with the statistical analysis of the results. The work then gives a suggestive approach on the selection of progression when varying Inertia factor through arithmetic, geometric and harmonic sequence in the simplest form of Particle Swarm Optimization algorithm. Keywords: Approximation Algorithms, Swarm Intelligence, Particle Swarm Optimization, Inertia Weight, Mathematical Progressions,
  • Item
    Comparison of back propagation algorithm and SVM on SLA based masquerader detection in cloud
    (Department of Computer Science and Information Technology, 2013) Ghimire, Dadhi Ram
    Cloud computing is a prospering technology that most organizations are considering for adoption as a cost effective strategy for managing IT. However, organizations still consider the technology to be associated with many business risks that are yet to be resolved. Such issues include security, privacy as well as legal and regulatory risks. As an initiative to address such risks, organizations can develop and implement Service Level Agreement (SLA) to establish common expectations and goals between the cloud provider and customer. Organizations can base on the SLA to address the security concern. However, many SLAs tend to focus on cloud computing performance whilst neglecting information security issues. This study is oriented to build a masquerade detection system in cloud computing, based on the proposed SLA. The new SLA contains additional security constraints than that found in traditional SLA such as length of temporal sequence, weight of each activities and the threshold weight of the temporal sequence. The performance analysis includes comparison of BackPropagation algorithm with SVM. The detection rate and false alarm rate is observed and found that it can detect masqueraders well from the small set of training data with small false alarm rate. Keywords: Cloud Computing, Service Level Agreement, Masquerader, Backpropagation Algorithm, Support Vector Machine, Temporal Sequence
  • Item
    A comparative evaluation of buffer replacement algorithms LIRS-WSR and AD-LRU for flash memory based systems
    (Department of Computer Science and Information Technology, 2014) Singh Mahara, Dabbal
    Flash memory has characteristics of asymmetric I/O latencies for read, write and erase operations and out-of-place update. Thus, buffering policy for flash based systems has to consider these properties to improve the overall performance. Existing buffer replacement algorithms such as LRU, LIRS, ARC etc do not deal with differing I/O latency of flash memory. Therefore, these algorithms have been revised to make them suitable for buffering policy for flash based systems. Among different flash aware buffer replacement algorithms LIRS-WSR and AD-LRU are two new buffer replacement policies that can be suitable for flash based systems. LIRS-WSR enhances LIRS by reordering the writes of not-cold-dirty pages from the buffer cache to flash storage to focus on the reduction of number of write/erase operations as well as preventing serious degradation of buffer hit ratio. AD-LRU also focuses on improving overall performance of flash based systems by reducing number of write /erase operations and by retaining high buffer hit ratio. We evaluate these two different approaches with same objectives of improving buffering policy for flash based systems by using trace driven simulation. When workload has high reference locality, AD-LRU has significantly superior performance than LIRS-WSR in terms of both hit rate and write count. AD-LRU has higher hit rate up to 22% and minimizes write count up to 40% in comparison to LIRS-WSR. This is because of AD-LRU‟s good adaptive technique to handle changes in reference patterns. For uniformly distributed workloads, the difference in hit rates and write count of AD-LRU and LIRS-WSR is comparatively small. AD-LRU outperforms LIRS-WSR by increasing hit rate up to 5% and decreasing write count up to 3% in comparison to LIRS-WSR in its worst case. Keywords: Flash memory, Buffer Replacement Algorithm, LIRS, LIRS-WSR, AD-LRU, Hit Rate, Write Count