ACEEE International Journal on Network Security, Vol 1, No. 2, July 2010 A Novel Classification via Clustering Method for Anomaly Based Network Intrusion Detection System Mrutyunjaya Panda1 and Manas Ranjan Patra2 1 Department of ECE, Gandhi Institute of Engineering and Technology, Gunupur, Orissa-765022, India Email: mrutyunjaya.2007@rediffmail.com 2 Department of Computer Science, Berhampur University, Orissa-760007, India Email: mrpatra12@gmail.com Abstract- Intrusion detection in the internet is an active intrusions by scanning for pre-classified signatures in area of research. Intruders can be classified into two network packets. On the other hand, anomaly detection types, namely; external intruders who are unauthorized can detect new intrusions while misuse detection may users of the computers they attack, and internal not. The idea behind anomaly detection is that if we intruders, who have permission to access the system but can establish a normal activity profile for a system, in with some restrictions. The aim of this paper is to present a methodology to recognize attacks during the normal theory we can flag all system states varying from the activities in a system. A novel classification via sequential established profile as intrusion attempts. However, if a information bottleneck (sIB) clustering algorithm has set of intrusive activities is not identical to the set of been proposed to build an efficient anomaly based known anomalous activities, the situation becomes network intrusion detection model. We have compared more interesting as we can find new interesting our proposed method with other clustering algorithms possibilities. Anomalous activities that are not intrusive like X-Means, Farthest First, Filtered clusters, DBSCAN, but flagged as intrusive are referred as false positives. K-Means, and EM (Expectation-Maximization) Actual intrusive activities that go undetected are called clustering in order to find the suitability of our proposed false negatives. This is a serious problem and far more algorithm. A subset of KDDCup 1999 intrusion detection benchmark dataset has been used for the experiment. serious than false positives. Anomaly detection is Results show that the proposed method is efficient in computationally expensive because of the overhead of terms of detection accuracy, low false positive rate in keeping track of and possibly updating several system comparison to the other existing methods. profiles. Recently proposed system learning rules for anomaly detection (LERAD) discovers relationships Index Terms: Intrusion Detection, Meta Classifier, among attributes in order to model application Unsupervised clustering, Sequential Information protocols [3, 4]. bottleneck clustering. In fact, intrusion detection can be regarded as a classification problem, namely; identifying normal and I INTRODUCTION other types of intrusive behavior. Hence, the key point Identifying unauthorized use, misuse and attacks on here is to choose an effective classification approach to information systems is defined as intrusion detection build accurate intrusion detection models. A number of [1, 2]. The most popular way to detect intrusions has machine learning algorithms have been applied to been done by analyzing audit data generated by intrusion detection to learn intrusion rules or build operating systems and by networks. Since almost all normal usage patterns. In this paper, we investigate and activities are logged on a system, it is possible that a evaluate the performance of classifiers using different manual inspection of these logs would allow intrusions cluster algorithms like Sequential Information to be detected. It is important to analyze the audit data Bottleneck (sIB), Farthest First Traversal (FFT), K- even an attack has occurred, for determining the extent Means, DBSCAN and EM. The motivation behind of damage occurred, this analysis helps in attack trace using the classification using clustering algorithms is to back and also in recording the attack patterns for future improve the accuracy of the intrusion detection system prevention of such attacks. An intrusion detection in comparison to the individual clustering algorithms. system (IDS) can be used to analyze audit data for The rest of the paper is organized as follows. A such insights. This makes IDS a valuable real-time literature review is presented in Section 2 followed by detection and prevention tool as well as a forensic a short theoretical background on the clustering analysis tool. algorithms used in this research in Section 3. Section 4 Traditionally, intrusion detection techniques fall into provides a brief idea about the KDDCup 1999 dataset two categories: signature detection and anomaly used. The proposed methodology is described in detection. Signature detection, or misuse detection, Section 5 followed by the experimental setup in searches for well known patterns of attacks and Section 6. Section 7 describes the results and 17 © 2010 ACEEE DOI: 01.ijns.01.02.04
ACEEE International Journal on Network Security, Vol 1, No. 2, July 2010 discussion. Finally, the paper concludes with future In our experiments, some clustering algorithms are research directions in Section 8. selected for performance comparison. Some of them are discussed as follows. II. RELATED RESEARCH A. K-Means: In [5], the authors have designed an Intrusion response We first describe the K-means algorithm for producing (IR) system cooperating with IDS using mobile agents a clustering of the points in the input into K clusters. It distributed throughout the network, based on partitions the data-points into K subsets such that all stigmergic properties. In [6], the authors introduced a points in a given subset “belong” to some centre. The self-organized ant colony based intrusion detection algorithm keeps track of the centroids of the subsets system (ANTIDS) to detect intrusions and compares and proceeds in iterations. Before the first iteration the its performance with linear genetic programming centroids are initialized to random values. The (LGP) [7], Support vector machines (SVM) [8] and algorithm terminates when the centroid locations stay Decision Trees (DT) [9]. Other works have made use fixed during iteration. During each iteration the of Multiple Adaptive Regression Splines (MARS) following is performed: [10]. In [11], the authors have compared various data  For each point x, find the centroid which is closest mining algorithms for detecting network intrusions. to x. Associate x with this centroid. The authors have used Naïve Bayes algorithm in  Re-estimate centroid locations by taking, for each building a network intrusion detection model [12]. In centroid, the canter of mass of points associated [13], the authors proposed Bayesian Belief network with it. (BBN) with genetic and simulated annealing local The K-Means algorithm is known to converge to a search in order to build an efficient network intrusion local minimum of the distortion measure (that is, detection model. The authors have compared various average squared distance from points to their class ensemble algorithms in detecting the intrusion centroids). It is also known to be very simple and can detection in [14]. Modeling intrusion detection system be easily implemented in solving many practical using hybrid intelligent systems is proposed in [15]. In problems. It can work very well for compact and hyper this, DT and SVM are combined as a hierarchical spherical clusters. The time complexity of the K-means hybrid intelligent system model (DT_SVM) and an is O (NKd). Since, K and d are usually much less than ensemble approach combining the base classifiers. In N, K-means can be used to cluster large data sets. [16], the authors propose support vector learning approach to classify network requests. The authors in B. Expectation Maximization (EM) [17] used K-Means and DBSCAN to demonstrate how We follow the procedure as in [20]. The steps for our cluster analysis can be used to effectively identify implementation of EM are as follows. We initiate the groups of traffic that are similar using only transport process by an initial estimate of mean and standard layer statistics. The authors propose hierarchical deviation. The EM algorithm then searches for a Gaussian Mixture Model (HGMM) a novel type of maximum likelihood hypothesis through the following Gaussian Mixture which detects network based attacks iterative scheme. as anomalies using statistical processing classification • Initialization Step: Initialize the hypothesis θ 0 = ( µ10 , µ2 ,...., µ k0 ) , θ k0 = µk0 in [18]. In [19], the authors use automated feature 0 weighting for network anomaly detection. They conclude that their proposed method not only increases (1) the detection rate but also reduces false alarm rate as Where k is the current number of Gaussians, σ is well. the standard deviation, θ 0 is the estimate at 0th We believe that an unsupervised clustering approach iteration, µ is the mean. offers some advantages over supervised learning approaches. One of the main benefits is that new • Expectation Step: Estimate the expected values applications can be identified by examining the of the hidden variables zij (mean and standard connections that are grouped to form a new cluster. deviation) using the current hypothesis θ = ( µ , µ ,...., µ ) The supervised approach can not discover new t t t t applications and can only classify network traffic for 1 2 k which it has labeled training data. This prompted us to � x −µt ) � −( i 2 use unsupervised clustering approach in building an exp � k � efficient classifier for detecting anomaly based � 2σ2 � E ( zik ) = � � network intrusions. (2) k � x −µt ) 2 � −( i ¥ � 2σ2 � j exp � � I. CLUSTERING ALGORITHMS j= 1 � � where t is the number of iteration, E zij ( ) is the expected value for the hidden variables (namely 18 © 2010 ACEEE DOI: 01.ijns.01.02.04
ACEEE International Journal on Network Security, Vol 1, No. 2, July 2010 mean and standard deviation), k is the dimension, The farthest first traversal k-center algorithm (FFT) is a σ is the standard deviation. fast, greedy algorithm that minimizes the maximum • Maximization Step: provides a new estimate cluster radius [23]. This is also treated as an efficient of the parameters. algorithm which always returns the right answer. The n pseudo code for the farthest first traversal algorithm is ¥( z ) x E ik i as follows: µt +1 = k i=1 n (3) ¥( zik ) E i=1 • Pick any z ᆪ S and set T = { z} • While Convergence Step: if θ − θ < e , stop t +1 t T <k : • z = arg max ρ( x, T ) (finish iteration); otherwise, go to step 2. xᆪS The hidden variables are the parameters of the model. T =T ᆪ z} { In this case, we use mixture of Gaussians; hence the hidden variables are the mean and standard deviation for each Gaussian distribution. We start with an initial Here, ρ ( x, T ) is the distance from point ( x ) to the estimate of those parameters and iteratively run the closest point in set T. This builds a solution T one point algorithm to find the maximum likelihood (ML) for at a time. It starts with any point, and then iteratively our estimates. adds in the point farthest from the ones chosen so far. The reason we are using EM is to fit the data better, so that clusters are compact and far from other clusters, The farthest point ( x ) from a set ( S ) is obtained from since we initially estimate the parameters and iterate to ρ ( x, T ) . Farthest first traversal takes time O ( k S ) , find the ML for those parameters which is fairly efficient and is always close to optimal C. DBSCAN solution, in the sense that if T is the solution returned Density-based methods are based on a simple by the farthest first traversal, and T * is the optimal assumption: clusters are dense regions in the data space that are separated by regions of lower density. ( ) solution, then cos t ( T ) ᆪ2 cos t T * . Their general idea is to continue growing the given E. Filtered Clusterer cluster as long as the density in the neighborhood exceeds some threshold. In other words, for each data The Filtered Clusterer is Meta-Clusterer which offers point within a given cluster, the neighborhood of a the possibility to apply filters directly before the given radius has to contain at least a minimum number Clusterer is learnt. The structure of the filter is based of points. These methods are good at altering out exclusively on the training data and test instances are outliers and discovering clusters of arbitrary shapes. processed by the filter without changing their structure Density based spatial clustering of applications with [24]. noise (DBSCAN) is an example of density based F. X-Means methods [21]. X-means is a new algorithm that quickly estimates the The algorithm DBSCAN (Density Based Spatial number of clusters K. It goes into action after each run Clustering of Applications with Noise) targeting low- of K-means, making local decisions about which subset dimensional spatial data is the major representative in of the current centroids should split themselves in order this category. Two input parameters ε and MinPts are to better fit the data. The splitting decision is done by used to define: ε -neighborhood computing the Bayesian Information Criterion (BIC). 1)An X-means have been experimented against a more Nε ( x ) = y ΣX | d ( x, y ) ε ) of the point x, traditional method that estimates the number of clusters 2) A core object (a point with a neighborhood by guessing K. X-means consistently produced better consisting of more than MinPts points) clustering on both synthetic and real-life data, with 3) A concept of a point y density-reachable from a respect to BIC. It also runs much faster, which core object x (a finite sequence of core objects prompted us to select this algorithm for our intrusion between x and y exists such that each next belongs to detection model, which has not been considered so far an ε -neighborhood of its predecessor) for intrusion detection. A detailed description on X- 4) A density-connectivity of two points x, y (they Means operations can be seen in [25]. should be density-reachable from a common core object). IV INTRUSION DETECTION DATASET More details about the DBSCAN can be found in [22]. The KDD Cup 1999 Intrusion detection contest data is D. Farthest First used in our experiments. This data was prepared by DARPA Intrusion detection evaluation program by 19 © 2010 ACEEE DOI: 01.ijns.01.02.04
ACEEE International Journal on Network Security, Vol 1, No. 2, July 2010 MIT Lincoln Laboratory. Lincoln labs acquired nine as the number of class labels in the dataset in order to weeks of raw TCP dump data. The raw data was obtain a useful model. However, in case of sIB processed into connection records, which contains clustering algorithm, we need not specify the number about 5 million connection records. The data set of clusters to generate a model, as it fixes the number contains 24 attack types. These attacks fall into four of clusters automatically. main categories such as Denial of Service (DoS), B. Proposed Sequential Information Bottleneck Probing, User to Root (U2R) and Remote to Local Clustering(sIB) (R2L) attacks. Besides the four different types of attacks mentioned The steps for the Sequential Information Bottleneck above, normal class needs to be detected. The data set (sIB) Clustering are as follows. for our experiments contained 1000 connection  Finding the global minimum for given number records, which is a subset of 10% KDD Cup’99 of clusters (W) intrusion detection benchmark dataset. These were  Initialized with a (possibly random) partition randomly generated from the MIT data set. Random of W clusters generation of data include the number of data from  sIB draws a sample x at random, treats it as a each class proportional to its size, except that the singleton cluster and merges it to a new smallest class is completely included. All the intrusion cluster cnew such that, detection models are trained and tested with the same data set. Although, the data set contains five different cnew = arg min cᆪC JS ( x, c ) (4) classes, we perform 2-class classification (either normal or attack) as our proposed method could not Where JS ( .,.) is the Jensen-Shannon distance. handle multiple class labels for building an efficient  At each step the objective function improves intrusion detection model. or remains un-changed. More details about this algorithm can be found in [26]. V PROPOSED METHODOLOGY The proposed methodology for classification Via VI EXPERIMENTAL SETUP Clustering using Sequential Information Bottleneck All our experiments were performed using a Pentium 4, (sIB) principle is shown in Fig. 1 below. 2.8 GHz CPU with 512 MB RAM. Full data set is used for the purpose of training in order to build an intrusion Featu Clustering Clust detection model. Then, 10-fold cross validation method re Algorithm ers is used in order to test the efficacy of the model built Select (sIB) during the training phase. Data Samples ion/ Design/ Extra Selection A. Cross Validation Methods Knowl Resul ction Classi Cluster edge Cross-validation (CV) tests exist in a number of ts ficatio validatio variants but the general idea is to divide the training Interp n n data into a number of partitions or folds. The classifier retatio is evaluated by its classification accuracy over one n partition after having learned from the other. This Figure1.Proposed methodology of Classification Via procedure is then repeated until all partitions have been sIB clustering used for evaluation. Some of the most common types are 10-fold, n-fold and bootstrap CV [27]. The A. Meta Classifier difference between these three types of CV lies in the Meta Classifier which comes under Decision way data is partitioned. Leave-one-out is similar to n- committee learning has demonstrated spectacular fold CV, where n stands for the number of instances in success in reducing classification error from learned the data set. Leave–one-out or n-fold CV is performed classifiers. These techniques develop a classifier in the by leaving one instance out for testing and training on form of a committee of subsidiary classifiers. The the other instances. This procedure is then performed committee members are applied to a classification task until all instances have been left out once. It has been and their individual outputs combined to create a argued that the design of 10-fold cross-validation single classification from the committee as a whole. introduces a source of correlation since one is used for These are recent methods for improving the predictive training in one trial and then the same is used for power of classifier learning systems. testing in another [28]. In our proposed methodology, we design a simple meta-classifier that uses a clusterer for classification. VII RESULTS AND DISCUSSION For cluster algorithms that use a fixed number of Here, we present the behavior of different unsupervised Clusterer, like Simple K-Means, the user has to make approaches in building an efficient anomaly based sure that the number of clusters to generate is the same network intrusion detection model. In Table 1, we 20 © 2010 ACEEE DOI: 01.ijns.01.02.04
ACEEE International Journal on Network Security, Vol 1, No. 2, July 2010 present a comparative study of our proposed sIB Classificati Average Average Unclassified clustering method with respect to the other existing on via Detection False instances approaches. Clustering Rate (%) positive (%) Algorithms Rate (%) Table 1. Comparative Study of Existing Clustering and K-Means 80.1 35 NIL Classification Algorithms X-Means 88.3 26.8 12.6 Filtered 80.1 19.8 NIL Algorithm Average False Positive Clusterer Detection Rate Rate (%) +K-Means (%) K-Means [29] 46.986 0.875 Farthest 86.6 89.6 0.8 KM-VQ [19] 48.4 10 First (FFT) FES-KM- 60.1 10 VQ[19] DBSCAN 86.7 100 18.66 X-Means (ours) 78 36 EM 89.1 52.3 2.2 KDDCup 48.57 0.225 sIB 86.3 34 NIL Winner [29] (Proposed) GMIX [30] 53.725 29.715 SOM [30] 47 25.685 Nearest Cluster 47.875 0.2 [31] VIII. CONCLUSION AND FUTURE SCOPE Incremental 44.57 to 84.78 15.99 to 76.83 In this paper, we have illustrated the use of Clustering [32] classification via clustering methodology using sIB Cluster [32] 66 2 clustering algorithm for modeling intrusion detection K-NN [32] 23 6 systems. The proposed approach provides better FR-1 [33] 74.82 73.07 detection accuracy with comparatively low false FR-2 [33] 79.68 85.66 Pure SVM [34] 57.6 35.5 positive rate in comparison to other existing SVM + Rocchio 51.6 44.2 unsupervised clustering algorithms. This makes the bundling [34] approach suitable for building an efficient anomaly SVM +DGSOT 69.8 37.8 based network intrusion detection model. As evident [34] from the results none of the algorithms provide the best sIB (Proposed) 85.5 34 detection with zero false positive rates. Therefore, in ANTIDS-a [6] 69.9 ---- our future research we shall investigate other data mining techniques with a view to enhance the detection From, the above comparison, it is clear that our accuracy as close as possible to 100% while proposed method is efficient in detecting anomalous maintaining a low false positive rate. activities with high detection rate and relatively low false positive rate. REFERENCES It is also quite evident from Table 1 and Table 2 that the Meta classifier using sIB clustering for [1] D. Denning. An Intrusion detection model. IEEE classification enhances the intrusion detection rate Transaction on software Engineering.SE-13(2), pp.222- while maintaining the same false positive rate in 232, 1998. [2] S. Kumar, E. H. Spafford. An application of pattern comparison to unsupervised sIB clustering method. matching in Intrusion Detection. Technical report,CSD- It can be seen from Table 2 that our proposed method TR-94-013. Purdue University, 1994. is able to detect intrusions with highest accuracy in [3] M.Mahoney, P.K.Chan. An analysis of the 199 comparison to K-Means and Filtered Clusterer with K- DARPA/Lincoln Laboratory evaluation data for network Means classification via clustering methods. It can also anomaly detection. In 6th Intl. symposium on RAID, be observed that the proposed methodology with X- 2003. pp. 220-237. Means, FFT, DBSCAN and EM clustering provides [4] P. K. Chan, M. mahoney and M. Arshad. Learning rules high average detection rate in comparison to our sIB and clusters for anomaly detection in network traffic”, in managing cyber threats,issues,approaches and clustering, but they are not able to classify all the challenges, V.Kumar,J.Srivastava, and A.Lazarevic instances successfully. At the same time, their false (Edns.),Kluwer.2004. positive rates except in case of X-Means are more than [5] N. Foukia, S. Hassas, S. Fenet and J. Hulaas, “An sIB clustering. These comparisons show that our Intrusion response scheme:Tracking the source using the proposed classification via sIB clustering is more sigmery paradigm”, in Proc. Of security of mobile multi suitable in building an efficient anomaly based agent system workshop (SEMAS-2002).Italy, July16, network intrusion detection model. 2002. [6] V. Ramos and Ajith Abraham, “ANTIDS-Self organised Table 2.Comparison of our proposed Classification via Ant based clustering model for intrusion detection system. WSTST, 2005, pp.103-112. Clustering Methods http://www.softcomputing.net/WSTST-ra.pdf 21 © 2010 ACEEE DOI: 01.ijns.01.02.04
ACEEE International Journal on Network Security, Vol 1, No. 2, July 2010 [7] M. Brameier and W. Banzhaf, “A comparison of linear [20] The Expectation Maximization Algorithm. genetic programming and neural network in medical data http://www.cs.unr.edu/~bebis/mathmethods/EM/lecture mining”, in IEEE Transaction on Evolutionary .pdf. Sept. 25, 2004. computation.5 (1), pp.17-26, 2001. [21] M. Ester, H.-P Kriegel, J. Sander and X. Xu, „A density- [8] V.N. Vapnik, The Nature of Statistical Learning Theory. based algorithm for discovering clusters in large spatial Springer, 1995. database with noise“, in Proc. of KDD-96,pp.226- [9] J. Denebourg ,et al. “The dynamic of collective sorting 231,1996. Robot like ants and Ant like Robots”, in 1st conf. on [22] Pavel Berkhin, “Survey of clustering data mining simulation of Adaptive behaviour ;from animals to techniques”, Technical report. Accrue software, San animats,cambridg,MA,MIT Press,pp.356-365,1991. Jose,CA,USA,pp.1-56.2002. [10] S. Mukkamala, A.H. Sung, A. Abraham, V. Ramos, http://www.ee.ucr.edu/~barth/EE242/clustering_survey. “Intrusion Detection Systems using Adaptive Regression pdf Splines”, in ICEIS-04, 6th Int. Conf. on Enterprise [23] A.Willium, “Clustering algorithm for categorical Information Systems, to appear at Kluwer Academic data”.2006. Press, 2005, Porto, 14-17 April 2004. http://www.cse.yorku.ca/~billa/MULIC/Bill_Andreopo [11] M. Panda and M. R. Patra, “A comparative study of data ulous_PhD_thesis_web.pdf mining algorithms for network intrusion detection”, [24] A. William, “Clustering Algorithms for categorical proc. of ICETET, India, 2008.pp.504-507. IEEE data”, September 2006. Xplore. http://www.cse.yorku.ca/~billa/MULIC/Bill_Andreopou [12] M. Panda and M. R. Patra, “Network intrusion detection los_PhD_thesis_web.pdf using Naïve Bayes”, International journal of computer [25] D. Pellag and A. Moore, “X-Means: Extending K-Means science and network security, vol.7, No.12, 2007, with efficient estimation of the number of clusters”. Int. pp.258-263. Conf. on Machine Learning (ICML), pp.727-734, San [13] M. Panda and M. R. Patra. “Bayesian belief Network Francisco, 2000. using genetic local search for network intrusion”, [26] N. Slonim, N. Friedman and N. Tishby, “Unsupervised International journal of secure digital information document classification using sIB maximization: SIGIR, age.Vol.1, issue.1, June 2009. In Press. Tempere, Finland, pp.129-136, ACM Press. ISBN: 1- [14] M. Panda and M. R. Patra, “Network intrusion detection 58113-561-0102/0008. using boosting support vector classifiers”, In 2009 IEEE [27] N. Lavesson and P. Davidson, “Multi dimensional Intl.Advance computing Conference, ptaila, Punjab, measures function for classifier performance”, in 2 nd pp.926-931. IEEE Press.USA. IEEE Intl. Conf. on Intelligent systems, June [15] S. Peddabachigari, A. Abraham, C. Grosan and J. 2004,pp.508-513. Thomas, “Modelling intrusion detection system using [28] M. A. Maloof, “On machine learning ROC analysis and hybrid intelligent systems”, Journal of Network and statistical tests of significance”, in 16 th Intl.Conf. on computer applications.Elsevier, 30(1), pp.114-132, pattern recognition, IEEE,Vol.2,2002,pp.204-207. 2007. [29] I. Levin,”KDD-99 classifier learning context”, in [16] John Mill and A. Inoue, “Support vector classifier and LLSoft’s results overview, SIGKDD explorations, network intrusion detection”, In proc. of 2004 IEEE ACMSIGKDD, Vol.1, No.2, pp.67-75, 2000. international conference on fuzzy system, WA, USA, [30] V. Venkatchalam and S. Selvan, “Performance 2004, Vol.1, pp.407-410. ISBN: 1098-7584. comparison of intrusion detection system classifier [17] J. Erman, M. Arlitt and A. Mahanti, “Traffic using various feature reduction techniques”, classification using clustering algorithms”, in International journal of simulation, Vol.9, No.1, 2008. SIGCOMM-06 workshops, sept.11-15, 2006, Pisa, [31] M. K. Sabhani and G. Sarpen, “Application of machine Italy.pp.281-286.ACM Press. learning algorithm on KDD intrusion detection dataset with misuse detection context”, in Proc. of Intl.conf. on [18] M. Bahrololum and M. Khaleghi, “Anomaly intrusion machine learning models,technologies and applications detection system using hierarchical gausian mixture (MLMTA 2005), Los Vegas,NV,2003,pp.209-215. model”, International journal of computer science and [32] T. Hassan, M. Hashaem and A. Fahmy,” Unsupervised network security, vol.8, no.8, pp.264-271, August anomaly detection using an incremental clustering 2008. algorithm”, International journal of intelligent [19] Dat Tran, W. Ma and Dharmendra Sharma, computing and information sciences, Vol.5, No.1, “Automatedfeature weighting for network anomaly pp.253-268, 2005. detection”, in International journal of computer science and network security, Vol.8, no.2, pp.173-178, Feb.2008. 22 © 2010 ACEEE DOI: 01.ijns.01.02.04

A Novel Classification via Clustering Method for Anomaly Based Network Intrusion Detection System

  • 1.
    ACEEE International Journalon Network Security, Vol 1, No. 2, July 2010 A Novel Classification via Clustering Method for Anomaly Based Network Intrusion Detection System Mrutyunjaya Panda1 and Manas Ranjan Patra2 1 Department of ECE, Gandhi Institute of Engineering and Technology, Gunupur, Orissa-765022, India Email: mrutyunjaya.2007@rediffmail.com 2 Department of Computer Science, Berhampur University, Orissa-760007, India Email: mrpatra12@gmail.com Abstract- Intrusion detection in the internet is an active intrusions by scanning for pre-classified signatures in area of research. Intruders can be classified into two network packets. On the other hand, anomaly detection types, namely; external intruders who are unauthorized can detect new intrusions while misuse detection may users of the computers they attack, and internal not. The idea behind anomaly detection is that if we intruders, who have permission to access the system but can establish a normal activity profile for a system, in with some restrictions. The aim of this paper is to present a methodology to recognize attacks during the normal theory we can flag all system states varying from the activities in a system. A novel classification via sequential established profile as intrusion attempts. However, if a information bottleneck (sIB) clustering algorithm has set of intrusive activities is not identical to the set of been proposed to build an efficient anomaly based known anomalous activities, the situation becomes network intrusion detection model. We have compared more interesting as we can find new interesting our proposed method with other clustering algorithms possibilities. Anomalous activities that are not intrusive like X-Means, Farthest First, Filtered clusters, DBSCAN, but flagged as intrusive are referred as false positives. K-Means, and EM (Expectation-Maximization) Actual intrusive activities that go undetected are called clustering in order to find the suitability of our proposed false negatives. This is a serious problem and far more algorithm. A subset of KDDCup 1999 intrusion detection benchmark dataset has been used for the experiment. serious than false positives. Anomaly detection is Results show that the proposed method is efficient in computationally expensive because of the overhead of terms of detection accuracy, low false positive rate in keeping track of and possibly updating several system comparison to the other existing methods. profiles. Recently proposed system learning rules for anomaly detection (LERAD) discovers relationships Index Terms: Intrusion Detection, Meta Classifier, among attributes in order to model application Unsupervised clustering, Sequential Information protocols [3, 4]. bottleneck clustering. In fact, intrusion detection can be regarded as a classification problem, namely; identifying normal and I INTRODUCTION other types of intrusive behavior. Hence, the key point Identifying unauthorized use, misuse and attacks on here is to choose an effective classification approach to information systems is defined as intrusion detection build accurate intrusion detection models. A number of [1, 2]. The most popular way to detect intrusions has machine learning algorithms have been applied to been done by analyzing audit data generated by intrusion detection to learn intrusion rules or build operating systems and by networks. Since almost all normal usage patterns. In this paper, we investigate and activities are logged on a system, it is possible that a evaluate the performance of classifiers using different manual inspection of these logs would allow intrusions cluster algorithms like Sequential Information to be detected. It is important to analyze the audit data Bottleneck (sIB), Farthest First Traversal (FFT), K- even an attack has occurred, for determining the extent Means, DBSCAN and EM. The motivation behind of damage occurred, this analysis helps in attack trace using the classification using clustering algorithms is to back and also in recording the attack patterns for future improve the accuracy of the intrusion detection system prevention of such attacks. An intrusion detection in comparison to the individual clustering algorithms. system (IDS) can be used to analyze audit data for The rest of the paper is organized as follows. A such insights. This makes IDS a valuable real-time literature review is presented in Section 2 followed by detection and prevention tool as well as a forensic a short theoretical background on the clustering analysis tool. algorithms used in this research in Section 3. Section 4 Traditionally, intrusion detection techniques fall into provides a brief idea about the KDDCup 1999 dataset two categories: signature detection and anomaly used. The proposed methodology is described in detection. Signature detection, or misuse detection, Section 5 followed by the experimental setup in searches for well known patterns of attacks and Section 6. Section 7 describes the results and 17 © 2010 ACEEE DOI: 01.ijns.01.02.04
  • 2.
    ACEEE International Journalon Network Security, Vol 1, No. 2, July 2010 discussion. Finally, the paper concludes with future In our experiments, some clustering algorithms are research directions in Section 8. selected for performance comparison. Some of them are discussed as follows. II. RELATED RESEARCH A. K-Means: In [5], the authors have designed an Intrusion response We first describe the K-means algorithm for producing (IR) system cooperating with IDS using mobile agents a clustering of the points in the input into K clusters. It distributed throughout the network, based on partitions the data-points into K subsets such that all stigmergic properties. In [6], the authors introduced a points in a given subset “belong” to some centre. The self-organized ant colony based intrusion detection algorithm keeps track of the centroids of the subsets system (ANTIDS) to detect intrusions and compares and proceeds in iterations. Before the first iteration the its performance with linear genetic programming centroids are initialized to random values. The (LGP) [7], Support vector machines (SVM) [8] and algorithm terminates when the centroid locations stay Decision Trees (DT) [9]. Other works have made use fixed during iteration. During each iteration the of Multiple Adaptive Regression Splines (MARS) following is performed: [10]. In [11], the authors have compared various data  For each point x, find the centroid which is closest mining algorithms for detecting network intrusions. to x. Associate x with this centroid. The authors have used Naïve Bayes algorithm in  Re-estimate centroid locations by taking, for each building a network intrusion detection model [12]. In centroid, the canter of mass of points associated [13], the authors proposed Bayesian Belief network with it. (BBN) with genetic and simulated annealing local The K-Means algorithm is known to converge to a search in order to build an efficient network intrusion local minimum of the distortion measure (that is, detection model. The authors have compared various average squared distance from points to their class ensemble algorithms in detecting the intrusion centroids). It is also known to be very simple and can detection in [14]. Modeling intrusion detection system be easily implemented in solving many practical using hybrid intelligent systems is proposed in [15]. In problems. It can work very well for compact and hyper this, DT and SVM are combined as a hierarchical spherical clusters. The time complexity of the K-means hybrid intelligent system model (DT_SVM) and an is O (NKd). Since, K and d are usually much less than ensemble approach combining the base classifiers. In N, K-means can be used to cluster large data sets. [16], the authors propose support vector learning approach to classify network requests. The authors in B. Expectation Maximization (EM) [17] used K-Means and DBSCAN to demonstrate how We follow the procedure as in [20]. The steps for our cluster analysis can be used to effectively identify implementation of EM are as follows. We initiate the groups of traffic that are similar using only transport process by an initial estimate of mean and standard layer statistics. The authors propose hierarchical deviation. The EM algorithm then searches for a Gaussian Mixture Model (HGMM) a novel type of maximum likelihood hypothesis through the following Gaussian Mixture which detects network based attacks iterative scheme. as anomalies using statistical processing classification • Initialization Step: Initialize the hypothesis θ 0 = ( µ10 , µ2 ,...., µ k0 ) , θ k0 = µk0 in [18]. In [19], the authors use automated feature 0 weighting for network anomaly detection. They conclude that their proposed method not only increases (1) the detection rate but also reduces false alarm rate as Where k is the current number of Gaussians, σ is well. the standard deviation, θ 0 is the estimate at 0th We believe that an unsupervised clustering approach iteration, µ is the mean. offers some advantages over supervised learning approaches. One of the main benefits is that new • Expectation Step: Estimate the expected values applications can be identified by examining the of the hidden variables zij (mean and standard connections that are grouped to form a new cluster. deviation) using the current hypothesis θ = ( µ , µ ,...., µ ) The supervised approach can not discover new t t t t applications and can only classify network traffic for 1 2 k which it has labeled training data. This prompted us to � x −µt ) � −( i 2 use unsupervised clustering approach in building an exp � k � efficient classifier for detecting anomaly based � 2σ2 � E ( zik ) = � � network intrusions. (2) k � x −µt ) 2 � −( i ¥ � 2σ2 � j exp � � I. CLUSTERING ALGORITHMS j= 1 � � where t is the number of iteration, E zij ( ) is the expected value for the hidden variables (namely 18 © 2010 ACEEE DOI: 01.ijns.01.02.04
  • 3.
    ACEEE International Journalon Network Security, Vol 1, No. 2, July 2010 mean and standard deviation), k is the dimension, The farthest first traversal k-center algorithm (FFT) is a σ is the standard deviation. fast, greedy algorithm that minimizes the maximum • Maximization Step: provides a new estimate cluster radius [23]. This is also treated as an efficient of the parameters. algorithm which always returns the right answer. The n pseudo code for the farthest first traversal algorithm is ¥( z ) x E ik i as follows: µt +1 = k i=1 n (3) ¥( zik ) E i=1 • Pick any z ᆪ S and set T = { z} • While Convergence Step: if θ − θ < e , stop t +1 t T <k : • z = arg max ρ( x, T ) (finish iteration); otherwise, go to step 2. xᆪS The hidden variables are the parameters of the model. T =T ᆪ z} { In this case, we use mixture of Gaussians; hence the hidden variables are the mean and standard deviation for each Gaussian distribution. We start with an initial Here, ρ ( x, T ) is the distance from point ( x ) to the estimate of those parameters and iteratively run the closest point in set T. This builds a solution T one point algorithm to find the maximum likelihood (ML) for at a time. It starts with any point, and then iteratively our estimates. adds in the point farthest from the ones chosen so far. The reason we are using EM is to fit the data better, so that clusters are compact and far from other clusters, The farthest point ( x ) from a set ( S ) is obtained from since we initially estimate the parameters and iterate to ρ ( x, T ) . Farthest first traversal takes time O ( k S ) , find the ML for those parameters which is fairly efficient and is always close to optimal C. DBSCAN solution, in the sense that if T is the solution returned Density-based methods are based on a simple by the farthest first traversal, and T * is the optimal assumption: clusters are dense regions in the data space that are separated by regions of lower density. ( ) solution, then cos t ( T ) ᆪ2 cos t T * . Their general idea is to continue growing the given E. Filtered Clusterer cluster as long as the density in the neighborhood exceeds some threshold. In other words, for each data The Filtered Clusterer is Meta-Clusterer which offers point within a given cluster, the neighborhood of a the possibility to apply filters directly before the given radius has to contain at least a minimum number Clusterer is learnt. The structure of the filter is based of points. These methods are good at altering out exclusively on the training data and test instances are outliers and discovering clusters of arbitrary shapes. processed by the filter without changing their structure Density based spatial clustering of applications with [24]. noise (DBSCAN) is an example of density based F. X-Means methods [21]. X-means is a new algorithm that quickly estimates the The algorithm DBSCAN (Density Based Spatial number of clusters K. It goes into action after each run Clustering of Applications with Noise) targeting low- of K-means, making local decisions about which subset dimensional spatial data is the major representative in of the current centroids should split themselves in order this category. Two input parameters ε and MinPts are to better fit the data. The splitting decision is done by used to define: ε -neighborhood computing the Bayesian Information Criterion (BIC). 1)An X-means have been experimented against a more Nε ( x ) = y ΣX | d ( x, y ) ε ) of the point x, traditional method that estimates the number of clusters 2) A core object (a point with a neighborhood by guessing K. X-means consistently produced better consisting of more than MinPts points) clustering on both synthetic and real-life data, with 3) A concept of a point y density-reachable from a respect to BIC. It also runs much faster, which core object x (a finite sequence of core objects prompted us to select this algorithm for our intrusion between x and y exists such that each next belongs to detection model, which has not been considered so far an ε -neighborhood of its predecessor) for intrusion detection. A detailed description on X- 4) A density-connectivity of two points x, y (they Means operations can be seen in [25]. should be density-reachable from a common core object). IV INTRUSION DETECTION DATASET More details about the DBSCAN can be found in [22]. The KDD Cup 1999 Intrusion detection contest data is D. Farthest First used in our experiments. This data was prepared by DARPA Intrusion detection evaluation program by 19 © 2010 ACEEE DOI: 01.ijns.01.02.04
  • 4.
    ACEEE International Journalon Network Security, Vol 1, No. 2, July 2010 MIT Lincoln Laboratory. Lincoln labs acquired nine as the number of class labels in the dataset in order to weeks of raw TCP dump data. The raw data was obtain a useful model. However, in case of sIB processed into connection records, which contains clustering algorithm, we need not specify the number about 5 million connection records. The data set of clusters to generate a model, as it fixes the number contains 24 attack types. These attacks fall into four of clusters automatically. main categories such as Denial of Service (DoS), B. Proposed Sequential Information Bottleneck Probing, User to Root (U2R) and Remote to Local Clustering(sIB) (R2L) attacks. Besides the four different types of attacks mentioned The steps for the Sequential Information Bottleneck above, normal class needs to be detected. The data set (sIB) Clustering are as follows. for our experiments contained 1000 connection  Finding the global minimum for given number records, which is a subset of 10% KDD Cup’99 of clusters (W) intrusion detection benchmark dataset. These were  Initialized with a (possibly random) partition randomly generated from the MIT data set. Random of W clusters generation of data include the number of data from  sIB draws a sample x at random, treats it as a each class proportional to its size, except that the singleton cluster and merges it to a new smallest class is completely included. All the intrusion cluster cnew such that, detection models are trained and tested with the same data set. Although, the data set contains five different cnew = arg min cᆪC JS ( x, c ) (4) classes, we perform 2-class classification (either normal or attack) as our proposed method could not Where JS ( .,.) is the Jensen-Shannon distance. handle multiple class labels for building an efficient  At each step the objective function improves intrusion detection model. or remains un-changed. More details about this algorithm can be found in [26]. V PROPOSED METHODOLOGY The proposed methodology for classification Via VI EXPERIMENTAL SETUP Clustering using Sequential Information Bottleneck All our experiments were performed using a Pentium 4, (sIB) principle is shown in Fig. 1 below. 2.8 GHz CPU with 512 MB RAM. Full data set is used for the purpose of training in order to build an intrusion Featu Clustering Clust detection model. Then, 10-fold cross validation method re Algorithm ers is used in order to test the efficacy of the model built Select (sIB) during the training phase. Data Samples ion/ Design/ Extra Selection A. Cross Validation Methods Knowl Resul ction Classi Cluster edge Cross-validation (CV) tests exist in a number of ts ficatio validatio variants but the general idea is to divide the training Interp n n data into a number of partitions or folds. The classifier retatio is evaluated by its classification accuracy over one n partition after having learned from the other. This Figure1.Proposed methodology of Classification Via procedure is then repeated until all partitions have been sIB clustering used for evaluation. Some of the most common types are 10-fold, n-fold and bootstrap CV [27]. The A. Meta Classifier difference between these three types of CV lies in the Meta Classifier which comes under Decision way data is partitioned. Leave-one-out is similar to n- committee learning has demonstrated spectacular fold CV, where n stands for the number of instances in success in reducing classification error from learned the data set. Leave–one-out or n-fold CV is performed classifiers. These techniques develop a classifier in the by leaving one instance out for testing and training on form of a committee of subsidiary classifiers. The the other instances. This procedure is then performed committee members are applied to a classification task until all instances have been left out once. It has been and their individual outputs combined to create a argued that the design of 10-fold cross-validation single classification from the committee as a whole. introduces a source of correlation since one is used for These are recent methods for improving the predictive training in one trial and then the same is used for power of classifier learning systems. testing in another [28]. In our proposed methodology, we design a simple meta-classifier that uses a clusterer for classification. VII RESULTS AND DISCUSSION For cluster algorithms that use a fixed number of Here, we present the behavior of different unsupervised Clusterer, like Simple K-Means, the user has to make approaches in building an efficient anomaly based sure that the number of clusters to generate is the same network intrusion detection model. In Table 1, we 20 © 2010 ACEEE DOI: 01.ijns.01.02.04
  • 5.
    ACEEE International Journalon Network Security, Vol 1, No. 2, July 2010 present a comparative study of our proposed sIB Classificati Average Average Unclassified clustering method with respect to the other existing on via Detection False instances approaches. Clustering Rate (%) positive (%) Algorithms Rate (%) Table 1. Comparative Study of Existing Clustering and K-Means 80.1 35 NIL Classification Algorithms X-Means 88.3 26.8 12.6 Filtered 80.1 19.8 NIL Algorithm Average False Positive Clusterer Detection Rate Rate (%) +K-Means (%) K-Means [29] 46.986 0.875 Farthest 86.6 89.6 0.8 KM-VQ [19] 48.4 10 First (FFT) FES-KM- 60.1 10 VQ[19] DBSCAN 86.7 100 18.66 X-Means (ours) 78 36 EM 89.1 52.3 2.2 KDDCup 48.57 0.225 sIB 86.3 34 NIL Winner [29] (Proposed) GMIX [30] 53.725 29.715 SOM [30] 47 25.685 Nearest Cluster 47.875 0.2 [31] VIII. CONCLUSION AND FUTURE SCOPE Incremental 44.57 to 84.78 15.99 to 76.83 In this paper, we have illustrated the use of Clustering [32] classification via clustering methodology using sIB Cluster [32] 66 2 clustering algorithm for modeling intrusion detection K-NN [32] 23 6 systems. The proposed approach provides better FR-1 [33] 74.82 73.07 detection accuracy with comparatively low false FR-2 [33] 79.68 85.66 Pure SVM [34] 57.6 35.5 positive rate in comparison to other existing SVM + Rocchio 51.6 44.2 unsupervised clustering algorithms. This makes the bundling [34] approach suitable for building an efficient anomaly SVM +DGSOT 69.8 37.8 based network intrusion detection model. As evident [34] from the results none of the algorithms provide the best sIB (Proposed) 85.5 34 detection with zero false positive rates. Therefore, in ANTIDS-a [6] 69.9 ---- our future research we shall investigate other data mining techniques with a view to enhance the detection From, the above comparison, it is clear that our accuracy as close as possible to 100% while proposed method is efficient in detecting anomalous maintaining a low false positive rate. activities with high detection rate and relatively low false positive rate. REFERENCES It is also quite evident from Table 1 and Table 2 that the Meta classifier using sIB clustering for [1] D. Denning. An Intrusion detection model. IEEE classification enhances the intrusion detection rate Transaction on software Engineering.SE-13(2), pp.222- while maintaining the same false positive rate in 232, 1998. [2] S. Kumar, E. H. Spafford. An application of pattern comparison to unsupervised sIB clustering method. matching in Intrusion Detection. Technical report,CSD- It can be seen from Table 2 that our proposed method TR-94-013. Purdue University, 1994. is able to detect intrusions with highest accuracy in [3] M.Mahoney, P.K.Chan. An analysis of the 199 comparison to K-Means and Filtered Clusterer with K- DARPA/Lincoln Laboratory evaluation data for network Means classification via clustering methods. It can also anomaly detection. In 6th Intl. symposium on RAID, be observed that the proposed methodology with X- 2003. pp. 220-237. Means, FFT, DBSCAN and EM clustering provides [4] P. K. Chan, M. mahoney and M. Arshad. Learning rules high average detection rate in comparison to our sIB and clusters for anomaly detection in network traffic”, in managing cyber threats,issues,approaches and clustering, but they are not able to classify all the challenges, V.Kumar,J.Srivastava, and A.Lazarevic instances successfully. At the same time, their false (Edns.),Kluwer.2004. positive rates except in case of X-Means are more than [5] N. Foukia, S. Hassas, S. Fenet and J. Hulaas, “An sIB clustering. These comparisons show that our Intrusion response scheme:Tracking the source using the proposed classification via sIB clustering is more sigmery paradigm”, in Proc. Of security of mobile multi suitable in building an efficient anomaly based agent system workshop (SEMAS-2002).Italy, July16, network intrusion detection model. 2002. [6] V. Ramos and Ajith Abraham, “ANTIDS-Self organised Table 2.Comparison of our proposed Classification via Ant based clustering model for intrusion detection system. WSTST, 2005, pp.103-112. Clustering Methods http://www.softcomputing.net/WSTST-ra.pdf 21 © 2010 ACEEE DOI: 01.ijns.01.02.04
  • 6.
    ACEEE International Journalon Network Security, Vol 1, No. 2, July 2010 [7] M. Brameier and W. Banzhaf, “A comparison of linear [20] The Expectation Maximization Algorithm. genetic programming and neural network in medical data http://www.cs.unr.edu/~bebis/mathmethods/EM/lecture mining”, in IEEE Transaction on Evolutionary .pdf. Sept. 25, 2004. computation.5 (1), pp.17-26, 2001. [21] M. Ester, H.-P Kriegel, J. Sander and X. Xu, „A density- [8] V.N. Vapnik, The Nature of Statistical Learning Theory. based algorithm for discovering clusters in large spatial Springer, 1995. database with noise“, in Proc. of KDD-96,pp.226- [9] J. Denebourg ,et al. “The dynamic of collective sorting 231,1996. Robot like ants and Ant like Robots”, in 1st conf. on [22] Pavel Berkhin, “Survey of clustering data mining simulation of Adaptive behaviour ;from animals to techniques”, Technical report. Accrue software, San animats,cambridg,MA,MIT Press,pp.356-365,1991. Jose,CA,USA,pp.1-56.2002. [10] S. Mukkamala, A.H. Sung, A. Abraham, V. Ramos, http://www.ee.ucr.edu/~barth/EE242/clustering_survey. “Intrusion Detection Systems using Adaptive Regression pdf Splines”, in ICEIS-04, 6th Int. Conf. on Enterprise [23] A.Willium, “Clustering algorithm for categorical Information Systems, to appear at Kluwer Academic data”.2006. Press, 2005, Porto, 14-17 April 2004. http://www.cse.yorku.ca/~billa/MULIC/Bill_Andreopo [11] M. Panda and M. R. Patra, “A comparative study of data ulous_PhD_thesis_web.pdf mining algorithms for network intrusion detection”, [24] A. William, “Clustering Algorithms for categorical proc. of ICETET, India, 2008.pp.504-507. IEEE data”, September 2006. Xplore. http://www.cse.yorku.ca/~billa/MULIC/Bill_Andreopou [12] M. Panda and M. R. Patra, “Network intrusion detection los_PhD_thesis_web.pdf using Naïve Bayes”, International journal of computer [25] D. Pellag and A. Moore, “X-Means: Extending K-Means science and network security, vol.7, No.12, 2007, with efficient estimation of the number of clusters”. Int. pp.258-263. Conf. on Machine Learning (ICML), pp.727-734, San [13] M. Panda and M. R. Patra. “Bayesian belief Network Francisco, 2000. using genetic local search for network intrusion”, [26] N. Slonim, N. Friedman and N. Tishby, “Unsupervised International journal of secure digital information document classification using sIB maximization: SIGIR, age.Vol.1, issue.1, June 2009. In Press. Tempere, Finland, pp.129-136, ACM Press. ISBN: 1- [14] M. Panda and M. R. Patra, “Network intrusion detection 58113-561-0102/0008. using boosting support vector classifiers”, In 2009 IEEE [27] N. Lavesson and P. Davidson, “Multi dimensional Intl.Advance computing Conference, ptaila, Punjab, measures function for classifier performance”, in 2 nd pp.926-931. IEEE Press.USA. IEEE Intl. Conf. on Intelligent systems, June [15] S. Peddabachigari, A. Abraham, C. Grosan and J. 2004,pp.508-513. Thomas, “Modelling intrusion detection system using [28] M. A. Maloof, “On machine learning ROC analysis and hybrid intelligent systems”, Journal of Network and statistical tests of significance”, in 16 th Intl.Conf. on computer applications.Elsevier, 30(1), pp.114-132, pattern recognition, IEEE,Vol.2,2002,pp.204-207. 2007. [29] I. Levin,”KDD-99 classifier learning context”, in [16] John Mill and A. Inoue, “Support vector classifier and LLSoft’s results overview, SIGKDD explorations, network intrusion detection”, In proc. of 2004 IEEE ACMSIGKDD, Vol.1, No.2, pp.67-75, 2000. international conference on fuzzy system, WA, USA, [30] V. Venkatchalam and S. Selvan, “Performance 2004, Vol.1, pp.407-410. ISBN: 1098-7584. comparison of intrusion detection system classifier [17] J. Erman, M. Arlitt and A. Mahanti, “Traffic using various feature reduction techniques”, classification using clustering algorithms”, in International journal of simulation, Vol.9, No.1, 2008. SIGCOMM-06 workshops, sept.11-15, 2006, Pisa, [31] M. K. Sabhani and G. Sarpen, “Application of machine Italy.pp.281-286.ACM Press. learning algorithm on KDD intrusion detection dataset with misuse detection context”, in Proc. of Intl.conf. on [18] M. Bahrololum and M. Khaleghi, “Anomaly intrusion machine learning models,technologies and applications detection system using hierarchical gausian mixture (MLMTA 2005), Los Vegas,NV,2003,pp.209-215. model”, International journal of computer science and [32] T. Hassan, M. Hashaem and A. Fahmy,” Unsupervised network security, vol.8, no.8, pp.264-271, August anomaly detection using an incremental clustering 2008. algorithm”, International journal of intelligent [19] Dat Tran, W. Ma and Dharmendra Sharma, computing and information sciences, Vol.5, No.1, “Automatedfeature weighting for network anomaly pp.253-268, 2005. detection”, in International journal of computer science and network security, Vol.8, no.2, pp.173-178, Feb.2008. 22 © 2010 ACEEE DOI: 01.ijns.01.02.04