ANOMALY BASED NETWORK INTRUSION DETECTION BY YATINDRA S HASHI I C T I NNOVATI ON T U B E R L I N M AT R I C U L AT I O N : 3 8 6 8 4 5 For the course: SE Autonomous Security Guided by: Karsten Bsufka Prof. Dr. H.C. Sahin Albayrak
OUTLINE 1. INTRODUCTION 1. Misuse-based(signature) 2. Anomaly based 2. Machine Learning Based NIDS 3. Fuzzy Logic Based A-NIDS (In detail) 4. Genetic Algorithm (GA) 5. Fuzzy IDS for Mobile Ad hoc Network 6. GA for MANET 7. Conclusion 8. Reference 1
1. Introduction 2 • Bob have datacenter • Host websites data • Different Companies data • Private data Hacker Eve  Unauthorize excess  Denial of Service  R2L  U2R  Probe
Introduction.. [1] NIDS is tool used to detect security attack It is based on network traffic ❖Depending on type of analysis, IDS classified as ❖Misuse-based  signature based ( defined patterns in data) ❖Anomaly based  deviation from Normal behavior ✓ Ability to detect new attack ✓ Profiles for normal activity is customized as for system ✓ But it has high potential of False alarms 3
2. Machine Learning based A-NIDS ❖Consist of two phase Training Phase ➢ Classify into subset ➢ Learn Model using training data ➢ Define Normal behavior Testing Phase ➢ Apply learn model to Test data ➢ Test data analyzed Normal or Abnormal ❖Pros Flexibility and Adaptability ❖Cons High dependency on assumption of normal behavior High Resource Consumption 4
3. Fuzzy Logic Based A-NIDS[2] Fuzzy logic : degree of truth rather than Boolean True/False(1/0) Process of Fuzzy logic include Fuzzify I/P values into fuzzy membership function  Apply rules in the rulebase to compute O/P De-fuzzify the fuzzy O/P to crisp set 5
3.1 KDD CUP DATASET DARPA 1998 network dataset(TCP Dump) 4.9 million single connection vectors Consist of 41 features  Normal + Attack Contains 4 types of main attacks ( overall 38 attacks) 6 Fig: KDD cup dataset Features example.[2]
3.2 Steps Involved in the Proposed A- NIDS[2] Classification of Training Data Strategy for generation of fuzzy rules Fuzzy decision module Finding an appropriate classification for a test input 7
8 Fig: Overall steps of proposed intrusion detection in [2]
3.2.1 Classification of Training Data Continuous attributes of KDD dataset (Continuous in Nature) Hence, 34 attributes from Input data set Divide dataset D in to 5 subset (Class label) --[2] Class Label define several attack ◦ 4 Major attack dat ◦ Denial of Service, Remote to Local,U2R and Probe ◦ 1 Normal data These data set used to develop Fuzzy Rules automatically 9
10 Fig: Overall steps of proposed intrusion detection in [2]
3.2.2 Strategy For Generation of Fuzzy Rules Mining method to identify better set of rules due to huge input data Definite rule obtained from single length frequent items used for learning of fuzzy system. It has 4 steps 1. Mining of Single Length Frequent Items [2]: Frequent items(attributes) for Attack & Normal classes discovered  Find 1 Length item for each attribute by finding frequency of continuous variable present in each attribute 11
Continue.. 2. Identification of suitable attributes for rule generation[2] Used Deviation Method Mined l-length frequent items from each attribute are stored in a vector so that 34 vectors are obtained for each class ----[2] 12 Where i=1(refer to Normal) & =2(refer to attack) Vector contains frequent items, frequency > minimum support ---[2]
Continue..  For each attribute, Deviation range of frequent item is identified based on frequency range(max, min) 13 --[2] ➢ One to one comparison is performed between respective vector of each class to identify effective attribute ➢ Attributes that not contain identical {max, min} range for both class is chosen as effective attribute --[2]
Continue… 3. Rule Generation ◦ Intersection points are identified comparing deviation range of effective attribute of normal and attack data ◦ With these 2 intersection point definite and indefinite rules are generated Ex. Normal Data for attribute1 ={1,5} Attack Data for attribute1 = {2,8} Rules can be 1. IF attribute1 is greater than 5, THEN the Data is Attack 2. IF attribute1 is between 2 &5, THEN the Data can be Normal or Attack 3. IF attribute1 is below 2, THEN Data is Normal 14
Continue.. 4. Rule Filtering ➢For compact and interpretable classification system following criteria should be matched for productive fuzzy learning system: ➢The number of fuzzy rules should be decreased ➢IF part of fuzzy rules should be short ➢Filters the indefinite rules and selects definite rules with only one THEN part 15
Continue.. 5.Generating Fuzzy Rules ➢Generated from the definite rules ➢IF part is numerical variable and THEN part is class label variable but fuzzy rules contains only linguistic variable ➢Fuzzify the IF part numerical value ➢Ex. IF attribute1 is H, THEN the data is attack ➢These fuzzy rules are used to learn the effective fuzzy system 16
17 Fig: Overall steps of proposed intrusion detection in [2]
3.2.3 Fuzzy Decision Module Finding suitable class label (attack or normal) of the test data using fuzzy logic system 18 Fig: Fuzzy decision Module [2]
Continue.. ➢each input fuzzy set described in the fuzzy system includes 4 membership functions (VL, L, M and H) ➢ an output fuzzy set contains two membership functions (L and H) ➢Mamdani FIS includes ➢Fuzzifier ➢Rule base ➢Inference Engine ➢Membership function & Defuzzifier 19
3.2.4 Finding Appropriate classification of Test Data ➢Test data is sent to the fuzzifier which converts numerical data into linguistic variable using membership function ➢Output is feed to the inference engine which in turns compare with the rule base ➢Rule base contains rules from the definite rule ➢O/P of the Inference engine is linguistic value from {Low, High} and it is converted to the crisp value of 0 or 1. ➢ “0” denotes data is completely Normal and “1” denotes data is attack 20
3.3 Experimentation ➢Experiment was done on 10% of available KDD Cup data set ➢Paper has shown result based on the MATLAB computation ➢Where TP= True Positive, TN= True Negative, FP= False Positive, FN=False Negative ➢Achieved 90 % of accuracy for all types of attacks 21 ----[2]
4.Genetic Algorithm [4] ➢Based on laws of selection and evolution ➢Problem is converted into the Chromosome ➢Identify dataset called population(collection of chromosome) ➢Dataset are individually encoded using bits, characters or integers which form chromosome 22
Continue… ➢Evaluation function or fitness function test is performed for suitability of chromosome ➢For more accuracy , Crossover(recombination) and Mutation are performed ➢It gives breeding and evolution with high fitness value ➢ Fit chromosome is selected once the optimized criteria meet 23
24 Fig: Genetic Algorithm Flow[4]
5. Fuzzy IDS for Mobile Ad hoc Network[3] ➢Communication by via wireless links makes mobile ad hoc networks more vulnerable ➢IDS for packet dropping attack through Malicious node Fuzzy Based Parameter extraction ◦ Data packet forwarded ratio ◦ Average data packet Dropped rate ➢Result ➢Calculated for different speed of mobile nodes ➢Fix malicious node 3 ➢True positive rate and false positive rate 25
6. GA based Intrusion Detection in MANET[4] ➢IDS for packet dropping attack through Malicious node ➢Number of nodes of MANET is initial population ➢Members of initial population are encoded and called as chromosome ➢Each chromosomes are evaluated based on ➢Packet drop(PD) ➢Request Forwarding Rate(RFR) ➢Request Receive Rate(RRR) ➢Experiment result ➢Different numbers of malicious nodes(up to 10) ➢It talks number of detected node 26
7. Conclusion ➢Anomaly based intrusion detection system is more efficient if it use machine learning algorithms to detect intrusion for new attack ➢Fuzzy logic based NIDS which uses fuzzy rule learning strategy makes it effective for detecting intrusion in computer network ➢Genetic Algorithm is another type of anomaly detecting algorithm which uses evolution and mutation techniques to find suitable chromosome for fitness function. ➢Genetic Algorithm (GA) is based on selection and evolution which find optimum individual ➢Fuzzy and GA IDS can be used for Mobile Ad hoc Network intrusion detection effectively ➢As for paper scope unable to compare Fuzzy and GA as they are dealing with different experiment and type of data set ➢Fuzzy system gives good result based on fuzzy rules where as GA gives best result based on fitness value 27
Reference 1. P. Garcia-Teodoro, J. Diaz-Verdejo, G. Macia -Fernandez, E. Vazquez, “Anomaly Based Network Intrusion Detection : Techniques, systems and challenges”, Computer and Security, Volume 28, Issues 1–2, February–March 2009, Pages 18–28. 2. R. Shanmugavadivu & Dr.N.Nagarajan, “Network Intrusion Detection system using fuzzy logic”, Indian Journal of Computer Science and Engineering (IJCSE), ISSN : 0976-5166, Vol. 2 No. 1, pp:101-111,2011 3. Alka Chaudhary, V.N Tiwari, Anil Kumar, “Design an anomaly based fuzzy intrusion detection system for packet dropping attack in Mobile Ad Hoc Network”, Advance Computing Conference (IACC), 2014 IEEE International 4. K.S.Sujatha, Vydeki Dharmar, R.S.Bhuvaneswaran, “Design of Genetic Algorithm based IDS for MANET”, 2012 International Conference on Recent Trends in Information Technology ,IEEE, 19-21 April 2012 . 28
29

Network Anomaly detection based on fuzzy logic and Genetic Algorithm

  • 1.
    ANOMALY BASED NETWORK INTRUSION DETECTION BY YATINDRAS HASHI I C T I NNOVATI ON T U B E R L I N M AT R I C U L AT I O N : 3 8 6 8 4 5 For the course: SE Autonomous Security Guided by: Karsten Bsufka Prof. Dr. H.C. Sahin Albayrak
  • 2.
    OUTLINE 1. INTRODUCTION 1. Misuse-based(signature) 2.Anomaly based 2. Machine Learning Based NIDS 3. Fuzzy Logic Based A-NIDS (In detail) 4. Genetic Algorithm (GA) 5. Fuzzy IDS for Mobile Ad hoc Network 6. GA for MANET 7. Conclusion 8. Reference 1
  • 3.
    1. Introduction 2 • Bobhave datacenter • Host websites data • Different Companies data • Private data Hacker Eve  Unauthorize excess  Denial of Service  R2L  U2R  Probe
  • 4.
    Introduction.. [1] NIDS istool used to detect security attack It is based on network traffic ❖Depending on type of analysis, IDS classified as ❖Misuse-based  signature based ( defined patterns in data) ❖Anomaly based  deviation from Normal behavior ✓ Ability to detect new attack ✓ Profiles for normal activity is customized as for system ✓ But it has high potential of False alarms 3
  • 5.
    2. Machine Learningbased A-NIDS ❖Consist of two phase Training Phase ➢ Classify into subset ➢ Learn Model using training data ➢ Define Normal behavior Testing Phase ➢ Apply learn model to Test data ➢ Test data analyzed Normal or Abnormal ❖Pros Flexibility and Adaptability ❖Cons High dependency on assumption of normal behavior High Resource Consumption 4
  • 6.
    3. Fuzzy LogicBased A-NIDS[2] Fuzzy logic : degree of truth rather than Boolean True/False(1/0) Process of Fuzzy logic include Fuzzify I/P values into fuzzy membership function  Apply rules in the rulebase to compute O/P De-fuzzify the fuzzy O/P to crisp set 5
  • 7.
    3.1 KDD CUPDATASET DARPA 1998 network dataset(TCP Dump) 4.9 million single connection vectors Consist of 41 features  Normal + Attack Contains 4 types of main attacks ( overall 38 attacks) 6 Fig: KDD cup dataset Features example.[2]
  • 8.
    3.2 Steps Involvedin the Proposed A- NIDS[2] Classification of Training Data Strategy for generation of fuzzy rules Fuzzy decision module Finding an appropriate classification for a test input 7
  • 9.
    8 Fig: Overall stepsof proposed intrusion detection in [2]
  • 10.
    3.2.1 Classification ofTraining Data Continuous attributes of KDD dataset (Continuous in Nature) Hence, 34 attributes from Input data set Divide dataset D in to 5 subset (Class label) --[2] Class Label define several attack ◦ 4 Major attack dat ◦ Denial of Service, Remote to Local,U2R and Probe ◦ 1 Normal data These data set used to develop Fuzzy Rules automatically 9
  • 11.
    10 Fig: Overall stepsof proposed intrusion detection in [2]
  • 12.
    3.2.2 Strategy ForGeneration of Fuzzy Rules Mining method to identify better set of rules due to huge input data Definite rule obtained from single length frequent items used for learning of fuzzy system. It has 4 steps 1. Mining of Single Length Frequent Items [2]: Frequent items(attributes) for Attack & Normal classes discovered  Find 1 Length item for each attribute by finding frequency of continuous variable present in each attribute 11
  • 13.
    Continue.. 2. Identification ofsuitable attributes for rule generation[2] Used Deviation Method Mined l-length frequent items from each attribute are stored in a vector so that 34 vectors are obtained for each class ----[2] 12 Where i=1(refer to Normal) & =2(refer to attack) Vector contains frequent items, frequency > minimum support ---[2]
  • 14.
    Continue..  For eachattribute, Deviation range of frequent item is identified based on frequency range(max, min) 13 --[2] ➢ One to one comparison is performed between respective vector of each class to identify effective attribute ➢ Attributes that not contain identical {max, min} range for both class is chosen as effective attribute --[2]
  • 15.
    Continue… 3. Rule Generation ◦Intersection points are identified comparing deviation range of effective attribute of normal and attack data ◦ With these 2 intersection point definite and indefinite rules are generated Ex. Normal Data for attribute1 ={1,5} Attack Data for attribute1 = {2,8} Rules can be 1. IF attribute1 is greater than 5, THEN the Data is Attack 2. IF attribute1 is between 2 &5, THEN the Data can be Normal or Attack 3. IF attribute1 is below 2, THEN Data is Normal 14
  • 16.
    Continue.. 4. Rule Filtering ➢Forcompact and interpretable classification system following criteria should be matched for productive fuzzy learning system: ➢The number of fuzzy rules should be decreased ➢IF part of fuzzy rules should be short ➢Filters the indefinite rules and selects definite rules with only one THEN part 15
  • 17.
    Continue.. 5.Generating Fuzzy Rules ➢Generatedfrom the definite rules ➢IF part is numerical variable and THEN part is class label variable but fuzzy rules contains only linguistic variable ➢Fuzzify the IF part numerical value ➢Ex. IF attribute1 is H, THEN the data is attack ➢These fuzzy rules are used to learn the effective fuzzy system 16
  • 18.
    17 Fig: Overall stepsof proposed intrusion detection in [2]
  • 19.
    3.2.3 Fuzzy DecisionModule Finding suitable class label (attack or normal) of the test data using fuzzy logic system 18 Fig: Fuzzy decision Module [2]
  • 20.
    Continue.. ➢each input fuzzyset described in the fuzzy system includes 4 membership functions (VL, L, M and H) ➢ an output fuzzy set contains two membership functions (L and H) ➢Mamdani FIS includes ➢Fuzzifier ➢Rule base ➢Inference Engine ➢Membership function & Defuzzifier 19
  • 21.
    3.2.4 Finding Appropriateclassification of Test Data ➢Test data is sent to the fuzzifier which converts numerical data into linguistic variable using membership function ➢Output is feed to the inference engine which in turns compare with the rule base ➢Rule base contains rules from the definite rule ➢O/P of the Inference engine is linguistic value from {Low, High} and it is converted to the crisp value of 0 or 1. ➢ “0” denotes data is completely Normal and “1” denotes data is attack 20
  • 22.
    3.3 Experimentation ➢Experiment wasdone on 10% of available KDD Cup data set ➢Paper has shown result based on the MATLAB computation ➢Where TP= True Positive, TN= True Negative, FP= False Positive, FN=False Negative ➢Achieved 90 % of accuracy for all types of attacks 21 ----[2]
  • 23.
    4.Genetic Algorithm [4] ➢Basedon laws of selection and evolution ➢Problem is converted into the Chromosome ➢Identify dataset called population(collection of chromosome) ➢Dataset are individually encoded using bits, characters or integers which form chromosome 22
  • 24.
    Continue… ➢Evaluation function orfitness function test is performed for suitability of chromosome ➢For more accuracy , Crossover(recombination) and Mutation are performed ➢It gives breeding and evolution with high fitness value ➢ Fit chromosome is selected once the optimized criteria meet 23
  • 25.
  • 26.
    5. Fuzzy IDSfor Mobile Ad hoc Network[3] ➢Communication by via wireless links makes mobile ad hoc networks more vulnerable ➢IDS for packet dropping attack through Malicious node Fuzzy Based Parameter extraction ◦ Data packet forwarded ratio ◦ Average data packet Dropped rate ➢Result ➢Calculated for different speed of mobile nodes ➢Fix malicious node 3 ➢True positive rate and false positive rate 25
  • 27.
    6. GA basedIntrusion Detection in MANET[4] ➢IDS for packet dropping attack through Malicious node ➢Number of nodes of MANET is initial population ➢Members of initial population are encoded and called as chromosome ➢Each chromosomes are evaluated based on ➢Packet drop(PD) ➢Request Forwarding Rate(RFR) ➢Request Receive Rate(RRR) ➢Experiment result ➢Different numbers of malicious nodes(up to 10) ➢It talks number of detected node 26
  • 28.
    7. Conclusion ➢Anomaly basedintrusion detection system is more efficient if it use machine learning algorithms to detect intrusion for new attack ➢Fuzzy logic based NIDS which uses fuzzy rule learning strategy makes it effective for detecting intrusion in computer network ➢Genetic Algorithm is another type of anomaly detecting algorithm which uses evolution and mutation techniques to find suitable chromosome for fitness function. ➢Genetic Algorithm (GA) is based on selection and evolution which find optimum individual ➢Fuzzy and GA IDS can be used for Mobile Ad hoc Network intrusion detection effectively ➢As for paper scope unable to compare Fuzzy and GA as they are dealing with different experiment and type of data set ➢Fuzzy system gives good result based on fuzzy rules where as GA gives best result based on fitness value 27
  • 29.
    Reference 1. P. Garcia-Teodoro,J. Diaz-Verdejo, G. Macia -Fernandez, E. Vazquez, “Anomaly Based Network Intrusion Detection : Techniques, systems and challenges”, Computer and Security, Volume 28, Issues 1–2, February–March 2009, Pages 18–28. 2. R. Shanmugavadivu & Dr.N.Nagarajan, “Network Intrusion Detection system using fuzzy logic”, Indian Journal of Computer Science and Engineering (IJCSE), ISSN : 0976-5166, Vol. 2 No. 1, pp:101-111,2011 3. Alka Chaudhary, V.N Tiwari, Anil Kumar, “Design an anomaly based fuzzy intrusion detection system for packet dropping attack in Mobile Ad Hoc Network”, Advance Computing Conference (IACC), 2014 IEEE International 4. K.S.Sujatha, Vydeki Dharmar, R.S.Bhuvaneswaran, “Design of Genetic Algorithm based IDS for MANET”, 2012 International Conference on Recent Trends in Information Technology ,IEEE, 19-21 April 2012 . 28
  • 30.