STEGANOGRAPHIC APPROACH TO ENSURE DATA STORAGE SECURITY IN CLOUD COMPUTING USING HUFFMAN CODING 1 CREATED BY: HASIMSHAH . R . S
CONTENTS 1.INTRODUCTION 2.RELATED WORK 3.DESIGN OF THE SYSTEM 4.ALGORITHMS USED IN THE SYSTEM 5.SECURITY ANALYSIS AND PERFOMANCE EVALUATION 6.CONCLUSION 7.REFERENCES 2
INTRODUCTION  Cloud computing 3
 The cloud computing model allows access to information and computer resources from anywhere that a network connection is available. Cloud computing provides a shared pool of resources, including data storage space, networks, computer processing power, and specialized corporate and user applications. Cloud computing is a practical approach to experience direct cost benefits and it has the potential to transform a data center from a capital-intensive set up to a variable priced environment. The idea of cloud computing is based on a very fundamental principal of „reusability of IT capabilities'. 4
INTRODUCTION(CNTD) 5 Data security Can’t implement Traditional cryptographic technology . Cloud - not a third-party warehouse. Data stored in multiple physical locations in random manner Steganographic Approach Using Huffman Coding  ensures explicit dynamic data support security of data when these data are in the cloud storage.
INTRODUCTION(CONT) The Huffman Tree constructs an optimal prefix code called a Huffman code. Let’s say, there are six characters A,B,C,D,E and F as shown in Fig a . 6
INTRODUCTION(CONT) 7 Now for a given code 0 100 100 1101 we can decode them to get back the original code by traversing the Huffman tree.
CLOUD COMPUTING ARCHITECTURE AND SECURITY ISSUES DEPLOYMENT MODELS Private cloud Community cloud Public cloud Hybrid cloud SERVICE DELIVERY MODELS Software as a Service(SaaS) Platform as a Service(PaaS) Infrastructure as a Srevice(IaaS) 8 INTRODUCTION(CONT)
SECURITY ISSUES 9 Phishing data loss  botnet (Collection of machines are running remotely). botnet - offers more reliable infrastructure at a relatively low price for attack. INTRODUCTION(CONT)
Problem statement : Main problem - loss of control of data stored in the cloud. 10 Schematic System Architecture for Cloud INTRODUCTION(CONT)
RELATED WORK cong wang et al.use homomorphic token with distributed verification of erasure-coded data. but it is failed to achieve public verifiability and storage correctness. shantanu pal et al. ensures to find location of adversary or the attacking party from its target. it may try to attack them, if adversary knows the location of the other vms. this may harm the other vms in between. ateniese et al.proposed the “provable data possession” (pdp) model to ensure possession of file in untrusted storages. This scheme used public key based homomorphic tags to audit the data file and it is providing public verifiability. 11
DESIGN OF THE SYSTEM Storing data into some images. - steganography . 12 Processes to store or retrieve their data :
13 Computational model to Store Data Computational model to Retrieve Data
Human Visual System(HVS) has very low sensitivity. Variable length encoding doesn’t help attacker to recognize characters. He/she has no idea about frequency of characters. Can’t generate Huffman code. Ultimately we are having a secured system 14
Image database •Images stored in CSP-1 •Set of images sent to CSP-3 - user wants to store data in cloud File database •File holds the address of images Embedded data into Images •Counts total no. of characters •Finds frequency of each characters by Huffman code •Applies Steganography to both frequency of characters & codified data. 15
ALGORITHMS USED IN THE SYSTEM ALGORITHM 1 : HF-codification() 1. procedure 2. Read file FText which is to be saved in Cloud 3. Compute CN from FText 4. Find the frequency of occurrences of each characters in Ftext and store them in some chronological order 5. Store frequency in a new file FFreq . 6. FN = Freq-Codification( ) 16
17 7. Call Huffman-Tree() 8. Create a file FCode 9. Open FN. Reach EOF of FN where the original characters of FText will be replaced by the Huffman codes present in FCode . 10. Calculate the total Bit BCount in FN. 11. Delete FText, F Freq and F Code. 12. Call Steganography() to perform steganography on FN 13. end procedure
FILE CODIFICATION frequency file is read digit by digit & each digit is codified into 4- bit binary pattern Algorithm 2: Freq-Codification () 1. procedure 2. Open FFreq and a new File FN. 3. while ( Read characters from F Freq until EOF ) 4. do if (character is a new line character) 5. Append 1111 at the end of FN. 6. else 18
19 7. Convert the digit to its 4-bit binary form. 8. Append those 4-bits at the end of FN. 9. end if 10. end while 11. Append 11111111 at the end of FN. 12. Return FN 13. end procedure Algorithm 2: Freq-Codification () (CNTD…)
Hiding Data within Images Steganography Deals the pre-requisite requirements like : load image, store file name, image index finally call the MdfImg operation which will map data from file to images. ALGORITHM 3: STEGANOGRAPHY() 1. procedure 2. Load Image_Index = ImageSearch (Image_Database) 3. Store (FName, BCount, Image_Index) 4. MdfImg (Image_Database [Image_Index]); 5. end procedure. 20
SEARCHING OF VALID IMAGE The algorithm searches an image which we can be used to store the data. It returns the address of a valid image if it is available in image database. ALGORITHM 4: IMAGESEARCH(IMAGE_DATABASE) 1. procedure 2. Open Image_Database; 3. for Image_Database(i), i<-1, n do 4. if (Image_Database(i).valid==1) 5. return i 6. end if 7. end for 8. end procedure 21
22 MAPPING DATA FROM A FILE TO IMAGE It does the actual steganographic operation by storing data into images. Algorithm 5: MdfImg (Image_Database [Image_Index]) 1. procedure 2. Read Image_Database [Image_Index]; 3. Compute Pixel Count for Image_Database[Image_Index]; 4. Open FN 5. while (Read Characters until EOF) 6. do if (Pixel Count < B Count) …
7. Last bit of each consecutive pixels of the Image_Database[Image_Index] is replaced by Store each character. 8. else 9. Load Image_Index1=ImageSearch(Image_Database) 10. Image_Database[Image_Index1].valid=Image_Index1 11. Image_Database [Image_Index1].valid=0 12. end if 13. end while 14. end procedure 23Algorithm 5: MdfImg (Image_Database [Image_Index]) (CNTD)
RETRIEVING DATA FROM IMAGE The following algorithm retrieves the data from the images which is kept in cloud storage. Algorithm 6: RetrieveData () 1. procedure 2. Read File_Database; 3. for F_Database (i), i=1 to m 4. do if (F exits) 5. I=Holds the address of image. 6. end if 7. end for 8. Open Image_Database; 9. Read Image_Database [I]; 10. Open a F Temp and a F Freq 24
11. while (Until we get 11111111 in Image_Database [I]) 12. do Read 4 bits at a time from 4 consecutive pixels 13. Convert them into decimal form. 14. sum =sum + 4 15. if ( decimal number is within 0 to 9 ) 16. Write that digit in F Freq 17. else 18. Write new line character in F Freq 19. end if 20. end while 21. sum= sum - BCount 22. Call Huffman-Tree( ) based on the frequency counts present in FFreq and create the HuffmanTree. 25
23. while ( sum <= BCount ) do 24. read bits from Image_Database [I] 25. Start traversing the Huffman-Tree from root. 26. When we reach leaf node, we will get character. 27. Append that character in FTemp . 28. Increment sum number of times we collect bits from Image_Database [I] 29. end while 30. Show F Temp to the user, after user closes the file FTemp, delete the file FTemp from system. 31. end procedure 26
CONSTRUCTION OF HUFFMAN TREE  A priority queue, Q, is used to generate Huffman Tree with levels (frequency) as key. Algorithm 7a: Huffman-Tree (X) 1. procedure 2. FN=|X| 3. Q=X 4. for i=1 to N-1 5. do 6. Z=Allocate_node( ) 7. Z.left=Extract_min(Q) 8. Z.right=Extract_min(Q) 9. Frequency(Z)=Frequency(Z.left)+Frequency(Z.right) 10. Insert(Q,Z) 11. end for 12. end procedure 27
Algorithm 7b: Allocate_node() 1. Procedure 2. Create a node for storing characters and their frequency from available free memory space. 3. Return the allocated node. 4. End procedure Algorithm 7c: Extract_min(Q) 1. procedure 2. Remove and return the character with minimum frequency from the priority queue Q. 3. end procedure 28
29 Algorithm 7d: Insert(Q,Z) 1. procedure 2. Insert the node Z in the priority queue Q 3. end procedure
Security Analysis and Performance Evaluation Huffman coding is a variable length coding scheme. The frequency of each character is stored in some chronological order. Variable length encoding does not help the attacker to recognize the characters. Decoding of bits can only be done by the Huffman tree only. The frequency file contains only frequency. Change in chronological order results difficulty of tracking characters. Ultimately we are having a secured system. 30
•SECURITY STRENGTH AGAINST CSP-1 CSP-1 only stores some files. CSP-1 does not contain the retrieving algorithm, thus the images containing data are purely safe. 31 •SECURITY STRENGTH AGAINST CSP-2 Retrieving and hiding mechanism are stored in CSP-2. Knowing only the algorithm will not help the attacker. •SECURITY STRENGTH AGAINST CSP-3 CSP-3 is responsible for computation. All files will be deleted after the above operations.
CONCLUSION We applied steganographic approach to ensure data storage security in cloud computing using Huffman Coding (SAHC). Through detailed security and performance analysis this approach gives high security of data when it is on rest in the data center of any Cloud Service Provider (CSP). This proposed architecture will be able to provide customer satisfaction to a great level and it will attract more clients in the field of cloud computing for industrial as well as future research firms. 32
REFERENCES [1] Peter Mell, Timothy Grance, “The NIST Definatin of Cloud Computing”, Jan, 2011.http://docs. ismgcorp.com/files/external/Draft-SP-800-145_clouddefinition.pdf. [2] Amazon.com, “Amazon Web Services (AWS)”, Online at hppt://aws.amazon.com, 2008. [3] Con Wang, Qian Wang, Kui Ren, and Wenjng Lou,“Ensuring Data Storage Security in CloudComputing”,17th International workshop on Quality of service, USA, pp1-9, 2009, IBSN:978-42443875-4. [4] Thomas H. Cormen, Charles E. Leiserson, Ronald L.Rivest, and Clifford Stein. Introduction to Algorithms,Third Edition, Prentice Hall of India, 2010. [5] B.P Rimal, Choi Eunmi,I.Lumb, “A Taxonomy and Survey of Cloud Computing Sytem”, Intl. Joint Conference on INC, IMS and IDC, 2009,pp.44-51, Seoul,Aug, 2009. DOI : 10.1109/NCM.2009.218. 33
34

data storage security technique for cloud computing

  • 1.
    STEGANOGRAPHIC APPROACH TOENSURE DATA STORAGE SECURITY IN CLOUD COMPUTING USING HUFFMAN CODING 1 CREATED BY: HASIMSHAH . R . S
  • 2.
    CONTENTS 1.INTRODUCTION 2.RELATED WORK 3.DESIGN OFTHE SYSTEM 4.ALGORITHMS USED IN THE SYSTEM 5.SECURITY ANALYSIS AND PERFOMANCE EVALUATION 6.CONCLUSION 7.REFERENCES 2
  • 3.
  • 4.
     The cloudcomputing model allows access to information and computer resources from anywhere that a network connection is available. Cloud computing provides a shared pool of resources, including data storage space, networks, computer processing power, and specialized corporate and user applications. Cloud computing is a practical approach to experience direct cost benefits and it has the potential to transform a data center from a capital-intensive set up to a variable priced environment. The idea of cloud computing is based on a very fundamental principal of „reusability of IT capabilities'. 4
  • 5.
    INTRODUCTION(CNTD) 5 Data security Can’t implementTraditional cryptographic technology . Cloud - not a third-party warehouse. Data stored in multiple physical locations in random manner Steganographic Approach Using Huffman Coding  ensures explicit dynamic data support security of data when these data are in the cloud storage.
  • 6.
    INTRODUCTION(CONT) The Huffman Treeconstructs an optimal prefix code called a Huffman code. Let’s say, there are six characters A,B,C,D,E and F as shown in Fig a . 6
  • 7.
    INTRODUCTION(CONT) 7 Now fora given code 0 100 100 1101 we can decode them to get back the original code by traversing the Huffman tree.
  • 8.
    CLOUD COMPUTING ARCHITECTUREAND SECURITY ISSUES DEPLOYMENT MODELS Private cloud Community cloud Public cloud Hybrid cloud SERVICE DELIVERY MODELS Software as a Service(SaaS) Platform as a Service(PaaS) Infrastructure as a Srevice(IaaS) 8 INTRODUCTION(CONT)
  • 9.
    SECURITY ISSUES 9 Phishing data loss botnet (Collection of machines are running remotely). botnet - offers more reliable infrastructure at a relatively low price for attack. INTRODUCTION(CONT)
  • 10.
    Problem statement : Mainproblem - loss of control of data stored in the cloud. 10 Schematic System Architecture for Cloud INTRODUCTION(CONT)
  • 11.
    RELATED WORK cong wanget al.use homomorphic token with distributed verification of erasure-coded data. but it is failed to achieve public verifiability and storage correctness. shantanu pal et al. ensures to find location of adversary or the attacking party from its target. it may try to attack them, if adversary knows the location of the other vms. this may harm the other vms in between. ateniese et al.proposed the “provable data possession” (pdp) model to ensure possession of file in untrusted storages. This scheme used public key based homomorphic tags to audit the data file and it is providing public verifiability. 11
  • 12.
    DESIGN OF THESYSTEM Storing data into some images. - steganography . 12 Processes to store or retrieve their data :
  • 13.
    13 Computational model toStore Data Computational model to Retrieve Data
  • 14.
    Human Visual System(HVS)has very low sensitivity. Variable length encoding doesn’t help attacker to recognize characters. He/she has no idea about frequency of characters. Can’t generate Huffman code. Ultimately we are having a secured system 14
  • 15.
    Image database •Images storedin CSP-1 •Set of images sent to CSP-3 - user wants to store data in cloud File database •File holds the address of images Embedded data into Images •Counts total no. of characters •Finds frequency of each characters by Huffman code •Applies Steganography to both frequency of characters & codified data. 15
  • 16.
    ALGORITHMS USED INTHE SYSTEM ALGORITHM 1 : HF-codification() 1. procedure 2. Read file FText which is to be saved in Cloud 3. Compute CN from FText 4. Find the frequency of occurrences of each characters in Ftext and store them in some chronological order 5. Store frequency in a new file FFreq . 6. FN = Freq-Codification( ) 16
  • 17.
    17 7. Call Huffman-Tree() 8.Create a file FCode 9. Open FN. Reach EOF of FN where the original characters of FText will be replaced by the Huffman codes present in FCode . 10. Calculate the total Bit BCount in FN. 11. Delete FText, F Freq and F Code. 12. Call Steganography() to perform steganography on FN 13. end procedure
  • 18.
    FILE CODIFICATION frequency fileis read digit by digit & each digit is codified into 4- bit binary pattern Algorithm 2: Freq-Codification () 1. procedure 2. Open FFreq and a new File FN. 3. while ( Read characters from F Freq until EOF ) 4. do if (character is a new line character) 5. Append 1111 at the end of FN. 6. else 18
  • 19.
    19 7. Convert thedigit to its 4-bit binary form. 8. Append those 4-bits at the end of FN. 9. end if 10. end while 11. Append 11111111 at the end of FN. 12. Return FN 13. end procedure Algorithm 2: Freq-Codification () (CNTD…)
  • 20.
    Hiding Data withinImages Steganography Deals the pre-requisite requirements like : load image, store file name, image index finally call the MdfImg operation which will map data from file to images. ALGORITHM 3: STEGANOGRAPHY() 1. procedure 2. Load Image_Index = ImageSearch (Image_Database) 3. Store (FName, BCount, Image_Index) 4. MdfImg (Image_Database [Image_Index]); 5. end procedure. 20
  • 21.
    SEARCHING OF VALIDIMAGE The algorithm searches an image which we can be used to store the data. It returns the address of a valid image if it is available in image database. ALGORITHM 4: IMAGESEARCH(IMAGE_DATABASE) 1. procedure 2. Open Image_Database; 3. for Image_Database(i), i<-1, n do 4. if (Image_Database(i).valid==1) 5. return i 6. end if 7. end for 8. end procedure 21
  • 22.
    22 MAPPING DATA FROMA FILE TO IMAGE It does the actual steganographic operation by storing data into images. Algorithm 5: MdfImg (Image_Database [Image_Index]) 1. procedure 2. Read Image_Database [Image_Index]; 3. Compute Pixel Count for Image_Database[Image_Index]; 4. Open FN 5. while (Read Characters until EOF) 6. do if (Pixel Count < B Count) …
  • 23.
    7. Last bitof each consecutive pixels of the Image_Database[Image_Index] is replaced by Store each character. 8. else 9. Load Image_Index1=ImageSearch(Image_Database) 10. Image_Database[Image_Index1].valid=Image_Index1 11. Image_Database [Image_Index1].valid=0 12. end if 13. end while 14. end procedure 23Algorithm 5: MdfImg (Image_Database [Image_Index]) (CNTD)
  • 24.
    RETRIEVING DATA FROMIMAGE The following algorithm retrieves the data from the images which is kept in cloud storage. Algorithm 6: RetrieveData () 1. procedure 2. Read File_Database; 3. for F_Database (i), i=1 to m 4. do if (F exits) 5. I=Holds the address of image. 6. end if 7. end for 8. Open Image_Database; 9. Read Image_Database [I]; 10. Open a F Temp and a F Freq 24
  • 25.
    11. while (Untilwe get 11111111 in Image_Database [I]) 12. do Read 4 bits at a time from 4 consecutive pixels 13. Convert them into decimal form. 14. sum =sum + 4 15. if ( decimal number is within 0 to 9 ) 16. Write that digit in F Freq 17. else 18. Write new line character in F Freq 19. end if 20. end while 21. sum= sum - BCount 22. Call Huffman-Tree( ) based on the frequency counts present in FFreq and create the HuffmanTree. 25
  • 26.
    23. while (sum <= BCount ) do 24. read bits from Image_Database [I] 25. Start traversing the Huffman-Tree from root. 26. When we reach leaf node, we will get character. 27. Append that character in FTemp . 28. Increment sum number of times we collect bits from Image_Database [I] 29. end while 30. Show F Temp to the user, after user closes the file FTemp, delete the file FTemp from system. 31. end procedure 26
  • 27.
    CONSTRUCTION OF HUFFMANTREE  A priority queue, Q, is used to generate Huffman Tree with levels (frequency) as key. Algorithm 7a: Huffman-Tree (X) 1. procedure 2. FN=|X| 3. Q=X 4. for i=1 to N-1 5. do 6. Z=Allocate_node( ) 7. Z.left=Extract_min(Q) 8. Z.right=Extract_min(Q) 9. Frequency(Z)=Frequency(Z.left)+Frequency(Z.right) 10. Insert(Q,Z) 11. end for 12. end procedure 27
  • 28.
    Algorithm 7b: Allocate_node() 1.Procedure 2. Create a node for storing characters and their frequency from available free memory space. 3. Return the allocated node. 4. End procedure Algorithm 7c: Extract_min(Q) 1. procedure 2. Remove and return the character with minimum frequency from the priority queue Q. 3. end procedure 28
  • 29.
    29 Algorithm 7d: Insert(Q,Z) 1.procedure 2. Insert the node Z in the priority queue Q 3. end procedure
  • 30.
    Security Analysis andPerformance Evaluation Huffman coding is a variable length coding scheme. The frequency of each character is stored in some chronological order. Variable length encoding does not help the attacker to recognize the characters. Decoding of bits can only be done by the Huffman tree only. The frequency file contains only frequency. Change in chronological order results difficulty of tracking characters. Ultimately we are having a secured system. 30
  • 31.
    •SECURITY STRENGTH AGAINSTCSP-1 CSP-1 only stores some files. CSP-1 does not contain the retrieving algorithm, thus the images containing data are purely safe. 31 •SECURITY STRENGTH AGAINST CSP-2 Retrieving and hiding mechanism are stored in CSP-2. Knowing only the algorithm will not help the attacker. •SECURITY STRENGTH AGAINST CSP-3 CSP-3 is responsible for computation. All files will be deleted after the above operations.
  • 32.
    CONCLUSION We applied steganographicapproach to ensure data storage security in cloud computing using Huffman Coding (SAHC). Through detailed security and performance analysis this approach gives high security of data when it is on rest in the data center of any Cloud Service Provider (CSP). This proposed architecture will be able to provide customer satisfaction to a great level and it will attract more clients in the field of cloud computing for industrial as well as future research firms. 32
  • 33.
    REFERENCES [1] Peter Mell,Timothy Grance, “The NIST Definatin of Cloud Computing”, Jan, 2011.http://docs. ismgcorp.com/files/external/Draft-SP-800-145_clouddefinition.pdf. [2] Amazon.com, “Amazon Web Services (AWS)”, Online at hppt://aws.amazon.com, 2008. [3] Con Wang, Qian Wang, Kui Ren, and Wenjng Lou,“Ensuring Data Storage Security in CloudComputing”,17th International workshop on Quality of service, USA, pp1-9, 2009, IBSN:978-42443875-4. [4] Thomas H. Cormen, Charles E. Leiserson, Ronald L.Rivest, and Clifford Stein. Introduction to Algorithms,Third Edition, Prentice Hall of India, 2010. [5] B.P Rimal, Choi Eunmi,I.Lumb, “A Taxonomy and Survey of Cloud Computing Sytem”, Intl. Joint Conference on INC, IMS and IDC, 2009,pp.44-51, Seoul,Aug, 2009. DOI : 10.1109/NCM.2009.218. 33
  • 34.