DEV Community

CodeWithML
CodeWithML

Posted on • Edited on

String Compression

Problem statement : Write a function to Compress a given string. Only return the compressed string if it saves space.

Difficulty level: Easy


Test Cases

  • 'AAABAACCDDDD' ---> 'A3BA2C2D4'
  • 'BAAACCDDDD' ---> 'BA3C2D4'
  • 'AABBCC' --> 'AABBCC'
  • 'BBBAACCCCDD' --> 'B3A2C4D2'

Algorithm

  • Setup a pointer at first character of string, initialise an empty string and a counter.
  • Iterate through the string and check if character is equal to the first character of the string.
    • if both are equal increment counter.
  • Else,

    • add the first character and it's count to result string.
      • if count is greater than 1 then only add count to the result string.
      • Else, add an empty string to the result string.
    • now, the first character will be the current character.
    • reset the counter value to 1.
  • Add the last character and it's count to result string.

  • Return the resultant compressed string if it's length is less than the given string, else return the given string.

Time and Space Complexity

  • Time complexity: O(n)
  • Space complexity: O(n)

Code

class CompressString(object): def compress(self, input): if input is None or not input: return input result = '' count = 0 prev_char = input[0] for char in input: if char == prev_char: count += 1 else: result += prev_char + (str(count) if count > 1 else '') prev_char = char count = 1 result += prev_char + (str(count) if count > 1 else '') return result if len(result) < len(input) else input 

More precise compression

class CompressString(object): def compress(self, input): if input is None or not input: return input result = '' count = 0 prev_char = input[0] for char in input: if char == prev_char: count += 1 else: if count > 2: result += prev_char + (str(count)) prev_char = char count = 1 elif count == 1: result += prev_char count = 1 prev_char = char else: result += prev_char result += prev_char count = 1 prev_char = char result += prev_char + (str(count) if count > 2 else prev_char) return result if len(result) < len(input) else input 

Test

import unittest from compressString import CompressString class TestCompress(unittest.TestCase): def testCompress(self, func): self.assertEqual(func(None), None) self.assertEqual(func(''), '') self.assertEqual(func('AABBCC'), 'AABBCC') self.assertEqual(func('AAABCCDDDDE'), 'A3BC2D4E') self.assertEqual(func('BAAACCDDDD'), 'BA3C2D4') self.assertEqual(func('AAABAACCDDDD'), 'A3BA2C2D4') print('Success: testCompress') def main(): test = TestCompress() compress_string = CompressString() test.testCompress(compress_string.compress) if __name__ == '__main__': main() 

Original article

Github solution

Happy Coding !

Top comments (3)

Collapse
 
grodzickir profile image
Ryszard Grodzicki

Simple improvement proposal: you could remove the last check by replacing chars only if there are more than 2 in a row.
Replacing 'CC' with 'C2' never saves space. Replacing more than 2 characters always saves space. So you would either return the same input string or return compressed string that's shorter than original.

Collapse
 
codewithml profile image
CodeWithML

I could do something like if count is more than 2 than add the character with count, if count is 1, than only add the character , and lastly if count is exactly 2, then add the character 2 times, so it would remain same as original string.

Collapse
 
codewithml profile image
CodeWithML

I didn't know that, looks really neat. Thanks for the tip. 🌟