Problem statement : Write a function to Compress a given string. Only return the compressed string if it saves space.
Difficulty level: Easy
Test Cases
- 'AAABAACCDDDD' ---> 'A3BA2C2D4'
- 'BAAACCDDDD' ---> 'BA3C2D4'
- 'AABBCC' --> 'AABBCC'
- 'BBBAACCCCDD' --> 'B3A2C4D2'
Algorithm
- Setup a pointer at first character of string, initialise an empty string and a counter.
- Iterate through the string and check if character is equal to the first character of the string.
- if both are equal increment counter.
-
Else,
- add the first character and it's count to result string.
- if count is greater than 1 then only add count to the result string.
- Else, add an empty string to the result string.
- now, the first character will be the current character.
- reset the counter value to 1.
- add the first character and it's count to result string.
Add the last character and it's count to result string.
Return the resultant compressed string if it's length is less than the given string, else return the given string.
Time and Space Complexity
- Time complexity: O(n)
- Space complexity: O(n)
Code
class CompressString(object): def compress(self, input): if input is None or not input: return input result = '' count = 0 prev_char = input[0] for char in input: if char == prev_char: count += 1 else: result += prev_char + (str(count) if count > 1 else '') prev_char = char count = 1 result += prev_char + (str(count) if count > 1 else '') return result if len(result) < len(input) else input
More precise compression
class CompressString(object): def compress(self, input): if input is None or not input: return input result = '' count = 0 prev_char = input[0] for char in input: if char == prev_char: count += 1 else: if count > 2: result += prev_char + (str(count)) prev_char = char count = 1 elif count == 1: result += prev_char count = 1 prev_char = char else: result += prev_char result += prev_char count = 1 prev_char = char result += prev_char + (str(count) if count > 2 else prev_char) return result if len(result) < len(input) else input
Test
import unittest from compressString import CompressString class TestCompress(unittest.TestCase): def testCompress(self, func): self.assertEqual(func(None), None) self.assertEqual(func(''), '') self.assertEqual(func('AABBCC'), 'AABBCC') self.assertEqual(func('AAABCCDDDDE'), 'A3BC2D4E') self.assertEqual(func('BAAACCDDDD'), 'BA3C2D4') self.assertEqual(func('AAABAACCDDDD'), 'A3BA2C2D4') print('Success: testCompress') def main(): test = TestCompress() compress_string = CompressString() test.testCompress(compress_string.compress) if __name__ == '__main__': main()
Top comments (3)
Simple improvement proposal: you could remove the last check by replacing chars only if there are more than 2 in a row.
Replacing 'CC' with 'C2' never saves space. Replacing more than 2 characters always saves space. So you would either return the same input string or return compressed string that's shorter than original.
I could do something like if count is more than 2 than add the character with count, if count is 1, than only add the character , and lastly if count is exactly 2, then add the character 2 times, so it would remain same as original string.
I didn't know that, looks really neat. Thanks for the tip. 🌟