Skip to content

Conversation

@mika314
Copy link
Contributor

@mika314 mika314 commented Jul 17, 2018

  • Make implementation shorter and easier to understand
  • Make namespace hierarchy flat
  • Combine Leaf and Branch in one class Node for the tree
  • Make codebook explicit
  • Use std::string for string_to_encode
  • Remove noise (delete function with_new_bit() for instance)
  • Use unordered map for weights instead of vector
  • Instead inserting to the orphan array, sort it on every iteration (debatable)
  • Remove structure for cacheing bits for character, calculate it every time
  • Remove error handling
  • Change the encoding interface: encode() and decode() instead huffman::encode_string() and encoded.decoded()
@Gathros Gathros added the Implementation Edit This provides an edit to an algorithm implementation. (Code and maybe md files are edited.) label Jul 17, 2018
@Butt4cak3
Copy link
Contributor

Could you please provide some explanation for why you think this rework was necessary and why your version is better than the current?

@mika314
Copy link
Contributor Author

mika314 commented Jul 18, 2018

Updated commit message and message in PR to emphasize why rework is needed.

Copy link
Contributor

@zsparal zsparal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the changes but I think the rest of the C++ code samples try to follow the standard library's style, which would mean using snake_case for types and variables

@mika314
Copy link
Contributor Author

mika314 commented Jul 20, 2018

@Gustorn no problem, I'll rename Node and Codebook to lower case.

- Make implementation shorter and easier to understand - Make namespace hierarchy flat - Combine Leaf and Branch in one class Node for the tree - Make codebook explicit - Use std::string for string_to_encode - Remove noise (delete function with_new_bit() for instance) - Use unordered map for weights instead of vector - Instead inserting to the orphan array, sort it on every iteration (debatable) - Remove structure for cacheing bits for character, calculate it every time - Remove error handling - Change the encoding interface: encode() and decode() instead huffman::encode_string() and encoded.decoded()
@mika314
Copy link
Contributor Author

mika314 commented Jul 20, 2018

@Gustorn changes with snake_case are pushed

std::string decode(Iter first, Iter last) const;
// common fields
node* parent = nullptr;
enum class branch { left, right };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think storing which branch we're on is necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean? It used here:

compressed_char.push_back(node->branch != node::branch::left);

https://github.com/antonte/algorithm-archive/blob/71c0d4295d4a892c91e1da64fdb7feaa9ea56dd3/contents/huffman_encoding/code/c%2B%2B/huffman.cpp#L83

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're storing a parent pointer anyway, it's trivial to write a function that tests whether the current node is a left or a right child. This implementation of the Huffman algorithm is inefficient enough as it is, no need to make it worse by bloating the node struct

@leios
Copy link
Member

leios commented Oct 16, 2018

Hey guys. What's the status on this PR? It looks like the review was completed; however, it was not yet merged into the AAA. Can I dismiss the review @Gustorn and merge?

@Gathros
Copy link
Contributor

Gathros commented Jan 4, 2019

Another stale pull request.

@Gathros Gathros closed this Jan 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Implementation Edit This provides an edit to an algorithm implementation. (Code and maybe md files are edited.)

5 participants