Skip to content

Conversation

@helinwang
Copy link
Contributor

The pserver checkpoint before failed because the MD5 checksum is
calculated incorrectly. Now changed to CRC32 checksum.

The pserver checkpoint before failed because the MD5 checksum is calculated incorrectly. Now changed to CRC32 checksum.
var cptr (*C.uchar)
if len(c) > 0 {
cptr = (*C.uchar)(&c[0])
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Else output error log?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! Done.

h := md5.New()
md5 := hex.EncodeToString(h.Sum(content))
if md5 != cpMeta.MD5 {
crc32 := crc32.ChecksumIEEE(content)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why md5 will cause the error?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same question...

Copy link
Contributor Author

@helinwang helinwang Oct 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@typhoonzero @Yancey1989 @dzhwinter

The problem was I think we used the md5 package incorrectly that generated a long string, causes etcd write error with "message too large". To show to error:

package main import ( "crypto/md5" "encoding/hex" "fmt" ) func main() { h := md5.New() md5 := hex.EncodeToString(h.Sum([]byte("hello this is some string"))) // Output: 68656c6c6f207468697320697320736f6d6520737472696e67d41d8cd98f00b204e9800998ecf8427e fmt.Println(md5) }

The output is not a typical MD5 string, rather a very long one.

I think the correct way to get the MD5 string is here:

package main import ( "crypto/md5" "fmt" ) func main() { data := []byte("These pretzels are making me thirsty.") sum := fmt.Sprintf("%x", md5.Sum(data)) // Output: b0804ec967f48520697662a204f5fe72 fmt.Printf(sum) }

The reason to switch to CRC32 is because it's faster, better for checksum, MD5 is slower, better for defending cracking.

h := md5.New()
md5 := hex.EncodeToString(h.Sum(content))
if md5 != cpMeta.MD5 {
crc32 := crc32.ChecksumIEEE(content)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one more question. Why we change MD5 to CRC, concern the speed?

}
}

if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we would add some log here, sorry this is out of this PR code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The caller will handle it, here is one caller code:

err := s.checkpoint() if err != nil { log.Error("checkpoint error", log.Ctx{"error": err}) }

I think in general the outer most caller should handle the error (either log or do something else), because it has the most information. If everyone prints log, it will be duplicating.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got it, thanks :)

h := md5.New()
md5 := hex.EncodeToString(h.Sum(content))
if md5 != cpMeta.MD5 {
crc32 := crc32.ChecksumIEEE(content)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same question...

Copy link
Contributor

@typhoonzero typhoonzero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@helinwang helinwang merged commit b1cbdf0 into PaddlePaddle:develop Oct 26, 2017
@helinwang helinwang deleted the checkpoint branch October 26, 2017 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

4 participants