Skip to content

Conversation

@dkz2
Copy link
Contributor

@dkz2 dkz2 commented Jun 7, 2023

Updated our Encoder and Decoder to serialize/deserialize a get MemcachedResponse. This pr closes #6.

Motivation:

This marks the beginning for being able to handle get operations, as well as the foundation set up for building a soon to come Memcached Actor.

Modifications:

  • Flag v for get command added.
  • Implemented serialization of get commands in our RequestEncoder.
  • Implemented extensions on ByteBuffer to allow us to read/write MemcachedFlags.
  • Implemented extension on ByteBuffer to allow us to read an integer from ASCII characters.
  • Condensed and Updated our ResponseDecoder decoding states.
  • Added Unit Test.

Result:
We can now successfully encode and decode a get MemcachedResponse that can be sent to the Memcached server

Copy link
Contributor

@FranzBusch FranzBusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great start! Left some comments


struct GetCommand {
let key: String
var flags: [MemcachedFlag]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to use something more specialised than an array here. We probably should create a MemcachedFlags struct instead that holds all of the flags inline. We know what flags exists so we don't need the dynamism that array offers us.

out.writeBytes(command.key.utf8)

// write flags if there are any
for flag in command.flags {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we should create MemcachedFlags struct and create a separate extension method on ByteBuffer that offers a mutating func write(flags: MemcachedFlags)


var returnCode: ReturnCode
var dataLength: UInt64?
var flags: [MemcachedFlag]?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here let's not use an array and lets add a mutating func readMemcachedFlags() -> MemcachedFlags to ByteBuffer

self.nextStep = .dataLengthOrFlag(returnCode)
return .continueDecodeLoop

case .dataLengthOrFlag(let returnCode):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we should separate this into two steps one dataLength and one flag and we can decide what the next step is in the case above when we decoded the return code.

if returnCode == .VA {
// TODO: Implement decoding of data length
fatalError("Decoding for VA return code is not yet implemented")
guard let dataLength = buffer.readInteger(as: UInt64.self) else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct? Is the data length encoded as a UInt64 or rather written out as text?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are correct, it is written out as text, so I will make that change as well!

return .returnDecodedResponse(response)

case .decodeValue(let returnCode, let dataLength, let flags):
guard buffer.readableBytes >= dataLength else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably check here for dataLength + carriageReturn + newLine. To make sure we are not decoding the data until we also have the returns

Comment on lines 175 to 178
var data: ByteBuffer = .init()
for _ in 0..<dataLength {
let byte = buffer.readInteger(as: UInt8.self)
data.writeInteger(byte!)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a method to read a slice of a buffer which we should use here readSlice(length:)

nextByte == UInt8.carriageReturn,
let nextNextByte = buffer.readInteger(as: UInt8.self),
nextNextByte == UInt8.newline else {
return .waitForMoreBytes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we change the logic here to guard that we have these bytes readable as well we should probably fatalError here because it should never happen

///
/// - parameters:
/// - integer: The MemcachedFlag to serialize.
mutating func write(flags: MemcachedFlags) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming NIT:

Suggested change
mutating func write(flags: MemcachedFlags) {
mutating func writeMemcachedFlags(_ flags: MemcachedFlags) {
/// If a flag is set, its corresponding byte value and a whitespace character is written into the ByteBuffer.
///
/// - parameters:
/// - integer: The MemcachedFlag to serialize.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doc is outdated

//===----------------------------------------------------------------------===//

struct MemcachedFlags {
var v: Bool = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we give this a good name of what it does and add a doc comment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also probably be optional since all flags are optional.

Comment on lines 48 to 58
extension MemcachedFlags: Equatable {
static func == (lhs: MemcachedFlags, rhs: MemcachedFlags) -> Bool {
return lhs.v == rhs.v
}
}

extension MemcachedFlags: Hashable {
func hash(into hasher: inout Hasher) {
hasher.combine(v)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't have to implement those. The compiler will synthesise them for you if you write

extension MemcachedFlags: Hashable {} 
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great catch, didnt think about that!

Comment on lines 37 to 45
var bytes: Set<UInt8> {
var result = Set<UInt8>()
for (keyPath, byte) in Self.flagToByte {
if self[keyPath: keyPath] {
result.insert(byte)
}
}
return result
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't allocate the intermediate Set here. We can do all of this in the writeMemcachedFlags method. Just go over each field of the struct, check if it is not-nil and write it out.

Comment on lines 25 to 26
extension UInt8 {
static var size: Int { MemoryLayout<UInt8>.size }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we using this anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct no longer needed!

Comment on lines 28 to 30
static let flagToByte: [KeyPath<MemcachedFlags, Bool?>: UInt8] = [
\MemcachedFlags.shouldReturnValue: 0x76,
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is going to work for all flags. Some of the flags carry actual value e.g. a numeric TTL. I think we really just have to spell this out in the write method. Like just go field by field of the struct and check if it is optional or not.

}
}

extension MemcachedFlags: Equatable {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove this since Hashable implies Equatable

Suggested change
extension MemcachedFlags: Equatable {}
///
/// - returns: An instance of `MemcachedFlags` containing the flags read from the buffer.
mutating func readMemcachedFlags() -> MemcachedFlags {
var flagBytes: Set<UInt8> = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allocating this set is unnecessary. I would rather init an "empty" MemcachedFlags in the beginning and then read byte by byte. For now we drop everything that is not a v flag.

/// - returns: An instance of `MemcachedFlags` containing the flags read from the buffer.
mutating func readMemcachedFlags() -> MemcachedFlags {
var flagBytes: Set<UInt8> = []
while let nextByte = self.readInteger(as: UInt8.self), nextByte != UInt8.whitespace {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct. The next byte can be a whitespace since flags are separated by whitespaces. Furthermore, the next byte could be a newline + carriage return. We probably have to do this differently and handle it inside the decoder itself because we need to understand the state of the whole response, e.g. it depends what kind of response we are decoding if there is anything following the flags or not.

Comment on lines 75 to 78
/// Decode the flag
case flag(MemcachedResponse.ReturnCode, UInt64?)
/// Decode the next flag
case decodeNextFlag(MemcachedResponse.ReturnCode, UInt64?)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need two different states here?


let returnCode = MemcachedResponse.ReturnCode(bytes)
self.nextStep = .dataLengthOrFlag(returnCode)
self.nextStep = .dataLength(returnCode)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check the return code here and decide if we transition to dataLength or flags

Comment on lines 150 to 165
case .flag(let returnCode, let dataLength):
let flags = buffer.readMemcachedFlags()

guard buffer.readableBytes >= 2,
let nextByte = buffer.readInteger(as: UInt8.self),
nextByte == UInt8.carriageReturn,
let nextNextByte = buffer.readInteger(as: UInt8.self),
nextNextByte == UInt8.newline else {
preconditionFailure("Expected to find CRLF at end of response")
}

self.nextStep = .decodeValue(returnCode, dataLength!, flags)
return .continueDecodeLoop

case .decodeNextFlag(let returnCode, let dataLength):
while let nextByte = buffer.readInteger(as: UInt8.self), nextByte != UInt8.whitespace {
// for now consume the byte and do nothing with it.
// TODO: Implement decoding of flags
let flags = buffer.readMemcachedFlags()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can combine these two cases

@dkz2 dkz2 marked this pull request as ready for review June 30, 2023 12:43
Comment on lines 66 to 83
mutating func readMemcachedFlags() -> MemcachedFlags {
var flags = MemcachedFlags()
while let nextByte = self.readInteger(as: UInt8.self) {
switch nextByte {
case UInt8.v:
flags.shouldReturnValue = true
case UInt8.whitespace:
continue
case UInt8.carriageReturn:
guard let _ = self.readInteger(as: UInt8.self), self.readableBytes > 0 else {
break
}
default:
preconditionFailure("Unrecognized flag.")
}
}
return flags
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still see a couple of problems here:

  1. We have to verify after the carriage return that we get a newline
  2. More importantly it might be that our buffer doesn't contain the full flags yet. We have to support incremental parsing of flags. After thinking about this a bit, we should probably move the flag reading out of this method and into the state machine. We should be trying to read an individual flag + its token and stay in the decodeFlag state until we are done. We should also write a test for this where we only have a buffer that contains let's say the flag but not the carriage return and newline and send that as a second buffer into the decode method.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added a comment further down that basically says we should try to not do 2 yet since it makes everything a bit more complicated. However can you please add a test where you feed a partial buffer in that doesn't have everything i.e. only a \r and not the \n afterwards and then pass it in after and assert the right things happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be handled in a future PR!


init() {}

init(flagBytes: Set<UInt8>) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this?

/// Decode the data length
case dataLength(MemcachedResponse.ReturnCode)
/// Decode the flags
case flags(MemcachedResponse.ReturnCode, UInt64?)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's rename this to decodeFlag and always try to decode a single flag at a time. We just have to update the flags struct and store it as a state here.

Suggested change
case flags(MemcachedResponse.ReturnCode, UInt64?)
case flags(MemcachedResponse.ReturnCode, UInt64?)
/// Decode the flags
case flags(MemcachedResponse.ReturnCode, UInt64?)
/// Decode the Value
case decodeValue(MemcachedResponse.ReturnCode, UInt64, MemcachedFlags?)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need an optional for flags here we can just always have it since all its field are optional

Comment on lines 123 to 125
while let currentByte = buffer.getInteger(at: buffer.readerIndex, as: UInt8.self), currentByte == UInt8.whitespace {
buffer.moveReaderIndex(forwardBy: 1)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can there be more than 1 whitespace? Shouldn't we rather just read the next byte and make sure it is a whitespace. Then afterwards we check where the next whitespace is by just getInteger forward. Once we have confirmed that we have a whitespace in the buffer we try to reach the integer from ASCII.

Comment on lines 131 to 138
// Skip over the whitespace, newline or carriage return
while let currentByte = buffer.getInteger(at: buffer.readerIndex, as: UInt8.self),
[UInt8.whitespace, UInt8.newline, UInt8.carriageReturn].contains(currentByte) {
buffer.moveReaderIndex(forwardBy: 1)
}

let isNextByteWhitespace = buffer.getInteger(at: buffer.readerIndex) == UInt8.whitespace
self.nextStep = isNextByteWhitespace ? .flags(returnCode, dataLength) : .decodeValue(returnCode, dataLength, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we shouldn't do this here because we actually might have a short buffer. I would recommend that after reading the ASCII integer to transition to the next state which is decodeFlags. In that state we have to handle this whitespace/carriageReturn&newline logic.

if nextByte == UInt8.whitespace {
self.nextStep = .decodeNextFlag(returnCode, nil)
case .flags(let returnCode, let dataLength):
let flags = buffer.readMemcachedFlags()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is where it gets a bit tricky and I would like us to take a shortcut for now. The problem is that we can have a partial buffer where we don't have the carriageReturn&newline yet and we would need to support reading partial flags. This also includes reading the partial tokens of flags which might be quite big. In a best world, we would go flag by flag and move ourselves forward. However, implementing this requires a bit of logic. What I would do for now is trying to scan the buffer forward to see if we have a \r\n in the buffer. If we do we are going to try and decode if not we are going to return needMoreData and let's create an issue for us to revisit this in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, we will create a new issue to handle the partial buffer implementation.

try self.testDecodeResponse(buffer: &buffer, expectedReturnCode: .NF)
}

func testDecodeValueResponse() throws {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we need a couple of more tests here that are trying to pass partial responses and assert that we return the right things.

@dkz2 dkz2 requested a review from FranzBusch July 6, 2023 06:12
Comment on lines 75 to 76
// We were expecting a newline after the carriage return, but didn't get it.
return flags
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can fatalError here for now. This shouldn't happen

Comment on lines 111 to 116
guard buffer.readableBytes >= 2,
let lastTwoBytes = buffer.getBytes(at: buffer.writerIndex - 2, length: 2),
lastTwoBytes[0] == UInt8.carriageReturn,
lastTwoBytes[1] == UInt8.newline else {
return .waitForMoreBytes
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That check isn't correct. Instead of checking the last bytes we have to walk the buffer to find the first occurrence of \r\n since there might be more than one response in the buffer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified this to find the first occurrence!

Comment on lines 81 to 82
// If it wasn't a newline, it could be the start of the data block.
return flags
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to fatalError if it is not a newline anything else is unexpected.

Comment on lines 85 to 86
// Encountered a character we weren't expecting. This could be the start of the data block.
return flags
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not parsing the v flag anymore here. Additionally let's fatalError here as well for now since we don't support the other flags and the comment is wrong. It cannot be the start of the data block

Copy link
Contributor

@FranzBusch FranzBusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Let's get this merged so that we can focus on the actual Connection actor

@FranzBusch FranzBusch merged commit ed88f85 into swift-server:main Jul 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants