Skip to content
This repository was archived by the owner on May 25, 2023. It is now read-only.

Conversation

@svroonland
Copy link

@svroonland svroonland commented Jan 19, 2018

First of all, thanks to Lightbend for publishing and committing to maintain this excellent library.

One of my annoyances of the java API for kafka streams is having to pass explicit Serdes to lots of methods, in the 1.0.0 api in the form of Serialized or Produced. Although most methods have overloads without a serializer, those may result in runtime errors if no appropriate serializer is configured.

In scala we can make nice use of implicits for passing the Serdes. In this PR I propose some additions to the scala api that take implicit Serdes instead of explicit Produced and Serialized. Unfortunately to get it to compile, I had to remove the overloads without those parameters, but as the test code shows, the change in client code is minor compared to the advantages.

}

def through(topic: String): KStreamS[K, V] = inner.through(topic)
def through(topic: String)(implicit keySerde: Serde[K], valueSerde: Serde[V]): KStreamS[K, V] =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea behind this version of the API is to allow the user to use the default serializers, deserializers and the producer's DefaultPartitioner. Here's the Javadoc ..

Materialize this stream to a topic and creates a new KStream from the topic using default serializers and deserializers and producer's DefaultPartitioner.

By asking the user to supply implicit parameters for them makes the contract contradict the original API intent. And also the API variant with an explicit Produced parameter is also present here.

If you look at the API design scheme of Kafka Streams 1.0, clearly they have 2 variants - the first that takes these arguments from the default setups and the second one that allows the user to explicitly pass them through Produced, Serialized, Consumed abstractions. In the version that u suggest, even though the user may have the defaults set for the serializers, she will have to pass them again as implicits.

Copy link
Author

@svroonland svroonland Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue that typeclass implicits are both a better solution and more common in scala land (see e.g. play-json Format or play ConfigLoader and ContentWritable). The configuration file solution should be available, but be secondary and not-recommended in my opinion as they carry the risk of runtime errors. Maybe we can expose them under alternative method names?

By asking the user to supply implicit parameters for them makes the contract contradict the original API intent.

You could argue that the default Serde's from the config are a java mechanism to prevent having to explicitly pass them all over the place, for lack of implicit parameters in java. Replacing this mechanism with scala implicits, I'd say that the proposed change nicely 'translates' this intent to Scala. With the added benefit that we avoid runtime errors when the default serializer is not able to handle the type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typeclass implicits are both a better solution and more common in scala land

fully agree but here by introducing implicits we are losing a big use case - defining default (de-)serializers upfront and then using the APIs like through or to with just the topic specified as argument. And this is a very well established idiom when programming with kafka-streams. These users will have to define implicits for them and ensure they are in scope. And this they need to do in every function that use these APIs.

Copy link

@maver1ck maver1ck Feb 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implicit conversion should be provided by this package.
Then user instead of setting defaults will be doing import sth.implicits._
It's not a big deal.
Please look how it's done in Apache Spark.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seglo
Copy link

seglo commented Jan 28, 2018

Providing implicit Serdes to wrapper implementations that are meant to use the default key/value Serdes provided at stream configuration time would hide kstreams functionality from the end user. It would require the user to keep an instantiated Serdes in scope for each call. I agree with @debasishg on this point.

I think we can all agree that implicit parameters are useful and often used in Scala-based marshalling libraries, but I think coercing the signature of facades that make use of default Serdes is the wrong way to go about it.

One advantage of this library that would be nice to maintain is that it can serve as a drop in replacement for many of the user-facing kstreams API's. Developers familiar with kstreams can intuitively use it, without worrying about how usages of underlying implementations would change.

Perhaps we could find a way to reduce the Serdes boilerplate that maintains the integrity of the default Serdes implementation. Perhaps providing an overload that takes implicit Produced, Serialized, Consumed parameters?

From your PR description:

In scala we can make nice use of implicits for passing the Serdes. In this PR I propose some additions to the scala api that take implicit Serdes instead of explicit Produced and Serialized. Unfortunately to get it to compile, I had to remove the overloads without those parameters, but as the test code shows, the change in client code is minor compared to the advantages.

I don't understand why you would have to remove the overloads that use the default Serdes.

@debasishg
Copy link
Contributor

Closing the PR for the time being. Will reopen if need be ..

@debasishg debasishg closed this Feb 3, 2018
@debasishg debasishg mentioned this pull request Feb 8, 2018
@svroonland
Copy link
Author

@seglo Thanks for your input.

The reason the overloads taking the default Serdes would have to be removed, is that unless I'm missing something here, the compiler will complain about ambiguous reference to an overloaded method:

def a(a: A, b:B) = ... def a(a: A, b:B)(implicit c: C) = ... a(someA, someB) // which of the two overloads is it referring to? 
@seglo
Copy link

seglo commented Feb 9, 2018

@svroonland Yes, you're right. There's another discussion thread happening that's honing in on an approach for supplying a default serdes if it's available via an implicit param. We can take this discussion there.

#46

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

4 participants