- Published on
Guide to using Protocol Buffers with Kotlin
Data serialization is a critical aspect of any software application. It involves converting data structures or objects into a format that can be stored or transmitted over a network. Efficient data serialization can have a significant impact on the performance of your application, especially when dealing with large volumes of data. This guide explores how to achieve efficient data serialization using Kotlin and Protocol Buffers.
Introduction to Protocol Buffers
Protocol Buffers is a language-agnostic data serialization format developed by Google. It allows you to define a message schema using a simple, human-readable language and then generates code for various programming languages to serialize and deserialize the message.
Protocol Buffers, what are they?
Also known as protobuf is a method of serializing structured data. It is designed to be more efficient than XML or JSON for encoding data. It is language-agnostic, which means that you can generate code for various programming languages to serialize and deserialize messages.
It consists of a language specification for defining message types and a set of code generators for different programming languages. Their messages are defined using a simple, human-readable language, which is then compiled into code for encoding and decoding the message in the desired programming language.
Message Schema
A Protocol message is defined using a message schema, which specifies the fields of the message and their types. The message schema is defined using the Protocol Buffers language, which is a simple, human-readable language.
syntax = "proto3";
message Person {
string name = 1;
int32 age = 2;
}
In the example above, we define a message schema for a person. The message has two fields, name and age, which are of type string and int32, respectively. The numbers after the field names are field numbers, which are used to identify fields in the encoded message.
Serialization and Deserialization with Protocol Buffers
Once you have defined a message schema using Protocol Buffers, you can use the generated code to serialize and deserialize messages. The Protocol Buffers API provides methods for encoding a message into binary format and decoding a message from binary format.
message Person {
string name = 1;
int32 age = 2;
}
Person person = Person.newBuilder()
.setName("Alice")
.setAge(30)
.build();
byte[] bytes = person.toByteArray();
In the example above, we define a Person message using the Person schema we defined earlier. We then use the Person.newBuilder() method to create a new Person object and set its name and age fields. Finally, we use the toByteArray() method to encode the Person object into binary format.
To decode the message from binary format, we use the parseFrom() method:
byte[] bytes = ... // binary data
Person person = Person.parseFrom(bytes);
In the example above, we use the parseFrom() method to decode the Person object from binary format. The parseFrom() method takes a byte[] array as input and returns a Person object.
Integrating Kotlin and Protocol Buffers
In this section, we will explain how to generate Kotlin code from Protocol Buffers schemas and how to use the generated Kotlin code to work with Protocol Buffers messages.
Create the project folder
First, create a new folder for your project. We will call it kotlin-protocol-buffer-sample.
mkdir kotlin-protocol-buffer-sample
cd kotlin-protocol-buffer-sample
Install Gradle
Next, install Gradle. We will use Gradle to build our project.
# macOS
brew install gradle
For others OS, please follow the official documentation.
Create a Gradle project
$ gradle init
Select type of project to generate:
1: basic
2: application
3: library
4: Gradle plugin
Enter selection (default: basic) [1..4] 2
Select implementation language:
1: C++
2: Groovy
3: Java
4: Kotlin
5: Scala
6: Swift
Enter selection (default: Java) [1..6] 4
Select build script DSL:
1: Groovy
2: Kotlin
Enter selection (default: Groovy) [1..2] 2
Project name (default: kotlin-protocol-buffer-sample):
Source package (default: kotlin.protocol.buffer.sample):
The build.gradle.kts configuration file
import com.google.protobuf.gradle.id
plugins {
kotlin("jvm") version "1.8.0"
id("com.google.protobuf") version "0.9.2"
application
}
group = "com.devlach"
version = "1.0-SNAPSHOT"
repositories {
mavenCentral()
google()
}
val grpcVersion = "1.54.0"
val grpcKotlinVersion = "1.3.0"
val protobufVersion = "3.22.2"
val annotationApiVersion = "6.0.53"
dependencies {
testImplementation(kotlin("test"))
implementation(kotlin("stdlib"))
implementation("io.grpc:grpc-kotlin-stub:$grpcKotlinVersion")
implementation("io.grpc:grpc-protobuf:$grpcVersion")
implementation("com.google.protobuf:protobuf-kotlin:$protobufVersion")
runtimeOnly("io.grpc:grpc-netty-shaded:$grpcVersion")
if (JavaVersion.current().isJava9Compatible) {
compileOnly("org.apache.tomcat:annotations-api:$annotationApiVersion") // necessary for Java 9+
}
}
protobuf {
protoc {
artifact = "com.google.protobuf:protoc:$protobufVersion"
}
plugins {
create("grpc") {
artifact = "io.grpc:protoc-gen-grpc-java:$grpcVersion"
}
create("grpckt") {
artifact = "io.grpc:protoc-gen-grpc-kotlin:$grpcKotlinVersion:jdk8@jar"
}
}
generateProtoTasks {
all().forEach {
it.plugins {
create("grpc")
create("grpckt")
}
it.builtins {
create("kotlin")
}
}
}
}
tasks.test {
useJUnitPlatform()
}
kotlin {
jvmToolchain(17) // Depends on your JDK version
}
application {
mainClass.set("ServerKt")
}
This build.gradle.kt file defines the build configuration for a Kotlin-based project that uses the gRPC framework and Protocol Buffers for inter-service communication.
The plugins block specifies the plugins used in the project, including kotlin-jvm, com.google.protobuf, and application. The group and version properties define the Maven coordinates for the project.
The repositories block lists the repositories where the dependencies will be fetched from, including mavenCentral() and google().
The dependencies block defines the project’s dependencies, which include Kotlin test and standard libraries, as well as gRPC-related libraries such as grpc-kotlin-stub, grpc-protobuf, and protobuf-kotlin.
The protobuf block configures the Protocol Buffers code generation for the project. The protoc property specifies the version of the Protocol Buffers compiler to use, while the plugins block defines the gRPC plugins to use for generating client and server code in both Java and Kotlin. The generateProtoTasks block configures the generation of the Protocol Buffers code for all .proto files in the project.
The tasks block configures the test task to use the JUnit platform for testing.
The kotlin block configures the Kotlin toolchain, setting the jvmToolchain to version 17.
The application block sets the main class for the application to ServerKt.
The Protocol definition
We are going to create a Protocol Buffers definition for a simple gRPC service called OrderService, which provides a single method called GetOrders.
First, create a new folder called kotlin-protocol-buffer-sample/src/main/proto.
service OrderService {
rpc GetOrders (OrderRequest) returns (OrderResponse) {}
}
message OrderRequest {
int32 totalOrders = 1;
int32 totalItems = 2;
}
message OrderResponse {
repeated Order orders = 1;
}
message Order {
int32 id = 1;
string customerName = 2;
repeated Item items = 3;
int32 totalCost = 4;
}
message Item {
string name = 1;
int32 quantity = 2;
double price = 3;
}
The GetOrders method takes an OrderRequest message as input and returns an OrderResponse message as output.
The OrderRequest message has two integer fields: totalOrders and totalItems.
The OrderResponse message has a repeated field of Order messages called orders.
The Order message has four fields: id is an integer field, customerName is a string field, items is a repeated field of Item messages, and totalCost is an integer field.
The Item message has three fields: name is a string field, quantity is an integer field, and price is a double field.
Overall, this Protocol Buffers definition specifies a simple service for retrieving a list of orders, where each order has an ID, a customer name, a list of items, and a total cost, with each item in the list consisting of a name, quantity, and price.
Generating a Server and Client Code
Server implementation
import com.devlach.proto.*
import io.grpc.ServerBuilder
import kotlin.random.Random
class Server(val port: Int) {
val server = ServerBuilder.forPort(port)
.addService(OrderService())
.build()
fun start() {
server.start()
println("Server started, listening on $port")
Runtime.getRuntime().addShutdownHook(Thread {
println("Received Shutdown Request")
this.stop()
println("Successfully stopped the server")
})
}
fun stop() {
server.shutdown()
}
fun blockUntilShutdown() {
server.awaitTermination()
}
private class OrderService : OrderServiceGrpcKt.OrderServiceCoroutineImplBase() {
override suspend fun getOrders(request: OrderRequest): OrderResponse {
val ordersResp = mutableListOf<Order>()
for (i in 1..request.totalOrders) {
val order = order {
id = i
customerName = "Customer $i"
for (j in 1..request.totalItems) {
items += item {
name = "Item $j"
price = Random.nextDouble(1.0, 100.0)
}
}
totalCost = items.sumOf { it.price }
}
ordersResp += order
}
return orderResponse {
orders += ordersResp
}
}
}
}
fun main() {
val server = Server(System.getenv("PORT")?.toInt() ?: 50051)
server.start()
server.blockUntilShutdown()
}
This is a gRPC server that listens to on a given port, provides an implementation of the OrderService defined in the Protocol Buffers file, and responds to requests to the GetOrders RPC method.
The Server class takes an integer port argument in its constructor and creates a new gRPC server by calling ServerBuilder.forPort(port). The server is then configured with an instance of the OrderService class and built using the build() method.
The start method starts the server and logs a message to indicate that it is listening on the specified port. It also adds a shutdown hook to handle the case where the server is shut down.
The stop method shuts down the server by calling server.shutdown().
The blockUntilShutdown method blocks the current thread until the server is terminated.
The OrderService class extends the generated OrderServiceGrpcKt.OrderServiceCoroutineImplBase class, which provides an implementation of the GetOrders RPC method. The getOrders() method takes an OrderRequest message as input generates a list of Order messages based on the request, and returns an OrderResponse message containing the list of Order messages.
The main function creates a new instance of the Server class, passing in the port number as an argument. It then calls the start() method on the server and blocks until the server is terminated by calling the blockUntilShutdown() method.
Client implementation
import com.devlach.proto.Order
import com.devlach.proto.OrderServiceGrpc
import com.devlach.proto.orderRequest
import io.grpc.ManagedChannel
import io.grpc.ManagedChannelBuilder
import java.util.concurrent.TimeUnit
import kotlin.system.measureTimeMillis
class OrderClient(val channel: ManagedChannel) {
private val stub = OrderServiceGrpc.newBlockingStub(channel)
fun getOrders(totalOrdersToGenerate: Int = 5, totalItemsPerOrder: Int = 10): List<Order> {
val request = orderRequest {
totalOrders = totalOrdersToGenerate
totalItems = totalItemsPerOrder
}
val response = stub.getOrders(request)
return response.ordersList
}
fun close() {
channel.shutdown().awaitTermination(5, TimeUnit.SECONDS)
}
}
fun main() {
val channel = ManagedChannelBuilder.forTarget("localhost:50051")
.usePlaintext()
.build()
val client = OrderClient(channel)
val timeMilliseconds = measureTimeMillis {
println("Orders result: ${client.getOrders(
totalOrdersToGenerate = 2,
totalItemsPerOrder = 5
)}")
}
println("Time taken: $timeMilliseconds ms")
client.close()
}
After the server is started, we can use the client to send requests to the server and receive responses.
Orders result: [id: 1
customerName: "Customer 1"
items {
name: "Item 1"
price: 12.287644232722725
}
items {
name: "Item 2"
price: 90.36353951474923
}
items {
name: "Item 3"
price: 39.046442165072186
}
items {
name: "Item 4"
price: 47.07877242640991
}
items {
name: "Item 5"
price: 83.4756203248169
}
totalCost: 272.25201866377097
, id: 2
customerName: "Customer 2"
items {
name: "Item 1"
price: 31.341644363457085
}
items {
name: "Item 2"
price: 31.315596943150464
}
items {
name: "Item 3"
price: 75.48704154439892
}
items {
name: "Item 4"
price: 22.246544420244142
}
items {
name: "Item 5"
price: 27.7028320427861
}
totalCost: 188.0936593140367
]
Time taken: 117 ms
The above code is an implementation of a gRPC client that communicates with a server using the OrderService. The client is implemented using the OrderService proto definition.
The OrderClient class takes a ManagedChannel object in its constructor and creates a blocking stub for the OrderService using the newBlockingStub() method. The getOrders() function takes two parameters, the totalOrdersToGenerate and totalItemsPerOrder which are used to create an OrderRequest object that is sent to the server using the getOrders() method of the OrderService.
The main() function creates a ManagedChannel using the forTarget() method, sets plaintext as transport security, and builds the channel. It then creates an instance of OrderClient using the created ManagedChannel. The measureTimeMillis function is used to time how long it takes to get the orders from the server. Finally, the OrderClient channel is closed using the close() function.
Overall, the OrderClient class provides a simple interface for making requests to the server and receiving responses. The main() function shows an example usage of the client by creating a channel and an instance of the client, making a request to the server, and timing the response.
Benchmarking
If we look at the previous example, the execution time was 10 milliseconds. How about doing some comparative analysis by changing the values of totalOrdersToGenerate and totalItemsPerOrder?
Set 5 orders with 10 items each
// 5 orders with 10 items each
println("Orders result size: ${client.getOrders(
totalOrdersToGenerate = 5,
totalItemsPerOrder = 10
).size}")
Orders result size: 5
Time taken: 101 ms
Set 10000 orders with 15 items each
// print the size of the result of 10000 orders with 15 items each
println("Orders result size: ${client.getOrders(
totalOrdersToGenerate = 10000,
totalItemsPerOrder = 15
).size}")
Orders result size: 10000
Time taken: 142 ms
The first example generates 5 orders with 10 items each, while the second example generates 10000 orders with 15 items each. Both examples print the size of the result, which should be equal to the number of orders generated.
The second example will likely take longer to execute than the first example because it generates a larger number of orders. The difference in execution time may not be proportional to the difference in the number of orders generated, due to other factors such as network latency and server load.
In the examples, we can see that the response time for the gRPC client-server communication is consistently fast and efficient, regardless of the amount of data being transmitted. The response time is measured in milliseconds, and even when the request involves a larger amount of data, the difference in time is relatively small.
Comparing with JSON, GraphQL
Technology | Pros | Cons | Use cases |
---|---|---|---|
Protocol Buffer | Faster and smaller than JSON, supports type safety, efficient binary serialization, and deserialization, supports backward and forward compatibility | Requires a defined schema, not human-readable, not self-describing, not suitable for all use cases | Web applications, RESTful APIs. It is a popular choice for microservices architectures, where performance and efficiency are crucial. |
GraphQL | Flexible data querying and retrieval, type safety, self-describing, supports client-driven requests and responses, allows selective data retrieval, supports multiple data sources | Requires a defined schema, not suitable for all use cases, adds complexity to the stack, can result in over-fetching or under-fetching of data | Client-server applications that require flexible and efficient data retrieval and manipulation, with complex data models and multiple data sources |
JSON | Human-readable, widely supported, self-describing, simple and familiar syntax, suitable for small to medium data models and applications | Slower and larger than Protocol Buffer, limited type safety, not suitable for complex data models, not efficient for large data transfers | Web applications, RESTful APIs, small to medium-sized applications with simple data models |
In summary, Protocol Buffers provide a highly efficient and compact binary serialization format with language-agnostic support, while GraphQL offers a strong type system and reduced over-fetching of data. JSON, on the other hand, is widely supported and easy to use, making it suitable for web APIs that require quick iteration and debugging. Choosing the right technology depends on the specific needs and requirements of the project.
Conclusion
In conclusion, Kotlin is a versatile programming language that has become popular due to its interoperability, safety features, and simplicity. Protocol Buffers are a data serialization format that provides an efficient and flexible way to exchange data between different systems. Integrating Kotlin and Protocol Buffers allows developers to leverage the benefits of both technologies and build fast and efficient systems.
In addition, the comparative analysis of Protocol Buffers, GraphQL, and JSON shows that each technology has its own strengths and weaknesses. Protocol Buffers are the most efficient in terms of serialization and deserialization speed, while GraphQL is the most flexible and provides powerful querying capabilities. JSON, on the other hand, is the most widely used and has excellent compatibility with a wide range of systems.
Overall, the choice of data serialization technology depends on the specific use case and the requirements of the system. By understanding the strengths and weaknesses of each technology, developers can make informed decisions and build systems that are fast, efficient, and flexible.
All the code snippets mentioned in the article can be found on GitHub.