Setting up a quick dev environment for Kafka, CSR and SDC

Few days ago Pat Patterson published an excellent article on DZone about Evolving Avro Schemas With Apache Kafka and StreamSets Data Collector. I recommend reading this interesting article. I followed this tutorial and today I want to share the details on how I quickly setup the environment for this purpose, just in case you should be interested on doing the same. I did it on a Linux Red Hat Server 7 (but the steps are the same for any other Linux distro) and using only images available in the Docker Hub.
First start a Zookeeper node (which is required by Kafka):

sudo docker run --name some-zookeeper --restart always -d zookeeper

and then a Kafka broker, linking the container to that for Zookeeper:

sudo docker run -d --name kafka --link zookeeper:zookeeper ches/kafka

Then start the Confluent Schema Registry (linking it to Zookeeper and Kafka):

sudo docker run -d --name schema-registry -p 8081:8081 --link zookeeper:zookeeper --link kafka:kafka confluent/schema-registry

and the REST proxy for it:

sudo docker run -d --name rest-proxy -p 8082:8082 --link zookeeper:zookeeper --link kafka:kafka --link schema-registry:schema-registry confluent/rest-proxy

Start an instance of the Streamsets Data Collector:

sudo docker run --restart on-failure -p 18630:18630 -d --name streamsets-dc streamsets/datacollector

Finally you can do an optional step in order to make more user friendly (compared to using the CSR APIs) the registration/update of Avro schema in CSR: start the OS CSR UI provided by Landoop:

sudo docker run -d --name schema-registry-ui -p 8000:8000 -e "SCHEMAREGISTRY_URL=http://<csr_host>:8081" -e "PROXY=true" landoop/schema-registry-ui

connecting it to your CSR instance.
You can create a topic in Kafka executing the following commands:

export ZOOKEEPER_IP=$(sudo docker inspect --format '{{ .NetworkSettings.IPAddress }}' zookeeper)
sudo docker run --rm ches/kafka kafka-topics.sh --create --zookeeper $ZOOKEEPER_IP:2181 --replication-factor 1 --partitions 1 --topic csrtest

The environment is ready to play with and to be used to follow Pat's tutorial.

Googlielmo's blog

Search This Blog

Setting up a quick dev environment for Kafka, CSR and SDC

Labels

Comments

Post a Comment

Popular posts from this blog

Load testing MongoDB using JMeter

Turning Python Scripts into Working Web Apps Quickly with Streamlit

Evaluating Pinpoint APM (Part 1)