Skip to main content

Voxxed Days Milan 2019 review

Finally I found few minutes to share my impressions after attending the first Voxxed Days event in Italy, which happened in Milan on April 13th 2019.



I was one of the speakers there: my talk was about Deep Learning on Apache Spark with DeepLearning4J (a follow up of some topics from my book). There were 3 sessions in parallel. The level of the talks was really high and it was hard for me and any other participant to choose which one to follow at a given time slot. The good news is that all of the sessions have been recorded and yesterday the first videos (those from the main session) have been published on YouTube. Once they will be online, I suggest you to watch all of the videos you can, but here are some suggestions among those I had a chance to attend in person at the event. I put my comments to a minimum to reduce spoiling ;)

Opening key note by Mario Fusco: he was the main organizer of the event. In the opening key note he presented the agenda. He recently wrote a book for Manning. In the late afternoon he signed and gave some copies of his book for free and was available for attendees' questions.

Key note by Holly Cummins, The importance of fun in the workplace: the title says everything, the content was brilliant. Highly recommended.

Boosting your applications with distributed caches/datagrids by Katia Aresti: I really enjoyed the talk even if I am definitely not a fan of the Harry Potter's saga (all of Katia's examples referred to characters and/or situations from those books). But if someone mentions reactive microservices and Vert.x I can bear Harry Potter's stuff too :)))

Performance tuning Twitter services with Graal and Machine Learning by Chris Thalinger: just in case you're among those people that still don't believe that Machine Learning could help you from a DevOps perspective to improve tuning and performance of your applications/services. A real world use case from Twitter.

Concurrent Garbage Collectors: ZGC & Shenandoah by Simone Bordet: a detailed overview of the new Java 11 and 12 Garbage Collectors. Simone goes very deep on this topic. If you, like me, didn't have a chance yet to play with the latest 2 Java major releases, would find this talk very informative.

Interaction Protocols: It's all about good manners by Martin Thompson: an interesting history of distributed systems protocols and their quality attributes. A more philosophical than technical talk, but absolutely enjoyable.

Not only the talks were fantastic. I really enjoyed the networking with the organizers, other speakers and participants and I was positively impressed also by the very high level of questions raised by attendees during the Q&A sessions. Definitely an overall ultra positive experience.

Comments

Popular posts from this blog

Streamsets Data Collector log shipping and analysis using ElasticSearch, Kibana and... the Streamsets Data Collector

One common use case scenario for the Streamsets Data Collector (SDC) is the log shipping to some system, like ElasticSearch, for real-time analysis. To build a pipeline for this particular purpose in SDC is really simple and fast and doesn't require coding at all. For this quick tutorial I will use the SDC logs as example. The log data will be shipped to Elasticsearch and then visualized through a Kibana dashboard. Basic knowledge of SDC, Elasticsearch and Kibana is required for a better understanding of this post. These are the releases I am referring to for each system involved in this tutorial: JDK 8 Streamsets Data Collector 1.4.0 ElasticSearch 2.3.3 Kibana 4.5.1 Elasticsearch and Kibana installation You should have your Elasticsearch cluster installed and configured and a Kibana instance pointing to that cluster in order to go on with this tutorial. Please refer to the official documentation for these two products in order to complete their installation (if you do

Exporting InfluxDB data to a CVS file

Sometimes you would need to export a sample of the data from an InfluxDB table to a CSV file (for example to allow a data scientist to do some offline analysis using a tool like Jupyter, Zeppelin or Spark Notebook). It is possible to perform this operation through the influx command line client. This is the general syntax: sudo /usr/bin/influx -database '<database_name>' -host '<hostname>' -username '<username>'  -password '<password>' -execute 'select_statement' -format '<format>' > <file_path>/<file_name>.csv where the format could be csv , json or column . Example: sudo /usr/bin/influx -database 'telegraf' -host 'localhost' -username 'admin'  -password '123456789' -execute 'select * from mem' -format 'csv' > /home/googlielmo/influxdb-export/mem-export.csv

Using Rapids cuDF in a Colab notebook

During last Spark+AI Summit Europe 2019 I had a chance to attend a talk from Miguel Martinez  who was presenting Rapids , the new Open Source framework from NVIDIA for GPU accelerated end-to-end Data Science and Analytics. Fig. 1 - Overview of the Rapids eco-system Rapids is a suite of Open Source libraries: cuDF cuML cuGraph cuXFilter I enjoied the presentation and liked the idea of this initiative, so I wanted to start playing with the Rapids libraries in Python on Colab , starting from cuDF, but the first attempt came with an issue that I eventually solved. So in this post I am going to share how I fixed it, with the hope it would be useful to someone else running into the same blocker. I am assuming here you are already familiar with Google Colab. I am using Python 3.x as Python 2 isn't supported by Rapids. Once you have created a new notebook in Colab, you need to check if the runtime for it is set to use Python 3 and uses a GPU as hardware accelerator. You