Googlielmo's blog

Posts

Showing posts from September, 2016

HUG Ireland October Meetup: Machine Learning and the Serving Layer. Successful Big Data Architecture

Another interesting Hadoop User Group (HUG) Ireland Meetup next week (Monday October 3rd 2016) in Dublin at the Bank of Ireland premises in Grand Canal Square : http://www.meetup.com/it-IT/hadoop-user-group-ireland/events/234240469/?eventId=234240469&chapter_analytics_code=UA-55809013-2 If you are in the Dublin area on Monday and interested in Machine Learning, please attend this event to learn more and start networking with other Big Data professionals. Hope to meet you there!

Adding a Maven behavior to not Maven projects

Not Maven projects built through Jenkins cannot be automatically uploaded to a build artifact repository (like Apache Archiva or JFrog Artifactory) at the end of a successful build. In order to allow this level of automation whenever you won't (or can't) transform a standard Java project into a Maven project (but you need to have a uniform build management process) you need to simply add a pom.xml file to the root of your project and commit it to your source code repository along with all the other project files. You can create a basic template following the official documentation and then add/update the specific sections as described in this post. This way developers wouldn't need to change anything else in the current projects' structure. And no need to have Maven on their local development machine or a Maven plugin in the IDE they use. Here's a description of all the sections of the pom.xml file. The values to overwrite are highlighted in bold . Ma...

Streamsets Data Collector 1.6.0.0 has been released!

The release 1.6.0.0 of the Streamsets Data Collector has been released on September 1st. This release comes with an incredible number of new features. Here are some of the most interesting: JDBC Lookup processor: it can perform lookups in a database table through a JDBC connection and then you can use the values to enrich records. JDBC Tee processor: it can write data to a database table through a JDBC connection, and then you can pass generated database column values to fields. Support for reading data from paginated webpages through the HTTP origin. Support for Apache Kafka 0.10 and ElasticSearch 2.3.5. Enterprise security in the MongoDB origin and destination including SSL and login credentials. Whole File Data format: to move entire files from an origin system (Amazon S3 or Directory) to a destination system (Amazon S3, HDFS, Local File System or MapR FS). Using the whole file data format, you can transfer any type of file. And many more. ...