Skip to main content

Evaluating Pinpoint APM (Part 2)

This second post of the Pinpoint series covers the configuration of the HBase database where the monitoring data are written by the collector and from which they are read by the web UI.
I did the first evaluation of Pinpoint on a MS Windows machine, so here I am going to cover some specific installation details for this OS family. For initial evaluation purposes a standalone HBase server (which runs all daemons within a single JVM) is enough.

Database installation

Here I am referring to the latest stable release (1.2.4) of HBase available at the time this post is being written. This release supports both Java 7 and Java 8: I am referring to Java 8 here. Cygwin isn't going to be used for this installation purposes.
Of course you start downloading the tarball with the HBase binaries and then unpack its content.
Rename the hbase-1.2.4 directory to hbase.
Set up the JAVA_HOME variable to the JRE to use (if you don't have already done it in this installation machine).
Edit the %HBASE_HOME%\conf\hbase-site.xml configuration file in order to set the directories in the local filesystem where HBase and ZooKeeper write data:
    <configuration>
      <property>
        <name>hbase.rootdir</name>
        <value>file:///C:/Users/hbaseuser/hbase</value>
      </property>
      <property>
        <name>hbase.zookeeper.property.dataDir</name>
        <value>/C:/Users/hbaseuser/zookeeper</value>
      </property>
    </configuration>

No need to create those directories preliminarily: HBase will do it at the first start.
Download the Winutils executable from its GitHub repository and then save it in a subfolder named bin of a local directory. Then edit the %HBASE_HOME%\conf\hbase-env.cmd file setting the HADOOP_HOME environment variable with the Winutils home directory like in the example below:
    set HADOOP_HOME=C:\DevelopmentTools\WinUtils
HBase needs ZooKeeper to run. You can set HBase to start its own ZooKeeper instance simply decommenting the following line in the %HBASE_HOME%\conf\hbase-env.cmd file:
    rem set HBASE_MANAGES_ZK=true
Now you're ready to start HBase. From a command prompt execute the following command:
    %HBASE_HOME%\bin\start-hbase.cmd
To test that the database is running fine you can connect to its web UI available at the following URL:
    http://localhost:16010
or start a HBase shell session through the following command:
    HBASE_HOME%\bin\hbase shell
   

Configuration for Pinpoint

Now that the HBase database is running you can create the schema for Pinpoint. You need to specify in the init script that you're going to use an existing HBase instance. So you need to edit the %PINPOINT_HOME%\quickstart\bin\init-hbase.cmd file setting the QUICKSTART_HBASE_PATH with your external HBase home path like in the example below:
    set QUICKSTART_HBASE_PATH=C:\DevelopmentTools\hbase
and then commenting the line
    set QUICKSTART_HBASE_PATH=%QUICKSTART_BASE%\hbase\hbase
Save the changes and execute the script. The execution will last some minutes (depending on your machine resources), so be patient and grab a coffee or do some stretching exercises while waiting for it to be completed.
At the end the following tables should have been created in the database:
  •     AgentEvent
  •     AgentInfo
  •     AgentLifeCycle
  •     AgentStat
  •     AgentStatV2
  •     ApiMetaData
  •     ApplicationIndex
  •     ApplicationMapStatisticsCallee_Ver2
  •     ApplicationMapStatisticsCaller_Ver2
  •     ApplicationMapStatisticsSelf_Ver2
  •     ApplicationTraceIndex
  •     HostApplicationMap_Ver2
  •     SqlMetaData_Ver2
  •     StringMetaData
  •     TraceV2
  •     Traces

What's next

In the next post of this series we are going to learn how to start the collector and the web UI, test Pinpoint using the demo web application which is part of the quickstart bundle and understand how to setup the agent to profile standalone and web Java applications.

Comments

Popular posts from this blog

Exporting InfluxDB data to a CVS file

Sometimes you would need to export a sample of the data from an InfluxDB table to a CSV file (for example to allow a data scientist to do some offline analysis using a tool like Jupyter, Zeppelin or Spark Notebook). It is possible to perform this operation through the influx command line client. This is the general syntax: sudo /usr/bin/influx -database '<database_name>' -host '<hostname>' -username '<username>'  -password '<password>' -execute 'select_statement' -format '<format>' > <file_path>/<file_name>.csv where the format could be csv , json or column . Example: sudo /usr/bin/influx -database 'telegraf' -host 'localhost' -username 'admin'  -password '123456789' -execute 'select * from mem' -format 'csv' > /home/googlielmo/influxdb-export/mem-export.csv

jOOQ: code generation in Eclipse

jOOQ allows code generation from a database schema through ANT tasks, Maven and shell command tools. But if you're working with Eclipse it's easier to create a new Run Configuration to perform this operation. First of all you have to write the usual XML configuration file for the code generation starting from the database: <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <configuration xmlns="http://www.jooq.org/xsd/jooq-codegen-2.0.4.xsd">   <jdbc>     <driver>oracle.jdbc.driver.OracleDriver</driver>     <url>jdbc:oracle:thin:@dbhost:1700:DBSID</url>     <user>DB_FTRS</user>     <password>password</password>   </jdbc>   <generator>     <name>org.jooq.util.DefaultGenerator</name>     <database>       <name>org.jooq.util.oracle.OracleDatabase</name>     ...

Turning Python Scripts into Working Web Apps Quickly with Streamlit

 I just realized that I am using Streamlit since almost one year now, posted about in Twitter or LinkedIn several times, but never wrote a blog post about it before. Communication in Data Science and Machine Learning is the key. Being able to showcase work in progress and share results with the business makes the difference. Verbal and non-verbal communication skills are important. Having some tool that could support you in this kind of conversation with a mixed audience that couldn't have a technical background or would like to hear in terms of results and business value would be of great help. I found that Streamlit fits well this scenario. Streamlit is an Open Source (Apache License 2.0) Python framework that turns data or ML scripts into shareable web apps in minutes (no kidding). Python only: no front‑end experience required. To start with Streamlit, just install it through pip (it is available in Anaconda too): pip install streamlit and you are ready to execute the working de...