Skip to main content

Some potential errors building Hadoop on Windows (and their root cause and solution).

In my last post I described the process to successfully build Hadoop on a Windows machine. Before coming to a stable and reproducible procedure I faced some errors which messages don't highlight the real problem. Browsing the web I didn't found so much info about them, so I am going to share the solutions I found. In this list I am going to skip some errors due to missing execution of one or more steps of the overall build procedure (like those due to missing JDK or Maven, or missing paths in the PATH variable, or missing environment variables, etc.) that could be easily fixed.
I am still referring to the release 2.7.1 of Hadoop. So most part of the proposed solutions have been tested for this release only.

Building through an incompatible version of the .NET framework.
Error message:
LINK : fatal error LNK1123: failure during conversion to COFF: file invalid or corrupt [C:\OpenSourcePrograms\Hadoop\hadoop-2.7.1-src\hadoop-common-project\hadoop-common\src\main\winutils\winutils.vcxproj]
Done Building Project "C:\OpenSourcePrograms\Hadoop\hadoop-2.7.1-src\hadoop-common-project\hadoop-common\src\main\winutils\winutils.vcxproj" (default targets) -
- FAILED.


Root cause: You are trying to build using .NET framework release 4.5. Hadoop 2.7.1 is incompatible with that .NET release.

Solution: Remove the .NET 4.5 from the building machine and use the release 4 (it comes with the Windows SDK 7.1).


Building with the latest version of Protocol Buffer.
Error message:
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 3:34.957s
[INFO] Finished at: Tue Aug 04 10:25:16 BST 2015
[INFO] Final Memory: 63M/106M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.hadoop:hadoop-maven-plugins:2.7.1:protoc (compile-protoc) on project hadoop-common: org.apache.maven.plugin.MojoExecutionException: protoc version is 'libprotoc 2.6.1', expected version is '2.5.0' -
> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e swit
ch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please rea
d the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command


Root cause:
Hadoop common 2.7.1 is incompatible with the latest version (2.6.1) of Google Protocol Buffers.

Solution:
Download and use the release 2.5.0 (Hadoop 2.7.1 Common builds successfully with this release only).

Windows SDK not installed successfully or wrong path for the MSBuild.exe to use.
Error message:
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 45.046s
[INFO] Finished at: Tue Aug 04 11:34:29 BST 2015
[INFO] Final Memory: 66M/240M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.3.1:exec (compile-ms-winutils) on project hadoop-common: Command execution failed. Process
exited with an error: 1 (Exit value: 1) -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.3.1:exec (compile-ms-winutils) on project hadoop-common: Command execution failed.
        at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor
.java:216)
        at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor
.java:153)
        at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor
.java:145)
        at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProje
ct(LifecycleModuleBuilder.java:84)
        at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProje
ct(LifecycleModuleBuilder.java:59)
        at org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBu
ild(LifecycleStarter.java:183)
        at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(Lifecycl
eStarter.java:161)
        at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:317)
        at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:152)
        at org.apache.maven.cli.MavenCli.execute(MavenCli.java:555)
        at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:214)
        at org.apache.maven.cli.MavenCli.main(MavenCli.java:158)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:94)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:55)
        at java.lang.reflect.Method.invoke(Method.java:619)
        at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Laun
cher.java:290)
        at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.jav
a:230)
        at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(La
uncher.java:409)
        at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:
352)
Caused by: org.apache.maven.plugin.MojoExecutionException: Command execution fai
led.
        at org.codehaus.mojo.exec.ExecMojo.execute(ExecMojo.java:303)
        at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(Default
BuildPluginManager.java:106)
        at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor
.java:208)
        ... 19 more
Caused by: org.apache.commons.exec.ExecuteException: Process exited with an erro
r: 1 (Exit value: 1)
        at org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecut
or.java:402)
        at org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:
164)
        at org.codehaus.mojo.exec.ExecMojo.executeCommandLine(ExecMojo.java:750)

        at org.codehaus.mojo.exec.ExecMojo.execute(ExecMojo.java:292)
        ... 21 more
[ERROR]
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command

[ERROR]   mvn <goals> -rf :hadoop-common


Root cause:
This error happens when:
A) the Windows SDK installation didn't complete successfully or
B) the location of the MSBuild.exe in the system PATH variable is not the correct one to use. You could have more than one on a Windows machine.

Solution:
A) Repeat the Windows SDK 7.1. installation process and then make sure that everything went well.
B) Locate the proper MSBuild.exe to use and update the system PATH variable with the correct path.

Error compiling through the IBM JDK.
Error message:
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 19 source files to C:\OpenSourcePrograms\Hadoop\hadoop-2.7.1-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-registry\target\test-classes
[INFO] -------------------------------------------------------------
[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] /C:/OpenSourcePrograms/Hadoop/hadoop-2.7.1-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry/src/test/java/org/apache/hadoop/registry/secure/TestSecureLogins.java:[23,36] package com.sun.security.auth.module does not exist
[ERROR] /C:/OpenSourcePrograms/Hadoop/hadoop-2.7.1-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry/src/test/java/org/apache/hadoop/registry/secure/TestSecureLogins.java:[138,11] cannot find symbol
  symbol:   class Krb5LoginModule
  location: class org.apache.hadoop.registry.secure.TestSecureLogins
[ERROR] /C:/OpenSourcePrograms/Hadoop/hadoop-2.7.1-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry/src/test/java/org/apache/hadoop/registry/secure/TestSecureLogins.java:[138,49] cannot find symbol
  symbol:   class Krb5LoginModule
  location: class org.apache.hadoop.registry.secure.TestSecureLogins
[INFO] 3 errors
[INFO] -------------------------------------------------------------
[INFO] ------------------------------------------------------------------------


Root cause:
This error happens only when the JDK used to perform the compilation is IBM. The Hadoop source code is totally compatible with any JDK 7. No issues found using the IBM JDK or the Open JDK so far. But one of the JUnit TestCases (TestSecureLogins, contained in the Hadoop Yarn project test package org.apache.hadoop.registry.secure), has a dependency from the Oracle implementation of the Krb5LoginModule class (com.sun.security.auth.module.Krb5LoginModule), not present in the IBM JDK. Even if you build Hadoop through Maven with the option -DskipTests, the unit tests are compiled any way and the error above happens.

Solution:
Open the %HADOOP_HOME%/hadoop-2.7.1-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry/src/test/java/org/apache/hadoop/registry/secure/TestSecureLogins.java class with any text editor and replace the import of the Oracle Krb5LoginModule implementation
import com.sun.security.auth.module.Krb5LoginModule;
with the import of the IBM Krb5LoginModule implementation:
import com.ibm.security.auth.module.Krb5LoginModule;
Save the change and restart the build.

Comments

Popular posts from this blog

jOOQ: code generation in Eclipse

jOOQ allows code generation from a database schema through ANT tasks, Maven and shell command tools. But if you're working with Eclipse it's easier to create a new Run Configuration to perform this operation. First of all you have to write the usual XML configuration file for the code generation starting from the database: <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <configuration xmlns="http://www.jooq.org/xsd/jooq-codegen-2.0.4.xsd">   <jdbc>     <driver>oracle.jdbc.driver.OracleDriver</driver>     <url>jdbc:oracle:thin:@dbhost:1700:DBSID</url>     <user>DB_FTRS</user>     <password>password</password>   </jdbc>   <generator>     <name>org.jooq.util.DefaultGenerator</name>     <database>       <name>org.jooq.util.oracle.OracleDatabase</name>     ...

Turning Python Scripts into Working Web Apps Quickly with Streamlit

 I just realized that I am using Streamlit since almost one year now, posted about in Twitter or LinkedIn several times, but never wrote a blog post about it before. Communication in Data Science and Machine Learning is the key. Being able to showcase work in progress and share results with the business makes the difference. Verbal and non-verbal communication skills are important. Having some tool that could support you in this kind of conversation with a mixed audience that couldn't have a technical background or would like to hear in terms of results and business value would be of great help. I found that Streamlit fits well this scenario. Streamlit is an Open Source (Apache License 2.0) Python framework that turns data or ML scripts into shareable web apps in minutes (no kidding). Python only: no front‑end experience required. To start with Streamlit, just install it through pip (it is available in Anaconda too): pip install streamlit and you are ready to execute the working de...

TagUI: an Excellent Open Source Option for RPA - Introduction

 Photo by Dinu J Nair on Unsplash Today I want to introduce  TagUI , an RPA (Robotic Process Automation) Open Source tool I am using to automate test scenarios for web applications. It is developed and maintained by the AI Singapore national programme. It allows writing flows to automate repetitive tasks, such as regression testing of web applications. Flows are written in natural language : English and other 20 languages are currently supported. Works on Windows, Linux and macOS. The TagUI official documentation can be found  here . The tool doesn't require installation: just go the official GitHub repository and download the archive for your specific OS (ZIP for Windows, tar.gz for Linux or macOS). After the download is completed, unpack its content in the local hard drive. The executable to use is named  tagui  (.cmd in Windows, .sh for other OS) and it is located into the  <destination_folder>/tagui/src  directory. In order to ...