RHadoop (https://github.com/RevolutionAnalytics/RHadoop/wiki) is a collection of five R packages (rhdfs, rmr2, rhbase, ravro, plyrmr) that allow users to manage and analyze data with Hadoop. Running any MapReduce function, also this simple one
from.dfs(mapreduce(to.dfs(1:100)))
through RHadoop on Linux servers you could face this exception:
2015-10-20 08:39:41,722 ERROR [main] org.apache.hadoop.streaming.PipeMapRed: configuration exception
java.io.IOException: Cannot run program "Rscript": error=2, No such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1059)
at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209)
at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:95)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:56)
at java.lang.reflect.Method.invoke(Method.java:620)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:95)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:56)
at java.lang.reflect.Method.invoke(Method.java:620)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(AccessController.java:452)
at javax.security.auth.Subject.doAs(Subject.java:572)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: error=2, No such file or directory
at java.lang.UNIXProcess.<init>(UNIXProcess.java:188)
at java.lang.ProcessImpl.start(ProcessImpl.java:164)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1040)
... 24 more
At first glance one could think of a missing permission issue on the Rscript tool for the user running Hadoop (the file cannot be missing because it is part of the R environment and the MapReduce function has been triggered by a R console). But this issue happens also when the permissions for that user are r-x for the Rscript file. The root cause for this issue is the following: RHadoop/Hadoop tries to to execute the Rscript tool from the /usr/bin/ location when it is really in the $R_HOME/bin/ directory. The solution is simple. As root user create a symbolic link to the Rscript file this way:
ln -s $R_HOME/bin/Rscript /usr/bin/Rscript
This solution works on Red Hat and CentOS, but I suppose it should work on any Linux distro.
from.dfs(mapreduce(to.dfs(1:100)))
through RHadoop on Linux servers you could face this exception:
2015-10-20 08:39:41,722 ERROR [main] org.apache.hadoop.streaming.PipeMapRed: configuration exception
java.io.IOException: Cannot run program "Rscript": error=2, No such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1059)
at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209)
at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:95)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:56)
at java.lang.reflect.Method.invoke(Method.java:620)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:95)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:56)
at java.lang.reflect.Method.invoke(Method.java:620)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(AccessController.java:452)
at javax.security.auth.Subject.doAs(Subject.java:572)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: error=2, No such file or directory
at java.lang.UNIXProcess.<init>(UNIXProcess.java:188)
at java.lang.ProcessImpl.start(ProcessImpl.java:164)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1040)
... 24 more
At first glance one could think of a missing permission issue on the Rscript tool for the user running Hadoop (the file cannot be missing because it is part of the R environment and the MapReduce function has been triggered by a R console). But this issue happens also when the permissions for that user are r-x for the Rscript file. The root cause for this issue is the following: RHadoop/Hadoop tries to to execute the Rscript tool from the /usr/bin/ location when it is really in the $R_HOME/bin/ directory. The solution is simple. As root user create a symbolic link to the Rscript file this way:
ln -s $R_HOME/bin/Rscript /usr/bin/Rscript
This solution works on Red Hat and CentOS, but I suppose it should work on any Linux distro.
Comments
Post a Comment