背景
项目中需要通过一些自定义的组件来操控hive的元数据,于是使用了remote方式来存储hive元数据,使用一个服务后台作为gateway,由它来控制hive元数据。
现象
在windows上连接hive metastore的时候,无端的会报NullPointerException,非常费解。
分析
看了代码后发现,连接后会获取本地用户所在的用户组信息(org.apache.hadoop.hive.metastore.HiveMetaStoreClient中的open方法):
if (isConnected && !useSasl && conf.getBoolVar(ConfVars.METASTORE_EXECUTE_SET_UGI)){// Call set_ugi, only in unsecure mode.try {UserGroupInformation ugi = Utils.getUGI();client.set_ugi(ugi.getUserName(), Arrays.asList(ugi.getGroupNames()));} catch (LoginException e) {LOG.warn("Failed to do login. set_ugi() is not successful, " +"Continuing without it.", e);} catch (IOException e) {LOG.warn("Failed to find ugi of client set_ugi() is not successful, " +"Continuing without it.", e);} catch (TException e) {LOG.warn("set_ugi() not successful, Likely cause: new client talking to old server. "+ "Continuing without it.", e);}}
ugi.getGroupNames()会去调用本地命令在windows平台上会使用一个叫winutils的工具,但是作为客户端开发的话不会在windows端安装这些二进制文件,所以代码流程就出错了
/*** a Unix command to get a given user's groups list.* If the OS is not WINDOWS, the command will get the user's primary group* first and finally get the groups list which includes the primary group.* i.e. the user's primary group will be included twice.*/public static String[] getGroupsForUserCommand(final String user) {//'groups username' command return is non-consistent across different unixesreturn (WINDOWS)? new String[] { WINUTILS, "groups", "-F", "\"" + user + "\""}: new String [] {"bash", "-c", "id -gn " + user+ "&& id -Gn " + user};
WINUTILS的初始化在如下函数中,如果path中找不到的话会返回null
/** a Windows utility to emulate Unix commands */public static final String WINUTILS = getWinUtilsPath();public static final String getWinUtilsPath() {String winUtilsPath = null;try {if (WINDOWS) {winUtilsPath = getQualifiedBinPath("winutils.exe");}} catch (IOException ioe) {LOG.error("Failed to locate the winutils binary in the hadoop binary path",ioe);}return winUtilsPath;}
在java.lang.ProcessBuilder.java中的start中有如下判断:
public Process start() throws IOException {// Must convert to array first -- a malicious user-supplied// list might try to circumvent the security check.String[] cmdarray = command.toArray(new String[command.size()]);cmdarray = cmdarray.clone();for (String arg : cmdarray)if (arg == null)throw new NullPointerException();// Throws IndexOutOfBoundsException if command is emptyString prog = cmdarray[0];
由于cmdarray中的第一个元素就是null,所以马上甩出NullPointerException
toString() 中的null值检测
另外在org.apache.hadoop.util.Shell中
ShellCommandExecutor
这个类中存在一个问题,就是toString方面没有对成员为null的情况进行判断如:
/*** Returns the commands of this instance.* Arguments with spaces in are presented with quotes round; other* arguments are presented raw** @return a string representation of the object.*/@Overridepublic String toString() {StringBuilder builder = new StringBuilder();String[] args = getExecString();for (String s : args) {if (s.indexOf(' ') >= 0) {builder.append('"').append(s).append('"');} else {builder.append(s);}builder.append(' ');}return builder.toString();}
即假如我们的命令args中有元素是null,那么这个toString也会抛出NullPointerException,因为在没有判断的情况下直接引用了对象方法(s.indexOf),记得这个问题似乎在Effective Java里看到过。一般并不会触发这问题,可是在打开调试器的时候,它会去执行当前环境里对象的toString方法。所以每当debug到相关代码段时,总是莫名其妙的就突然爆出个NullPointerException,着实费解了一些时间。