Applications like Impala requires block locations to be non-empty for data load to work. Currently Impala could not read from Alluxio Hadoop client when data is not preloaded:
For files in HDFS under file system, Alluxio returns HDFS block location hostnames for BlockLocation::getNames(), but Impala requires two part nameort format which is indicated in javadoc of the method.
For files in non-HDFS or remote HDFS under file system, Alluxio returns no BlockLocation, this is making Impala skip reading of the file all together.
This change updated the client to return full two-part locations for UFS locations if there is an Alluxio worker co-located. If no worker is colocated with the UFS block location, it will fallback to return all the worker hosts so that applications can just pick from one location to read from.