With an alluxio system managing over 50M file/dirs, master can consume more than 100G RAM (The number is different with configuration/workload). After looking at the heap dump, we found that InodeDirctory consume quit a bit memory, further, the mChildren hashmap will allocate 8K slot with each 8bytes, which will consume 64K memory for directory without any children.
This should be investigated to see whether the mChildren is needed to allocated and kept all the time, and if there is some more efficient mechanism.
Yes, it is exactly what we saw. Is there a way to shrink the slot ?
It will require code change to fix this. ConcurrentHashMap doesn’t support shrinking, so we need to do it ourselves by creating a new, smaller map and copying the entries into it. The resizing could be triggered when the map size drops 2 or 3 powers of 2 below its capacity, e.g. if the capacity has expanded to 2^10, we resize once the size is small enough to fit within capacity 2^7.
I think we can use list to store child nodes as well as HDFS.
Right now we look up a path like /a/b/c by looking up “a” in the child list of “/”, then looking up “b” in the child list of “/a”, then looking up “c” in the child list of “/a/b”. These lookups are constant time because we use a map to store the children, where the key is the name of the child.
It looks like HDFS stores the children in a sorted list and uses binary search to do lookup. The HDFS way is more memory efficient, and more computationally efficient for small directories.
I implemented a prototype :