InodeDirectory consume too much memory

Description

With an alluxio system managing over 50M file/dirs, master can consume more than 100G RAM (The number is different with configuration/workload). After looking at the heap dump, we found that InodeDirctory consume quit a bit memory, further, the mChildren hashmap will allocate 8K slot with each 8bytes, which will consume 64K memory for directory without any children.

This should be investigated to see whether the mChildren is needed to allocated and kept all the time, and if there is some more efficient mechanism.

Environment

None

Activity

Show:
Chao Guang Li
September 11, 2018, 9:55 AM

Yes, it is exactly what we saw. Is there a way to shrink the slot ?

Andrew Audibert
September 11, 2018, 5:58 PM

It will require code change to fix this. ConcurrentHashMap doesn’t support shrinking, so we need to do it ourselves by creating a new, smaller map and copying the entries into it. The resizing could be triggered when the map size drops 2 or 3 powers of 2 below its capacity, e.g. if the capacity has expanded to 2^10, we resize once the size is small enough to fit within capacity 2^7.

ligq
November 17, 2018, 1:16 PM
Andrew Audibert
November 18, 2018, 10:56 PM

Right now we look up a path like /a/b/c by looking up “a” in the child list of “/”, then looking up “b” in the child list of “/a”, then looking up “c” in the child list of “/a/b”. These lookups are constant time because we use a map to store the children, where the key is the name of the child.

It looks like HDFS stores the children in a sorted list and uses binary search to do lookup. The HDFS way is more memory efficient, and more computationally efficient for small directories.

ligq
November 19, 2018, 1:36 AM

Assignee

Unassigned

Reporter

Chao Guang Li

Labels

None

Components

Fix versions

Affects versions

Priority

Critical
Configure