Blocks cannot be freed, but they take up space on workers

Description

The system can get into a state where workers hold blocks which cannot be release by calling delete or free. This happens if the block master and filesystem master get out of sync, such that the block master is tracking a block which doesn't exist in any file's block list. This can happen in a couple scenarios:

1) While performing a recursive delete on an out of sync directory (files exist in the directory which are not tracked by Alluxio), files are deleted one by one from the UFS and Alluxio. If any of the files fail to be deleted from the UFS, e.g. due to a transient network failure, the blocks for the successfully-deleted files will not be deleted.
2) When using the recently introduced UFS sync feature, if a file is deleted directly from the UFS and the deletion is synced to Alluxio, the file metadata will be erased, but the block metadata will remain.

Environment

None

Status

Assignee

Andrew Audibert

Reporter

Andrew Audibert

Labels

None

Components

Fix versions

Affects versions

1.7.0

Priority

Critical
Configure