Improve client metrics heartbeat

Description

Currently every FileSystemContext will send its own metrics heartbeat to master. We should aggregate these heartbeats into a single heartbeat to reduce load on master and clients.

Environment

None

Activity

Show:
Calvin Jia
December 3, 2018, 7:29 PM

Would this problem be solved after we switch to gRPC (since we will have multiplexing on the same connection)?

Also, I think the MetricsSystem cannot be a singleton if we want to have a modular client where you could create instances of Alluxio clients that talk to different Alluxio masters.

Andrew Audibert
December 3, 2018, 8:37 PM

The problem will be reduced if gRPC re-uses the same connection for the heartbeats, but the extra client heartbeats are still unnecessary stress on client and master.

If we split MetricsSystem to be client-level instead of global, this ticket isn’t as important, though batching metrics heartbeats could still benefit in reducing master stress.

Calvin Jia
December 4, 2018, 9:36 PM

That makes sense, I think the main overhead is # of connections currently.

Assignee

Unassigned

Reporter

Andrew Audibert

Labels

None

Components

Affects versions

Priority

Major
Configure