Skip to content

HBASE-30135: Improve CacheAwareLoadBalancer to simulate low cache ratio regions as cached in candidate servers with enough cache space#8221

Open
wchevreuil wants to merge 4 commits into
apache:masterfrom
wchevreuil:HBASE-30135
Open

HBASE-30135: Improve CacheAwareLoadBalancer to simulate low cache ratio regions as cached in candidate servers with enough cache space#8221
wchevreuil wants to merge 4 commits into
apache:masterfrom
wchevreuil:HBASE-30135

Conversation

@wchevreuil
Copy link
Copy Markdown
Contributor

No description provided.

…io regions as cached in candidate servers with enough cache space

Change-Id: I4212296994ebd0411982596796e833b80c537a38
Change-Id: Ia1e58a8437356f124595119987d0a08c8ed5222f
Change-Id: Ibba74c440715c5673993b23f30ed5852cc854a54
Copy link
Copy Markdown
Contributor

@petersomogyi petersomogyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I have a comment on simulatedRatio field access.

private float potentialCacheRatioAfterMove;
private float minFreeCacheSpaceFactor;

private BigDecimal simulatedRatio = new BigDecimal(0);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This simulatedRatio field is accessed from multiple places. Can it cause a race issue?

  • CacheAwareCandidateGenerator.generate()
  • CacheAwareSkewnessCandidateGenerator.pickRandomRegions()
  • CacheAwareCostFunction.regionMoved()

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not thread safe for sure, but as of now, balancer is a single thread only operation (no multiple parallel balancers run at the same time).

Unfortunately, we need to keep this global variable in order to rollback the cache ratio increment if the plan isn't accepted.

In StochasticLoadBalancer.balanceTable we do:

  1. Pick a generator and cost function at random;
  2. Call generate() on the generator so that we set that variable to 0 and get a move plan;
  3. Simulate the plan, which calls regionMoved(). If the chosen cost function was the CacheAwareCostFunction, we will calculate the impact of that plan on the overall cache ratio. This impact is what we save simulatedRatio field;
  4. Calculate the total cost with this plan. If it's lower than the lowest cost so far, keep it. Otherwise, roll it back by using the same cost function but inverting the to/from servers order. If it's the CacheAwareCostFunction, regionMoved would be called and it would enter the branch that reverts the value from the cacheRatio.
  5. Iterate again. Reset the variable on step 2;

Without this, we would need to always recalculate all individual regions/servers ratios on every new plan simulation (every iteration of the StochasticLoadBalancer loop), which is costly.

Change-Id: I15ef3637b52b027cc2ebb5d1d0511dd6f9edf47a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants