-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-30161 Add paginated, single-RPC RegionLocator.getRegionLocations(startKey, limit) API for bulk meta-cache warmup #8236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: branch-2
Are you sure you want to change the base?
Changes from all commits
e62bee5
3120495
3feadd7
dc6f860
17814c3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -21,6 +21,7 @@ | |
| import java.io.IOException; | ||
| import java.util.List; | ||
| import java.util.stream.Collectors; | ||
| import org.apache.hadoop.hbase.HConstants; | ||
| import org.apache.hadoop.hbase.HRegionLocation; | ||
| import org.apache.hadoop.hbase.TableName; | ||
| import org.apache.hadoop.hbase.util.Pair; | ||
|
|
@@ -130,6 +131,54 @@ default List<HRegionLocation> getRegionLocations(byte[] row) throws IOException | |
| */ | ||
| List<HRegionLocation> getAllRegionLocations() throws IOException; | ||
|
|
||
| /** | ||
| * Bulk lookup of region locations from {@code hbase:meta} in a single RPC, starting at | ||
| * {@code startKey} (region start-key boundary, inclusive) and returning at most {@code limit} | ||
| * regions in start-key order. | ||
| * <p/> | ||
| * The returned list includes all replicas of each region (matching | ||
| * {@link #getAllRegionLocations()}), and the result is also written to the connection's region | ||
| * location cache. | ||
| * <p/> | ||
| * Ordering: regions are returned in ascending region start-key order (the natural order of | ||
| * {@code hbase:meta} rows for a single table). Within each region, replicas are returned in | ||
| * ascending replica-id order (replica 0, then 1, then 2, ...). Split parents and offline regions | ||
| * are filtered out, which may cause a page to contain fewer than {@code limit} regions but never | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Where is this filtering happening? I didn't see any test coverage either. The existing methods don't do any such filtering correct, so is this even needed?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Filtering is already implemented and happens in inside
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am reusing the filtering logic so, no test coverage needed. |
||
| * disturbs ordering of the survivors. | ||
| * <p/> | ||
| * To page through all regions of a table, call repeatedly passing | ||
| * {@code last.getRegion().getEndKey()} as the next {@code startKey}, where {@code last} is the | ||
| * final element of the previous response. All replicas of a region share the same | ||
| * {@link RegionInfo}, so the last entry's end key is the correct cursor regardless of which | ||
| * replica it is. Pass {@code null} for the first call. Stop paging when the returned list is | ||
| * empty or when the last region's end key is {@link HConstants#EMPTY_END_ROW} (zero-length) - | ||
| * that signals the end of the table; passing it back in would re-scan from the beginning since by | ||
| * convention an empty start key means "from the first region". | ||
| * <p/> | ||
| * Unlike {@link #getAllRegionLocations()}, this method performs at most one RPC against | ||
| * {@code hbase:meta} per invocation, so its latency is bounded by {@code limit} rather than table | ||
| * size. Suitable for callers that wrap meta lookups in a lock with a fixed timeout, e.g. for bulk | ||
| * region-cache warmup. | ||
| * <p/> | ||
| * This method is optional. Implementations that cannot support paginated lookups should throw | ||
| * {@link UnsupportedOperationException} (the default behavior); callers should fall back to | ||
| * {@link #getAllRegionLocations()} in that case. | ||
| * @param startKey region start-key to begin scanning from (inclusive); {@code null} or empty | ||
| * starts from the first region | ||
| * @param limit maximum number of regions to return; if <= 0, falls back to | ||
| * {@code hbase.meta.scanner.caching} | ||
| * @return up to {@code limit} {@link HRegionLocation}s in start-key order, possibly empty when no | ||
| * more regions exist | ||
| * @throws IOException if a remote or network exception occurs | ||
| * @throws UnsupportedOperationException if this implementation does not support paginated lookups | ||
| */ | ||
| default List<HRegionLocation> getRegionLocationsPage(byte[] startKey, int limit) | ||
| throws IOException { | ||
| throw new UnsupportedOperationException( | ||
| "getRegionLocationsPage(byte[], int) is not supported by this RegionLocator;" | ||
| + " fall back to getAllRegionLocations()"); | ||
| } | ||
|
|
||
| /** | ||
| * Gets the starting row key for every region in the currently open table. | ||
| * <p> | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wonder how spotless didn't take care of spaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is valid space as we are listing method parameters and so, spotless also didn't flag it. On viewing in editor it is aligned rightly.
