Supply the max_schema/data_region_group_num param to modify schema when create or alter database#17988
Supply the max_schema/data_region_group_num param to modify schema when create or alter database#17988zerolbsony wants to merge 3 commits into
Conversation
…en create or alter database
a70914c to
e3cd74a
Compare
…al connection when the connection is closed
CRZbulabula
left a comment
There was a problem hiding this comment.
A few non-blocking cleanups. One of them (the orphaned min-based ALTER validation) is on pre-existing lines that aren't part of this diff, so I'm noting it here rather than inline:
Orphaned min-based ALTER validation — ClusterSchemaManager.java:242 and :256 (alterDatabase): the isSetMinSchemaRegionGroupNum() / isSetMinDataRegionGroupNum() validation blocks (the "could only be increased" checks) are now unreachable from the SQL path, because after this PR no SQL statement ever sets minSchemaRegionGroupNum/minDataRegionGroupNum — the keyword now sets max* instead. These branches are now only hit by internal/Thrift callers that still populate the min fields. This isn't wrong, but it's easy to mistake for the active validation. Consider adding a comment clarifying it only applies to non-SQL callers, or confirm nothing else still sends min* via ALTER.
The inline comments below cover the remaining items.
| } | ||
|
|
||
| private static TSStatus validateMaxRegionGroupNum( | ||
| final String database, |
There was a problem hiding this comment.
Unused parameter. The database parameter of validateMaxRegionGroupNum is never referenced in the method body (it reads policy/default/field-name purely from consensusGroupType and CONF). Likewise validateMaxRegionGroupNumOnCreation only forwards databaseSchema.getName() into it. Either drop the parameter, or use it in the error message for clearer diagnostics.
| ? databaseSchema.getMaxSchemaRegionGroupNum() | ||
| : databaseSchema.getMaxDataRegionGroupNum(), | ||
| true); | ||
| return status.getCode() == TSStatusCode.SUCCESS_STATUS.getStatusCode() ? previousError : status; |
There was a problem hiding this comment.
Error-aggregation pattern is replace, not accumulate. validateMaxRegionGroupNumOnCreation returns previousError on success and status on failure, and the two call sites chain it as errorResp = validateMaxRegionGroupNumOnCreation(schema, SchemaRegion, errorResp) then again for DataRegion. If the SchemaRegion validation already produced a non-null error and the DataRegion validation also fails, the SchemaRegion error is silently overwritten (DataRegion wins). With current inputs only one error can occur so it's harmless today, but the pattern reads as if it accumulates when it actually replaces. The alter path already does the clearer if (status != OK) return status; per dimension — applying the same shape here on create would avoid the subtlety.
| } | ||
| } | ||
| connectionToUse = null; | ||
| statement = null; |
There was a problem hiding this comment.
Unrelated drive-by change. This statement = null; (and the PR's head commit "The old statement shall be reclaimed in the same manner as the original connection") is a statement/connection-lifecycle fix that is unrelated to the MAX_*_REGION_GROUP_NUM feature this PR is about. It looks like a legitimate resource-cleanup fix, but please either split it into its own PR or call it out in the PR description so reviewers and git blame aren't confused about why it's here.
| "select database, max_schema_region_group_num, max_data_region_group_num " | ||
| + "from information_schema.databases where database = 'test_create'"), | ||
| "database,max_schema_region_group_num,max_data_region_group_num,", | ||
| Collections.singleton("test_create,3,4,")); |
There was a problem hiding this comment.
To verify the demand number is set correctly, u should not only test the metadata (the number obtained via information_schema or show databases), but also write some data and check if the region groups are created successfully.
Not sure if we have these CIs currently, you should check and append when necessary.
|
|
||
| if (databaseSchema.isSetMinSchemaRegionGroupNum()) { | ||
| // Validate alter SchemaRegionGroupNum | ||
| final int minSchemaRegionGroupNum = | ||
| getMinRegionGroupNum(databaseSchema.getName(), TConsensusGroupType.SchemaRegion); | ||
| if (databaseSchema.getMinSchemaRegionGroupNum() <= minSchemaRegionGroupNum) { | ||
| result = new TSStatus(TSStatusCode.DATABASE_CONFIG_ERROR.getStatusCode()); | ||
| result.setMessage( | ||
| String.format( | ||
| "Failed to alter database. The SchemaRegionGroupNum could only be increased. " | ||
| + "Current SchemaRegionGroupNum: %d, Alter SchemaRegionGroupNum: %d", | ||
| minSchemaRegionGroupNum, databaseSchema.getMinSchemaRegionGroupNum())); | ||
| return result; | ||
| } | ||
| } | ||
| if (databaseSchema.isSetMinDataRegionGroupNum()) { | ||
| // Validate alter DataRegionGroupNum | ||
| final int minDataRegionGroupNum = | ||
| getMinRegionGroupNum(databaseSchema.getName(), TConsensusGroupType.DataRegion); | ||
| if (databaseSchema.getMinDataRegionGroupNum() <= minDataRegionGroupNum) { | ||
| result = new TSStatus(TSStatusCode.DATABASE_CONFIG_ERROR.getStatusCode()); | ||
| result.setMessage( | ||
| String.format( | ||
| "Failed to alter database. The DataRegionGroupNum could only be increased. " | ||
| + "Current DataRegionGroupNum: %d, Alter DataRegionGroupNum: %d", | ||
| minDataRegionGroupNum, databaseSchema.getMinDataRegionGroupNum())); | ||
| return result; | ||
| } | ||
| } |
There was a problem hiding this comment.
Both min region group num params are deprecated, you can delete them.
| } catch (final DatabaseNotExistsException e) { | ||
| // ignore the pre deleted database | ||
| continue; | ||
| int maxSchemaRegionGroupNum = databaseSchema.getMaxSchemaRegionGroupNum(); |
There was a problem hiding this comment.
Current implementation of this for-loop can be further optimized:
- the adjustment logic for both schema and data region groups share lots of codes, can be extracted to some common functions.
- the
calcMaxRegionGroupNumcan be invoked at most twice.
| // TODO: Support alter other fields | ||
| if (alterSchema.isSetMinSchemaRegionGroupNum()) { | ||
| currentSchema.setMinSchemaRegionGroupNum(alterSchema.getMinSchemaRegionGroupNum()); | ||
| currentSchema.setMaxSchemaRegionGroupNum( | ||
| Math.max( | ||
| currentSchema.getMinSchemaRegionGroupNum(), | ||
| currentSchema.getMaxSchemaRegionGroupNum())); | ||
| LOGGER.info( | ||
| ConfigNodeMessages.ADJUSTREGIONGROUPNUM_THE_MINIMUM_NUMBER_OF_SCHEMAREGIONGROUPS_FOR, | ||
| currentSchema.getName(), | ||
| currentSchema.getMinSchemaRegionGroupNum()); | ||
| LOGGER.info( | ||
| ConfigNodeMessages.ADJUSTREGIONGROUPNUM_THE_MAXIMUM_NUMBER_OF_SCHEMAREGIONGROUPS_FOR, | ||
| currentSchema.getName(), | ||
| currentSchema.getMaxSchemaRegionGroupNum()); | ||
| } | ||
| if (alterSchema.isSetMinDataRegionGroupNum()) { | ||
| currentSchema.setMinDataRegionGroupNum(alterSchema.getMinDataRegionGroupNum()); | ||
| currentSchema.setMaxDataRegionGroupNum( | ||
| Math.max( | ||
| currentSchema.getMinDataRegionGroupNum(), | ||
| currentSchema.getMaxDataRegionGroupNum())); | ||
| LOGGER.info( | ||
| ConfigNodeMessages.ADJUSTREGIONGROUPNUM_THE_MINIMUM_NUMBER_OF_DATAREGIONGROUPS_FOR, | ||
| currentSchema.getName(), | ||
| currentSchema.getMinDataRegionGroupNum()); | ||
| LOGGER.info( | ||
| ConfigNodeMessages.ADJUSTREGIONGROUPNUM_THE_MAXIMUM_NUMBER_OF_DATAREGIONGROUPS_FOR, | ||
| currentSchema.getName(), | ||
| currentSchema.getMaxDataRegionGroupNum()); |
There was a problem hiding this comment.
Similar as above, these are useless codes now. You can delete them.
There was a problem hiding this comment.
Pull request overview
This PR updates database create/alter semantics to support explicitly setting maximum region-group quotas via new MAX_SCHEMA_REGION_GROUP_NUM and MAX_DATA_REGION_GROUP_NUM properties, and wires the values through parsing, planning, ConfigNode validation/persistence, and integration tests.
Changes:
- Replace the deprecated
*_REGION_GROUP_NUMSQL attributes withMAX_*_REGION_GROUP_NUMin the ANTLR grammar and parser/statement plumbing. - Add ConfigNode-side validation and persistence for altering max region-group quotas, including policy gating (CUSTOM vs AUTO).
- Update and add integration tests to cover creation, alteration, mixed policy behavior, and deprecated-syntax rejection.
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| iotdb-core/metrics/interface/src/main/java/org/apache/iotdb/metrics/reporter/iotdb/IoTDBSessionReporter.java | Stops creating the internal metrics DB with deprecated region-group properties. |
| iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/statement/metadata/DatabaseSchemaStatement.java | Renames statement fields to max*RegionGroupNum and updates string output to include them. |
| iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/parser/ASTVisitor.java | Parses new MAX_*_REGION_GROUP_NUM attributes into the statement object. |
| iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/execution/config/TableConfigTaskVisitor.java | Maps relational DB properties to max*RegionGroupNum in TDatabaseSchema. |
| iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/execution/config/metadata/relational/AbstractDatabaseTask.java | Renames supported property keys to max_schema_region_group_num / max_data_region_group_num. |
| iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/execution/config/metadata/DatabaseSchemaTask.java | Propagates max region-group values into TDatabaseSchema for tree-model create/alter. |
| iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/persistence/schema/ClusterSchemaInfo.java | Persists altered max region-group values into stored database schemas. |
| iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/manager/schema/ClusterSchemaManager.java | Adds validation for setting max region-group quotas and adjusts auto-tuning behavior when policies are CUSTOM. |
| iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/manager/partition/PartitionManager.java | Uses maxRegionGroupNum as the target when extending region groups under CUSTOM policy. |
| iotdb-core/antlr/src/main/antlr4/org/apache/iotdb/db/qp/sql/SqlLexer.g4 | Introduces lexer tokens for MAX_SCHEMA_REGION_GROUP_NUM / MAX_DATA_REGION_GROUP_NUM. |
| iotdb-core/antlr/src/main/antlr4/org/apache/iotdb/db/qp/sql/IoTDBSqlParser.g4 | Updates database attribute grammar to accept only the new MAX keys. |
| iotdb-core/antlr/src/main/antlr4/org/apache/iotdb/db/qp/sql/IdentifierParser.g4 | Adds MAX region-group keys to the reserved keywords list. |
| integration-test/src/test/java/org/apache/iotdb/relational/it/schema/IoTDBDatabaseMixedRegionGroupPolicyIT.java | New IT verifying AUTO policy still auto-adjusts when the other policy is CUSTOM. |
| integration-test/src/test/java/org/apache/iotdb/relational/it/schema/IoTDBDatabaseMaxRegionGroupNumIT.java | New IT covering create/alter with MAX keys and rejecting deprecated keys. |
| integration-test/src/test/java/org/apache/iotdb/relational/it/schema/IoTDBDatabaseIT.java | Updates existing ITs and adds coverage for rejecting max keys under AUTO policy. |
| integration-test/src/test/java/org/apache/iotdb/pipe/it/dual/treemodel/auto/enhanced/IoTDBPipeIdempotentIT.java | Updates pipe IT setup/payloads to use MAX keys and CUSTOM policies. |
| integration-test/src/test/java/org/apache/iotdb/pipe/it/dual/tablemodel/manual/enhanced/IoTDBPipeMetaIT.java | Removes use of deprecated alter-database properties in test SQL. |
| integration-test/src/test/java/org/apache/iotdb/db/it/utils/TestUtils.java | Resets statement state on retry after SQL failures. |
| integration-test/src/test/java/org/apache/iotdb/confignode/it/database/IoTDBDatabaseRegionControlIT.java | Updates SQL to use MAX keys and adds a test ensuring deprecated SQL is rejected. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| final boolean isSchemaRegion = TConsensusGroupType.SchemaRegion.equals(consensusGroupType); | ||
| final String fieldName = isSchemaRegion ? "MaxSchemaRegionGroupNum" : "MaxDataRegionGroupNum"; | ||
|
|
||
| final int allocatedRegionGroupCount; | ||
| try { | ||
| allocatedRegionGroupCount = | ||
| getPartitionManager().getRegionGroupCount(database, consensusGroupType); | ||
| } catch (final DatabaseNotExistsException e) { | ||
| return new TSStatus(TSStatusCode.DATABASE_NOT_EXIST.getStatusCode()) | ||
| .setMessage(e.getMessage()); | ||
| } | ||
| if (maxRegionGroupNum < allocatedRegionGroupCount) { | ||
| return new TSStatus(TSStatusCode.DATABASE_CONFIG_ERROR.getStatusCode()) | ||
| .setMessage( | ||
| String.format( | ||
| "%s should be greater than or equal to allocated %sRegionGroupNum: %d.", | ||
| fieldName, isSchemaRegion ? "Schema" : "Data", allocatedRegionGroupCount)); | ||
| } | ||
|
|
||
| return StatusUtils.OK; |
| + dataReplicationFactor | ||
| + ", timePartitionInterval=" | ||
| + timePartitionInterval | ||
| + ", schemaRegionGroupNum=" | ||
| + schemaRegionGroupNum | ||
| + ", dataRegionGroupNum=" | ||
| + dataRegionGroupNum | ||
| + ", maxSchemaRegionGroupNum=" | ||
| + maxSchemaRegionGroupNum | ||
| + ", maxDataRegionGroupNum=" | ||
| + maxDataRegionGroupNum |
| connectionToUse = null; | ||
| statement = null; |
Adjust the maximum quotas for database schemas based on the MAX_DATA_REGION_GROUP_NUM and MAX_SCHEMA_REGION_GROUP_NUM parameters:
ALTER DATABASE root.sg WITH MAX_DATA_REGION_GROUP_NUM = 4, MAX_SCHEMA_REGION_GROUP_NUM = 3;