Class RelMdUtil

java.lang.Object
org.apache.calcite.rel.metadata.RelMdUtil

public class RelMdUtil extends Object
RelMdUtil provides utility methods used by the metadata provider methods.
  • Field Details

    • ARTIFICIAL_SELECTIVITY_FUNC

      public static final SqlFunction ARTIFICIAL_SELECTIVITY_FUNC
  • Method Details

    • makeSemiJoinSelectivityRexNode

      public static RexNode makeSemiJoinSelectivityRexNode(RelMetadataQuery mq, Join rel)
      Creates a RexNode that stores a selectivity value corresponding to the selectivity of a semijoin. This can be added to a filter to simulate the effect of the semijoin during costing, but should never appear in a real plan since it has no physical implementation.
      Parameters:
      rel - the semijoin of interest
      Returns:
      constructed rexnode
    • getSelectivityValue

      public static double getSelectivityValue(RexNode artificialSelectivityFuncNode)
      Returns the selectivity value stored in a call.
      Parameters:
      artificialSelectivityFuncNode - Call containing the selectivity value
      Returns:
      selectivity value
    • computeSemiJoinSelectivity

      public static double computeSemiJoinSelectivity(RelMetadataQuery mq, RelNode factRel, RelNode dimRel, Join rel)
      Computes the selectivity of a semijoin filter if it is applied on a fact table. The computation is based on the selectivity of the dimension table/columns and the number of distinct values in the fact table columns.
      Parameters:
      factRel - fact table participating in the semijoin
      dimRel - dimension table participating in the semijoin
      rel - semijoin rel
      Returns:
      calculated selectivity
    • computeSemiJoinSelectivity

      public static double computeSemiJoinSelectivity(RelMetadataQuery mq, RelNode factRel, RelNode dimRel, List<Integer> factKeyList, List<Integer> dimKeyList)
      Computes the selectivity of a semijoin filter if it is applied on a fact table. The computation is based on the selectivity of the dimension table/columns and the number of distinct values in the fact table columns.
      Parameters:
      factRel - fact table participating in the semijoin
      dimRel - dimension table participating in the semijoin
      factKeyList - LHS keys used in the filter
      dimKeyList - RHS keys used in the filter
      Returns:
      calculated selectivity
    • areColumnsDefinitelyUnique

      public static boolean areColumnsDefinitelyUnique(RelMetadataQuery mq, RelNode rel, ImmutableBitSet colMask)
      Returns true if the columns represented in a bit mask are definitely known to form a unique column set.
      Parameters:
      rel - the relational expression that the column mask corresponds to
      colMask - bit mask containing columns that will be tested for uniqueness
      Returns:
      true if bit mask represents a unique column set; false if not (or if no metadata is available)
    • areColumnsUnique

      public static @Nullable Boolean areColumnsUnique(RelMetadataQuery mq, RelNode rel, List<RexInputRef> columnRefs)
    • areColumnsDefinitelyUnique

      public static boolean areColumnsDefinitelyUnique(RelMetadataQuery mq, RelNode rel, List<RexInputRef> columnRefs)
    • areColumnsDefinitelyUniqueWhenNullsFiltered

      public static boolean areColumnsDefinitelyUniqueWhenNullsFiltered(RelMetadataQuery mq, RelNode rel, ImmutableBitSet colMask)
      Returns true if the columns represented in a bit mask are definitely known to form a unique column set, when nulls have been filtered from the columns.
      Parameters:
      rel - the relational expression that the column mask corresponds to
      colMask - bit mask containing columns that will be tested for uniqueness
      Returns:
      true if bit mask represents a unique column set; false if not (or if no metadata is available)
    • areColumnsUniqueWhenNullsFiltered

      public static @Nullable Boolean areColumnsUniqueWhenNullsFiltered(RelMetadataQuery mq, RelNode rel, List<RexInputRef> columnRefs)
    • areColumnsDefinitelyUniqueWhenNullsFiltered

      public static boolean areColumnsDefinitelyUniqueWhenNullsFiltered(RelMetadataQuery mq, RelNode rel, List<RexInputRef> columnRefs)
    • setLeftRightBitmaps

      public static void setLeftRightBitmaps(ImmutableBitSet groupKey, ImmutableBitSet.Builder leftMask, ImmutableBitSet.Builder rightMask, int nFieldsOnLeft)
      Separates a bit-mask representing a join into masks representing the left and right inputs into the join.
      Parameters:
      groupKey - original bit-mask
      leftMask - left bit-mask to be set
      rightMask - right bit-mask to be set
      nFieldsOnLeft - number of fields in the left input
    • numDistinctVals

      public static @PolyNull Double numDistinctVals(@PolyNull Double domainSize, @PolyNull Double numSelected)
      Returns the number of distinct values provided numSelected are selected where there are domainSize distinct values.

      Note that in the case where domainSize == numSelected, it's not true that the return value should be domainSize. If you pick 100 random values between 1 and 100, you'll most likely end up with fewer than 100 distinct values, because you'll pick some values more than once.

      The implementation is an unbiased estimation of the number of distinct values by performing a number of selections (with replacement) from a universe set.

      Parameters:
      domainSize - Size of the universe set
      numSelected - The number of selections
      Returns:
      the expected number of distinct values, or null if either argument is null
    • capInfinity

      public static double capInfinity(Double d)
      Caps a double value at Double.MAX_VALUE if it's currently infinity
      Parameters:
      d - the Double object
      Returns:
      the double value if it's not infinity; else Double.MAX_VALUE
    • guessSelectivity

      public static double guessSelectivity(@Nullable RexNode predicate)
      Returns default estimates for selectivities, in the absence of stats.
      Parameters:
      predicate - predicate for which selectivity will be computed; null means true, so gives selectity of 1.0
      Returns:
      estimated selectivity
    • guessSelectivity

      public static double guessSelectivity(@Nullable RexNode predicate, boolean artificialOnly)
      Returns default estimates for selectivities, in the absence of stats.
      Parameters:
      predicate - predicate for which selectivity will be computed; null means true, so gives selectity of 1.0
      artificialOnly - return only the selectivity contribution from artificial nodes
      Returns:
      estimated selectivity
    • unionPreds

      public static @Nullable RexNode unionPreds(RexBuilder rexBuilder, @Nullable RexNode pred1, @Nullable RexNode pred2)
      AND's two predicates together, either of which may be null, removing redundant filters.
      Parameters:
      rexBuilder - rexBuilder used to construct AND'd RexNode
      pred1 - first predicate
      pred2 - second predicate
      Returns:
      AND'd predicate or individual predicates if one is null
    • minusPreds

      public static @Nullable RexNode minusPreds(RexBuilder rexBuilder, @Nullable RexNode pred1, @Nullable RexNode pred2)
      Takes the difference between two predicates, removing from the first any predicates also in the second.
      Parameters:
      rexBuilder - rexBuilder used to construct AND'd RexNode
      pred1 - first predicate
      pred2 - second predicate
      Returns:
      MINUS'd predicate list
    • setAggChildKeys

      public static void setAggChildKeys(ImmutableBitSet groupKey, Aggregate aggRel, ImmutableBitSet.Builder childKey)
      Takes a bitmap representing a set of input references and extracts the ones that reference the group by columns in an aggregate.
      Parameters:
      groupKey - the original bitmap
      aggRel - the aggregate
      childKey - sets bits from groupKey corresponding to group by columns
    • splitCols

      public static void splitCols(List<RexNode> projExprs, ImmutableBitSet groupKey, ImmutableBitSet.Builder baseCols, ImmutableBitSet.Builder projCols)
      Forms two bitmaps by splitting the columns in a bitmap according to whether or not the column references the child input or is an expression.
      Parameters:
      projExprs - Project expressions
      groupKey - Bitmap whose columns will be split
      baseCols - Bitmap representing columns from the child input
      projCols - Bitmap representing non-child columns
    • cardOfProjExpr

      public static @Nullable Double cardOfProjExpr(RelMetadataQuery mq, Project rel, RexNode expr)
      Computes the cardinality of a particular expression from the projection list.
      Parameters:
      rel - RelNode corresponding to the project
      expr - projection expression
      Returns:
      cardinality
    • getJoinPopulationSize

      public static @Nullable Double getJoinPopulationSize(RelMetadataQuery mq, RelNode join_, ImmutableBitSet groupKey)
      Computes the population size for a set of keys returned from a join.
      Parameters:
      join_ - Join relational operator
      groupKey - Keys to compute the population for
      Returns:
      computed population size
    • addEpsilon

      public static double addEpsilon(double d)
      Add an epsilon to the value passed in.
    • getSemiJoinDistinctRowCount

      public static @Nullable Double getSemiJoinDistinctRowCount(Join semiJoinRel, RelMetadataQuery mq, ImmutableBitSet groupKey, @Nullable RexNode predicate)
      Computes the number of distinct rows for a set of keys returned from a semi-join.
      Parameters:
      semiJoinRel - RelNode representing the semi-join
      mq - metadata query
      groupKey - keys that the distinct row count will be computed for
      predicate - join predicate
      Returns:
      number of distinct rows
    • getJoinDistinctRowCount

      public static @Nullable Double getJoinDistinctRowCount(RelMetadataQuery mq, RelNode joinRel, JoinRelType joinType, ImmutableBitSet groupKey, @Nullable RexNode predicate, boolean useMaxNdv)
      Computes the number of distinct rows for a set of keys returned from a join. Also known as NDV (number of distinct values).
      Parameters:
      joinRel - RelNode representing the join
      joinType - type of join
      groupKey - keys that the distinct row count will be computed for
      predicate - join predicate
      useMaxNdv - If true use formula max(left NDV, right NDV), otherwise use left NDV * right NDV.
      Returns:
      number of distinct rows
    • getUnionAllRowCount

      public static double getUnionAllRowCount(RelMetadataQuery mq, Union rel)
      Returns an estimate of the number of rows returned by a Union (before duplicates are eliminated).
    • getMinusRowCount

      public static double getMinusRowCount(RelMetadataQuery mq, Minus minus)
      Returns an estimate of the number of rows returned by a Minus.
    • getJoinRowCount

      public static @Nullable Double getJoinRowCount(RelMetadataQuery mq, Join join, RexNode condition)
      Returns an estimate of the number of rows returned by a Join.
    • estimateFilteredRows

      public static double estimateFilteredRows(RelNode child, RexProgram program, RelMetadataQuery mq)
    • estimateFilteredRows

      public static double estimateFilteredRows(RelNode child, @Nullable RexNode condition, RelMetadataQuery mq)
    • linear

      public static double linear(int x, int minX, int maxX, double minY, double maxY)
      Returns a point on a line.

      The result is always a value between minY and maxY, even if x is not between minX and maxX.

      Examples:

      • linear(0, 0, 10, 100, 200} returns 100 because 0 is minX
      • linear(5, 0, 10, 100, 200} returns 150 because 5 is mid-way between minX and maxX
      • linear(5, 0, 10, 100, 200} returns 160
      • linear(10, 0, 10, 100, 200} returns 200 because 10 is maxX
      • linear(-2, 0, 10, 100, 200} returns 100 because -2 is less than minX and is therefore treated as minX
      • linear(12, 0, 10, 100, 200} returns 100 because 12 is greater than maxX and is therefore treated as maxX
    • checkInputForCollationAndLimit

      public static boolean checkInputForCollationAndLimit(RelMetadataQuery mq, RelNode input, RelCollation collation, @Nullable RexNode offset, @Nullable RexNode fetch)
      Returns whether a relational expression is already sorted and has fewer rows than the sum of offset and limit.

      If this is the case, it is safe to push down a Sort with limit and optional offset.

    • validatePercentage

      public static @PolyNull Double validatePercentage(@PolyNull Double result)
      Validates whether a value represents a percentage number (that is, a value in the interval [0.0, 1.0]) and returns the value.

      Returns null if and only if result is null.

      Throws if result is not null, not in range 0 to 1, and assertions are enabled.

    • validateResult

      public static @PolyNull Double validateResult(@PolyNull Double result)
      Validates the result is valid.

      Never let the result go below 1, as it will result in incorrect calculations if the row-count is used as the denominator in a division expression. Also, cap the value at the max double value to avoid calculations using infinity.

      Returns null if and only if result is null.

      Throws if result is not null, is negative, and assertions are enabled.

      Returns:
      the corrected value from the result
      Throws:
      AssertionError - if the result is negative
    • clearCache

      public static boolean clearCache(RelNode rel)
      Removes cached metadata values for specified RelNode.
      Parameters:
      rel - RelNode whose cached metadata should be removed
      Returns:
      true if cache for the provided RelNode was not empty