Class RelMdUtil


  • public class RelMdUtil
    extends Object
    RelMdUtil provides utility methods used by the metadata provider methods.
    • Field Detail

      • ARTIFICIAL_SELECTIVITY_FUNC

        public static final SqlFunction ARTIFICIAL_SELECTIVITY_FUNC
    • Method Detail

      • makeSemiJoinSelectivityRexNode

        public static RexNode makeSemiJoinSelectivityRexNode​(RelMetadataQuery mq,
                                                             Join rel)
        Creates a RexNode that stores a selectivity value corresponding to the selectivity of a semijoin. This can be added to a filter to simulate the effect of the semijoin during costing, but should never appear in a real plan since it has no physical implementation.
        Parameters:
        rel - the semijoin of interest
        Returns:
        constructed rexnode
      • getSelectivityValue

        public static double getSelectivityValue​(RexNode artificialSelectivityFuncNode)
        Returns the selectivity value stored in a call.
        Parameters:
        artificialSelectivityFuncNode - Call containing the selectivity value
        Returns:
        selectivity value
      • computeSemiJoinSelectivity

        public static double computeSemiJoinSelectivity​(RelMetadataQuery mq,
                                                        RelNode factRel,
                                                        RelNode dimRel,
                                                        Join rel)
        Computes the selectivity of a semijoin filter if it is applied on a fact table. The computation is based on the selectivity of the dimension table/columns and the number of distinct values in the fact table columns.
        Parameters:
        factRel - fact table participating in the semijoin
        dimRel - dimension table participating in the semijoin
        rel - semijoin rel
        Returns:
        calculated selectivity
      • computeSemiJoinSelectivity

        public static double computeSemiJoinSelectivity​(RelMetadataQuery mq,
                                                        RelNode factRel,
                                                        RelNode dimRel,
                                                        List<Integer> factKeyList,
                                                        List<Integer> dimKeyList)
        Computes the selectivity of a semijoin filter if it is applied on a fact table. The computation is based on the selectivity of the dimension table/columns and the number of distinct values in the fact table columns.
        Parameters:
        factRel - fact table participating in the semijoin
        dimRel - dimension table participating in the semijoin
        factKeyList - LHS keys used in the filter
        dimKeyList - RHS keys used in the filter
        Returns:
        calculated selectivity
      • areColumnsDefinitelyUnique

        public static boolean areColumnsDefinitelyUnique​(RelMetadataQuery mq,
                                                         RelNode rel,
                                                         ImmutableBitSet colMask)
        Returns true if the columns represented in a bit mask are definitely known to form a unique column set.
        Parameters:
        rel - the relational expression that the column mask corresponds to
        colMask - bit mask containing columns that will be tested for uniqueness
        Returns:
        true if bit mask represents a unique column set; false if not (or if no metadata is available)
      • areColumnsDefinitelyUniqueWhenNullsFiltered

        public static boolean areColumnsDefinitelyUniqueWhenNullsFiltered​(RelMetadataQuery mq,
                                                                          RelNode rel,
                                                                          ImmutableBitSet colMask)
        Returns true if the columns represented in a bit mask are definitely known to form a unique column set, when nulls have been filtered from the columns.
        Parameters:
        rel - the relational expression that the column mask corresponds to
        colMask - bit mask containing columns that will be tested for uniqueness
        Returns:
        true if bit mask represents a unique column set; false if not (or if no metadata is available)
      • setLeftRightBitmaps

        public static void setLeftRightBitmaps​(ImmutableBitSet groupKey,
                                               ImmutableBitSet.Builder leftMask,
                                               ImmutableBitSet.Builder rightMask,
                                               int nFieldsOnLeft)
        Separates a bit-mask representing a join into masks representing the left and right inputs into the join.
        Parameters:
        groupKey - original bit-mask
        leftMask - left bit-mask to be set
        rightMask - right bit-mask to be set
        nFieldsOnLeft - number of fields in the left input
      • numDistinctVals

        public static Double numDistinctVals​(Double domainSize,
                                             Double numSelected)
        Returns the number of distinct values provided numSelected are selected where there are domainSize distinct values.

        Note that in the case where domainSize == numSelected, it's not true that the return value should be domainSize. If you pick 100 random values between 1 and 100, you'll most likely end up with fewer than 100 distinct values, because you'll pick some values more than once. The implementation is an unbiased estimation of the number of distinct values by performing a number of selections (with replacement) from a universe set.

        Parameters:
        domainSize - size of the universe set.
        numSelected - the number of selections.
        Returns:
        the expected number of distinct values.
      • capInfinity

        public static double capInfinity​(Double d)
        Caps a double value at Double.MAX_VALUE if it's currently infinity
        Parameters:
        d - the Double object
        Returns:
        the double value if it's not infinity; else Double.MAX_VALUE
      • guessSelectivity

        public static double guessSelectivity​(RexNode predicate)
        Returns default estimates for selectivities, in the absence of stats.
        Parameters:
        predicate - predicate for which selectivity will be computed; null means true, so gives selectity of 1.0
        Returns:
        estimated selectivity
      • guessSelectivity

        public static double guessSelectivity​(RexNode predicate,
                                              boolean artificialOnly)
        Returns default estimates for selectivities, in the absence of stats.
        Parameters:
        predicate - predicate for which selectivity will be computed; null means true, so gives selectity of 1.0
        artificialOnly - return only the selectivity contribution from artificial nodes
        Returns:
        estimated selectivity
      • unionPreds

        public static RexNode unionPreds​(RexBuilder rexBuilder,
                                         RexNode pred1,
                                         RexNode pred2)
        AND's two predicates together, either of which may be null, removing redundant filters.
        Parameters:
        rexBuilder - rexBuilder used to construct AND'd RexNode
        pred1 - first predicate
        pred2 - second predicate
        Returns:
        AND'd predicate or individual predicates if one is null
      • minusPreds

        public static RexNode minusPreds​(RexBuilder rexBuilder,
                                         RexNode pred1,
                                         RexNode pred2)
        Takes the difference between two predicates, removing from the first any predicates also in the second.
        Parameters:
        rexBuilder - rexBuilder used to construct AND'd RexNode
        pred1 - first predicate
        pred2 - second predicate
        Returns:
        MINUS'd predicate list
      • setAggChildKeys

        public static void setAggChildKeys​(ImmutableBitSet groupKey,
                                           Aggregate aggRel,
                                           ImmutableBitSet.Builder childKey)
        Takes a bitmap representing a set of input references and extracts the ones that reference the group by columns in an aggregate.
        Parameters:
        groupKey - the original bitmap
        aggRel - the aggregate
        childKey - sets bits from groupKey corresponding to group by columns
      • splitCols

        public static void splitCols​(List<RexNode> projExprs,
                                     ImmutableBitSet groupKey,
                                     ImmutableBitSet.Builder baseCols,
                                     ImmutableBitSet.Builder projCols)
        Forms two bitmaps by splitting the columns in a bitmap according to whether or not the column references the child input or is an expression.
        Parameters:
        projExprs - Project expressions
        groupKey - Bitmap whose columns will be split
        baseCols - Bitmap representing columns from the child input
        projCols - Bitmap representing non-child columns
      • cardOfProjExpr

        public static Double cardOfProjExpr​(RelMetadataQuery mq,
                                            Project rel,
                                            RexNode expr)
        Computes the cardinality of a particular expression from the projection list.
        Parameters:
        rel - RelNode corresponding to the project
        expr - projection expression
        Returns:
        cardinality
      • getJoinPopulationSize

        public static Double getJoinPopulationSize​(RelMetadataQuery mq,
                                                   RelNode join_,
                                                   ImmutableBitSet groupKey)
        Computes the population size for a set of keys returned from a join.
        Parameters:
        join_ - Join relational operator
        groupKey - Keys to compute the population for
        Returns:
        computed population size
      • addEpsilon

        public static double addEpsilon​(double d)
        Add an epsilon to the value passed in.
      • getSemiJoinDistinctRowCount

        public static Double getSemiJoinDistinctRowCount​(Join semiJoinRel,
                                                         RelMetadataQuery mq,
                                                         ImmutableBitSet groupKey,
                                                         RexNode predicate)
        Computes the number of distinct rows for a set of keys returned from a semi-join.
        Parameters:
        semiJoinRel - RelNode representing the semi-join
        mq - metadata query
        groupKey - keys that the distinct row count will be computed for
        predicate - join predicate
        Returns:
        number of distinct rows
      • getJoinDistinctRowCount

        public static Double getJoinDistinctRowCount​(RelMetadataQuery mq,
                                                     RelNode joinRel,
                                                     JoinRelType joinType,
                                                     ImmutableBitSet groupKey,
                                                     RexNode predicate,
                                                     boolean useMaxNdv)
        Computes the number of distinct rows for a set of keys returned from a join. Also known as NDV (number of distinct values).
        Parameters:
        joinRel - RelNode representing the join
        joinType - type of join
        groupKey - keys that the distinct row count will be computed for
        predicate - join predicate
        useMaxNdv - If true use formula max(left NDV, right NDV), otherwise use left NDV * right NDV.
        Returns:
        number of distinct rows
      • getUnionAllRowCount

        public static double getUnionAllRowCount​(RelMetadataQuery mq,
                                                 Union rel)
        Returns an estimate of the number of rows returned by a Union (before duplicates are eliminated).
      • getMinusRowCount

        public static double getMinusRowCount​(RelMetadataQuery mq,
                                              Minus minus)
        Returns an estimate of the number of rows returned by a Minus.
      • linear

        public static double linear​(int x,
                                    int minX,
                                    int maxX,
                                    double minY,
                                    double maxY)
        Returns a point on a line.

        The result is always a value between minY and maxY, even if x is not between minX and maxX.

        Examples:

        • linear(0, 0, 10, 100, 200} returns 100 because 0 is minX
        • linear(5, 0, 10, 100, 200} returns 150 because 5 is mid-way between minX and maxX
        • linear(5, 0, 10, 100, 200} returns 160
        • linear(10, 0, 10, 100, 200} returns 200 because 10 is maxX
        • linear(-2, 0, 10, 100, 200} returns 100 because -2 is less than minX and is therefore treated as minX
        • linear(12, 0, 10, 100, 200} returns 100 because 12 is greater than maxX and is therefore treated as maxX
      • checkInputForCollationAndLimit

        public static boolean checkInputForCollationAndLimit​(RelMetadataQuery mq,
                                                             RelNode input,
                                                             RelCollation collation,
                                                             RexNode offset,
                                                             RexNode fetch)
        Returns whether a relational expression is already sorted and has fewer rows than the sum of offset and limit.

        If this is the case, it is safe to push down a Sort with limit and optional offset.

      • validatePercentage

        public static Double validatePercentage​(Double result)
        Validate the result represents a percentage number, e.g. the value interval is [0.0, 1.0].
        Returns:
        true if the result is a percentage number
        Throws:
        AssertionError - if the validation fails
      • validateResult

        public static Double validateResult​(Double result)
        Validates the result is valid.

        Never let the result go below 1, as it will result in incorrect calculations if the row-count is used as the denominator in a division expression. Also, cap the value at the max double value to avoid calculations using infinity.

        Returns:
        the corrected value from the result
        Throws:
        AssertionError - if the result is negative
      • clearCache

        public static boolean clearCache​(RelNode rel)
        Removes cached metadata values for specified RelNode.
        Parameters:
        rel - RelNode whose cached metadata should be removed
        Returns:
        true if cache for the provided RelNode was not empty