Class TableCharacteristic

java.lang.Object
org.apache.calcite.sql.TableCharacteristic

public class TableCharacteristic extends Object
A table-valued input parameter of a table function is classified by three characteristics.

The first characteristic is semantics. The table has either row semantics or set semantics.

Row semantics means that the result of the table function is decided on a row-by-row basis.

Example of a table function with row semantics input table parameter: We often need to read a CSV file, generally, the first line of the file contains a list of column names, and subsequent lines of the file contain data. The data in general can be treated as a large VARCHAR. However, some of the fields may be numeric or datetime.

A table function named CSVReader is designed to read a file of comma-separated values and interpret this file as a table. The function has three parameters:

  • The first parameter, File, is the name of a file on the query author's system. This file must contain the comma-separated values that are to be converted to a table. The first line of the file contains the names of the resulting columns. Succeeding lines contain the data. Each line after the first will result in one row of output, with column names as determined by the first line of the input.
  • Floats is a descriptor area, which should provide a list of the column names that are to be interpreted numerically. These columns will be output with the data type FLOAT.
  • Dates is a descriptor area, which provides a list of the column names that are to be interpreted as datetimes. These columns will be output with the data type DATE.

How to use this table function in query?

For a csv file which contents are:

 docno,name,due_date,principle,interest
 123,Mary,01/01/2014,234.56,345.67
 234,Edgar,01/01/2014,654.32,543.21
 

The query author may write a query such as the following:


 SELECT *
 FROM TABLE (
   CSVreader (
   'abc.csv',
   DESCRIPTOR ("principle", "interest")
   DESCRIPTOR ("due_date")))
 
Results of the query
docno name due_date principle interest
123 Mary 01/01/2014 234.56 345.67
234 Edgar 01/01/2014 654.32 543.21

Set semantics means that the outcome of the function depends on how the data is partitioned. Set semantics is useful to implement user-defined analytics like aggregation or window functions. They operate on an entire table or a logical partition of it.

Example of a table function with set semantics input table parameter: TopN takes an input table that has been sorted on a numeric column. It copies the first n rows through to the output table. Any additional rows are summarized in a single output row in which the sort column has been summed and all other columns are null. TopN function has two parameters:

  • The first parameter, Input, is the input table. This table has set semantics, meaning that the result depends on the set of data (since the last row is a summary row). In addition, the table is marked as PRUNE WHEN EMPTY, meaning that the result is necessarily empty if the input is empty. The query author must order this input table on a single numeric column (syntax below).
  • The second parameter, Howmany, specifies how many input rows that the user wants to be copied into the output table; all rows after this will contribute to the final summary row in the output.

How to use this table function in query?

Original records of table orders
region product sales
East A 1234.56
East B 987.65
East C 876.54
East D 765.43
East E 654.32
West E 2345.67
West D 2001.33
West C 1357.99
West B 975.35
West A 864,22

The query author may write a query such as the following:


 SELECT *
 FROM TABLE(
   Topn(
     TABLE orders PARTITION BY region ORDER BY sales desc,
     3))
 

The result will be:

Original records of table orders
region product sales
East A 1234.56
East B 987.65
East C 876.54
West E 2345.67
West D 2001.33
West C 1357.99

The second characteristic of input table parameter only applies to input table with set semantics. It specifies whether the table function can generate a result row even if the input table is empty.

The third characteristic is whether the input table supports pass-through columns or not.

  • Field Details

    • semantics

      public final TableCharacteristic.Semantics semantics
      Input table has either row semantics or set semantics.
    • pruneIfEmpty

      public final boolean pruneIfEmpty
      If the value is true, meaning that the DBMS can prune virtual processors from the query plan if the input table is empty. If the value is false, meaning that the DBMS must actually instantiate a virtual processor (or more than one virtual processor in the presence of other input tables).
    • passColumnsThrough

      public final boolean passColumnsThrough
      If the value is true, for each input row, the table function makes the entire input row available in the output, qualified by a range variable associated with the input table. Otherwise the value is false.
  • Method Details