Arrow adapter

Note: Arrow Adapter is an experimental feature; changes in public API and usage are expected.

Overview

Calcite’s adapter for Apache Arrow is able to read and process data in Arrow format using SQL.

It can read files in Arrow’s Feather format (which generally have a .arrow suffix) in the same way that the File Adapter can read .csv files.

A simple example

Let’s start with a simple example. First, we need a model definition, as follows.

{
  "version": "1.0",
  "defaultSchema": "ARROW",
  "schemas": [
    {
      "name": "ARROW",
      "type": "custom",
      "factory": "org.apache.calcite.adapter.arrow.ArrowSchemaFactory",
      "operand": {
        "directory": "arrow"
      }
    }
  ]
}

The model file is stored as arrow/src/test/resources/arrow-model.json, so you can connect via sqlline as follows:

$ ./sqlline
sqlline> !connect jdbc:calcite:model=arrow/src/test/resources/arrow-model.json admin admin
sqlline> select * from arrow.test;
+----------+----------+------------+
| fieldOne | fieldTwo | fieldThree |
+----------+----------+------------+
|        1 | abc      |        1.2 |
|        2 | def      |        3.4 |
|        3 | xyz      |        5.6 |
|        4 | abcd     |       1.22 |
|        5 | defg     |       3.45 |
|        6 | xyza     |       5.67 |
+----------+----------+------------+
6 rows selected

The arrow directory contains a file called test.arrow, and so it shows up as a table called test.