Iceberg 集成

用户可以通过表函数与 Iceberg 表格式集成。

iceberg 表函数

为 Amazon S3、Azure、HDFS 或本地存储中的 Apache Iceberg 表提供只读的类表接口。

语法

icebergS3(url [, NOSIGN | access_key_id, secret_access_key, [session_token]] [,format] [,compression_method])
icebergS3(named_collection[, option=value [,..]])

icebergAzure(connection_string|storage_account_url, container_name, blobpath, [,account_name], [,account_key] [,format] [,compression_method])
icebergAzure(named_collection[, option=value [,..]])

icebergHDFS(path_to_table, [,format] [,compression_method])
icebergHDFS(named_collection[, option=value [,..]])

icebergLocal(path_to_table, [,format] [,compression_method])
icebergLocal(named_collection[, option=value [,..]])

参数

参数的描述与表函数 s3、azureBlobStorage、HDFS 和 file 中参数的描述相符。format 代表 Iceberg 表中数据文件的格式。

返回值 一个具有指定结构的表，用于读取指定 Iceberg 表中的数据。

示例

SELECT * FROM icebergS3('http://test.s3.amazonaws.com/clickhouse-bucket/test_table', 'test', 'test')

信息

ClickHouse 目前通过 icebergS3、icebergAzure、icebergHDFS 和 icebergLocal 表函数以及 IcebergS3、icebergAzure、IcebergHDFS 和 IcebergLocal 表引擎支持读取 Iceberg 格式的 v1 和 v2 版本。

定义命名集合

以下是配置用于存储 URL 和凭据的命名集合的示例

<clickhouse>
    <named_collections>
        <iceberg_conf>
            <url>http://test.s3.amazonaws.com/clickhouse-bucket/</url>
            <access_key_id>test<access_key_id>
            <secret_access_key>test</secret_access_key>
            <format>auto</format>
            <structure>auto</structure>
        </iceberg_conf>
    </named_collections>
</clickhouse>

SELECT * FROM icebergS3(iceberg_conf, filename = 'test_table')
DESCRIBE icebergS3(iceberg_conf, filename = 'test_table')

Schema 演变 目前，借助 CH，您可以读取 schema 随时间变化的 iceberg 表。我们目前支持读取已添加和删除列以及其顺序已更改的表。您还可以更改列，将必需值的列更改为允许 NULL 的列。此外，我们还支持简单类型的允许类型转换，即：

int -> long
float -> double
decimal(P, S) -> decimal(P', S) 其中 P' > P。

目前，无法更改嵌套结构或数组和映射中元素的类型。

分区剪枝

ClickHouse 在 Iceberg 表的 SELECT 查询期间支持分区剪枝，这有助于通过跳过不相关的数据文件来优化查询性能。现在它仅适用于 identity 转换和基于时间的转换（小时、天、月、年）。要启用分区剪枝，请设置 use_iceberg_partition_pruning = 1。

别名

表函数 iceberg 现在是 icebergS3 的别名。

参见

语法​

参数​

定义命名集合​

语法

参数

定义命名集合