查看原文
其他

PgSQL - 内核特性 - 把DuckDB弄进来怎么样

yzshover yanzongshuaiDBA 2024-03-04
PgSQL - 内核特性 - 把DuckDB弄进来怎么样

DuckDB是一款高性能的分析型数据库系统,支持了基于Push-based pipeline的向量化执行引擎。这么好的一款数据库,有办法直接弄到PgSQL里面,以利用其优秀的列式存储、向量化执行引擎等优秀特性吗?Hydra团队开源了一款插件pg_quack,将duckdb以表访问方法的方式加到PgSQL中,为PgSQL提供了新的存储引擎以及执行引擎。

1、执行器及表访问方法hook接入

1)增加了一个配置项quack.data_dir,用于指定duckdb表的存访目录

2)quack_init_tableam初始化duckdb表访问方法的hook

3)quack_init_hooks初始化duckdb执行器的hook

2、接入执行器的hook

包括两个hook,一个用于DML等查询,一个用于DDL等操作。

3、DML的hook

通过duckdb api来进行操作。由于duckdb是嵌入式,所以只需要将duckdb的lib库和头文件加进来就可以使用。

quack_executor_run-> duckdb_database db = quack_open_database(MyDatabaseId, false); duckdb_connection connection = quack_open_connection(db); duckdb_query(connection, queryDesc->sourceText, &result) slot = MakeTupleTableSlot(queryDesc->tupDesc, &TTSOpsHeapTuple); row_count = duckdb_row_count(&result);//行数 column_count = duckdb_column_count(&result);//列数 for(idx_t row = 0; row < row_count; row++){ ExecClearTuple(slot); for(idx_t col = 0; col < column_count; col++){ if (duckdb_value_is_null(&result, col, row)) slot->tts_isnull[col] = true; else{ slot->tts_isnull[col] = false; quack_read_result(slot, &result, col, row); } } ExecStoreVirtualTuple(slot); dest->receiveSlot(slot, dest);//一行发送给用户 释放该行的所有列 } ...

4、DDL的hook

quack_process_utility->quack_execute_query:使用duckdb api执行 //database目录quack_data_dir/"databaseOid".duckdb db = quack_open_database(MyDatabaseId, true);//创建/打开一个新的database duckdb_connect(db, &connection)//连接该database duckdb_query(connection, query, NULL);//执行 duckdb_disconnect(&connection);//断开连接
主要利用duckdb的API进行操作。

5、duckdb表访问方法的hook

表访问方法如下:

static const TableAmRoutine quack_am_methods = { .type = T_TableAmRoutine,
.slot_callbacks = quack_slot_callbacks,
.scan_begin = quack_begin_scan, .scan_end = quack_end_scan, .scan_rescan = quack_rescan, .scan_getnextslot = quack_getnextslot,
.parallelscan_estimate = quack_parallelscan_estimate, .parallelscan_initialize = quack_parallelscan_initialize, .parallelscan_reinitialize = quack_parallelscan_reinitialize,
.index_fetch_begin = quack_index_fetch_begin, .index_fetch_reset = quack_index_fetch_reset, .index_fetch_end = quack_index_fetch_end, .index_fetch_tuple = quack_index_fetch_tuple,
.tuple_fetch_row_version = quack_fetch_row_version, .tuple_tid_valid = quack_tuple_tid_valid, .tuple_get_latest_tid = quack_get_latest_tid,
.tuple_satisfies_snapshot = quack_tuple_satisfies_snapshot, .index_delete_tuples = quack_index_delete_tuples,

.tuple_insert = quack_tuple_insert, .tuple_insert_speculative = quack_tuple_insert_speculative, .tuple_complete_speculative = quack_tuple_complete_speculative, .multi_insert = quack_multi_insert, .tuple_delete = quack_tuple_delete, .tuple_update = quack_tuple_update, .tuple_lock = quack_tuple_lock, .finish_bulk_insert = quack_finish_bulk_insert,
.relation_set_new_filenode = quack_relation_set_new_filenode, .relation_nontransactional_truncate = quack_relation_nontransactional_truncate, .relation_copy_data = quack_relation_copy_data, .relation_copy_for_cluster = quack_relation_copy_for_cluster, .relation_vacuum = quack_vacuum_rel, .scan_analyze_next_block = quack_scan_analyze_next_block, .scan_analyze_next_tuple = quack_scan_analyze_next_tuple, .index_build_range_scan = quack_index_build_range_scan, .index_validate_scan = quack_index_validate_scan,
.relation_size = quack_relation_size, .relation_needs_toast_table = quack_relation_needs_toast_table,
.relation_estimate_size = quack_estimate_rel_size,
.scan_bitmap_next_block = NULL, .scan_bitmap_next_tuple = NULL, .scan_sample_next_block = quack_scan_sample_next_block, .scan_sample_next_tuple = quack_scan_sample_next_tuple};
这样就将duckdb以表访问方法的方式加入PgSQL了

参考

https://github.com/hydradatabase/pg_quack

https://github.com/duckdb/duckdb/
继续滑动看下一个

PgSQL - 内核特性 - 把DuckDB弄进来怎么样

yzshover yanzongshuaiDBA
向上滑动看下一个

您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存