PostgreSQL DBA(60) - 列式存储zedstore
本节简单介绍了Greenplum开源的列式存储zedstore的安装和使用.
安装
从Github上下载源码,与普通PG一样,编译安装即可
[root@localhost postgres-zedstore]# ./configure --enable-debug --with-python --with-perl --with-tcl --with-gssapi --with-pam --with-openssl --with-pam --without-ldap --with-libxml --with-libxslt --enable-dtrace --enable-depend --enable-cassert --with-systemd CFLAGS="-O0 -DOPTIMIZER_DEBUG -g3 -gdwarf-2" --prefix=/appdb/zedstorechecking build system type... x86_64-pc-linux-gnu...[root@localhost postgres-zedstore]# make -j4...[root@localhost postgres-zedstore]# make install...PostgreSQL installation complete.
Heap vs ZedStore
创建用户,初始化数据库
[zedstore@localhost ~]$ initdb -E utf8 -D /data/zedstore/testdbThe files belonging to this database system will be owned by user "zedstore".This user must also own the server process.The database cluster will be initialized with locale "en_US.UTF-8".The default text search configuration will be set to "english".Data page checksums are disabled.creating directory /data/zedstore/testdb ... okcreating subdirectories ... okselecting dynamic shared memory implementation ... posixselecting default max_connections ... 100selecting default shared_buffers ... 128MBselecting default timezone ... PRCcreating configuration files ... okrunning bootstrap script ... okperforming post-bootstrap initialization ... oksyncing data to disk ... okinitdb: warning: enabling "trust" authentication for local connectionsYou can change this by editing pg_hba.conf or using the option -A, or--auth-local and --auth-host, the next time you run initdb.Success. You can now start the database server using: pg_ctl -D /data/zedstore/testdb -l logfile start
下面来对比一下head am和zedstore的性能差异
PG
testdb=# create table t_olap(id int,c1 int,c2 varchar(20));CREATE TABLE testdb=# insert into t_olap select x,x,'c2'||x from generate_series(1,5000000) as x;INSERT 0 5000000testdb=#
执行查询
testdb=# explain analyze select avg(id),sum(c1),max(c2) from t_olap; QUERY PLAN ------------------------------------------------------------------------------------------------- Finalize Aggregate (cost=69209.40..69209.40 rows=1 width=72) (actual time=925.540..925.540 rows=1 loops=1) -> Gather (cost=69209.17..69209.38 rows=2 width=72) (actual time=925.284..932.688 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 -> Partial Aggregate (cost=68209.17..68209.18 rows=1 width=72) (actual time=911.539..911.539 rows=1 loops=3) -> Parallel Seq Scan on t_olap (cost=0.00..52584.24 rows=2083324 width=17) (actual time=0.037..240.287 rows=1666667 loops=3) Planning Time: 22.703 ms Execution Time: 933.020 ms(8 rows)
执行时间为933ms
ZedStore
[zedstore@localhost testdb]$ psql -d testdbpsql (13devel)Type "help" for help.testdb=# \dA+ List of access methods Name | Type | Handler | Description ----------+-------+--------------------------+---------------------------------------- brin | Index | brinhandler | block range index (BRIN) access method btree | Index | bthandler | b-tree index access method gin | Index | ginhandler | GIN index access method gist | Index | gisthandler | GiST index access method hash | Index | hashhandler | hash index access method heap | Table | heap_tableam_handler | heap table access method spgist | Index | spghandler | SP-GiST index access method zedstore | Table | zedstore_tableam_handler | zedstore table access method(8 rows)testdb=# create table t_olap(id int,c1 int,c2 varchar(20)) using zedstore;CREATE TABLEtestdb=# \d+ t_olap Table "public.t_olap" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+-----------------------+-----------+----------+---------+----------+--------------+------------- id | integer | | | | plain | | c1 | integer | | | | plain | | c2 | character varying(20) | | | | extended | | Access method: zedstoretestdb=# insert into t_olap select x,x,'c2'||x from generate_series(1,5000000) as x;INSERT 0 5000000
执行查询
testdb=# explain analyze select avg(id),sum(c1),max(c2) from t_olap; QUERY PLAN ------------------------------------------------------------------------------------------------- Finalize Aggregate (cost=31425.10..31425.11 rows=1 width=72) (actual time=1707.755..1707.756 rows=1 loops=1) -> Gather (cost=31424.87..31425.08 rows=2 width=72) (actual time=1659.121..1710.512 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 -> Partial Aggregate (cost=30424.87..30424.88 rows=1 width=72) (actual time=1647.216..1647.217 rows=1 loops=3) -> Parallel Seq Scan on t_olap (cost=0.00..24130.07 rows=839307 width=17) (actual time=0.418..1124.465 rows=1666667 loops=3) Planning Time: 1.907 ms Execution Time: 1753.191 ms(8 rows)
执行时间为1753ms,列式存储似乎没有发挥作用?待续.
参考资料
PostgreSQL 基于access method api的列存zedstore
声明:本站所有文章资源内容,如无特殊说明或标注,均为采集网络资源。如若本站内容侵犯了原著者的合法权益,可联系本站删除。