sqlrs is an in-process sql query engine modeled off duckdb in Rust
- The goal of this project is to build a embedded in-process sql query engine for OLAP workloads.
- It leverages the power of Rust ecosystem and Apache Arrow.
- It achieved columnar-vectorized execution engine.
- It will support pipeline parallelism execution in the future.
🚧 The project is actively developing a new planner V2 that inspired by DuckDB, and will replace the planner v1 in the future.🚧
currently, the following SQL statements are supported, execute the commands into interactive mode to test them:
make run
: run sqlrs in planner_v1make run_v2
: run sqlrs in planner_v2
-- supported in Roadmap 0.1 (planner_v1) select first_name from employee where last_name = 'Hopkins'; -- supported in Roadmap 0.2 (planner_v1) select sum(salary+1), count(salary), max(salary) from employee where id > 1; select state, count(state), sum(salary) from employee group by state; -- load csv table \load csv department ./tests/csv/department.csv \load csv employee ./tests/csv/employee.csv -- show tables \dt -- supported in Roadmap 0.3 (planner_v1) select id from employee order by id desc offset 2 limit 1; select * from employee left join state on employee.state=state.state_code and state.state_name!='California State'; -- supported in Roadmap 0.4 (planner_v1) -- explain plan tree \explain select a from t1; -- Heuristic Optimizer that includes rules such as: Column pruning, Predicates pushdown, Limit pushdown etc. -- supported in Roadmap 0.5 (planner_v1) -- distinct select distinct state from employee; select count(distinct(b)) from t2; -- alias select a as c1 from t1 order by c1 desc limit 1; select t.a from t1 t where t.b > 1 order by t.a desc limit 1; -- uncorrelated scalar subquery select t.* from (select * from t1 where a > 1) t where t.b > 7; select a, (select max(b) from t1) max_b from t1; -- supported in Roadmap 0.6 (planner_v2) -- create and insert table in memory create table t1(v1 int, v2 int, v3 int); create table t2 as select * from read_csv('t2.csv'); insert into t1 values (0, 4, 1), (1, 5, 2); select * from t1; -- select only expressions select 1, 2.3, '😇', true, null; -- pragma commands show tables; describe t1; -- previous SQL statements select v1+1 as a from t1 where a >= 2; select v1 from t1 limit 2 offset 1; -- table functions select * from sqlrs_tables(); select * from sqlrs_columns(); select * from read_csv('t1.csv'); select * from read_csv('t1.csv', header=>true, delim=>','); select * from 't1.csv'; -- copy copy t1 from 't1.csv' ( DELIMITER '|', HEADER false); -- date and interval select date '1998-12-01' - interval '1' month; select interval '1' year + date '1998-12-01';
High level description:
- Roadmap 0.1: Build a basic SQL query on CSV storage
- Roadmap 0.2: Support aggregation operators, e2e testing framework and interactive mode
- Roadmap 0.3: Support limit, order, and join operators
- Roadmap 0.4: Introduce a Heuristic Optimizer and common optimization rules
- Roadmap 0.5: Support distinct, alias, and uncorrelated scalar subquery
- Roadmap 0.6: New planner_v2 highly inspired by DuckDB
Please see Roadmap for more information of implementation steps