Simple SQL Parser

Overview

A parser for SQL in Haskell. Also includes a pretty printer which formats SQL.

This is the documentation for version 0.8.0. Documentation for other versions is available here: http://jakewheat.github.io/simple-sql-parser/.

Status: usable for parsing a substantial amount of SQL. Adding support for new SQL is easy. Expect a little bit of churn on the AST types when support for new SQL features is added.

This version is tested with GHC 9.10.1, 9.8.2, 9.6.6.

Examples

Parse a SQL statement:

ghci> import Language.SQL.SimpleSQL.Parse ghci> import qualified Data.Text as T ghci> either (T.unpack . prettyError) show $ parseStatement ansi2011 "" Nothing "select a + b * c" "SelectStatement (Select {qeSetQuantifier = SQDefault, qeSelectList = [(BinOp (Iden [Name Nothing \"a\"]) [Name Nothing \"+\"] (BinOp (Iden [Name Nothing \"b\"]) [Name Nothing \"*\"] (Iden [Name Nothing \"c\"])),Nothing)], qeFrom = [], qeWhere = Nothing, qeGroupBy = [], qeHaving = Nothing, qeOrderBy = [], qeOffset = Nothing, qeFetchFirst = Nothing})"

The result printed readably:

[ SelectStatement  Select  { qeSetQuantifier = SQDefault  , qeSelectList =  [ ( BinOp  (Iden [ Name Nothing "a" ])  [ Name Nothing "+" ]  (BinOp  (Iden [ Name Nothing "b" ])  [ Name Nothing "*" ]  (Iden [ Name Nothing "c" ]))  , Nothing  )  ]  , qeFrom = []  , qeWhere = Nothing  , qeGroupBy = []  , qeHaving = Nothing  , qeOrderBy = []  , qeOffset = Nothing  , qeFetchFirst = Nothing  } ]

Formatting SQL, TPC-H query 21:

select  s_name,  count(*) as numwait from  supplier,  lineitem l1,  orders,  nation where  s_suppkey = l1.l_suppkey  and o_orderkey = l1.l_orderkey  and o_orderstatus = 'F'  and l1.l_receiptdate > l1.l_commitdate  and exists (  select  *  from  lineitem l2  where  l2.l_orderkey = l1.l_orderkey  and l2.l_suppkey <> l1.l_suppkey  )  and not exists (  select  *  from  lineitem l3  where  l3.l_orderkey = l1.l_orderkey  and l3.l_suppkey <> l1.l_suppkey  and l3.l_receiptdate > l3.l_commitdate  )  and s_nationkey = n_nationkey  and n_name = 'INDIA' group by  s_name order by  numwait desc,  s_name fetch first 100 rows only;

Output from the simple-sql-parser pretty printer:

select s_name, count(*) as numwait from supplier,  lineitem as l1,  orders,  nation where s_suppkey = l1.l_suppkey  and o_orderkey = l1.l_orderkey  and o_orderstatus = 'F'  and l1.l_receiptdate > l1.l_commitdate  and exists (select *  from lineitem as l2  where l2.l_orderkey = l1.l_orderkey  and l2.l_suppkey <> l1.l_suppkey)  and not exists (select *  from lineitem as l3  where l3.l_orderkey = l1.l_orderkey  and l3.l_suppkey <> l1.l_suppkey  and l3.l_receiptdate > l3.l_commitdate)  and s_nationkey = n_nationkey  and n_name = 'INDIA' group by s_name order by numwait desc, s_name fetch first 100 rows only;

Supported SQL overview

See the supported_sql.html page for details on the supported SQL.

Here is all the test_cases.html rendered in a webpage so you can get an idea of what it supports, and what various instances of SQL parse to.

Installation

This package is on hackage, use it in the usual way. You can install the SimpleSQLParserTool demo exe using:

cabal install -fparserexe simple-sql-parser

Reporting bugs

Please report bugs here: https://github.com/JakeWheat/simple-sql-parser/issues

A good bug report (or feature request) should have an example of the SQL which is failing. You can expect bugs to get fixed.

Feature requests are welcome, but be aware that there is no-one generally available to work on these, so you should either make a pull request, or find someone willing to implement the features and make a pull request.

Bug reports of confusing or poor parse errors are also encouraged.

There is a related tutorial on implementing a SQL parser here: http://jakewheat.github.io/intro_to_parsing/ (TODO: this is out of date, in the process of being updated)

Modifying the library

Get the latest development version:

git clone https://github.com/JakeWheat/simple-sql-parser.git cd simple-sql-parser cabal build

You can run the tests using cabal:

cabal test

Or use the makefile target

make test

To skip some of the slow lexer tests, which you usually only need to run before each commit, use:

make fast-test

When you add support for new syntax: add some tests. If you modify or fix something, and it doesn’t have tests, add some. If the syntax isn’t in ANSI SQL, guard it behind a dialect flag. If you add support for something from a new dialect, add that dialect.

Check all the tests still pass, then send a pull request on Github.

Links

The simple-sql-parser is a lot less simple than it used to be. If you just need to parse much simpler SQL than this, or want to start with a simpler parser and modify it slightly, you could also look at the basic query parser in the intro_to_parsing project, the code is here: https://github.com/JakeWheat/intro_to_parsing/blob/master/SimpleSQLQueryParser0.lhs (TODO: this is out of date, in the process of being updated).