DEV Community

Tomasz Wegrzanowski
Tomasz Wegrzanowski

Posted on

100 Languages Speedrun: Episode 39: SQLite

SQLite is likely the world's most popular database. It can be embedded in apps, and so many apps use it.

For example if you use Google Chrome, then you can open your history with sqlite3 and see which URL you visited most often:

$ sqlite3 ~/'Library/Application Support/Google/Chrome/Default/History' SQLite version 3.32.3 2020-06-18 14:16:19 Enter ".help" for usage hints. sqlite> .headers on sqlite> select url, title from urls order by visit_count desc limit 1; url|title https://twitter.com/|Home / Twitter 
Enter fullscreen mode Exit fullscreen mode

At least if Google Chrome is not currently running. If it's running SQLite will tell you it's locked instead, as by design it only lets one application use each database file at a time, so copy that file elsewhere to view or modify it.

Anyway, this episode is not about using SQLite for its intended purpose. We're going to have some fun.

SQL

Contrary to very common misconception, SQL is not a language, no more than "Lisp" or "Shell" are a language. Every database has its own "SQL" and they're vastly incompatible for even the simplest tasks. Over years there's been some standards, and some attempts at bringing them closer together, but it's really all failed, and every database has its own "SQL" language.

If you try to run SQL for SQLite on PostgreSQL or MySQL or whichever other system, you'd have about as much luck as trying to run Clojure Lisp code in Racket Lisp, or PowerShell shell code on ZSH shell.

Hello, World!

SQLite wasn't really designed to run SQL scripts without database from #!, so we need to write a small header to trick it into our special mode. Here's Hello, World!:

#!/bin/bash tail -n +5 "$0" | sqlite3 exit $? select "Hello, World!"; 
Enter fullscreen mode Exit fullscreen mode

Which we can run with command line as expected:

$ ./hello.sql Hello, World! 
Enter fullscreen mode Exit fullscreen mode

The "$0" in shell means means current file name - and we need to put it in "" in case path contains spaces, as that breaks shell scripts. tail -n +5 means print everything from a file from line 5 onwards. | sqlite3 means start SQLite and pipe that stuff into its input.

There are of course many other ways to run it. The more reasonable way would be to simply save that SQL to a fire and do sqlite3 <file.sql:

$ cat hello2.sql select "Hello, World!"; $ sqlite3 <hello2.sql Hello, World! 
Enter fullscreen mode Exit fullscreen mode

Or echo it from within shell - this is not really recommended for anything nontrivial, as we need to deal with shell quoting and evaluation issues:

$ echo 'select "Hello, World!";' | sqlite3 Hello, World! $ sqlite3 <<<'select "Hello, World!";' Hello, World! 
Enter fullscreen mode Exit fullscreen mode

Going on, I won't be listing the tail header in the examples, just the SQL part.

Loops

The first problem we run into is looping. We have no database, and we'd like to print some integers, and that's surprisingly hard in any shared SQL.

Some databases save us here, for example this works in PostgreSQL returning all numbers from 1 to 10:

select * from generate_series(1, 10); 
Enter fullscreen mode Exit fullscreen mode

In SQLite we can list all the numbers, but that's of course not reasonable beyond just a few:

select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9 union select 10; 
Enter fullscreen mode Exit fullscreen mode

The best way to do so seems to be "recursive Common Table Expressions" (recursive CTEs). SQL doesn't allow recursion in general, but it's possible to do so in some limited cases.

with recursive generate_series(value) as ( select 1 union all select value+1 from generate_series where value + 1 <= 10 ) select value from generate_series; 
Enter fullscreen mode Exit fullscreen mode

FizzBuzz

Once we have the hard problem of looping solved, FizzBuzz itself is very easy:

#!/bin/bash tail -n +5 "$0" | sqlite3 exit $? with recursive generate_series(value) as ( select 1 union all select value+1 from generate_series where value + 1 <= 100 ) select case when value % 15 = 0 then 'FizzBuzz' when value % 5 = 0 then 'Buzz' when value % 3 = 0 then 'Fizz' else value end from generate_series; 
Enter fullscreen mode Exit fullscreen mode

Fibonacci

We can use a recursive CTE again, with all the extra variables we want. We just only select the ones we care about afterwards (a), and ignore all the rest (b, i):

#!/bin/bash tail -n +5 "$0" | sqlite3 exit $? with recursive fib(a, b, i) as ( select 1, 1, 1 union all select b, a + b, i + 1 from fib where i + 1 <= 100 ) select a from fib; 
Enter fullscreen mode Exit fullscreen mode

SQLite uses floats so at some point it will lose precision:

$ ./fib.sql 1 1 2 3 5 8 13 21 34 55 .. 5.16807088548583e+19 8.36211434898484e+19 1.35301852344707e+20 2.18922995834555e+20 3.54224848179262e+20 
Enter fullscreen mode Exit fullscreen mode

CSV

SQL can import and export to CSV, but let's pretend it doesn't, and we need to process some CSV data.

I'll actually use temporary tables for this one, so I can do this step by step - but it all works as a monster expression.

I asked GitHub Copilot to generate some data of people and their favorite programming languages:

create table csv(data); insert into csv values('first name,last name,favorite language Alice,Smith,Ruby Bob,Smith,Python Charlie,Brown,JavaScript Daniel,Clark,PHP Ed,Jones,C Frank,Smith,HTML Gary,Johnson,CSS Heather,Williams,JavaScript Ivan,Smith,Ruby Jack,Jones,JavaScript John,Brown,PHP Karen,Clark,JavaScript Larry,Jones,HTML Mike,Williams,CSS Nathan,Smith,Ruby Oscar,Brown,JavaScript Peter,Clark,PHP Quinn,Jones,JavaScript Ralph,Smith,HTML Sally,Williams,CSS Tom,Smith,Ruby Will,Jones,JavaScript Xavier,Brown,PHP Yvonne,Clark,JavaScript Zachary,Jones,HTML '); 
Enter fullscreen mode Exit fullscreen mode

Let's say we want to get a result where each line is programming language, and then list of everyone who likes it.

First, let's split it into lines:

create table lines(lineno, line); insert into lines with recursive split(lineno, line, str) as ( select 0, null, data from csv union all select lineno + 1, substr(str, 0, instr(str, char(10))), substr(str, instr(str, char(10))+1) from split where str != '' ) select lineno, line from split where line is not null; 
Enter fullscreen mode Exit fullscreen mode

This code recursively turns "A\nB\nC\n" into "A" and "B\nC\n" - until result is empty. Then it throws out those strings.

SQLite doesn't have any way to escape \n, so we need to do char(10). It's also important that this input ends with \n, otherwise the expression will loop forever. We could add some more checks to deal with it, but let's just deal with the happy path.

Data looks like this (first line is SQL header which you can get with .headers on, SQL output is separated by | by default):

lineno|line 1|first name,last name,favorite language 2|Alice,Smith,Ruby 3|Bob,Smith,Python 4|Charlie,Brown,JavaScript 5|Daniel,Clark,PHP ... 
Enter fullscreen mode Exit fullscreen mode

Now let's do this again for each line. As lines don't end with , we need to (line||',') for our loop to work (|| is string concatenation, not logical or):

create table cells(lineno, colno, cell); insert into cells with recursive split(lineno, colno, cell, str) as ( select lineno, 0, null, (line||',') from lines union all select lineno, colno + 1, substr(str, 0, instr(str, ',')), substr(str, instr(str, ',')+1) from split where str != '' ) select lineno, colno, cell from split where cell is not null; 
Enter fullscreen mode Exit fullscreen mode

Data looks like this:

lineno|colno|cell 1|1|first name 1|2|last name 1|3|favorite language 2|1|Alice 2|2|Smith 2|3|Ruby 3|1|Bob 3|2|Smith ... 
Enter fullscreen mode Exit fullscreen mode

Now we could work with those column numbers, but I'd much prefer to transform it so we can see keys and values.

create table cellvals(lineno, k, v); insert into cellvals select c.lineno, h.cell, c.cell from cells h inner join cells c on c.colno = h.colno and h.lineno = 1 and c.lineno != 1; 
Enter fullscreen mode Exit fullscreen mode

Data looks like this now:

lineno|k|v 2|first name|Alice 2|last name|Smith 2|favorite language|Ruby 3|first name|Bob 3|last name|Smith 3|favorite language|Python ... 
Enter fullscreen mode Exit fullscreen mode

Now let's turn this into something SQL friendly:

create table preferences(full_name, language); insert into preferences select fn.v||' '||ln.v, fl.v from cellvals fn inner join cellvals ln on fn.k='first name' and ln.k='last name' and fn.lineno = ln.lineno inner join cellvals fl on fl.k='favorite language' and fn.lineno = fl.lineno; 
Enter fullscreen mode Exit fullscreen mode

Data looks like this now:

full_name|language Alice Smith|Ruby Bob Smith|Python Charlie Brown|JavaScript ... 
Enter fullscreen mode Exit fullscreen mode

And finally we run the query we wanted:

select language || ',' || group_concat(full_name, ',') from preferences group by language; 
Enter fullscreen mode Exit fullscreen mode

After all this:

$ ./csv.sql C,Ed Jones CSS,Gary Johnson,Mike Williams,Sally Williams HTML,Frank Smith,Larry Jones,Ralph Smith,Zachary Jones JavaScript,Charlie Brown,Heather Williams,Jack Jones,Karen Clark,Oscar Brown,Quinn Jones,Will Jones,Yvonne Clark PHP,Daniel Clark,John Brown,Peter Clark,Xavier Brown Python,Bob Smith Ruby,Alice Smith,Ivan Smith,Nathan Smith,Tom Smith 
Enter fullscreen mode Exit fullscreen mode

The same as monster expression

Of course it would be more fun to do this without any temporary tables:

with csv(data) as ( select 'first name,last name,favorite language Alice,Smith,Ruby Bob,Smith,Python Charlie,Brown,JavaScript Daniel,Clark,PHP Ed,Jones,C Frank,Smith,HTML Gary,Johnson,CSS Heather,Williams,JavaScript Ivan,Smith,Ruby Jack,Jones,JavaScript John,Brown,PHP Karen,Clark,JavaScript Larry,Jones,HTML Mike,Williams,CSS Nathan,Smith,Ruby Oscar,Brown,JavaScript Peter,Clark,PHP Quinn,Jones,JavaScript Ralph,Smith,HTML Sally,Williams,CSS Tom,Smith,Ruby Will,Jones,JavaScript Xavier,Brown,PHP Yvonne,Clark,JavaScript Zachary,Jones,HTML ' ), lines(lineno, line) as ( with recursive split(lineno, line, str) as ( select 0, null, data from csv union all select lineno + 1, substr(str, 0, instr(str, char(10))), substr(str, instr(str, char(10))+1) from split where str != '' ) select lineno, line from split where line is not null ), cells(lineno, colno, cell) as ( with recursive split(lineno, colno, cell, str) as ( select lineno, 0, null, (line||',') from lines union all select lineno, colno + 1, substr(str, 0, instr(str, ',')), substr(str, instr(str, ',')+1) from split where str != '' ) select lineno, colno, cell from split where cell is not null ), cellvals(lineno, k, v) as ( select c.lineno, h.cell, c.cell from cells h inner join cells c on c.colno = h.colno and h.lineno = 1 and c.lineno != 1 ), preferences(full_name, language) as ( select fn.v||' '||ln.v, fl.v from cellvals fn inner join cellvals ln on fn.k='first name' and ln.k='last name' and fn.lineno = ln.lineno inner join cellvals fl on fl.k='favorite language' and fn.lineno = fl.lineno ) select language || ',' || group_concat(full_name, ',') from preferences group by language; 
Enter fullscreen mode Exit fullscreen mode

Should you use SQLite?

Double Yes!

It's totally great for its intended purpose of having tiny databases embedded in your apps. It's far less hassle than setting up an SQL or NoSQL database server, is expressive enough, performant enough, and everyone knows basic SQL (even if of a different kind) so learning curve is very low.

As for doing programming in SQL, also yes. Unlike let's say CSS where "writing games in pure CSS" is a skill nearly orthogonal to regular use of making websites look pretty, with SQL all those silly exercises can help you when you write real queries.

Code

All code examples for the series will be in this repository.

Code for the SQLite episode is available here.

Top comments (0)