Skip to content

Commit f5665e1

Browse files
committed
Claudio doc contribs
1 parent 3c85221 commit f5665e1

9 files changed

+37
-23
lines changed

doc/www/lumo-architecture.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -195,25 +195,25 @@ Single-level store concepts are well-explained in [Howard Chu's 2013 MDB Paper](
195195
> Store". The basic idea is to treat all of computer memory as a single address
196196
> space. Pages of storage may reside in primary storage (RAM) or in secondary
197197
> storage (disk) but the actual location is unimportant to the application. If
198-
> a referenced page is currently in primary storagethe application can use it
198+
> a referenced page is currently in primary storage the application can use it
199199
> immediately, if not a page fault occurs and the operating system brings the
200200
> page into primary storage. The concept was introduced in 1964 in the Multics
201201
> operating system but was generally abandoned by the early 1990s as data
202202
> volumes surpassed the capacity of 32 bit address spaces. (We last knew of it
203-
> in the ApolloDOMAIN operating system, though many other Multics-influenced
203+
> in the Apollo DOMAIN operating system, though many other Multics-influenced
204204
> designs carried it on.) With the ubiquity of 64 bit processors today this
205205
> concept can again be put to good use. (Given a virtual address space limit of
206-
> 63 bits that puts the upper bound of database size at 8exabytes. Commonly
207-
> available processors today only implement 48 bit address spaces,limiting us
206+
> 63 bits that puts the upper bound of database size at 8 exabytes. Commonly
207+
> available processors today only implement 48 bit address spaces, limiting us
208208
> to 47 bits or 128 terabytes.) Another operating system requirement for this
209209
> approach to be viable is a Unified BufferCache. While most POSIX-based
210210
> operating systems have supported an mmap() system call for many years, their
211-
> initial implementations kept memory managed by the VM subsystemseparate from
211+
> initial implementations kept memory managed by the VM subsystem separate from
212212
> memory managed by the filesystem cache. This was not only wasteful
213-
> (again,keeping data cached in two places at once) but also led to coherency
213+
> (again, keeping data cached in two places at once) but also led to coherency
214214
> problems - data modified through a memory map was not visible using
215-
> filesystem read() calls, or data modifiedthrough a filesystem write() was not
216-
> visible in the memory map. Most modern operatingsystems now have filesystem
215+
> filesystem read() calls, or data modified through a filesystem write() was not
216+
> visible in the memory map. Most modern operating systems now have filesystem
217217
> and VM paging unified, so this should not be a concern in most deployments.
218218
219219

doc/www/lumo-benchmarking.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ field of SQL databases, and certainly in the SQLite landscape.
3737
Benchmarking is a big part of LumoSQL, to determine if changes are an
3838
improvement. The trouble is that SQLite and other top databases are not really
3939
benchmarked in a realistic and consistent way, despite SQL server benchmarking
40-
using tools like TPCC from [tpc.org](https://tpc.org) being an obsessive
40+
using tools like TPC-C from [tpc.org](http://tpc.org) being an obsessive
4141
industry in itself, and many testing tools released with SQLite, Postgresql,
4242
MariaDB etc. But in practical terms there is no way of comparing the most-used
4343
databases with each other, or even of being sure that the tests that do exist
@@ -223,7 +223,7 @@ groups, those using:
223223
deployments, who are likely to a wider range of the supported SQL features
224224

225225
The embedded style of statement is typically used within the application process
226-
space,the code written by these developers is often tightly coupled with the
226+
space, the code written by these developers is often tightly coupled with the
227227
SQLite library. The online style of SQL statement is typically more loosely
228228
coupled with the database implementation and these developers may switch to
229229
execute similar SQL statements on different databases. Further this second style

doc/www/lumo-corruption-detection-and-magic.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@ SQLite needs row-level integrity checking even more than the online databases be
8787
* it is easy to backup an SQLite database partway through a transaction, meaning that the restore will be corrupted
8888
* SQLite does not have robust locking mechanisms available for access by multiple processes at once, since it relies on lockfiles and Posix advisory locking
8989
* SQLite provides the [VFS API Interface](https://www.sqlite.org/vfs.html) which users can easily misuse to ignore locking via the sql3_*v2 APIs
90-
* the on-disk file format is seemingly often corrupted regardless of use case. Better evidence on this is needed but authors of SQLite data file recovery software (see listing in [SQLite Relevant Knowledgebase](./lumo-relevant-knowledebase)) indicates high demand for their servies. Informal shows of hands at conferences indicates that SQLite users expect corruption.
90+
* the on-disk file format is seemingly often corrupted regardless of use case. Better evidence on this is needed but authors of SQLite data file recovery software (see listing in [SQLite Relevant Knowledgebase](./lumo-relevant-knowledebase)) indicates high demand for their services. Informal shows of hands at conferences indicates that SQLite users expect corruption.
9191

9292
sqlite.org has a much more detailed, but still incomplete, summary of [How to Corrupt an SQLite Database](https://www.sqlite.org/howtocorrupt.html).
9393

doc/www/lumo-landscape.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@ profile and attracts technical review.
8585

8686
**Sqlite.org is entirely focussed on its existing scope and traditional user
8787
needs** and [SQLite imposes strict limits](https://sqlite.org/about.html) for
88-
example “Think of SQLite not as a replacement for but as a replacement for
88+
example “Think of SQLite not as a replacement for Oracle but as a replacement for
8989
fopen()” which eliminates many of the possibilities LumoSQL is now exploring
9090
that go beyond any version of fopen(). Many things that SQLite can used
9191
for are [declared out of scope](https://sqlite.org/whentouse.html) by the
@@ -98,7 +98,7 @@ project.
9898
**Sqlite has a very strict and reliable view on maintaining backwards
9999
compatibility both binary and API (except when it comes to encryption, see
100100
further down.)** The Sqlite foundation aims to keep SQLite3 interfaces and
101-
formats stable until the year 2050 years, according to Richard Hipp in the
101+
formats stable until the year 2050, according to Richard Hipp in the
102102
podcast interview, as once requested by an airframe construction company
103103
(Airbus). Whatever happens in years to come SQLite has definitely delivered on
104104
this to date. This means that there are many things SQLite cannot do which
@@ -156,7 +156,7 @@ serious problems with it too:
156156
SQLite binary format has almost zero internal integrity checking.
157157
LumoSQL aims to add options to address this problem.
158158

159-
**Sqlite is less open source than it appears**. The existance of so many SQLite
159+
**Sqlite is less open source than it appears**. The existence of so many SQLite
160160
spin-offs is evidence that SQLite code is highly available. However there are
161161
several aspects of SQLite that mean it cannot be considered open source, in
162162
ways that are increasingly important in the 21st century:
@@ -235,7 +235,7 @@ SQLite Downstreams
235235

236236
There is still a lot for LumoSQL to explore because there is just so much code, but
237237
as of March 2020 we are confident code could be assembled from here and there
238-
and there on the internet to demonstrate the following features:
238+
on the internet to demonstrate the following features:
239239

240240
- SQLite with Berkely bdb backend
241241
- SQLite with LevelDB backend

doc/www/lumo-legal-aspects.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -155,7 +155,7 @@ has been released as "Public Domain"
155155

156156
SQLite is not available with encryption. There are two common ways of adding encryption to SQLite, both of which have legal implications:
157157

158-
1. Purchasing the [SQLite Encryption Extension](https://www.hwaci.com/sw/sqlite/see.html)(SEE) from Richard Hipp's company Hwaci. The SEE is proprietary software, and cannot be use with open source applications.
158+
1. Purchasing the [SQLite Encryption Extension](https://www.hwaci.com/sw/sqlite/see.html)(SEE) from Richard Hipp's company Hwaci. The SEE is proprietary software, and cannot be used with open source applications.
159159
2. [SQLcipher](https://www.zetetic.net/sqlcipher/) which has a open core model. The BSD-licensed open source version requires users to publish copyright notices, and the more capable commercial editions are available on similar terms to SEE, and therefore cannot be used with open source applications.
160160

161161
There are many other ways of adding encryption to SQLite, some of which are listed in the [Knowledgebase Relevant to LumoSQL](./lumo-relevant-knowledgebase.md).

doc/www/lumo-not-forking.md

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,7 @@ numbers in order are:
8888

8989
We may extend this definition to deal with version numbering schemes
9090
used by normal software, however it will never work correctly with the
91-
version numbers used by INTERCAL compilers.
91+
version numbers used by [INTERCAL](https://en.wikipedia.org/wiki/INTERCAL) compilers.
9292

9393
The `subtree` key indicates a directory inside the sources to use instead
9494
of the top level.
@@ -104,7 +104,8 @@ keys need to be present:
104104
either a single string which is prefixed to the version number, or two
105105
strings separated by space, the first one is prefixed and the second appended.
106106
- optionally, `user` and `password` can be specified to obtain access to the
107-
repository.
107+
repository (this is currently not implemented, all repositories must be
108+
accessible without authentication).
108109

109110
A software version can be identified by a generic git commit ID, or by a
110111
version string similar to the one described for the `compare` key, if the
@@ -147,7 +148,9 @@ a modification is only necessary up to a particular version, because
147148
for example that modification has been accepted by upstream and is
148149
no longer necessary. Another use of this key is to identify versions
149150
in which substantial upstream changes make it difficult to specify a
150-
modification which works for every possible version.
151+
modification which works for every possible version. Specifying this
152+
keyword is essentially equivalent to put the whole `.mod` file in
153+
a conditional.
151154
- `method`; the method used to specify the modification; currently, the
152155
value can be either `patch`, indicating that the final part of the file is
153156
in a format suitable for passing as standard input to the "patch" program;
@@ -162,6 +165,10 @@ currently no other keys for the `replace` method, and the following for
162165
the `patch` method:
163166

164167
- `options`: options to pass to the "patch" program (default: "-Nsp1")
168+
- `list`: extra options to the "patch" program to list what it would do
169+
instead of actually doing it (this is used internally to figure out
170+
what changes; the default currently assumes the "patch" program provided
171+
by most Linux distributions)
165172

166173
# Example Configuration directory <a name="example"></a>
167174

@@ -282,3 +289,10 @@ not been modified since; in this case, delete the output directory
282289
completely, or rename it to something else, and run the program again.
283290
There is currently no option to override this safety feature.
284291

292+
We plan to add logging to the notforking tool, in which all messages are
293+
written to a log file (under control of configuration), while the subset
294+
of messages selected by the verbosity setting will go to standard output;
295+
this will allow us to increase the amount of information provided and make
296+
it available if there is a processing error; however in the current version
297+
this is just planned, and not yet implemented.
298+

doc/www/lumo-project-aims.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,7 @@ nearly all 30 million lines of the Linux kernel can be exclude giving just 200k
100100
lines. Runtime modularity will be controlled through the same user interfaces
101101
as the rest of LumoSQL.
102102

103-
* LumoSQL will ensure that new code can all be active at one, eg
103+
* LumoSQL will ensure that new code can all be active at once, eg
104104
multiple backends or frontends for conversion between/upgrading from one
105105
format or protocol to another. This is crucial to provide continuity and
106106
supported upgrade paths for users, for example, users who want to become

doc/www/lumo-relevant-codebases.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ What is a Relevant Codebase?
3030

3131
There are three dimensions to codebases relevant to LumoSQL:
3232

33-
1. Code that is a derivitive of SQLite code adding a feature or improvement, and
33+
1. Code that is a derivative of SQLite code adding a feature or improvement, and
3434
2. Code that has nothing to do with SQLite but implements an interesting database feature we want to use in LumoSQL
3535
3. Code that supports the development of LumoSQL such as testing, benchmarking or analysing relevant codebases
3636

@@ -126,7 +126,7 @@ The on-disk file format is important to many SQLite use cases, and introspection
126126

127127
# List of Relevant Benchmarking and Test Knowledge
128128

129-
Benchmarking is a big part of LumoSQL, to determine if changes are an improvement. The trouble is that SQLite and other top databases are not really benchmarked in realistic and consistent way, despite SQL server benchmarking using tools like TCP being an obsessive industry in itself, and there being myriad of testing tools released with SQLite, Postgresql, MariaDB etc. But in practical terms there is no way of comparing the most-used databases with each other, or even of being sure that the tests that do exist are in any way realistic, or even of simply reproducing results that other people have found. LumoSQL covers so many codebases and use cases that better SQL benchmarking is a project requirement. Benchmarking and testing overlap, which is addressed in the code and docs.
129+
Benchmarking is a big part of LumoSQL, to determine if changes are an improvement. The trouble is that SQLite and other top databases are not really benchmarked in realistic and consistent way, despite SQL server benchmarking using tools like TPC being an obsessive industry in itself, and there being myriad of testing tools released with SQLite, Postgresql, MariaDB etc. But in practical terms there is no way of comparing the most-used databases with each other, or even of being sure that the tests that do exist are in any way realistic, or even of simply reproducing results that other people have found. LumoSQL covers so many codebases and use cases that better SQL benchmarking is a project requirement. Benchmarking and testing overlap, which is addressed in the code and docs.
130130

131131
The well-described [testing of SQLite](https://sqlite.org/testing.html) involves some open code, some closed code, and many ad hoc processes. Clearly the SQLite team have an internal culture of testing that has benefitted the world. However that is very different to reproducible testing, which is in turn very different to reproducible benchmarking, and that is even without considering whether the benchmarking is a reasonable approximation of actual use cases.
132132

doc/www/lumo-relevant-knowledgebase.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ Analyser, DB Browser for SQLite, Magnet AXIOM and Oxygen Forensic Detective.)
7878

7979
# List of Relevant Benchmarking and Test Knowledge
8080

81-
Benchmarking is a big part of LumoSQL, to determine if changes are an improvement. The trouble is that SQLite and other top databases are not really benchmarked in realistic and consistent way, despite SQL server benchmarking using tools like TCP being an obsessive industry in itself, and there being myriad of testing tools released with SQLite, Postgresql, MariaDB etc. But in practical terms there is no way of comparing the most-used databases with each other, or even of being sure that the tests that do exist are in any way realistic, or even of simply reproducing results that other people have found. LumoSQL covers so many codebases and use cases that better SQL benchmarking is a project requirement. Benchmarking and testing overlap, which is addressed in the code and docs.
81+
Benchmarking is a big part of LumoSQL, to determine if changes are an improvement. The trouble is that SQLite and other top databases are not really benchmarked in realistic and consistent way, despite SQL server benchmarking using tools like TPC being an obsessive industry in itself, and there being myriad of testing tools released with SQLite, Postgresql, MariaDB etc. But in practical terms there is no way of comparing the most-used databases with each other, or even of being sure that the tests that do exist are in any way realistic, or even of simply reproducing results that other people have found. LumoSQL covers so many codebases and use cases that better SQL benchmarking is a project requirement. Benchmarking and testing overlap, which is addressed in the code and docs.
8282

8383
The well-described [testing of SQLite](https://sqlite.org/testing.html) involves some open code, some closed code, and many ad hoc processes. Clearly the SQLite team have an internal culture of testing that has benefitted the world. However that is very different to reproducible testing, which is in turn very different to reproducible benchmarking, and that is even without considering whether the benchmarking is a reasonable approximation of actual use cases.
8484

0 commit comments

Comments
 (0)