Safe Haskell | None |
---|---|
Language | Haskell2010 |
Data.ArithEncode
Description
ArithEncode is a library that provides tools for defining arithmetic encodings for arbitrary datatypes. The library is designed so that multiple encoding schemes can be defined for a given datatype, and a given encoding need not encode all possible instances of the datatype.
An Encoding
is an object which is passed as the first argument to seven different functions. The primary function of an Encoding
is manifest in the encode
and decode
functions, which define an isomorphism between the datatype and the natural numbers (or a bounded set thereof), represented using Integer
s. The encode
and decode
functions have the following properties:
decode enc (encode enc v) == v
for all valuesv
in the domainencode enc v == encode enc w
only ifw == v
decode enc n == decode enc m
only ifn == m
The inDomain
function indicates whether or not a given value is in the domain of the encoding. Passing a value v
where inDomain enc v == False
into any other function may result in an IllegalArgument
exception. (For performance reasons, encodings are not strictly required to throw IllegalArgument
, but the result should not be considered valid if they do not throw the exception).
This library provides a large collection of combinators for constructing more complex Encoding
s out of simpler ones. The provided combinators should be appropriate for constructing Encoding
s for most datatypes.
As an example, the following definition creates an Encoding
for the Tree Integer
type:
tree :: Encoding (Tree Integer) tree = let ... nodeEncoding nodeenc = wrap unmakeNode makeNode (pair interval (seq nodeenc)) in recursive nodeEncoding
In this example, the makeNode
and unmakeNode
functios are simply "glue"; their definitions are
makeNode (label, children) = Node { rootLabel = label, subForest = children } unmakeNode Node { rootLabel = label, subForest = children } = Just (label, children)
The resulting Encoding
maps any Tree Integer
to a unique Integer
value.
Encoding
s have a number of practical uses. First, all Encoding
s in this library satisfy a completeness property, which guarantees that they map each value to a finite natural number (or in the case of constructions on Encoding
s, they preserve completeness). Hence, they can be used as an enumeration procedure for their underlying datatype.
Second, as Encoding
s define an isomorphism to the natural numbers, they provide an efficient binary encode/decode procedure in theory. In practice, the techniques used to guarantee the completeness property may result in long encodings of some types (particularly sequences). Also, knowledge of the distribution of the domain is necessary in order to achieve the most succinct possible encoding.