1
$\begingroup$

I am working on a computational project for a supervisor working in classical and modern deformation theory; my base framework is the SageMath project. This question is inspired by a serious computational inefficiency, if I may call it that (perhaps it is unavoidable!), in the algorithm for generating Lyndon words. For the sake of creating "free" nilpotent differential graded Lie algebras (and then reducing them modulo whichever relations are necessary), we are lucky in that we can explicitly construct a basis and then reduce the problem to linear algebra - but we are very unlucky that the algorithm is so slow.

I have a pretty decent CPU and an abundance of RAM and yet (in a Sage runtime terminal) calling "LieAlgebra(QQ, 3, step=10)" - which means: "create a free Lie algebra on three generators and insist that the Lie bracket is nilpotent of index 10" - causes the terminal to pause for about 5 minutes until the process self-terminates. I believe this is because the source code will generate all Lyndon words on three letters of length $1$, then $2$, then $3$, then $4$, then $5$, then $6$, ... separately, from scratch, and once we get to $10$ it simply takes too long. The underlying method is some Cython code lyndon_word_iterator(n,k), n the number of letters and k the length of the word, which makes no reference to previous computations. Not being able to handle Lie algebras on a mere three generators is a serious issue.

Q: I would guess it would be much faster to follow how $b[w]$ is inductively defined and ask: given a Lyndon $v$, enumerate all Lyndon words $u$ such that $uv$ is also a Lyndon word. This would allow an enumeration of all Lyndon words of length below a certain threshold to rely on previous calculations and be less wasteful or resource-intensive: are any (good) such algorithms known in the literature? The naive algorithm would be to take a list of all words $u$ (of a given length) which are lexicographically lesser than $v$ and then just run a test suite to make sure $uv$ really is Lyndon: I am sure someone will have found a better algorithm than that.

The fact Sage's Lyndon word constructive iterator is non-inductive, and is called separately for each separate length, is surely the main fault.


For a brief review of my terminology:

Fix a finite alphabet $A$ on ordered letters, say, $x_1<x_2<x_3<\cdots<x_n$. A Lyndon word is an element of the free monoid $A^\ast$ which is lexicographically strictly smaller than all of its right (proper) factors; given a Lyndon word $w$, one may write it as $w=uv$ where $v$ is the longest proper right factor which is also a Lyndon word, and then in the free Lie algebra $L$ generated by $A$ over some background field $k$ we define the bracketing $b[w]$ to be $[b[u], b[v]]$ inductively, starting from $b[x_i]=x_i$. I am also interested in the case of (differential) graded Lie algebras, and it is known that in characteristic $\neq 2,3$ that we should define 'super-Lyndon' words to be all Lyndon words and also all squares $ww$ where $w$ is a Lyndon word of odd grading (defined as $\sum_{n_j}|x_{n_j}|$ where $w=x_{n_1}x_{n_2}\cdots$ and the $x_\bullet$ have been assigned some arbitrary integer gradings). One extends the notion of standard bracketing to put $b[ww]=[b[w],b[w]]$ in this case.

$\endgroup$
3
  • 1
    $\begingroup$ The Lyndon word iterator is probably not to blame. list(LyndonWords(3, 10)) returns its result in less than 1 second. I think FreeNilpotentLieAlgebra is doing something slow in its __init__ method. Incidentally, I find it weird that a free Lie algebra wants its generators named while a free nilpotent Lie algebra doesn't... $\endgroup$ Commented Apr 12 at 14:40
  • $\begingroup$ In general, a rule of thumb is that if you are dealing with a CombinatorialFreeModule (i.e., free module with a basis consisting of combinatorial objects), the combinatorics is rarely the bottleneck; the algebra is. $\endgroup$ Commented Apr 12 at 14:41
  • 1
    $\begingroup$ I've submitted a pull request which improves the speed somewhat with caching. Hopefully it will be accepted, but if not then it's there if you want to compile a local fork of Sage. $\endgroup$ Commented May 30 at 20:03

1 Answer 1

3
$\begingroup$

To answer your direct question: Wikipedia outlines an algorithm from Duval, Jean-Pierre (1988), "Génération d'une section des classes de conjugaison et arbre des mots de Lyndon de longueur bornée", Theoretical Computer Science (in French), 60 (3): 255–283, doi:10.1016/0304-3975(88)90113-2, MR 0979464.

This calculates all Lyndon words up to length $n$ in an interleaved form, and can be implemented in amortised constant time per word.

But, as Darij Grinberg has commented, the Lyndon word generation is not the bottleneck. A simple %prun LieAlgebra(QQ, 3, step=9) shows that most of the time is in modules_with_basis.py, and in particular in methods support and leading_support. It seems that the number of invocations of these methods is quadratic in the size of the basis.

My guess was that there's some pre-computation of tables going on, and I think this is the culprit:

https://github.com/sagemath/sage/blob/develop/src/sage/algebras/lie_algebras/nilpotent_lie_algebra.py starting at line 395:

# extract structural coefficients from the free Lie algebra s_coeff = {} for dx in range(1, s + 1): # Brackets are only computed when deg(X) + deg(Y) <= s # We also require deg(Y) >= deg(X) by the ordering for dy in range(dx, s + 1 - dx): if dx == dy: for i, val in enumerate(basis_by_deg[dx]): X_ind, X = val for Y_ind, Y in basis_by_deg[dy][i + 1:]: Z = L[X, Y] if not Z.is_zero(): s_coeff[(X_ind, Y_ind)] = {W_ind: Z[W.leading_support()] for W_ind, W in basis_by_deg[dx + dy]} else: for X_ind, X in basis_by_deg[dx]: for Y_ind, Y in basis_by_deg[dy]: Z = L[X, Y] if not Z.is_zero(): s_coeff[(X_ind, Y_ind)] = {W_ind: Z[W.leading_support()] for W_ind, W in basis_by_deg[dx + dy]} 
$\endgroup$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.