I am wondering about the possibility of efficiently (here: in Ptime) representing binary decision trees (BDT) by some other data structure in a way that characterizes their equivalence.
More precisely: a BDT is a tree with internal nodes labelled by boolean variables and leaves labelled by 0 or 1. A BDT represents a boolean function in the obvious way. Two BDT A,B are equivalent (A∼B) when they represent the same function (this equivalence is can be decided in Ptime).
A Ptime representation of BDTs is a function $f$ that inputs a BDT and turns it into another data structure/mathematical object, such that:
- $f$ is computable in Ptime
- $A\sim B$ if and only if $f(A)=f(B)$
Additionally, we may require that we have a way to reconstruct a BDT from a representation:
- there is a function $g$, also computable in Ptime, such that $g(f(A))\sim A$
The question is
Can a Ptime representation of BDT exist?
For instance reduced ordered binary decision diagrams (OBDD) validate 2 and 3, but not 1 because with the wrong variable ordering the output might be of exponential size.
I looked up the literature but did not find a complete answer to this question (see below).
An element of answer I have: if we moreover assume that the size of the objects produced by $f$ and $g$ is linear in the size of the input, we get that $g(f(A))$ produces a constant factor approximation of the smallest BDT equivalent to $A$, which is impossible unless P=NP.
Can we say anything more general, is it something already known?
This question was asked on cstheory.stackexchange:
Canonical representation of Binary Decision Tree in Ptime?
but the answer it got there is not satisfactory: the answer relies on a claim in another post which is not well justified. I asked the author about this but got no satisfactory answer: (answer 3 and comments)
Lev Reyzin's answer to 'Algorithm for optimizing decision trees'