I had a very specific problem.
The pdf's were of a question and answer type.
A web page posed questions and the user provided answers.
When the questions were done a pdf was generated.
When decoded I pick up the question answer pairs.
I am ignoring titles, and various other stuff which doesn't concern me.
So far all the pdf's of this type have decoded well.
I am making no claim to have created a generalised pdf translation tool