PDF processing and analysis with open-source tools : compare