Tuesday, 28 September 2010

Full review of publication metadata extraction techniques

We have completed a review of existing methods for metadata extraction from publications, specifically PDFs. This sort of technique allows automated derivation of title, authors, and so on, from an academic publication in PDF form. This document should prove a useful reference for others looking into this area.

You can view our report online here; this builds upon Dr. Amyas Phillip's earlier work[PDF] in this area.

Many thanks to Richard Easty for the bulk of the research in this area, and to Verity Allan for refining the final text.

No comments:

Post a Comment