Apiary Demonstration at WebWise 2010

(as originally published at http://www.imls.gov/news/2010/020310.shtm)
"The Apiary Project: Framework and Workflow for Extraction and Parsing of Herbarium Specimen Data: A Standards-Based Approach to Tool Integration and Metadata
(http://www.apiaryproject.org/content/about-apiary-project)
William Moen, University of North Texas
Millions of specimens in museums and herbaria worldwide need to be digitized to be accessible to scientists. The Apiary Project combines human and machine processes to facilitate the transformation of herbarium label data into machine-processable parsed data. The workflow and framework integrate a variety of existing technologies and the application of standards, such as the recently approved Darwin Core metadata standard. Participants will access a Web-based application with interfaces focusing on four primary phases: layout analysis, text extraction, text parsing, and quality control. The technology platform is composed entirely of open source components; upon completion, the workflow and framework will be released as an open source project."