Uniform TitleMapSearch: a protocol and prototype application to find maps
NameGelernter, Judith (author), Lesk, Michael (chair), O'Connor, Daniel (internal member), Radford, Marie (internal member), Goodchild, Michael (outside member), Rutgers University, Graduate School - New Brunswick,
SubjectCommunication, Information and Library Studies,
Maps--Computer network resources,
DescriptionEven geographers need ways to find what they need among the thousands of maps buried in map libraries and in journal articles. It is not enough to provide search by region and keyword. Studies of queries show that people often want to look for maps showing a certain location at a certain time period or with a subject theme. The difficulties in finding such maps are several. Maps in physical and digital collections often are organized by region. Multi-dimensional manual indexing is time-consuming and so many maps are not indexed. Further, maps in non-geographical publications are indexed rarely, making them essentially invisible.
In an attempt to solve actual problems, this dissertation research automatically indexes maps in published documents so that they become visible to searchers. The MapSearch prototype aggregates journal components to allow finer-grained searching of article content. MapSearch allows search by region, time, or theme as well as by keyword (http://scilsresx.rutgers.edu/~gelern/maps/).
Automatic classification of maps is a multi-step process. A sample of 150 maps and the text (that becomes metadata) describing the maps have been copied from a random assortment of journal articles. Experience taking metadata manually enabled the writing of instructions to mine data automatically; experience with manual classification allowed for writing algorithms that classify maps by region, time and theme automatically. That classification is supported by ontologies for region, time and theme that have been generated or adapted for the purpose and that allow what has been called intelligent search, or smart search. The 150 map training set was loaded into the MapSearch engine repeatedly, each time comparing automatically-assigned classification to manually-assigned classification. Analysis of computer misclassifications suggested whether the ontology or classification algorithm should be modified in order to improve classification accuracy. After repeated trials and analyses to improve the algorithms and ontologies, MapSearch was evaluated with a set of 55 previously unseen maps in a test set. Automated classification of the test set of maps was compared to the manual classification, with the assumption that the manual process provides the most accurate classification obtainable. Results showed an accuracy, or a correspondence between manual and automated classification, of 75% for region, 69% for time, and 84% for theme.
The dissertation contributes: (1) a protocol to harvest metadata from maps in published articles that could be adapted to aggregate other sorts of journal article components such as charts, diagrams, cartoons or photographs, (2) a method for ontology-supported metadata processing to allow for improved result relevance that could be applied to other sorts of data, (3) algorithms to classify maps into region, time and theme facets that could be adapted to classify other document types, and (4) a proof-of-concept MapSearch system that could be expanded with heterogeneous map types.
NoteIncludes bibliographical references (p. 117-125).
CollectionGraduate School - New Brunswick Electronic Theses and Dissertations
Organization NameRutgers, The State University of New Jersey
RightsThe author owns the copyright to this work.