John Lamp

John Lamp was awarded the 2012 PhD Medal.

Information Categorisation: an Emergent Approach

The explosion of information and of naïve users on the Internet has highlighted problems of effective access to information. One response to the problem of effective access to information is to classify the information into categories based on the nature of the information being classified. Existing information classifications are typically developed by committees or imposed by organisations and have proved difficult to maintain. This investigation developed a two phase method to systematically determine and analyse information categories in a specific domain as perceived by domain experts. The initial phase, the Term Extraction Phase, applied the librarianship approach of literary warrant guided by Ingarden’s Ontology of Literature to research papers from a specific domain to discover what is studied in the domain. The approach is significant in that it draws upon rigorous and philosophically compatible bodies of work in two areas. Firstly, from work addressing the nature, existence, and categorisation of literary expression found in research papers. Secondly, from qualitative research methods addressing how meaningful terms can be analysed in text and related to each other. We have found that such a guiding ontological theory can be used to seed coding families giving rise to a viable method for generating categorisations for further research. We have also found that the key guiding unit of analysis operationalising Ingarden’s approach is the “reported research activity” and that the process is practical although labour intensive.

The second phase, the Term Categorisation Phase, used the librarianship approach of consensus to have domain experts form categories from the terms generated in the first phase. Examining those categories using pairwise comparisons allowed the identification of similar categories based on the common categorisation of terms in the coding family. The pairwise comparisons were undertaken manually, but the development of an automated tool to perform these comparisons would enhance this aspect of the phase. Boisot’s Social Learning Cycle (SLC) was used as a model with which to explain category variations. The single performance of the Term Categorisation Phase undertaken in this investigation demonstrated the value of the SLC for explaining the variations between domain experts, and showed the potential for explaining category changes over time using the SLC and repeated performances of the Term Categorisation Phase.

This investigation makes a number of contributions. The investigation demonstrated that the two librarianship approaches of literary warrant and consensus are not necessarily mutually exclusive and that both have much to offer at different stages of the categorisation process. A method was devised which provides a more rigorous and systematic approach to analysing and categorising text. The method consists of two phases which are loosely coupled and could be used independently. A very significant aspect is the ability to view categorisation as a dynamic process. That enables the examination of categorisation and classification schemes and for the identification of areas within those schemes which require attention. The method is not a tool to develop a complete classification scheme, but seeks to contribute insights on how to progress the development of mature schemes.

 

John Lamp’s award winning thesis