The ‘Interdisciplinary Research Discourse’ project held its second seminar at the University of Birmingham on 29th June. At the seminar, the results of various linguistic analyses, including Multi-Dimensional Analysis, Topic Modelling, and Phraseological Profiling, were presented. The aim of these analyses was to investigate the discourse of the interdisciplinary journal Global Environmental Change, along with 10 other journals dealing with environmental and related issues in the physical and social sciences. In addition, the researchers also interviewed and surveyed writers, reviewers and editors of the journal, the results of which were also presented at the seminar. The two-year project was funded by the ESRC and was supported by Elsevier publishers, who provided a corpus of the journal articles and assisted with conducting the survey and citation analysis.
This week we have completed the compilation of the full corpus (data set), which we will use in our study of interdisciplinary discourse. The corpus consists of the afore mentioned GEC journal as well as 5 other interdisciplinary and 5 monodisciplinary journals.
Our partners Elsevier have provided us with the journals and contacted their editors to ensure they are happy to cooperate with us on the project. The journals included are: Agriculture, Ecosystems & Environment (AEE), Biosystems (B), Computers, Environment and Urban Systems (CEUS), Environmental Pollution (EP), Global Environmental Change (GEC), Journal of Rural Studies (JRS), Advances in Water Resources (AWR), Journal of Strategic Information Systems (JSIS), Plant Science (PS), Resource and Energy Economics (REE), and Transportation Research Part D: Transport and Environment (TRTE).
We would like to express gratitude to Sarah Huggett and everyone else at Elsevier who has been involved in this process, as well as all the journal editors who accepted to participate in our research.
The details about the corpus are published in the following blog post.
The GEC corpus is finally complete! The corpus consists of 569 original research articles published in the Global Environmental Change from 1990 to 2010. This amounts to 3.7 million words tokens and, although we are very happy with this achievement, this is only 1 out of 11 journals we are comparing in our analysis. These other 10 journals will represent 5 discipline specific and 5 other interdisciplinary journals. At the moment, the team at Elsevier are working hard on identifying these 10 journals from which we’ll compile the full corpus. Thus we expect our final corpus to be between 30-40 million word tokens, which for a corpus linguistic analysis is a massive amount of data.