Here are some brief notes from the Arts and Humanities breakout group at the JISC MRD02 Launch Workshop earlier this month.
We began with brief introductions around the table.
Definitions of ‘research data’:
Chris Awre from History DMP, Hull, kicked off the discussion by asking: in the arts and humanities, how do you define what research data is? The term ‘data’ means different things and different activities in different departments. Also, a lot of what could be called data is secondary rather than primary, i.e. a lot is not new facts, but gathered facts. Should we attempt to set a definition?
Marie-Therese Gramstadt from the Kaptur project outlined how they are trying to find a way of talking about research data that doesn’t use that particular term. Instead they talk about looking at the materialisation of research. They come across lots of paper-based research – do we concentrate just on the digital? Or do we include hard copy as well as digital data when we talk about managing research data?
Simon Price, data.bris project, reported that in the Bristol theatre collection, a lot of the collection is physical artefacts, including scans and photos and objects, and getting them into a citable or preservable form is a challenge.
Anastasia Sakellariadi, REWARD project, noted that first steps may include finding out from the institute what people use. You could ascertain that, e.g. through a survey of researchers in your department, and scope that first, then use that definition. This may be a more useful way to engage with researchers.
Brian Hole, also from REWARD, noted that the REF now specifically talks about data, so that term is being used more now. In REF terms we’re being compared with STEM [science, technology, engineering and mathematics] subjects who in some cases have much longer-established practice in data management and sharing.
Motivations for effective research data management:
Brian remarked that if we want researchers to plan to publish the data at the end of a project, we need to talk about it at the start. That makes researchers think about it from the beginning. It’s good to provide an aspirational model. Keep the researchers’ eyes on that goal from the beginning.
Anastasia added that if we are achieve this paradigm shift, we have to encourage people to do research for themselves but to also think, ‘I’m doing this work for this project but for other people too.’
Laura Molloy, University of Glasgow, noted that in the JISC MRD01 projects Incremental and DaMSSI, the team found repeated evidence that terminology in awareness-raising efforts, skills training, policy and guidance must be very carefully considered as the use of information management- or digital curation-specific language, or legal language, immediately presents a barrier to many researchers and diminishes their engagement with potentially useful material. Also, most researchers have other issues as their priority so to get their cooperation and increase motivation, benefits of good research data management must be clear.
Subject-specific differences in re-use:
The group noted differences within arts and humanities across disciplines re. data re-use. Archaeology is strong on data re-use. The case for the continuation of the Archaeology Data Service was relatively easy to make because of the unrepeatable nature of some archaeology work.
If there is a strong tradition of re-use in a discipline, the case is easier to make for good research data management. If the discipline does not currently widely re-use data, we need to find ways to make re-use more attractive.
Simon Price noted that there are often problems putting video content online that has been digitised at Bristol because they can’t track down the rights holders – this is a common barrier to re-use of this type of data.
Data centres and sources of advice:
The group noted the current lack of advice for research data management across the arts and humanities disciplines, particularly with the closure of the Arts and Humanities Data Service (AHDS).
Chris queried the value of depositing data in a discipline-specific data centre, and/or an IR. Julian Richards of the ADS noted that the value added by deposit in a discipline-specific data centre includes visibility, data mining and aggregation of datasets amongst other advantages.
Simon Price noted that institutions don’t necessarily need to keep data on-site. The key elements are a citable point and the metadata, and any data centre will do as long as it’s a trusted centre.
Julian added that we need to harmonise metadata to make sure that when a researcher deposits data, they only need to create the metadata once, and APIs are needed to see access stats as this encourages use of data centres.
There then followed a discussion of work on DOIs to distinguish parts of datasets as opposed to the entire dataset.
Brian commented that during his work on the LIFE project, they discovered that in digital preservation, data probably needs to be migrated far less often than the team originally thought.
There’s a lot of data that can be lost when migrating from one format to another. Julian noted that this is one of the arguments for discipline-specific repositories. Staff with discipline knowledge are going to be more likely to be aware of these risks, and how to check that significant characteristics, properties and metadata of the file haven’t been lost.
Training for researchers:
When considering training to help researchers address research data management issues, presenting such training in a ‘digital humanities’ environment runs the risk of ‘preaching to the converted’ – a digital preservation / research data management event will definitely do so. The group concluded that perhaps training would be better delivered in a subject-specific environment (particularly one more specific than ‘arts and humanities’ as this is far too broad an area to be useful).
If you were present at the group, please supply corrections and additions to laura.molloy AT glasgow.ac.uk – thank you. Otherwise, please enter comments below!