Author Archives: lmhatii

Oh, the humanities! A discussion about research data management for the Arts and Humanities disciplines

Here are some brief notes from the Arts and Humanities breakout group at the JISC MRD02 Launch Workshop earlier this month.

We began with brief introductions around the table.

Definitions of ‘research data’:

Chris Awre from History DMP, Hull, kicked off the discussion by asking: in the arts and humanities, how do you define what research data is?  The term ‘data’ means different things and different activities in different departments.  Also, a lot of what could be called data is secondary rather than primary, i.e. a lot is not new facts, but gathered facts.  Should we attempt to set a definition?
Marie-Therese Gramstadt from the Kaptur project outlined how they are trying to find a way of talking about research data that doesn’t use that particular term.  Instead they talk about looking at the materialisation of research.  They come across lots of paper-based research – do we concentrate just on the digital?  Or do we include hard copy as well as digital data when we talk about managing research data?
Simon Price, data.bris project, reported that in the Bristol theatre collection, a lot of the collection is physical artefacts, including scans and photos and objects, and getting them into a citable or preservable form is a challenge.
Anastasia Sakellariadi, REWARD project, noted that first steps may include finding out from the institute what people use.  You could ascertain that, e.g. through a survey of researchers in your department, and scope that first, then use that definition.  This may be a more useful way to engage with researchers.
Brian Hole, also from REWARD, noted that the REF now specifically talks about data, so that term is being used more now.  In REF terms we’re being compared with STEM [science, technology, engineering and mathematics] subjects who in some cases have much longer-established practice in data management and sharing.

Motivations for effective research data management:

Brian remarked that if we want researchers to plan to publish the data at the end of a project, we need to talk about it at the start.  That makes researchers think about it from the beginning.  It’s good to provide an aspirational model.  Keep the researchers’ eyes on that goal from the beginning.
Anastasia added that if we are achieve this paradigm shift, we have to encourage people to do research for themselves but to also think, ‘I’m doing this work for this project but for other people too.’
Laura Molloy, University of Glasgow, noted that in the JISC MRD01 projects Incremental and DaMSSI, the team found repeated evidence that terminology in awareness-raising efforts, skills training, policy and guidance must be very carefully considered as the use of information management- or digital curation-specific language, or legal language, immediately presents a barrier to many researchers and diminishes their engagement with potentially useful material.  Also, most researchers have other issues as their priority so to get their cooperation and increase motivation, benefits of good research data management must be clear.

Subject-specific differences in re-use:

The group noted differences within arts and humanities across disciplines re. data re-use.  Archaeology is strong on data re-use.  The case for the continuation of the Archaeology Data Service was relatively easy to make because of the unrepeatable nature of some archaeology work.
If there is a strong tradition of re-use in a discipline, the case is easier to make for good research data management.  If the discipline does not currently widely re-use data, we need to find ways to make re-use more attractive.
Simon Price noted that there are often problems putting video content online that has been digitised at Bristol because they can’t track down the rights holders – this is a common barrier to re-use of this type of data.

Data centres and sources of advice:

The group noted the current lack of advice for research data management across the arts and humanities disciplines, particularly with the closure of the Arts and Humanities Data Service (AHDS).
Chris queried the value of depositing data in a discipline-specific data centre, and/or an IR.  Julian Richards of the ADS noted that the value added by deposit in a discipline-specific data centre includes visibility, data mining and aggregation of datasets amongst other advantages.
Simon Price noted that institutions don’t necessarily need to keep data on-site.  The key elements are a citable point and the metadata, and any data centre will do as long as it’s a trusted centre.
Julian added that we need to harmonise metadata to make sure that when a researcher deposits data, they only need to create the metadata once, and APIs are needed to see access stats as this encourages use of data centres.

There then followed a discussion of work on DOIs to distinguish parts of datasets as opposed to the entire dataset.

Brian commented that during his work on the LIFE project, they discovered that in digital preservation, data probably needs to be migrated far less often than the team originally thought.
There’s a lot of data that can be lost when migrating from one format to another.  Julian noted that this is one of the arguments for discipline-specific repositories.  Staff with discipline knowledge are going to be more likely to be aware of these risks, and how to check that significant characteristics, properties and metadata of the file haven’t been lost.

Training for researchers:

When considering training to help researchers address research data management issues, presenting such training in a ‘digital humanities’ environment runs the risk of ‘preaching to the converted’ – a digital preservation / research data management event will definitely do so.  The group concluded that perhaps training would be better delivered in a subject-specific environment (particularly one more specific than ‘arts and humanities’ as this is far too broad an area to be useful).

If you were present at the group, please supply corrections and additions to laura.molloy AT glasgow.ac.uk – thank you.  Otherwise, please enter comments below!

JISC Managing Research Data programme – mk. 02 is go!

In case it’s useful, here’s a quick report on the JISC MRD02 programme manager’s opening remarks.  Simon Hodson’s introductory talk contained quite a lot of useful information for those working on the new programme and “fellow travellers”.

Simon opened the launch event by welcoming the new projects joining the JISC Managing Research Data (MRD) programme in its second funded iteration, plus some additional researchers from the last programme who were there to share experiences, and “fellow travellers”, i.e. interested other parties with experience to share or a particular interest in the work of the new iteration of the programme.

The current challenges include tackling management of the well-acknowledged data deluge: this is about huge quantities of data but the problems of managing this are not just limited to storage.  There are opportunities here: ways to improve and develop data re-use, run meta studies and engage with interdisciplinary grand challenges.  There is increasing awareness of research data as an asset and recognition of the fact that data has re-use value.  Simon stressed the importance of building on what’s already been done to ensure our work on research data management continues to make real progress.

Simon then described the new programme.  There are twenty-seven new projects funded through the 07-11 call for MRD02, across three main strands.

Strand A consists of the infrastructure projects – these include work on systems and storage, and also policy, support and guidance.  Nine projects from Strand A will be piloting new infrastructure, four will build on existing pilots, three will develop discipline-specific infrastructures and one will develop infrastructure with a focus on metadata.

Key deliverables for Strand A projects include:

  • Requirements analysis;
  • Implementation plan;
  • Description of a research data management system including lifecycle management and preservation;
  • Description of the human support infrastructure, i.e. the guidance and support that will be provided;
  • Institutional research data management policy;
  • Evidence of benefits of interventions made by each project and the cost of information where available;
  • Business plan for sustaining the pilot infrastructure or service.

There are ten more projects which are about planning in one way or another – these constitute Strands B and C.  These projects will help researchers, research groups and departments to meet funder requirements.  They will also explore discipline-specific challenges associated with making and executing DMPs.  There are eight 6-month projects developing DMPs and the infrastructure to implement these plans, and two 12-month projects enhancing DCC’s DMP Online tool.

Strand B planning projects will deliver

  • Requirements analysis and description of information / data development;
  • DMP and supporting system infrastructure with appropriate guidance and support materials;
  • No business case is required, but they should contribute to the programme objectives of gathering evidence for making the case for data planning.

Strand C planning projects will deliver

  • Requirements analysis and description of data architecture;
  • Adaptation of systems enhancement and adaptation with DCC of the DMP Online tool including guidance and user support, and feedback to DCC.

There will be a further funding call in January 2012 focusing on training and research data publications. The publications call will encourage bids to work in partnership with researchers, educational boards, scholarly societies, data centres and publishers, to encourage use and citation of data.  The research data management training call will seek bids for the development of training programmes for specific disciplines, for support roles (e.g. librarian, research liaison staff), and partnerships with professional bodies.  Key outputs sought include recommendations for future funding.

Simon mentioned several upcoming events which will be relevant for JISC MRD02 project staff.

  • There will be a workshop for RDM planning projects in or around March 2012 with a focus on demonstrating project outputs.
  • The British Library is being funded by the MRD02 programme to run a series of five workshops about DataCite, targeted at JISC MRD02 projects and open to other interested parties.
  • There will also be a workshop for infrastructure projects and fellow travellers in either July or September 2012 – the date for this is to be decided.
  • Finally, there will be the JISCMRD conference in March 2013, which will be a large international event for programme staff and an international audience to share findings, deliver demonstrations and plan future work.

Simon then introduced the importance of evidence gathering in the new programme.  This will be an explicit activity in this iteration of the programme with three part-time members of staff assigned to it.   It is important for the programme to have evidence of the benefits of the interventions provided by the projects and to be able to provide this evidence to the projects’ host institutions and to the wider community.  This will be supported by the programme as much as possible, and efforts by the projects to list and explain the benefits of their work will be helpful towards the writing of their business cases.

Projects will be specifically expected to identify likely benefits from their projects, what evidence they can produce and any possible metrics. This information should be blogged, both for the project’s own reference and in order to share it across the programme and also publicly.  The three evidence gatherers – Laura Molloy, Meik Poschen and Jonathan Tedds – will be identifying themes as they emerge from project blogs, and will be working with the projects to encourage blog posting, engage in tweeting and re-tweeting relevant material to promote project activity, posting their own blog material responding to issues raised by project reporting and generally, in this way, compiling and stimulating an evidence base.  The evidence gatherers will be particularly interested in commonalities across the programme including any themes arising around approaches, discipline focus, technical platforms, identifiers and metadata.

The webpage for the JISC MRD02 programme is located at

http://www.jisc.ac.uk/whatwedo/programmes/di_researchmanagement/managingresearchdata.aspx

This page lists the URLs of project websites.

Simon has tracked the initial commonalities of the projects on the following publicly-available spreadsheet:

https://docs.google.com/spreadsheet/ccc?key=0AoaHWqA_UJNhdEhaaaRT0lpTTJjUDBvUVpEMDlmYlc2RlE#gid=0

and the project blogs are listed on his own blog at

https://researchdata.jiscinvolve.org/wp/2011/11/29/jisc-managing-research-data-programme-2011-13-new-rdm-planning-projects-2

There is an RSS feed of JISC MRD02 project blogs at http://bit.ly/JISCMRD02-Blogs.  This feed was compiled by Jez Cope.  If you are a project on the programme, please do make sure your blog is listed here.

And a Twitter list for tweeters connected to the programme has been set up by Brian Kelly at https://twitter.com/#%21/briankelly/jiscmrd.

Thanks to Jez and Brian for these useful tools!

Laura Molloy

E: laura.molloy@glasgow.ac.uk

Twitter: @LM_HATII

The Three Monkeys: New Evidence-Gatherer Roles for the JISC Managing Research Data programme

The JISC Managing Research Data (‘JISC MRD’) programme is now in its second iteration, MRD02.  In this second funded phase, JISC MRD02 is showing no signs of slowing down, arguably reflecting the growing attention now being focused on research data management.  As JISC is compelled to gather and disseminate evidence of the impact of its work and the projects it funds, programme manager Simon Hodson recruited three evidence gatherers to help him with this aspect of programme management.

Like the three monkeys of the fable, Meik, Jonny and I all have slightly different ways of looking at life (and we do have a bad habit of sitting together in a row at meetings), but we’re all survivors of MRD01 projects, we all care deeply about research data management and we’re all concerned about how to increase the articulation of research data management principles and best practice between different audiences, ideally for the mutual benefit of all.

I’ll let Meik (Twitter: @MeikPoschen) and Jonny (Twitter: @jtedds) introduce themselves, but I thought it might be useful to briefly describe the kind of things I’ve already worked on in this area, how this informs my particular perspective on research data management in the context of the current JISC MRD programme, and what I hope to do under the auspices of the evidence gatherer work.

I was a minor member of the team of the late, lamented Arts and Humanities Data Service, specifically the AHDS Performing Arts data centre.  AHDS-PA was funded by the AHRC and the JISC, and we gathered the research outputs of AHRC-funded performance-related work for preservation purposes but also to encourage the use and re-use of these resources in research and teaching.  It was during the life of this work, before the cessation of AHRC funding in March 2008, that I started to learn about this field of digital preservation and the archival principles behind much of it.

After AHDS-PA, I began work on the EC-funded FP6 project Planets (http://www.planets-project.eu/), which was a four-year effort developing tools and services for digital preservation.  My main activity was working with a small team in the UK, and teams of local organisers in five different European countries, to develop and deliver training events in the project’s final year.  We produced one event each in Copenhagen, Bern, London, Rome and Sofia.  The first day of each event was devoted to outreach, i.e. awareness-raising for senior managers and budget holders, and days two and three delivered hands-on training for technical staff with Planets tools.  I also developed a series of basic online training materials including a series of videos, an annotated reading list and some summaries of the outreach day of the live training events, written for a technical audience in collaboration with IBM.  The results of Planets are now sustained by the Open Planets Foundation.

After Planets, the first JISC MRD programme funded a project at Glasgow and Cambridge universities to look at the existing data management practices of researchers at these two institutions, and then to build on these findings to deliver tailored training and guidance to improve how research data is managed throughout its lifecycle.  This work was called JISC Incremental and its outputs are available here http://www.glasgow.ac.uk/datamanagement (Glasgow) and here http://www.lib.cam.ac.uk/dataman (Cambridge).  There was a blog maintained by the team throughout the project for a slightly less formal account of proceedings, available here: http://incrementalproject.wordpress.com/.

As well as introducing me to the joys of punting and Fitzbillies teashop in Cambridge, my work on Incremental updated my knowledge about research data management, reaffirmed my belief in the value of user requirements-led resource development, and confirmed my suspicion that no matter how great tools and software for any kind of digital curation are, they won’t be used unless you can translate and articulate the benefits of using them to the people you want to be the users.  (For more discussion on the importance of audience-appropriate language in guidance and training, see my post on the Incremental blog at: http://incrementalproject.wordpress.com/2010/07/14/vocabularyjargonterminology-synonyms-and-specialist-language/.)

Most researchers we spoke to don’t see research data management as something either important or particularly relevant to them, and certainly not something upon which they want to expend money, time or mental energy.  Luckily for those of us who are concerned with this area, however, the shifting of funder requirements to more explicit demands for demonstrable research data management planning provides at least some motivation for starting conversations with researchers.  (There are, however, disciplines with well-established and capable research data management practices and traditions, and I hope to unpack that issue in later blog posts.)

I was also involved in a later, shorter project called DaMSSI, the Data Management Skills Support Initiative, which was funded as part of the MRD01 Research Data Management Training Materials strand.  DaMSSI is described and documented here: http://www.dcc.ac.uk/training/data-management-courses-and-training/skills-frameworks.

As well as the specific findings and outputs of the Incremental and DaMSSI work, I also learned a lot about the JISC MRD programme, its protocols, culture and key people, and hopefully this will serve as a useful background for work on the second iteration of the programme.

Comments to anything we write in this blog are always going to be warmly welcomed, and you are also encouraged to feedback in your own (linked) blog posts, or via Twitter – I can be found on Twitter @LM_HATII.