Category Archives: Meeting reports

OR2012: Research Data Management and Infrastructure: institutional perspectives

Research data management can make a significant contribution to an institution’s research performance but needs solid user requirements research, an understanding of the researcher working space and a collaborative approach between researchers and support staff for infrastructure to be adopted, understood and sustained in the institution.  That was the message from this session on 11 July in Edinburgh at Open Repositories 2012 on research data management and infrastructure, from the perspectives of three particular institutions.

Unmanaged to managed

First we heard from Natasha Simons from Australia’s Griffith University.  Natasha made a clear connection between the university’s position in the top 10 research universities of Australia, and the existence of their Research Hub, which was developed with funding from the Australian National Data Service.  The Hub stores data and relationships between the data, exports to ANDS, and provides Griffiths researchers with their own profiles which allow better collaboration across the institution by allowing researchers to find others with similar research interests for collaboration and supervision.

Natasha outlined some challenges the Griffith team have met and are currently facing, but ultimately reported that they are successfully transforming institutional data in line with ANDS aims from unmanaged to managed; disconnected to connected; invisible to visible; and single-use to reusable.

Resourcing for RDM

Another institution which connects RDM with its prestigious position in the research league tables is Oxford; Sally Rumsey of the University’s Bodleian library took us through their vision for their institutional research data management infrastructure, encompassing current work on the Oxford DMP Online and the DaMaRO project; data creation and local management (DataStage, ViDASS); archival storage and curation (DataBank, software store); and data discovery and dissemination (document repository, Oxford DataFinder and Colwiz).

Sally argued that that data management doesn’t stop at digital objects:

“Paper in filing cabinets, specimens in jars: all could exist as data.”

She also reminded us that although emerging funder requirements, and particularly this year’s EPSRC roadmap requirement, were doing much to focus minds on RDM, there is also the challenge of unfunded research, a major component of research activity at Oxford.  This needs requirements and funding for management, too.

Sally was asked whether researchers were going to end up paying for RDM infrastructure.  She argued that there needs to be a budget line in research bids to cover these costs.  This prompted me to think about the fact that we talk about getting researchers trained from the start of their research activity, but to bring about the kind of awareness that will lead to researchers knowing to cost in data management in their bid, we need to engage with them before they start even writing the bid.  This is an argument for engagement at PhD level at the latest, and for a much wider and more consistent provision of RDM training in universities in order to bring about this kind of change in culture.  Clearly we also need simple, accessible costing tools to help non-specialists quantify explicit costs for data management and preservation, for inclusion in funding bids.

Adopt, adapt, develop

Anthony Beitz, manager of Australia’s Monash University eResearch Centre, also has nascent culture change in mind.  He described the availability of research data as having the potential to change research work:

“We’re going to see things we’ve never seen before.”

Anthony’s description of how the eResearch team works at Monash is based on a clear understanding of the characteristics of the research space and how that differs from the way in which IT services staff work.

  • Researchers: focused on outcomes.  They work in an interpretive mode, using iterative processes.  The approach may be open-ended and thrives on ambiguity.  Requirements and goals may change over time.  May require an ICT capability for only a short period of time – don’t tend to care what happens to it after the end of the project.  Resourceful, driven, and loyal to their discipline more than the institution.
  • IT services: broad service base.  Supporting administration, education and research.  Continuity of IT services is a priority.  Excel at selecting and deploying supporting institutional enterprise solutions.  IT works in analytical mode as opposed to the research space, which is in interpretive mode.

The volume of data is growing exponentially, but funding to manage it is certainly not.  In this context, a clear articulation of need between the researcher space and the IT services space is crucial.  Anthony argues that researchers need to participate actively in the deployment of an institution’s RDM infrastructure.  Media currently used is not good for reliability, security or sharing, but no single institutional RDM platform will fit all researchers’ needs.  RDM solutions must be a good cultural fit as researchers have stronger synergies with colleagues beyond the institution and are more likely to use solutions within their disciplines.  Anthony suggests that IT services should adopt existing solutions being used within disciplines, where possible, as building a new one breaks the collaboration cycle for researchers with colleagues from other institutions, asserting, “going into development should be a last resort.”

In this way, much of the RDM activity at Monash seems to be explicitly responding to current researcher behaviours.  Adoption of emerging solutions is encouraged by promoting a sense of ownership by the researchers; by delivering value early and often; and by supporting researchers in raising awareness of a RDM platform to their research community.  If users don’t feel they own a resource, they’ll look to the developers to sustain funding.  If they feel ownership, they’ll look for funding for it themselves, so buy-in is not only good for adoption but also for sustainability.

DARTS3, The Third Discover Academic Research Training & Support Conference. Dartington Hall, Devon: 28 – 29 June 2012

Whilst storms swept much of the rest of the country, the sleepy peace of bucolic Devonshire was barely disturbed by the arrival of several dozen librarians (plus a couple of ‘fellow travellers’) to dreamy Dartington.

Anna Dickinson from HEFCE’s REF team (of which there are only five people!) kicked off the first day with a very informative overview of the 2014 REF expectations, process, staff selection, timescales, the test submission system, the assessment of the research environment and how the panels work, with particular advice on areas where research support staff may be involved.

Judith Stewart of UWE and Gareth Cole of Exeter, in separate presentations, both described the work and findings of their current JISC MRD-funded research data management projects (UWE’s project, ‘Managing Research Data’ is at http://www1.uwe.ac.uk/library/usingthelibrary/servicesforresearchers/datamanagement/managingresearchdata.aspx; the Open Exeter project is at http://blogs.exeter.ac.uk/openexeterrdm/).

Each also each positioned library staff members as key to improved research data management across the university, as part of partnership working with other relevant research support professionals.  Both presenters also reminded us that library staff members are well-placed to instigate research data management activity if this is not already an activity within an institution: whilst the research data management challenge may require new skills, librarians are already skilled in information management, bibliometrics, and other relevant areas of expertise, and are experienced in working across the institution, free from inter-faculty or inter-discipline politics.  These skills equip them well to work towards supporting researchers with better management of research data.

Miggie Pickerton of Northampton pushed this relationship between library staff and research activity further, arguing there are strong benefits for library staff to wade into research activity for themselves.  Drawing a division between ‘academic’ and ‘practitioner’ research, Miggie encouraged library staff to consider either but particularly argued the case for the value of ‘practitioner’ research, which she defined as taking a pragmatic approach to a current problem or need, as opposed to curiosity-driven work intended to make REF impact.

Through a very interactive session, Miggie encouraged the audience to identify the benefits of library staff undertaking research for the individual librarian, the institution, and the library profession as a whole, and provided some examples of suitable topics for investigation.  Inspiring!

Jennifer Coombs (N’ham) and Elizabeth Martin (De Montfort) described their experiences of creating, alongside colleagues from Loughborough and Coventry, a collaborative online tutorial to teach researchers about research promotion (www.emrsg.org.uk).

Jez Cope of the Research360 project at Bath (http://blogs.bath.ac.uk/research360/) shared the benefits for researchers of several social media applications.  Despite the earlier assertions of doubt about Twitter by the event chair, Jez managed to get a few more delegates onto the service and interacting with other delegates as well as more remote followers of the event hashtag.

As always, it was apparent that institutions vary widely in their cultures, sizes and experience with RDM, but we learned a great deal about what librarians are already doing to support researchers, some new tools and techniques that might be useful for their work in this area, and some powerful arguments for expansion into the research data management and research practice areas.

Delegates to this event may find it interesting to explore the research data management training materials made by five projects of the first MRD programme, available at http://www.jisc.ac.uk/whatwedo/programmes/mrd/rdmtrain.aspx (follow the link for each project at the bottom of the page).  These materials are freely available for use and reuse, and will be supplemented by a further four projects in the second MRD programme, starting this summer, some of which will be delivering training materials specifically for research support professionals including library staff.

Here’s hoping there will be a DARTS4!

 

Discuss, Debate, Disseminate – PhD and Early Career Researcher data management workshop, University of Exeter, 22 June 2012

Jill and Hannah of the Open Exeter project have not been holding back with their user requirements research – not content with attracting hundreds of responses to their survey of Exeter postgraduates, they’re also augmenting this with their own research as well as running events like Friday’s, in an admirably thorough approach to gathering information on what postgraduate students and early career researchers at their institution need, how they work and where the gaps are in the current infrastructure provision.

Twenty enthusiastic participants turned up on 22 June, happily from across the sciences and humanities, and contributed with gusto to group discussion, intensive one-to-one conversations and a panel session.  The project has recruited six PhD students – Stuart from Engineering; Philip from Law; Ruth from Film Studies; Lee from Sport Sciences and Duncan from Archaeology, plus one more currently studying abroad – to help bridge the gap between project staff and their PhD peers.  These six are working intensively with the project team to sort out common PhD-level data management issues and activities in the context of their own work, which allows them to not only improve their own practice but also to share their experiences and tips with other PhD students and ECRs in their own disciplines at Exeter.  (You can see more about this at http://blogs.exeter.ac.uk/openexeterrdm/)

One of the most interesting aspects of working on this programme, for me, is understanding the nuts and bolts of research data management in a specific disciplinary context, in a particular institution.  In other words, the same context in which each researcher is working.  Although funders are increasingly calling the shots with requirements and expectations for research data management, the individual researcher still has to find a way to put these requirements into practice with the infrastructure they have to hand.  That means it’s all very well for the EPSRC or AHRC or whoever to require you to do something, and you may even understand why and want to do it, but who do you ask in IT to help?  Why isn’t it OK to just put data on Dropbox?  What to do with data after you finish your PhD or project?  And what is metadata anyway?

Despite the generally-held view by researchers that their RDM requirements are unique to their discipline, these questions – and other like them – are actually fairly consistent across institutions when researchers are sharing concerns in an open and relaxed environment.  And this was one of the achievements of today’s event: by keeping things friendly, low-key and informal, the team got some very useful information about what PhDs and ECRs are currently doing with RDM, the challenges they’re encountering and what Exeter needs to provide to support well-planned and sustainable RDM.

Some additional detail from the event:

–       Jill offered a working definition of ‘data’ for the purposes of the workshop: “What we mean by data is all inclusive.  It could be code, recordings, images, artworks, artefacts, notebooks – whatever you feel is information that has gone into the creation of your research outputs.”  This definitely seemed to aid discussion and meant we didn’t spend time in semantic debate about the nature of the term.

–       Types of data used by participants:
o       Paper, i.e. printouts of experiment
o       Word documents
o       Excel spreadsheets
o       Interview transcripts
o       Audio files (recordings of interviews)
o       Mapping data
o       PDFs
o       Raw data in CSV form
o       Post-processed data in text files
o       Graphs
o       Tables for literature review
o       Search data for systematic review
o       Interviews and surveys: audio files, word transcripts
o       Photographs
o       Photocopies of documents from the archives
o       NVivo files
o       STATA files

–       Common RDM challenges included: the best way to back-up, use of central university storage, number of passwords, complexity of working online (which can make free cloud services more attractive), lack of support with queries or uncertainty about who to contact; selection and disposal, uncertainty over who owns the data.

–       Sources of help identified during the event: subject librarians, departmental IT officers, and during the life of the project, Open Exeter staff, existing online resources such as guides from the Digital Curation Centre (http://www.dcc.ac.uk) and the Incremental project (http://www.gla.ac.uk/datamanagement and http://www.lib.cam.ac.uk/preservation/incremental/).

The future of the past: closing workshop for the Data Management Planning projects

It always provokes mixed feelings to attend a closing event marking the end of a project or raft of projects.  On the one hand, it’s melancholy to say goodbye to people, or to know that there will be no more interesting outputs coming from a particular project.  On the other, there is (hopefully) the sense of achievement that comes with having finished a piece of work.  Having something finished, ready to show, then getting ready for the next activity, preparing for the future.  It was useful and thought-provoking to see the findings and outputs of the ‘strand B’ or data management planning projects of the MRD02 programme at the Meeting Challenges in Research Data Planning workshop in London on 23 March.  This event marked the closing of these projects, and gave them an opportunity to share what they’d been doing.  Data management planning by definition is about considering the future, and there was a sense of energy and enthusiasm from the projects on the day which suggested we could easily have met for longer and talked more.  And yet, some elements of the discussion made me think about the past.

Back in MRD01 (2009-11), there were a few projects such as Oxford’s Sudamih and Glasgow-Cambridge’s Incremental project which performed institution-specific scoping work about what researchers need to improve both their understanding and practice of RDM.  As one of the Incremental team, I felt at the time that, to be honest, a lot of it seemed to be stating the blooming obvious, but we recognised the value of gathering original data on these issues in order (1) to check that our suspicions were correct; and (2) to wave in front of those making decisions about whether and how to fund RDM infrastructure.

You can read the full report of Sudamih here and Incremental here, but the main ideas we found evidence for were things like: researchers are almost always more interested in doing their research than spending time on data management, so engagement relies on guidance being short and situated in one obvious, easy-to-navigate place; there are lots of guidance resources at institutions already but they’re scattered and not well advertised; lots of researchers in the arts and humanities don’t consider their material as ‘data’ and so the terminology of RDM doesn’t engage them or may actively alienate them; researchers may be party to multiple data expectations from their institution and / or their funder, but a lot of them are not aware of that fact, never mind what these are and where to find them in writing.  Also, different disciplines have different data sharing conventions and protocols, which affect researcher behaviour; some researchers can be quite willing to practice good data management, but they need to know who to call or email about it at their own place; guidance written by digital curation specialists is great and fine, but often needs translating into non-specialist language, and there are lots of researchers who are just not going to engage with a policy document.  All that kind of thing.  Readers of this blog will possibly be amazed that such fundamental ideas are not more widely understood out there in the wider research community, but that in itself probably just confirms the knowledge gap between RDM people and the general researcher population.

So back at the event on 23 March, we heard from, amongst others, Richard Plant of the DMSPpsych project explaining the importance of local guidance for the institution’s researchers, and Norman Gray of MaRDI-Gross explaining the influence of the data sharing culture in big science on its researchers (although I never did get around to asking him if the project did indeed reach ‘the broad sunlit uplands of magnificently-managed big-science data’, as promised in the project blog).

History DMP from Hull charmed with an appearance by one of their tame researchers, who came along to give a brief account of his experience with the project.  He was happy not being familar with RDM terminology or principles or, as he put it,

‘This process has been very straightforward for me.  I don’t understand the technical elements but I don’t need to.’

The benefits of easier remote access to and confidence in the security of his data storage were the pay-off for him, and left everyone feeling optimistic.

Reward at UCL/Ubiquity Press did many interesting things whilst aiming to lower the barriers to good RDM and shared a deluge of findings echoing those of Incremental / Sudamih, including the value of drawing together institutional RDM-related resources to provide a single point of access; the effect of discipline-specific protocols on researcher behaviour (specifically data sharing); the value of clarifying benefits of good RDM to motivate researchers; the lack of current awareness about IPR, licensing and data protection; the reluctance to discard data; the need for training about RDM and particularly long term preservation of data; and many other points.

So what occured to me on 23 March was that it felt good to hear several of the MRD02 strand B projects reiterating our findings from their own experiences at their own institutions.  It reminded me of Heather Piwowar’s notion of ‘broad shoulders’.  It wasn’t that they were agreeing with us – I’m more than happy for my research to be challenged constructively.  It was that what we’d done in MRD01 seemed to be useful to some extent, allowing the MRD02 projects to extend and refine user requirements in RDM, and share what they found, which benefits us all.

Revisited: Meeting (Disciplinary) Challenges in Research Data Management Planning

The JISCMRD Workshop on ‘Meeting (Disciplinary) Challenges in Research Data Management Planning’ (March 23, 2012, London) saw the projects in this strand present their interim outputs; the development of DMPonline (now in v3.0), disciplinary templates and further institutional approaches rounded up the event.

The discussion circled around a number of issues and questions, some covered, some yet to be fully answered as Steve Hitchcock points out in his excellent blog piece (e.g. What is a DMPs scope, defined by whom? Where to best host a DMP? To what extent and how to (pre-)populate DMP records?).

Overall it is fair to say that a lot of good progress has been made on the DMP front – but challenges remain, especially as the implementation of funder requirements, data management policies and therefore DMPs has gained speed on institutional level:

  • For researchers/research groups “changing RDM culture is (going to be) hard work” as pointed out by Simon Dixon (SMDMRD project), representative of the overall discussion. Sticks AND carrots are needed (in a positive way: show benefits!).
  • Along with disciplinary working practices and cultures the requirements from DMPs in use are further evolving – not bound by project schedules and implementation time lines.
  • Furthermore, time is always a constraint for filling out DMPs, we have to try to mitigate the duplication of effort for data already stored electronically.
  • Good practice is not at all easy to implement and in connection to that training and documentation has to be a part of it all.
  • In the end, DMP tools not only need to mature in general, but the DMP as such has to be a dynamic thing (vs. a static snapshot only) in a running project before it will be put to rest in an archive at the end of the research lifecycle.

Meik Poschen  <meik.poschen@manchester.ac.uk>
Twitter:  @MeikPoschen

Chatham House at Weetwood Hall: emerging themes from the JISCMRD02 institutional RDM policy workshop

Earlier this week, I and my co-facilitator had four wide-ranging and thought-provoking discussions across two days with the JISCMRD02 projects who attended the programme workshop on institutional research data management policy development and implementation at Weetwood Hall in Leeds.  Conducted under the Chatham House rule, we hoped projects and interested Fellow Travellers would feel able to share their challenges, successes, questions and institutional quirks openly, and I’d like to thank the participants for their time and energy in doing so!

It has been indicated to me that some preliminary notes of themes arising from our discussion would be useful, in advance of more detailed reporting.  I’d like to share some of the main themes that emerged from our group, with the provisos that:

  • these only represent one of the several discussion groups – main themes from the others may vary (and you can read Bill Worthington’s useful account from another group here); and
  • these are presented here for interest and discussion – please don’t interpret any of them as the official position of or advice from the MRD02 programme, the DCC or JISC – they’re simply ideas that bubbled up from our group conversations and were contributed by twelve individuals representing ten very diverse institutions, as well as the thoughts of our facilitator.

That said, we hope the lessons they’ve learned from their work so far in RDM policy development will be useful to others travelling the same path.

Themes and observations:

– At this point (March 2012), institutions are still all at different stages with their research data management policies.  However, as far as  they’re funded by the major funding councils, research councils and associated bodies, institutions are all subject to a common set of requirements, mandates and expectations from those funders, in addition to UK and EU legislation. In other words, the responsibility to have these expectations and requirements clarified and complied with is already there. It’s now up to institutions to decide their approach to an appropriate and realistic response.

– The idea of having an institutional research data management policy in place at your institution can be reassuring.  However, having a policy in place without any real buy-in from staff can be more harmful over time – by breeding complacency – than having no policy yet in place. So it’s best to take a little longer and get it right than rush through a policy in which researchers, research support staff or senior management have no investment or of which they have little awareness.

– A useful approach may be to craft an aspirational, high-level document which outlines principles as opposed to specific attributions of responsible persons, workflows, budgets and so on.  This high-level statement is often more easily understood by senior management and so can be the most effective way to get the policy through university senior committees and into institutional regulations.  This high-level policy should then be accompanied by, and executed by way of, working documents which translate the principles into specific tasks allocated to specific roles.  It should be anticipated that the high-level policy will not need frequent changes; it should allow enough room for, for example, new funder requirements, whereas the working documents should be regularly updated and seen as much more volatile documents.  This is, however, only one type of approach to institutional RDM policy development.  See also the JISCMRD02 Open Exeter project’s blog on the value of aspirational policy here.

– Policy and infrastructure need to evolve in correlation.  Some policies have been well-written but have foundered at the point of senior approval because they have specified responsibilities and workflows which the institution didn’t yet have the infrastructure to deliver.  At the same time, a well-organised policy can help to make the case to senior management for the investment in the necessary infrastructure.  This is another argument in favour of the high-level principles-based approach to the main policy, which can then be used to justify moving towards a more detailed position over time, via the working documents, whilst avoiding the danger of being rejected because of the lack of infrastructure.  It’s also an argument in favour of carrying out some surveying of the current state of infrastructure at your institution – including the ‘soft’ infrastructure elements of training provision, current skills levels in relevant staff groups, staff awareness of the requirements under which they’re currently working, etc.

– Consider the other policies – both internal and external – with which your new research data management policy should work in concert.  It’s obviously better to identify and iron out any potential wrinkles between these before you start plugging the new policy to senior management.  Examples of internal documents may include institutional policies on digital preservation, IT equipment use, open data, response to Freedom of Information requests, data protection, research ethics, intellectual property and academic integrity.  External documents to consider may include the Data Protection Act, Freedom of Information legislation, INSPIRE regulations, environmental data legislation, expectations and requirements of your funding council, expectations and requirements of your research funders, the Research Integrity Office’s research code of conduct, the RCUK code of research practice and relevant legislation relating to use of government data, intellectual property and copyright.

– Retain awareness of the different roles and legislation for research data and administrative data.  Whilst anyone drafting a research data management policy would benefit from knowledge of how the institution handles administrative data, and there may be some crossover in relevant legislation (particularly UK and EU legislation for some aspects of both), it’s important to remember these two categories of data have different purposes, different stakeholders, and attract different expectations by funders, and so should be dealt with by discrete policies, clearly pitched to the relevant audience for each.

– Try to avoid taking the view that researchers will automatically resist implementation of a research data management policy.  Some may be suspicious of it, some will be enthusiastic – and the difference is often down to the approach used.  In institutions where the development and implementation of such a policy is presented as a way to help researchers (e.g. ‘We’ll look after it so you don’t have to’, promotion of the benefits to the researcher, etc.), as opposed to being a new rule or requirement imposed by the central administration, researchers have generally responded enthusiastically.

– Whilst recent research (e.g. the JISC/RIN/DCC DaMSSI project) found that researchers respond well to data management training when it is presented as just one of many aspects of excellence in research practice, there is a tension between embedding RDM training as just another part of routine business and highlighting it sufficiently to attract attendance at training and to ensure researchers pay attention to good RDM practice.  Motivation can be helped by underlining the benefits of good RDM practice to the researcher’s career and profile, their enhanced ability to find their own work in the future, increased impact and a more efficient way of working.

Do any of these points chime with your experience?  Or contradict it?  Let us know in the comments!

Developing Research Data Management Policy

This is Jonathan Tedds (@jtedds): Senior Research Liaison Manager for IT Services; researcher in astronomy and research data management at the University of Leicester. By way of a first blog post proper here in JISCMRD Towers I want to introduce the increasingly higher profile area of Research Data Management (RDM) policy and why it’s rapidly moving from desirable to essential.

Following the agreement by the RCUK umbrella body of research funders on common data principles for making research data reusable – data as a public good – and similar moves by larger charitable trusts such as Wellcome, funders have then batted the ball back to institutions and said deal with it! The EPSRC in particular requires that institutions in receipt of grant funding establish a clear roadmap to align their policies and processes with EPSRC’s expectations by 1st May 2012, and are fully compliant with these expectations by 1st May 2015 – yes, you did read that correctly, that’s a roadmap by this May! Sarah Jones of the Digital Curation Centre (DCC) has just blogged about this following a refreshed look at this area during the very well attended recent DCC Roadshow at Loughborough in February 2012.

Of course there are many other reasons why any institution that it is serious about research should be investing in the support of RDM and Angus Whyte and I recently co-authored a DCC Briefing on making the case for research data management which sets the national and international context as well as describing the experiences in the last 3 years at the University of Leicester. As a consequence institutions (and more specifically those held accountable for supporting researchers) are now realising, if they didn’t already, that they need to plan for research data management infrastructure on the ground across the entire research data lifecycle. Crucially they will also need high level policy at the institutional level to make this a reality. So how to go about it?

Well there are a few institutions that already have policies in place including Edinburgh, Oxford, Northampton and Hertfordshire. The DCC maintains a list of these with links to relevant institutional data policies. Of course this in itself is a grey area as your institution may well already have a code of practice which covers at least some of this ground. But does the policy (or the code!) always connect to the practice on the ground? Bill Worthington, who leads the Research Data Toolkit (Herts) JISCMRD project, has recently blogged on their work in this area.

At Leicester we have been building up to an institutional level policy to fit alongside an existing code of practice adopting a rather ground up approach; building on exemplars such as the JISCMRD Halogen interdisciplinary database hosting project and the current BRISSkit UMF project I lead for cross NHS-University biomedical research alongside high profile central investment in high performance computing (HPC). I facilitate a Research Computing Management Group across the University which takes a strategic view of these issues and will inform our own institutional level policy working party.

A recent email exchange on the JISCMRD mailing list showed a strong interest from the many new (and established) institutes involved in getting together to discuss a number of issues around developing and implementing RDM policies. Following an online poll it was decided to host a lunch-to-lunch meeting, supported by the Programme and assisted by the DCC, to takes this forward at the University of Leeds on March 12-13th 2012. Based on the poll we are expecting up to 50 participants. I’ll link to further details as they are finalised and made available. Themes raised to date include:

  • How are projects/institutions developing policies? Covering considerations of general principles, guidelines from funders and other bodies, specific considerations for the institution in question.
  • How are people getting approval for policies? A chance to share – e.g. off the record or by the Chatham House Rule – some of the challenges which may be faced.
  • How are people planning to support the implementation of the policies? How do projects/institutions intend to support transition from policy to practice?  Policy, infrastructure and guidance.  Interplay of top-down and bottom-up elements?  How to build mention and requirements of subject specific and/or institutional services into institutional policies.
  • How technical solutions affect policy decisions How much will policy be driven by what is technically available to an institution as a (suite of) data management solutions.
  • How are we going to assess and critique the success of RDM systems and policies

Finally, there are of course difficulties in all of this focus on the institutional level. As a researcher myself (astronomy) I argue that a researcher or research group is likely to have much more in common regarding their requirements to manage their data with a similar researcher or group in the same discipline but residing in any other institution (including international) compared to another researcher/group even in the same building. So we are asking a lot for institutions to meet this full range of requirements across all of their research areas. Researchers rather tend to look to their disciplinary learned societies or evaluation panels established by funders to provide coordinated responses. To be sure, the institutions have a strong role to play and shoulder a strong measure of responsibility but they are by no means the whole answer to the problem as I blogged in Research Fortnight (February 2011).

Oh, the humanities! A discussion about research data management for the Arts and Humanities disciplines

Here are some brief notes from the Arts and Humanities breakout group at the JISC MRD02 Launch Workshop earlier this month.

We began with brief introductions around the table.

Definitions of ‘research data’:

Chris Awre from History DMP, Hull, kicked off the discussion by asking: in the arts and humanities, how do you define what research data is?  The term ‘data’ means different things and different activities in different departments.  Also, a lot of what could be called data is secondary rather than primary, i.e. a lot is not new facts, but gathered facts.  Should we attempt to set a definition?
Marie-Therese Gramstadt from the Kaptur project outlined how they are trying to find a way of talking about research data that doesn’t use that particular term.  Instead they talk about looking at the materialisation of research.  They come across lots of paper-based research – do we concentrate just on the digital?  Or do we include hard copy as well as digital data when we talk about managing research data?
Simon Price, data.bris project, reported that in the Bristol theatre collection, a lot of the collection is physical artefacts, including scans and photos and objects, and getting them into a citable or preservable form is a challenge.
Anastasia Sakellariadi, REWARD project, noted that first steps may include finding out from the institute what people use.  You could ascertain that, e.g. through a survey of researchers in your department, and scope that first, then use that definition.  This may be a more useful way to engage with researchers.
Brian Hole, also from REWARD, noted that the REF now specifically talks about data, so that term is being used more now.  In REF terms we’re being compared with STEM [science, technology, engineering and mathematics] subjects who in some cases have much longer-established practice in data management and sharing.

Motivations for effective research data management:

Brian remarked that if we want researchers to plan to publish the data at the end of a project, we need to talk about it at the start.  That makes researchers think about it from the beginning.  It’s good to provide an aspirational model.  Keep the researchers’ eyes on that goal from the beginning.
Anastasia added that if we are achieve this paradigm shift, we have to encourage people to do research for themselves but to also think, ‘I’m doing this work for this project but for other people too.’
Laura Molloy, University of Glasgow, noted that in the JISC MRD01 projects Incremental and DaMSSI, the team found repeated evidence that terminology in awareness-raising efforts, skills training, policy and guidance must be very carefully considered as the use of information management- or digital curation-specific language, or legal language, immediately presents a barrier to many researchers and diminishes their engagement with potentially useful material.  Also, most researchers have other issues as their priority so to get their cooperation and increase motivation, benefits of good research data management must be clear.

Subject-specific differences in re-use:

The group noted differences within arts and humanities across disciplines re. data re-use.  Archaeology is strong on data re-use.  The case for the continuation of the Archaeology Data Service was relatively easy to make because of the unrepeatable nature of some archaeology work.
If there is a strong tradition of re-use in a discipline, the case is easier to make for good research data management.  If the discipline does not currently widely re-use data, we need to find ways to make re-use more attractive.
Simon Price noted that there are often problems putting video content online that has been digitised at Bristol because they can’t track down the rights holders – this is a common barrier to re-use of this type of data.

Data centres and sources of advice:

The group noted the current lack of advice for research data management across the arts and humanities disciplines, particularly with the closure of the Arts and Humanities Data Service (AHDS).
Chris queried the value of depositing data in a discipline-specific data centre, and/or an IR.  Julian Richards of the ADS noted that the value added by deposit in a discipline-specific data centre includes visibility, data mining and aggregation of datasets amongst other advantages.
Simon Price noted that institutions don’t necessarily need to keep data on-site.  The key elements are a citable point and the metadata, and any data centre will do as long as it’s a trusted centre.
Julian added that we need to harmonise metadata to make sure that when a researcher deposits data, they only need to create the metadata once, and APIs are needed to see access stats as this encourages use of data centres.

There then followed a discussion of work on DOIs to distinguish parts of datasets as opposed to the entire dataset.

Brian commented that during his work on the LIFE project, they discovered that in digital preservation, data probably needs to be migrated far less often than the team originally thought.
There’s a lot of data that can be lost when migrating from one format to another.  Julian noted that this is one of the arguments for discipline-specific repositories.  Staff with discipline knowledge are going to be more likely to be aware of these risks, and how to check that significant characteristics, properties and metadata of the file haven’t been lost.

Training for researchers:

When considering training to help researchers address research data management issues, presenting such training in a ‘digital humanities’ environment runs the risk of ‘preaching to the converted’ – a digital preservation / research data management event will definitely do so.  The group concluded that perhaps training would be better delivered in a subject-specific environment (particularly one more specific than ‘arts and humanities’ as this is far too broad an area to be useful).

If you were present at the group, please supply corrections and additions to laura.molloy AT glasgow.ac.uk – thank you.  Otherwise, please enter comments below!

JISC Managing Research Data programme – mk. 02 is go!

In case it’s useful, here’s a quick report on the JISC MRD02 programme manager’s opening remarks.  Simon Hodson’s introductory talk contained quite a lot of useful information for those working on the new programme and “fellow travellers”.

Simon opened the launch event by welcoming the new projects joining the JISC Managing Research Data (MRD) programme in its second funded iteration, plus some additional researchers from the last programme who were there to share experiences, and “fellow travellers”, i.e. interested other parties with experience to share or a particular interest in the work of the new iteration of the programme.

The current challenges include tackling management of the well-acknowledged data deluge: this is about huge quantities of data but the problems of managing this are not just limited to storage.  There are opportunities here: ways to improve and develop data re-use, run meta studies and engage with interdisciplinary grand challenges.  There is increasing awareness of research data as an asset and recognition of the fact that data has re-use value.  Simon stressed the importance of building on what’s already been done to ensure our work on research data management continues to make real progress.

Simon then described the new programme.  There are twenty-seven new projects funded through the 07-11 call for MRD02, across three main strands.

Strand A consists of the infrastructure projects – these include work on systems and storage, and also policy, support and guidance.  Nine projects from Strand A will be piloting new infrastructure, four will build on existing pilots, three will develop discipline-specific infrastructures and one will develop infrastructure with a focus on metadata.

Key deliverables for Strand A projects include:

  • Requirements analysis;
  • Implementation plan;
  • Description of a research data management system including lifecycle management and preservation;
  • Description of the human support infrastructure, i.e. the guidance and support that will be provided;
  • Institutional research data management policy;
  • Evidence of benefits of interventions made by each project and the cost of information where available;
  • Business plan for sustaining the pilot infrastructure or service.

There are ten more projects which are about planning in one way or another – these constitute Strands B and C.  These projects will help researchers, research groups and departments to meet funder requirements.  They will also explore discipline-specific challenges associated with making and executing DMPs.  There are eight 6-month projects developing DMPs and the infrastructure to implement these plans, and two 12-month projects enhancing DCC’s DMP Online tool.

Strand B planning projects will deliver

  • Requirements analysis and description of information / data development;
  • DMP and supporting system infrastructure with appropriate guidance and support materials;
  • No business case is required, but they should contribute to the programme objectives of gathering evidence for making the case for data planning.

Strand C planning projects will deliver

  • Requirements analysis and description of data architecture;
  • Adaptation of systems enhancement and adaptation with DCC of the DMP Online tool including guidance and user support, and feedback to DCC.

There will be a further funding call in January 2012 focusing on training and research data publications. The publications call will encourage bids to work in partnership with researchers, educational boards, scholarly societies, data centres and publishers, to encourage use and citation of data.  The research data management training call will seek bids for the development of training programmes for specific disciplines, for support roles (e.g. librarian, research liaison staff), and partnerships with professional bodies.  Key outputs sought include recommendations for future funding.

Simon mentioned several upcoming events which will be relevant for JISC MRD02 project staff.

  • There will be a workshop for RDM planning projects in or around March 2012 with a focus on demonstrating project outputs.
  • The British Library is being funded by the MRD02 programme to run a series of five workshops about DataCite, targeted at JISC MRD02 projects and open to other interested parties.
  • There will also be a workshop for infrastructure projects and fellow travellers in either July or September 2012 – the date for this is to be decided.
  • Finally, there will be the JISCMRD conference in March 2013, which will be a large international event for programme staff and an international audience to share findings, deliver demonstrations and plan future work.

Simon then introduced the importance of evidence gathering in the new programme.  This will be an explicit activity in this iteration of the programme with three part-time members of staff assigned to it.   It is important for the programme to have evidence of the benefits of the interventions provided by the projects and to be able to provide this evidence to the projects’ host institutions and to the wider community.  This will be supported by the programme as much as possible, and efforts by the projects to list and explain the benefits of their work will be helpful towards the writing of their business cases.

Projects will be specifically expected to identify likely benefits from their projects, what evidence they can produce and any possible metrics. This information should be blogged, both for the project’s own reference and in order to share it across the programme and also publicly.  The three evidence gatherers – Laura Molloy, Meik Poschen and Jonathan Tedds – will be identifying themes as they emerge from project blogs, and will be working with the projects to encourage blog posting, engage in tweeting and re-tweeting relevant material to promote project activity, posting their own blog material responding to issues raised by project reporting and generally, in this way, compiling and stimulating an evidence base.  The evidence gatherers will be particularly interested in commonalities across the programme including any themes arising around approaches, discipline focus, technical platforms, identifiers and metadata.

The webpage for the JISC MRD02 programme is located at

http://www.jisc.ac.uk/whatwedo/programmes/di_researchmanagement/managingresearchdata.aspx

This page lists the URLs of project websites.

Simon has tracked the initial commonalities of the projects on the following publicly-available spreadsheet:

https://docs.google.com/spreadsheet/ccc?key=0AoaHWqA_UJNhdEhaaaRT0lpTTJjUDBvUVpEMDlmYlc2RlE#gid=0

and the project blogs are listed on his own blog at

https://researchdata.jiscinvolve.org/wp/2011/11/29/jisc-managing-research-data-programme-2011-13-new-rdm-planning-projects-2

There is an RSS feed of JISC MRD02 project blogs at http://bit.ly/JISCMRD02-Blogs.  This feed was compiled by Jez Cope.  If you are a project on the programme, please do make sure your blog is listed here.

And a Twitter list for tweeters connected to the programme has been set up by Brian Kelly at https://twitter.com/#%21/briankelly/jiscmrd.

Thanks to Jez and Brian for these useful tools!

Laura Molloy

E: laura.molloy@glasgow.ac.uk

Twitter: @LM_HATII