Author Archives: lmhatii

Research Data Management programme Training Strand kick-off workshop, London, 26 October

This one-day event provided an overview of the JISC MRD programme training strand, its aims and context; a description of the DaMSSI-ABC support initiative for the training strand and various pieces of work it hopes to complete before particularly in terms of making outputs easier to find and use; and recognition of the fact that the activity of the four small training materials projects of the JISC Digital Preservation programme have correspondence with the RDMTrain02 projects.

The four RDMTrain02 projects each talked about their approach, activities, challenges and progress, giving us an idea of the subject areas or staff groups they are specifically addressing with the RDM training materials they develop:

  • RDMrose, Sheffield (Andrew Cox): ‘information professionals’ (which I understand to be, in this context, academic librarians)
  • Research Data Management Training for the whole project lifecycle in Physics & Astronomy research (RDMTPA), Hertfordshire (Joanna Goodger): PG students and ECRs in the physical sciences
  • Sound Data Management Training (SoDaMaT), Queen Mary University of London (Steve Welburn): postgraduate research students, researchers and academics working in the area of digital music and audio research
  • TraD: Training for Data Management at UEL (Gurdish Sandhu and Stephen Grace): PGR students in psychology and in computer science.

The afternoon session consisted of an introduction to a set of description and evaluation criteria which have been developed by the Research Information Network through its Research Information and Digital Literacies coalition.  These criteria are in an advanced draft form and participants were asked to read and feedback on them.  They are intended to help with 1. specifying what the training resource or event is meant to do and who it is for, and 2. assessing the success of the training against those specifications.  As such, it’s potentially a very useful tool to suggest to and remind those developing training of useful measures they can take and factors that should be considered in order to create a genuinely useful training resource, whilst also providing a framework for review and impact.

Some participants were perhaps not entirely clear on the potential benefits of the criteria, and profited from a chance to discuss the document with members of the DaMSSI-ABC team.  Those who had a clear grasp of the aim and structure of the document – usually by replacing ‘information literacy’ with ‘research data management’ for ease of use in their particular context – agreed it looked very useful and provided a structure that may clarify what they’re trying to do.

Detailed feedback and questions on the criteria were sought, and will still be received gratefully by Stéphane Goldstein at stephane.goldstein AT researchinfonet.org.

Discussion was a good opportunity for projects to ask questions and share experiences.   Points included:

  • Culture change in institution can’t be expected to happen during short project lifespan.  But projects can be a catalyst to inspire change and start the process.
  • Important to remember that changing culture in one area or institution can influence other players, e.g. researcher practice and requirements can influence the behaviour of publishers if messages are clear enough.
  •  Support – including admin – staff are an important population in institutions: in universities, they are over 50% of staff.  They also have to manage data and information.  Datasafe (Bristol) has been considering their needs as well as those of researchers.
  • Simplification of models can sometimes help engagement.  As JISC’s Neil Grindley pointed out, many initiatives have simplified models such as the DCC lifecycle model into four main areas; e.g. the four digital preservation projects have collaborated on a leaflet which reduces DC activities to: start early, explain, store, share.  This will heretofore be known as the Grindley Theory of Four Things.
  • Short (5 – 10 min) resources lend themselves to easier re-use and can more easily be slipped into training at the institution that isn’t about RDM.  This means we can raise awareness more widely than just preaching to converted.  For example, it would make sense to include RDM in induction training, or training for researchers in bidding for funding.
  • Terminology is still an issue: ‘digital preservation’ and even ‘data’ is problematic in some training contexts.
  • People in institutions are already doing training in disparate ways in areas connected to RDM.  It’s important to find out if this is happening in your institution, if they are aware of your project and if you’re giving consistent messages across the institution.
  • Even simple measures can be valuable when you’re trying to quantify the benefits of improved RDM.  Sometimes a quantity is useful, sometimes a story.
  • Need for generic as well as discipline-specific training and resources.
  • Need to work across campus and involve all relevant areas such as research office, library, IT services (both local and central computing services), staff development services, legal office.
  • Librarian role is valuable for various reasons, but an important one is the ability to use links across campus.
  • Whilst researchers often appear to have higher loyalty to their discipline than their institution, and researchers are a mobile population, a discipline by its nature doesn’t often have agreed rules, representatives, funded infrastructure or membership.    So knowledge can be passed through informal networks, but there is little in the way of actually engaging with ‘a discipline’ as a whole.  It’s still institutions who are providing the infrastructure, policy framework and the training.  DaMSSI-ABC keen to work with professional bodies where these exist to try and address this situation.
  • This strand of projects as well as fellow travellers, e.g. www.le.ac.uk/researchdata happy to build on prior work, e.g. JISC Incremental www.glasgow.ac.uk/datamanagement, UKDA, Sudamih, in the ‘four things’ approach to building online guidance.
  • Is there a role for organisations such as UKCGE, HEA?

Links:

JISC MRD training strand (RDMTrain02): http://www.jisc.ac.uk/whatwedo/programmes/di_researchmanagement/managingresearchdata/research-data-management-training.aspx

DaMSSI-ABC: http://www.researchinfonet.org/infolit/damssi-abc/

RIN Research Information and Digital Literacies Coalition: http://www.researchinfonet.org/infolit/ridls/

RIN Criteria for Describing and Assessing Training: http://www.researchinfonet.org/infolit/ridls/strand2/

DMP session at JISC MRD and DCC IE workshop, Nottingham

Wednesday’s session on data management planning (session 2A) at the JISC Managing Research Data programme progress / DCC institutional engagements event was addressed by

  • Rachel Proudfoot of the RoaDMaP project at Leeds and her researcher colleague Professor Richard Hall of the SpineFX project;
  • Meik Poschen of the MiSS project at Manchester (as well as of programme evidence gatherer fame!);
  • the UKDA’s Veerle Van den Eynden of the RD@Essex project.

Speakers each gave lively updates on the work of their projects, their engagement with research data management and, particularly, data management planning in each of their contexts.

Challenges and lessons learned

Rachel Proudfoot reported that at Leeds, every research application is now to go through an RDM risk assessment process.  As part of that, I wondered if that meant a large majority of researchers would have participated in the writing of a data management plan of one sort or another.  However, RoaDMaP research tells us that 44% of researchers surveyed said they’d done so.  This is an encouraging figure, but the RoaDMaP team are keen to improve matters.

RoaDMaP has been using the DCC’s DMP online tool in their work with researchers: Rachel reported that not all funders are equally well served by it yet but has been feeding suggestions to the DCC and hopes to be able to recommend it to researchers in the future.

Rachel is not alone in dealing with mixed practice across faculties and subject areas in a large, complex institution.  Veerle Van den Eynden described how RD@Essex is also engaging with diverse disciplines to learn about and build on knowledge of diverse discipline practices.

Rachel underlined the need for a consistent message across committees and policy.  Veerle agreed that the university needs to send a strong, consistent message about its stance and expectations around RDM to all researchers.  To be achieved, of course, this needs to be supported by technical infrastructure and the cohesive interaction of university systems, a challenge which, as Meik Poschen reported, is being tackled at Manchester too.  As Meik noted, the integration of systems is not only a more efficient and possibly cost-effective way of gathering and keeping information about research at the institution, but it can also minimise the frustration of researchers with administrative procedures by removing the need to supply the same information several times as part of the bidding process.

Another challenge identified is the provision and sustainability of support for RDM activities including the development of data management plans.  Some projects are able to provide this at the moment due to the relatively low levels of awareness and concomitant low levels of demand.  But projects today aired concerns about scalability, particularly once policies become more robust, awareness rises and demand increases.  All three projects are reaching out to their various audiences with online guidance resources to provide on-demand help and supplement in-person guidance provision.

Richard Hall, a spine researcher at Leeds, is clear that members of the research team should be a priority in the development of a data management plan as they will be the best people to give a realistic account of the scale and type of data anticipated, and also any changes in technology that are likely to occur during project lifespan.  His example brought it home: a few years ago, scanners could produce a scan of a vertebra in a day or two: now whole spines can be scanned in a few hours.  Increasing the speed and capacity of scanning not only means that more scans are produced during the project lifespan: also, as it’s so quick and easy to produce larger and more complex scans, researchers are likely to produce and keep more and larger scans than they would have a few years ago.  Meik also outlined the challenges posed to RDM by rapid change in research technology.

Other lessons learned by Richard in this area are that DMPs must be part of research activity from the earliest stage possible, and that a requirements specification needs to be developed at that time.  A project risk assessment is also useful to identify challenges.  These will all need resourcing – not only financially but also in terms of time and attention: data management planning for even the simplest data needs thought and researcher engagement.  (Unsurprisingly, financial resourcing for RDM was also highlighted as another challenge by the other two institutions.)

But of course more roles than the researcher alone must be engaged: all projects acknowledged the various roles involved in good RDM practice across the institution, and Meik was particularly clear about the need to clearly assign both responsibilities and accountability for various stages of RDM.  MiSS is developing training for library, research office and business managers at Manchester to raise awareness across the campus.

What has worked or is working?

In RoaDMaP’s view, the DCC’s DMPonline works quite well for some funders.  Examples of a DMP created by the tool can be reassuring for researchers, who often find that by contrast, talking about it in the abstract can be disconcerting.

Rachel is convinced that to get researchers on board with guidance, services and tools, it’s crucial to put lots of feedback mechanisms in place for timely and detailed user information.  This not only helps to improve the product, but also gets over the message to researchers or other users that their experience is important to the process, an idea echoed by Richard Hall.

Richard is pleased that working on data management plans with a research team doesn’t just yield the plan itself: his experience is that the process also helps to bring about cultural change as the relevant issues are examined and decisions reached.  Other advantages to the activity are that it helps to instil a culture of cooperation throughout the research team even where there are national boundaries, and that the additional governance structures ultimately enhance research.

What can the MRD programme or the DCC do to help? 

The MRD programme has done much already to bring RDM questions into focus, and put in place pathfinder projects as well as supporting development in institutions at a more advanced stage of supporting RDM.  Many projects will be hopeful of further JISC MRD programme investment to sustain and extend the work on which they are currently engaged.

Many suggestions emerged in the question period for future DCC activity, including:

  • Promotion of the benefits of writing DMPs alongside the risks and costs of not participating
  • Work with Je-s / RCUK to streamline the process and for consistency
  • Help to coordinate policy production across engaged institutions
  • Guidance about roles and responsibilities

How do you model costs?  Have you assigned responsibilities and / or accountability for various RDM functions at your institution?  And is there anything you’d like the MRD programme, the JISC more widely or the DCC to do, either now or during future work, to meet RDM challenges?  Tell us in the comments.

Two JISC MRD programme events this week!

A particularly busy week this week for the JISC MRD programme and its projects!

On Wednesday 24 and Thursday 25 October, we have the ‘Components of Institutional Research Data Services‘ meeting which involves both the JISC Managing Research Data Programme 2011-13 and the DCC Institutional Engagements team.  It’s from 12.00 on Wednesday to 16.00 on Thursday and is at the NCL Conference Centre, Nottingham.

Those attending, and those who are just interested in what we’ll be up to, can see the programme here.

Suggested Twitter hashtags, should you be that way inclined, are #JISCMRD and #UKDCC.

And then on Friday 26 October, we’re continuing to keep ourselves busy with the kick-off event for the new strand of research data management training materials projects!  This event is in London, at Venuesetc. Paddington and I’ll be there as one of the DaMSSI-ABC team along with the rest of the team and our four new(ish!) training projects.  The programme is available here.  Good hashtags for this event include #JISCMRD, #DaMSSI, #researchinfonet and #UKDCC.

We’ll be hearing from the projects and also asking for their take on some draft criteria for describing and assessing training interventions, which is a really interesting and accessible document to help thinking about planning and assessing training.  The criteria have been developed by the Research Information and Digital Literacies Coalition (RIDLs), supported by the Research Information Network.  There is more information about the criteria here: http://www.researchinfonet.org/infolit/ridls/strand2/.

DaMSSI-ABC will be a support and synthesis initiative for the training projects.  It’s explained in more detail here: http://www.researchinfonet.org/infolit/damssi-abc/ and also has its own blog at https://damssiabc.jiscinvolve.org/wp/.  Please feel free to use the blog to post comments, questions and feedback before, during or after the event.

Hope to see you this week!

 

Evidence Gathering: The Field Guide

Evidence?  Of what?

We have great lives as Evidence Gatherers – really, we do. Swanning around to meetings, reading interesting blogposts from MRD02 projects, being nosy about what the programme’s projects are doing and writing about stuff that engages us.  But there is a more serious side to the role.  The clue’s in the name, really: we’re here primarily to Gather Evidence.  But evidence of what, and why?

Well, like everyone else in the research sector, JISC is under considerable obligation to provide clear and compelling evidence of the value of its activities.  Everyone on an MRD02 project knows that what the programme’s projects are doing is going to really change things for the better in research data management – whether that’s at our institution, our discipline or more broadly across the sector – but how do we prove this?  We all know there’s less money going around to fund the sort of research we want to do in RDM, so how do we make the case in clear and irrefutable terms that our work brings benefits?  Real, measurable, trackable benefits?  Hence the decision to undertake a structured approach to gathering and presenting the evidence of the benefits of MRD projects.

Anyone from an MRD01 infrastructure project will remember the requirement for a benefits case study near the end of project activity.  (These were brought together in a handy summary document.)  But we couldn’t help thinking, ‘If only we’d been able to plan for writing this case study earlier.  Then we could’ve put some pre-activity benchmarks in place to show how great we were.’  So this time around, projects were introduced to the benefits work at the programme’s kick-off meeting, and then wrote a blog post early in the project to outline the benefits they were expecting to realise.

The Field Guide to our approach

One of the main things we’ve noticed when reading these blog posts is that there is often a bit of confusion around what constitutes an output; a benefit; and a piece of evidence.  For our purposes, here is the Field Guide to the MRD Evidence Gathering Approach:

  • An output is something that the project is going to make, produce, put in place or that it otherwise aims to deliver.  These will be specified in your project plan.
  • Benefits can be identified by asking, ‘What does this help us (the institution / researchers) to do better?’
  • Evidence consists of specific, clear metrics (quantitative measures) and specific, clear qualitative evidence such as narratives and short case studies, all of which support or prove the benefit.

So for the Evidence Gathering work, we need to establish a list of benefits for each project, and each benefit in turn needs to be supported by evidence.

An example:

  • Output: Production and approval of a data policy is an output (and a great one!  Go you!)
  • Benefits: How does this output help the institution / researchers to do RDM and/or research better?  Well, having this policy to refer to can contribute to i) easier compliance with funder policy, ii) improved availability of RDM infrastructure, and iii) improved ability of the institution to plan for future requirements.  These are three benefits.
  • Evidence: So appropriate evidence can be the tracking of quantitative measures, e.g. an increased number of references to the data policy within research proposals,  against an existing benchmark.  A case study with a researcher showing how the policy helped them with success in bidding would be inspiring and could show compliance with funder policy to good effect.  Evidence of increasing reference to the data policy, over time, along with an increased number of datasets held securely and in a context that makes them available for re-use would be compelling.  Interviews with key staff from planning office or research office (as appropriate) about use of RDM roadmap/policy, or a narrative detailing how the policy is being used to improve the institution’s RDM infrastructure could also be used.

Tailored solutions

At programme events, project staff will probably have noticed how diverse the programme is.  The types and sizes of institutions, the aims of projects and the approaches to RDM in all these circumstances make for interesting meetings and energetic debate.  However, it also means that we don’t propose a one-size-fits-all approach to the Evidence Gathering work, so much of our time is currently spent crafting a tailor-made list of sensible and appropriate pieces of evidence for each project. These are to be delivered in an Evidence Report along with Final Reports, but projects should also find the material very helpful when putting together their sustainability business cases.

The goal is to have clear, understandable and compelling evidence for each project which contributes to an evidence base for the programme as a whole.  This will show the difference made for the better – how we as a programme have improved matters, changed the game and moved RDM onwards in the UK HE sector.

OR2012: Research Data Management and Infrastructure: institutional perspectives

Research data management can make a significant contribution to an institution’s research performance but needs solid user requirements research, an understanding of the researcher working space and a collaborative approach between researchers and support staff for infrastructure to be adopted, understood and sustained in the institution.  That was the message from this session on 11 July in Edinburgh at Open Repositories 2012 on research data management and infrastructure, from the perspectives of three particular institutions.

Unmanaged to managed

First we heard from Natasha Simons from Australia’s Griffith University.  Natasha made a clear connection between the university’s position in the top 10 research universities of Australia, and the existence of their Research Hub, which was developed with funding from the Australian National Data Service.  The Hub stores data and relationships between the data, exports to ANDS, and provides Griffiths researchers with their own profiles which allow better collaboration across the institution by allowing researchers to find others with similar research interests for collaboration and supervision.

Natasha outlined some challenges the Griffith team have met and are currently facing, but ultimately reported that they are successfully transforming institutional data in line with ANDS aims from unmanaged to managed; disconnected to connected; invisible to visible; and single-use to reusable.

Resourcing for RDM

Another institution which connects RDM with its prestigious position in the research league tables is Oxford; Sally Rumsey of the University’s Bodleian library took us through their vision for their institutional research data management infrastructure, encompassing current work on the Oxford DMP Online and the DaMaRO project; data creation and local management (DataStage, ViDASS); archival storage and curation (DataBank, software store); and data discovery and dissemination (document repository, Oxford DataFinder and Colwiz).

Sally argued that that data management doesn’t stop at digital objects:

“Paper in filing cabinets, specimens in jars: all could exist as data.”

She also reminded us that although emerging funder requirements, and particularly this year’s EPSRC roadmap requirement, were doing much to focus minds on RDM, there is also the challenge of unfunded research, a major component of research activity at Oxford.  This needs requirements and funding for management, too.

Sally was asked whether researchers were going to end up paying for RDM infrastructure.  She argued that there needs to be a budget line in research bids to cover these costs.  This prompted me to think about the fact that we talk about getting researchers trained from the start of their research activity, but to bring about the kind of awareness that will lead to researchers knowing to cost in data management in their bid, we need to engage with them before they start even writing the bid.  This is an argument for engagement at PhD level at the latest, and for a much wider and more consistent provision of RDM training in universities in order to bring about this kind of change in culture.  Clearly we also need simple, accessible costing tools to help non-specialists quantify explicit costs for data management and preservation, for inclusion in funding bids.

Adopt, adapt, develop

Anthony Beitz, manager of Australia’s Monash University eResearch Centre, also has nascent culture change in mind.  He described the availability of research data as having the potential to change research work:

“We’re going to see things we’ve never seen before.”

Anthony’s description of how the eResearch team works at Monash is based on a clear understanding of the characteristics of the research space and how that differs from the way in which IT services staff work.

  • Researchers: focused on outcomes.  They work in an interpretive mode, using iterative processes.  The approach may be open-ended and thrives on ambiguity.  Requirements and goals may change over time.  May require an ICT capability for only a short period of time – don’t tend to care what happens to it after the end of the project.  Resourceful, driven, and loyal to their discipline more than the institution.
  • IT services: broad service base.  Supporting administration, education and research.  Continuity of IT services is a priority.  Excel at selecting and deploying supporting institutional enterprise solutions.  IT works in analytical mode as opposed to the research space, which is in interpretive mode.

The volume of data is growing exponentially, but funding to manage it is certainly not.  In this context, a clear articulation of need between the researcher space and the IT services space is crucial.  Anthony argues that researchers need to participate actively in the deployment of an institution’s RDM infrastructure.  Media currently used is not good for reliability, security or sharing, but no single institutional RDM platform will fit all researchers’ needs.  RDM solutions must be a good cultural fit as researchers have stronger synergies with colleagues beyond the institution and are more likely to use solutions within their disciplines.  Anthony suggests that IT services should adopt existing solutions being used within disciplines, where possible, as building a new one breaks the collaboration cycle for researchers with colleagues from other institutions, asserting, “going into development should be a last resort.”

In this way, much of the RDM activity at Monash seems to be explicitly responding to current researcher behaviours.  Adoption of emerging solutions is encouraged by promoting a sense of ownership by the researchers; by delivering value early and often; and by supporting researchers in raising awareness of a RDM platform to their research community.  If users don’t feel they own a resource, they’ll look to the developers to sustain funding.  If they feel ownership, they’ll look for funding for it themselves, so buy-in is not only good for adoption but also for sustainability.

DARTS3, The Third Discover Academic Research Training & Support Conference. Dartington Hall, Devon: 28 – 29 June 2012

Whilst storms swept much of the rest of the country, the sleepy peace of bucolic Devonshire was barely disturbed by the arrival of several dozen librarians (plus a couple of ‘fellow travellers’) to dreamy Dartington.

Anna Dickinson from HEFCE’s REF team (of which there are only five people!) kicked off the first day with a very informative overview of the 2014 REF expectations, process, staff selection, timescales, the test submission system, the assessment of the research environment and how the panels work, with particular advice on areas where research support staff may be involved.

Judith Stewart of UWE and Gareth Cole of Exeter, in separate presentations, both described the work and findings of their current JISC MRD-funded research data management projects (UWE’s project, ‘Managing Research Data’ is at http://www1.uwe.ac.uk/library/usingthelibrary/servicesforresearchers/datamanagement/managingresearchdata.aspx; the Open Exeter project is at http://blogs.exeter.ac.uk/openexeterrdm/).

Each also each positioned library staff members as key to improved research data management across the university, as part of partnership working with other relevant research support professionals.  Both presenters also reminded us that library staff members are well-placed to instigate research data management activity if this is not already an activity within an institution: whilst the research data management challenge may require new skills, librarians are already skilled in information management, bibliometrics, and other relevant areas of expertise, and are experienced in working across the institution, free from inter-faculty or inter-discipline politics.  These skills equip them well to work towards supporting researchers with better management of research data.

Miggie Pickerton of Northampton pushed this relationship between library staff and research activity further, arguing there are strong benefits for library staff to wade into research activity for themselves.  Drawing a division between ‘academic’ and ‘practitioner’ research, Miggie encouraged library staff to consider either but particularly argued the case for the value of ‘practitioner’ research, which she defined as taking a pragmatic approach to a current problem or need, as opposed to curiosity-driven work intended to make REF impact.

Through a very interactive session, Miggie encouraged the audience to identify the benefits of library staff undertaking research for the individual librarian, the institution, and the library profession as a whole, and provided some examples of suitable topics for investigation.  Inspiring!

Jennifer Coombs (N’ham) and Elizabeth Martin (De Montfort) described their experiences of creating, alongside colleagues from Loughborough and Coventry, a collaborative online tutorial to teach researchers about research promotion (www.emrsg.org.uk).

Jez Cope of the Research360 project at Bath (http://blogs.bath.ac.uk/research360/) shared the benefits for researchers of several social media applications.  Despite the earlier assertions of doubt about Twitter by the event chair, Jez managed to get a few more delegates onto the service and interacting with other delegates as well as more remote followers of the event hashtag.

As always, it was apparent that institutions vary widely in their cultures, sizes and experience with RDM, but we learned a great deal about what librarians are already doing to support researchers, some new tools and techniques that might be useful for their work in this area, and some powerful arguments for expansion into the research data management and research practice areas.

Delegates to this event may find it interesting to explore the research data management training materials made by five projects of the first MRD programme, available at http://www.jisc.ac.uk/whatwedo/programmes/mrd/rdmtrain.aspx (follow the link for each project at the bottom of the page).  These materials are freely available for use and reuse, and will be supplemented by a further four projects in the second MRD programme, starting this summer, some of which will be delivering training materials specifically for research support professionals including library staff.

Here’s hoping there will be a DARTS4!

 

Discuss, Debate, Disseminate – PhD and Early Career Researcher data management workshop, University of Exeter, 22 June 2012

Jill and Hannah of the Open Exeter project have not been holding back with their user requirements research – not content with attracting hundreds of responses to their survey of Exeter postgraduates, they’re also augmenting this with their own research as well as running events like Friday’s, in an admirably thorough approach to gathering information on what postgraduate students and early career researchers at their institution need, how they work and where the gaps are in the current infrastructure provision.

Twenty enthusiastic participants turned up on 22 June, happily from across the sciences and humanities, and contributed with gusto to group discussion, intensive one-to-one conversations and a panel session.  The project has recruited six PhD students – Stuart from Engineering; Philip from Law; Ruth from Film Studies; Lee from Sport Sciences and Duncan from Archaeology, plus one more currently studying abroad – to help bridge the gap between project staff and their PhD peers.  These six are working intensively with the project team to sort out common PhD-level data management issues and activities in the context of their own work, which allows them to not only improve their own practice but also to share their experiences and tips with other PhD students and ECRs in their own disciplines at Exeter.  (You can see more about this at http://blogs.exeter.ac.uk/openexeterrdm/)

One of the most interesting aspects of working on this programme, for me, is understanding the nuts and bolts of research data management in a specific disciplinary context, in a particular institution.  In other words, the same context in which each researcher is working.  Although funders are increasingly calling the shots with requirements and expectations for research data management, the individual researcher still has to find a way to put these requirements into practice with the infrastructure they have to hand.  That means it’s all very well for the EPSRC or AHRC or whoever to require you to do something, and you may even understand why and want to do it, but who do you ask in IT to help?  Why isn’t it OK to just put data on Dropbox?  What to do with data after you finish your PhD or project?  And what is metadata anyway?

Despite the generally-held view by researchers that their RDM requirements are unique to their discipline, these questions – and other like them – are actually fairly consistent across institutions when researchers are sharing concerns in an open and relaxed environment.  And this was one of the achievements of today’s event: by keeping things friendly, low-key and informal, the team got some very useful information about what PhDs and ECRs are currently doing with RDM, the challenges they’re encountering and what Exeter needs to provide to support well-planned and sustainable RDM.

Some additional detail from the event:

–       Jill offered a working definition of ‘data’ for the purposes of the workshop: “What we mean by data is all inclusive.  It could be code, recordings, images, artworks, artefacts, notebooks – whatever you feel is information that has gone into the creation of your research outputs.”  This definitely seemed to aid discussion and meant we didn’t spend time in semantic debate about the nature of the term.

–       Types of data used by participants:
o       Paper, i.e. printouts of experiment
o       Word documents
o       Excel spreadsheets
o       Interview transcripts
o       Audio files (recordings of interviews)
o       Mapping data
o       PDFs
o       Raw data in CSV form
o       Post-processed data in text files
o       Graphs
o       Tables for literature review
o       Search data for systematic review
o       Interviews and surveys: audio files, word transcripts
o       Photographs
o       Photocopies of documents from the archives
o       NVivo files
o       STATA files

–       Common RDM challenges included: the best way to back-up, use of central university storage, number of passwords, complexity of working online (which can make free cloud services more attractive), lack of support with queries or uncertainty about who to contact; selection and disposal, uncertainty over who owns the data.

–       Sources of help identified during the event: subject librarians, departmental IT officers, and during the life of the project, Open Exeter staff, existing online resources such as guides from the Digital Curation Centre (http://www.dcc.ac.uk) and the Incremental project (http://www.gla.ac.uk/datamanagement and http://www.lib.cam.ac.uk/preservation/incremental/).

Digital curation tools: what works for you?

I’m undertaking a piece of work with Monica Duke and Magdalena Getler of the DCC, and we need your help!  We’re looking at which DCC-developed digital curation tools are used by the MRD02 projects.  This is a happy case of our interests overlapping in a Venn diagram-type way: I’m interested in which digital curation tools, DCC or not, are used (or considered but rejected) by the projects.  Monica and Magda are interested in the use of DCC tools by the MRD02 projects as well as by other people.

There is a list of the DCC tools developed to date at http://www.dcc.ac.uk/resources/tools-and-applications, and there is a freshly-revised catalogue of digital curation tools developed by people other than the DCC at http://www.dcc.ac.uk/resources/external/tools-services (although please note this latter link is currently still in development – it should be finalised by the week commencing 30 April 2012).

We plan to look at the project plans, blogs and so on to see where digital curation tools are mentioned.  After this initial perusal, the plan is currently for DCC to send out a brief survey to projects where we don’t already have a full picture from their blogging (and this may also be a way of helping to get the new RDMTrain02 projects involved), asking for information on their use of DCC tools.

If you’re on one of the projects and keen to contribute, it would be immensely helpful to me if you could let me know which tools for digital curation (DCC-developed or not) you have considered using.  If you’re going ahead with use of them, please let me know what you think of them, and if you’ve decided against use of a particular tool, please let me know why.  I welcome this feedback by email to laura.molloy AT glasgow.ac.uk, or in the comments below.  Thanks!

The future of the past: closing workshop for the Data Management Planning projects

It always provokes mixed feelings to attend a closing event marking the end of a project or raft of projects.  On the one hand, it’s melancholy to say goodbye to people, or to know that there will be no more interesting outputs coming from a particular project.  On the other, there is (hopefully) the sense of achievement that comes with having finished a piece of work.  Having something finished, ready to show, then getting ready for the next activity, preparing for the future.  It was useful and thought-provoking to see the findings and outputs of the ‘strand B’ or data management planning projects of the MRD02 programme at the Meeting Challenges in Research Data Planning workshop in London on 23 March.  This event marked the closing of these projects, and gave them an opportunity to share what they’d been doing.  Data management planning by definition is about considering the future, and there was a sense of energy and enthusiasm from the projects on the day which suggested we could easily have met for longer and talked more.  And yet, some elements of the discussion made me think about the past.

Back in MRD01 (2009-11), there were a few projects such as Oxford’s Sudamih and Glasgow-Cambridge’s Incremental project which performed institution-specific scoping work about what researchers need to improve both their understanding and practice of RDM.  As one of the Incremental team, I felt at the time that, to be honest, a lot of it seemed to be stating the blooming obvious, but we recognised the value of gathering original data on these issues in order (1) to check that our suspicions were correct; and (2) to wave in front of those making decisions about whether and how to fund RDM infrastructure.

You can read the full report of Sudamih here and Incremental here, but the main ideas we found evidence for were things like: researchers are almost always more interested in doing their research than spending time on data management, so engagement relies on guidance being short and situated in one obvious, easy-to-navigate place; there are lots of guidance resources at institutions already but they’re scattered and not well advertised; lots of researchers in the arts and humanities don’t consider their material as ‘data’ and so the terminology of RDM doesn’t engage them or may actively alienate them; researchers may be party to multiple data expectations from their institution and / or their funder, but a lot of them are not aware of that fact, never mind what these are and where to find them in writing.  Also, different disciplines have different data sharing conventions and protocols, which affect researcher behaviour; some researchers can be quite willing to practice good data management, but they need to know who to call or email about it at their own place; guidance written by digital curation specialists is great and fine, but often needs translating into non-specialist language, and there are lots of researchers who are just not going to engage with a policy document.  All that kind of thing.  Readers of this blog will possibly be amazed that such fundamental ideas are not more widely understood out there in the wider research community, but that in itself probably just confirms the knowledge gap between RDM people and the general researcher population.

So back at the event on 23 March, we heard from, amongst others, Richard Plant of the DMSPpsych project explaining the importance of local guidance for the institution’s researchers, and Norman Gray of MaRDI-Gross explaining the influence of the data sharing culture in big science on its researchers (although I never did get around to asking him if the project did indeed reach ‘the broad sunlit uplands of magnificently-managed big-science data’, as promised in the project blog).

History DMP from Hull charmed with an appearance by one of their tame researchers, who came along to give a brief account of his experience with the project.  He was happy not being familar with RDM terminology or principles or, as he put it,

‘This process has been very straightforward for me.  I don’t understand the technical elements but I don’t need to.’

The benefits of easier remote access to and confidence in the security of his data storage were the pay-off for him, and left everyone feeling optimistic.

Reward at UCL/Ubiquity Press did many interesting things whilst aiming to lower the barriers to good RDM and shared a deluge of findings echoing those of Incremental / Sudamih, including the value of drawing together institutional RDM-related resources to provide a single point of access; the effect of discipline-specific protocols on researcher behaviour (specifically data sharing); the value of clarifying benefits of good RDM to motivate researchers; the lack of current awareness about IPR, licensing and data protection; the reluctance to discard data; the need for training about RDM and particularly long term preservation of data; and many other points.

So what occured to me on 23 March was that it felt good to hear several of the MRD02 strand B projects reiterating our findings from their own experiences at their own institutions.  It reminded me of Heather Piwowar’s notion of ‘broad shoulders’.  It wasn’t that they were agreeing with us – I’m more than happy for my research to be challenged constructively.  It was that what we’d done in MRD01 seemed to be useful to some extent, allowing the MRD02 projects to extend and refine user requirements in RDM, and share what they found, which benefits us all.

Chatham House at Weetwood Hall: emerging themes from the JISCMRD02 institutional RDM policy workshop

Earlier this week, I and my co-facilitator had four wide-ranging and thought-provoking discussions across two days with the JISCMRD02 projects who attended the programme workshop on institutional research data management policy development and implementation at Weetwood Hall in Leeds.  Conducted under the Chatham House rule, we hoped projects and interested Fellow Travellers would feel able to share their challenges, successes, questions and institutional quirks openly, and I’d like to thank the participants for their time and energy in doing so!

It has been indicated to me that some preliminary notes of themes arising from our discussion would be useful, in advance of more detailed reporting.  I’d like to share some of the main themes that emerged from our group, with the provisos that:

  • these only represent one of the several discussion groups – main themes from the others may vary (and you can read Bill Worthington’s useful account from another group here); and
  • these are presented here for interest and discussion – please don’t interpret any of them as the official position of or advice from the MRD02 programme, the DCC or JISC – they’re simply ideas that bubbled up from our group conversations and were contributed by twelve individuals representing ten very diverse institutions, as well as the thoughts of our facilitator.

That said, we hope the lessons they’ve learned from their work so far in RDM policy development will be useful to others travelling the same path.

Themes and observations:

– At this point (March 2012), institutions are still all at different stages with their research data management policies.  However, as far as  they’re funded by the major funding councils, research councils and associated bodies, institutions are all subject to a common set of requirements, mandates and expectations from those funders, in addition to UK and EU legislation. In other words, the responsibility to have these expectations and requirements clarified and complied with is already there. It’s now up to institutions to decide their approach to an appropriate and realistic response.

– The idea of having an institutional research data management policy in place at your institution can be reassuring.  However, having a policy in place without any real buy-in from staff can be more harmful over time – by breeding complacency – than having no policy yet in place. So it’s best to take a little longer and get it right than rush through a policy in which researchers, research support staff or senior management have no investment or of which they have little awareness.

– A useful approach may be to craft an aspirational, high-level document which outlines principles as opposed to specific attributions of responsible persons, workflows, budgets and so on.  This high-level statement is often more easily understood by senior management and so can be the most effective way to get the policy through university senior committees and into institutional regulations.  This high-level policy should then be accompanied by, and executed by way of, working documents which translate the principles into specific tasks allocated to specific roles.  It should be anticipated that the high-level policy will not need frequent changes; it should allow enough room for, for example, new funder requirements, whereas the working documents should be regularly updated and seen as much more volatile documents.  This is, however, only one type of approach to institutional RDM policy development.  See also the JISCMRD02 Open Exeter project’s blog on the value of aspirational policy here.

– Policy and infrastructure need to evolve in correlation.  Some policies have been well-written but have foundered at the point of senior approval because they have specified responsibilities and workflows which the institution didn’t yet have the infrastructure to deliver.  At the same time, a well-organised policy can help to make the case to senior management for the investment in the necessary infrastructure.  This is another argument in favour of the high-level principles-based approach to the main policy, which can then be used to justify moving towards a more detailed position over time, via the working documents, whilst avoiding the danger of being rejected because of the lack of infrastructure.  It’s also an argument in favour of carrying out some surveying of the current state of infrastructure at your institution – including the ‘soft’ infrastructure elements of training provision, current skills levels in relevant staff groups, staff awareness of the requirements under which they’re currently working, etc.

– Consider the other policies – both internal and external – with which your new research data management policy should work in concert.  It’s obviously better to identify and iron out any potential wrinkles between these before you start plugging the new policy to senior management.  Examples of internal documents may include institutional policies on digital preservation, IT equipment use, open data, response to Freedom of Information requests, data protection, research ethics, intellectual property and academic integrity.  External documents to consider may include the Data Protection Act, Freedom of Information legislation, INSPIRE regulations, environmental data legislation, expectations and requirements of your funding council, expectations and requirements of your research funders, the Research Integrity Office’s research code of conduct, the RCUK code of research practice and relevant legislation relating to use of government data, intellectual property and copyright.

– Retain awareness of the different roles and legislation for research data and administrative data.  Whilst anyone drafting a research data management policy would benefit from knowledge of how the institution handles administrative data, and there may be some crossover in relevant legislation (particularly UK and EU legislation for some aspects of both), it’s important to remember these two categories of data have different purposes, different stakeholders, and attract different expectations by funders, and so should be dealt with by discrete policies, clearly pitched to the relevant audience for each.

– Try to avoid taking the view that researchers will automatically resist implementation of a research data management policy.  Some may be suspicious of it, some will be enthusiastic – and the difference is often down to the approach used.  In institutions where the development and implementation of such a policy is presented as a way to help researchers (e.g. ‘We’ll look after it so you don’t have to’, promotion of the benefits to the researcher, etc.), as opposed to being a new rule or requirement imposed by the central administration, researchers have generally responded enthusiastically.

– Whilst recent research (e.g. the JISC/RIN/DCC DaMSSI project) found that researchers respond well to data management training when it is presented as just one of many aspects of excellence in research practice, there is a tension between embedding RDM training as just another part of routine business and highlighting it sufficiently to attract attendance at training and to ensure researchers pay attention to good RDM practice.  Motivation can be helped by underlining the benefits of good RDM practice to the researcher’s career and profile, their enhanced ability to find their own work in the future, increased impact and a more efficient way of working.

Do any of these points chime with your experience?  Or contradict it?  Let us know in the comments!