Much RDM activity has been stimulated by requirements and expectations emerging from the main UK research funders, as usefully described in the DCC’s funder policy table. But do funders understand what they are really asking researchers and institutions to do with their data, and how much sustainable research data management activities actually cost? The April 2013 DCC Research Data Management Forum was a free and timely opportunity for Jisc MRD projects, DCC Institutional Engagement partners and other interested people to directly quiz representatives of some of the main UK research funders. Graham Pryor of the DCC has written a blogpost over at the DCC website, which lays out the day’s discussion and provides funders’ responses to queries about RDM costs.
For me, some of the main take-home messages included:
- More cooperation and standardisation across research funder guidance to bidders, policy and guidance to peer review panel members would be sensible and useful for the sector as a whole. That is to say, harmonisation of language, approach and policy would benefit bidding researchers and their institutions but would also help funders work in a more effective and interoperable way, which has to be advantageous to them too.
- Collaborative measures by HEIs and researchers should be considered. Who else in your area would be a good partner? Not just for a research bid, but for shared services such as storage? Can you achieve an economy of scale by partnering with another institution in your geographical or research area?
- Use of existing tools and services should be considered as a priority: HEI doing their own development should be a last resort. Anthony Beitz, amongst others, has argued this persuasively before. Initiatives like the DCC can help with suggestions and descriptions of tools.
- We need to move forward with pragmatic measures for what researchers need right now, whilst not losing sight of modelling longer-term sustainable strategies.
So far, so sensible. But a take-home worry for me was the importance placed again and again by funders on the key role of the peer review panel. We don’t know how AHRC or ESRC deal with this because neither of them were present, but the funders who were there rely on their peer review panels to make decisions about ‘the science’ (for which I mentally substitute ‘the research’) and also, in the case of most of the funders present, the data management plan or statement.
Given that we who work solely and only on research data management, digital curation and digital preservation as our fields of interest are still in the process of working this stuff out, how do we know whether the peer review panel members have sufficient and appropriate knowledge of these fields to responsibly discharge their duty when judging the RDM plans of other researchers? One funder explained that they expect a DMP to be in place at the point of bidding but that these are not peer reviewed “because peer reviewers are unlikely to have the knowledge required.” What of the other funders? It strikes me that knowing the limits of panel expertise – the ‘known unknowns’ – is by far the most responsible approach to any type of peer review process.
In addition to quality or level of knowledge, I’m also interested in consistency of standards applied. One funder openly admitted on the day that he is aware that there is a troubling amount of variability in the approach to both the creation and assessment of data management plans in bids. Other comments indicated that some bids have their data management plans or statements specifically reviewed and some don’t.
Peer review panels are largely comprised of senior researchers (by which I mean time in the field as opposed to age). The Jisc MRD programme, like many other initiatives, often focuses training and awareness-raising efforts on early career researchers and postgraduate students, with the idea being that they will take their good practice up with them through their academic careers. But what do we do until then? Even if we can rely on current ECRs and PGs to be well and consistently trained, we’re still in a situation where all bidding, for the next twenty-odd years is being reviewed by senior researchers who have not been specifically targetted by RDM training and awareness-raising efforts.
A solution? As fellow evidence gatherer Jonathan Tedds suggested in the discussion, we can learn from areas such as bidding for telescope time in astronomy, where peer review necessarily includes someone who is specifically there to provide their technical knowledge. For consistency, research funders seem to need the presence of or input from an appropriate external body. So this seems to be an area where the DCC and research funders can work together, for example, to produce consistent and approachable, up-to-date guidance for peer review panel members, and to ensure someone who specialises in digital curation as applied to research data management is included in their peer review panels.
Just my suggestions. Your comments are, as always, welcome below.
e: laura.molloy AT glasgow.ac.uk
Share and Enjoy
The Jisc MRD programme has been working with the Digital Curation Centre throughout the life of the programme and this relationship is fruitful in several directions. One of the most gratifying, however, is seeing the outputs of the programme being used and re-used in DCC training events.
Sarah Jones and Marieke Guy from the DCC were in Northampton this week, working alongside Miggie Pickton to offer training to librarians in research data management. They used the excellent UK Data Archive ‘Managing and Sharing Data’ guide as well as outputs from Jisc MRD projects including ADMIRe, MANTRA, Research360, RDMRose, RoaDMaP and TraD.
Please see the training event slides here: http://www.dcc.ac.uk/webfm_send/1243
and the supporting workbook here: http://www.dcc.ac.uk/webfm_send/1245
These are of course Creative Commons-licensed to allow use and re-use as specified.
Sarah has also blogged on the DCC website about the day, with some tips for what works well, at http://www.dcc.ac.uk/news/rdm-training-librarians.
Have you re-used Jisc MRD project materials (even your own) or any other training materials in your own events? What worked and what didn’t? Let us know in the comments!
e: laura.molloy AT glasgow.ac.uk
Share and Enjoy
For a full and useful summary of the RDM Train session (also known as session 3B) at the Jisc MRD ‘Achievements..’ workshop on 25-26 March, please see this blogpost from Kellie Snow over at the DaMSSI-ABC blog:
Share and Enjoy
Here at the JiscMRD Achievements, Challenges and Recommendations workshop, Joy Davidson (HATII and the DCC) chaired session 1B on research data management support and guidance. Jez Cope (Research360 at Bath), Rachel Proudfoot (RoaDMaP at Leeds), Hannah Lloyd-Jones of Open Exeter and Anne Spalding (stepping into Leigh Garrett’s shoes for the KAPTUR project at UCA) all shared their experiences of developing tailored advice and guidance for their host institutions and / or target disciplines.
Jez described very clearly how the Research360 project went about the formulation and production of their resource, finding very similar challenges and solutions to those noted by e.g. the Incremental project in MRD01, including the usefulness of some fundamental but often overlooked details such as placing the resource as high in the university website architecture as possible (theirs is at http://www.bath.ac.uk/research/data) which helps to ensure the resource is not seen as partisan to one discipline or service over others; and listing in website A-Z directories under something meaningful and findable to users (in their case ‘R’ for ‘research’ and ‘D’ for ‘data’ as opposed to their project acronym).
Usability also extends into the layout on the homepage, where content can be accessed via a menu of RDM topics (for those with a bit of RDM knowledge) or by project phase for those with less RDM knowledge.
Jez noted that much of his role has been to work as a translator between technical and non-technical people. Rachel Proudfoot is also bringing together different staff groups: RoaDMaP work draws on a working group containing key contacts from varied services and areas of the university including the university training service, IT services, the library and faculties. Rachel’s experience is that this approach not only provides an essential mix of expertise to inform your outputs, but also gives you access to new channels for administration and promotion of training events and awareness-raising efforts. Rachel was pragmatic about re-purposing existing training resources already created at Leeds, e.g. made for one discipline and re-used for another. Whilst Jez was clear that getting material from other people at the institution always takes longer than even the most generous estimate, in Rachel’s experience reusing one’s own materials can be tricky too.
The Open Exeter project has been remarkable for their use of a group of PGR students from varied disciplines as active participants in project work where, for a fee (and an iPad!) they have functioned as the face of the project at university events and across their peer group. The group members have also supplied responses and feedback to various project outputs and so helped to make sure guidance and events are relevant and meaningful to this group of researchers, and produced a ‘survival guide’ for distribution at induction which helps to make the case for RDM to newly-arrived PGRs. In this way, they have made the work of the project a lot more visible through peer-to-peer and student-to-supervisor (!) education about RDM at Exeter. They also contributed better understanding of the needs of active researchers in a way that was more practical in terms of time and cost than trying to work with more senior researchers. The students in turn have new knowledge of and skills in RDM, have received specialised help from the university and external experts and have a new element to add to their academic CV. This fruitful relationship has contributed much to Open Exeter’s online guidance resources: due to the varied disciplines represented by the PGRs, their case studies and other contributions are truly central to the webpage at http://as.exeter.ac.uk/library/resources/openaccess/openexeter/.
Another fruitful relationship was described by Anne Spalding in the last presentation in the session, a description of the KAPTUR project. KAPTUR has a fairly unusual challenge of involving four creative arts-focused academic institutions on a common quest to understand and manage research data in the visual arts. Anne noted that this is a discipline-area with particular challenges around the definition of what constitutes research data – an ongoing area of work for the project. She also noted that project work, as with other projects such as Open Exeter’s DAF survey, was built upon the findings of surveys of researchers to understand current data-related practice. As with the other projects of this group, a range of areas of the institution were involved; in this case libraries, training services and others were asked to feed into policy formation and UCA had their data policy passed by senior management in February 2013. Anne was clear that this policy will operate as a framework for further RDM infrastructure development work.
When discussing areas for future work, Joy and Rachel both agreed on the need for us to now consider how we extend capacity for RDM training in the institution. There are relatively few with the skills and the confidence to train others in RDM: we need to train more trainers and extend the network of expertise at the institution, particularly in cases where the Jisc MRD project is not assured of continuation funding from their host HEI. A useful idea at Leeds was inviting the DCC to attend – not to provide a training session but to critique the session presented by the project: this is an effective way to instil confidence and skill in RDM training at the institution, and can be extended by thoughtful deployment of the openly-available training and guidance resources already produced by the MRD programme.
Here are some of my thoughts from this session:
- The more you can find out about your audience beforehand, the better tailored (= more meaningful = more effective) your training can be, so get those pre-event questionnaires out and completed!
- Re-use of existing resources is possible and can be successful but may still need some effort and time to do well. So whilst it’s worth while using the expertise of others, and always looks good to demonstrate awareness of the relevant resources that already exist, don’t do it simply be a short cut or a time-saver.
- Training cohorts of new researchers is good and well but we now need to start planning to train more senior academics. They are the ones that allow RAs, postdocs and students to go off to training (or not); they are providing training recommendations to the students they supervise; they are the ones sitting on funder selection boards and ethics panels. They need to be up to date on RDM, at least in their own discipline areas, and to be aware of what they don’t know.
e: laura.molloy AT glasgow.ac.uk
Share and Enjoy
Jisc Managing Research Data Programme Workshop: Achievements, Challenges and Recommendations, 25-26 March 2013, Aston Business School
The Jisc MRD Achievements, Challenges and Recommendations workshop is about recognising the achievements – both in scale and quality – of the projects of the second Jisc Managing Research Data programme (2011-13). The programme’s large infrastructure projects will complete during spring – summer 2013 and so at this point we are starting to see real delivery from many of them. At the same time, there is still space for sharing good practice and recommending approaches for meeting challenges as well as for areas which need additional work.
The event programme is available at http://bit.ly/MRD-Aston2013-Programme, and we’ll be posting summaries of sessions and highlights here on the EG blog.
Share and Enjoy
The Goportis Conference 2013 on ‘Non-Textual Information: Strategy and Innovation Beyond Text’ took place on 18-19 March 2013 in Hannover. A programme with abstracts and speaker biographies is available at http://www.nontextualinformation2013.de/index.php/programme. This gives an idea of the number and variety of speakers: some of my highlights are outlined here, and you can get more narrative on Twitter by searching for the event’s tag, #goportis13.
The event began with a keynote by Martin Hofmann-Apitius of the Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), who passionately argued for better access to scientific data for the good of science, particularly for public health. This need is demonstrated by the rapidly-increasing occurence of Alzheimer’s disease in the west – to tackle such huge challenges, we urgently need to be able to undertake text mining and data mining to produce useful, computable chemical information for science to advance. He particularly identified the current publishing model in research as ‘problematic’, describing the existing business model of many publishers as something that ‘interferes with the advancement of science’. Martin is convinced that the days of the static publication are numbered, and in the future scientific communication will be done by knowledge models and other non-static means.
Jan Brase, Datacite then gave an overview of the Datacite work in promoting citability of data. Jan believes that libraries should open their catalogues to any kind of information. The catalogue has classically been a window onto the holdings, but now the library doesn’t have to hold all the records they present. In the future, Brase predicted, libraries will function more as a portal in a net of trusted providers, drawing on their long heritage in bringing scientific information to the public, their track record of persisting longer than projects and other departments of the institution, and their reputation as very trustworthy organisations. Yet more love for the libraries! Now all we need to do is fund them to take on these new responsibilities and acquire the concomitant skills…
Todd Carpenter, American National Information Standard Organisation (NISO) stirred up some debate by suggesting that perhaps we need to be more discriminating in selecting different metadata structures for different things. He referred to the use of DOIs ‘for just about everything: books, articles, data, content negotiation and licensing. We don’t apply an ISBN to people. We don’t use taxpayer numbers for addresses. Are we pushing the DOI beyond its limits? Should we call what Datacite is doing with DOIs something different than, for example, what the CrossRef community is doing?’
Todd also outlined a project undertaken by NISO and the National Federation of Abstracting and Information Services (NFAIS) which looks at the current publication of supplementary materials and how publishers are dealing with these. What is critical / supplemental / ancillary to understanding? Todd made a straightforward but fundamental point that the form of content does not designate whether something integral to understanding, e.g. just because one part of the research publication is a paper and one is a video, that doesn’t necesarily mean that the paper should be regarded as the main publication and the video as supplementary material – it could quite easily be the other way around, and this chimed with one of the trends of today’s event, namely that publishing needs to change from the static paper to more flexible, interactive and repurposable models.
Jill Cousins, Europeana Foundation / The European Library moved focus slightly more onto humanities and arts research with her update on work at Europeana, the access provided to 27m resources (which are not held by Europeana itself; rather, they provide the metadata to enhance findability) and the challenges of getting the metadata for 27m objects across Europe to be available under CC0 licensing! Jill was also keen to discuss new initiative Europeana Research which will soon be available at http://pro.europeana.eu.
Creative Commons licensing is important to the work of the Jisc MRD projects, particularly those making training resources for use and re-use. It was useful to hear Puneet Kishor of Creative Commons reporting on the new license suite, 4.0, and the differences between this and the previous suite of licences including licensing of European sui generis database rights (SGDR). The new 4.0 licences are to be launched in the second quarter of 2013 – right now, though, you can contribute your thoughts at wiki.creativecommons.org/4.0 -and Puneet particularly wants to hear from scientists.
Brian McMahon of the International Union of Crystallography is, I think, quite chuffed that crystallographers are generally considered to be really, really good at research data management – not least by Richard Kidd of the Royal Society of Chemistry – but feels it is still important for those in the field to keep their skills current and contributing to the advancement of science. This was another talk presenting ways to extend the functionality and interactivity of the scientific publication, as Brian outlined the publishing work by IUCR and ways of modelling crystal structures as non-textual information in publications.
I had the last talk of the event, presenting the work of the Jisc Managing Research Data programme. It’s a real challenge trying to communicate the mass and the variety of the activities that the JiscMRD projects are tackling, and to delineate the difference between the programme-level work and that of the individual projects, but I did my best. I described the landscape and drivers which stimulate programme activity, the structure of the programme, some lessons learned from phase 1 which have been applied to phase 2, and the fearlessness of projects in tackling tricky aims such as improving institution-specific awareness, devising and delivering discipline-specific training, analysing and enhancing current RDM infrastructure provision, implementing or extending data repository provision, attempting to cost data loss, and generally sorting out the world. I also described various key resources provided by MRD projects and the Digital Curation Centre. I then had the pleasure of Goportis’s Klaus Tochtermann describing UK RDM activity as ‘the most advanced in Europe’ – so I think we’re doing something right!
You’ll see from the Twitter feed (#goportis13) that there were many more talks which discussed particular applications of non-textual information in a range of disciplines – far more information than can sit comfortably in a blog post, so please have a look. The slide decks will be available from the conference website in the near future – I’ll tweet when I’m aware of this having happened.
Do you agree we need new publishing paradigms? How could your discipline benefit from non-textual research communication? Want to know more about any of the projects mentioned above? Let us know in the comments!
E: laura.molloy AT glasgow.ac.uk
Share and Enjoy
A very happy new year to all on the MRD programme and all ‘fellow travellers’! 2013 has started with a shot of energy provided by IDCC 2013, which took place in the deliciously-named Mövenpick hotel in Amsterdam last week (14 – 17 Jan).
A lot of the twitter stream (#IDCC13) agreed that there was a huge amount of information and opinion to download. This frenetic pace was encouraged by the practice papers taking place in slots that allowed only ten minutes to talk! A great opportunity to really work on honing those high-level messages, then.
It was very encouraging to see representatives of so many Jisc MRD projects there, and I hope those who were in the ‘National perspectives in research data management’ track found the talks Simon Hodson and I did on the programme as a whole and on the evidence-gathering activity to be useful. One slight disappointment was having the “National perspectives” track running at the same time as the “Institutional research data management” track: the MRD programme connects institutional approaches and happens to work across the UK, so whilst we weren’t entirely out of place in the “National” track, we probably missed out on some relevant audience. No matter: if you missed either talk and are interested in seeing the slides, the presentation about the MRD programme as a whole is here; and the talk on the evidence gathering activity is here. Your feedback or questions are of course welcome.
One of the things the MRD programme has been – and I hope continues to be – very good at is making stuff available to other people. In his IDCC preview blog post, Kevin Ashley said,
“Overall, I would like everyone to come away aware of the potential for reuse of the work that others are doing and the potential for collaboration. Whether it is software tools, training materials, methodologies or analyses, many of the talks describe things that others can use to deal with data curation issues in their own research group, institution or national setting.”
This is what we as a programme, along with other organisations and activities, do. Various pieces of work across the MRD programme with the DCC Cardio tool have inspired other projects and areas of the programme; the same applies to those who have tailored the DCC DMPonline tool, and we encourage all such innovations to be made available to provide examples and ideas for others. In addition, however, the MRD programme has a strand (both in MRD01 and in the current iteration of the programme) specifically involved in creating training materials for research data management, aimed at particular audiences. These are really valuable resources and have been created to be used and re-used in an open and flexible way.
I was asked so many times throughout the event where these materials can be found, that I thought it was worth listing them here. The links given lead directly to teaching resources; background information on the projects can be found here: http://www.jisc.ac.uk/whatwedo/programmes/mrd/rdmtrain.aspx
- CAiRO: For postgraduate and early career researchers. Performance and live arts. http://www.projectcairo.org/module/unit1-0.html
- DataTrain: For postgraduate students. Archaeology, http://archaeologydataservice.ac.uk/learning/DataTrain. Social anthropology, http://www.lib.cam.ac.uk/dataman/datatrain/datatrainintro.html
- DATUM for Health: For postgraduate research students. Health studies. http://www.northumbria.ac.uk/sd/academic/ceis/re/isrc/themes/rmarea/datum/health/materials/?view=Standard
- Research Data MANTRA: For postgraduate and early career researchers. Geosciences, social science and clinical psychology. http://datalib.edina.ac.uk/mantra/
(Unfortunately the website for the DMTpsych project at University of York is no longer online. As the project has not deposited its resources into Jorum either, I can’t supply a link.)
There are more training resources in production at the moment: you can read more about them here: http://www.jisc.ac.uk/whatwedo/programmes/di_researchmanagement/managingresearchdata/research-data-management-training.aspx
We as a programme can’t solve the issue of duplication of effort in digital curation by ourselves, but by maximising the use of these materials, and finding new applications for them, we are definitely doing our bit.
Have you used any of these resources? Want to know more? Let us know in the comments!
Share and Enjoy
The ‘Triage and Handover’ session (session 3B) of the JISC Managing Research Data programme progress and DCC institutional engagements workshop (24 – 25 October 2012) differed in structure from the other sessions: less about project experiences and more about sharing expertise from people working specifically in this area and generating discussion for projects attending in response.
For the sessions, we note-takers were tasked with establishing: a) what is working? b) challenges and lessons learned, and c) what the MRD programme or the DCC can do to help. Whilst the structure of this session didn’t lend itself so well to this task as some other sessions, I hope this summary will supply the salient points.
Angus Whyte (DCC) began this session by acknowledging the difficulties of the area. Because there is no way of knowing which digital objects will be useful in the future, there is no one foolproof way to decide which data should be retained for handover at project end to institutional data management services, and which can be disposed of.
‘Triage’ here is used in the business sense rather than the medical sense: it is meant to imply the existence of a process of decision-making which can determine resource allocation. ‘Selection’ suggests an either/or decision, which is useful to consider, but Angus makes the point that for institutions the greater need is to define a range of decisions. One of these will be disposal. Others might range from showcasing high-value data online to keeping low-value data on tape back-up.
As a co-author of the DCC ‘How to’ guide on appraisal and selection of data for curation, Angus has spent some time considering various models that are used by data centres and archives to guide their decision-making. He described the basic records management approach to this:
1. Define a policy, i.e. criteria and range of decisions
2. Archive management applies criteria: select the significant, dispose of the rest
However, he argues, there are a few complications for this model when it comes to dealing with research data, i.e.:
- Research processes may be more complex (need more explanation) than administrative processes
- Data purpose may change
- Needs more effort to make re-usable
- Complex relationships and rich contexts
- Originators should be engaged but may not have capacity to be
- Others may need to be involved too
- More than keep / dispose choice – need to prioritise attention and effort to make data fit for re-use.
So, for research data:
- First, characterise. What is this data? What are the relationships within it and what are the significant aspects of the context in which it was created?
- Appraisal criteria should establish: who has the duty of care? How accessible is the data? What is its re-use value, and what costs are involved?
- Categorise the responses to these criteria or questions i.e. combinations of high or low ratings. These are your triage levels; levels of effort and cost attached to making data accessible and discoverable, balanced against the likely range of reuse cases and benefits
- An important factor will be whether there are other natural homes for the data and, if so, whether there are benefits from retaining a copy with the institution.
- A tiered approach to data value could in theory map to a tiered approach to resource costs, e.g. for discoverability, access management, storage performance, preservation actions.
Clearly, some effort is required here. This may make senior management, as well as the researchers themselves, say, ‘why not just keep it all?’ Well, in the arguments for selection, costs are a significant issue. There has been an exponential growth in digital storage required in the last few years: this includes lots of types of digital content including research data, but of course other types of digital material can also be useful in the research process.
David Rosenthal estimated in his frequently-mentioned blogpost of 14 May 2012 how much it would cost to ‘keep everything forever in the cloud’. He speculated that, based on current cost trajectories, keeping 2018’s data in S3 (Amazon’s cloud storage service) will ‘consume more than the entire GWP [Gross World Product] for the year’. Whilst the DC/DP/RDM community may argue around the specifics of Rosenthal’s position here, his argument does help to demonstrate that whilst storage costs – never mind those for curation – have long been transparent to researchers, they are real and clarity here can help us to price curation (including storage) realistically and responsibly.
Selection presumes description. You can’t value what you don’t know about. Angus argued researchers can’t afford not to spend effort on minimal metadata description and organisation, because costs of retention will be much higher if they don’t. Description makes data affordable – is citation potential a concrete enough reward?
To summarise, we must identify what datasets are created and where they are, and differentiate priorities.
Marie-Therese Gramstadt then outlined the activity of the JISC MRD KAPTUR project relating to selection and retention. KAPTUR is aware of previous JISC MRD work in training. One of main questions addressed by KAPTUR is how to select and appraise research data. In their approach, they have referred to the DCC paper on this topic, and held an event earlier this year to further explore the issues. The event discussed the following aspects of research data in the creative arts and how to select it for management:
- Value and context, including scientific and historical value;
- Value creation;
- Ethics and legal issues;
- Enabling use and reuse;
- Enabling long-term access.
(More information on this KAPTUR event is available here http://kapturmrd01.eventbrite.co.uk/, which includes the presentations.)
Veerle Van den Eynden of the UKDA then presented a data centre view of the issue, as opposed to an institution-level view. She described the current process that applies to deposit in the ESRC-funded UK Data Service, including the data review form, the work of the acquisitions committee which evaluates applications for deposit, and the acceptance criteria they apply.
The acquisitions committee will give one of three decisions about a dataset offered:
- accept data into main ESDS collection for curation and longer-term preservation;
- processing determined: either A, B or C
- accept data into self-archive system, the ESRC data store, for short-term management and access; or,
- unable to accept data.
which is a useful reminder that selection for management (including preservation) need not be a binary matter of yes / no but can consist of a range of possible management solutions.
Acceptance criteria includes:
- Within scope
- Long-term value and re-use potential
- Data requested (by ESDS advisory committee, users)
- Data from ESRC-funded research
- Viable for preservation (acceptable file format, well documented)
Common reasons for non-acceptance:
- Value of data in publications
- Legal obstacles (copyright, IPR)
- Ethical constraints (consent, anonymisation)
- Depositor wishes unnecessarily stringent access conditions
Usually about 5-10% of data offered currently falls into these categories of non-acceptance.
There are currently some draft categories for the data collections accepted by UKDS.
- Data collections selected for long term curation
- Data collections selected for ‘short term’ management
- Data collections selected for ‘delivery’ only
- Data collections selected for ‘discovery’ only.
The Data Service has a Collections Development Policy currently in draft. This addresses factors such as
- Scientific or historical value
- Replication data and resourses (materials required for replicating research)
Even if other projects and services don’t have the same levels of experience and capacity as the Research Data Service, these aspects of Data Service policy and structure provide an example of a functional approach to ‘triage’ and selection of research data.
Veerle also mentioned the repository engagement project, to support institutional data management / repository managers in their local role as ESRC data curators. Through this, they aim to provide guidance and training in appraising data for social science research for IR staff and other good practice. This is helpful in the current environment where there is more expectation from funders that institutions can take more responsibility for archiving data. You can see Veerle’s presentation here.
Marie-Therese then briefly showed material from Sam Peplar of NERC who was unable to attend at short notice. This described the development of the NERC data value checklist which aims to make selection better, more consistent and more objective. It emerged from consultancy in the research sector and has been modified in response to user feedback.
NERC funding requires an outline DMP at proposal stage with a detailed DMP when funding is agreed. The data value checklist is intended to be useful when preparing this full DMP but, Sam’s material cautioned, the checklist should not be expected to give some authoritative or definitive response to whether the data should be retained. Rather, it supplies questions on which to reflect around aspects of the data such as storage, access, formats, origin, conditions, etc. Sam is clear that there are not neat solutions for selecting data; objective rules are not possible. He is also clear that scientists are not generally prepared to do the selection alone – this is an area of RDM which requires support.
The group feedback was included various pertinent questions, and concluded that whilst there is no one methodology for discerning the future value of data, it is currently important for institutions to understand where they fit in, in the current landscape in terms of their responsibility to assist researchers in responsible selection and deposit of data. Veerle confirmed that funders expect data to go to the IR where available, and a data centre if not. In either case, it is massively helpful if acceptance criteria are public: this can help researchers and research support staff to discern the most appropriate data for selection.
What are your main challenges in selecting and disposing of research data? What could the JISC MRD programme or the DCC do to help? Tell us in the comments.
Share and Enjoy
The ‘Components of Institutional Research Data Services’ event on 24 October 2012 brought together the ongoing JISC MRD infrastructure projects as well as the institutions with which the Digital Curation Centre is running an ‘institutional engagement’.
The ‘Institutional policies, strategies, roadmaps’ session (session 1A) reflected this nicely, with two speakers from MRD projects ‘Admire’ and ‘Research360’, and two from DCC IEs, St Andrews and Edinburgh.
What is working?
Tom Parsons from Nottingham’s Admire project described further connections across this set of institutions, acknowledging the 2011 aspirational Edinburgh data policy (more on this later) as the inspiration for theirs at Nottingham, and underlining the importance of being aware of the requirements not only of major funders at your institution but also the institutional policies which exist: these need to be found, understood, and worked with to give a coherent message to researchers and support staff about RDM. This can be done, as he noted, by reflecting these existing messages in your data policy but also by strengthening the data management aspects of these existing policies, and so making the most of any credibility they already have with university staff.
At Bath, RCUK funders are also important influences on progress. Cathy Pink from Research360 has established that the biggest funder of research work at her institution is the EPSRC, and so Research360′s roadmap work to particularly respond to the EPSRC’s expectations is important at her university, and was published earlier this year. Bath has looked to the Monash University work to guide its direction in policy formation, particularly to inform strategic planning for RDM and making a clear connection between work at the university to advance RDM and the university’s existing strategic aims: an intelligent way to garner senior management buy-in.
Cathy noted that the DAF and Cardio tools from DCC were both useful in ascertaining the existing situation at Bath: these measures are important to take both in order to identify priorities for action, and also in order to be able to demonstrate the improvements (dare I say impact?) brought about by your work in policy formulation and / or training and guidance provision.
To be taken seriously at the institution and to promote awareness and buy-in, Cathy urged institutions to incorporate feedback from a wide range of relevant parties at the university: research support office, the library, IT support and the training support office where available. This promotes a coherent approach from all these stakeholders as well as a mutually well-informed position on what each of these areas can contribute to successful RDM.
Birgit Plietzch from St Andrews also found DAF and Cardio relevant to ascertain the current data management situation at her institution but felt the processes could be usefully merged. Birgit’s team again started by finding out who was funding research at the university (400+ funders!) and then increasing their understanding of these funders’ RDM requirements to create a solid base for policy work. Again, the Monash University work in this area was useful at her institution, and when the EPSRC roadmap work was completed, as with Bath, it helped to demonstrate the relevance of RDM to diverse areas of institutional activity.
Edinburgh’s Stewart Lewis, too, described the value of creating relationships not only with senior management champions for RDM but also between the university mission statement or strategic aims, and RDM policy. Stewart acknowledged that the aspirational policy published by Edinburgh in 2011 is a useful way to both instigate and lead on improved RDM at the university, but that action is also crucial. The aspirational mode of policy gives a stable, high-level statement which is then enacted through supporting, and more volatile, documents. So whilst action is devolved from the top-level document, it is still intrinsically important if culture change is to happen. To this end, they have created various levels of implementation groupings to carry through specific actions. Infrastructure specified by their policy work includes a minimum storage amount and training provision.
In accordance with the Grindley Theory of Four Things (see the – fittingly – 4th bullet point of http://mrdevidence.jiscinvolve.org/wp/2012/11/05/research-data-management-programme-training-strand-kick-off-workshop-london-26-october/), Edinburgh is concentrating on four high level areas: planning, infrastructure, stewardship and, lastly, support across these three. These areas were chosen in order to meaningfully move forward the RDM work at Edinburgh whilst still making sense to the researcher population.
Challenges and lessons learned
Tom shared some findings gathered by Admire from their survey of the institution’s researcher population which shows around 230 projects are currently funded and so storage requirements are substantial. Most of these projects are funded by RCUK funders, and so the expectations for a well-organised approach to RDM are also pretty substantial. When c. 92% of researchers surveyed at the institution report having had no RDM training, we can understand the need for (and scale of) Admire’s work!
Cathy echoed Tom’s point: don’t attempt to simply lift one institution’s work and hope to apply it to yours. The tailoring required is significant if a set of policies is going to work in your own context.
The first attempt at the RDM policy for Bath was rejected by the senior management group. Inspirationally, Cathy recognised this as a great opportunity to refine their work and improve the policy using the feedback received. It also helped clarify their ambitions for the policy and resolved the team to do better than ‘just good enough’: being tempered, of course, by the support infrastructure that could be realistically delivered by the institution – a similar situation as with Nottingham.
Cathy emphasised the point that good quality consultation across the institution is time-consuming but well worthwhile if you aim to build genuinely useful and effective policy or other resources.
Birgit also faced challenges in getting a wider acceptance of some promising RDM policy work. The institutional environment, including a recent reshuffle of IT provision, had contributed problems to the smooth progress of their IE and senior management, once again, needed compelling evidence to understand the benefits of improved RDM for the institution.
Birgit also found that academics were overextended and found it difficult to make the time to participate in the research that her team needed to undertake to develop policy in this area, but when they realised the relevance they were keen to be involved in the process and to access RDM training. The notion of the aspirational (as opposed to the highly-specified) mode of RDM policy is popular with researchers at her institution.
Next steps for Stewart and the team at Edinburgh include attaching costs, both in terms of person-time and financial, to the actions specified under their EPSRC roadmap, which will be published soon. The team will also soon run focus groups using the DCC’s DMP Online tool, run a pilot of Datashare, establish what is needed by researchers in addition to storage, and run training for liaison librarians; these activities, however, need resources: the next challenge to meet.
Discussion picked up the balance between universities offering trustworthy storage appropriate for research data and the motivation of researchers to bid for these resources elsewhere: researchers bidding for this type of funding not only helps the university to concentrate resources in other useful areas but also helps to give a clear message to funders that if they want improved RDM, they have to be prepared to contribute financially towards it.
Costing was a popular topic: Graham Pryor (DCC) was interested that no speaker said they’ve attached costs. Sometimes explicitly identifying costs means this work becomes unacceptable to senior management on financial grounds. Paul Stainthorpe at Lincoln agreed that you can spend lots of time on policy, but it won’t be accepted unless there’s a business case. Other institutions agreed, but added that senior management want some illustrative narrative in addition to the hard figures, to tell them why this really matters.
Birgit added that there is also the problem of unfunded research, particularly in the arts. Her team has been receiving an increasing number of enquiries relating to this area, and it’s an area also being considered by Newcastle’s Iridium project, who have looked at research information management systems and discovered they only track funded work, leaving unfunded research as ‘a grey area’, even though it may be generating high impact publications. At UAL, a partner in the KAPTUR project, lots of researchers do a lot of work outside the institution and not funded by it and so for the purposes of the project, they’re being explicit about managing funded work.
UAL has recently launched their RDM policy as a result of their KAPTUR work and stakeholders are happy with it in principle, but the challenge now is how to implement it: John Murtagh noted that engagement and understanding mean work must continue beyond the policy launch. I mentioned the importance of training here as an element which has to be developed at the institution alongside policy and technical infrastructure. This was agreed by Wendy White of Southampton: policy needs to be an ongoing dialogue and the challenge is to integrate these elements.
What could the MRD programme or the DCC do to help?
- DCC: advise on whether funders are going to move the goalposts, and how realistic the risks are of this happening;
- DCC: advise on what public funding can be used to support RDM policy work;
- help with costing work
- DCC: mediation between universities and the research councils, clarifying requirements and sharing universities’ experiences, etc.
- DCC: providing briefings on current issues, e.g. PVC valued briefings re. open access.
Share and Enjoy
This one-day event provided an overview of the JISC MRD programme training strand, its aims and context; a description of the DaMSSI-ABC support initiative for the training strand and various pieces of work it hopes to complete before particularly in terms of making outputs easier to find and use; and recognition of the fact that the activity of the four small training materials projects of the JISC Digital Preservation programme have correspondence with the RDMTrain02 projects.
The four RDMTrain02 projects each talked about their approach, activities, challenges and progress, giving us an idea of the subject areas or staff groups they are specifically addressing with the RDM training materials they develop:
- RDMrose, Sheffield (Andrew Cox): ‘information professionals’ (which I understand to be, in this context, academic librarians)
- Research Data Management Training for the whole project lifecycle in Physics & Astronomy research (RDMTPA), Hertfordshire (Joanna Goodger): PG students and ECRs in the physical sciences
- Sound Data Management Training (SoDaMaT), Queen Mary University of London (Steve Welburn): postgraduate research students, researchers and academics working in the area of digital music and audio research
- TraD: Training for Data Management at UEL (Gurdish Sandhu and Stephen Grace): PGR students in psychology and in computer science.
The afternoon session consisted of an introduction to a set of description and evaluation criteria which have been developed by the Research Information Network through its Research Information and Digital Literacies coalition. These criteria are in an advanced draft form and participants were asked to read and feedback on them. They are intended to help with 1. specifying what the training resource or event is meant to do and who it is for, and 2. assessing the success of the training against those specifications. As such, it’s potentially a very useful tool to suggest to and remind those developing training of useful measures they can take and factors that should be considered in order to create a genuinely useful training resource, whilst also providing a framework for review and impact.
Some participants were perhaps not entirely clear on the potential benefits of the criteria, and profited from a chance to discuss the document with members of the DaMSSI-ABC team. Those who had a clear grasp of the aim and structure of the document – usually by replacing ‘information literacy’ with ‘research data management’ for ease of use in their particular context – agreed it looked very useful and provided a structure that may clarify what they’re trying to do.
Detailed feedback and questions on the criteria were sought, and will still be received gratefully by Stéphane Goldstein at stephane.goldstein AT researchinfonet.org.
Discussion was a good opportunity for projects to ask questions and share experiences. Points included:
- Culture change in institution can’t be expected to happen during short project lifespan. But projects can be a catalyst to inspire change and start the process.
- Important to remember that changing culture in one area or institution can influence other players, e.g. researcher practice and requirements can influence the behaviour of publishers if messages are clear enough.
- Support – including admin – staff are an important population in institutions: in universities, they are over 50% of staff. They also have to manage data and information. Datasafe (Bristol) has been considering their needs as well as those of researchers.
- Simplification of models can sometimes help engagement. As JISC’s Neil Grindley pointed out, many initiatives have simplified models such as the DCC lifecycle model into four main areas; e.g. the four digital preservation projects have collaborated on a leaflet which reduces DC activities to: start early, explain, store, share. This will heretofore be known as the Grindley Theory of Four Things.
- Short (5 – 10 min) resources lend themselves to easier re-use and can more easily be slipped into training at the institution that isn’t about RDM. This means we can raise awareness more widely than just preaching to converted. For example, it would make sense to include RDM in induction training, or training for researchers in bidding for funding.
- Terminology is still an issue: ‘digital preservation’ and even ‘data’ is problematic in some training contexts.
- People in institutions are already doing training in disparate ways in areas connected to RDM. It’s important to find out if this is happening in your institution, if they are aware of your project and if you’re giving consistent messages across the institution.
- Even simple measures can be valuable when you’re trying to quantify the benefits of improved RDM. Sometimes a quantity is useful, sometimes a story.
- Need for generic as well as discipline-specific training and resources.
- Need to work across campus and involve all relevant areas such as research office, library, IT services (both local and central computing services), staff development services, legal office.
- Librarian role is valuable for various reasons, but an important one is the ability to use links across campus.
- Whilst researchers often appear to have higher loyalty to their discipline than their institution, and researchers are a mobile population, a discipline by its nature doesn’t often have agreed rules, representatives, funded infrastructure or membership. So knowledge can be passed through informal networks, but there is little in the way of actually engaging with ‘a discipline’ as a whole. It’s still institutions who are providing the infrastructure, policy framework and the training. DaMSSI-ABC keen to work with professional bodies where these exist to try and address this situation.
- This strand of projects as well as fellow travellers, e.g. www.le.ac.uk/researchdata happy to build on prior work, e.g. JISC Incremental www.glasgow.ac.uk/datamanagement, UKDA, Sudamih, in the ‘four things’ approach to building online guidance.
- Is there a role for organisations such as UKCGE, HEA?
JISC MRD training strand (RDMTrain02): http://www.jisc.ac.uk/whatwedo/programmes/di_researchmanagement/managingresearchdata/research-data-management-training.aspx
RIN Research Information and Digital Literacies Coalition: http://www.researchinfonet.org/infolit/ridls/
RIN Criteria for Describing and Assessing Training: http://www.researchinfonet.org/infolit/ridls/strand2/