What are the requirements an RDMI has to cover?

The idea for this post came through a discussion Tom Parsons (ADMIRe project, University of Nottingham) started on the JISCMRD mailinglist about the ‘Requirements for an RDM system’. As we all know there are a lot of challenges involved in figuring out what requirements have to be covered by a Research Data Management Infrastructure. What is it an RDMI has to deliver in the end, what functional specifications have to be defined and on what scale?

Requirements have to be gathered on different levels from various researchers, research groups and other stakeholders. Approaches, case studies and outcomes are depending not only on projects’ remits (e.g. pilot projects vs. 2nd phase projects implementing an institutional wide service), but also on resources, buy-in from stakeholders and varying needs of the target audiences. The wider the field to play on, the harder it gets to balance out disciplinary specific and generic needs and to decide what parts of the research lifecycle can/have to be covered.

At the same time the institutional landscape has to be taken into account as well, especially the IT landscape and systems already existing – Tom pointed to the danger of “straying into the territory of a research administration system when we are thinking about an overall RDM system”. Trying to inter-connect with existing systems across the institution is something e.g. the MiSS project is doing at the moment, integrating existing infrastructure while adhering to the existing IT framework at the University of Manchester; this means the project initiation and accounting systems will be connected to the MiSS RDM core system (as will eScholar as the dissemination platform) and e.g. DMPs are automatically transferred into the active data stage. The temporary downside of this approach – creating an RDMI across the whole lifecylce, but with a thin layer for a service to start with – is that certain specific needs can only be addressed over time. Then again, such a new infrastructure needs to be able to evolve in use anyway.

To round up this post, here are some thoughts mentioned in the discussion:

Chris Morris pointed to a “concept of operations for research data management for a specific domain, molecular biology: http://pims.structuralbiology.eu/docs/beforePiMS.html” and also “got one general suggestion, which is not to start with use cases about deposition, but with retrieval. There is no scholarly value in a write-only archive.”

Some examples for use cases provided by Joss Winn: “https://github.com/lncd/Orbital-Core/wiki/Case-Studies. He remarks that those are very high level looking at them now and provides another link with “more detail from our project tracker, although it may not be as easy to follow: https://www.pivotaltracker.com/projects/366731. Basically, other than the usual deposit, describe, retrieve functionality, our two pilot research groups are looking to use the RDM platform for data analysis, such that is is a working tool from the start of the research project, rather than a system for depositing data at the end of the project.”

Steve Welburn addresses another important question, namely what technical framework to choose and how they came to choose DSpace: http://rdm.c4dm.eecs.qmul.ac.uk/platform_choice

Thanks to everyone mentioned for their links and thoughts!

Meik Poschen  <meik.poschen@manchester.ac.uk>
Twitter:  @MeikPoschen