Author Archives: jtedds

Research data + research records management = research management?

Interesting meeting at Leicester this week with our Information Assurance Services. Andrew Burnham and myself have developed a roadmap to implement research data management policy as required for EPSRC and other funders and are currently leading the development of guidance and support for researchers in time for the new academic year. We are doing this in collaboration with colleagues in IT Services, the Academic Practice Unit, Library and Research Support Office on behalf of the Research Computing Management Group, chaired by our PVC Research & Enterprise Kevin Schurer, which feeds recommendations up to the University Research Committee. We are feeding in the many relevant external information and guidance resources as produced through the JISCMRD programme, UKDA, DCC and related.

I’ve lost some of you haven’t I?! To add to the confusion, there is also the university code of practice for researchers.

However, the question has arisen as to when and how we distinguish the university records management policy with this research data management policy? We are of course referring here to differences in language between different parts of central services in a university – never mind the disciplinary differences across the university academic community.

Information Assurance Services (IAS) currently have a records management policy tabled for university approval. It might be described as corporate based and is not immediately identified with research. And yet if sensitive data were to be lost in a researcher’s lab notes the matter would probably reach IAS before any other body in the university (they also handle FoI requests in the first instance). IAS identify the lab notes as “research records” not “research data” so is a “research records management policy” then required?

From a research data viewpoint we might think of research records as being about the process including funder specific and personal information about the researchers rather than the research itself. Indeed we made exactly this distinction when following up on an external research data audit to all research staff: “Do you use, reuse or generate sensitive (including commercial in confidence) research data?” Researchers tend to assume that departmental and university administrators will be looking after the “research records”.

So let’s clarify how IAS see it. They incorporate information compliance and security (including FoI and environmental concerns), risk management and business continuity. The question then arises: is there clear legal ownership of research data? Many researchers are somewhat surprised to find out their hard work is actually “owned” by their institution for these purposes. This becomes particularly relevant when researchers move institution, as of course they often do.

So I found myself wondering aloud: do we need a “research managment policy” which then refers to the “records management policy” guidance and the rather more prescriptive “research data management policy”?

Oxford digital infrastructure to support research workshop

The University of Oxford have impressively attempted to marshal the diverse projects ranging across disparate areas of expertise in research data management at the university. I attended a DaMaRo workshop today to review the digital infrastructure required to meet the challenges of the multi disciplinary and institutional research landscape as it pertains to Oxford.

First and foremost, this is no mean feat in a university as diverse and dispersed as Oxford and Paul Jeffreys and colleagues are to be congratulated for the work to date. It’s hard enough attempting join up in a smaller, albeit research intensive university such as Leicester and the road is long and at times tortuous. Never mind potentially at odds with established university structures and careers…

I particularly liked the iterative approach taken during the workshop: so present key challenges to the various stakeholders present; provide an opportunity to reflect; then vote with your feet (ok, post-it notes in traffic light colours) on which areas should be prioritised. At the very least this is useful even if we may argue over which stakeholders are present or not. In this case the range was quite good but inevitably you don’t get so many active researchers (at least in terms of publishing research papers) at this kind of meeting.

In assessing the potential research services it was pointed out where a charging model was required, if not funded by the institution or externally. Turns out here at Oxford the most popular choice was the proposed DataFinder service (hence no weblink yet!) to act as a registry of data resources in the university which could be linked to wider external search. I remember during the UK Research Data Service pathfinder project that there was a clearly identified need for a service of this kind. Jean Sykes of LSE, who helped steer the UKRDS through choppy waters, was present and told me she is about to retire in a couple of months. Well done Jean and I note that UKRDS launched many an interesting and varied flower now blossoming in the bright lights of ‘data as a public good’ – an itch was more than scratched.

I also note in passing that it was one of the clear achievements of the e-science International Virtual Observatory Alliance movement, developed for astronomical research between 2000-2010, that it became possible to search datasets, tools and resources in general via use of community agreed metadata standards. Takes medium to long term investment but it can be done. Don’t try it at home and don’t try and measure it by short term research impact measures alone…even the  Hubble Space Telescope required a decade plus before it was possible to clearly demonstrate that the number of journal papers resulting from secondary reuse of data overtook the originally proposed work. Watch it climb ever upwards after that though…

Back to the workshop: we identified key challenges around Helpdesk type functionality to support research data services and who and how to charge when – in the absence of institutional funding. I should highlight some of the initiatives gaining traction here at Oxford but it was also pointed out that in house services must always be designed to work with appropriate external services. Whether in-house or external, such tools must be interoperable with research information management systems where possible.

Neil Jefferies described the DataBank service for archiving, available from Spring 2013, which provides an open ended commitment to preservation. The archiving is immutable (can’t be altered once deposited) but versioned so that it is possible to step back to an earlier version. Meanwhile Sally Rumsey described a proposed Databank Archiving & Manuscript Submission Combined DAMASC model for linking data & publications. Interestingly there is a serious attempt to work with a university spin off company providing the web 2.0 Colwiz collaboration platform which should link to appropriate Oxford services where applicable. It was noted that to be attractive to researchers a friendly user interface is always welcome. Launch date September 2012 and the service will be free to anyone by the way, in or out of Oxford.

Meanwhile, for research work in progress the DataStage project offers secure storage at the research group level while allowing the addition of simple metadata as the data is stored, making that step up to reusability all the easier down the line. It’s about building good research data management practice into normal research workflows and, of course, making data reusable.

Andrew Richards described the family of supercomputing services at Oxford. Large volumes of at risk storage are available for use on-the-fly but not backed up. You’d soon run into major issues trying to store large amounts of this kind of dataset longer term. There is also very little emphasis on metadata in the supercomputing context other than where supplied voluntarily by researchers. I raised the issue of sustainability of the software & associated parameters in this context where a researcher may need to be able to regenerate the data if required.

James Wilson of OUCS described the Oxford Research Database Service ORDS which will launch around November 2012 and again be run on a cost recovery basis. The service is targeted at hosting smaller sized databases used by the vast majority of researchers who don’t have in-house support or appropriate disciplinary services available to them. It has been designed to be hosted in a cloud environment over the JANET network in the same way as biomedical research database specific applications will be provided by Leicester’s BRISSkit project.

Last but not least, Sian Dodd showed the Oxford Research Data Management website which includes contact points for a range of research data lifecycle queries. It is so important to the often isolated researcher that there is a single place to go and find out more information and point to the tools needed for the job at hand.  Institutions in turn need to be able to link data management planning tools to in-house resources & costing information. To that end, the joint Oxford and Cambridge X5 project (named after the bus between the two) will go live in February 2013 and provide a tool to enable research costing, pricing & approval.

Synthesis of first JISCMRD programme benefits

Useful presentations summarising the benefits identified in the first JISCMRD programme 2009-11 from individual projects/institutes and as synthesised by Neil Beagrie on behalf of the programme can be accessed from the JISC national conference 2011 site.
There is also a more general online overview of the outputs of the first JISCMRD programme now available.

Developing Research Data Management Policy

This is Jonathan Tedds (@jtedds): Senior Research Liaison Manager for IT Services; researcher in astronomy and research data management at the University of Leicester. By way of a first blog post proper here in JISCMRD Towers I want to introduce the increasingly higher profile area of Research Data Management (RDM) policy and why it’s rapidly moving from desirable to essential.

Following the agreement by the RCUK umbrella body of research funders on common data principles for making research data reusable – data as a public good – and similar moves by larger charitable trusts such as Wellcome, funders have then batted the ball back to institutions and said deal with it! The EPSRC in particular requires that institutions in receipt of grant funding establish a clear roadmap to align their policies and processes with EPSRC’s expectations by 1st May 2012, and are fully compliant with these expectations by 1st May 2015 – yes, you did read that correctly, that’s a roadmap by this May! Sarah Jones of the Digital Curation Centre (DCC) has just blogged about this following a refreshed look at this area during the very well attended recent DCC Roadshow at Loughborough in February 2012.

Of course there are many other reasons why any institution that it is serious about research should be investing in the support of RDM and Angus Whyte and I recently co-authored a DCC Briefing on making the case for research data management which sets the national and international context as well as describing the experiences in the last 3 years at the University of Leicester. As a consequence institutions (and more specifically those held accountable for supporting researchers) are now realising, if they didn’t already, that they need to plan for research data management infrastructure on the ground across the entire research data lifecycle. Crucially they will also need high level policy at the institutional level to make this a reality. So how to go about it?

Well there are a few institutions that already have policies in place including Edinburgh, Oxford, Northampton and Hertfordshire. The DCC maintains a list of these with links to relevant institutional data policies. Of course this in itself is a grey area as your institution may well already have a code of practice which covers at least some of this ground. But does the policy (or the code!) always connect to the practice on the ground? Bill Worthington, who leads the Research Data Toolkit (Herts) JISCMRD project, has recently blogged on their work in this area.

At Leicester we have been building up to an institutional level policy to fit alongside an existing code of practice adopting a rather ground up approach; building on exemplars such as the JISCMRD Halogen interdisciplinary database hosting project and the current BRISSkit UMF project I lead for cross NHS-University biomedical research alongside high profile central investment in high performance computing (HPC). I facilitate a Research Computing Management Group across the University which takes a strategic view of these issues and will inform our own institutional level policy working party.

A recent email exchange on the JISCMRD mailing list showed a strong interest from the many new (and established) institutes involved in getting together to discuss a number of issues around developing and implementing RDM policies. Following an online poll it was decided to host a lunch-to-lunch meeting, supported by the Programme and assisted by the DCC, to takes this forward at the University of Leeds on March 12-13th 2012. Based on the poll we are expecting up to 50 participants. I’ll link to further details as they are finalised and made available. Themes raised to date include:

  • How are projects/institutions developing policies? Covering considerations of general principles, guidelines from funders and other bodies, specific considerations for the institution in question.
  • How are people getting approval for policies? A chance to share – e.g. off the record or by the Chatham House Rule – some of the challenges which may be faced.
  • How are people planning to support the implementation of the policies? How do projects/institutions intend to support transition from policy to practice?  Policy, infrastructure and guidance.  Interplay of top-down and bottom-up elements?  How to build mention and requirements of subject specific and/or institutional services into institutional policies.
  • How technical solutions affect policy decisions How much will policy be driven by what is technically available to an institution as a (suite of) data management solutions.
  • How are we going to assess and critique the success of RDM systems and policies

Finally, there are of course difficulties in all of this focus on the institutional level. As a researcher myself (astronomy) I argue that a researcher or research group is likely to have much more in common regarding their requirements to manage their data with a similar researcher or group in the same discipline but residing in any other institution (including international) compared to another researcher/group even in the same building. So we are asking a lot for institutions to meet this full range of requirements across all of their research areas. Researchers rather tend to look to their disciplinary learned societies or evaluation panels established by funders to provide coordinated responses. To be sure, the institutions have a strong role to play and shoulder a strong measure of responsibility but they are by no means the whole answer to the problem as I blogged in Research Fortnight (February 2011).