Category Archives: Tools

Goportis Conference 2013: Non-Textual Information: Strategy and Innovation Beyond Text

The Goportis Conference 2013 on ‘Non-Textual Information: Strategy and Innovation Beyond Text’ took place on 18-19 March 2013 in Hannover.  A programme with abstracts and speaker biographies is available at http://www.nontextualinformation2013.de/index.php/programme.  This gives an idea of the number and variety of speakers: some of my highlights are outlined here, and you can get more narrative on Twitter by searching for the event’s tag, #goportis13.

The event began with a keynote by Martin Hofmann-Apitius of the Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), who passionately argued for better access to scientific data for the good of science, particularly for public health.  This need is demonstrated by the rapidly-increasing occurence of Alzheimer’s disease in the west – to tackle such huge challenges, we urgently need to be able to undertake text mining and data mining to produce useful, computable chemical information for science to advance.  He particularly identified the current publishing model in research as ‘problematic’, describing the existing business model of many publishers as something that ‘interferes with the advancement of science’.  Martin is convinced that the days of the static publication are numbered, and in the future scientific communication will be done by knowledge models and other non-static means.

Jan Brase, Datacite then gave an overview of the Datacite work in promoting citability of data.  Jan believes that libraries should open their catalogues to any kind of information. The catalogue has classically been a window onto the holdings, but now the library doesn’t have to hold all the records they present.  In the future, Brase predicted, libraries will function more as a portal in a net of trusted providers, drawing on their long heritage in bringing scientific information to the public, their track record of persisting longer than projects and other departments of the institution, and their reputation as very trustworthy organisations.  Yet more love for the libraries!  Now all we need to do is fund them to take on these new responsibilities and acquire the concomitant skills…

Todd Carpenter, American National Information Standard Organisation (NISO) stirred up some debate by suggesting that perhaps we need to be more discriminating in selecting different metadata structures for different things.  He referred to the use of DOIs ‘for just about everything: books, articles, data, content negotiation and licensing.  We don’t apply an ISBN to people.  We don’t use taxpayer numbers for addresses.  Are we pushing the DOI beyond its limits?  Should we call what Datacite is doing with DOIs something different than, for example, what the CrossRef community is doing?’

Todd also outlined a project undertaken by NISO and the National Federation of Abstracting and Information Services (NFAIS) which looks at the current publication of supplementary materials and how publishers are dealing with these.  What is critical / supplemental / ancillary to understanding?  Todd made a straightforward but fundamental point that the form of content does not designate whether something integral to understanding, e.g. just because one part of the research publication is a paper and one is a video, that doesn’t necesarily mean that the paper should be regarded as the main publication and the video as supplementary material – it could quite easily be the other way around, and this chimed with one of the trends of today’s event, namely that publishing needs to change from the static paper to more flexible, interactive and repurposable models.

Jill Cousins, Europeana Foundation / The European Library moved focus slightly more onto humanities and arts research with her update on work at Europeana, the access provided to 27m resources (which are not held by Europeana itself; rather, they provide the metadata to enhance findability) and the challenges of getting the metadata for 27m objects across Europe to be available under CC0 licensing!  Jill was also keen to discuss new initiative Europeana Research which will soon be available at http://pro.europeana.eu.

Creative Commons licensing is important to the work of the Jisc MRD projects, particularly those making training resources for use and re-use.  It was useful to hear Puneet Kishor of Creative Commons reporting on the new license suite, 4.0, and the differences between this and the previous suite of licences including licensing of European sui generis database rights (SGDR).  The new 4.0 licences are to be launched in the second quarter of 2013 – right now, though, you can contribute your thoughts at wiki.creativecommons.org/4.0 -and Puneet particularly wants to hear from scientists.

Brian McMahon of the International Union of Crystallography is, I think, quite chuffed that crystallographers are generally considered to be really, really good at research data managementnot least by Richard Kidd of the Royal Society of Chemistry – but feels it is still important for those in the field to keep their skills current and contributing to the advancement of science.  This was another talk presenting ways to extend the functionality and interactivity of the scientific publication, as Brian outlined the publishing work by IUCR and ways of modelling crystal structures as non-textual information in publications.

I had the last talk of the event, presenting the work of the Jisc Managing Research Data programme.  It’s a real challenge trying to communicate the mass and the variety of the activities that the JiscMRD projects are tackling, and to delineate the difference between the programme-level work and that of the individual projects, but I did my best.  I described the landscape and drivers which stimulate programme activity, the structure of the programme, some lessons learned from phase 1 which have been applied to phase 2, and the fearlessness of projects in tackling tricky aims such as improving institution-specific awareness, devising and delivering discipline-specific training, analysing and enhancing current RDM infrastructure provision, implementing or extending data repository provision, attempting to cost data loss, and generally sorting out the world.  I also described various key resources provided by MRD projects and the Digital Curation Centre.  I then had the pleasure of Goportis’s Klaus Tochtermann describing UK RDM activity as ‘the most advanced in Europe’ – so I think we’re doing something right!

You’ll see from the Twitter feed (#goportis13) that there were many more talks which discussed particular applications of non-textual information in a range of disciplines – far more information than can sit comfortably in a blog post, so please have a look.  The slide decks will be available from the conference website in the near future – I’ll tweet when I’m aware of this having happened.

Do you agree we need new publishing paradigms?  How could your discipline benefit from non-textual research communication?  Want to know more about any of the projects mentioned above?  Let us know in the comments!

Laura

E: laura.molloy AT glasgow.ac.uk

 

Share and Enjoy

  • Facebook
  • Twitter
  • Delicious
  • LinkedIn
  • StumbleUpon
  • Add to favorites
  • Email
  • RSS

New year, new IDCC

A very happy new year to all on the MRD programme and all ‘fellow travellers’!  2013 has started with a shot of energy provided by IDCC 2013, which took place in the deliciously-named Mövenpick hotel in Amsterdam last week (14 – 17 Jan).

A lot of the twitter stream (#IDCC13) agreed that there was a huge amount of information and opinion to download.  This frenetic pace was encouraged by the practice papers taking place in slots that allowed only ten minutes to talk!  A great opportunity to really work on honing those high-level messages, then.

It was very encouraging to see representatives of so many Jisc MRD projects there, and I hope those who were in the ‘National perspectives in research data management’ track found the talks Simon Hodson and I did on the programme as a whole and on the evidence-gathering activity to be useful.  One slight disappointment was having the “National perspectives” track running at the same time as the “Institutional research data management” track: the MRD programme connects institutional approaches and happens to work across the UK, so whilst we weren’t entirely out of place in the “National” track, we probably missed out on some relevant audience.  No matter: if you missed either talk and are interested in seeing the slides, the presentation about the MRD programme as a whole is here; and the talk on the evidence gathering activity is here.  Your feedback or questions are of course welcome.

One of the things the MRD programme has been – and I hope continues to be – very good at is making stuff available to other people.  In his IDCC preview blog post, Kevin Ashley said,

“Overall, I would like everyone to come away aware of the potential for reuse of the work that others are doing and the potential for collaboration. Whether it is software tools, training materials, methodologies or analyses, many of the talks describe things that others can use to deal with data curation issues in their own research group, institution or national setting.”

This is what we as a programme, along with other organisations and activities, do.  Various pieces of work across the MRD programme with the DCC Cardio tool have inspired other projects and areas of the programme; the same applies to those who have tailored the DCC DMPonline tool, and we encourage all such innovations to be made available to provide examples and ideas for others.  In addition, however, the MRD programme has a strand (both in MRD01 and in the current iteration of the programme) specifically involved in creating training materials for research data management, aimed at particular audiences.  These are really valuable resources and have been created to be used and re-used in an open and flexible way.

I was asked so many times throughout the event where these materials can be found, that I thought it was worth listing them here.  The links given lead directly to teaching resources; background information on the projects can be found here: http://www.jisc.ac.uk/whatwedo/programmes/mrd/rdmtrain.aspx

(Unfortunately the website for the DMTpsych project at University of York is no longer online.  As the project has not deposited its resources into Jorum either, I can’t supply a link.)

There are more training resources in production at the moment: you can read more about them here:  http://www.jisc.ac.uk/whatwedo/programmes/di_researchmanagement/managingresearchdata/research-data-management-training.aspx

We as a programme can’t solve the issue of duplication of effort in digital curation by ourselves, but by maximising the use of these materials, and finding new applications for them, we are definitely doing our bit.

Have you used any of these resources?  Want to know more?  Let us know in the comments!

Share and Enjoy

  • Facebook
  • Twitter
  • Delicious
  • LinkedIn
  • StumbleUpon
  • Add to favorites
  • Email
  • RSS

Oxford digital infrastructure to support research workshop

The University of Oxford have impressively attempted to marshal the diverse projects ranging across disparate areas of expertise in research data management at the university. I attended a DaMaRo workshop today to review the digital infrastructure required to meet the challenges of the multi disciplinary and institutional research landscape as it pertains to Oxford.

First and foremost, this is no mean feat in a university as diverse and dispersed as Oxford and Paul Jeffreys and colleagues are to be congratulated for the work to date. It’s hard enough attempting join up in a smaller, albeit research intensive university such as Leicester and the road is long and at times tortuous. Never mind potentially at odds with established university structures and careers…

I particularly liked the iterative approach taken during the workshop: so present key challenges to the various stakeholders present; provide an opportunity to reflect; then vote with your feet (ok, post-it notes in traffic light colours) on which areas should be prioritised. At the very least this is useful even if we may argue over which stakeholders are present or not. In this case the range was quite good but inevitably you don’t get so many active researchers (at least in terms of publishing research papers) at this kind of meeting.

In assessing the potential research services it was pointed out where a charging model was required, if not funded by the institution or externally. Turns out here at Oxford the most popular choice was the proposed DataFinder service (hence no weblink yet!) to act as a registry of data resources in the university which could be linked to wider external search. I remember during the UK Research Data Service pathfinder project that there was a clearly identified need for a service of this kind. Jean Sykes of LSE, who helped steer the UKRDS through choppy waters, was present and told me she is about to retire in a couple of months. Well done Jean and I note that UKRDS launched many an interesting and varied flower now blossoming in the bright lights of ‘data as a public good’ – an itch was more than scratched.

I also note in passing that it was one of the clear achievements of the e-science International Virtual Observatory Alliance movement, developed for astronomical research between 2000-2010, that it became possible to search datasets, tools and resources in general via use of community agreed metadata standards. Takes medium to long term investment but it can be done. Don’t try it at home and don’t try and measure it by short term research impact measures alone…even the  Hubble Space Telescope required a decade plus before it was possible to clearly demonstrate that the number of journal papers resulting from secondary reuse of data overtook the originally proposed work. Watch it climb ever upwards after that though…

Back to the workshop: we identified key challenges around Helpdesk type functionality to support research data services and who and how to charge when – in the absence of institutional funding. I should highlight some of the initiatives gaining traction here at Oxford but it was also pointed out that in house services must always be designed to work with appropriate external services. Whether in-house or external, such tools must be interoperable with research information management systems where possible.

Neil Jefferies described the DataBank service for archiving, available from Spring 2013, which provides an open ended commitment to preservation. The archiving is immutable (can’t be altered once deposited) but versioned so that it is possible to step back to an earlier version. Meanwhile Sally Rumsey described a proposed Databank Archiving & Manuscript Submission Combined DAMASC model for linking data & publications. Interestingly there is a serious attempt to work with a university spin off company providing the web 2.0 Colwiz collaboration platform which should link to appropriate Oxford services where applicable. It was noted that to be attractive to researchers a friendly user interface is always welcome. Launch date September 2012 and the service will be free to anyone by the way, in or out of Oxford.

Meanwhile, for research work in progress the DataStage project offers secure storage at the research group level while allowing the addition of simple metadata as the data is stored, making that step up to reusability all the easier down the line. It’s about building good research data management practice into normal research workflows and, of course, making data reusable.

Andrew Richards described the family of supercomputing services at Oxford. Large volumes of at risk storage are available for use on-the-fly but not backed up. You’d soon run into major issues trying to store large amounts of this kind of dataset longer term. There is also very little emphasis on metadata in the supercomputing context other than where supplied voluntarily by researchers. I raised the issue of sustainability of the software & associated parameters in this context where a researcher may need to be able to regenerate the data if required.

James Wilson of OUCS described the Oxford Research Database Service ORDS which will launch around November 2012 and again be run on a cost recovery basis. The service is targeted at hosting smaller sized databases used by the vast majority of researchers who don’t have in-house support or appropriate disciplinary services available to them. It has been designed to be hosted in a cloud environment over the JANET network in the same way as biomedical research database specific applications will be provided by Leicester’s BRISSkit project.

Last but not least, Sian Dodd showed the Oxford Research Data Management website which includes contact points for a range of research data lifecycle queries. It is so important to the often isolated researcher that there is a single place to go and find out more information and point to the tools needed for the job at hand.  Institutions in turn need to be able to link data management planning tools to in-house resources & costing information. To that end, the joint Oxford and Cambridge X5 project (named after the bus between the two) will go live in February 2013 and provide a tool to enable research costing, pricing & approval.

Share and Enjoy

  • Facebook
  • Twitter
  • Delicious
  • LinkedIn
  • StumbleUpon
  • Add to favorites
  • Email
  • RSS

OR2012: Research Data Management and Infrastructure: institutional perspectives

Research data management can make a significant contribution to an institution’s research performance but needs solid user requirements research, an understanding of the researcher working space and a collaborative approach between researchers and support staff for infrastructure to be adopted, understood and sustained in the institution.  That was the message from this session on 11 July in Edinburgh at Open Repositories 2012 on research data management and infrastructure, from the perspectives of three particular institutions.

Unmanaged to managed

First we heard from Natasha Simons from Australia’s Griffith University.  Natasha made a clear connection between the university’s position in the top 10 research universities of Australia, and the existence of their Research Hub, which was developed with funding from the Australian National Data Service.  The Hub stores data and relationships between the data, exports to ANDS, and provides Griffiths researchers with their own profiles which allow better collaboration across the institution by allowing researchers to find others with similar research interests for collaboration and supervision.

Natasha outlined some challenges the Griffith team have met and are currently facing, but ultimately reported that they are successfully transforming institutional data in line with ANDS aims from unmanaged to managed; disconnected to connected; invisible to visible; and single-use to reusable.

Resourcing for RDM

Another institution which connects RDM with its prestigious position in the research league tables is Oxford; Sally Rumsey of the University’s Bodleian library took us through their vision for their institutional research data management infrastructure, encompassing current work on the Oxford DMP Online and the DaMaRO project; data creation and local management (DataStage, ViDASS); archival storage and curation (DataBank, software store); and data discovery and dissemination (document repository, Oxford DataFinder and Colwiz).

Sally argued that that data management doesn’t stop at digital objects:

“Paper in filing cabinets, specimens in jars: all could exist as data.”

She also reminded us that although emerging funder requirements, and particularly this year’s EPSRC roadmap requirement, were doing much to focus minds on RDM, there is also the challenge of unfunded research, a major component of research activity at Oxford.  This needs requirements and funding for management, too.

Sally was asked whether researchers were going to end up paying for RDM infrastructure.  She argued that there needs to be a budget line in research bids to cover these costs.  This prompted me to think about the fact that we talk about getting researchers trained from the start of their research activity, but to bring about the kind of awareness that will lead to researchers knowing to cost in data management in their bid, we need to engage with them before they start even writing the bid.  This is an argument for engagement at PhD level at the latest, and for a much wider and more consistent provision of RDM training in universities in order to bring about this kind of change in culture.  Clearly we also need simple, accessible costing tools to help non-specialists quantify explicit costs for data management and preservation, for inclusion in funding bids.

Adopt, adapt, develop

Anthony Beitz, manager of Australia’s Monash University eResearch Centre, also has nascent culture change in mind.  He described the availability of research data as having the potential to change research work:

“We’re going to see things we’ve never seen before.”

Anthony’s description of how the eResearch team works at Monash is based on a clear understanding of the characteristics of the research space and how that differs from the way in which IT services staff work.

  • Researchers: focused on outcomes.  They work in an interpretive mode, using iterative processes.  The approach may be open-ended and thrives on ambiguity.  Requirements and goals may change over time.  May require an ICT capability for only a short period of time – don’t tend to care what happens to it after the end of the project.  Resourceful, driven, and loyal to their discipline more than the institution.
  • IT services: broad service base.  Supporting administration, education and research.  Continuity of IT services is a priority.  Excel at selecting and deploying supporting institutional enterprise solutions.  IT works in analytical mode as opposed to the research space, which is in interpretive mode.

The volume of data is growing exponentially, but funding to manage it is certainly not.  In this context, a clear articulation of need between the researcher space and the IT services space is crucial.  Anthony argues that researchers need to participate actively in the deployment of an institution’s RDM infrastructure.  Media currently used is not good for reliability, security or sharing, but no single institutional RDM platform will fit all researchers’ needs.  RDM solutions must be a good cultural fit as researchers have stronger synergies with colleagues beyond the institution and are more likely to use solutions within their disciplines.  Anthony suggests that IT services should adopt existing solutions being used within disciplines, where possible, as building a new one breaks the collaboration cycle for researchers with colleagues from other institutions, asserting, “going into development should be a last resort.”

In this way, much of the RDM activity at Monash seems to be explicitly responding to current researcher behaviours.  Adoption of emerging solutions is encouraged by promoting a sense of ownership by the researchers; by delivering value early and often; and by supporting researchers in raising awareness of a RDM platform to their research community.  If users don’t feel they own a resource, they’ll look to the developers to sustain funding.  If they feel ownership, they’ll look for funding for it themselves, so buy-in is not only good for adoption but also for sustainability.

Share and Enjoy

  • Facebook
  • Twitter
  • Delicious
  • LinkedIn
  • StumbleUpon
  • Add to favorites
  • Email
  • RSS

Digital curation tools: what works for you?

I’m undertaking a piece of work with Monica Duke and Magdalena Getler of the DCC, and we need your help!  We’re looking at which DCC-developed digital curation tools are used by the MRD02 projects.  This is a happy case of our interests overlapping in a Venn diagram-type way: I’m interested in which digital curation tools, DCC or not, are used (or considered but rejected) by the projects.  Monica and Magda are interested in the use of DCC tools by the MRD02 projects as well as by other people.

There is a list of the DCC tools developed to date at http://www.dcc.ac.uk/resources/tools-and-applications, and there is a freshly-revised catalogue of digital curation tools developed by people other than the DCC at http://www.dcc.ac.uk/resources/external/tools-services (although please note this latter link is currently still in development – it should be finalised by the week commencing 30 April 2012).

We plan to look at the project plans, blogs and so on to see where digital curation tools are mentioned.  After this initial perusal, the plan is currently for DCC to send out a brief survey to projects where we don’t already have a full picture from their blogging (and this may also be a way of helping to get the new RDMTrain02 projects involved), asking for information on their use of DCC tools.

If you’re on one of the projects and keen to contribute, it would be immensely helpful to me if you could let me know which tools for digital curation (DCC-developed or not) you have considered using.  If you’re going ahead with use of them, please let me know what you think of them, and if you’ve decided against use of a particular tool, please let me know why.  I welcome this feedback by email to laura.molloy AT glasgow.ac.uk, or in the comments below.  Thanks!

Share and Enjoy

  • Facebook
  • Twitter
  • Delicious
  • LinkedIn
  • StumbleUpon
  • Add to favorites
  • Email
  • RSS