Planning research data management
Excerpt from "Guidelines on Digital Research Data at TU Darmstadt"
"Research data are all digital data that are created by transformation from an analogue medium or in the course of experiments, measurements, simulations, computer program development, studies of primary sources, surveys or inquiries or are their result. Associated with them are also the metadata, documentation, and software necessary to understand them. Research data exists in a variety of digital formats and stages of processing and aggregation in every scientific discipline." (Definition 1)
"The planning and execution of a research project also include research data management. This concerns, for example, the type of data generated and used in the course of the project, information on the required accuracy and scope of the data and metadata to be collected, measures to maintain the integrity and authenticity of the data, as well as information on confidentiality, retention and planned publication, including clarification of intellectual property rights and rights of use. These aspects should be systematically prepared and recorded in a suitable form (data management plan). Subject-specific characteristics and standards must be taken into account and the data management plan must be adapted in line with the current progress of the project." (Guideline 1)
Define objectives and tasks
As research data management (RDM) touches on all areas of your scientific activities, implementing new workflows, changing existing ones or even implementing RDM concepts for new projects can seem like a huge undertaking at first. Therefore, you should divide your work program of planning or improving research data management into smaller tasks and put quick wins first. You will find that even early and minor adjustments in one area (e.g. unifying file naming conventions) can already yield huge benefits. This will significantly help you with your research data management and thus your research in general.
Similarly, you should aim to build a solid RDM foundation on which you can later build by introducing more advanced tools and workflows. More disruptive and work-intensive objectives will also become easier and more manageable this way. With such a small-step approach later steps will become clearer once earlier ones are established.
In any case, whenever you decide on an action, include everyone affected in the planning process, for example by organizing an RDM planning workshop.
Over time, you will gain more experience with RDM and identify what is working in your team and what can still be improved. Thus, just like re-evaluating you workflows, re-evaluate your aims regularly and adapt your objectives to the currently given circumstances.
Analyze the status quo
Planning RDM always starts with taking a close look at the current state. Common points at this stage can be:
- What RDM experiences do you and other researchers in the group/project have?
- Do you have experience with technical solutions (also from somewhere else, such as earlier positions)? Could those be adapted to your current setting?
- Do you already have strategies that work particularly well and that could be more broadly applied?
- Do you have common routines/conventions/best practices?
- Do you actively/routinely talk about data management, e.g. in your working group seminar?
- How much data do you produce?
- Which data formats do you use?
- Have you recently faced any incidents such as data loss?
- Where do you store your data? Is storage centralized or is the data spread across multiple systems?
- How do you exchange data among each others or with externals?
- Do you already use technical solutions such as electronic laboratory notebooks or automated data transfer workflows?
Typical RDM implementation tasks
Please keep in mind that other researchers and groups around you face the same tasks of implementing and optimizing research data management. Therefore, it is a good idea to connect early to be able to learn from each other's solutions and ideas. To this end, TUdata is currently setting up a network of research data management officers as detailed under Networking.
The list below outlines typical tasks in no particular order. Depending on your assessment, certain tasks will vary in urgency and difficulty of implementation. For many of the tasks, you will find additional information elsewhere in the recommendations.
- Create a data management plan
- Data management plans, described in more detail below, are a simple but highly efficient tool that will help you with many other tasks as well.
- Assign responsibilities
- This includes a research data management officer to connect to the TU Darmstadt RDM network
- Organize research data management training
- Implement quality checks and processes to ensure high data quality
- Inform yourself about and adopt discipline-specific standards
- Follow the developments within your relevant NFDI consortia for current and future standards
- Integrate central services and tools provided by TU Darmstadt:
- Clarify legal aspects such as handling personal information or data ownership
- Clarify if and how you can ensure that your data is FAIR and open, for example by publishing your data
- FAIR stands for Findable, Accessible, Interoperable and Reusable and is a guiding principle in RDM
- Switch from proprietary to open file formats
- Find technical solutions and automate tasks (typically a task at a later stage after workflows have been established)
- Automation of RDM tasks reduces workload and risk of manual errors
- Many tools are open source and even available as services from public infrastructure providers
- Actively participate in larger consortia such as the NFDI
Additional questions and tasks for larger initiatives, such as cluster projects
Large projects like collaborative research centres face additional challenges when implementing RDM solutions as they have to coordinate activities of various subprojects and share data between them. Additionally, they generally have to handle a broad variety of data formats. These additional challenges might mean addressing several additional tasks as outlined below. We highly recommend to contact TUdata for consultation on the RDM section in your funding proposal.
- Formulate a general data policy for the whole research centre to have common guidance (example: SFB-TRR 211 Data policy; Fairsharing.org is building up a database of project data policies)
- Create a detailed data management plan for each subproject
- Create practical guidelines for the daily work of researchers, for example based on these recommendations
- Create a data management hub with dedicated research data management personnel
- Organize data exchange between subprojects based on these points:
- How much data will you exchange between subprojects?
- Which data formats will you use different subprojects and can those be unified?
- Make self-developed solutions sustainable and reusable
Create a data management plan
A data management plan (DMP) is a document in which you write down in a structured way what will happen to your research data when it passes through the research data life cycle. Thus, it is a tool to help you plan your research workflows by supporting you to anticipate tasks, opportunities and challenges related to your research data. Data management plans can and should be assembled at different levels, e.g. one for larger collaborative projects and one for each individual subproject to allow for project-specific details.
With TUdmo, TUdata provides a tool that will lead you through DMP creation in an interview-style approach for a selection of templates. You can create snapshots and always return to the questionnaire to implement changes. The created DMP can be exported to a variety of formats.
There can be external requirements for the exact format and content of a DMP from funders. One important example is the European Union's Horizon Europe program where a formal DMP that conforms to the provided template is required for grant application.
- Deutsche Forschungsgemeinschaft: Checklist for handling research data
- Horizon Europe: ERC DMP template
- CESSDA: DMP expert guide (social sciences)