Legal issues
Excerpt from "Guidelines on Digital Research Data at TU Darmstadt"
"In particular, all project participants are obliged to ensure compliance with good scientific practice and long-term archiving, as well as to implement the relevant requirements of the research funders and partners. In doing so, they take ethical, data protection and copyright aspects into account." (Guideline 2)
Note
This collection is for informational purposes only and is not intended to be a legal document. TUdata cannot provide legal advice, so we refer to recommendations or bodies that can assist you with your questions.
Legal issues, in particular copyright and privacy regulations, are important aspects in your research, especially if you plan to share your research with collaborators or to publish it. We advise in accordance with DFG's Good Scientific Practice code of conduct to answer upcoming legal questions from the very beginning of your project (e.g. within your DMP) as it can be complicated to obtain publication or exploitation rights afterward. We recommend to form an agreement on how to use research data as early as possible. If you plan to publish your research data, see the Archival and Publication section for a flowchart that summarizes the following points to help you decide if there is anything standing in the way of publication.
Identify rights holder and other stakeholder
Copyright ("Urheberrecht")
You need to decide whether your research data is copyright protected at all before you deal with other related issues. Unfortunately, there is no generally valid answer to this. While simple sensor readings are generally not protected, more elaborate research outputs, for example texts and images, are. A special case are databases that contain data: Here, the database falls under its own copyright law.
If you want to publish data or make data openly available, you need to ensure you have the rights and consent to do so. In general, you should consider the following aspects before you publish any of your research data with respects to rights holder:
- Third party rights: If several parties have rights to research data, try to obtain the consent of all stakeholders to publication. Possible situations where this may apply are:
- Usage of copyrighted materials, e.g. images and text, in your research. Here, you must obtain the clearance of the original rights holder.
- A project was conducted as part of collaboration which means that several researcher may hold rights to data.
- The research was subject to contracts, e.g. as part of an employment where the employer might have rights.
- Non-disclosure agreements: If commercial interests are linked to your research, non-disclosure agreements or other contractual obligations can be put in place that might prohibit publishing research data.
- Patenting: If you plan to use your results in a patent, you should refrain from publishing research data before the patenting process is completed, otherwise the novelty required for a patent might be lost.
Some of these issues, such as patenting, may only generate a temporary embargo on publication. You should therefore reconsider your decisions after the appropriate time has passed.
Data privacy
If your project generates research data that contains personal information or otherwise relates to an identifiable individual, there are a number of things to consider. As a rule of thumb, your data contains personal information if it contains names, addresses, gender of individuals or information that relates to a person, such as phone numbers, student IDs, licence plates etc. Personal data is protected in the European Union by the European Union General Data Protection Regulation (GDPR). The Council on Social and Economic Data (RatSWD) published a guide in German and English on data protection in research. The NFDI consortium BERD@NFDI released an interactive tool that supports you in deciding if the regulations of GDPR the apply to your research.
One way of dealing with personal data is to de-identify individuals by pseudonmyzation or anonymization. The Research Data Centre Education (FDZ Bildung) at the DIPF (German Institute for International Educational Research) provides instructions on anonymization of qualitative and quantitative data. In addition, the OpenAIRE project provides Amnesia, a tool that supports researchers in anonymizing their research data. Another option for unstructured text data is QualiAnon by the qualitative data center Qualiservice.
Sometimes, anonymization would make your research data unusable, especially in qualitative social research. If you process personal data in your research that cannot be anonymized, you have to obtain a legal permission and the consent of the research subjects and persons concerned. They must be informed
- about how research data will be stored, preserved and used in the long term,
- how confidentiality will be maintained, e.g. by anonymizing the data and data access management, and
- about the right of withdrawal and the right to deletion of their data.
This information should be part of a written consent form. You can use the template of a declaration of consent by the ethics commission of TU Darmstadt as a good place to implement these steps. BERD@NFDI provides a tool with information about key aspects to get lawful consents.
If your data contains personal information, you have to take care that the information is protected during your analysis. That is, using tools such as Hessenbox for storing and sharing data within your project might not be feasible. Using web-based tools for analysis might also be problematic.
If you plan on publishing your data, you have to anonymize the personal information prior to publication. However, complete de-identification might render your data useless to future research. In that case, social research data can be published with access barriers. For example, Leibniz Institute for the Social Sciences GESIS provides access management for quantitative research data sharing. QualidataNet connects research data centers which offer the same for qualitative data.
Choose a fitting license
If you assign a license to your research data, you avoid ambiguities and others can reuse your results under well-defined and binding terms. We suggest to use free licenses to create an open and fair access to your data. In order to make frequently occurring conditions easy to handle, there are standardised licences you can use. The most commonly used licences are:
- Creative Commons (CC) for texts, images and data. We suggest especially open license like CC0 or CC-BY for research data, since other flavors are discussed for preventing open reuse
- GNU General Public Licence (GPL) or BSD-3 for software,
- Open Data Commons (ODC) which is designed for databases.
Issue special licenses like the ones mentioned above for your research software/code instead of Creative Commons. Additionally, if you used some third-party software within your own software, you have to check their respective licenses. They influence what licences you can choose from to ensure compatibility between your and their choice.
If you are unsure which license is a good choice, there are several tools that will help choose a suitable licence:
If you need to combine licenses you can check for compatibility using these websites: