Do's and Don'ts in Research Data Management

Do's and Don'ts in Research Data Management

Christian Hillen

Christian Hillen

I have a background in history. As an archivist I have been working with historical (research) data - both analogue and digital - for the last 25 years. Currently I am a consultant with DKZ.2R at the RRZK (Regionales Rechenzentrum der Universität zu Köln).

Research Data Management Do’s and Don’ts - Step up your RDM skills!

1. Structuring and naming your folders There is an easy way to make your data findable for you and your team: establish a folder structure which makes sense for you and your working group as well as naming conventions for your folders.

Don’t:

Paul and Suzie
»Guideline
>application
»version2_final
»v.3
»review
»3rd.version
>JD
»qn
»0-1

Instead do:

000_int_orga
»01_application
»02_review 120_questionaires
»01_qualitative »02_quantitative 130_data
»01_qualitative »02_quantitative

Also Do:

000-int_orga
100_planning
»01_application
»01_review
>120_qualitative
»01_guideline »02_data
130_quantitative
»01_questionaire
»02_data

Want to learn more about organizing your data?:
Take part in our Data Challenge on November 7th in Cologne and learn more about Metadata and Data structuring (sign up here!) or visit University of Cologne EduLabs for more information on how to structure your data in useful ways.

2. Storing your data
Storing your data is very important not only to make them accessible for the (right) persons it is also a matter of making them findable: If you store them on a stick no other member of your working group will have access or find the data, they won’t even know this data exists.

Don’t:

measuring device (local, remote)
laptop (local, remote)
Dropbox (local, remote)
flash drive (archive)
external H(ard)D(isk)D(rive) (archive)

Do this instead:

S(olid)S(tate)D(rive) (local, remote) H(ard)D(isk)D(rive) (local, remote) N(etwork)A(attached)S(torage) (local, remote) Sciebo (local, remote) DataStorageNRW (archive) Repositories (archive)

Want to learn more about storing your data?: visit University of Cologne EduLabs or the UDE Speichermatrix.

3. Naming your data
Naming your data in an understandable and consistent manner makes it much easier for you and your team to find the data you are looking for. Therefore you should take some time to develop naming conventions.

Don’t:

Really_long_file_names_because_windows_is not_able_to_process_more_than_255_characters_and_that_includes_the_name_of_the_folders
Using abbreviations that are not generally understood in your community
Using special characters like * % [ ] > / : ä ö ü ß space

Instead do:

Readme file documenting conventions
Use inverted date format for sorting (YYYYMMDD)
If necessary add hour, minute and second
Initial numbers for sorting (01_title)
Use interoperable set of characters
A good filename could be: 20250901_sample01_H2O_v2_original.tiff.
The readme should explain the structure of your naming convention: [SamplingDate][SampleID][SampleType][VersionNumber][description]
Abbreviations should be explained as well.

Want to learn more about naming you data in a way that helps you to stay organised?:
Take part in our Data Challenge on November 7th in Cologne and learn more about Metadata, Data structuring, and file naming (sign up here!).

4. Interoperability
You can enhance the use and reuse of your data by making them interoperable.

Don’t:

Encrypting your data (if not necessary for legal reasons)
Compressing data (like in a Zip-file) or using compressed file formats (e.g. jpeg)
Using proprietary software

Instead do:

Use open standards
Add lots of metadata
Document your processes of gathering, processing, naming an storing your data

5. Write a D(ata)M(anagement)P(lan)
DMPs are required by funding institutions, but they are also useful for yourself and your team and collaborators because they raise awareness for the importance of the whole Data Life Cycle: Which and how many data are gathered when and how. How are they processed and stored, archived and reused?

Don’t:

Starting with the DMP two days before handing in your grant application
Underestimating costs for processing and storing data.
Underestimating costs for curating data (human resources)

Instead do:

Start early on so you have time to consider all the different stages of your data in the life cycle.
Think about potential costs in human resources, soft- and hardware as well as storage.

Want to learn more about DMPs? Useful resources are offered i.a. by the University of Cologne, University Duisburg-Essen, and the Heinrich Heine University

Related Posts

Carpentries Workshop - Introduction to Python

Carpentries Workshop - Introduction to Python

Empowering Researchers with Foundational Computing Skills: Join the Upcoming Carpentries Workshop

In today’s fast-paced research environment, the ability to harness computational tools effectively can make a world of difference. Whether you’re managing data or automating tasks, having the right skills can significantly streamline your work. That’s where The Carpentries come in — a global initiative comprising the Software Carpentry, Data Carpentry, and Library Carpentry communities. These communities are dedicated to equipping researchers with essential computational and data science skills, helping them to work smarter, not harder.

Read More
A Survival Guide to Research Data Sharing Services in the Rhine-Ruhr Region

A Survival Guide to Research Data Sharing Services in the Rhine-Ruhr Region

A Survival Guide to Research Data Sharing Services in the Rhine-Ruhr Region

There are a lot of reasons why collaborating with other researchers on scientific projects is great! It provides new perspectives and gives you the chance to benefit from other people’s knowledge and input. When it comes to sharing and exchanging data across multiple locations and devices however, researchers are often disoriented and don’t know which tools, cloud services and so on are safe to share data in a secure and ethical way.

Read More
Announcement - Call for participation

Announcement - Call for participation

Update (June 24, 2024)

The call for participation is now open! Read more

Upcoming!

The Data Literacy Center Rhine-Rhur is issuing a call for participation in its “rent-an-expert” project! This is a great opportunity for PhD students and early postdocs who are working on research projects that involve data science, artificial intelligence, high performance computing and simulation, to get free support from our expert consultants.

Support can take the form of short- or long-term consulting, depending on the needs of the project. More info will be available shortly!

Read More