Understanding Open Data

Everything you need to know about making your research data open and FAIR

Here at F1000Research, we’re big advocates for open data. We believe that sharing research data can accelerate the pace of discovery, provide credit and recognition for authors, and even improve public trust in research (but more on that later).

We know the 21^st century researcher has lots to think about – not least securing grant funding, conducting high quality research, and maximizing impact after publication. We’re asking you to add one more thing to this list: open data. Far from being another hoop to jump through, sharing your research data can bring a whole host of benefits to every stage of your research journey.

On this page, we’ll walk you through the what, why, and how of data sharing, shining a light on how open data can help you and your research community. Keep reading for information and resources designed to answer your key questions, including:

What is open data?
How to make your data FAIR
Why choose open data? What are the benefits?
Data collection tips and tricks
How to prepare your data when submitting to F1000Research

Working in STEM?

We've curated a library of content tailored to your research area, so you can learn more about the open data landscape across the sciences, technology, engineering, and medicine.

EXPLORE STEM RESOURCES

In a nutshell: open data is data that is available for everyone to access, use, and share.

For researchers, this refers to any information or materials that have been collected or created as part of your research project – such as survey results, gene sequences, software, code, neuro-images, even audio files. In research, open data practices are also known as ‘data sharing’.

There are some cases where data sharing is not appropriate for legal, ethical, data protection, or confidentiality reasons. We recommend researchers strive to make their data as open as possible, and as closed as necessary. This means researchers should only restrict access to their data where absolutely necessary, in situations where openly sharing the data is not possible.

What is FAIR Data?

The FAIR Guiding Principles were published in Scientific Data in 2016, offering a new framework for research data management, designed to maximize its reuse and support open data practices.

FAIR data is Findable, Accessible, Interoperable, and Reusable. FAIR data goes beyond open data, aiming to make the data itself more useful and user-friendly, rather than simply 'open'. At F1000Research, we endorse the FAIR guidelines as part of our Open Data Policy.

Download our FAIR Data Guide

Learn more about how to make your data Findable, Accessible, Interoperable, and Reusable, with our quick guide for researchers.

Download now

When you choose open data, this helps others to replicate your study and validate your results. As such, open data is a fundamental requirement for reproducibility and transparency. These are two things we’re big fans of at F1000Research, because they have impact not just for individual researchers, but for the research ecosystem as a whole, and wider society.

When the data underlying academic research is made open, it makes it easier to question, share, replicate, validate, confirm, and build upon the evidence which underpins the results.

Open data can’t be an afterthought. It’s essential to know at the outset of your research project if you’ll be making your data open, so that you can plan accordingly.

Create a detailed Data Management Plan (DMP) at the start of your project and keep this updated throughout. Your DMP is a living document that will change and grow over the course of your research lifecycle.

A good DMP has benefits beyond simply supporting open data. It will help you find, organize and understand your data better throughout the research process, improve efficiency by reducing unnecessary duplication (e.g. re-collecting data), and even provide continuity in the event of staff turnover.

When it comes to data collection and analysis, there may be discipline-specific (and repository-specific!) guidelines you need to comply with to ensure your data is FAIR. Make sure you’ve done your research, and have a clear understanding of best practice in your field. If you need support with this step, reach out to your institution’s Data Steward for guidance.

Do standard data sharing policies work for humanities editors?

In her article, Rebecca Grant, Head of Data at F1000, questions how we can adapt current policies to reflect the working practices of humanities scholars. READ NOW

Download the guide

Get your copy of our Tips & Tricks now for practical, bitesize guidance on managing data throughout the research process.

Download Now

Repositories

Choosing the right repository for your research can be tricky. This handy guide asks you three simple questions to help determine the best repository for you.

Read now

Spreadsheets

Stuck on how to keep your spreadsheets interoperable and reusable? Read our do’s and don’ts to ensure your spreadsheet data follows best practice.

Read Now

Submitting to F1000Research? (Great choice, by the way). Before you submit your article, make sure your research data complies with the progressive Open Data Policy we advocate for, and that you’ve prepared your data according to our stringent Data Guidelines.

About our Open Data Policy

All articles published on F1000Research that report original results should include a Data Availability Statement: this is a short section of text providing citations to repositories that host the data underlying your results, together with details of any software used to process results.

Failure to provide your research data openly is likely to result in your submission being rejected, although there are a few exceptions:

Ethics and security: where data access must be restricted for ethical or security reasons
Data protection: where human data cannot be de-identified, so data cannot be shared in order to protect patient/participant privacy
Large data: where data is too large to be feasibly hosted by a recommended repository
Third party data: where data has been obtained by a third party, and restrictions apply to the availability of the dataset

In all cases where the data cannot be shared openly, authors should provide detailed instructions for readers on how to apply for access to the data. These instructions should be included in the Data Availability Statement for the article.

Read our Open Data Policy for full details, and get in touch with our Editorial team if you cannot share your dataset for one of the reasons listed above.

Extended data

F1000Research authors must make all their data, including extended data, openly available. Extended data are additional materials that support the key claims made in your article, but are not absolutely required to follow the study design and analysis. Examples include questionnaires, images, or tables, which some journals may refer to as 'Supplementary Materials'. If there is any code required for processing or replication, this should be included within your extended data. For submission to F1000Research, this data needs to be uploaded to an approved online repository, alongside any data underlying your results.

Creating research software?

You’ve come to the right place. We’re pretty much trailblazers when it comes to software, as one of the first publishers asking for it to be made open alongside the rest of your data back in 2015. Even today, not all publishing platforms or journals require your software to be made openly available, but we think differently. At F1000Research, we know that open software is just as important as open data when it comes to ensuring reproducibility.

So, what exactly are our requirements?

For submission to F1000Research, any novel software should be written in an open source programming language, and made openly available in a structured repository like Zenodo. We also ask for an archived version at the time of submission, hosted on a recognized Version Control System (VCS) like GitHub. Your source code must be assigned an open license, ideally an OSS approved license.

Include software in your Data Availability Statement under a ‘Software Availability’ heading; here, you should list the repository and license under which the software can be used.

How to Write a Data Availability Statement

Not sure exactly what to include in your Statement, or how it should be formatted?

Don't worry - we've pulled together a quick guide which walks you through every step of the process. Download the guide now to find out:

What is a Data Availability Statement?
What kinds of data need to be covered by the Statement?
How to cite repository-hosted data
When and how to reference research software
How to reference third party data
Examples of Data Availability Statements on F1000Research

If you have questions about how to write your Data Availability Statement, you can always get in touch with our Editorial team, who will be happy to help!

Download the guide

Not sure where to start when it comes to crafting your Data Availability Statement? We've got you covered.

Download Now

1. Prepare your data for sharing

This step is the most time consuming, but also the most important. Firstly, consider how to make your data as open as possible, and as closed as necessary. Are there any ethical or security issues around sharing your data? Do you need to anonymize your dataset to protect patient or participant privacy? If you’re unsure, reach out to the F1000Research Editorial team for advice.

Are there subject-specific data standards relevant to your research? If so, make sure your data meets these standards, and that you label your files according to discipline-specific best practice. If your dataset includes spreadsheets with large tables, follow our simple Do’s and Don’ts to maximize its accessibility and reusability.

Finally, ensure details of any software that is required to view your datasets is included – if you’ve coded the software yourself, the code should be made openly available too.

2. Select a repository

Your datasets should be deposited in a stable and recognized open repository, under a CC0 license. Your community might have a recognized repository, and some data types (such as genetic sequences or protein structures) have specific data banks they should be deposited in. Struggling to decide which repository is right for your research? Our Data Guidelines include a comprehensive list of F1000Research-approved repositories, or download our handy guide.

3. Add a Data Availability Statement to your article

On F1000Research, all articles must include a Data Availability Statement, even when there is no data associated with the article. This statement helps your reviewers and readers find and access the data underlying your results. Not sure how? Read our guidance on how to write this.

4. Link your datasets to your article

Once your article is published, update your repository project with the DOI for your article. Linking your data and your article in this way means they are reciprocally connected, ensuring you receive credit for your work.

So that's it - four simple steps to open data on F1000Research! Ready to publish? Submit your research now

Join our mailing list

Be the first to know about special offers, calls for papers, and more. Sign up to the F1000Research mailing list today.

What is Code Ocean?

We spoke to Code Ocean about how ‘compute capsules’ can support computational reproducibility.

Read Now

Discover our blog

Read articles and updates from the F1000Research team, including case studies from our authors.

Browse Now

Understanding Open Data

Working in STEM?

What is Open Data?

What is FAIR Data?

Download our FAIR Data Guide

Why Choose Open Data?

Benefits for Researchers

Benefits for Research

Benefits for Society

How to Share your Research Data

Open data can’t be an afterthought. It’s essential to know at the outset of your research project if you’ll be making your data open, so that you can plan accordingly.

Data Collection Tips & Tricks

Download the guide

Repositories

Spreadsheets

Open Data on F1000Research

About our Open Data Policy

Extended data

Creating research software?

How to Write a Data Availability Statement

Download the guide

4 Steps to Open Data

1. Prepare your data for sharing

2. Select a repository

3. Add a Data Availability Statement to your article

4. Link your datasets to your article

Join our mailing list

What is Code Ocean?

Discover our blog