Do you have a plan in place to store the information generated during your research project?
Think about what would happen if you were to lose some or all of your data. Could your project recover from such a setback?
How much valuable time would you lose?
How much money would it cost you?
Would you be liable for the loss of time and/or data?
Proper storage of research data pays dividends throughout your project.
(from Scott Summers' presentation on data security and storage, UK Data Service, 2016)
Things to consider when weighing your storage options:
How much data will your project generate? This is something to consider during the planning phase, because storage costs should be factored into the overall data management plan.
Who will need access to the data during the project's active phase? Collaborative research means additional challenges to storage and access.
Will the project involve confidential or sensitive information? If so, you'll need to take extra precautions to avoid accidental disclosure.
Common Storage Options
Centralized network drive
(e.g., U Sask DATASTORE)
Hosted and managed by a researcher's school, institution, or department.
Data accessible whenever needed
Physically more secure
Backed up regularly, reducing potential for data loss
Desktop or laptop
Convenient for short term storage and processing
More susceptible to hardware failure
Easily stolen or lost.
Commercial or cloud-based
Convenient; accessible from anywhere
Not as secure as networked storage
May not be backed up regularly
External storage media
(portable hard drives, flash drives, and optical discs)
Easily lost, stolen or damaged
Note: In some cases, well-secured and encrypted local drives that are not connected to a network, and are backed up rigorously, are appropriate for storing very sensitive data.
Protecting your research data "requires paying attention to physical security, network security and security of computer systems and files to prevent unauthorized access or unwanted changes to data, disclosure or destruction of data."* The more sensitive the data, the more stringent security measures need to be.
protects your data from disasters (e.g., flooding or fire)
prevents unauthorized access to computers or storage facilities where data and documents related to your project are kept
Encryption - encoding data in a way that makes it readable only to someone who has an access code, key, or password.
protects sensitive or confidential information
makes data transmission from one site to another more secure
restricts access only to authorized persons
The UK Data Service has detailed information about encryption techniques and tools, and has tutorial videos demonstrating the most commonly used encryption software.
Access Control - who needs access to your data and how do you manage that?
restrict physical access to computer or storage media to members of the research team
employ password protection on all computers used during your research
encrypt files and provide access keys only to research team
*Source: Louise Corti, et al., 2014. Managing and Sharing Research Data: A Guide to Good Practice, Los Angeles: Sage.
Backing up your data
Backing up data refers to making copies of files frequently, usually for short-term storage during a project's active phase; or for long-term storage during its static phase. Data files can be lost due to hardware or software failure, they can be accidentally altered or deleted, or they can become corrupted, rendering them unreadable or error-ridden. To avoid problems resulting from data loss, researchers should ensure that their data is properly backed up. A well thought out backup strategy should be an integral part of the overall data management plan. (adapted from UK Data Service)
Why back up your data?
reduce the risk of data loss, especially if that data cannot be reproduced
save time and money
recover research data with minimal disruption if something does go wrong
limit your liability
Things to consider when backing up your data. If you're using networked storage, discuss your requirements with the administrators. Some of the questions you need to ask are:
How frequently are their drives backed up?
How long do they store backed up copies?
Do they perform complete or partial backups?
Do they validate the backups to ensure the integrity of the data?
How do they recover files in the event of a problem?
Incremental or partial back ups copy only the files or data that have changed since the previous back up and are performed on a regular basis.
Complete back-ups, on the other hand, duplicate your project's entire data collection.
Be sure to have multiple copies of backup and archive files, in several locations, in case of software or hardware failure, theft or tampering, or natural disasters.
Protecting non-digital or textual data: ideally all non-digital data should be digitized. Items that cannot be digitized need to be managed in a way that keeps them secure and permits access on request.
File formats: use open or standardized formats rather than proprietary formats for both short-term and long-term storage of data.
Organization: establish and adhere to a protocol for naming and organizing back up copies to ensure that files are easy to locate and identify.
There are also many third-party backup utilities available for all platforms, some open-source, some commercial.
Best Practices for Data Storage
1. As a part of your overall data management plan, design a detailed data storage, security and back up policy for your project, and review it from time to time during the project's active phase.
2. Adhere to the 3-2-1 principle:
Keep 3 copies of research data
Use 2 different storage media
Store 1 copy off-site
3. Back up data files regularly. Check backed up files manually and verify them (using checksums, etc.) to ensure the integrity of the data.
4. Use portable media -- USB drives, portable hard drives, CDs or DVDs -- only for working copies of research data, not for master copies and never for sensitive or personal data. Encrypt these devices to protect the contents in the event of loss or theft.
5. Ensure data integrity by refreshing storage media. Magnetic and optical storage media can degrade with time.
6. Employ open or standard file formats for data storage to ensure that files will be readable in the future.
7. Create meaningful file names (including version information) to aid in organizing and locating files and folders.