Storage options

There are many different storage options available for storing research data. This section describes some of the options available at the University of Washington, as well as some other options that may be useful.

Options comparison

The following table compares the different options for storing research data:

Feature UW Psych Dept U Drive UW LOLO UW SFS UW OneDrive UW Google Drive RedCap ResearchWorks Amazon S3 Azure Blobs Dropbox OSF GitHub
Suitability General use at UW Psych Interactive use at UW Long-term storage, archival Interactive use and collaboration at UW Interactive use and collaboration at UW Interactive use and collaboration at UW Research data Research data publication at UW Programming, sharing, archival Programming, sharing, archival Interactive use and collaboration Research data, collaboration, publishing Code, software containers, collaboration
UW support

đź’˛discount 5% đź’˛discount ~10%

Storage cost Free up to 2 TiB; 10 TiB free on request; more available for purchase (no recurring costs) Free up to 50 GiB, then $123/TiB/month ($0.12/GiB) $3.45/TiB/month $256/TiB/month Free, 5 TiB limit Free, 100 GiB limit ~$25/month for 1 TiB hot data, ~$5/month for cold (S3 Glacier Instant Retrival) ~$25/month for 1 TiB hot data, ~$5/month for cold; ~$30/TiB for cold egress Subscription ($12/month for 2 TiB) Free Free for most uses
Off-campus availability VPN VPN VPN VPN

Automatic backups

Limited Limited
Versioning

Platforms

Encryption in transit

Encryption at rest

FERPA

HIPAA

On request

Local filesystem

Sharing

UW only UW only UW only

UW only

Available on Klone w/rclone w/rclone

w/rclone w/rclone w/rclone w/rclone w/rclone

UW-supported options

UW provides several different storage options for faculty, staff, and students. Additionally, there are storage options that are not directly supported by UW but are available to UW users through UW’s enterprise agreements with the service providers, sometimes at a discount.

The following sections describe some of the most common options. UW IT maintains a page that provides an overview of UW’s online storage options, as well as a comparison of file service options. The UW IT Service Catalog has a more complete list of UW IT services.

The UW Libraries Research Data Services also maintains a page with information about storage options.

Departmental storage

The Department of Psychology has a server that can be used for storing research data. This server is located in Guthrie Hall and is managed by the department’s IT staff. The server is not intended for sharing data with collaborators outside of the University of Washington. Contact the department’s IT staff for more information.

Features and options

  • 2 TiB of storage (up to 10 TiB available on request; for more, contact mailto:dougkalk@uw.edu) per lab
  • 1 Gigabit Ethernet connection to campus network
  • Accessible on Klone via rclone’s smb backend
  • Accessible off-campus via VPN
  • Weekly backups (twice a week on request) to a backup server
  • Monthly backups to network-attached storage (NAS)
  • Storage is not encrypted
  • Shared access typically by sharing a single username and password per lab
  • Alternative configurations are possible on request (e.g., read-only accounts, restricted access to specific directories)
  • Mountable as a drive on Windows, Mac, and Linux systems

The department’s IT staff can provide more information about the server’s configuration and options. Contact mailto:dougkalk@uw.edu for more information.

Note

Transfer speed to the server is limited by the speed of the network connection and its disk drives. The server is connected to the campus network via a 1 Gigabit Ethernet connection. The maximum write speed is around 70-100 MiB/s.

Pricing

Free for departmental labs up to 10 TiB. More may be available on request.

Eligibility

  • Department faculty and staff

UW LOLO

The UW LOLO Data Archive provides long-term tape-backed archival storage for users at the university. It is only intended for use by UW faculty, staff, and affiliated organizations. It is suitable for data that is not actively being worked on. The Hyak documentation has additional information about the LOLO Data Archive.

Features and options

  • All files immediately and directly accessible via SSH protocol
  • Support for up to 1,000 files per TB of data stored
  • Fast uploads and downloads for large (>=100GB) files
  • 10Gbs network connection to campus, the internet, and Hyak
  • Two copies of all files are preserved, each in a separate data center, each in a different seismic zone

Pricing

The LOLO Data Archive is priced at $3.45/TiB/month, with a minimum purchase of 1 TiB. The LOLO Data Archive is billed to a UW Workday worktag.

Eligibility

  • UW faculty, staff, and affiliated organizations
  • UW Workday worktag required

UW Shared File Service

The UW Shared File Service provides a network file system that can be mounted on Linux and Windows systems. It is only intended for use by UW faculty, staff, and students. It is suitable for data that is actively being worked on.

Features and options

  • CIFS/SMB access from UW campus subnets. (Can also be accessed via VPN Service, which provides remote systems with a UW campus IP address.)
  • SFTP access from any internet location.
  • User self-service file restores via “snapshots”
  • Disaster recovery backups in two geographically separate regions (Seattle and Spokane). Per-file restores from tape are not available.
  • Easy allocations of additional space as demand requires.
  • Access controlled via UW NetIDs and UW Groups
  • Permissions are limited to Read/Write for a single group per folder, full granular ACLs are not supported currently.

Pricing

$0.25/GiB/month

Eligibility

UW faculty; UW staff; UW researchers; UW clinicians; UW academic units; UW administrative units; UW Medical Center; Harborview Medical Center; requires a valid UW budget number

Compliance

This service is considered FERPA compliant by the UW Registrar’s office (due to the process used for data release), but has had no other formal third party data security or privacy compliance audits.

UW U Drive

The UW U Drive provides a network file system that can be mounted as a drive on Windows, Mac, and Linux 0systems on campus or by VPN.

Features and options

  • CIFS/SMB access from UW campus subnets. (Can also be accessed via VPN Service, which provides remote systems with a UW campus IP address.)
Warning

SFTP access to U Drive will no longer be available after March 20, 2024.

Pricing

Free up to 50 GiB. $0.12/GiB/month for additional storage.

Eligibility

UW faculty; UW staff; UW students; UW students in residence halls; UW researchers

UW OneDrive for Business

UW provides access to the Microsoft OneDrive for Business cloud file syncing service as part of its Office 365 subscription.

Features and options

  • 5 TiB of storage
  • 250 GiB maximum file size
  • HIPAA and FERPA compliance
  • Sharing with UW users
  • Desktop, browser, and mobile access
  • SharePoint Online integration
  • Accessible on Klone via rclone
Warning

OneDrive has technical limitations that may cause problems when using it with some applications (e.g., FreeSurfer). Caution is advised when working with files stored in OneDrive. The support page has more information about OneDrive’s limitations.

Pricing

Free for eligible UW users.

Eligibility

UW faculty; UW staff; UW students; UW researchers; UW clinicians; For Shared UW NetIDs and Sponsored & affiliate UW NetIDs, accounts can be provisioned by UW employees

UW Google Drive

UW Google Drive provides Google Drive file storage and sync for users at UW (this is not the same as the Google Drive service provided by Google).

Features and options

  • 100 GiB of storage
  • FERPA compliance
  • Accessible on Klone via rclone
Warning

UW Google Drive is not HIPAA compliant.

ResearchWorks Archive

ResearchWorks Archive is the University of Washington’s digital repository (also known as “institutional repository”) for disseminating scholarly work. More information about ResearchWorks can be found on the Scholarly Publishing Services page.

Link: ResearchWorks Archive

RedCap

RedCap is a web-based application for building and managing online surveys and databases. According to the RedCap website:

Research Electronic Data Capture (REDCap) is a rapidly evolving web tool developed by researchers for researchers in the translational domain.

REDCap features a high degree of customizability for your forms and advanced user right control. It also features free, unlimited survey functionality, a sophisticated export module with support for all the popular statistical programs, and supports HIPAA compliance.

Cloud storage

Amazon Web Services

Amazon Web Services (AWS) is a cloud computing platform that provides a number of different storage services, including Amazon S3, Amazon EBS, Amazon EFS, and Amazon Glacier. Amazon S3 is comparable to Azure Blobs. Amazon EBS is comparable to Azure Files. Amazon EFS is comparable to Azure Files. Amazon Glacier is comparable to Azure Archive Storage.

Features and options
  • Data Egress Waiver, which effectively eliminates the standard charges for moving data out of the AWS Service.
  • HIPAA Eligible Account, requires special request and approval, and is subject to important operational considerations to meet compliance requirements of HIPAA, the UW BAA, and UW Medicine Compliance Policies.
  • Accessible on Klone via rclone
Pricing

The approximate cost of storing 1 TiB of data per month is: - ~$25 for “hot” data accessed frequently (S3 Standard) - ~$5/month for “cold” data accessed infrequently (S3 Glacier Instant Retrieval)

The Data Egress Waiver eliminates the standard charges for moving data out of the AWS Service.

UW IT has a subscription service for AWS that provides a discount of 5% and a waiver for data egress charges. The subscription is covered by UW’s HIPPAA BAA and other enterprise contracts. A UW Workday worktag is required to use the service. For more information, see the UW IT AWS page.

For detailed pricing information, see the AWS pricing calculator.

Eligibility

UW faculty; UW staff; UW researchers; UW clinicians; UW academic units; UW administrative units; UW affiliated organizations; UW Medical Center; Harborview Medical Center; Any group with an approved UW blanket PO. NOTE: UW students are not eligible for this service, except when working within the scope of UW employment, e.g., as an RA, TA or GSA. Known Prerequisites: This service requires a valid UW budget number.

Azure

Microsoft’s Azure cloud platform includes a number of storage options, including Azure Blobs and Azure Files.

Azure Blobs is comparable to Amazon AWS S3. It is designed to store large amounts of unstructured data but is not designed to be used as a file system.

Features and options
  • Covered by UW’s HIPAA BAA and other enterprise contracts
  • Accessible on Klone via rclone
Pricing

The approximate cost of storing 1 TiB of data per month is: - ~$25 for “hot” data accessed frequently (Azure Blobs Hot Tier) - ~$5/month for “cold” data accessed infrequently (Azure Blobs Cold Tier)

UW IT has a subscription service for Azure that provides a discount of roughly 10% on Azure usage charges and allows charges to be paid with a UW Workday worktag. A UW Workday worktag is required to use the service. For more information, see the UW IT Azure Subscription Service.

For detailed pricing information, see the Azure pricing calculator.

Eligibility

UW faculty; UW staff; UW researchers; UW clinicians; UW academic units; UW administrative units; UW affiliated organizations; UW Medical Center; Harborview Medical Center; Any group with an approved UW blanket PO. NOTE: UW students are not eligible for this service, except when working within the scope of UW employment, e.g., as an RA, TA or GSA. Known Prerequisites: This service requires a valid UW budget number.

UW-IT’s Azure Subscription service allows UW units that wish to create and manage their own Azure subscription to receive a discount (~10%) on Azure usage charges and to pay those with a UW Workday worktag.

Other options

Google Cloud Storage

Google Cloud Storage is a cloud object storage service that is part of the Google Cloud Platform. It resembles the offerings by AWS and Azure. It is distinct from UW Google Drive and Google Drive. Google Cloud Storage can be accessed on Klone via rclone.

Tip

Google Cloud also offers block storage, file storage, and archival storage services. See here for more information.

Open Science Framework

Open Science Framework (OSF) is a web-based platform for managing research projects. OSF can be used to store research data and their metadata. It is best suited for collaborating with other researchers and sharing research data for papers. Storage addons are available to connect Amazon S3, Dropbox, GitHub, OneDrive, and GitHub to OSF.

Dropbox

Dropbox is a subscription-based cloud file sync service. The University of Washington does not support Dropbox. Dropbox is not suited for storing sensitive data. However, Dropbox can be useful for storing non-sensitive data that is actively being worked on, and it provides convenient versioning and recovery options. Dropbox can be accessed on Klone via rclone.

GitHub

GitHub is a web-based hosting service for version control using Git. Files up to 100 MiB each can be stored in a GitHub repository. The total size of a repository should be under 5 GiB.

GitHub repositories can be made publicly available or can be made private. Private repositories are only accessible to collaborators who have been granted access to the repository. Private repositories are only available to collaborators who have GitHub accounts. GitHub repositories can be accessed from the GitHub website or the GitHub Desktop application.

GitHub Releases

GitHub Releases can be used to store files up to 2 GiB in size each.

GitHub Packages

GitHub Packages can be used to create container that can be uploaded to the GitHub Container Registry. For public repositories, there are no charges nor storage limits. For private repositories, storage limits and charges apply. A GitHub account is required to access GitHub Packages.

Although GitHub Packages is not designed to store research data, it can be used to store research data in the form of a container containing an archive of research data (similar to a ZIP file). The data cannot be viewed or modified while stored in GitHub Packages, but can be downloaded and extracted from the container. Any changes to the data require creating a new image and uploading it to GitHub Packages.

Public packages can be accessed by anyone and should not be used for sensitive data. However, it may be possible to encrypt data before storing it in the image or to deploy an encrypted container.

Large File Storage

GitHub has a Large File Storage (LFS) extension that can be used to store large files in a GitHub repository.

According to GitHub’s documentation, the file size limits for GitHub repositories are:

Plan Maximum file size
GitHub Free 2 GiB
GitHub Pro 2 GiB
GitHub Team 4 GiB
GitHub Enterprise Cloud 5 GiB