Having trouble finding or accessing data on AnVIL? The Data Access category is the place to ask questions and find solutions if you need help locating or gaining access to a file or dataset.
- Check out our FAQs below.
- If you don’t find an answer for your problem, post a new topic, and the AnVIL team will do our best to help you out!
For issues connecting data with specific analysis tools, please look at the Help category.
Resources
You can learn more about accessing data on AnVIL through:
FAQs and common problems:
- I can’t find the data I need
- I’m having trouble with dbGaP
- How do I import data?
- I can’t view or export data in a Workspace
I can’t find the data I need
To work with data on AnVIL, you first need to know whether the data is:
- Hosted by AnVIL
- Uploaded by another AnVIL user
- Stored outside of AnVIL, in which case you will need to import it yourself.
Note that many datasets hosted on AnVIL are controlled-access, so you will need to obtain appropriate permissions before you can work with them.
1. Is the data hosted by AnVIL?
If you expect to find your data-of-interest hosted by AnVIL (or just want to see what’s on AnVIL):
- Explore available datasets using the AnVIL Data Explorer
- Learn how to work with these data by reading the AnVIL Data Explorer Guide
If you are a member of a consortium that participates in data sharing through AnVIL, contact your consortium’s leadership to request access to your consortium’s data or inquire about data you expect to see on AnVIL.
Note that a data submitter must appropriately prepare and submit the data to AnVIL before it will appear on AnVIL, so it is possible that not all data from a particular study or consortium is available on AnVIL, or that the data is still being processed. If a dataset is not available on AnVIL, you can still import it yourself.
2. Has the data been uploaded by another AnVIL user?
If a colleague or collaborator has uploaded your data-of-interest to their own AnVIL Workspace:
- They will need to grant you permission to access the Workspace(s)
- You will need to find the Workspace(s) where the data are stored by logging into Terra and looking through your Workspaces. From there you can work with the data directly or import them into another Workspace, depending on your team’s practices.
I’m having trouble with dbGaP
If you are having trouble accessing controlled-access data, here are a few things you can try:
1. Is the data hosted by AnVIL?
First, confirm that the data are on AnVIL. Find out by checking the AnVIL Data Explorer. Not all dbGaP-protected datasets are available on AnVIL. In order for a dataset to be made available as part of AnVIL’s data collection, a data submitter must coordinate with the AnVIL team to format the data correctly and provide appropriate metadata. If the data is not hosted by AnVIL, you can still import it yourself.
2. Do you have permission to access the data?
Make sure you have obtained permission through dbGaP if needed. For more info see:
Trainees can be added to their PI’s dbGaP project as a Downloader.
3. Does AnVIL know about your permission?
Follow these instructions to link your AnVIL/Terra account to your NIH/dbGap account.
Make sure that you are logged in with the correct account
Note that:
-
dbGaP permissions take a little while to propagate to AnVIL
- ~ 6 hours if you already had access with dbGaP
- ~ 30 hours if you were just granted access to a dbGaP study
-
Links expire after a few weeks and you’ll need to renew your link to access data.
If you are still having trouble, try unlinking and then relinking the account. This can sometimes resolve the problem.
4. Has your team (or consortium) granted you permission to access the data?
If you are working with other AnVIL users and they have already set up AnVIL Workspaces with the data, make sure that:
- You have permission to access the Workspace(s)
- You have been added to the appropriate Authorization Domain if needed
If you are a member of a consortium that participates in data sharing through AnVIL, contact your consortium’s leadership to be granted the appropriate permissions.
How do I import data?
If your data-of-interest is not yet on AnVIL, you can upload it yourself or import it from an external cloud-storage platform. Here are some resources on importing data:
- How Terra manages data
- Moving data to and from a Google Bucket (command-line)
- Uploading data from your local computer (GUI)
- Workflow for importing dbGaP data
Additional resources can be found here: Bringing your own data
I can’t view or export data in a Workspace
1. Is the Workspace using Requester Pays?
Some Workspaces have enabled “Requester Pays”, so that the Workspace owner isn’t charged every time someone accesses the data. If this is the case, you will need to specify a billing project to pay for egress charges. Learn more here: Using Requester Pays workspaces/buckets