dbGaP access - sequencing data

Hello

We are seeking access to sequencing data from a set of iPSCs provided by the California’s Stem Cell Agency (CIRM), which are available through your database.

Recently, we had our request for data access approved by dbGaP’s DAC. Unfortunately, as we were not aware that the data wasn’t managed by dbGaP, we did not request access to use cloud computing.

How do we proceed from here? We have not used AnVIL before, so I don’t know whether we need to revise our data request in order to access the data.

Otherwise, can we download the data directly without the use of cloud computing?

Thank you in advance.

Best wishes,
Anne Kirstine

Hi @AKKB,

Thanks for your question.

If you’d like to use data locally, you will still need to egress (download) data from AnVIL. Whether you choose to (1) analyze data on AnVIL or (2) egress, you’ll need to:

  • Ensure you can log in to AnVIL
  • Link your eRA commons ID in your AnVIL user settings [instructions here]
  • Ensure you can view the data on AnVIL

Please let us know if you are encountering issues on any of the steps above.

Ava

Hi Ava,

Thank you for your quick response.

We are only interested in looking at a limited set of genes to see whether they harbour disease-associated SNPs – we are trying to select a set of iPSCs (that have been previously sequenced) to find the most optimal for a disease model for AMD (https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs002032.v2.p1).

Do you know if we can simply see the data on AnVIL without having requested permission to use cloud computing on our dbGaP access request? Also, do you know if we can see the genotypes on AnVIL without doing any analysis – and without the need for cloud computing?

In the case of requesting the use of cloud computing, we need to fill out the following. Perhaps you could help here as well?

Cloud Use Statement
State the name of the cloud service provider and/or third-party IT system, their security standard, and how they will be used to carry out the work described in your Research Use Statement. Also, if applicable, describe the role of any collaborators. Please limit your statement to 2000 characters.

Thank you very much in advance.

Best wishes,

Anne Kirstine

Hi @AKKB,

It is possible to view/browse data on AnVIL without cloud computing costs, as long as you are logged in and have permission to view the data. If the Workspace containing the data already has genotypes quantified, you could view these as well. However, it depends on what has already been done in the Workspace.

We are locating resources our end that can be used for the Cloud Use Statement.

Thanks!
Ava

Hi Ava,

Okay, thank you very much!

Can you say anything about the costs of using AnVIL for looking at these genotypes? I know it’s hard to predict, but just a general estimate? Also, which platforms/tools would you recommend us using?

For the dbGaP access request, can you specify AnVIL’s security standard very shortly (please see below)?

State the name of the cloud service provider and/or third-party IT system, their security standard, and how they will be used to carry out the work described in your Research Use Statement. Also, if applicable, describe the role of any collaborators. Please limit your statement to 2000 characters.

Thanks in advance.

Best wishes,

Anne Kirstine

Hi @AKKB ,

The costs will generally depend on the number of samples and what tools you are using (e.g., Jupyter notebooks, Galaxy, Workflows). Generally, interactive sessions are easier to estimate because the tool shows the amount per hour. For example, the base configuration Jupyter notebook is $0.06 per hr. If you plan to run workflows, we recommend starting with a few samples to get a cost estimate.

You might be able to find more precise information in the Terra support docs here: https://support.terra.bio/hc/en-us/sections/360006459511-Managing-Cloud-costs

You brought to our attention that our team should provide a general use Cloud Use Statement for AnVIL. We’re currently working on this. In the meantime, you can find security information here: Platform and Data Security - AnVIL Portal

Thanks!
Ava