AnVIL Office Hours 27JAN2022 @ 11 am ET

The AnVIL Outreach Working Group is hosting virtual AnVIL Office Hours on Thursday, January 27, 2022 at 11:00 - 11:50 am ET. These Office Hours are an opportunity for you to get your questions about working on AnVIL answered in person – whether you’re trying to set up a billing account, launch Galaxy or RStudio, looking for methods and featured workspaces, and more. Members of the AnVIL team will be available to help users including PIs, analysts, and data submitters get unstuck, troubleshoot issues, and discover online resources that provide further information.

Please post your questions in this thread ahead of the session!

Register here to receive the meeting link: https://forms.gle/9i5XrtycS6Z2bzmf8.

Where should post reference data so that others can clone my terra workspace?

I started by cloning a workspace. I ran a WDL task that added new columns to the existing data models. The values are GCP bucket URLS

I now plan to extend the original workspace by adding a new pipeline. For the new pipeline, I plan to create a second data model to store the results. I will also need to add a lot of reference data. You can think of this reference data as if it was a bam file. I plan to create a second ‘samples’ data model. I will probably call it ‘reference’ if there are no restrictions on naming. This model will need to contain metadata and URLs to the reference files stored somewhere in the GCP universe. Where should I put these files so that others will be able to clone my workspace?

Kind regards

Andy

I will be unable to attend. Is it possible get this question answered via email or on this forum?

Kind regards

Andy

From Terra Support 3:38 pm EST

Hi @aedavids, a few clarification questions:

  • Who needs to have access to the to the workspace (e.g., Public? Known collaborators)?
  • Are the data controlled access (e.g. does it have a GTEx Authorization Domain)?

Once I complete my research I would like to publish the result and also make it possible for others to reproduce my work. The original bam files require AUTH_ANVIL_AnVIL_GTEx_V8_hg38
and GTEx-dbGaP-Authorized.

So one goal might be to “allow someone to re-run everything from soup to nuts”.

As part of my research, I am creating a large reference data set. One goal might be to “publish the new reference data”. It does not have any personally identifiable data. The original sample participant ids are not retained. Maybe this is published in a new workspace with a reference back to the 'soup to nuts" workspace? this workspace might only have the data models

Kind regards

Andy

From Terra Support 1:35 pm EST