Q&A
Q:Is Terra public? It’d be great for the workspace to be explorable by folks following with the video.
A: The workflow to create the staging is public. There is an AnVIL flag used, but it’s also generic to be usable outside of AnVIL.
Publicly discoverable in Terra? That’d be great. Unless working with controlled access data, then wouldn’t do that.
The intention is to make a featured workspace that would be essentially the same as this workspace.
–
Q: So, uploading data from other workspaces managed by our institution in terra, is possible, for example, with our workspace bucket, correct?
A: Yes. In AnVIL in general, AnVIL covers storage costs for data submitters - will make a copy of the data into AnVIL funded data buckets to manage the storage costs. This process is set up to reference files that exist anywhere in Google storage.
Also see that people want to do version increments which is possible to be supported. Can reference any files in Google Cloud Storage that have access.
Q: Interested in interoperability among different platforms on Terra - AnVIL, All of Us, and institution-supported workspaces. Not sure exactly if the institution has a separately layered version of AnVIL. Is it possible to transfer a dataset currently in their workspace with the institution-supported environment? Yes. What about the datasets from All of Us?
A: When looking at data submission - constructing a dataset that will be shared. It can be comprised of a bunch of other datasets. You can build a dataset in AnVIL that references files in 1000 Genomes project, you could point to the files in 1000 Genomes and not create another copy. TDR is designed to support this.
From researcher standpoint who wants to analyze data from AnVIL and All of Us - All of Us datasets cannot leave the All of Us environment (there are security restrictions on this). AnVIL datasets can be downloaded out of AnVIL. From other platforms however, you can pull into a single workspace in Terra and do analysis or you can download the dataset and host it in the workspace. Tabular data can be referenced from other platforms.
Q: Upload and ingesting into TDR - alluded that AnVIL will cover storage costs for AnVIL-hosted data, though TDR can index things that are not AnVIL-hosted. Is there an approval process for what can be indexed into TDR, or is this open to anyone?
A: For AnVIL, still leveraging TDR’s file reference capabilities. It did not copy the files from the staging workspace into TDR - still referencing in cloud storage bucket. TDR can reference any files in GCS as long as it has access. Users can use TDR to reference and index files in GCS buckets without any Terra/AnVIL involvement. For AnVIL, if you want to reference files living outside of AnVIL-hosted storage, this would be supported. When users reference those files, they get the URL to them. Open indexing is available.
–
Q: Two questions about TDR Documentation: 1) Is TDR ready to be advertised at a place like Submitting Data - AnVIL Portal? 2) Is https://support.terra.bio/hc/en-us/sections/4407099323675 the best place to point people to about TDR? From an analyst standpoint, https://support.terra.bio/hc/en-us/articles/10092372304155 looks very handy but a little buried.
A: TDR is not an AnVIL-specific product or is gated in any way. It’s more of a data custodian-type tool, and is used extensively for AnVIL and other applications. The intent is for it to be more user friendly, but it’s not there yet.
Q: Is this Terra support document the best place to point more data submitters?
Q: From data analyst standpoint ,to get stuff out of TDR, is the Data Explorer the best way to interface with TDR?
A: The Data Explorer is meant to be the primary one, all the data are available in DUOS as well. Allowing people to take the dataset and move it into a Terra workspace. Once you have access to the snapshot, you can export from TDR as well. But the primary front door is the Data Explorer.
–
Q: Accessing All of Us data from AnVIL or AnVIL dataset in All of Us. I have a pet account I pass to my All of Us workspace that is available in my AnVIL workspace.
A: Would recommend reaching out the All of Us data help desk. All of your pet service accounts should have access to AnVIL data from All of Us as well.
Q: This is fairly different in my experience. AnVIL team should be aware of this concern from the researcher community. All of Us has a very strict policy to not allow export. If Terra is secured, why not? There is confusion because these platforms look similar. Should be aware of the lack of interoperability between AnVIL and All of Us.
A: We often get this question of how to combine AnVIL and All of Us datasets. We received this guidance, but it does not seem from your experience that it works.
Q: It seems like AnVIL data should easily be able to be brought into All of Us.
A: This specific request would be excellent to raise to the All of Us help desk, and we’d love to work with you to help make this path clearer for all researchers.