AnVIL Demos: Learn how to use data across multiple AnVIL workspaces on June 22

Upcoming AnVIL Demos

:pencil2: Sign up: https://forms.gle/7CcaLE9AM7FrYqpP7

What are AnVIL Demos?

AnVIL Demos are a monthly, virtual meeting where we highlight what you can do on the NHGRI Analysis, Visualization, and Informatics Lab-space (AnVIL), a cloud-based computing platform for genomic data science! AnVIL Demos will start out with a 30-minute demonstration on the platform followed by open time for Q&A and user support.

The demos will highlight a range of topics, from a capability of the platform to a scientific analysis powered by AnVIL. If you’re interested in showcasing how you use AnVIL at a future AnVIL Demos session, reach out to Natalie Kucher (nkucher3@jhu.edu). After the demo, we’ll open up the floor to answer questions about the demo and to answer any general questions you might have about AnVIL.

Watch our past Demos from our YouTube playlist!

June Demo: How to use data across multiple AnVIL workspaces

When: June 22, 2023 at 11:00 AM ET (your time zone) on Zoom

11:00 AM - 11:30 AM ET – Demo on AnVIL

In this demo, Frederick Tan will highlight a key benefit of using a cloud-based platform for data storage and compute: you can access and analyze the data right in the cloud, without having to move it around or download it! He will demonstrate how to bring together data that are stored in different workspaces to analyze them together.

11:30 AM - 12:00 PM ET – Q&A

We’ll open up the floor to questions about the demo presented, and will have AnVIL and Terra support on call to answer any questions about AnVIL you might have!

:pencil2: Sign up for more demos: https://forms.gle/7CcaLE9AM7FrYqpP7

Future Demos

July 20, 2023 – Interactive Genomic Data Science with Bioconductor

August 24, 2023 – Galaxy on AnVIL

September 21, 2023 – Epigenetics in AnVIL

Resources

Upcoming Events

Sign up to hear about future AnVIL Demos and announcements at lists.anvilproject.org and learn about upcoming events at AnVIL Community Events!

Q: Regarding WGS and pipelines available, how can I find which datasets have variant calls?

A: It’s a good question about which datasets have variant calls. Where is this documented in the data? Certain projects will have made these variant calls, but perhaps not all. This information would be found in the workspace data tables as a row containing the variant call file. CCDG is an example of a controlled access dataset that does have SNP and indel calls, but whether they have SNV calls is uncertain. It’s an ongoing effort to make datasets easier to discover. The QC pipeline presented is available for exome or genome data.

Q: What kind of workflows are most popular for people who are just getting started?

A: Showcasing workflows in presentations usually highlight the GATK pipelines and tutorials.

Q: Are All of Us data in AnVIL too?

A: All of Us data have some policies that does not allow data to leave the All of Us platform, though those datasets are hosted in the cloud. Datasets that are available across platforms those that are in the NIH Cloud Platform Interoperability Project (NCPI) which includes the NHGRI AnVIL, Gabriella Miller Kids First, NCI Cancer Research Data Commons, NHLBI Biodata Catalyst, and the National Center for Biotechnology Information (NCBI). Find more of those datasets here: NIH Cloud Platform Interoperability Effort.