AnVIL Demo: Getting Started with your Analyses and Running FastQC with Galaxy in AnVIL on January 15, 2025

January Demo: Getting Started with your Analyses and Running FastQC with Galaxy in AnVIL

January 15, 2025 at 10:00 AM ET

11:00 AM - 11:30 AM EDT – Demo on AnVIL

In this demo, Javier Carpinteyro-Ponce will demonstrate “Getting Started with your Analyses in AnVIL”, and Kate Isaac will demonstrate “Running FastQC with Galaxy in AnVIL.”

11:30 AM - 12:00 PM EDT – Q&A

We’ll open up the floor to answer questions about the demo or any general questions about AnVIL you might have!

:pencil2: Sign up: https://forms.gle/7CcaLE9AM7FrYqpP7

What are AnVIL Demos?

AnVIL Demos are a monthly, virtual meeting where we highlight what you can do on the NHGRI Analysis, Visualization, and Informatics Lab-space (AnVIL), a cloud-based computing platform for genomic data science! AnVIL Demos will start out with a 30-minute demonstration on the platform followed by open time for Q&A and user support.

The demos will highlight a range of topics, from a capability of the platform to a scientific analysis powered by AnVIL. If you’re interested in showcasing how you use AnVIL at a future AnVIL Demos session, reach out to Natalie Kucher (nkucher3@jhu.edu). After the demo, we’ll open up the floor to answer questions about the demo and to answer any general questions you might have about AnVIL.

Watch our past Demos from our YouTube playlist!

:pencil2: Sign up for more demos: https://forms.gle/7CcaLE9AM7FrYqpP7

Resources

Upcoming Events

Sign up to hear about future AnVIL Demos and announcements at bit.ly/anvil-mailing-list and learn about upcoming events at https://anvilproject.org/events!

  • Q: I would like to build a custom docker for Jupyter notebook environments. How should I approach this task?
  • A: Terra provides base images that can be expanded upon to fit your specific use case. You can learn more at GitHub - DataBiosphere/terra-docker.
  • Q: When developing a WDL workflow, should I use one docker with all associated dependencies for all tasks in the pipeline?
  • A: Sometimes it’s easier to focus on one step at a time, ensuring that the outputs for each step in the pipeline are generated and stored as expected. Sometimes using a specific docker for each step can simplify development, testing and debugging.
  • Q: Sometimes I want to run all samples in a data table through a WDL workflow. When selecting workflow inputs from the workspace data table in the UI, I can only select one “page” worth of records at a time. This seems tedious, is this possibly a bug?
  • A: Great question! All samples from the workspace data table can be selected at once when launching a workflow from the workflows tab. When selecting data from the pop-up form in the workflow tab, there is a checkbox in the table’s header. By selecting the down-arrow, a menu appears, allowing users to select ‘ALL’ records from a table
  • Q: I Would like to upload a large dataset from an HPC to AnVIL. Will this be covered today? Any suggestions on how to get started?
  • A: This is a great idea for a future demo! In the meantime, here are some useful resources to get started: https://support.terra.bio/hc/en-us/articles/4409101169051-How-to-move-data-to-from-a-Google-bucket
  • Q: I’m comfortable with Galaxy, but some tools are made available through WDL. How can I choose between Galaxy and WDL?
  • A: It really depends! Galaxy is great if you don’t want to mess with code. However, if you are (1) using a tool that is less common, (2) you need to do a lot of customizing, or (3) you have a ton of files, you might want to learn WDL
  • Q: How can I get started learning WDL?
  • A: GitHub - openwdl/learn-wdl: Educational materials for learning WDL is a great point for jumping in.