AnVIL Demo: Getting Started with your Analyses and Running FastQC with Galaxy in AnVIL on January 15, 2025

January Demo: Getting Started with your Analyses and Running FastQC with Galaxy in AnVIL

January 15, 2025 at 10:00 AM ET

11:00 AM - 11:30 AM EDT – Demo on AnVIL

In this demo, Javier Carpinteyro-Ponce will demonstrate “Getting Started with your Analyses in AnVIL”, and Kate Isaac will demonstrate “Running FastQC with Galaxy in AnVIL.”

11:30 AM - 12:00 PM EDT – Q&A

We’ll open up the floor to answer questions about the demo or any general questions about AnVIL you might have!

:pencil: Sign up: https://forms.gle/7CcaLE9AM7FrYqpP7

What are AnVIL Demos?

AnVIL Demos are a monthly, virtual meeting where we highlight what you can do on the NHGRI Analysis, Visualization, and Informatics Lab-space (AnVIL), a cloud-based computing platform for genomic data science! AnVIL Demos will start out with a 30-minute demonstration on the platform followed by open time for Q&A and user support.

The demos will highlight a range of topics, from a capability of the platform to a scientific analysis powered by AnVIL. If you’re interested in showcasing how you use AnVIL at a future AnVIL Demos session, reach out to Natalie Kucher (nkucher3@jhu.edu). After the demo, we’ll open up the floor to answer questions about the demo and to answer any general questions you might have about AnVIL.

:play_or_pause_button: Watch our past Demos from our YouTube playlist!

:pencil: Sign up for more demos: https://forms.gle/7CcaLE9AM7FrYqpP7

Resources

Upcoming Events

Sign up to hear about future AnVIL Demos and announcements at bit.ly/anvil-mailing-list and learn about upcoming events at https://anvilproject.org/events!

  • Q: I would like to build a custom docker for Jupyter notebook environments. How should I approach this task?
  • A: Terra provides base images that can be expanded upon to fit your specific use case. You can learn more at GitHub - DataBiosphere/terra-docker.
  • Q: When developing a WDL workflow, should I use one docker with all associated dependencies for all tasks in the pipeline?
  • A: Sometimes it’s easier to focus on one step at a time, ensuring that the outputs for each step in the pipeline are generated and stored as expected. Sometimes using a specific docker for each step can simplify development, testing and debugging.
  • Q: Sometimes I want to run all samples in a data table through a WDL workflow. When selecting workflow inputs from the workspace data table in the UI, I can only select one “page” worth of records at a time. This seems tedious, is this possibly a bug?
  • A: Great question! All samples from the workspace data table can be selected at once when launching a workflow from the workflows tab. When selecting data from the pop-up form in the workflow tab, there is a checkbox in the table’s header. By selecting the down-arrow, a menu appears, allowing users to select ‘ALL’ records from a table
  • Q: I Would like to upload a large dataset from an HPC to AnVIL. Will this be covered today? Any suggestions on how to get started?
  • A: This is a great idea for a future demo! In the meantime, here are some useful resources to get started: https://support.terra.bio/hc/en-us/articles/4409101169051-How-to-move-data-to-from-a-Google-bucket
  • Q: I’m comfortable with Galaxy, but some tools are made available through WDL. How can I choose between Galaxy and WDL?
  • A: It really depends! Galaxy is great if you don’t want to mess with code. However, if you are (1) using a tool that is less common, (2) you need to do a lot of customizing, or (3) you have a ton of files, you might want to learn WDL
  • Q: How can I get started learning WDL?
  • A: GitHub - openwdl/learn-wdl: Educational materials for learning WDL is a great point for jumping in.