Using NIH Controlled-access Data on AnVIL Under the Updated NIH GDS Policy

This post is based on common questions we receive through the help@lists.anvilproject.org email. It is shared here so the wider community can benefit. We’ll update this post as new guidance or resources become available.

Last updated August 19, 2025

With the updated NIH Genomic Data Sharing (GDS) Policy that took effect earlier this year on January 25, 2025, researchers creating new Data Use Agreements (DUAs) or renewing existing ones must ensure that NIH controlled-access data are stored and analyzed on NIST SP 800-171 (or equivalent) compliant systems.

AnVIL, powered by Terra, already meets these security requirements. Below are some frequently asked questions from researchers considering AnVIL as their analysis environment:

Is every environment in AnVIL NIST SP 800-171 compliant?
Yes. AnVIL’s cloud environment is designed to meet NIST SP 800-171 (or equivalent) standards.
More information: AnVIL Platform and Data Security.

If every environment in AnVIL is not NIST SP 800-171 compliant, how do I create one that is?
All environments launched within the AnVIL/Terra boundary are already compliant—no additional setup is required.

What should I do if the dbGaP study I need is not available in the AnVIL Data Explorer or DUOS?
Datasets listed in explore.anvilproject.org and duos.org/datalibrary/AnVIL are those hosted directly by AnVIL. If a dataset you need is not available within AnVIL, you can still “bring your own data” into AnVIL. This includes both your own generated data and dbGaP-hosted studies.
Learn more: Can users upload non-AnVIL data to an AnVIL Workspace?.

Does AnVIL or Terra support workflow managers such as Snakemake or Nextflow?
Today the answer is no. AnVIL’s workflow engine is based on the Workflow Description Language (WDL) and uses Google Cloud Batch for execution. Workflows written in Snakemake or Nextflow must either be converted to WDL for use in AnVIL or run outside the platform.

Does AnVIL or Terra support the Conda or Mamba package management systems?
While Conda (and compatible managers like Mamba or Micromamba) are not installed by default in Terra’s standard Cloud Environment images, you can easily install Conda using the standard Miniconda installer on interactive environments like Jupyter notebook and RStudio sessions ( Managing Conda on AnVIL ).

Are there restrictions on pulling de-identified summarized data to a non-compliant environment?
This is an area where the AnVIL team is working with NIH for additional guidance. While summarized data (such as gene count matrices) may appear de-identified, whether or not they can be exported outside of a compliant environment depends on the specific terms of your DUA and NIH policy. At present, we recommend confirming with NIH before moving any derived results out of AnVIL.