Push docker containers on google container registry or artifact registry

truckload · November 22, 2024, 11:57pm

Hi.
I am trying to use google container registry to store an image of software needed for my analysis. I will use the image to configure the application environment using the option “Container image”.
I had followed the tutorial, publish a docker container image to google container registry. However, with the GCR transition to google artifact registry, I cannot find instructions to use GCR anymore.

When I try to use the artifact registry, the image has certain format as [us-location]-docker.pkg.dev/[project_id]/[imageid:tag]. It would not be accepted by the Terra environment configuration option for “container image”.

I would appreciate some instructions to use artifact registry, or to create GCR, so I can publish the image to GCR. Would Terra accept image from Docker.hub? Or should I just used the startup script to install the software whenever I start the VM?
Thanks.

truckload · November 23, 2024, 4:28pm

In addition to my previous question, I would like to know:
It was said, one can either using a custom docker in the interactive notebook, or in a workflow.
To use it in the interactive notebook, the docker has to be extended from the base image. (ref: https://support.terra.bio/hc/en-us/articles/360037143432-Docker-tutorial-Custom-cloud-environments-for-Jupyter-Notebooks).
To use it in a workflow, does the custom docker has to follow certain requirements (has to be extended from the base image) or not? Shall I keep it light weighted? (the terra base image is quite large…)

Javier-CP · December 2, 2024, 7:24pm

Hi @truckload,

We unfortunately do not have specific documentation addressing the GCR/GAR transition.

If you are comfortable building Docker images, one option could be trying to pull from Docker hub as WDL can use [almost] any Docker image.

Also, many new AnVIL users do some interactive development using Cloud Enviroments. One of the reasons for doing this is to make sure people would have access to files and software they need.

Javier

camancuso · March 14, 2025, 9:33pm

@Javier-CP I was wondering if there was any update on Terra being able to find images in the Artifact Registry. Is there a place we can put container images in google that terra will look for them? I have some on DockerHub but it takes very long to pull from there (>15) and I’ll be using AnVIL for workshop modules. I imagine having the images on a google system would be faster. Thanks!

Javier-CP · March 17, 2025, 8:06pm

Hi @camancuso,

If your Docker Hub image is public, you can try Google’s official mirror: Pull cached Docker Hub images | Artifact Registry documentation | Google Cloud

Pull times might be faster this way.

camancuso · March 18, 2025, 1:38pm

Thanks @Javier-CP! I will try to learn about this in the next week or so and see how it works.

Javier-CP · May 12, 2025, 6:30pm

@camancuso have you been able to try the Google’s official mirror?

Another alternative is that you pay a fee for using containers on GAR:

camancuso · May 13, 2025, 11:56am

I didn’t try the google cache yet. The problem with that is if the images end up being cleared from cache, then they take too long to be reloaded to be useful for the workshop.

In my first post you can see that I tried using GAR. Do you have working example of being able to use a public GAR image on Terra/AnvIL?

truckload · May 13, 2025, 12:54pm

Just to chime in some user experience here. Do you know why it took a long time for the image to be pulled from Docker hub? What size of your docker image? Sometime you can reduce the size of the docker when you build it.

To use it on GCR, if you have access to docker hub, your can 1) pull the image down from your docker hub, (to your computer environment), 2) and tag it to a gcr address (something like grc.io/…) , 3) then push the tagged image to the gcr. It is quiet easy to change it to GCR. I have not used the artifact registry.

Even if you use GCR, I am afraid if your image size is very large, you would probably run into the same time issue.

camancuso · May 13, 2025, 1:06pm

@truckload thanks for that information. The images are huge, but they are the base Terra images and I don’t know enough about docker to build my own that would work on Terra. I have a workshop to run so I was going to add a layer to these base images with the needed extra packages so they end up being the same size as the base images basically. The omics packages used in the workshop are ever changing so start up scripts that need to be included as the environments are spun up have some drawbacks.

I was under the impression that GCR was deprecated and GAR was the active way of adding images to GCP. I’m using a new account for this and GCP seemed to not let me push anything to GCR since I don’t have an active GCR resource and when I try to create one it auto-directs me to GAR. Was your GCR made in the last few months since it become depreciated? For instance all the Terra base images live on GCR, so things can be on there, I just don’t know if new users can put things up there? But I have very little experience with this.

truckload · May 13, 2025, 1:29pm

To my understanding, you would need the base image to replicate the terra computing environment. However, just for running analysis, you don’t need to create a new docker image with terra base + your “personal” software packages.

For analysis, you would need to

Run the interactive mode in Rstudio or Jupiter notebook. To use your own R, Python packages does not come with terra base image, you just need to install them to your Terra VM “on the fly”. As long as you don’t delete the persistent disk (your VM files are usually stored there), those packages will not be lost. You don’t need to reinstall them next time you resume your environment. Even if you lost them, it would not be a pain to reinstall them.
Run WDL where you would pull a docker image (The docker image only contains your own packages need for your script)
Run dsub using a docker image (The docker image only contains your own packages)

I did the similar things as you before. I thought to keep my personal packages stored with a terra base image together as a docker, so I don’t need to install them whenever I started a new VM in terra. It turned out to be an overkill. The terra base image I built off the github is very large, and it is never deployed properly when I try to start a Terra computing environment using that docker image.

camancuso · May 14, 2025, 2:44am

Thanks for that info! I’ll see if I can apply it to my situation setting up different environments for an education workshop.

Topic		Replies	Views
AnVIL Office Hours 25AUG2022 @ 11 AM ET AnVIL Demos	1	334	August 25, 2022
Software installation Help terra	2	25	February 27, 2025
Finding a specific reference file for GTEx Data Access	5	436	April 1, 2022
Are files from other google storage bucket copied over when running WDL? Help	2	29	May 8, 2025
Managing Conda on AnVIL Help	5	558	March 1, 2021

Push docker containers on google container registry or artifact registry

Related topics