google cloud platform - Make gemini-1.5-flash-002 accesible for my GCloud Run project - Stack Overflow

时间： 2025-04-26 admin 业界

I am trying a basic script to summarize text:

def generate(self, text_to_summarize):
    vertexai.init(project="<PROJECT_ID", location="MY_REGION")
    model = GenerativeModel(
        "gemini-1.5-flash-002",
        system_instruction=[my_prompt]
    )
    responses = model.generate_content(
        [text_to_summarize],
        stream=True,
    )

    for response in responses:
        print(response.text, end="")

This works as intended locally, using "gemini-1.5-flash-002"

In order to run in gcloud run, I have built the script in a docker container and have deployed it to gcloud run.

Calling the endpoint then fails with error:

"PermissionDenied(\"Permission 'aiplatform.endpoints.predict' denied on resource '//aiplatform.googleapis/projects/<PROJECT-ID>/locations/<REGION>/publishers/google/models/gemini-1.5-flash-002' (or it may not exist).\")"

I have double-checked permissions with gcloud projects get-iam-policy <PROJECT-ID> and see:

bindings:
- members:
  - serviceAccount:service-<CODE>@gcp-sa-vertex-op.iam.gserviceaccount
  role: roles/aiplatform.onlinePredictionServiceAgent
- members:
  - serviceAccount:service-<CODE>@gcp-sa-aiplatform.iam.gserviceaccount
  role: roles/aiplatform.serviceAgent
- members:
  - serviceAccount:<CODE>[email protected]
  - user:<MY-EMAIL>
  role: roles/aiplatform.user
...

I checked the models here and aiplatform.endpoints.predict is a permission for roles/aiplatform.user, so I have permission.

This has led me to conclude the model does not exist. I thought gcloud run would automatically use the gemini flash one as it does locally. I have run

gcloud ai models list --region=<REGION>

and there are no models.

Even trying to deploy that model to my endpoint fails. The code to deploy is:

gcloud ai endpoints deploy-model <MY-ENDPOINT-ID>\
   --model=gemini-1.5-flash-002 \
   --region=<REGION> \
   --display-name="flash-deployment" \
   --machine-type="n1-standard-4"

and this fails with

(gcloud.ai.endpoints.deploy-model) There is an error while getting the model information. Please make sure the model 'projects/<PROJECT-ID>/locations/<REGION>/models/gemini-1.5-flash-002' exists.

I think I have to register the model somewhere, but when I open the model registry and try to "Create" one, it asks me for training data and so on. I do not want to train a new model, just use the flash pretrained one.

Does anyone know how this can be achieved?