How to Deploy a Gemini-powered chat app on Cloud Run

1. Introduction

Overview

In this codelab, you'll see how you can create a basic chat bot written in node by using the Vertex AI Gemini API and the Vertex AI client library. This app uses an express session store backed by Google Cloud Firestore.

What you'll learn

  • How to use htmx, tailwindcss, and express.js to build a Cloud Run service
  • How to use the Vertex AI client libraries to authenticate to Google APIs
  • How to create a chatbot to interact with the Gemini model
  • How to deploy to a cloud run service without a docker file
  • How to use an express session store backed by Google Cloud Firestore

2. Setup and Requirements

Prerequisites

Activate Cloud Shell

  1. From the Cloud Console, click Activate Cloud Shell d1264ca30785e435.png.

cb81e7c8e34bc8d.png

If this is your first time starting Cloud Shell, you're presented with an intermediate screen describing what it is. If you were presented with an intermediate screen, click Continue.

d95252b003979716.png

It should only take a few moments to provision and connect to Cloud Shell.

7833d5e1c5d18f54.png

This virtual machine is loaded with all the development tools needed. It offers a persistent 5 GB home directory and runs in Google Cloud, greatly enhancing network performance and authentication. Much, if not all, of your work in this codelab can be done with a browser.

Once connected to Cloud Shell, you should see that you are authenticated and that the project is set to your project ID.

  1. Run the following command in Cloud Shell to confirm that you are authenticated:
gcloud auth list

Command output

 Credentialed Accounts
ACTIVE  ACCOUNT
*       <my_account>@<my_domain.com>

To set the active account, run:
    $ gcloud config set account `ACCOUNT`
  1. Run the following command in Cloud Shell to confirm that the gcloud command knows about your project:
gcloud config list project

Command output

[core]
project = <PROJECT_ID>

If it is not, you can set it with this command:

gcloud config set project <PROJECT_ID>

Command output

Updated property [core/project].

3. Enable APIs and Set Environment Variables

Enable APIs

Before you can start using this codelab, there are several APIs you will need to enable. This codelab requires using the following APIs. You can enable those APIs by running the following command:

gcloud services enable run.googleapis.com \
    cloudbuild.googleapis.com \
    aiplatform.googleapis.com \
    secretmanager.googleapis.com

Setup environment variables

You can set environment variables that will be used throughout this codelab.

PROJECT_ID=<YOUR_PROJECT_ID>
REGION=<YOUR_REGION, e.g. us-central1>
SERVICE=chat-with-gemini
SERVICE_ACCOUNT="vertex-ai-caller"
SERVICE_ACCOUNT_ADDRESS=$SERVICE_ACCOUNT@$PROJECT_ID.iam.gserviceaccount.com
SECRET_ID="SESSION_SECRET"

4. Create and configure a Firebase project

  1. In the Firebase console, click Add project.
  2. Enter <YOUR_PROJECT_ID> to add Firebase to one of your existing Google Cloud projects
  3. If prompted, review and accept the Firebase terms.
  4. Click Continue.
  5. Click Confirm Plan to confirm the Firebase billing plan.
  6. It is optional to Enable Google Analytics for this codelab.
  7. Click Add Firebase.
  8. When the project has been created, click Continue.
  9. From the Build menu, click Firestore database.
  10. Click Create database.
  11. Choose your region from the Location drop-down, then click Next.
  12. Use the default Start in production mode, then click Create.

5. Create a service account

This service account will be used by Cloud Run to call the Vertex AI Gemini API. This service account will also have permissions to read and write to Firestore and read secrets from Secret Manager.

First, create the service account by running this command:

gcloud iam service-accounts create $SERVICE_ACCOUNT \
  --display-name="Cloud Run to access Vertex AI APIs"

Second, grant the Vertex AI User role to the service account.

gcloud projects add-iam-policy-binding $PROJECT_ID \
  --member serviceAccount:$SERVICE_ACCOUNT_ADDRESS \
  --role=roles/aiplatform.user

Now, create a secret in Secret Manager. The Cloud Run service will access this secret as an environment variables, which is resolved at instance startup time. You can learn more about secrets and Cloud Run.

gcloud secrets create $SECRET_ID --replication-policy="automatic"
printf "keyboard-cat" | gcloud secrets versions add $SECRET_ID --data-file=-

And grant the service account access to the express session secret in Secret Manager.

gcloud secrets add-iam-policy-binding $SECRET_ID \
    --member serviceAccount:$SERVICE_ACCOUNT_ADDRESS \
    --role='roles/secretmanager.secretAccessor'

Lastly, grant the service account read and write access to Firestore.

gcloud projects add-iam-policy-binding $PROJECT_ID \
  --member serviceAccount:$SERVICE_ACCOUNT_ADDRESS \
  --role=roles/datastore.user

6. Create the Cloud Run service

First, create a directory for the source code and cd into that directory.

mkdir chat-with-gemini && cd chat-with-gemini

Then, create a package.json file with the following content:

{
  "name": "chat-with-gemini",
  "version": "1.0.0",
  "description": "",
  "main": "app.js",
  "scripts": {
    "start": "node app.js",
    "nodemon": "nodemon app.js",
    "cssdev": "npx tailwindcss -i ./input.css -o ./public/output.css --watch",
    "tailwind": "npx tailwindcss -i ./input.css -o ./public/output.css",
    "dev": "npm run tailwind && npm run nodemon"
  },
  "keywords": [],
  "author": "",
  "license": "ISC",
  "dependencies": {
    "@google-cloud/connect-firestore": "^3.0.0",
    "@google-cloud/firestore": "^7.5.0",
    "@google-cloud/vertexai": "^0.4.0",
    "axios": "^1.6.8",
    "express": "^4.18.2",
    "express-session": "^1.18.0",
    "express-ws": "^5.0.2",
    "htmx.org": "^1.9.10"
  },
  "devDependencies": {
    "nodemon": "^3.1.0",
    "tailwindcss": "^3.4.1"
  }
}

Next, create an app.js source file with the content below. This file contains the entry point for the service and contains the main logic for the app.

const express = require("express");
const app = express();
app.use(express.urlencoded({ extended: true }));
app.use(express.json());
const path = require("path");

const fs = require("fs");
const util = require("util");
const { spinnerSvg } = require("./spinnerSvg.js");

// cloud run retrieves secret at instance startup time
const secret = process.env.SESSION_SECRET;

const { Firestore } = require("@google-cloud/firestore");
const { FirestoreStore } = require("@google-cloud/connect-firestore");
var session = require("express-session");
app.set("trust proxy", 1); // trust first proxy
app.use(
    session({
        store: new FirestoreStore({
            dataset: new Firestore(),
            kind: "express-sessions"
        }),
        secret: secret,
        /* set secure to false for local dev session history testing */
        /* see more at https://expressjs.com/en/resources/middleware/session.html */
        cookie: { secure: true },
        resave: false,
        saveUninitialized: true
    })
);

const expressWs = require("express-ws")(app);

app.use(express.static("public"));

// Vertex AI Section
const { VertexAI } = require("@google-cloud/vertexai");

// instance of Vertex model
let generativeModel;

// on startup
const port = parseInt(process.env.PORT) || 8080;
app.listen(port, async () => {
    console.log(`demo1: listening on port ${port}`);

    // get project and location from metadata service
    const metadataService = require("./metadataService.js");

    const project = await metadataService.getProjectId();
    const location = await metadataService.getRegion();

    // Vertex client library instance
    const vertex_ai = new VertexAI({
        project: project,
        location: location
    });

    // Instantiate models
    generativeModel = vertex_ai.getGenerativeModel({
        model: "gemini-1.0-pro-001"
    });
});

app.ws("/sendMessage", async function (ws, req) {
    if (!req.session.chathistory || req.session.chathistory.length == 0) {
        req.session.chathistory = [];
    }

    let chatWithModel = generativeModel.startChat({
        history: req.session.chathistory
    });

    ws.on("message", async function (message) {

        console.log("req.sessionID: ", req.sessionID);
        // get session id

        let questionToAsk = JSON.parse(message).message;
        console.log("WebSocket message: " + questionToAsk);

        ws.send(`<div hx-swap-oob="beforeend:#toupdate"><div
                        id="questionToAsk"
                        class="text-black m-2 text-right border p-2 rounded-lg ml-24">
                        ${questionToAsk}
                    </div></div>`);

        // to simulate a natural pause in conversation
        await sleep(500);

        // get timestamp for div to replace
        const now = "fromGemini" + Date.now();

        ws.send(`<div hx-swap-oob="beforeend:#toupdate"><div
                        id=${now}
                        class=" text-blue-400 m-2 text-left border p-2 rounded-lg mr-24">
                        ${spinnerSvg} 
                    </div></div>`);

        const results = await chatWithModel.sendMessage(questionToAsk);
        const answer =
            results.response.candidates[0].content.parts[0].text;

        ws.send(`<div
                        id=${now}
                        hx-swap-oob="true"
                        hx-swap="outerHTML"
                        class="text-blue-400 m-2 text-left border p-2 rounded-lg mr-24">
                        ${answer}
                    </div>`);

                    // save to current chat history
        let userHistory = {
            role: "user",
            parts: [{ text: questionToAsk }]
        };
        let modelHistory = {
            role: "model",
            parts: [{ text: answer }]
        };

        req.session.chathistory.push(userHistory);
        req.session.chathistory.push(modelHistory);

        // console.log(
        //     "newly saved chat history: ",
        //     util.inspect(req.session.chathistory, {
        //         showHidden: false,
        //         depth: null,
        //         colors: true
        //     })
        // );
        req.session.save();
    });

    ws.on("close", () => {
        console.log("WebSocket was closed");
    });
});

function sleep(ms) {
    return new Promise((resolve) => {
        setTimeout(resolve, ms);
    });
}

// gracefully close the web sockets
process.on("SIGTERM", () => {
    server.close();
});

Create the tailwind.config.js file for tailwindCSS.

/** @type {import('tailwindcss').Config} */
module.exports = {
    content: ["./**/*.{html,js}"],
    theme: {
        extend: {}
    },
    plugins: []
};

Create the metadataService.js file for getting the project id and region for the deployed Cloud Run service. These values will be used to instantiate an instance of the Vertex AI client libraries.

const your_project_id = "YOUR_PROJECT_ID";
const your_region = "YOUR_REGION";

const axios = require("axios");

module.exports = {
    getProjectId: async () => {
        let project = "";
        try {
            // Fetch the token to make a GCF to GCF call
            const response = await axios.get(
                "http://metadata.google.internal/computeMetadata/v1/project/project-id",
                {
                    headers: {
                        "Metadata-Flavor": "Google"
                    }
                }
            );

            if (response.data == "") {
                // running locally on Cloud Shell
                project = your_project_id;
            } else {
                // running on Clodu Run. Use project id from metadata service
                project = response.data;
            }
        } catch (ex) {
            // running locally on local terminal
            project = your_project_id;
        }

        return project;
    },

    getRegion: async () => {
        let region = "";
        try {
            // Fetch the token to make a GCF to GCF call
            const response = await axios.get(
                "http://metadata.google.internal/computeMetadata/v1/instance/region",
                {
                    headers: {
                        "Metadata-Flavor": "Google"
                    }
                }
            );

            if (response.data == "") {
                // running locally on Cloud Shell
                region = your_region;
            } else {
                // running on Clodu Run. Use region from metadata service
                let regionFull = response.data;
                const index = regionFull.lastIndexOf("/");
                region = regionFull.substring(index + 1);
            }
        } catch (ex) {
            // running locally on local terminal
            region = your_region;
        }
        return region;
    }
};

Create a file called spinnerSvg.js

module.exports.spinnerSvg = `<svg class="animate-spin -ml-1 mr-3 h-5 w-5 text-blue-500"
                    xmlns="http://www.w3.org/2000/svg"
                    fill="none"
                    viewBox="0 0 24 24"
                >
                    <circle
                        class="opacity-25"
                        cx="12"
                        cy="12"
                        r="10"
                        stroke="currentColor"
                        stroke-width="4"
                    ></circle>
                    <path
                        class="opacity-75"
                        fill="currentColor"
                        d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z"
                    ></path></svg>`;

Lastly, create an input.css file for tailwindCSS.

@tailwind base;
@tailwind components;
@tailwind utilities;

Now, create a new public directory.

mkdir public
cd public

And within that public directory, create the index.html file for the front end, which will use htmx.

<!DOCTYPE html>
<html lang="en">
    <head>
        <meta charset="UTF-8" />
        <meta
            name="viewport"
            content="width=device-width, initial-scale=1.0"
        />
        <script
            src="https://unpkg.com/htmx.org@1.9.10"
            integrity="sha384-D1Kt99CQMDuVetoL1lrYwg5t+9QdHe7NLX/SoJYkXDFfX37iInKRy5xLSi8nO7UC"
            crossorigin="anonymous"
        ></script>

        <link href="./output.css" rel="stylesheet" />
        <script src="https://unpkg.com/htmx.org/dist/ext/ws.js"></script>

        <title>Demo 1</title>
    </head>
    <body>
        <div id="herewego" text-center>
            <!-- <div id="replaceme2" hx-swap-oob="true">Hello world</div> -->
            <div
                class="container mx-auto mt-8 text-center max-w-screen-lg"
            >
                <div
                    class="overflow-y-scroll bg-white p-2 border h-[500px] space-y-4 rounded-lg m-auto"
                >
                    <div id="toupdate"></div>
                </div>
                <form
                    hx-trigger="submit, keyup[keyCode==13] from:body"
                    hx-ext="ws"
                    ws-connect="/sendMessage"
                    ws-send=""
                    hx-on="htmx:wsAfterSend: document.getElementById('message').value = ''"
                >
                    <div class="mb-6 mt-6 flex gap-4">
                        <textarea
                            rows="2"
                            type="text"
                            id="message"
                            name="message"
                            class="block grow rounded-lg border p-6 resize-none"
                            required
                        >
Is C# a programming language or a musical note?</textarea
                        >
                        <button
                            type="submit"
                            class="bg-blue-500 text-white px-4 py-2 rounded-lg text-center text-sm font-medium"
                        >
                            Send
                        </button>
                    </div>
                </form>
            </div>
        </div>
    </body>
</html>

7. Run the service locally

First, make sure you are in the root directory chat-with-gemini for your codelab.

cd .. && pwd

Next, install dependencies by running the following command:

npm install

Using ADC when running locally

If you are running in Cloud Shell, you are already running on a Google Compute Engine virtual machine. Your credentials associated with this virtual machine (as shown by running gcloud auth list) will automatically be used by Application Default Credentials, so it is not necessary to use the gcloud auth application-default login command. You can skip down to the section Create a local session secret

However, if you are running on your local terminal (i.e. not in Cloud Shell), you will need to use Application Default Credentials to authenticate to Google APIs. You can either 1) login using your credentials (provided you have both Vertex AI User and Datastore User roles) or 2) you can login by impersonating the service account used in this codelab.

Option 1) Using your credentials for ADC

If you want to use your credentials, you can first run gcloud auth list to verify how you are authenticated in gcloud. Next, you may need to grant your identity the Vertex AI User role. If your identity has the Owner role, you already have this Vertex AI user role. If not, you can run this command to grant your identity Vertex AI user role and the Datastore User role.

USER=<YOUR_PRINCIPAL_EMAIL>

gcloud projects add-iam-policy-binding $PROJECT_ID \
  --member user:$USER \
  --role=roles/aiplatform.user

gcloud projects add-iam-policy-binding $PROJECT_ID \
  --member user:$USER \
  --role=roles/datastore.user

Then run the following command

gcloud auth application-default login

Option 2) Impersonating a Service Account for ADC

If you want to use the service account created in this codelab, your user account will need to have the Service Account Token Creator role. You can obtain this role by running the following command:

gcloud projects add-iam-policy-binding $PROJECT_ID \
  --member user:$USER \
  --role=roles/iam.serviceAccountTokenCreator

Next, you'll run the following command to use ADC with the service account

gcloud auth application-default login --impersonate-service-account=$SERVICE_ACCOUNT_ADDRESS

Create a local session secret

Now, create a local session secret for local development.

export SESSION_SECRET=local-secret

Run the app locally

Lastly, you can start the app by running the following script. This script will also generate the output.css file from tailwindCSS.

npm run dev

You can preview the website by opening the Web Preview button and selecting Preview Port 8080

web preview - preview on port 8080 button

8. Deploy the service

First, run this command to start the deployment and specify the service account to be used. If a service account is not specified, the default compute service account is used.

gcloud run deploy $SERVICE \
 --service-account $SERVICE_ACCOUNT_ADDRESS \
 --source . \
  --region $REGION \
  --allow-unauthenticated \
  --set-secrets="SESSION_SECRET=$(echo $SECRET_ID):1"

If your are prompted that "Deploying from source requires an Artifact Registry Docker repository to store built containers. A repository named [cloud-run-source-deploy] in region [us-central1] will be created.", hit ‘y' to accept and continue.

9. Test the service

Once deployed, open the service URL in your web browser. Then ask Gemini a question, e.g. "I practice guitar but I'm also a software engineer. When I see "C#", should I think of it as a programming language or a musical note? Which one should I pick?"

10. Congratulations!

Congratulations for completing the codelab!

We recommend reviewing the documentation Cloud Run and Vertex AI Gemini APIs.

What we've covered

  • How to use htmx, tailwindcss, and express.js to build a Cloud Run service
  • How to use the Vertex AI client libraries to authenticate to Google APIs
  • How to create a chat bot to interact with the Gemini model
  • How to deploy to a cloud run service without a docker file
  • How to use an express session store backed by Google Cloud Firestore

11. Clean up

To avoid inadvertent charges, (for example, if the Cloud Run services are inadvertently invoked more times than your monthly Cloud Run invokement allocation in the free tier), you can either delete the Cloud Run or delete the project you created in Step 2.

To delete the Cloud Run service, go to the Cloud Run Cloud Console at https://console.cloud.google.com/run and delete the chat-with-gemini service. You may also want to delete the vertex-ai-caller service account or revoke the Vertex AI User role, to avoid any inadvertent calls to Gemini.

If you choose to delete the entire project, you can go to https://console.cloud.google.com/cloud-resource-manager, select the project you created in Step 2, and choose Delete. If you delete the project, you'll need to change projects in your Cloud SDK. You can view the list of all available projects by running gcloud projects list.