Stable Diffusion GCP Cloud Function

September 9, 2022

I’m working on a project that has, as one of its features, the ability for a user to dynamically create portraits and landscapes of any subject they wish. I needed a way to create a simple image

from an app, store it in the cloud and display in the app at some later point in time – a great opportunity to learn how to use Cloud Functions and Stability.ai’s Stable Diffusion SDK.

This is what you’ll from this post:

How to set up and write a Cloud Function on GCP
How to use the stable diffusion sdk
How to store a generated image on Cloud Storage
Prompt ideas for specific use cases.

Cloud Functions on GCP

So to start we need to set up Cloud Functions and write a new function that handles a POST with a prompt, and the height and width of the image we want to create. This post assumes you know enough about GCP to know how get to the console and create an account.

A Cloud Function is literally just a function that is run in response to a HTTP requst or an event (we are just going to focus on HTTP requests). You can write then a bunch of language, I chose the Python runtime. To create a function all you need is a main.py file and a requirements.txt file. You can write in the console itself but I got burned by writing a function and not having it saved as I navigated away so I suggest you just create a directory and use an editor like VSCode to create those files.

main.py

As you can see from this very simple function, you get a request object and have to return a response. The functions_framework has a bunch of code to handle different scenarios but a simple function is easy.

requirements.txt

functions-framework==3.*

Like any other requirements.txt file in Python, you list the dependencies that need to be installed.

Deploying the Function

From within the directory you created for main.py and requirements.txt you can run

gcloud functions deploy hello-world-func --allow-unauthenticated --trigger-http --runtime=python38 --gen2 --entry-point=hello_world

The entry-point arg is the name of the function you want to run in main.py.

Stability.ai Stable Diffusion SDK

Next we need to figure out how to use the python client for the stable diffusion sdk. First, you need a developer key which you can get by signing up at DreamStudio. You’ll need to create an API and copy it somewhere.

The Python client is pretty simple, you create a stability client with your key and then use the generate function to generate an image for a given prompt. It returns an iterable that has artifacts containing the actual images. Finally you create a PIL image by converting the artifact binary to a Byte array and opening it in a PIL Image.

Cloud Storage

Next we want to take that image and store it in a Cloud Storage Bucket and return the URL its stored at so we can retrieve the image in our app.

First you have to create a storage bucket. For simplicity, we are going to make it public so that anyone can read the URL and retrieve the image.

After you do that, the following function is enough to take a PIL image and upload it to a bucket.

Putting it all together

Here is the code of my cloud function.

Once you deploy the function you can POST a json string with like this to invoke the generator.

{‘prompt’: ‘Serena Williams as Superwoman’}

The function will return something like the following url which you can use to display the image.

{‘url’:’https://storage.googleapis.com/launcher_gen_images//tmp/tmp00wy8yaq.jpg’}

Prompts

I’ve found that in most cases the raw prompts don’t produce great images. In my use case there is an opportunity to guide the user a bit on what type of image they want to produce. They can choose from a portrait or a landscape. The app “pre-applies’ a style by appending the following prompt to their subject:

Portraits:

[Subject] + , realistic portrait, symmetrical, highly detailed, digital painting, artstation, concept art, smooth, sharp focus, illustration, cinematic lighting, art by artgerm and greg rutkowski and alphonse mucha.

Landscape:

[Subject] + , au naturel, blender render, Cinematic, Dramatic, peaceful, Good, in a symbolic and meaningful style, insanely detailed and intricate, hyper realistic.

By doing it this way, your users don’t have be prompt engineers and still create some amazing images.

That’s it. Enjoy!