How to build your own meme generator with machine learning

Par BNH NEWS Mise à jour Jul 28, 2023

[ad_1]

Generating captions

I use two different implementations of GPT to generate the captions. There’s the latest GPT-3 Da Vinci model from OpenAI that does an excellent job, but you have to be enrolled in their beta program to use it. And there’s the open-source GPT-Neo model from EleutherAI. The model is a lot smaller, but it’s free to use.

GPT-3 Da Vinci

OpenAI’s GPT-3 Da Vinci is currently the largest AI model for Natural Language Processing. I am using their latest “zero-shot” style of prompting with their new Da Vinci Instruct model. Instead of providing examples of what you are asking the model to do, you can just simply ask it what to do directly.

Here is the prompt that creates a caption for the apple pie picture.

Create a funny caption for a new meme about apple pie. The background picture is Simple and easy apple pie served with vanilla ice cream, on a gingham tablecloth in Lysekil, Sweden.

I pass the prompt into the call to OpenAI along with some additional parameters. Here’s the Python code.

import openai
response = openai.Completion.create(
engine=”davinci-instruct-beta”,
prompt=prompt,
max_tokens=64,
temperature=0.7,
top_p=0.5,
frequency_penalty=0.5,
presence_penalty=0.5,
best_of=1)

The max_token parameter indicates how long the response should be. The temperature and top_p parameters are similar in that they indicate the amount of variety in the response. The frequency_penalty and presence_penalty are also similar in that they control how often there are new deviations and new topics in the response. If you want to know what all these parameters do, check out my article from last month, here.

Before I show examples of the output from GPT-3, here is the legal disclaimer that OpenAI suggests that I show, which is all true.

The author generated the following text in part with GPT-3, OpenAI’s large-scale language-generation model. Upon generating draft language, the author reviewed and revised the language to their own liking and takes ultimate responsibility for the content of this publication.

Running the code 10 times will yield the following results, at a total cost of $0.03. Note that I formatted the text to be in uppercase.

1: THIS IS THE PERFECT WAY TO END A DAY OF APPLE PICKING
2: NO, IT’S NOT THAT EASY
3: I’LL TAKE THE ONE WITH THE VANILLA ICE CREAM, PLEASE
4: APPLE PIE IS THE BEST!
5: THIS APPLE PIE IS SO GOOD, I CAN’T EVEN!
6: YOU’RE NOT THE ONLY ONE WHO LOVES APPLE PIE
7: IF YOU CAN’T FIND THE RECIPE, JUST GOOGLE IT
8: THE PIE IS GOOD, BUT IT’S NOT AS GOOD AS MY MOM’S
9: I’LL HAVE A SLICE OF THAT APPLE PIE, PLEASE
10: WE’RE GOING TO NEED A BIGGER PIE

OK, these are pretty good. One thing I learned is that GTP-3 Da Vinci can be funny! For example, caption number 2 seems to refer to the “easy as pie” idiom.

Note that GPT-3, like all AI models trained on a large corpus of text, will reflect societal biases. Occasionally the system will produce text that may be inappropriate or offensive. OpenAI has a feature to label generated text with one of three warning levels: 0 – the text is safe, 1 – this text is sensitive, or 2 – this text is unsafe. My code will show a warning for any of the generated captions that are flagged as sensitive or unsafe.

GPT-Neo

GPT-Neo is a transformer model created primarily by developers known as sdtblck and leogao2 on GitHub. The project is an implementation of “GPT-2 and GPT-3-style models using the mesh-tensorflow library.” So far, their system is the size of OpenAI’s GPT-3 Ada, their smallest model. But GPT-Neo is available for free. I used the Huggingface Transformers interface to access GPT-Neo from my Python code.

Since GPT-Neo doesn’t have “instruct” versions of their pre-trained models, I had to write a “few-shot” prompt in order to get the system to generate captions for memes using examples. Here’s the prompt I wrote using Disaster Girl and Grumpy Cat memes with example captions.

Create a funny caption for a meme.

Theme: disaster girl
Image description: A picture of a girl looking at us as her house burns down
Caption: There was a spider. It’s gone now.

Theme: grumpy cat
Image description: A face of a cat who looks unhappy
Caption: I don’t like Mondays.

Theme: apple pie.
Image description: Simple and easy apple pie served with vanilla ice cream, on a gingham tablecloth in Lysekil, Sweden.
Caption:

After setting the temperature parameter to 0.7 and the top_p to 1.0, I pass the prompt into GPT-Neo to generate new captions. Here’s the code to generate a caption.

from transformers import pipeline, AutoTokenizer
generator = pipeline(‘text-generation’,
device=0,
model=’EleutherAI/gpt-neo-2.7B’)
results = generator(prompt,
do_sample=True,
min_length=50,
max_length=150,
temperature=0.7,
top_p=1.0,
pad_token_id=gpt_neo_tokenizer.eos_token_id)

Here are the sample results.

1: I LOVE APPLE PIE
2: I CAN’T. I’M NOT ALLOWED
3: I LOVE THE SIMPLICITY OF AN APPLE PIE
4: APPLE PIE. THE ONLY THING BETTER THAN THIS IS A HOT BATH
5: I’M A PIE. YOU’RE A PIE
6: I LOVE PIE, AND THIS IS A GOOD ONE
7: I LOVE APPLES, BUT I’M NOT VERY GOOD AT BAKING
8: THE PIE IS DELICIOUS, BUT THE ICE CREAM IS NOT
9: I LOVE APPLE PIE. IT’S THE BEST
10: THE BEST FOOD IS WHEN YOU CAN TASTE THE DIFFERENCE BETWEEN THE FOOD AND THE TABLECLOTH

Hmmm. These are not as good as the GPT-3 captions. Most of them are quite simple and not very funny. Number 10 is just plain absurd. But number 4 seems to be OK. Let’s use this as our caption.

The final step is to compose the meme by writing the caption into the background image.

How to build your own meme generator with machine learning

Generating captions

GPT-3 Da Vinci

GPT-Neo

Typesetting memes

Discussion

Next steps

Related