2024-07-28 AI Birthday Present - Keith Moon

# Using AI to Create a Unique Birthday Present for a Family Member Today is my Dad's 70th birthday, and not an easy man to buy a present for. Unless someone is picking up a new hobby or interest very year (and he isn't), then it's easy end up buying them the same things every birthday. This is big birthday milestone, I wanted to come up with something new to give him as a present, so I trained an AI image model on his likeness, and then made a bunch of pictures of him in increasingly unlikely situations. It was surprisingly easy and cheap, with impressive results! ## High-level Building Blocks I used an open source image model called Stable Diffusion XL, because it's open nature makes it possible to add additional fine tuning to get it to produce a specific face. You do this by providing a bunch of images of the person you want to be fine tuned to, these are used to create a LoRA. The LoRA file defines the additional fine tuning on top of the base Stable Diffusion XL model. With the LoRA applied to the SDXL model, you can provide an image prompt, and it will produce images with a face that resembles the training data provided. This can produce some amazing results, but it's not a 100% hit rate, you will likely have to sift through some rubbish images to find ones that have a good likeness. Also, created images will still suffer from many of the problems seen in other AI image generation, like malformed body parts and too many fingers. ## Specific Tools Here's the easiest way I found to accomplish the above: ### Gather your photos To train the model, you will need to gather a bunch of photos of the person, I'd recommend about 30 photos. I tried with 15 photos and the likeness of the photos produced was not great. Both Google Photos and Apple's Photo app have a feature that will local photos that are all of the same person, this can be helpful to find all the photos you have of a family member. ![[google-photos-dave-moon.png]] Try and pick photos where the person is looking towards the camera, rather than side on pictures. If there are other people in the photo, then you will need to crop the picture so just the relevant subject is visible, otherwise the AI model will get confused over who they are supposed to fine tuning for. Once you have you 30 or so photos, zip them into a zip file. ### Create your LoRA You can use a service called Replicate to spin up on demand AI models to use for very little money. Visit this link to create your LoRA from a set of images of a person: [zylim0702/sdxl-lora-customize-training – Run with an API on Replicate](https://replicate.com/zylim0702/sdxl-lora-customize-training) To use this, you will need to sign up for an account, and set a payment method, but you can set a spending cap. I produced 3 different LoRAs, and hundreds of pictures, and it costed less than $5 in total. The form asks for a zip file of your training photos, so upload the photos you collected above. The rest of the options in the form can be left on the default values. Hit "Boot + Run". Replicate will spun up a version of the model and run your images through the LoRA training, this might take 7-10 mins. When it's finished, you will have a output that includes a `trained_model.tar`. ![[lora-training-output.png]] I suggest downloading this file, and storing it, so you can reuse it in the future, however we can use the Replicate hosted version directly for the next step. ### Create your customised images You will need the URL of the `trained_model.tar` that we just produced, you can get this right clicking on the button, and choosing "copy link address" or something similar. Next visit [zylim0702/sdxl-lora-customize-model – Run with an API on Replicate](https://replicate.com/zylim0702/sdxl-lora-customize-model) where we will use our LoRA to create customised images. In the text box labelled "Lora_url" end the URL we just copied from the previous page. We can now write out image prompt. When we created the LoRA, we defined a token that represents the thing we were fine tuning for, in this case our family member. If you left everything with the defaults then this token is "TOK". When we write our prompt, we want to include the token when describing what our family member should be doing. For example, one of the first prompts I tried was "A photo of TOK, dressed in a spacesuit, standing on the moon". ![[davemoon-spaceman.jpg]] As I mentioned, not all pictures will be perfect, and it might be worth playing around with some settings, to see if it helps. It might be helpful to change the "num_outputs" from 1 to 4, to give you a few options with each generation. It will take minute or two to produce the images, especially when it has to load up the model from a "cold boot". It's best to stick to scenarios that involve just one person. I tried to generate a picture of my Dad meeting the Queen, and none of the picture looked even remotely like him. Play around, and have fun! Then maybe get the best pictures printed out and put into a photo album. If they are anything like my Dad, they will love the present :) ![[davemoon-body-builder.jpg]] ![[davemoon-cowboy.jpg]] ![[davemoon-crocodile-dundee.jpg]]