Automated Personalization of Children's Storybook Images: A Case Study
Images are vital in stories for children since visual narratives support their cognitive, language, and emotional development. On the other hand, personalization of storybooks based on children’s preferences has a great potential to boost their engagement during reading. However, this personalization is manual and costly. This paper presents a case study on automating the personalization of children’s storybook images using recent advances in latent diffusion, vision models, and small language models. We develop and evaluate three lightweight pipelines designed to personalize storybook images based on specific editing requests. A qualitative evaluation was performed using a grid search of parameters to observe trends in the parameter combinations used in the image generation process. Our main findings showcase that Pipeline 1 does not yield great personalization results, Pipeline 2 is good at making color changes and toggling the background of images, and Pipeline 3 is adequate at modifications requiring to add, modify, or transform elements in the image. For further research, we recommend a thorough evaluation of the pipelines on larger datasets with more editing requests' types.
