MUSIA-2025

Multilingual Story Illustration: Bridging Cultures through AI Artistry

Datasets and Baseline Model

Example Illustration

“I want to be big,” says Little Monkey. “I want to be strong.” A wise woman hears him. “Take this magic wand,” she says, “and all your wishes can come true.”

Illustration 1

A giraffe comes by. He stretches his long neck. He eats the sweet leaves at the top of the trees. “I want a long neck,” says Little Monkey. “POP!” His neck grows long, just like the giraffe’s. Little Monkey is happy. An elephant comes down to the river. He fills his trunk with water. He blows it all over himself. “I want to do that too!”, says Little Monkey. “BANG! Just like that, he grows a trunk. He is very happy. “This is fun!” he says. Next, Little Monkey sees a zebra. “I want stripes like those,” he says. “WHIZZ!” Little Monkey has stripes all over his body, just like the zebra. He is very, very happy.

Illustration 2

He goes to the river to try out his new trunk. He looks down. He sees himself in the water. “Mother!” he cries. “Help! A monster!”“That’s not a monster,” says his mother. “That’s you.” “You want a giraffe’s neck, an elephant’s trunk and stripes like a zebra. Don’t you remember?” Little Monkey cries and cries. “I look AWFUL!” he says. “I want to be myself again.” There is a POP, a BANG and a WHIZZ. Little Monkey is himself again. He jumps for joy. He throws the magic wand into the river. He never wants to be anyone else again.

Illustration 3

Dataset

To get the datasets please send the scanned version of the duly filled-in Data-access form to the email address irel.iitbhu@gmail.com. Please mention "MUSIA FIRE-2025" in the Subject. Also please mention the name of the team, and the name, affiliation and email id of each participant in the email.

Language

Training Data

Testing Data

English

Link

Update Soon

Hindi

Link

Update Soon

Dataset Details

The dataset is given in two languages: English and Hindi. The training data consists of the Stories folder that contains narrative texts/stories. Each story file is named following the convention:

  • eng_story_XXXX for English stories
  • hin_story_XXXX for Hindi stories

where XXXX is a zero-padded identifier (e.g., eng_story_0001, hin_story_0001). Corresponding image files are stored in the Images folder. For each story, the associated images follow the naming pattern:

  • eng_story_XXXX_01, eng_story_XXXX_02, ..., for English stories
  • hin_story_XXXX_01, hin_story_XXXX_02, ..., for Hindi stories

Image files are provided in .jpg, .jpeg, or .png formats. The training set includes 360 English stories and 185 Hindi stories.


The testing set comprises 40 English stories and 30 Hindi stories. It consists of a Stories folder for each language that contains the test narratives/stories, and a mapping file for each language that specifies the number of images to be generated for each story. Participants are expected to:

  • Generate the specified number of images per story, matching the formats .jpg, .jpeg, or .png
  • Save the generated images using the naming convention: hin_story_XXXX_01, hin_story_XXXX_02, ..., for Hindi and eng_story_XXXX_01, eng_story_XXXX_02, ..., for English
  • Create separate folders for both the languages containing images of the respective languages and package them into a single zip file for submission.

Baseline Models

Will be provided soon!

Contact us

For any queries write to us at irel.iitbhu@gmail.com