Spaces:
Runtime error
Runtime error
| title: MultiAgent System For Screenplay Creation | |
| emoji: 🏆 | |
| colorFrom: yellow | |
| colorTo: blue | |
| sdk: gradio | |
| sdk_version: 5.32.1 | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| tag: "agent-demo-track" | |
| # TODO NAME OF THE AGENT | |
| ## Agent capabilities | |
| TODO: BETTER INTRO | |
| The aim of our agent is to support authors in their creative process for scenarios and storyboards. | |
| ### Agent Flow | |
|  | |
| **A** | |
| Starting the agent | |
| **B** | |
| The agent receives as input a text file containing the script, | |
| either in plain text format or in structured formats (e.g. PDF, DOCX), | |
| which it then converts into plain text for processing. | |
| **C** | |
| The agent extracts a summary of the overall content of the scenario, | |
| identifying the main narrative lines and the time frame. | |
| This will help creating a big picture version of the draft for the next steps | |
| **D** | |
| The agent will identify the main entities (characters, locations, events) and key themes in the script. | |
| It will also generate a small abstract (~5 sentences) | |
| with enough details to understand the overall plot and tone. | |
| **E** | |
| The agent checks whether the input text matches a known or published script. | |
| If it does, | |
| it will check the license and availability of rights to understand if it is possible to operate on it. | |
| In case of any limitations, the agent will warn the user about restrictions. | |
| **F** | |
| The agent will perform an analysis of the main points of the sctipt: | |
| - Characters: extract and catalog the names of the characters, | |
| classifying them by role (protagonist, antagonist, secondary characters), | |
| gender and age/physical description. | |
| - Locations: Detect the places where the scenes take place | |
| (interiors, exteriors, historical periods, geographical location) and catalogue them. | |
| - Plot points: Isolate key plot points | |
| - Vibes (Look and Feel): Understand the style (dramatic, comic, thriller, horror) | |
| and the overall sensation (suspense, irony, melancholy). | |
| **G** | |
| Define the agent goal. | |
| Having achieved a comprehensive summary, the agent will ask for the final goal: | |
| - Remake / Rewrite | |
| - Change of medium (movie, tv series, ...) | |
| - Other purposes (Workshop, Interactive presentation, Didactic analysis, ...) | |
| **H** | |
| Structural proposal. | |
| Coherently with the goal, | |
| the agent will split the narrative structure into acts and scenes, | |
| pointing to the reference text as well | |
| **I** | |
| Media generation. | |
| This phase consists of a series of steps focused on creating additional contents, | |
| to support the textual part of the script: | |
| - Concept art | |
| - Storyboard for narrative keypoints | |
| - Images for plot points | |
| **TODO: add sound and bias analysis?** | |
| **J** | |
| Final deliverable | |
| ### Main Techniques | |
| - Transformer-based NLP architectures (BERT, GPT-4) to produce a coherent text synthesis | |
| - Named Entity Recognition (NER) and context analysis, to identify human characters and their roles | |
| - Semantic analysis of textual descriptions, toponym extraction, creation of an internal scene map | |
| - Detection of text patterns (turning expressions such as “Suddenly”, “In the meantime”) | |
| and classification using a Story Understanding model | |
| - Tone analysis and Sentiment analysis for understanding vibes | |
| - Image generation models (Stable Diffusion, DALL·E 3), with prompts generated by the model | |
| ### Code overview | |
| ### Use cases | |
| ### Contributors: | |
| - Code Implementation made by luke9705 and DDPM; | |
| - Ideas creation and testing conducted by OrianIce and Loren1214. | |
| ### Sources | |
| - Russell, S., & Norvig, P. (2021). *Artificial Intelligence: A Modern Approach* (3rd ed.). Pearson. | |
| - Cambria, E., & White, B. (2014). *Jumping NLP Curves: A Review of Natural Language Processing Research*. IEEE Computational Intelligence Magazine, 9(2), 48–57. | |
| - Ramesh, A., Pavlov, M., Goh, G., Gray, S., Voss, C., Radford, A., … & Sutskever, I. (2022). *Hierarchical Text-Conditional Image Generation with CLIP Latents*. arXiv preprint arXiv:2204.06125. | |
| - | |