SciPostGen: Bridging the Gap between Scientific Papers and Poster Layouts

CVPR 2026 (Findings)

Shun Inadumi^1,2,*Shohei Tanaka¹Tosho Hirasawa¹Atsushi Hashimoto¹Koichiro Yoshino^4,3,2Yoshitaka Ushiku¹

¹OMRON SINIC X Corporation²NARA Institute of Science and Technology³Guardian Robot Project, RIKEN⁴Institute of Science Tokyo

* Work done during an internship at OMRON SINIC X Corporation.

paper code hf

TL;DR SciPostGen is a large-scale dataset of paper–poster pairs. It enables applications such as retrieving layouts aligned with a given paper and using them to guide poster layout generation.

Abstract

As the number of scientific papers continues to grow, there is a demand for approaches that can effectively convey research findings, with posters serving as a key medium for presenting paper contents. Poster layouts determine how effectively research is communicated and understood, highlighting their growing importance. In particular, a gap remains in understanding how papers correspond to the layouts that present them, which calls for datasets with paired annotations at scale. To bridge this gap, we introduce SciPostGen, a large-scale dataset for understanding and generating poster layouts from scientific papers. Our analyses based on SciPostGen show that paper structures are associated with the number of layout elements in posters. Based on this insight, we explore a framework, Retrieval-Augmented Poster Layout Generation, which retrieves layouts consistent with a given paper and uses them as guidance for layout generation. We conducted experiments under two conditions: with and without layout constraints typically specified by poster creators. The results show that the retriever estimates layouts aligned with paper structures, and our framework generates layouts that also satisfy given constraints.

SciPostGen Dataset

Overview

We constructed SciPostGen, a large-scale dataset consisting of 18,097 pairs of scientific papers and their corresponding posters. SciPostGen includes pairs collected from major computer science conferences and covers both landscape and portrait poster formats. To avoid copyright issues, we release only download scripts for original paper PDFs and poster images, together with corresponding annotation files.

Example of Annotations

Statistics

Analysis

Our analyses based on SciPostGen show that paper structures are associated with the number of layout elements in their corresponding posters. In particular, the amount of text and the number of figures and tables in a paper are moderately related to the number of text and figure elements in the layout.

Retrieval-Augmented Poster Layout Generation Framework

Overview

Building on our analyses, we explore a framework, Retrieval-Augmented Poster Layout Generation, which retrieves multiple layouts consistent with a given paper to accommodate the diversity observed in poster layouts. The layout retriever searches for layouts aligned with paper structures by learning relationships between papers and layouts through contrastive learning. Following recent work on layout generation, we employ a large language model as the layout generator to flexibly integrate diverse information, including retrieved results and paper structures.

Retrieval-Augmented Poster Layout Generation framework

Key Findings:

Layout retriever can estimate layouts aligned with paper structures.
Our framework can produce layouts aligned with paper structures and faithful to layout constraints.

Citation

@inproceedings{Inadumi_2026_SciPostGen,
    title={{SciPostGen}: Bridging the Gap between Scientific Papers and Poster Layouts}, 
    author={Shun Inadumi and Shohei Tanaka and Tosho Hirasawa and Atsushi Hashimoto and Koichiro Yoshino and Yoshitaka Ushiku},
    booktitle = {Findings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    year={2026}
}

Relevant Projects

BMVC 2024

SciPostLayout: A Dataset for Layout Analysis and Layout Generation of Scientific Posters

SciPostLayout dataset consists of 7,855 scientific posters and manual layout annotations for layout analysis and generation. SciPostLayout also contains 100 scientific papers paired with the posters.

CVPR 2026 (Findings)

SciPostLayoutTree: A Dataset for Structural Analysis of Scientific Posters

SciPostLayoutTree is a dataset of approximately 8,000 posters annotated with reading order and parent-child relations.