Description
This study examines the longstanding need and challenge of providing contextual analysis of historical images stored in digital visual archives and the accessibility of retrieving contextual information from these historical archives. Contextual analysis is essential for disciplines such as history and art history, as it allows for the contextualization of artwork and historical sources with historical narratives, which, in turn, enhances understanding of the artistic or political expression in the contents of cultural products. To address this challenge, a novel approach is proposed utilizing computer vision to trace the circulation and dissemination of historical photographs in their original contexts. This method involves first using YOLO v7 to crop historical images from pictorial magazines, then training machine learning models on the cropped printed images plus another large dataset of original historical photographs, and comparing the similarity of images between the datasets of printed images and original photographs. To ensure accuracy of image similarities between the two subsets with distinct image qualities, an ensemble of three machine learning models—Vision Transformer, EfficientNetv2, and Swin Transformer—was developed. Through this system, contexts in the circulation of historical photographs were discovered and new insights regarding the editing strategies of propaganda magazines in East Asia during WWII were uncovered. These outcomes offer supporting evidence for previous research in the history and art historical disciplines, and demonstrate the potential of computer vision for uncovering new information from digital visual archives. Our model achieves a 77.8% top-15 retrieval accuracy on our evaluation dataset.