Behind the Scenes: Evaluating Computer Vision Embedding Techniques for Discovering Similar Photo Backgrounds

Dodson, Terryl Dwayne

Behind the Scenes: Evaluating Computer Vision Embedding Techniques for Discovering Similar Photo Backgrounds

Files

Dodson_TD_T_2023.pdf (1.4 MB)

Downloads: 102

Date

2023-07-11

Authors

Dodson, Terryl Dwayne

Publisher

Virginia Tech

Abstract

Historical photographs can generate significant cultural and economic value, but often their subjects go unidentified. However, if analyzed correctly, visual clues in these photographs can open up new directions in identifying unknown subjects. For example, many 19th century photographs contain painted backdrops that can be mapped to a specific photographer or location, but this research process is often manual, time-consuming, and unsuccessful. AI-based computer vision algorithms could be used to automatically identify painted backdrops or photographers or cluster photos with similar backdrops in order to aid researchers. However, it is unknown which computer vision algorithms are feasible for painted backdrop identification or which techniques work better than others. We present three studies evaluating four different types of image embeddings – Inception, CLIP, MAE, and pHash – across a variety of metrics and techniques. We find that a workflow using CLIP embeddings combined with a background classifier and simulated user feedback performs best. We also discuss implications for human-AI collaboration in visual analysis and new possibilities for digital humanities scholarship.

Keywords

Convolutional Neural Networks (CNN), Computer Vision (CV), Photography, History, Cultural Heritage, American Civil War

Persistent link

http://hdl.handle.net/10919/115739

Collections

Masters Theses

Full item page