Personal Media Workflow: Scanning and Tagging

During COVID, I began to take seriously the idea of digitizing my family’s physical media. The effort has remained ongoing and encompasses home video, 4x6 prints, slides, cabinet cards, and a wealth of genealogical research material gathered by relatives passed.

I wrote about digitizing home video, a process that yielded a piece of lost media that’s been mythical within my family for as long as I can remember. And I’ve nearly finished that part of the job. After a few years of job changes, losses, and a move, I reorganized and found 6 more VHS-C tapes left to capture. In the meantime, I’ve had a few concurrent workflows going to process static pictures of a few varieties.

When my grandmother passed away following a short battle with cancer, I did a rapid scan of the photos in her personal photo journals. I found those just days before her passing and managed to get most of them into the slideshow presentation I assembled for her wake. It took me most of the following year to even look at any of the material I’d been working on before. In many ways, she was the last surviving link to the members of our family who stewarded this material before I took it on. It was a bonding activity for us and just the idea of returning to it made me feel very sad and alone.

At the end of the year, I finished digitizing two boxes of slides my grandfather took in the 60s, a fascinating look at the early years of my grandparent’s budding relationship, marriage, and the birth of my father. There were some cool, moderately historically significant shots in there as well. As I scanned, I’d text Grandpa photos and he’d reply with some context. In this way, I’ve been able to bond with him too. We’re both still pretty raw, but I think this helps remind us both of Grandma.

The slides took a while. And getting the names and dates assigned to them took longer still. My workflow for that unit of work changed many times, eventually leading to a small Flask application that let me capture responses from my grandfather and commit them to a SQLite database. I’d like to write more about that later, but I have since found a better way. One that I’ve applied to a cache of 4x6s from my own childhood and would like to share here.

What do you want out of this exercise?

When you get invested in digitizing your physical photographs, I think it’s important to think about what you really want. Do you want a complete archive of everything you’ve got or just a few select shots? Where do you want to store these and how accessible should they be? Do you plan to share these with members of your family? What’s the level of quality you’re satisfied with?

For me, I wanted it all. Every shot. Ideally in the best quality possible. I wanted it backed up in several locations and completely accessible via my iCloud Photo Library. To share, I want to rotate hard drives with my immediate family which would also act as further “off-site” backups.

With this in mind, I broke the work out into steps. Segmenting the work proved critical at getting them processed at a pace I considered productive. I have found that trying to do too many actions to a given photo before moving to the next leads directly to burn out.

What follows is how I broke down my scanning projects. It assumes you have the equipment to get the job done.

Sort it out, won’t you?

I’ve got two boxes of immediate family photos. These cover the earliest years of my life, most of them taken by my parents. One is a box full of film negatives. They’re ordered to a degree, but not with the level of specificity I want to reach my goal. The other box is full of 4x6 photo prints—extras from my mother’s scrapbooking days. They have names and dates written on the back of them.

The ideal media to scan is a negative. Provided that it’s an unblemished negative, you can often get a much better image out of it than the 1 Hour Photo lab would have printed for you in 1994. I have the ability to scan these, and have in the past. I’m not, however, completely satisfied with my equipment’s ability to handle those that won’t lay perfectly flat in the transparency bracket adapter that was provided with the scanner. So until I can get some textured glass to keep things flat and avoid Newton’s Rings, I’m setting these to the side.

The added benefit of doing the 4x6s first, as mentioned above, is the context my mom wrote on the back. There will certainly be duplicated work once I get to the negatives, but by then I’ll have a pretty good timeline to follow.

That decided, I began sorting out the envelopes by date. These were thankfully batched by year. Within the envelopes, I sorted them by date as best I could. Grouping like photos whenever possible. This isn’t totally necessary, but it does simplify a later step of my process.

Together we scan

Starting with the oldest photos first, I started scanning. With my Epson V600, I can do 3 4x6 images at a time. I use VueScanner on my Mac to interface with the hardware. This software is powerful, but simple to use. It will automatically detect multiple pieces of media and even attempt to correct the skew if they aren’t sitting at perfectly right angles [1]

For 4x6 photo prints, I set my preview DPI to 300. This is an okay balance for me in terms of speed and quality. The preview scan is really just important for letting VueScan locate and select the photos on the flatbed and for me to adjust the color balance I think is most appropriate for the image. Typically that’s Auto Levels, sometimes neutral with some slight black and white point adjustments.


  1. They’re never sitting at perfectly right angles. ↩︎

For the scan DPI, I use 600. For archiving, experts will say that 300 is adequate. Because I only want to do this once, storage is not at a premium, and I may want to enlarge some of these for reprinting, I double it. [1] I set the “restore fading” filter to auto. I’ve had some issues with that feature on print media, but for photos I’m pretty happy with the success rate. For these photos, it often doesn’t need to do anything. I waffle on whether to use the sharpen tool. I think it does a good job at the lightest setting, but I determined It’s probably better to do that manually later on if I feel it’s necessary.

So long as the files are named with incrementing numbers (scan0001+.jpg being my preference), I throw on a movie I don’t mind occasionally looking away from and go for it. To make the next step run smoothly, I do my best to lay my images on the bed in the scanning order every time. I maintain that order when I pick them up to make room for the next ones. The reasons for that will become clear soon.

My last note on the scanning process itself is ABC: always be cleaning. I keep two tools handy at all times. I clean microfiber cloth and an air blower. I try not to touch the glass much when picking up photos, but if it happens the cloth is good for removing smudges on the glass. Occasionally these are useful for removing old scotch tape adhesive from a photo as well. [2] The blower is used liberally on the white scanner backing, glass, and the front of every image. The less dust the better!

EXIF through the photo cache

Prompted by David Nelson on Mastodon, I made a breakthrough on the part of my workflow that entails context. When doing the slides, my best idea was to capture the raw anecdotes from my grandfather and put them into a database until I could figure out how to embed it. There were too many of these images to even imagine doing that at some later date. If it was going to get done, now was the time.

Enter NeededApps. I evaluated a bunch of EXIF tools for the Mac. None of them are great, including theirs. But they are the best I’ve found.

For all of these photos, I’m looking to add three things: date, description, and location. Because my stack of scanned images are in the same order, grouped by date and event, adding this data in batches is fairly straightforward. When I don’t know the exact date, I default to the first of the month. I don’t typically change the time attributes at this stage. [3]

The names of the date and description fields I’m updating are as follows, as seen from the output of exiftool:

% exiftool -G1 -a -s path/to/image.jpg

[IFD0]      ImageDescription   : Jacob Tender (3 years old)
[IPTC]      Caption-Abstract   : Jacob Tender (3 years old)
[XMP-dc]    Subject            : Jacob Tender (3 years old)
[ExifIFD]   CreateDate         : 1997:01:05 12:00:00
[IPTC]      DateCreated        : 1997:01:05
[XMP-xmp]   CreateDate         : 1997:01:05 12:00:00

When I first started writing these, I was only using EXIF. That works pretty well for dates, but I found that Apple photos does not read EXIF ImageDescription, it reads the IPTC Caption-Abstract field. So I ended up using the same tool to copy the values across fields.

List the values for IPTC and EXIF image description tags, if they exist:

exiftool -IPTC:Caption-Abstract -EXIF:ImageDescription /path/to/image.jpg

Copy the value from EXIF’s description to IPTC’s caption tag:

exiftool -overwrite_original "-IPTC:Caption-Abstract<EXIF:ImageDescription" /path/to/image.jpg 

Do the same as above, but to every file within a directory:

% cd /path/to/

exiftool -overwrite_original "-IPTC:Caption-Abstract<EXIF:ImageDescription" *.jpg

Now I write to both fields (plus XMP) within MetaImage. This makes the images a lot more portable going forward.

MetaImage’s location finding tool is pretty solid. I set up a few actions within the app that let me quickly apply common addresses. I like to scroll through the entire batch and tag everything at a given place at one time, save, then remove these from the list and tag the next location.

What’s in a rename?

With photos brimming with metadata, I like to rename them using a naming convention I’ve carried from project to project:

YYYY-MM-DD_media_#_Description

Media, in this context indicates the physical format of the image. For me, this is either a print or transparency.

I find this is adequate for organizing and navigating scans in my archive through Finder or the terminal.

Examples:

1997-01-05_print_1_Jacob Tender (3 years)
1997-01-05_print_2_Jacob Tender (3 years)
1997-01-20_print_1_Jacob Tender playing outside

I used to cobble this together using Renamer, but found there to be limitations in accessing EXIF data and numbering. Fortunately, NeededApps had a tool for that too. You can bundle MetaImage and MetaRename on the Mac App Store for a good deal. [4]

I've included some of my configuration settings here.


  1. I actually started at 800, but stepped it back slightly to compare and found 600 was more than adequate for my purposes. ↩︎

  2. Really just when the glue is on the subject. Easier to smudge it off here than try to fix it in Photoshop later. ↩︎

  3. I have occasionally changed these to group photos better for browsing once in Apple Photos. ↩︎

  4. You get their video metadata tool as well. ↩︎

File away

My fully tagged files first get dragged into Apple Photos. Here I can tell pretty quick if I missed a date on an image because it’ll show up at the end of my camera roll. That’s a quick fix. Otherwise, I can now browse my scanned photos by location, date, and—after Photos does its thing in the background—by faces.

The files themselves get sorted out by year and dumped onto my NAS for long-term storage and additional cloud backup. It’s this copy that I’ll eventually distribute around to everyone once the job is done.

It never ends

It’s really an amazing thing, having these old shots surfacing now in my Apple Photos memories. It’s been fun sharing them with family and friends. But the job isn’t over. Over the last week, I’ve done 400+ with probably another ~600 to go. I keep chipping away at it during lunch breaks, evenings, and chill weekend days. Only real issue is that I’m running out of English-speaking movies on my queue to put on while I work… [1]


  1. Over the last few weeks spent here at my desk, I watched the entirety of Joel Haver’s 12 movies in 12 months project. It was a thematically solid pairing at times. Would recommend. ↩︎