More PSA to Lightroom details

August 4, 2006 | 4 Comments

I posted a link to my results to the Adobe Lightroom Beta forums and someone asked for additional details, and I realize I skimped on the technical stuff a little too much. Here are the details on the steps as described in the previous post. I got a lot of information out of people doing a similar thing: transferring PSA data into iMatch.

Caution: WordPress may have mangled some of the code below by manipulating quotes, backslashes, etc. I really wish it didn’t do that. You can download the files here.

Step 2: Use PSATool to dump PSA catalog to text file

Get the PSATool here. I used the following command line to generate a PSAdump.txt file:
psatool /if-print /tp /fp /uf /ss /ed My-Catalog.psa >c:\PSA-export.txt

The /ed tells it to dump the file path for the ‘_edited’ file instead of the original, feel free to leave this off if you prefer.

Step 3: Build keyword list from psa dumpfile

This is where I wrote the first script. The format of the dump file looks like this:
\Nas0\media\Pictures\1999-01-01 thru 1999-12-31\DCP_1033.JPG;14;photo;online;1999-05-31 12:00:13;"";"Imports"\"Impor
t (3/11/2003 12:16:31 PM)"

So everything after the 6th field is semicolon-delimited tags, with the hierarchy being backslash delimited. The idea is to generate a file that looks like this:

Places
        Florida
                Disney
        Virginia
                Hume
                Tallow Tree
                Apartment

Where each tab indicates a level of nesting. Here’s the perl script I used to convert the PSAdump.txt into a keyword list:

#!/usr/bin/perl

use strict;
use Data::Dumper;
$Data::Dumper::Quotekeys = 0;
$Data::Dumper::Indent = 1;

my $data = {};

while(<STDIN>) {
        chomp; 
        my @fields = split(/;/, $_, 7);
        next unless ($fields[6]);
        $_ = $fields[6];
        tr/"\n\r//d;
        my @tags = split(/;/);
        foreach my $tag (@tags) { 
                next if ($tag =~ /Adobe Atmosphere/);
                my $p = $data;
                foreach my $name (split(/\\/, $tag)) { 
                        if (!defined $p->{$name}) { $p->{$name} = {}; } 
                        $p = $p->{$name} 
                } 
        }
}

my $s =  Dumper($data);
$s =~ s/^\S+.*$//mg; # Get rid of $VAR line
$s =~ s/\s*=>.*$//mg; # Get rid of hash structure data
$s =~ s/^\s*\}.?\s*$//mg; # Get rid of hash structure data
$s =~ s/^(\s*)'/$1/mg; # Get rid of quotes on keys
$s =~ s/'\s*$//mg; # Get rid of quotes on keys
$s =~ s/\\'/\'/g; # Unescape quotes
while ( $s =~ s/^(\t*) {2}/$1\t/mg ) { } # Transform spaces to tabs
$s =~ s/^\t//mg; # Drop 1 indent level
$s =~ s/^\s*$//mg; # Get rid of space-only lines
$s =~ tr/\n//s; # Collapse empty lines
$s =~ s/^\n+//s; # Get rid of empty lines at the beginning
print $s;

I then invoke the script thus:


keywords.pl < PSAdump.txt > keywords.txt

Step 4: Import keyword list into Lightroom

Start up Lightroom, go to Library-> Import Keywords. Point it at your file and it should populate the keywords with the tags from the file.

Step 5: Update the SQLite database to add the mappings

This is the juicy step. In the database there is a table called AgLibraryImageTag. It is what’s known as a “link table”. It defines a many-to-many relationship between images and tags (i.e., a tag is linked to many images, an image is linked to many tags). In it’s simplest form, the table would be rows of two values: image, tag. Except we don’t use the image name or the tag name, instead we use the image id and the tag id. So we have to get the image id from the images table, and the tag id from the tags table. Then add the row to the link table. That’s what the following code does:

#!/usr/bin/perl

use strict;
use DBI;
use Data::Dumper;
use Data::GUID;
use YAML;

# This is your library file, typically located in My Documents\My Pictures\Lightroom
# The name is something longer than "Library.aglib" but I shortened mine.
my $dbfile = "Library.aglib"; 
# This is the file you generated with PSAtool.
my $psafile = "PSADump.txt";

my $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile","","");
my %tag_cache;

my $localid = $dbh->selectrow_array("select value from Adobe_variablesTable where  name = 'Adobe_entityIDCounter'"); 
print "Initial ID: $localid\n";

open(PSA, "< $psafile");
while (<PSA>) {
        tr/\n\r//d;
        my @fields = split(/;/, $_, 7);
        next unless ($fields[6]); # Only process images with tags

        my $file = $fields[0];
        my $tags = $fields[6];

        next if ($tags eq 'Folders');
        print STDERR "Processing file: $file [$tags]\n";

        my $fileid = resolve_file("\\$file");
        if (!$fileid) {
                print STDERR "Couldn't resolve file to fileid [$file]\n";
                next;
        }

        my @tags = split(/;/, $tags);
        foreach my $tag (@tags) {
                my $tagid = resolve_tag($tag);

                if (!$tagid) {
                        print STDERR "Couldn't resolve tag to tagid [$tag] (file: $file)\n";
                        next;
                }

                associate($fileid, $tagid);
        }
}
close(PSA);


print "Final ID: $localid\n";
$dbh->do("update Adobe_variablesTable set value = ? where  name = 'Adobe_entityIDCounter'", undef, $localid); 
$dbh->do("update AgLibraryImageSearchIndex set dirty = 1");

exit;


sub resolve_tag {
        my $tag = shift;
#       print STDERR "  Processing tag: $tag\n";
        return undef unless ($tag);
        if (exists $tag_cache{$tag}) {
                return $tag_cache{$tag};
        }

        my @parts = split(/\\/, $tag);
        my $local = pop(@parts);
        my $prefix = join('\\', @parts);
        $local =~ tr/"//d;

        my $parent = resolve_tag($prefix);

        my $id;
        if ($parent) {
                $id = $dbh->selectrow_array('SELECT id_local FROM AgLibraryTag where name = ? and parent = ?',
                        undef, $local, $parent);
        } else {
                $id = $dbh->selectrow_array('SELECT id_local FROM AgLibraryTag where name = ? and parent = (select id_local from AgLibraryTag where parent is null and kindName = ?)',
                        undef, $local, 'AgKeywordTagKind');
        }


#       print STDERR "Resolving $tag ($parent -> $local) [$id]\n";

        $tag_cache{$tag} = $id;
        return $id;
}

sub resolve_file {
        my $file = shift;
        my $id = $dbh->selectrow_array('SELECT image from Adobe_imageFiles where lower(absolutePath) = lower(?)', undef, $file);
        return $id;
}

sub associate {
        my ($fileid, $tagid) = @_;
        if ($fileid and $tagid) {
                my $id = $dbh->selectrow_array('select id_local from AgLibraryTagImage where image = ? and tag = ?', undef, $fileid, $tagid);
                if ($id) {
                        print STDERR "Association exists ($id) between fileid [$fileid] and tagid [$tagid]\n";
                        return $id;

                }

                print STDERR "Creating association: $fileid -> $tagid\n";

                $dbh->do("INSERT INTO AgLIbraryTagImage (id_local, id_global, image, tag, positionInCollection, tagKind) values (?, ?, ?, ?, ?, ?)",
                        undef, ++$localid, Data::GUID->new()->as_string, $fileid, $tagid, 'z', 'AgKeywordTagKind');
        } else {
                print STDERR "Did not get either fileid [$fileid] or tagid [$tagid] so not associating\n";
        }
}

I think the code is fairly safe explanatory. For each line in the PSADump file, we resolve the imageid in the database (you imported the images in Step 1), we resolve the Tags (you imported the tags in step 4), and we add a row to AgLibraryImageTag to associate the two. The two tricky steps are updating the entityIDCounter in Adobe_variablesTable and setting the ImageSearchIndex to be dirty so that it rebuilds the search keywords for the tags you just added.

There you have it. I’m not sure I can add much more detail to that. Hopefully that helps someone along in their quest to transfer their data.


4 Comments

  1. hi, i am trying your last step and can’t get the script to compile. there are a lot of packages it seems to need from CPAN, but all i did was install Perl’s default package from perl.com. what ELSE did you install besides the default PERL package??

    i have tried download individual packages from cpan.org but after 7 of these, i figured i better ask, because you must have run some install to install all the necessary packages.

    please help me! i’m so close…….

    Comment by shane — March 14, 2007 #


  2. You’re right that it probably leads to a lot of dependencies and there’s also an implicit dependency on DBD::SQLite . I install everything from Debian packages so it handles the dependencies. If you’re doing this on a Windows machine with ActiveState Perl than you can use their package manager (I’d be fairly positive you can get all of them except maybe DBD::SQLite via the ActiveState package manager).

    Coincidentally I’ve been revisiting this issue in the last couple days because I’m migrating from Windows to MacOS and now I need to migrate my Lightroom DB from PC to Mac which I’ll post about once I’ve finished.

    In the process of doing this I happened to find someone who did something very similar (except moving the data to Gallery2 instead of Lightroom) directly on his Windows box using ActiveState and fewer steps. The project is called Elements2Gallery2.

    I checked the ActiveState site and according to this ActiveState page the DBD-SQLite builds for Windows properly so that should make your life easier.

    Comment by pablo — March 14, 2007 #


  3. I just realized that the final version of Lightroom (v1.0) supports importing from Photoshop Elements and Photoshop Album so you shouldn’t need this at all anymore.

    Comment by pablo — March 14, 2007 #


  4. I would like to see a continuation of the topic

    Comment by Maximus — December 20, 2007 #

Sorry, the comment form is closed at this time.