PostgreSQL tsearch2 migration from 8.0 to 8.3 - how I did it successfully
Recently I was in the process of moving my site to a much better hosting situation (more on that later). During the move I decided to upgrade from PostgreSQL-8.0 to PostgreSQL-8.3 as I was pretty far behind and I prefer to stay current. This sort of upgrade isn’t a big deal, and I’ve done it many times. So I did my usual process:
- Install the desired version of PostgreSQL (in this case 8.3)
- Scp my last backup (taken a few minutes after I lock down the site) to the new host
- Run a script that essentially creates a new db, and restores the backup to it
That’s when things went horribly wrong. After a few searches I quickly learned that moving from 8.0 to 8.3 is a bit tricky when you have tsearch2 stuff in there. It turned out to be really easy to upgrade, read on if you want to know how I did it.
The main problem turned out to be the movement of Tsearch2 into the core. Previously it was a contributed project and thus you had to “install” it into any database that needed to use it. The installation was really just a bunch of sql statements that created functions inside your database. So the problem when you restore the db is that the functions installed conflict with the functions that now exist in the core (aka in PostgreSQL itself).
So here’s how I migrated my database successfully:
- Lock my site so I know nothing can write to the database
- Create a backup (not to be used, but in case I typo a subsequent step)
- Drop all of the tsearch2 objects from my database (I’ll provide the commands I used)
pg_dumpto dump the database per normal
- Create a fresh database on 8.3
- Restore the database per normal
So basically by doing those steps I was easily able to get my database ported over to PostgreSQL-8.3. Before I move on to how my app had to change to accommodate things, here’s the sql script I used to purge Tsearch2 from my db:
The next thing I needed to do was update my code so that it would reference the new core Tsearch2 objects. This turned out to be super easy. Here are a few code diffs of the lines I had to change (in my Python code):
- coalesce = "COALESCE(RANK(idx_tags, TO_TSQUERY('%s')), 0)" + coalesce = "COALESCE(TS_RANK(TO_TSVECTOR(tags), TO_TSQUERY('%s')), 0)" - filter = "AND idx_tags @@ TO_TSQUERY('%s')" + filter = "AND TO_TSVECTOR(tags) @@ TO_TSQUERY('%s')"
I had to rinse and repeat those two changes a few times, and I was all set. Here’s the page I used as a reference to get me going. Currently I’m running without either an index or a dedicated tsvector column and things are running great. I’m soon going to be adding a GIN index to gain back the speed I had before (and potentially more).
If you have any questions or anything please speak up - I know plenty of people are having challenges getting onto [the awesome] 8.3.