Just posting this here and might help the others:
I got this error when we had an outage a while ago, and tried to restart the slon replication service as the replication is NOT really replicating the data from MASTER to SLAVE:
开发者_如何学GoFATAL localListenThread: "select "_MY_DATABASE_copy".cleanupNodelock(); insert into "_MY_DATABASE_copy".sl_nodelock values ( 1, 0, "pg_catalog".pg_backend_pid()); " - ERROR: duplicate key value violates unique constraint "sl_nodelock-pkey"
DEBUG2 slon_abort() from pid=xxxx
DEBUG1 slon: shutdown requested
This, in our environment, is caused by outages (planned or unplanned) in which Postgres goes down before the slon daemons. Oftentimes, restarting the slons doesn't resolve the error. What does resolve it is:
- stop all slons which reference the failed/restarted node (if you
can, just stop all slons via
slon_kill
; much simpler). - go into the
sl_nodelock
table on the master database server, and remove all rows with an nl_nodeid corresponding to the failed/restarted node. - start any stopped slons.
I also got this error, and in my case, a third fix applied. We accidentally had a 'slon' process running on the slave-to-be. Our solution was to shut down that process, and then restart the 'slon' processes on the master.
Ok so this is how I fix it, but this doesn't guarantee it will work for everyone:
- Check the slon process.
~# ps aux | grep slon
Then kill all process running the slon replicatin service that will look something like this.
~# ps aux | grep slon root 8321 0.0 0.0 6528 1568 pts/2 S+ 10:35 0:00 slon MY_SCHEMA dbname=MY_DATABASE user=postgres host=169.1.1.1 password=password
Normally it has two process running in the background, and this is normal, so we have to kill those process.
~# kill -9 8321
Then run your slon replicatin service script again.
精彩评论