Replication stopped Consumer failed to replay change redhat-ds/fedora-ds/389

By thomas, 20 May, 2011

One of our ldap seconaries was failing to stay in sync with the main server. We kept getting "Consumer failed to replay change" in the error log. The uniqueid and CSN were always the same, so at first I thought it was specific to the record that was being propogated. After a little looking, I found the following post: http://lists.fedoraproject.org/pipermail/389-users/2009-October/010278.html

In our case it was indeed the passwordRetryCount that was not being accepted at the consumer. I used cl-dump to look at the changelog, a new tool to me, very useful, I just used cl-dump -w[passwd] -o/tmp/changelog.dump. I then did a grep to find the parts that were failing using the uniqueid and CSN.

Rich Megginson mentions that you can set passwordIsGlobalPolicy to allow passwordRetryCount to be sent in the update. I elected to for this solution rather than modifying the sync agreement. I think it makes sense for the retry count to increment globally. I then found this in the documentation:

http://docs.redhat.com/docs/en-US/Red_Hat_Directory_Server/8.1/html/Administration_Guide/Managing_Replication-Replicating-Password-Attributes.html

/usr/lib/mozldap/ldapmodify -D "cn=directory manager" -w secret -p 389 -h consumer1.example.com dn: cn=config changetype: modify replace: passwordIsGlobalPolicy passwordIsGlobalPolicy: on

The only thing to note is that some documentation has the last line there as passwordIsGlobalPolicy: 1 which is probably an older way of changing the setting.

After making that change:

19/May/2011:16:48:05 -0400] NSMMReplicationPlugin - agmt="cn=ldap" (ldap:389): Consumer failed to replay change (uniqueid cccccccc-dddddddd-eeeeeeee-ffffffff, CSN 4dd474c2000004720000): DSA is unwilling to perform. Will retry later. [19/May/2011:16:48:08 -0400] agmt="cn=ldap" (ldap:389) - session end: state=0 load=1 sent=3 skipped=0 [19/May/2011:16:48:08 -0400] NSMMReplicationPlugin - agmt="cn=ldap" (ldap:389): Successfully released consumer [19/May/2011:16:48:08 -0400] NSMMReplicationPlugin - agmt="cn=ldap" (ldap:389): Beginning linger on the connection [19/May/2011:16:48:08 -0400] NSMMReplicationPlugin - agmt="cn=ldap" (ldap:389): State: sending_updates -> start_backoff [19/May/2011:16:48:11 -0400] NSMMReplicationPlugin - agmt="cn=ldap" (ldap:389): State: start_backoff -> backoff [19/May/2011:16:48:11 -0400] NSMMReplicationPlugin - agmt="cn=ldap" (ldap:389): Cancelling linger on the connection

And all is happy again.