Postgres: Deadlock resolution
Within the confines of the application trends in usage dictate whether synchronous data update is likely to occur. Such an update potentially cause resources to be in demand from more than one source.
# ======> AR relationships <======
has_one :address, as: :addressable, :dependent => :destroy
has_one :contact, as: :contactable, :dependent => :destroy
has_one :commission_split, :dependent => :destroy
belongs_to :company, touch: true
belongs_to :user, touch: true
belongs_to :office_manager, touch: true
has_many :approvals, as: :approvable, :dependent => :destroy
has_many :assigned_users, as: :assignable, dependent: :destroy
has_many :assets, as: :attachable, :dependent => :destroy
The scenario where an object A
instance is waiting on a specific record mapped to object B
at the same time the object B
record awaits on object A
. A deadlock. The immediate need is to resolve this situation so the users do not see a stream of 500 errors.
First stop identify the postgresql process. From the manual:
$ ps auxww | grep ^postgres
postgres 960 0.0 1.1 6104 1480 pts/1 SN 13:17 0:00 postmaster -i
postgres 963 0.0 1.1 7084 1472 pts/1 SN 13:17 0:00 postgres: stats buffer process
postgres 965 0.0 1.1 6152 1512 pts/1 SN 13:17 0:00 postgres: stats collector process
postgres 998 0.0 2.3 6532 2992 pts/1 SN 13:18 0:00 postgres: tgl runbug 127.0.0.1 idle
postgres 1003 0.0 2.4 6532 3128 pts/1 SN 13:19 0:00 postgres: tgl regression [local] SELECT waiting
postgres 1016 0.1 2.4 6532 3080 pts/1 SN 13:19 0:00 postgres: tgl regression [local] idle in transaction
You can see the infrastructure processes associated with Postgres. Subtract those and you can easily see the waiting
process.
All that remains is to kill the process id:
$ ps auxww | grep ^postgres
SELECT pg_cancel_backend('1003’);
Now the immediate concern is mitigated it is time to change our code so this does not happen again. At least does not happen with the same results. if there is not a high likelihood this will happen again we would use Optomistic Locking
. If there was a high likelikehood of reoccurrence we would use Pessimistic Locking
.
Optimistic Locking
is implemented by Rails
where Pessimistic Locking
is implemented at the database level.
The Optimistic
part means that we can service as many users as possible, since we will allow reading
to all and then implement a locking
mechanism when we attempt an update.
The first step is to create a migration to modifiy your Model
to have a designated lock
column:
class AddLockVersionToOrder < ActiveRecord::Migration
def up
if (ActiveRecord::Base.connection.table_exists? 'orders')
unless ActiveRecord::Base.connection.column_exists? :orders, :lock_version
add_column :orders, :lock_version, :integer
end
end
end
def down
if (ActiveRecord::Base.connection.table_exists? 'orders')
if ActiveRecord::Base.connection.column_exists? :orders, :lock_version
remove_column :orders, :lock_version
end
end
end
end
Now a ActiveRecord::StaleObjectError
error will be raise if the attempt is made to modify the same object at the same time. This works because internally lock_version
gets incremented when a the process grabs the record. If when the update is attempted the Order
object’s lock_version
does not match the current object’s lock_version
then the error gets raised.
This means it is the application responsibility to handle this error:
def set_order_price
begin
self.order.price = 100000.00
self.order.save!
rescue ActiveRecord::StaleObjectError
logger.error "Changes to this record: #{self.order.inspect} were made since you started editing. Please try again."
raise "General Error Occurred"
end
end
Pessimistic Locking
would kick in at the database level when invoking either lock!
or the with_lock
- which wraps inside a transaction.
The lock clears when save!
is executed.