Sunday, August 5, 2018

Concurrency bug in a real banking system

In this era of banking where online banking and mobile banking has became a daily affair for many of us (esp. in cities), any technical bug in the backend software used by banks can have an impact whose severity is difficult to assess without the occurrence of the incident. While the primary onus of correctness of the software is on the verification engineer, in the era of multi-processing and multi-threading, the design structure and the coding style matters a lot. Many of the artifacts of complexities in the current systems are non-deterministic and non-repeatable and cannot be verified with guarantees. Thus, the designer is also equally responsible for a bug that leaks into the production system.

Recently, I hit up on one such bug in a recent banking transaction, which we have studied as a fundamental problem in concurrent systems. Apparently, the example studied in books consider the scenario of banking transaction and hitting the bug in reality reminded me of the same. To explain the situation, I had INR 7316.88 in my account. I transferred INR 3000.00, which attracted a service charge and tax of INR (50 + 9). This triggered 2 transactions in parallel - deducting the 3000.00 and deducting 59.00 as service charge from the same account. It could have possibly happened in sequence - 3000.00 deducted first and then 59.00. But somehow, the banking system's code treated them as 2 parallel threads and invoked them in parallel. Now what happens - 3000.00 deducted from 7316.88 leading to a balance of 4316.88. Now 59.00 should be deducted from 4316.88, but due to this happening in parallel with 3000.00 deduction, it deducted 59.00 from the original amount of 7316.88, leading to a balance of 7257.88. The transaction of 59.00 got completed later than 3000.00, ending up with a total balance of 7257.88 in my account. The below snapshot proves this incident (listed in reverse order of date - the last entry is older transaction than previous).

This is a clear concurrency bug which could have been avoided by using proper locks in the software code (for those who understand the terminologies). What this incident brings to forefront is that there are bugs for such simple situations still existing in production banking systems which many of us just rely on and use regularly. This one ended up being beneficial for me and hence I am not worried. What if the reverse had happened - The final balance turned out to be lower than expected. I would be running from one bank counter to another, and they would blame it on the software without being able to help themselves. This might get resolved, but would take time. What if people do not notice this - then it is a matter of chance that it would be fixed.

I was able to withdraw 7000.00 later from the account, leading to no way that the software could fix the problem some days later. I am waiting to see what happens and how the system goes with this. I have no intent to keep this amount and I would eventually return it, once the bank identifies their issue.

I would expect that the application developers understand the severity of such incidents and do their due diligence in ensuring the quality of their software. I would also hope that the bank staff and the customers do not trust these systems blindly.

Happy online banking to all !!

1 comment:

  1. This looks like they are doing dirty reads for a transaction. In such a case this can repeat often. Did you report this to the bank? I hope this is not SBI :)

    ReplyDelete