Factors to consider for a great Transaction Categorization Engine

August 15, 2020 in Financial Services

Factors to consider for a great Transaction Categorization Engine

More Financial Services Providers (FSPs), both traditional and fintechs are increasingly introducing personal finance management (PFM) features into their mobile banking solutions. Some of these features monitor and report day-to-day spending, provide tools that help keep customers in line with their financial goals and charts that detail spending or investment in different categories. All these features bring value to both customers and FSPs, but are only possible when actionable insights can be generated out of customers’ transaction data through categorisation. This is also only useful to FSPs and their customers if the categorisation is accurate. There is therefore a need to ensure that the categorisation engine is built to achieve categorisation accuracy.

Transaction categorisation identifies the purpose and context of a bank transaction. This could include using algorithms to identify keywords within transactions and allotting categories to these keywords depending on some rules. Categorization goes beyond merely being able to keep customers’ bank transactions neat and tidy, as it forms the main base for understanding account transactions and catering to the unique needs of the customers behind these transactions. With categorisation, every customer transaction data tells a story of a specific opportunity for both the customer and the FSP. 

For FSPs intent on deploying transaction categorisation to generate actionable insights and automated decisioning, it is imperative to understand some critical components of a transaction categorisation engine. Having successfully built one ourselves from scratch, here’s what we think are the key things to consider in making that choice:

  1. Categorisation Rate

This refers to the ratio of transactions an engine can recognise as against the total number of 

transactions in the set. While it may appear to be the most important factor, it’s significance can often be falsely inflated. A great performance will be somewhere between 90-95% categorisation rate. 

  1. Accuracy Rate

This shows the rate of transactions that are categorised accurately by the engine. So whilst categorisation rate speaks to the percentage of transactions that are recognised, the accuracy rate measures the number of transactions that are accurately measured. Accuracy rates are critical in understanding the quality of output a categorisation engine produces. 

  1. Categorisation Speed

Categorization speed refers to how fast a categorization engine is able to process sets of data. Depending on the computing capability of the categorisation engine, some can achieve speed of 4,000-6,000 transactions per second. High categorization speed is necessary where large amounts of data are involved or if categorisation results are required in real-time.

  1. Category Spread

This indicates how many categories of transactions an engine can recognize from a given set of transaction data. When a categorization engine has a low amount of categories, it usually has a lower spread rate and a higher categorization rate. This means that the less the number of categories a categorization engine has to assign transactions to, the less likely it is that a wrong category will be selected. So also, a higher number of categories potentially reduces the rate of success of the engine. It is however key to note that having a larger number of categories provides a better context for generating actionable insights so it is critical that the engine is optimised for accuracy, even with a large number of categories. Spain-based multinational financial services provider BBVA, on its website makes mention of its data engine having 15 major categories and 72 subcategories.

  1. Local Adaptability

For a categorization engine to maintain its relevance, it is important that it factors the nuances of the local environment it operates. Keywords, unique categories and point of sale scenarios provide unique insights that improve the quality of categorisation. This means that similar transaction keywords in two different scenarios (for example time of the day, transaction value, etc) might not be categorized identically. This is one of the reasons why FSPs are advised to collaborate with fintech partners who have local environment categorisation capabilities.

  1. Data Size

It is very important that a categorisation engine uses a large size of data to ensure improved performance. This means that in principle, a categorisation engine that has 1,000 data subjects at an average of say 30 transactions per month, will eventually have less accuracy rate than an engine that has far more data subjects at the same number of average transactions. Essentially, the more the engine is used, the better it gets, as it learns from more people. This is usually the classic response to the popular build vs buy question for transaction categorisation. FSPs are usually advised to collaborate with fintech platforms as internally built transaction categorisation engines do not have the data size (they are limited to the customers of the particular FSP) of an independent fintech categorisation engine that supports a variety of banks.


In this period of the pandemic, FSPs are having to rethink their growth models. As they struggle to build enterprise value in what is a very challenging economic context, growth can be generated from providing superior value to customers. Transaction categorisation provides a great option as FSPs can build actionable insights to truly support customers in a commercially viable way.

KliQr has built big data computing capabilities that can categorise large sets of data to generate actionable insights that support decision making. You can speak to us for more information on how we can support your growth ambitions.

Share via
Copy link
Powered by Social Snap