Today, businesses and developers use the Mono Transaction API (Transaction Metadata) to retrieve real-time transaction data from thousands of customers' financial accounts connected to their mobile or web apps. They use these data to build everything from investment advisory platforms for their users, to understanding customers’ finance/spending patterns and making the experience better on finance management tools.
However, because this retrieved data come in uncategorized formats, there’s no way to programmatically extract the description of each transaction for classification, or determine what category the transaction falls under, for example, if it was for bank charges, a loan repayment, or an online purchase from an e-commerce merchant.
Using Machine Learning, we have improved the Transaction Metadata and built two new features that will enable businesses to get better categorization and insights into customers' transactions. These features are:
What is the Transaction Classifier?
With the Transaction Classifier, you can get a detailed categorization of your customers' transactions and better understand their spending patterns. These transactions are classified into categories like Transfers, Airtime, Loan repayment, Bank charges, VAT, etc, so you can recognize the purpose of each credit and debit made on a customer's financial account. For example, you can determine if the NGN 3000 debited from their account was an ATM withdrawal or payment for groceries.
While, every transaction statement shows details like the transaction amount, transaction type (debit or credit), and the narration, the Transaction Classifier includes details like the category and subcategory of each transaction made.
Here's an example:
Data collected without the Mono Transaction Classifier
Data collected with the Transaction Classifier
What is the Merchant Extractor?
The Merchant Extractor allows you to retrieve even more details from customers' transactions. It identifies which merchant the customer made a transaction with and then categorizes the merchant based on their industry category.
For example, if a customer buys airtime online using their bank app, the merchant extractor will return the name of the Telco they purchased airtime from as the merchant paid, and the telecommunication industry as the merchant category.
More than just understanding your users' spending pattern, you can also determine which merchants they patronize often and how much they spend on a specific merchant. Rich data like this will help lending companies automate the computation of users' spending behaviour and assess their creditworthiness more accurately.
Here's an example
Data collected without the Mono Merchant Extractor
Data collected with the Mono Merchant Extractor
How we built the Mono Transaction Classifier and Merchant Extractor
To ensure that we delivered high-quality data, and businesses could make sense of the data at a glance without a secondary engineering process, we started out with:
Data labelling: To ensure we had the adequate dataset we needed to build this, we curated a large volume of datasets and engaged a group of volunteers - mostly data scientists, to help us effectively label the dataset.
Data cleaning and preprocessing: For this process, we eliminated some elements like punctuation, asterisks, etc from the dataset and converted the transaction narration to lower case words.
Feature Engineering: Before we trained our model, we had to convert our data to vectors which is the expected format for Machine Learning algorithms. For feature engineering, we tried various vectorization processes or approaches to convert narrations to vectors and got a good performance with the CountVectorizer.
Model building: Usually, the simplest way to categorize transactions would be keyword matching or a hand set of rules. So, in order to classify transactions better and more accurately, we employed the ensemble, bagging, and boosting algorithm for the transaction classifier and used the spacy custom named entity recognition (NER) for the merchant extractor.
Model evaluation: At this stage, we evaluated our models using AUC score, RMSE, Recall, precision, and F1 score. We got over 0.94% accuracy (F1 - weighted ) and 0.2081 as the log-loss score.
Behaviour testing: Even after the model evaluation was done, we didn’t want to depend on the traditional or generic evaluation, and we are also concerned about the external behaviour of our model to detect and prevent breaches and model drifts, as Machine learning is like software, they need rigorous testing and they degrade over time.
For behaviour testing, we employed the Directional Expectation(DIR) test to understand if two different transaction narrations under the same category will change the model predictions when the data is extracted.
Deployment: After training our model we need to take it to the cloud so you can be able to use it. We served our model via Flask API but we were also concerned about the scalability and latency of the prediction, and if it could handle multiple requests. So, we containerize our model or Flask API with Docker and deployed it on virtual machine (VM) with load balancing.
Monitoring setup: This was the most interesting part of the whole process for us, because more than just building this solution, we monitor it regularly to understand any drift in our model and inference data. We are also keen on improving our model performance.
What use cases can you build for with these features?
Lending companies need to understand customers' transactions over a certain period to carry out a more detailed assessment of the customer’s finances and do better underwriting. With the Mono Transaction Classifier and Merchant Extractor, lenders can build credit scoring and loan prediction default models that help them to make data-influenced and better lending decisions.
This will label each transaction according to its category, and help them to answer questions like; where does the user spend most of their money? What is the user likely to pay for? Which category or merchant does the user spend most of their income on? Are they repaying loans from other lenders? So, instead of going through the hassle of classifying these transactions themselves, they can now automate the process and achieve more accurate transaction and merchant categorization.
e-Commerce and Buy Now Pay Later (BNPL)
Let’s consider that a phone company offers BNPL services that allow their customers to buy a phone and pay in monthly instalments. Before an instalment payment plan is granted, they want to see that the customer has sufficient income to cover the cost of instalments, see if they are loan defaulters or are repaying too many loans, and also ensure the experience is seamless enough to prevent a drop-off.
With the Mono transaction and merchant classification feature, e-Commerce brands that offer the Buy Now and Pay Later models, like the phone company, can deduce if they are likely to fulfil the monthly instalment plan or not.
Finance management tools that help users aggregate their bank accounts and manage their finances seamlessly, depend on well-categorized transaction data to deliver better financial experiences to the user. With this intelligent transaction categorization, they can accurately determine how much a user spends on groceries, airtime, or transport every month and which merchants they use for these services. This enables users to easily track what they are spending their money on without manually labeling their transactions. They can also recommend personalized and data-driven finance management tips to users, based on their transaction data.
Want to get started with smart transaction categorization with Mono?
To request access to this feature and build better transaction and merchant categorization for your product, please reach out to our team via firstname.lastname@example.org.