What is Railz's financial and accounting data normalization?

November 23, 2021
normalize financial data

“Life is not complex. We are complex. Life is simple, and the simple thing is the right thing.” — Oscar Wilde

What is financial and accounting data normalization? We're here to take the complication out of normalization

When we speak about data normalization at Railz, we know it doesn’t sound like the most exciting topic. Internally, we get ourselves into quite a stir about the normalization engine we’ve built, but we want to do a better job at defining what we mean by data normalization and why financial data normalization is important to the success of your financial institution. The world is built on data, and the financial industry is no different. As per Forbes, the amount of data created and consumed in the world increased by almost 5000% from 2010 to 2020. Whether you’re a financial technology startup or a long-held financial services provider, you rely on your small- and medium-sized businesses’ (SMBs’) financial data in order to offer the best services and value to your commercial customers.

As your financial institution is well aware, there are over 30,000,0000 businesses in the United States - SMBs are the backbone of the economy. The only way you can successfully use the financial data from 30,000,000 businesses is to have that data arrive to you clean, standardized, and correctly mapped - what we refer to as normalization. We handle financial data normalization so you don’t have to. Below, we’ll explore exactly what we accomplish and how with our normalization engine it impacts your financial institution.

Turning accounting data into something digestible for your financial institution

Simplicity is becoming increasingly elusive and, ironically, a formidable complexity to solve for in our ever-evolving and interconnected world. Railz is in the pursuit of building the largest financial data network to support the future of finance. In order to accomplish that, we need to make accounting data simple and effortless to work with. We’re making data easily digestible by providing our users a single “source of truth.” With Railz, you have a single data archetype across a broad spectrum of accounting service providers (ASPs) we integrate with, such as QuickBooks, FreshBooks, Sage Intacct, Xero, Oracle Netsuite, Plaid, and others.

The result of gathering all accounting and financial data on SMBs through one API and normalizing the data is an easy-to-use, cohesive Accounting Data-as-a-Service™ framework that empowers financial institutions to make the most of their SMB data. Normalizing data for your financial institution also considerably reduces development efforts when building and managing accounting integrations for your financial product and services.

Why accounting and financial data needs to be normalized

Accounting and financial data is exceptionally messy by default - it is nuanced with a multiplex of idiosyncrasies. Navigating intricate layers of accounting data in the absence of an army of specialists can be enormously challenging to small- and medium-sized businesses and more importantly to your financial institution providing services and products to your SMBs.

There are a number of hidden layers that are being masked by Railz’s industry-leading accounting and financial data standardization. Every now and then, our data team loves to geek out about all the sexy nuts and bolts involved in our data normalization engine. To better understand how we designed our normalization pipeline we need to provide you a little more context. For the first time, Railz is excited to give you a little peek under the hood of our financial data normalization engine.

Example of why Railz normalizes your accounting and financial data

In order to create standardized financial statements across today’s ASP platforms, we need to consider some of the ways different accounting platforms can package similar accounting information.

A number of very basic examples can be used to demonstrate the different ways accounting service providers package similar information by comparing two of the most popular ASPs out there today that Railz integrates with, Xero and QuickBooks Online:

  • Xero has two balance sheet classifications, (i) Assets and (ii) Liabilities and Equity
  • QuickBooks Online uses (i) Assets, (ii) Liabilities and (iii) Equity

Account and sub-accounts are far more granular under QuickBooks Online. QuickBooks can run five levels deep for sub-accounts whereas Xero maxes out at just one level. QuickBooks Online provides a PayType for Bill Payments, whereas Xero does not. QuickBooks provides detail on 19 different transaction types, whereas Xero supports 4. The unique way that each of these ASPs in the above example defines categorization of accounts and line-item configurations makes it such a challenge to reconcile accounting across the two platforms.

How Railz normalizes your accounting and financial data: the moving parts behind our normalization engine

Railz uses the Chart of Accounts as a standardization reference to efficiently normalize across all financial statements with a 99.9% level of accuracy - this is important because it ensures integrity of the accounting data. For your financial institution, having this level of accuracy on your customer data empowers your organization to offer accurate services and products to your SMBs. This level of accuracy is unique to the Railz normalization engine and it presents you with the best data to make these decisions in the fastest and most efficient possible manner.

The trickiest part of normalization is actually the process by which all other financial statements are standardized against the prepared Chart of Accounts. We have to correlate all variations of the wild types of account names against the normalized Chart of Accounts.

There are generally two ways of implementing this type of approach. The first is to map ASP-specific fields to a standardized grouping of classes using a tuned combination of text preprocessing and a general Natural Language Processing (NLP) algorithm. An example of NLP tasks we implement for text preprocessing are bag of words, stemming/lemmatization, SVM, and Random Forest. A decision tree type of approach, such as Random Forest, is effective when training over vast amounts of data – predicting at a 99% level of confidence would require a minimum sample size of 9068 unique, non-overlapping line items. A dictionary or a text corpus method like lemmatization is effective in very specific types of applications like film ratings or sentiment analysis. Its fundamental dependence on grouping based on inflected forms of language greatly weakens its classification ability when contextual information is lacking in the dictionary lookup.

For instance, “Loans From Shareholders” and “Loans To Shareholders” would be given an identical score under a sentiment analysis tool like VADER. In fact, what is even more problematic in this example is the removal of stop words (e.g. “to”) resulting in the stemmed compound phrase “Loan Shareholder”, even though one is a current liability and the other a current asset.

A second type of implementation, and the one that we elected to adopt at Railz, is to cluster all possible combinations of raw ASP categories by the largest subset of unique class types, Railz subGroup, then map these subGroup values to our standard super sections (Section, subSection and Group). We know in advance that each ASP has a finite grouping space. QuickBooks Online, for example, has only 15 FS types and 274 sub-types. We use a combination of these two fields to create an initial mapping to our own normalized category, “subGroup”. Railz does not modify any user-configurable data, such as account line-items, only the overarching classifications. These are the broader groupings that we call “Section”, “subSection” and “Group”:

railz normalization engine
Railz Normalization engine at work: text matching method

Once we have completed this outer relationship between the ASP and Railz internal subGroup, the mapping to the super sections (Group, subSection, Section) can be easily calculated. This approach greatly reduces the mapping complexity of the accounting. No other input or interaction is required by our users, which further ensures the accuracy of the data normalization. By contrast, in crowdsourced model training you’re paying for an incomplete service which requires your resources to operate, costing you the time that you were wanting to save in the first place. This is not “User Control”, rather “User Offloading.” Thus, through a carefully tuned text matching method, you get the benefits of our normalization engine.

How data normalization can positively impact your financial institution

Having access to normalized financial and accounting data empowers your financial institution to build the financial products of the future. The use cases for normalized financial data span different industries from insurance to wealth management to credit and lenders to neobanks.

Use Case: How Railz Impacts Corl’s Non-Dilutive Startup Funding Platform

Corl, a non-dilutive startup and scale-up funding platform, uses Railz to speed up the underwriting process when making lending decisions. Corl is a revenue-based financier and Railz allows Corl to sync up with their customers’ online accounting software and populate financial data in an efficient manner. Corl needs to get access to normalized financial data of their small- and medium-sized business customers to provide non-dilutive capital.

Use Case: How Railz Powers whatifi’s Financial Forecasts

Startup whatifi, a financial scenario planning platform, helps finance professionals to visually plot a company's various "what if" financial futures - all in real-time. Using a mind-mapping-like interface, they enable lightning-fast decision-making and financial modeling. Customers of whatifi's include Chief Financial Officers (CFOs), Fractional CFOs, and individual business owners. Whatifi relies on Railz to automate and standardize the ongoing ingestion of a company's financial data - our normalization engine - to build a baseline for each "what if" scenario. Railz ensures that the beginning of every forecast is grounded in the most accurate and current financial data possible.

Railz's financial and accounting data normalization is necessary for your SMBs

Without normalizing the financial and accounting data we provide you, the data would be pretty overwhelming. Your financial institution wouldn’t be able to build the products and services you want to offer your small- and medium-sized businesses to help them succeed in securing loans, credit, wealth and portfolio management - the list goes on. Using the Railz Accounting Data-as-a-Service™ API and normalization engine is like having our data science team on yours. If you have questions about our normalization engine, feel free to join our Slack Community to reach out to our data science team.

Director of Data Science

Pasha Zavari is the Director of Data Science at Railz. We're building the largest financial data network to support the future of finance. He has more than 10 years of industry success in developing and deploying advanced analytics, from Government to Fortune 500. In his free time, you can find Pasha playing toddler basketball with his son, singing barbershop or just jamming out on the guitar.