NABET, NABET 2017 Conference

Font Size: 
Big Data Analytics in Tax Fraud Detection
Bernice Marie Purcell

Last modified: 2018-01-14

Abstract


Big data is the term applied to datasets exceeding the normal confines of traditional database technology.  Datasets collected range from all professional fields, including taxation.  One use of big data analysis – or analytics – in the taxation field is discovery of tax fraud.

Big data is characterized by the terms volume, velocity, variety, and veracity.  The characteristics mean that big data employs large amounts of storage space, gathered from diverse sources, stored in diverse formats, and updated at different intervals.  The specific processing used in tax fraud analysis of big data is data mining; the process is now often referred to as analytics.  Datamining itself is one step in a larger process referred to by practitioners as knowledge discovery in databases (KDD).  Two key groups of datamining tasks employed in fraud discovery are predictive tasks and descriptive tasks.  Predictive tasks make a prediction for each observation, whereas descriptive tasks essentially describe the data examined.

Various agencies impose numerous taxes on society, all of which are subject to fraud.  Fraud exists in many forms and the Internal Revenue Code defines fraud in several places.  Investigators use various methods to detect fraud including direct and indirect procedures.  Direct methods include the matching of reported data to information returns received by the Internal Revenue Service.  Indirect methods include analytical procedures, review of documents, observation and informants.  These traditional methods of finding fraud can be greatly enhanced using analytics.

 


Keywords


Big data, analytics, taxation, tax fraud, fraud analytics