- Built with Python3 and Flask
- Data from the World Bank Data Catalog
- Visualizations with Plotly.js
- Interpolation: a way of constructing new data points between points you already know. In the case here, we just did a simple average; it can get very complex depending on your data and what you are trying to fill in.
- Choropleth: a geographic map in which areas are colored according to some statistical measure. A heat map is similar, in that areas are colored accoring to some statistical measure, but is not necessarily geographic.
We were looking through the World Bank Data Catalog (which we highly recommend, it is very easy to navigate and contains so much data!) and found a listing of the financial inclusion indicators. We downloaded the data on all countries as a CSV from G20 Financial Inclusions Indicators and extracted the data on the number of ATMs per 100K adults in each country, spanning the years 2011-2017. We thought this could be an interesting indicator of countries becoming more inclusive and offering more banking services via the proxy of ATMs, or it could tell a different story if cash is becoming less prominent and mobile banking is on the rise.
We did need to do a little data cleaning, as some countries were missing data for some or all years. Where we had more than two points for a country, we did a simple interpolation to fill in missing years, though we did not extrapolate into the future.
Plotly.js is a really easy tool for doing choropleths, and without too much effort it gives a nicely colored map. However, to use the colorscale, we had to split the data into countries we had data for and countries that were missing in order to give the NaN values a grey color instead of default white.
Plotly.js also made it easy to capture a map click and open a new window that gives specifics for that country; in this case, a chart over time of the actual number of ATMs per 100K adults.
This Helps Find Financial Crimes?
Choropleths are a great way to understand what is happening in different geographies, and time series charts are an excellent way to detect possible shifts in behaviors. Only looking at the raw 200+ lines in the CSV data file would make it nearly impossible to detect possible trends for future exploration.