A Very Simple Bias Algorithm Using Real Data

    While researching on biased algorithm's examples I was not able to find simple ones that would not require comprehensive in-deep knowledge of other topics like coding or statistics, so I thought it would be fun to make and analyze a very simple algorithm for bias. The goal is to understand how bias works in an algorithm.

    I've picked the cities of Lewiston (ID) and Clarkston (WA) for this example. These two cities share the Lewis-Clark valley at the confluence of the Snake and the Clearwater rivers. Now, let's assume we work for a real state investment firm looking into building new housing developments. Let's assume another algorithm had already segmented the real state market in the LC valley in four sections based on population and other parameters: Red, Yellow, Blue, and Green. Let's assume we are tasked with creating an algorithm that would decide where next the company should invest. The four parameters we are going to use for our very simple algorithm are schools, supermarkets, hospitals/clinics, and restaurants. Here is a map I built using our fictitious zones and some real data from the Lewis-Clark valley.


LC Valley Hospitals, Restaurants, Schools, and Supermarkets

We first proceed to count how many of each of parameters we have in each zone. Our company has historically build models giving equal positive value to schools, supermarkets, and hospitals, while a negative value to Restaurants. Restaurants had historically being considered as a sign of commercial areas by our customers, so our sales department had consistently pointed them as a negative value when selecting areas. Using this historical set of logics we build Algorithm 1. The results tell us that we should build in the Green zone and certainly avoid Zone Blue at all cost. Algorithm 1 is a great tool and everyone in our company is happy with it because it gives us an expected result. We are actually planning on using it to analyze real state across the country.

One day we decided to visit the LC Valley and we learned something very interesting. Residents in the Blue zone, the downtown area of Lewiston, are decreasing every year, while residents in the Green zone continue increasing. We think this is due to too many commercial buildings, like restaurants, causing people not to want to live there, but our research finds that it is actually the lack of new housing investment by companies like ours. It seems to be that restaurants are not the problem but our perception of them. So we go back to our company and ask everyone how they actually feel about restaurants. We found that an equal negative value to the other parameters is inconsistent with what our customers see as ideal, yet our algorithm does not reflect this aspirational goals, rather just solidifies reality. Now we go back to our algorithm and create Algorithm 2. Algorithm 2 says that schools, supermarkets, and hospitals have equal value again, but Restaurants are going to have only half a negative point value. We continue to not like too many restaurants when picking zones for new housing, but we agree that areas with enough existing resources like hospitals, schools, and supermarkets should not be punished with a full negative point for each restaurant in that zone. The results of Algorithm 2 still point at Green zone as our first choice, yet now Blue zone is the second area where we want invest in. Something similar happens with the Orange vs the Yellow zone in Clarkston.
Red, Yellow, Blue, and Green areas Algorithm 1 vs Algorithm 2 results

A reevaluation of our algorithm that takes into consideration not only the results for what we have been expecting, but for what we aspired to was critical to auditing and improving this very simple algorithm. If this would be a real algorithm, it could affect the lives of many people in the LC Valley for generations to come. So auditing algorithms for bias is critical before implementation.

Comments

Popular posts from this blog

Could the FTC Order to Delete Biased Algorithms in the Near Future?

AI Bias: Human Bias in Artificial-Intelligence-Developed Algorithms

Algorithmic Bias: Is Perfectly Imperfect Good Enough?