Methods to correct it
One key assumption was made to correct the data set. This was that the error in latitudes and longitudes is, in fact, that they are all shifted to a constant value. In this formulation, we assume that the errors are systematic as opposed to random.
We then assume that for each distance housing corresponding City centre is consistent and correct. We think the mistake is actually due to the settlements. We then determine:
Where x_i is the coordinate of the apartment (either latitude or longitude) i in the data set, x_bar_j is the average coordinate of the city j (i.e. city center) i is and D_i is the coordinate distance of the apartment i from the city center x_bar_j.
Based on our assumption about the nature of the error, this means that if we compensate x_bar_j repaired x_bar_j *, keeping D_i as standard, we can find the repaired location of each apartment.
Finding repaired city centers x_bar_j *, we used the Google Geocoder API. The code for this is shown below:
The perfect implementation can be found here.