Friday, June 5, 2015

Inverse Geographic Mapping: Experimental Setup

Previous posts:

This is another post exploring the attempt to map from a set of distance features back to an x, y coordinate system. If you haven't already read it, you may want to start by reading the Introductory post in the series and work your way through. This post will assume knowledge of what was presented in the two previous posts in the series.

Experimental Setup

Because the data I have from Kaggle does not include any x, y coordinate system it may be difficult to discern whether my approach is effective. In practice there are things that can be done with the Kaggle data that might provide some indication of correctness, but it will be simpler to create my own data sets while doing the initial development and testing of my approach. Below I will describe a couple of steps in setting up my practice data.

The basic setup:

The first step is to randomly generate a set of reference and sample points. Sample points are generated with x and y position values from a normal distribution with mean of 0 and standard deviation of 1000, and reference points with position values from a normal distribution with mean of 0 and standard deviation of 2000. At different times I generate different amounts of practice points, but for demonstration, here is a graph of 10 test points with 1 each of the three reference point types.

The distance between each sample point and each reference point is calculated to get the "horizontal distance to..." fields:

IdHorizontal_Distance_To_Fire_PointsHorizontal_Distance_To_HydrologyHorizontal_Distance_To_Roadways
03095.8435402182.2406193553.646848
14156.9004971200.9770464765.299586
2643.0746604513.1322924881.010825
31820.1766894114.3329545843.656280
42504.9936513740.1131196036.146318
(The distance fields for the first 5 sample points in the generated set.)

The practice data is now in the form of the Kaggle data. I can run it through my process, take the x and y coordinates that are output and compare them to the original positions of the practice data. There will be no way for the process to determine correct orientation or absolute position, but if it works properly it should find points that are the same locations as the original up to rotation and/or reflection and a translation.

Further Steps:

The setup above is about as simple as I can make it and is a good starting place. As I continue to develop the process I will also need to consider how it handles situations where there are more than 1 of each reference point type. For example, what happens when there are 3 water points? Does it matter if they are far apart/close together/in a line? What happens near the boundaries when a sample point is close to the same distance from 2 or more of the water points? It is straight forward to extend my practice set generator to include additional reference points, and I will likely include an example of that when I get to exploring the process at that level.

I may need to extend the practice set generation even further as I continue to iterate between getting the process running on the practice data and seeing how it performs on the actual data. The next step for the blog though is to start looking at the basics of how my current approach works with just the simple practice setup.

No comments:

Post a Comment