Hello,
I have a problem with an algorithm I am working with. It clusters and matches a set of points extracted from one image to another set extracted from another image. I am trying to weight each point's match based on which cluster it matches to.
For example if I have two clusters of points in each image and I match the points I can then get a histogram of how the points match between the clusters.
If all the points match from cluster 1 in image 1 to cluster 1 in image 2 I want to weight these as more likely to be correct as they all match to the same cluster. If one point matched to the other cluster I would like to weight this as likely to be a mismatch. If the matches are spread evenly over two clusters I would like there to be no positive or negative weighting as there is a 50/50 chance of either being right.
An example of the histogram data is as follows where the vertical axis are the clusters in image 1 and the horizontal axis are the clusters from image 2:
__1__2_
1| 22 5
2| 44 1
This shows that 22 matches are present from the cluster of points in image 1 cluster 1 to image 2 cluster 1 and that 5 points match to image 2 cluster 2.
The problem is as follows:
1) I want to generate percentages from tables such as this to weight the responses and give higher values to those which match to the same cluster.
2) I want an equal distribution to generate a neutral response. i.e 50/50 across a row will do nothing in terms of weighting. e.g subtract the mean.
3) The main problem is that by taking a percentage small values where one match occurs will result in a 100 percent weighting. Is there an elegant way of weighting the higher number of matches as more reliable without just using a threshold?
Thanks