# Thread: Trying to find least difference among numbers in O(n) time

1. Hi everybody. (This is my first post on the forum btw, just registered)

Anyway, I've been developing a very fast sorting algorithm. The array to which the newly sorted elements are assigned creates some extra empty elements due to the nature of the sort algorithm. In order to optimize the amount of memory allocated for this sort array, I have derived a formula for guaranteeing an optimal array size that minimizes the amount of extra empty elements generated, but the problem is that the formula assumes that I know what the least difference among all elements of the array being sorted is.

For example: if a set is 68, 7, 1, 4, 100 then the least difference is 3 because 4-1 = 3 and this is the smallest distance.

In it's most obvious implementation finding the least difference among a set of numbers seems to be an O(n^2) operation, since every element in the array would be subtracted from every other element in the array to find the distance and then assigned based on whether it is the current smallest distance (kind of like finding a min).

However, in order for my sort algorithm to be optimally effective I need this calculation to happen in O(n) time or better rather than O(n^2).

Does anyone know a clever way to calculate this value in O(n) or better time?

Alternatively, it would suffice almost just as well if the value returned was just approximately equal to the least difference among the set, provided that approximation is less than or equal to the actual real least difference.

As another alternative, perhaps simply being able to make an very good guess as to what it might be would suffice. Anyone have any good ideas for a good way to guess the value?

If this task proves impossible, then it seems I will simply have to parametrize this value as something set in advance by the user, kind of like a granularity specification. It would be okay that way, but it would be better if I could find the optimal granularity automatically for the user.

Any thoughts?

Thanks for your time and for reading.  2.

3. You can't do anything with n elements faster than O(n). Period.

What is the reason for development of a new sort algorithm?  4. Sorry for taking so long to respond (been busy with school). Thank for your time and for replying (and thank you to anyone else who viewed it as well).

Indeed, I agree that you can't do anything with n elements faster than O(n). However, that's not really what I asked. If you read my post carefully you'll notice that I implied that I was looking for an O(n) efficiency method. Granted, I perhaps shouldn't have tacked on the "or faster" part because it may have only added confusion. The reasoning was that I might as well say "O(n) or faster" because anything faster than O(n) would also certainly be good and worth hearing about. Also, notice that I also mentioned that an very good guess of what the min distance might suffice, thereby implying that a guess could perhaps be made in faster than O(n) time. For example, one very stupid way to guess at the min distance would just be to take the first two items and calculate the distance, which would be O(1) time, despite the fact that it would be a extraordinarily bad guess. Get my point?

Apologies for any bad wording on my part though. The "or faster" is far fetched for a true deterministic find of the minimal distance, but not so for methods of guessing.

As for the reason why I'd develop it, why not? I'm a fairly new programmer trying to develop good skills, and I also had an unusual concept for how to do it, so I wanted to explore it.

Also, after posting this thread I actually eventually did figure out a fast way to make a good guess at what the minimum distance is (although I couldn't find a faster than n^2 way to deterministically find the real min distance).

Here's how I came up with an algorithm for guessing the least distance:

1) Consider the fact that wherever the minimum distance is it is most likely to be located somewhere close to where the most elements in the set are (the densest spot).

2) Consider also the fact that the median of a set of data is generally the best representative of the most typical value of a set (assuming that set is sufficiently large, say... greater than 30 perhaps)

3) Consider that wherever the median may be it will most likely be within the most heavily occupied bar of a distribution graph of the set of data.

4) Therefore, allocate memory for a set of categories representing ranges of values in the data set. Then, pass through the array once and count the number of times each range matches a data item. This will form a distribution graph of the data set.

5) Pass through the distribution searching for the category with the greatest count value and remember which range it belonged to. This is fairly likely to be the location of the median if the data is normally distributed (curved like a bell curve or poison or w/e else). If the data is uniform then that means that since the data items are generally uniformly distributed then the distances between them are also probably fairly uniformly distributed, hence just about any distance might make a decent guess. etc...

6) Pick an element in this category and calculate the distance between it and other values in order to find the least of those distances.

This gives you a decent guess as to what the minimal distance might be, and does so in O(n) time if implemented correctly.

In any case though, I've actually moved on to a different approach to the algorithm, but I figured I'd post this for you guys in case any of you are interested.

Thank you for your time and for reading.  5. There are good standard sort algorithms which take O(nlogn) (where log is base 2) time to execute. Does your algorithm improve on this?  Bookmarks
 Posting Permissions
 You may not post new threads You may not post replies You may not post attachments You may not edit your posts   BB code is On Smilies are On [IMG] code is On [VIDEO] code is On HTML code is Off Trackbacks are Off Pingbacks are Off Refbacks are On Terms of Use Agreement