Notices
Results 1 to 5 of 5

Thread: 'centrifugal sorting' problem?

  1. #1 'centrifugal sorting' problem? 
    Forum Freshman
    Join Date
    May 2009
    Posts
    33
    Hi, me again.
    I'll add this one and then shut up for a while.

    A problem I came across at work (it's for a medicinal chemistry-related thing), and our informaticians are working on it. But I'd like to have some more opinions.

    Say you have a matrix whose elements are either 'blank' or 'experimental data', where the latter is a positive floating point number. There is at least one number in each row and column, i.e. there are no completely blank rows or columns.
    The only operations you're allowed are swapping rows and swapping columns.
    What you want to achieve is a distribution of the data such that the smaller numbers are closer to the top left corner of the matrix (I'll call it 'origin'), and the bigger numbers get thrown far from that.
    Of course, depending on what the values are, you might get the odd big number close to the 'origin' and the odd small number far from it, but the aim would be to minimize that. So the data should be distributed in 2D around the origin according to their increasing values.

    My first attempt at solving it was based on a 'centrifugal' concept, i.e. I calculated the product of each data cell by its distance from the origin, and I tried to identify the pair of row-swapping and column-swapping matrices that, applied to my matrix, resulted in the maximization of the sum of those products.
    But they told me the computational cost of this was too high, and didn't take into account the presence of blank cells, that counted as zero, whereas they should not be included at all in the computation.

    Now we're evaluating other methods based on weighed row-wise and column-wise averaging, but I'm not sure we're in view of any meaningful solution.

    What do you people think? Does it make any sense at all?


    Reply With Quote  
     

  2.  
     

  3. #2  
    Forum Freshman
    Join Date
    May 2008
    Posts
    9
    It makes sense for the most part. But I do have a question, are the zero cell's meant to be around the origin or are they pushed to the outer lower corner?


    Reply With Quote  
     

  4. #3  
    Forum Freshman
    Join Date
    May 2009
    Posts
    33
    Blank cells correspond to data we haven't measured yet, so they could be large or small values, we don't know.
    We would like to keep them out of the equation. In other words, ideally they shouldn't be distributed in any preferential way in the matrix. Only the cells whose value is known should be taken into account.

    And the other big problem is how to minimise the number of computations needed to achieve the optimal distribution, because otherwise we might as well compute all m!*n! permutations and try them out. But this would be pure madness.

    So my real question here was: what is the function of the values in each row and column that I can compute once, use to sort the rows and columns, and then repeat iteratively until a 'minimum' is reached?
    I guess convergent iterative methods are the best alternative to explicit computation.
    Reply With Quote  
     

  5. #4  
    Suspended
    Join Date
    Oct 2008
    Posts
    1,849
    How are you measuring distance? By "block-walking", some kind of pythagorean calculation involving square roots, some other measure?

    That is, is the address (0,5) closer, further, or the same distance from (0,0) as the address (2,3)?
    Reply With Quote  
     

  6. #5  
    Forum Freshman
    Join Date
    May 2009
    Posts
    33
    If R is the row number and C is the column number, the distance I used was:

    Reply With Quote  
     

Bookmarks
Bookmarks
Posting Permissions
  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •