Opened 7 years ago

Closed 7 years ago

#814 closed enhancement (fixed)

IGP: avoid calculating distance between every pair

Reported by: Peter Owned by: Peter
Priority: minor Milestone: yat 0.13
Component: classifier Version: 0.12.1
Keywords: Cc:

Description

The calculation calculates the distance from between every pair of samples.

for each sample i
  for each sample j
    d = distance(sample_i, sample_j)

which means that n*(n-1) distances are calculated but due to the symmetry half of that could be avoided. Only reasons I could see is that it would break code if someone uses IGP with a distance that disobeys the symmetry, and 2) it could cause some memory blow to since all distances need to be stored (in a matrix). We do however write in the weighted stats section that a Distance is symmetric, so I think we can do the change without breaking any contract. The memory issue to store a NxN matrix should only be a problem if N, number of samples, is huge.

Change History (2)

comment:1 Changed 7 years ago by Peter

Milestone: yat 0.x+yat 0.13
Owner: changed from Markus Ringnér to Peter
Status: newassigned

comment:2 Changed 7 years ago by Peter

Resolution: fixed
Status: assignedclosed

(In [3320]) speedup. assume Distance is symmetric. closes #814

Note: See TracTickets for help on using tickets.