Opened 11 years ago

Closed 8 years ago

#387 closed task (fixed)

svndigest:ignore but only ignore first revision

Reported by: Peter Johansson Owned by: Peter Johansson
Priority: major Milestone: svndigest 0.10
Component: core Version: trunk
Keywords: Cc:

Description

related to ticket:204

Sometimes when you copy a file from elsewhere, you would like this to be reflected in statistics, i.e., you don't want the lines to be counted. However, if someone later on make modifications of the file, you would like to count those lines.

Perhaps there could be an svn property for this case. This is essentially a stop-on-copy so the specification must be finalized in conjunction with ticket:204.

Change History (20)

comment:1 Changed 10 years ago by Peter Johansson

Milestone: svndigest 0.x+svndigest 0.10

I think we should use the properties we already have: svndigest:ignore and svncopyright:ignore.

svndigest:ignore="value" should following meaning

  • if value is int, we ignore changesets less and equal to value
  • else if value is no we dont ignore the file
  • else (including empty string) we ignore all changesets of file

The second case, "no", would be the same as property not set; however, it may be useful when property is set in svn repo and we can ovveride that setting via the config file.

comment:2 Changed 10 years ago by Peter Johansson

That information has to be stored in cache file somehow. Either declare the cache file invalid when this int is changed or take care of it in a clever way so cache file can be used but correctly.

comment:3 Changed 10 years ago by Peter Johansson

I think this should have higher priority than the stop-on-copy ticket (#204) because it is much easier to implement and with some tweaking users can accomplish stop-on-copy with it.

comment:4 Changed 10 years ago by Peter Johansson

Owner: changed from Jari Häkkinen to Peter Johansson
Status: newassigned

comment:5 Changed 9 years ago by Peter Johansson

The current behavior is that svncopyright listens to both svndigest:ignore and svncopyright:ignore. I wanna break this behavior so that svncopyright only listens to svncopyright:ignore. It is a design flaw to not makes them independent in the first place because it makes it impossible to have files/dirs ignored by svndigest and not ignored by svncopyright. This flaw becomes more prominent when this ticket has been implemented because imagine the following use case:

create foo.cc
commit foo.cc
@rev 1
..
edit foo.cc
commit foo.cc
@rev 2
..
svn copy foo.cc bar.cc
commit bar.cc
@rev 3
..
edit bar.cc
commit bar.cc
@rev 4

Here it's likely one wanna svndigest:ignore bar.cc in rev 1 and 2 whereas one wanna keep authors of rev 1 and 2 in the list of copyright holders of file bar.cc. If someone copy your file and base his code on it you'd expect to keep the copyright of that new file. At least users should have the choice to get this behavior. Only way to accomplish that now is to set the properties in config files and keep separate config files for svndigest and snvcopyright.

I'm never happy to break behavior and realize while writing this that perhaps we can solve it by introducing a third ignore property that only affects svndigest, in other words, we have three ignore properties:

  1. svndigest:ignore
  2. svncopyright:ignore
  3. new ignore property

svndigest ignores files if they have property 1) or 3) and svncopyright ignores files if they have 1) or 2) set. What would a good name be? Something with svndigest? svndigest-only:ignore?

comment:6 in reply to:  1 ; Changed 9 years ago by Peter Johansson

Replying to peter:

I think we should use the properties we already have: svndigest:ignore and svncopyright:ignore.

svndigest:ignore="value" should following meaning

  • if value is int, we ignore changesets less and equal to value
  • else if value is no we dont ignore the file
  • else (including empty string) we ignore all changesets of file

The second case, "no", would be the same as property not set; however, it may be useful when property is set in svn repo and we can ovveride that setting via the config file.

I wanna try that value should be a range instead, for example, 0-20 meaning ignore revs from 0 through 20. Empty on lhs of dash is interpreted as 0 and empty rhs is interpreted as infinity. Empty value would be same as "0-". A range set in props can be skipped by setting svndigest:ignore=0-0 in config file (We could translate "no" to 0-0 for convenience). Several ranges can be given comma separated, for example, 0-20,30-30. Need to work out whether it makes sense and if it is obvious how the stats should be calculated when revs are ignored. If that is problematic my plan B is to start with single ranges hat are bound to zero that for example 0-rev.

comment:7 in reply to:  5 Changed 9 years ago by Peter Johansson

Replying to peter:

I'm never happy to break behavior and realize while writing this that perhaps we can solve it by introducing a third ignore property that only affects svndigest, in other words, we have three ignore properties:

  1. svndigest:ignore
  2. svncopyright:ignore
  3. new ignore property

svndigest ignores files if they have property 1) or 3) and svncopyright ignores files if they have 1) or 2) set. What would a good name be? Something with svndigest? svndigest-only:ignore?

Makes more sense to decouple svndigest- and svncopyright-cache (ticket:385) before implementing this.

comment:8 in reply to:  6 Changed 9 years ago by Peter Johansson

Replying to peter:

Need to work out whether it makes sense and if it is obvious how the stats should be calculated when revs are ignored. If that is problematic my plan B is to start with single ranges hat are bound to zero that for example 0-rev.

For svncopyright it seems easy to implement the more flexible approach supporting ranges. Same thing for AddStats as svncopyright in principle is based on that statistics. They both work as looping over all commits and for each commit check what lines were added and what LineType, and for example for svncopyright add author to set of authors for that particular year. If we want to ignore certain revisions that is easy as we can just jump over those revisions in the loop.

For BlameStats and ClassicStats? it is more complicated. If we start with ClassicStats?. That stats is collected by looking at blame output for the most recent revision. If that for example looks like:

10 vlissides header
1 gamma first line
1 gamma second line
4 helm another line
4 helm 
1 gamma more text

And as described in the Manual, gamma gains two lines at rev 1, helm gains two lines at rev 4, and vlissides gains one line at rev 10. What does it mean that we ignore rev 4? Well, clearly helm should not gain any lines at rev 4 (because it is ignored). What if helm also removed a bunch of lines in revision 4? And what if those lines we reintroduced in the next revision? You could, for example, have the same file at revision 1:

header
first line
second line
third line
fourth line
more text

Who is now owner of the header line? Is it gamma or vlissides? And should lines third line and fourth line be counted for gamma? They were removed in revision 4 and if we are gonna ignore revision 4 one could argue that we should count them to gamma's stats. I'm not sure here but my gut feeling is that it'd make most sense to give header to gamma but not the lines that were deleted. That's means that diff revision 10 and revision 1, in which case header was not touched but some files were removed and we make the stats reflect this. The bottom line is that this is impossible to implement without make the stats based on diff rather than blame so it would require ticket:66 which is currently scheduled for svndigest 1.0 (not soonish). As BlameStats is essentially ClassicStats? performed on all revs, it suffers for the same reasons.

So my suggestion is that for svndigest only support ignores of type 64, which implies revision from 0 though 64 are ignored. For svncopyright it will be nice to support the more flexible style as 3-5 etc. svncopyright should also support singlets (64) which is interpreted as 0-64. And then we open a new ticket for the future that svndigest should support ignore ranges and the ticket will depend on ticket:66.

comment:9 Changed 9 years ago by Peter Johansson

(In [1438]) refs #387. extending svncopyrigt property in SVNproperty to hold which revs to ignore rather than just yes/no

comment:10 Changed 9 years ago by Peter Johansson

(In [1448]) refs #387. extend test with a case of invalid config file

comment:11 Changed 9 years ago by Peter Johansson

(In [1449]) refs #387. svncopyright no longer listens to property svndigest:ignore. New revision (71) in test repo where svncopyright:ignore was added to dir_to_be ignored.

comment:12 Changed 9 years ago by Peter Johansson

(In [1450]) refs #387. add function Node::property

comment:13 Changed 9 years ago by Peter Johansson

(In [1451]) add ignore_revs variable to CopyrightStats? interface, refs #387

comment:14 Changed 9 years ago by Peter Johansson

Working on this a question popped up regarding how these ignore revs should propagate from node to sub-node. Say that you have foo' and foo/bar' and foo has svncopyright:ignore=5-99 and foo/bar has svncopyright:ignore=50-200. Which revs should actually now be ignored by foo/bar? Obvious choices seems 50-200 (because that is what foo/bar's prop says, but perhaps even more natural would be to take the union of the two sets and ignore revs 5-200. That is inline with current behavior that if we set ignore property on a directory also the subnodes are completely ignored. If anyone this taking the union is a bad idea, speak up now - or ...

comment:15 in reply to:  14 Changed 9 years ago by Jari Häkkinen

Replying to peter: I think it should be the union 5-200. The ignore property on directory should have precedence. So I suppose I could have been quiet.

comment:16 Changed 9 years ago by Peter Johansson

(In [1453]) refs #387. implement core of svncopyright:ignore

comment:17 Changed 9 years ago by Peter Johansson

(In [1456]) pass svncopyright:ignore values from directory to sub-nodes. refs #387

comment:18 Changed 9 years ago by Peter Johansson

Another question: With new behavior it is possible that svncopyright works on the file (not completely ignored) but all copyright worthy revisions are set to be ignored. How should svncopyright act in that situation? Currently an assert is triggered which obviously is not OK. Should svncopyright set the Copyright to empty string or should it leave the file alone? I tend to lean to the latter now when writing this. That means that svncopyright:ignore= and svncopyright:ignore=0-999999 behave the same way. Another reason I don't like if svncopyright sets Copyright to empty string is that copyright changes would not converge after one run, instead if there is a second Copyright (C) string it will be removed in a second run, and if there is a third Copyright (C) string it will be removed in the third run et cetera.

comment:19 Changed 9 years ago by Peter Johansson

(In [1457]) if all copyright worthy revs are set to be ignored, do not touch the file. refs #387

comment:20 Changed 8 years ago by Peter Johansson

Resolution: fixed
Status: assignedclosed

(In [1478]) closes #387

Note: See TracTickets for help on using tickets.