Data to improve our code review queue

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Data to improve our code review queue

Quim Gil-2
Please have a look to some graphs visualizing interesting data from our
code review queues in Gerrit, focusing in the key Wikimedia software
projects.[1]

http://korma.wmflabs.org/browser/gerrit_review_queue.html

The queue of open chagesets keeps growing. We have open changesets
submitted in every month since March 2012. However, since last December we
must be doing something right, because the median times to update and
resolve submissions are decreasing.

Looking at http://korma.wmflabs.org/browser/scr.html , one reason for this
improvement might be that the volume of new changesets has also decreased
during the same period. Maybe newer patches get faster reviews? Any ideas?
We need to dig further.

We have created a "hall of shame" (add you preferred smiley here) to bring
under the light the repositories with the open changesets that haven't seen
any activity for a longer period. The principle is simple: you don't want
to see one of your repos appearing in the top 10.

Many of the _leading_ repos have a couple of open changesets only, and our
hope is that by showing up there, the maintainers will act on them quickly
(e.g. OpenStackManager, fluoride, commons, UserMerge, TorBlock, Vipscaler,
luasandbox...). This will leave the fight for the pole position to the
projects that actually have a real problem dealing with patches received
(Donationinterface, GuidedTour, UploadWizard...)

Who knows, perhaps we should organize "patch days", in the same way that we
have organized bug days in the past (which we want to recover now). We also
want to look at ways promote the oldest inactive requests. For instance,
what about directing new volunteers there, asking them to submit their code
revisions. For a patch that has been waiting in silence for over a year,
any feedback will be better than no feedback.

One last detail. Our initial motivation to look at the age of open
changesets by affiliation was to check whether submissions from WMF
employees and independent developers were treated equally. Interestingly,
there are no big differences between these groups. However, there are big
differences between the median age of open WMDE changesets (16.5 days) and
open Wikia changesets (almost 283 days). All this according to our
estimation of the origin of patches (domain of the submitter's email +
affiliation submitted by the developers that filled our survey.[2]

Your feedback about these metrics is welcome. Please reply here or file
Bugzilla reports directly to Analytics > Tech Community metrics

https://bugzilla.wikimedia.org/buglist.cgi?component=Tech%20community%20metrics&list_id=297881&product=Analytics&resolution=---

(Short link just in case: http://bit.ly/1q0itsl )

[1] https://wikitech.wikimedia.org/wiki/Key_Wikimedia_software_projects

[2]
https://docs.google.com/forms/d/1RFUa2zBAOolw78W-ozJPoYlR2lYbrAOYvOZYgjaAYQg/viewform

--
Quim Gil
Engineering Community Manager @ Wikimedia Foundation
http://www.mediawiki.org/wiki/User:Qgil
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Data to improve our code review queue

Brian Wolff
> Who knows, perhaps we should organize "patch days", in the same way that
we
> have organized bug days in the past (which we want to recover now).

that would be cool.

>We also
> want to look at ways promote the oldest inactive requests. For instance,
> what about directing new volunteers there, asking them to submit their
code
> revisions. For a patch that has been waiting in silence for over a year,
> any feedback will be better than no feedback.

You sure about that? I would imagine that having no one look at your code
for months, then having someone who doesnt have the authority to approve it
nit pick it a little, followed by another couple months of waiting, to be
more frustrating then no feedback at all.

>
> One last detail. Our initial motivation to look at the age of open
> changesets by affiliation was to check whether submissions from WMF
> employees and independent developers were treated equally. Interestingly,
> there are no big differences between these groups.

Interesting, im kind of surprised by this.

I wonder if there is a difference between newbie volunteers and long term
volunteers.

--
Bawolff
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Data to improve our code review queue

Brad Jorsch (Anomie)
In reply to this post by Quim Gil-2
On Thu, Apr 3, 2014 at 7:10 PM, Quim Gil <[hidden email]> wrote:

> The queue of open chagesets keeps growing.


Does "open" include changesets that were submitted, got feedback, and then
the submitter never bothered to follow up?


--
Brad Jorsch (Anomie)
Software Engineer
Wikimedia Foundation
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Data to improve our code review queue

Jeroen De Dauw-2
In reply to this post by Quim Gil-2
Hey,

> http://korma.wmflabs.org/browser/gerrit_review_queue.html

Is this just for core, or for all repos?

For Wikia, it sort of seems like they just created one commit that did not
get merged. The graph seems quite skewed. At least for WMDE contributors I
would find it strange that our typical commit takes over two weeks to get
merged if our most contributed to repos where held into account.

> http://korma.wmflabs.org/browser/scr.html

So here I noticed my name in the list on the right and clicked it.
http://korma.wmflabs.org/browser/people.html?id=127&name=jeroendedauw

These graphs seem very wrong. It is only showing commits starting in 2012,
rather than 2009. At the same time it lists emails from 2006, while I only
joined the list in 2009. It also has "jeroen_dedauw" at the top, which has
never been my commit name. So what's going on there?

Cheers

--
Jeroen De Dauw - http://www.bn2vs.com
Software craftsmanship advocate
Evil software architect at Wikimedia Germany
~=[,,_,,]:3
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Data to improve our code review queue

Happy Melon-2
In reply to this post by Brian Wolff
On 4 April 2014 00:38, Brian Wolff <[hidden email]> wrote:

>
> > what about directing new volunteers there, asking them to submit their
> code
> > revisions. For a patch that has been waiting in silence for over a year,
> > any feedback will be better than no feedback.
>
> You sure about that? I would imagine that having no one look at your code
> for months, then having someone who doesn't have the authority to approve
> it
> nit pick it a little, followed by another couple months of waiting, to be
> more frustrating then no feedback at all.
>

This is also completely the wrong way to go about open-source development.
The work priorities of volunteers are the thing that you, as manager of
paid staff, *can't* control, as opposed to the work priorities of paid
staff, which you very much can.  If reviewing these old patches was in any
way interesting/exciting/fulfilling, volunteers would probably already have
*made* some contributions.  Being occasionally tasked with
uninteresting/unexciting/unfulfilling tasks that Just Need Doing is one of
the things that paid developers *get *paid *for*.

--HM
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Data to improve our code review queue

Quim Gil-2
In reply to this post by Brad Jorsch (Anomie)
On Friday, April 4, 2014, Brad Jorsch (Anomie) <[hidden email]>
wrote:

> On Thu, Apr 3, 2014 at 7:10 PM, Quim Gil <[hidden email]<javascript:;>>
> wrote:
>
> > The queue of open chagesets keeps growing.
>
>
> Does "open" include changesets that were submitted, got feedback, and then
> the submitter never bothered to follow up?
>

Yes, shouldn't we?

Open is open. What would be the best practice with these changesets? Just
leave them open like now, abandon them, mark them with some tag...?


--
Quim Gil
Engineering Community Manager @ Wikimedia Foundation
http://www.mediawiki.org/wiki/User:Qgil
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Data to improve our code review queue

Brad Jorsch (Anomie)
On Fri, Apr 4, 2014 at 11:48 AM, Quim Gil <[hidden email]> wrote:

> On Friday, April 4, 2014, Brad Jorsch (Anomie) <[hidden email]>
> wrote:
>
> > On Thu, Apr 3, 2014 at 7:10 PM, Quim Gil <[hidden email]
> <javascript:;>>
> > wrote:
> >
> > > The queue of open chagesets keeps growing.
> >
> >
> > Does "open" include changesets that were submitted, got feedback, and
> then
> > the submitter never bothered to follow up?
> >
>
> Yes, shouldn't we?
>

It seems a little premature to me to worry about "our number of open
changesets keeps growing!" and "we have changesets that have been open for
a really long time!" when we don't know how many are due to this factor
that we have no control over (beyond marking them abandoned as stale,
counting on the submitter to unabandon if they ever come back).

Maybe abandoning old changesets with -1 reviews as stale with an
appropriate message is something that should be done. I don't know.


--
Brad Jorsch (Anomie)
Software Engineer
Wikimedia Foundation
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Data to improve our code review queue

Brian Wolff
In reply to this post by Quim Gil-2
On Apr 4, 2014 12:48 PM, "Quim Gil" <[hidden email]> wrote:
>
> On Friday, April 4, 2014, Brad Jorsch (Anomie) <[hidden email]>
> wrote:
>
> > On Thu, Apr 3, 2014 at 7:10 PM, Quim Gil <[hidden email]
<javascript:;>>
> > wrote:
> >
> > > The queue of open chagesets keeps growing.
> >
> >
> > Does "open" include changesets that were submitted, got feedback, and
then
> > the submitter never bothered to follow up?
> >
>
> Yes, shouldn't we?
>
> Open is open. What would be the best practice with these changesets? Just
> leave them open like now, abandon them, mark them with some tag...?
>

Best practise is for author to abandon them if they are not interested any
more. However very few people actually do that. If you want to exclude
them, -1 is a good criteria (-1 patchsets are not waiting on review, so
even if there is still interest, its not in queue).

Some [wip] patches are probably also outliers in the data set and should be
excluded.

If you want to know median time to review, i would suggest only counting
merged patches, since open patches have an unknown time to review (not sure
if this is what you are already doing).

-bawolff
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Data to improve our code review queue

Marc-Andre
In reply to this post by Quim Gil-2
On 04/04/2014 11:48 AM, Quim Gil wrote:
> Open is open. What would be the best practice with these changesets? Just
> leave them open like now, abandon them, mark them with some tag...?

I can think of two broad categories we'd be interested in: "waiting for
review" and "waiting for submitter action".  While crude, the simple
metric "has gotten comments since last submit" will give a good idea of
how many in which.

-- Marc


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Data to improve our code review queue

Jon Robson
In reply to this post by Brad Jorsch (Anomie)
In MobileFrontend I abandon patchsets which have had no activity in a month
for this very reason. People can always reopen them.
On 4 Apr 2014 08:54, "Brad Jorsch (Anomie)" <[hidden email]> wrote:

> On Fri, Apr 4, 2014 at 11:48 AM, Quim Gil <[hidden email]> wrote:
>
> > On Friday, April 4, 2014, Brad Jorsch (Anomie) <[hidden email]>
> > wrote:
> >
> > > On Thu, Apr 3, 2014 at 7:10 PM, Quim Gil <[hidden email]
> > <javascript:;>>
> > > wrote:
> > >
> > > > The queue of open chagesets keeps growing.
> > >
> > >
> > > Does "open" include changesets that were submitted, got feedback, and
> > then
> > > the submitter never bothered to follow up?
> > >
> >
> > Yes, shouldn't we?
> >
>
> It seems a little premature to me to worry about "our number of open
> changesets keeps growing!" and "we have changesets that have been open for
> a really long time!" when we don't know how many are due to this factor
> that we have no control over (beyond marking them abandoned as stale,
> counting on the submitter to unabandon if they ever come back).
>
> Maybe abandoning old changesets with -1 reviews as stale with an
> appropriate message is something that should be done. I don't know.
>
>
> --
> Brad Jorsch (Anomie)
> Software Engineer
> Wikimedia Foundation
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Data to improve our code review queue

Quim Gil-2
In reply to this post by Jeroen De Dauw-2
On Friday, April 4, 2014, Jeroen De Dauw <[hidden email]> wrote:

> Hey,
>
> > http://korma.wmflabs.org/browser/gerrit_review_queue.html
>
> Is this just for core, or for all repos?
>

This is for key Wikimedia software projects.

https://wikitech.wikimedia.org/wiki/Key_Wikimedia_software_projects


For Wikia, it sort of seems like they just created one commit that did not
> get merged. The graph seems quite skewed. At least for WMDE contributors I

would find it strange that our typical commit takes over two weeks to get
> merged if our most contributed to repos where held into account.
>

Good points; bug report created.

Bug 63530 - Identifying open changesets submitted by each organization
https://bugzilla.wikimedia.org/show_bug.cgi?id=63530



> > http://korma.wmflabs.org/browser/scr.html
>
> So here I noticed my name in the list on the right and clicked it.
> http://korma.wmflabs.org/browser/people.html?id=127&name=jeroendedauw
>
> These graphs seem very wrong. It is only showing commits starting in 2012,
> rather than 2009. At the same time it lists emails from 2006, while I only
> joined the list in 2009. It also has "jeroen_dedauw" at the top, which has
> never been my commit name. So what's going on there?
>

About commits, maybe you were using a different email address before 2012,
currently not assigned to your identity in our metrics?

About mailing lists, thank you for confirming
https://bugzilla.wikimedia.org/show_bug.cgi?id=60076

I will leave to Alvaro the explanation about the "jeroen_dedauw" part.

The underlying problem is that we need a system to allow users take care of
their own identity. There is a GSoC project idea to implement this feature,
and several students have presented proposals. The selection process is
ongoing, we will know more about this on April 21. In the meantime,

https://bugzilla.wikimedia.org/show_bug.cgi?id=58585


--
Quim Gil
Engineering Community Manager @ Wikimedia Foundation
http://www.mediawiki.org/wiki/User:Qgil
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Data to improve our code review queue

Alex Monk
On 4 April 2014 17:52, Quim Gil <[hidden email]> wrote:

> On Friday, April 4, 2014, Jeroen De Dauw <[hidden email]> wrote:>
> > http://korma.wmflabs.org/browser/scr.html
> >
> > So here I noticed my name in the list on the right and clicked it.
> > http://korma.wmflabs.org/browser/people.html?id=127&name=jeroendedauw
> >
> > These graphs seem very wrong. It is only showing commits starting in
> 2012,
> > rather than 2009. At the same time it lists emails from 2006, while I
> only
> > joined the list in 2009. It also has "jeroen_dedauw" at the top, which
> has
> > never been my commit name. So what's going on there?
> >
>
> About commits, maybe you were using a different email address before 2012,
> currently not assigned to your identity in our metrics?
>

All the old pre-Git commits (Git migration was 2012-03-21 IIRC) will have
different email addresses ([hidden email]).
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Data to improve our code review queue

Quim Gil-2
In reply to this post by Brian Wolff
On Friday, April 4, 2014, Brian Wolff <[hidden email]> wrote:

>
> Best practise is for author to abandon them if they are not interested any
> more. However very few people actually do that. If you want to exclude
> them, -1 is a good criteria (-1 patchsets are not waiting on review, so
> even if there is still interest, its not in queue).
>

Makes sense. Let's start excluding open changesets with a last update older
than 30 days and -1.

https://bugzilla.wikimedia.org/show_bug.cgi?id=63533


Some [wip] patches are probably also outliers in the data set and should be
> excluded.
>

Can you give examples of this type of changesets? Are there many of these?

For what is worth, in Bugzilla metrics we are able to ignore reports using
the "tracking" keyword.


If you want to know median time to review, i would suggest only counting
> merged patches, since open patches have an unknown time to review (not sure
> if this is what you are already doing).
>

So far the focus has been put into highlighting the areas that need
attention, as opposed of the performance measured by the work completed.

For what is worth, there is a "review time in days" graph at
http://korma.wmflabs.org/browser/scr.html, but I haven't looked at it, and
I'm not sure what is being counted. Perhaps average time instead of median,
and therefore getting distorted by patches that get reviewed between team
members in a few hours?

In any case, the disparity of age between resolved and unresolved
changesets might show that there is a clear stream of fast work, while
there is also a pocket of open changes languishing. Let's look at the
numbers again when the rulo of ignoring old -1s is applied.


--
Quim Gil
Engineering Community Manager @ Wikimedia Foundation
http://www.mediawiki.org/wiki/User:Qgil
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l