Re: Updates to ORES service & BREAKING CHANGE on April 7th

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Updates to ORES service & BREAKING CHANGE on April 7th

Aaron Halfaker-2
FYI, the new models (BREAKING CHANGE) are now deployed.

On Sun, Apr 3, 2016 at 5:38 AM, Aaron Halfaker <[hidden email]>
wrote:

> Hey folks, we have a couple of announcements for you today. First is that
> ORES has a large set of new functionality that you might like to take
> advantage of. We'll also want to talk about a *BREAKING CHANGE on April
> 7th.*
>
> Don't know what ORES is?  See
> http://blog.wikimedia.org/2015/11/30/artificial-intelligence-x-ray-specs/
>
> *New functionality*
>
> *Scoring UI*
> Sometimes you just want to score a few revisions in ORES and remembering
> the URL structure is hard. So, we've build a simple scoring user-interface
> <https://ores.wmflabs.org/ui/> that will allow you to more easily score a
> set of edits.
>
> *New API version*
> We've been consistently getting requests to include more information in
> ORES' responses. In order to make space for this new information, we needed
> to change the structure of responses. But we wanted to do this without
> breaking the tools that are already using ORES. So, we've developed a
> versioning scheme that will allow you to take advantage of new
> functionality when you are ready. The same old API will continue to be
> available at https://ores.wmflabs.org/scores/, but we've added two
> additional paths on top of this.
>
>    - https://ores.wmflabs.org/v1/scores/ is a mirror of the old scoring
>    API which will henceforth be referred to as "v1"
>    - https://ores.wmflabs.org/v2/scores/ implements a new response format
>    that is consistent between all sub-paths and adds some new functionality
>
> *Swagger documentation*
> Curious about the new functionality available in "v2" or maybe what the
> change was from "v1"? We've implemented a structured description of both
> versions of the scoring API using swagger -- which is becoming a defacto
> stanard for this sort of thing. Visit https://ores.wmflabs.org/v1/ or
> https://ores.wmflabs.org/v2/ to see the Swagger user-interface.
> Visithttps://ores.wmflabs.org/v1/spec/ or
> https://ores.wmflabs.org/v2/spec/ to get the specification in a
> machine-readable format.
>
> *Feature values & injection*
> Have you wondered what ORES uses to make it's predictions? You can now ask
> ORES to show you the list of "feature" statistics it uses to score
> revisions. For example,
> https://ores.wmflabs.org/v2/scores/enwiki/wp10/34567892/?features will
> return the score with a mapping of feature values used by the "wp10"
> article quality model in English Wikipedia to score oldid=34567892
> <https://en.wikipedia.org/wiki/Special:Diff/34567892>. You can also
> "inject" features into the scoring process to see how that affects the
> prediction. E.g.,
> https://ores.wmflabs.org/v2/scores/enwiki/wp10/34567892?features&feature.wikitext.revision.chars=10000
>
> *Breaking change -- new models*
> We've been experimenting with new learning algorithms to make ORES work
> better and we've found that we get better results with gradient boosting
> <https://en.wikipedia.org/wiki/Gradient_boosting> and random forest
> <https://en.wikipedia.org/wiki/Random_forest> strategies than we do with
> the current linear svc
> <https://en.wikipedia.org/wiki/Support_vector_machine> models. We'd like
> to get these new, better models deployed as soon as possible, but with the
> new algorithm comes a change in the range of probabilities returned by the
> model. So, when we deploy this change, any tools that uses hard-coded
> thresholds on ORES' prediction probabilities will suddenly start behaving
> strangely. Regretfully, we haven't found a way around this problem, so
> we're announcing the change now and we plan to deploy this *BREAKING
> CHANGE on April 7th*. Please subscribe to the AI mailinglist
> <https://lists.wikimedia.org/mailman/listinfo/ai> or watch our project
> page [[:m:ORES <https://meta.wikimedia.org/wiki/ORES>]] to catch
> announcements of future changes and new functionality.
>
> In order to make sure we don't end up in the same situation the next time
> we want to change an algorithm, we've included a suite of evaluation
> statistics with each model. The filter_rate_at_recall(0.9),
> filter_rate_at_recall(0.75), and recall_at_fpr(0.1) thresholds represent
> three critical thresholds (should review, needs review, and definitely
> damaging -- respectively) that can be used to automatically configure your
> wiki tool.  You can find out these thresholds for your model of choice by
> adding the ?model_info parameter to requests.  So, come breaking change,
> we strongly recommend basing your thresholds on these statistics in the
> future. We'll be working to submit patches to tools that use ORES in the
> next week to implement this flexibility.  Hopefully, all you'll need to do
> is worth with us on those.
>
> -halfak & The Revision Scoring team
> <https://meta.wikimedia.org/wiki/Research:Revision_scoring_as_a_service>
>
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Updates to ORES service & BREAKING CHANGE on April 7th

Moritz Schubotz-2
Hi Aaron,

can you say a few words about the stability of the API.
We are working on a scoring model for user contributions, rather than
revisions using Apache Flink.
http://imwa.gehaxelt.in:9090/pdfs/expose.pdf
However, it would be nice to have a somehow compatible API in the end.

Best
Moritz

On Thu, Apr 7, 2016 at 10:55 AM, Aaron Halfaker <[hidden email]>
wrote:

> FYI, the new models (BREAKING CHANGE) are now deployed.
>
> On Sun, Apr 3, 2016 at 5:38 AM, Aaron Halfaker <[hidden email]>
> wrote:
>
> > Hey folks, we have a couple of announcements for you today. First is that
> > ORES has a large set of new functionality that you might like to take
> > advantage of. We'll also want to talk about a *BREAKING CHANGE on April
> > 7th.*
> >
> > Don't know what ORES is?  See
> >
> http://blog.wikimedia.org/2015/11/30/artificial-intelligence-x-ray-specs/
> >
> > *New functionality*
> >
> > *Scoring UI*
> > Sometimes you just want to score a few revisions in ORES and remembering
> > the URL structure is hard. So, we've build a simple scoring
> user-interface
> > <https://ores.wmflabs.org/ui/> that will allow you to more easily score
> a
> > set of edits.
> >
> > *New API version*
> > We've been consistently getting requests to include more information in
> > ORES' responses. In order to make space for this new information, we
> needed
> > to change the structure of responses. But we wanted to do this without
> > breaking the tools that are already using ORES. So, we've developed a
> > versioning scheme that will allow you to take advantage of new
> > functionality when you are ready. The same old API will continue to be
> > available at https://ores.wmflabs.org/scores/, but we've added two
> > additional paths on top of this.
> >
> >    - https://ores.wmflabs.org/v1/scores/ is a mirror of the old scoring
> >    API which will henceforth be referred to as "v1"
> >    - https://ores.wmflabs.org/v2/scores/ implements a new response
> format
> >    that is consistent between all sub-paths and adds some new
> functionality
> >
> > *Swagger documentation*
> > Curious about the new functionality available in "v2" or maybe what the
> > change was from "v1"? We've implemented a structured description of both
> > versions of the scoring API using swagger -- which is becoming a defacto
> > stanard for this sort of thing. Visit https://ores.wmflabs.org/v1/ or
> > https://ores.wmflabs.org/v2/ to see the Swagger user-interface.
> > Visithttps://ores.wmflabs.org/v1/spec/ or
> > https://ores.wmflabs.org/v2/spec/ to get the specification in a
> > machine-readable format.
> >
> > *Feature values & injection*
> > Have you wondered what ORES uses to make it's predictions? You can now
> ask
> > ORES to show you the list of "feature" statistics it uses to score
> > revisions. For example,
> > https://ores.wmflabs.org/v2/scores/enwiki/wp10/34567892/?features will
> > return the score with a mapping of feature values used by the "wp10"
> > article quality model in English Wikipedia to score oldid=34567892
> > <https://en.wikipedia.org/wiki/Special:Diff/34567892>. You can also
> > "inject" features into the scoring process to see how that affects the
> > prediction. E.g.,
> >
> https://ores.wmflabs.org/v2/scores/enwiki/wp10/34567892?features&feature.wikitext.revision.chars=10000
> >
> > *Breaking change -- new models*
> > We've been experimenting with new learning algorithms to make ORES work
> > better and we've found that we get better results with gradient boosting
> > <https://en.wikipedia.org/wiki/Gradient_boosting> and random forest
> > <https://en.wikipedia.org/wiki/Random_forest> strategies than we do with
> > the current linear svc
> > <https://en.wikipedia.org/wiki/Support_vector_machine> models. We'd like
> > to get these new, better models deployed as soon as possible, but with
> the
> > new algorithm comes a change in the range of probabilities returned by
> the
> > model. So, when we deploy this change, any tools that uses hard-coded
> > thresholds on ORES' prediction probabilities will suddenly start behaving
> > strangely. Regretfully, we haven't found a way around this problem, so
> > we're announcing the change now and we plan to deploy this *BREAKING
> > CHANGE on April 7th*. Please subscribe to the AI mailinglist
> > <https://lists.wikimedia.org/mailman/listinfo/ai> or watch our project
> > page [[:m:ORES <https://meta.wikimedia.org/wiki/ORES>]] to catch
> > announcements of future changes and new functionality.
> >
> > In order to make sure we don't end up in the same situation the next time
> > we want to change an algorithm, we've included a suite of evaluation
> > statistics with each model. The filter_rate_at_recall(0.9),
> > filter_rate_at_recall(0.75), and recall_at_fpr(0.1) thresholds represent
> > three critical thresholds (should review, needs review, and definitely
> > damaging -- respectively) that can be used to automatically configure
> your
> > wiki tool.  You can find out these thresholds for your model of choice by
> > adding the ?model_info parameter to requests.  So, come breaking change,
> > we strongly recommend basing your thresholds on these statistics in the
> > future. We'll be working to submit patches to tools that use ORES in the
> > next week to implement this flexibility.  Hopefully, all you'll need to
> do
> > is worth with us on those.
> >
> > -halfak & The Revision Scoring team
> > <https://meta.wikimedia.org/wiki/Research:Revision_scoring_as_a_service>
> >
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l




--
Mit freundlichen Grüßen
Moritz Schubotz

  Telefon (Büro):  +49 30 314 22784
  Telefon (Privat):+49 30 488 27330
  E-Mail: [hidden email]
  Web: http://www.physikerwelt.de
  Skype: Schubi87
  ICQ: 200302764
  Msn: [hidden email]
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Updates to ORES service & BREAKING CHANGE on April 7th

Aaron Halfaker-2
Hi Mortiz,

There's two types of stability you should be aware of: API behavior and
model scores.

You should expect that the version'd API behavior will remain stable.  So,
if we choose to make a change to the request or response style, that will
appear under the path "v3/" and so forth.  So, if you write code against
the v2/ API (you shouldn't be writing new code against the v1/ API, but you
*can* expect it to be stable), you should expect that it will continue to
work as expected.  You can see the swagger spec's for the APIs at these
endpoints: https://ores.wmflabs.org/v1/spec/ or
https://ores.wmflabs.org/v2/spec/   You should expect that the API behavior
described will not change.

But we may still need to update the models in the future and that would
likely change the range of scores slightly.  We include versions of the
models in the basic API response so that you can cache and invalidate
scores that you get from the API.  We're still working out the right way to
report evaluation metrics to you so that you'll be able to dynamically
adjust any thresholds you set in your own application.  FWIW, I do not
forsee us changing our modeling strategy substantially in the short- or
mid-term.  It took us ~3 months of work to prepare for the breaking change
that was announced in this thread.

In the end, we're interested in learning about your needs and concerns so
that we can adjust our process and make changes accordingly.  So if you
have concerns with any of the above please let us know.

-Aaron

On Sat, Apr 30, 2016 at 5:50 PM, Moritz Schubotz <[hidden email]>
wrote:

> Hi Aaron,
>
> can you say a few words about the stability of the API.
> We are working on a scoring model for user contributions, rather than
> revisions using Apache Flink.
> http://imwa.gehaxelt.in:9090/pdfs/expose.pdf
> However, it would be nice to have a somehow compatible API in the end.
>
> Best
> Moritz
>
> On Thu, Apr 7, 2016 at 10:55 AM, Aaron Halfaker <[hidden email]>
> wrote:
>
> > FYI, the new models (BREAKING CHANGE) are now deployed.
> >
> > On Sun, Apr 3, 2016 at 5:38 AM, Aaron Halfaker <[hidden email]
> >
> > wrote:
> >
> > > Hey folks, we have a couple of announcements for you today. First is
> that
> > > ORES has a large set of new functionality that you might like to take
> > > advantage of. We'll also want to talk about a *BREAKING CHANGE on April
> > > 7th.*
> > >
> > > Don't know what ORES is?  See
> > >
> >
> http://blog.wikimedia.org/2015/11/30/artificial-intelligence-x-ray-specs/
> > >
> > > *New functionality*
> > >
> > > *Scoring UI*
> > > Sometimes you just want to score a few revisions in ORES and
> remembering
> > > the URL structure is hard. So, we've build a simple scoring
> > user-interface
> > > <https://ores.wmflabs.org/ui/> that will allow you to more easily
> score
> > a
> > > set of edits.
> > >
> > > *New API version*
> > > We've been consistently getting requests to include more information in
> > > ORES' responses. In order to make space for this new information, we
> > needed
> > > to change the structure of responses. But we wanted to do this without
> > > breaking the tools that are already using ORES. So, we've developed a
> > > versioning scheme that will allow you to take advantage of new
> > > functionality when you are ready. The same old API will continue to be
> > > available at https://ores.wmflabs.org/scores/, but we've added two
> > > additional paths on top of this.
> > >
> > >    - https://ores.wmflabs.org/v1/scores/ is a mirror of the old
> scoring
> > >    API which will henceforth be referred to as "v1"
> > >    - https://ores.wmflabs.org/v2/scores/ implements a new response
> > format
> > >    that is consistent between all sub-paths and adds some new
> > functionality
> > >
> > > *Swagger documentation*
> > > Curious about the new functionality available in "v2" or maybe what the
> > > change was from "v1"? We've implemented a structured description of
> both
> > > versions of the scoring API using swagger -- which is becoming a
> defacto
> > > stanard for this sort of thing. Visit https://ores.wmflabs.org/v1/ or
> > > https://ores.wmflabs.org/v2/ to see the Swagger user-interface.
> > > Visithttps://ores.wmflabs.org/v1/spec/ or
> > > https://ores.wmflabs.org/v2/spec/ to get the specification in a
> > > machine-readable format.
> > >
> > > *Feature values & injection*
> > > Have you wondered what ORES uses to make it's predictions? You can now
> > ask
> > > ORES to show you the list of "feature" statistics it uses to score
> > > revisions. For example,
> > > https://ores.wmflabs.org/v2/scores/enwiki/wp10/34567892/?features will
> > > return the score with a mapping of feature values used by the "wp10"
> > > article quality model in English Wikipedia to score oldid=34567892
> > > <https://en.wikipedia.org/wiki/Special:Diff/34567892>. You can also
> > > "inject" features into the scoring process to see how that affects the
> > > prediction. E.g.,
> > >
> >
> https://ores.wmflabs.org/v2/scores/enwiki/wp10/34567892?features&feature.wikitext.revision.chars=10000
> > >
> > > *Breaking change -- new models*
> > > We've been experimenting with new learning algorithms to make ORES work
> > > better and we've found that we get better results with gradient
> boosting
> > > <https://en.wikipedia.org/wiki/Gradient_boosting> and random forest
> > > <https://en.wikipedia.org/wiki/Random_forest> strategies than we do
> with
> > > the current linear svc
> > > <https://en.wikipedia.org/wiki/Support_vector_machine> models. We'd
> like
> > > to get these new, better models deployed as soon as possible, but with
> > the
> > > new algorithm comes a change in the range of probabilities returned by
> > the
> > > model. So, when we deploy this change, any tools that uses hard-coded
> > > thresholds on ORES' prediction probabilities will suddenly start
> behaving
> > > strangely. Regretfully, we haven't found a way around this problem, so
> > > we're announcing the change now and we plan to deploy this *BREAKING
> > > CHANGE on April 7th*. Please subscribe to the AI mailinglist
> > > <https://lists.wikimedia.org/mailman/listinfo/ai> or watch our project
> > > page [[:m:ORES <https://meta.wikimedia.org/wiki/ORES>]] to catch
> > > announcements of future changes and new functionality.
> > >
> > > In order to make sure we don't end up in the same situation the next
> time
> > > we want to change an algorithm, we've included a suite of evaluation
> > > statistics with each model. The filter_rate_at_recall(0.9),
> > > filter_rate_at_recall(0.75), and recall_at_fpr(0.1) thresholds
> represent
> > > three critical thresholds (should review, needs review, and definitely
> > > damaging -- respectively) that can be used to automatically configure
> > your
> > > wiki tool.  You can find out these thresholds for your model of choice
> by
> > > adding the ?model_info parameter to requests.  So, come breaking
> change,
> > > we strongly recommend basing your thresholds on these statistics in the
> > > future. We'll be working to submit patches to tools that use ORES in
> the
> > > next week to implement this flexibility.  Hopefully, all you'll need to
> > do
> > > is worth with us on those.
> > >
> > > -halfak & The Revision Scoring team
> > > <
> https://meta.wikimedia.org/wiki/Research:Revision_scoring_as_a_service>
> > >
> > _______________________________________________
> > Wikitech-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
>
>
>
> --
> Mit freundlichen Grüßen
> Moritz Schubotz
>
>   Telefon (Büro):  +49 30 314 22784
>   Telefon (Privat):+49 30 488 27330
>   E-Mail: [hidden email]
>   Web: http://www.physikerwelt.de
>   Skype: Schubi87
>   ICQ: 200302764
>   Msn: [hidden email]
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l