recent changes stream

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

recent changes stream

Ori Livneh
Hi,

Gerrit change Id819246a9 proposes an implementation for a recent changes
stream broadcast via socket.io, an abstraction layer over WebSockets that
also provides long polling as a fallback for older browsers. Comment on <
https://gerrit.wikimedia.org/r/#/c/131040/> or the mailing list.

Thanks,
Ori
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: recent changes stream

Tyler Romeo
Just wondering, but has any performance testing been done on different
socket.io implementations? IIRC, Python is pretty good, so I definitely
approve, but I'm wondering if there are other implementations are are more
performant (specifically, servers that have better parallelism and no GIL).

For example, Erlang with Cowboy is supposed to be a good
socket.ioimplementation, and it is truly parallel, but I've never
worked with it so
I cannot say for sure.


*-- *
*Tyler Romeo*
Stevens Institute of Technology, Class of 2016
Major in Computer Science


On Sun, May 4, 2014 at 10:23 PM, Ori Livneh <[hidden email]> wrote:

> Hi,
>
> Gerrit change Id819246a9 proposes an implementation for a recent changes
> stream broadcast via socket.io, an abstraction layer over WebSockets that
> also provides long polling as a fallback for older browsers. Comment on <
> https://gerrit.wikimedia.org/r/#/c/131040/> or the mailing list.
>
> Thanks,
> Ori
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: recent changes stream

Jeremy Baron
In reply to this post by Ori Livneh
On May 4, 2014 10:24 PM, "Ori Livneh" <[hidden email]> wrote:
> an implementation for a recent changes
> stream broadcast via socket.io, an abstraction layer over WebSockets that
> also provides long polling as a fallback for older browsers.

I see this is using redis. FWIW I was initially wondering if this would
leverage
the log aggregation infra that analytics is building. Currently we use UDP
for the IRC feed which is vulnerable to silent loss in case of packet loss
or other network disruption. I believe analytics will be using an
eventually consistent system? Or we could just watch the UDP stream for
sequence gaps and fill in the gaps with data from the RC table in the
corresponding DB. (likewise we could watch for gaps in redis and backfill
into redis from DB as needed)

How could this work overlap with adding pubsubhubbub support to existing
web RC feeds? (i.e. atom/rss. or for that matter even individual page
history feeds or related changes feeds)

The only pubsubhubbub bugs I see atm are
https://bugzilla.wikimedia.org/buglist.cgi?quicksearch=38970%2C30245

-Jeremy
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: recent changes stream

Ori Livneh
In reply to this post by Tyler Romeo
On Sun, May 4, 2014 at 9:09 PM, Tyler Romeo <[hidden email]> wrote:

> Just wondering, but has any performance testing been done on different
> socket.io implementations? IIRC, Python is pretty good, so I definitely
> approve, but I'm wondering if there are other implementations are are more
> performant (specifically, servers that have better parallelism and no GIL).
>

You still get the parallelism here, it just happens outside the language,
by having Nginx load-balance across multiple application instances. The
Puppet class, Upstart job definitions, and supporting shell scripts were
all designed to manage a process group of rcstream instances.
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: recent changes stream

Daniel Kinzler
In reply to this post by Jeremy Baron
Am 05.05.2014 07:20, schrieb Jeremy Baron:
> On May 4, 2014 10:24 PM, "Ori Livneh" <[hidden email]> wrote:
>> an implementation for a recent changes
>> stream broadcast via socket.io, an abstraction layer over WebSockets that
>> also provides long polling as a fallback for older browsers.

[...]

> How could this work overlap with adding pubsubhubbub support to existing
> web RC feeds? (i.e. atom/rss. or for that matter even individual page
> history feeds or related changes feeds)
>
> The only pubsubhubbub bugs I see atm are
> https://bugzilla.wikimedia.org/buglist.cgi?quicksearch=38970%2C30245

There is a Pubsubhubbub implementation in the pipeline, see
<https://git.wikimedia.org/summary/mediawiki%2Fextensions%2FPubSubHubbub>. It's
pretty simple and painless. We plan to have this deployed experimentally for
wikidata soon, but there is no reason not to roll it out globally.

This implementation uses the job queue - which in production means redis, but
it's pretty generic.

As to an RC *stream*: Pubsubhubbub is not really suitable for this, since it
requires the subscriber to run a public web server. It's really a
server-to-server protocol. I'm not too sure about web sockets for this either,
because the intended recipient is usually not a web browser. But if it works,
I'd be happy anyway, the UDP+IRC solution sucks.

Some years ago, I started to implement an XMPP based RC stream, see
<https://www.mediawiki.org/wiki/Extension:XMLRC>. Have a look and steal some
ideas :)

-- daniel



_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: recent changes stream

Petr Bena
Given the current specifications I can only support this change as
long as current IRC feed is preserved as IRC is IMHO, as much as evil
it looks, more suitable for this than WebSockets.

I am not saying that IRC is suitable for this and I know that people
really wanted to get rid of it or replace it with something better,
but I just can't see how is this better.

On Mon, May 5, 2014 at 10:37 AM, Daniel Kinzler <[hidden email]> wrote:

> Am 05.05.2014 07:20, schrieb Jeremy Baron:
>> On May 4, 2014 10:24 PM, "Ori Livneh" <[hidden email]> wrote:
>>> an implementation for a recent changes
>>> stream broadcast via socket.io, an abstraction layer over WebSockets that
>>> also provides long polling as a fallback for older browsers.
>
> [...]
>
>> How could this work overlap with adding pubsubhubbub support to existing
>> web RC feeds? (i.e. atom/rss. or for that matter even individual page
>> history feeds or related changes feeds)
>>
>> The only pubsubhubbub bugs I see atm are
>> https://bugzilla.wikimedia.org/buglist.cgi?quicksearch=38970%2C30245
>
> There is a Pubsubhubbub implementation in the pipeline, see
> <https://git.wikimedia.org/summary/mediawiki%2Fextensions%2FPubSubHubbub>. It's
> pretty simple and painless. We plan to have this deployed experimentally for
> wikidata soon, but there is no reason not to roll it out globally.
>
> This implementation uses the job queue - which in production means redis, but
> it's pretty generic.
>
> As to an RC *stream*: Pubsubhubbub is not really suitable for this, since it
> requires the subscriber to run a public web server. It's really a
> server-to-server protocol. I'm not too sure about web sockets for this either,
> because the intended recipient is usually not a web browser. But if it works,
> I'd be happy anyway, the UDP+IRC solution sucks.
>
> Some years ago, I started to implement an XMPP based RC stream, see
> <https://www.mediawiki.org/wiki/Extension:XMLRC>. Have a look and steal some
> ideas :)
>
> -- daniel
>
>
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: recent changes stream

Erik Bernhardson
I think we need to be clearer about what the goal is here, as is I think we
are all taking our personal idea of what we want to do with a feed and
applying that to this implementation.  Personally I have been working on an
external watchlist service that i would love to hook up to a feed, but
without any guarantees of receiving every single event my particular use
case is better off continuously scanning the xml feeds of 800 wikis.  I'm
certain other people are thinking of completely different things as well.

Erik B.


On Mon, May 5, 2014 at 2:29 AM, Petr Bena <[hidden email]> wrote:

> Given the current specifications I can only support this change as
> long as current IRC feed is preserved as IRC is IMHO, as much as evil
> it looks, more suitable for this than WebSockets.
>
> I am not saying that IRC is suitable for this and I know that people
> really wanted to get rid of it or replace it with something better,
> but I just can't see how is this better.
>
> On Mon, May 5, 2014 at 10:37 AM, Daniel Kinzler <[hidden email]>
> wrote:
> > Am 05.05.2014 07:20, schrieb Jeremy Baron:
> >> On May 4, 2014 10:24 PM, "Ori Livneh" <[hidden email]> wrote:
> >>> an implementation for a recent changes
> >>> stream broadcast via socket.io, an abstraction layer over WebSockets
> that
> >>> also provides long polling as a fallback for older browsers.
> >
> > [...]
> >
> >> How could this work overlap with adding pubsubhubbub support to existing
> >> web RC feeds? (i.e. atom/rss. or for that matter even individual page
> >> history feeds or related changes feeds)
> >>
> >> The only pubsubhubbub bugs I see atm are
> >> https://bugzilla.wikimedia.org/buglist.cgi?quicksearch=38970%2C30245
> >
> > There is a Pubsubhubbub implementation in the pipeline, see
> > <https://git.wikimedia.org/summary/mediawiki%2Fextensions%2FPubSubHubbub>.
> It's
> > pretty simple and painless. We plan to have this deployed experimentally
> for
> > wikidata soon, but there is no reason not to roll it out globally.
> >
> > This implementation uses the job queue - which in production means
> redis, but
> > it's pretty generic.
> >
> > As to an RC *stream*: Pubsubhubbub is not really suitable for this,
> since it
> > requires the subscriber to run a public web server. It's really a
> > server-to-server protocol. I'm not too sure about web sockets for this
> either,
> > because the intended recipient is usually not a web browser. But if it
> works,
> > I'd be happy anyway, the UDP+IRC solution sucks.
> >
> > Some years ago, I started to implement an XMPP based RC stream, see
> > <https://www.mediawiki.org/wiki/Extension:XMLRC>. Have a look and steal
> some
> > ideas :)
> >
> > -- daniel
> >
> >
> >
> > _______________________________________________
> > Wikitech-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: recent changes stream

Victor Vasiliev
In reply to this post by Petr Bena
On 05/05/2014 05:29 AM, Petr Bena wrote:
> I am not saying that IRC is suitable for this and I know that people
> really wanted to get rid of it or replace it with something better,
> but I just can't see how is this better.
>

Most programming languages have an implementation of WebSockets, and, well,
those who don't will eventually have it.  I heard C++ has plenty of them,
since most browsers are written in C++.  Almost any reasonable programming
language will have an implementation of JSON, some of them even have it
in standard library.

(If that's really an issue, and the language you are writing in is not INTERCAL
or Perl, I can probably even write a client for you)

I don't see how a well-defined standardized exchange format is better than
awkwardly screenscrapping colored lines of text from IRC feed, which works
as long as you don't exceed IRC message size limit.

    -- Victor.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: recent changes stream

Petr Bena
In reply to this post by Erik Bernhardson
I said this once in a gerrit comment and I will say it here as well:
most of people have different opinion on what is "good" for them as RC
stream. We should go for anything specific, but rather for a very
abstract solution that could be multiplexed into multiple RC feed
providers using a number of popular formats (including this IRC format
just for backward compatibility). So in the end, users would be able
to pick what format and protocol they want, just as they can do that
with api.php

Ideal RC stream would be so flexible that it could match any possible use case.

On Mon, May 5, 2014 at 6:45 PM, Erik Bernhardson
<[hidden email]> wrote:

> I think we need to be clearer about what the goal is here, as is I think we
> are all taking our personal idea of what we want to do with a feed and
> applying that to this implementation.  Personally I have been working on an
> external watchlist service that i would love to hook up to a feed, but
> without any guarantees of receiving every single event my particular use
> case is better off continuously scanning the xml feeds of 800 wikis.  I'm
> certain other people are thinking of completely different things as well.
>
> Erik B.
>
>
> On Mon, May 5, 2014 at 2:29 AM, Petr Bena <[hidden email]> wrote:
>
>> Given the current specifications I can only support this change as
>> long as current IRC feed is preserved as IRC is IMHO, as much as evil
>> it looks, more suitable for this than WebSockets.
>>
>> I am not saying that IRC is suitable for this and I know that people
>> really wanted to get rid of it or replace it with something better,
>> but I just can't see how is this better.
>>
>> On Mon, May 5, 2014 at 10:37 AM, Daniel Kinzler <[hidden email]>
>> wrote:
>> > Am 05.05.2014 07:20, schrieb Jeremy Baron:
>> >> On May 4, 2014 10:24 PM, "Ori Livneh" <[hidden email]> wrote:
>> >>> an implementation for a recent changes
>> >>> stream broadcast via socket.io, an abstraction layer over WebSockets
>> that
>> >>> also provides long polling as a fallback for older browsers.
>> >
>> > [...]
>> >
>> >> How could this work overlap with adding pubsubhubbub support to existing
>> >> web RC feeds? (i.e. atom/rss. or for that matter even individual page
>> >> history feeds or related changes feeds)
>> >>
>> >> The only pubsubhubbub bugs I see atm are
>> >> https://bugzilla.wikimedia.org/buglist.cgi?quicksearch=38970%2C30245
>> >
>> > There is a Pubsubhubbub implementation in the pipeline, see
>> > <https://git.wikimedia.org/summary/mediawiki%2Fextensions%2FPubSubHubbub>.
>> It's
>> > pretty simple and painless. We plan to have this deployed experimentally
>> for
>> > wikidata soon, but there is no reason not to roll it out globally.
>> >
>> > This implementation uses the job queue - which in production means
>> redis, but
>> > it's pretty generic.
>> >
>> > As to an RC *stream*: Pubsubhubbub is not really suitable for this,
>> since it
>> > requires the subscriber to run a public web server. It's really a
>> > server-to-server protocol. I'm not too sure about web sockets for this
>> either,
>> > because the intended recipient is usually not a web browser. But if it
>> works,
>> > I'd be happy anyway, the UDP+IRC solution sucks.
>> >
>> > Some years ago, I started to implement an XMPP based RC stream, see
>> > <https://www.mediawiki.org/wiki/Extension:XMLRC>. Have a look and steal
>> some
>> > ideas :)
>> >
>> > -- daniel
>> >
>> >
>> >
>> > _______________________________________________
>> > Wikitech-l mailing list
>> > [hidden email]
>> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
>> _______________________________________________
>> Wikitech-l mailing list
>> [hidden email]
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: recent changes stream

Martijn Hoekstra
On Mon, May 5, 2014 at 7:36 PM, Petr Bena <[hidden email]> wrote:

> I said this once in a gerrit comment and I will say it here as well:
> most of people have different opinion on what is "good" for them as RC
> stream. We should go for anything specific, but rather for a very
> abstract solution that could be multiplexed into multiple RC feed
> providers using a number of popular formats (including this IRC format
> just for backward compatibility). So in the end, users would be able
> to pick what format and protocol they want, just as they can do that
> with api.php
>
> Ideal RC stream would be so flexible that it could match any possible use
> case.
>


Standard solutions seem to be things like
rabbitMQ/zeroMQ/activeMQ/lolbiztalk

Websockets seems like something that should work, but is something of a
square peg/round hole thing. It needs a http connection to upgrade from
which is just unnatural for this problem, and is intended to be consumed by
client side javascript in browsers which communicate to a server.




> On Mon, May 5, 2014 at 6:45 PM, Erik Bernhardson
> <[hidden email]> wrote:
> > I think we need to be clearer about what the goal is here, as is I think
> we
> > are all taking our personal idea of what we want to do with a feed and
> > applying that to this implementation.  Personally I have been working on
> an
> > external watchlist service that i would love to hook up to a feed, but
> > without any guarantees of receiving every single event my particular use
> > case is better off continuously scanning the xml feeds of 800 wikis.  I'm
> > certain other people are thinking of completely different things as well.
> >
> > Erik B.
> >
> >
> > On Mon, May 5, 2014 at 2:29 AM, Petr Bena <[hidden email]> wrote:
> >
> >> Given the current specifications I can only support this change as
> >> long as current IRC feed is preserved as IRC is IMHO, as much as evil
> >> it looks, more suitable for this than WebSockets.
> >>
> >> I am not saying that IRC is suitable for this and I know that people
> >> really wanted to get rid of it or replace it with something better,
> >> but I just can't see how is this better.
> >>
> >> On Mon, May 5, 2014 at 10:37 AM, Daniel Kinzler <[hidden email]>
> >> wrote:
> >> > Am 05.05.2014 07:20, schrieb Jeremy Baron:
> >> >> On May 4, 2014 10:24 PM, "Ori Livneh" <[hidden email]> wrote:
> >> >>> an implementation for a recent changes
> >> >>> stream broadcast via socket.io, an abstraction layer over
> WebSockets
> >> that
> >> >>> also provides long polling as a fallback for older browsers.
> >> >
> >> > [...]
> >> >
> >> >> How could this work overlap with adding pubsubhubbub support to
> existing
> >> >> web RC feeds? (i.e. atom/rss. or for that matter even individual page
> >> >> history feeds or related changes feeds)
> >> >>
> >> >> The only pubsubhubbub bugs I see atm are
> >> >> https://bugzilla.wikimedia.org/buglist.cgi?quicksearch=38970%2C30245
> >> >
> >> > There is a Pubsubhubbub implementation in the pipeline, see
> >> > <
> https://git.wikimedia.org/summary/mediawiki%2Fextensions%2FPubSubHubbub>.
> >> It's
> >> > pretty simple and painless. We plan to have this deployed
> experimentally
> >> for
> >> > wikidata soon, but there is no reason not to roll it out globally.
> >> >
> >> > This implementation uses the job queue - which in production means
> >> redis, but
> >> > it's pretty generic.
> >> >
> >> > As to an RC *stream*: Pubsubhubbub is not really suitable for this,
> >> since it
> >> > requires the subscriber to run a public web server. It's really a
> >> > server-to-server protocol. I'm not too sure about web sockets for this
> >> either,
> >> > because the intended recipient is usually not a web browser. But if it
> >> works,
> >> > I'd be happy anyway, the UDP+IRC solution sucks.
> >> >
> >> > Some years ago, I started to implement an XMPP based RC stream, see
> >> > <https://www.mediawiki.org/wiki/Extension:XMLRC>. Have a look and
> steal
> >> some
> >> > ideas :)
> >> >
> >> > -- daniel
> >> >
> >> >
> >> >
> >> > _______________________________________________
> >> > Wikitech-l mailing list
> >> > [hidden email]
> >> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >>
> >> _______________________________________________
> >> Wikitech-l mailing list
> >> [hidden email]
> >> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >>
> > _______________________________________________
> > Wikitech-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: recent changes stream

Alex Monk
In reply to this post by Petr Bena
I've fiddled around with this a bit:

On my Ubuntu system, I simply got a library: sudo pip install
socketio-client
Then made this Python script:

---
from socketIO_client import SocketIO, BaseNamespace
class RecentChangeNamespace(BaseNamespace):
def on_connect(self):
self.emit('subscribe', ['enwiki'])
def on_change(self, data):
print(data)

socketIO = SocketIO('rcstream.wmflabs.org') # This host seems to be sending
the same example entries at the moment, obviously we should get an actual
RC stream from it or a production host later
socketIO.define(RecentChangeNamespace, '/rc')
socketIO.wait()
---
And it prints entries to console like this (Python just prints out objects
in a way that kind of looks like JSON):
{u'comment': u'', u'wiki': u'enwiki', u'type': 0, u'title': u'Main Page',
u'timestamp': 1398993841, u'server_script_path': u'/w', u'namespace': 0,
u'server_url': u'http://en.wikipedia.beta.wmflabs.org', u'length': {u'new':
585, u'old': 580}, u'user': u'Alice', u'patrolled': True, u'bot': False,
u'id': 9, u'minor': True, u'revision': {u'new': 9, u'old': 8}}

It's so much nicer. Sorry Petr, I think this is much more suitable than the
mess that is connecting to IRC and parsing everything. I definitely want to
see this replace the IRC feed entirely (with the obvious reasonably long
deprecation period).


Alex


On 5 May 2014 10:29, Petr Bena <[hidden email]> wrote:

> Given the current specifications I can only support this change as
> long as current IRC feed is preserved as IRC is IMHO, as much as evil
> it looks, more suitable for this than WebSockets.
>
> I am not saying that IRC is suitable for this and I know that people
> really wanted to get rid of it or replace it with something better,
> but I just can't see how is this better.
>
> On Mon, May 5, 2014 at 10:37 AM, Daniel Kinzler <[hidden email]>
> wrote:
> > Am 05.05.2014 07:20, schrieb Jeremy Baron:
> >> On May 4, 2014 10:24 PM, "Ori Livneh" <[hidden email]> wrote:
> >>> an implementation for a recent changes
> >>> stream broadcast via socket.io, an abstraction layer over WebSockets
> that
> >>> also provides long polling as a fallback for older browsers.
> >
> > [...]
> >
> >> How could this work overlap with adding pubsubhubbub support to existing
> >> web RC feeds? (i.e. atom/rss. or for that matter even individual page
> >> history feeds or related changes feeds)
> >>
> >> The only pubsubhubbub bugs I see atm are
> >> https://bugzilla.wikimedia.org/buglist.cgi?quicksearch=38970%2C30245
> >
> > There is a Pubsubhubbub implementation in the pipeline, see
> > <https://git.wikimedia.org/summary/mediawiki%2Fextensions%2FPubSubHubbub>.
> It's
> > pretty simple and painless. We plan to have this deployed experimentally
> for
> > wikidata soon, but there is no reason not to roll it out globally.
> >
> > This implementation uses the job queue - which in production means
> redis, but
> > it's pretty generic.
> >
> > As to an RC *stream*: Pubsubhubbub is not really suitable for this,
> since it
> > requires the subscriber to run a public web server. It's really a
> > server-to-server protocol. I'm not too sure about web sockets for this
> either,
> > because the intended recipient is usually not a web browser. But if it
> works,
> > I'd be happy anyway, the UDP+IRC solution sucks.
> >
> > Some years ago, I started to implement an XMPP based RC stream, see
> > <https://www.mediawiki.org/wiki/Extension:XMLRC>. Have a look and steal
> some
> > ideas :)
> >
> > -- daniel
> >
> >
> >
> > _______________________________________________
> > Wikitech-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: recent changes stream

Sumana Harihareswara-2
In reply to this post by Ori Livneh
This was merged on May 9th, so I have moved the RfC
https://www.mediawiki.org/wiki/Requests_for_comment/Publishing_the_RecentChanges_feed
to the archive - is that right?

Can I understand better where we're using PubSubHubbub vs WebSockets vs
rcstream?

Sumana Harihareswara
Senior Technical Writer
Wikimedia Foundation


On Sun, May 4, 2014 at 10:23 PM, Ori Livneh <[hidden email]> wrote:

> Hi,
>
> Gerrit change Id819246a9 proposes an implementation for a recent changes
> stream broadcast via socket.io, an abstraction layer over WebSockets that
> also provides long polling as a fallback for older browsers. Comment on <
> https://gerrit.wikimedia.org/r/#/c/131040/> or the mailing list.
>
> Thanks,
> Ori
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: recent changes stream

Petr Bena
In reply to this post by Alex Monk
it may surprise you but there are other languages than python and in
these, things like this aren't that simple...

On Sun, May 11, 2014 at 1:14 AM, Alex Monk <[hidden email]> wrote:

> I've fiddled around with this a bit:
>
> On my Ubuntu system, I simply got a library: sudo pip install
> socketio-client
> Then made this Python script:
>
> ---
> from socketIO_client import SocketIO, BaseNamespace
> class RecentChangeNamespace(BaseNamespace):
> def on_connect(self):
> self.emit('subscribe', ['enwiki'])
> def on_change(self, data):
> print(data)
>
> socketIO = SocketIO('rcstream.wmflabs.org') # This host seems to be sending
> the same example entries at the moment, obviously we should get an actual
> RC stream from it or a production host later
> socketIO.define(RecentChangeNamespace, '/rc')
> socketIO.wait()
> ---
> And it prints entries to console like this (Python just prints out objects
> in a way that kind of looks like JSON):
> {u'comment': u'', u'wiki': u'enwiki', u'type': 0, u'title': u'Main Page',
> u'timestamp': 1398993841, u'server_script_path': u'/w', u'namespace': 0,
> u'server_url': u'http://en.wikipedia.beta.wmflabs.org', u'length': {u'new':
> 585, u'old': 580}, u'user': u'Alice', u'patrolled': True, u'bot': False,
> u'id': 9, u'minor': True, u'revision': {u'new': 9, u'old': 8}}
>
> It's so much nicer. Sorry Petr, I think this is much more suitable than the
> mess that is connecting to IRC and parsing everything. I definitely want to
> see this replace the IRC feed entirely (with the obvious reasonably long
> deprecation period).
>
>
> Alex
>
>
> On 5 May 2014 10:29, Petr Bena <[hidden email]> wrote:
>
>> Given the current specifications I can only support this change as
>> long as current IRC feed is preserved as IRC is IMHO, as much as evil
>> it looks, more suitable for this than WebSockets.
>>
>> I am not saying that IRC is suitable for this and I know that people
>> really wanted to get rid of it or replace it with something better,
>> but I just can't see how is this better.
>>
>> On Mon, May 5, 2014 at 10:37 AM, Daniel Kinzler <[hidden email]>
>> wrote:
>> > Am 05.05.2014 07:20, schrieb Jeremy Baron:
>> >> On May 4, 2014 10:24 PM, "Ori Livneh" <[hidden email]> wrote:
>> >>> an implementation for a recent changes
>> >>> stream broadcast via socket.io, an abstraction layer over WebSockets
>> that
>> >>> also provides long polling as a fallback for older browsers.
>> >
>> > [...]
>> >
>> >> How could this work overlap with adding pubsubhubbub support to existing
>> >> web RC feeds? (i.e. atom/rss. or for that matter even individual page
>> >> history feeds or related changes feeds)
>> >>
>> >> The only pubsubhubbub bugs I see atm are
>> >> https://bugzilla.wikimedia.org/buglist.cgi?quicksearch=38970%2C30245
>> >
>> > There is a Pubsubhubbub implementation in the pipeline, see
>> > <https://git.wikimedia.org/summary/mediawiki%2Fextensions%2FPubSubHubbub>.
>> It's
>> > pretty simple and painless. We plan to have this deployed experimentally
>> for
>> > wikidata soon, but there is no reason not to roll it out globally.
>> >
>> > This implementation uses the job queue - which in production means
>> redis, but
>> > it's pretty generic.
>> >
>> > As to an RC *stream*: Pubsubhubbub is not really suitable for this,
>> since it
>> > requires the subscriber to run a public web server. It's really a
>> > server-to-server protocol. I'm not too sure about web sockets for this
>> either,
>> > because the intended recipient is usually not a web browser. But if it
>> works,
>> > I'd be happy anyway, the UDP+IRC solution sucks.
>> >
>> > Some years ago, I started to implement an XMPP based RC stream, see
>> > <https://www.mediawiki.org/wiki/Extension:XMLRC>. Have a look and steal
>> some
>> > ideas :)
>> >
>> > -- daniel
>> >
>> >
>> >
>> > _______________________________________________
>> > Wikitech-l mailing list
>> > [hidden email]
>> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
>> _______________________________________________
>> Wikitech-l mailing list
>> [hidden email]
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: recent changes stream

Antoine Musso-3
In reply to this post by Petr Bena
Le 05/05/2014 11:29, Petr Bena a écrit :
> Given the current specifications I can only support this change as
> long as current IRC feed is preserved as IRC is IMHO, as much as evil
> it looks, more suitable for this than WebSockets.
>
> I am not saying that IRC is suitable for this and I know that people
> really wanted to get rid of it or replace it with something better,
> but I just can't see how is this better.

IRC will be phased out entirely and related code in MediaWiki entirely
removed if it is not of any other use.

I am not sure what is the communication / migration plan for RCStream
but you should really start looking at it right now.


--
Antoine "hashar" Musso


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: recent changes stream

Sumana Harihareswara-2
In reply to this post by Sumana Harihareswara-2
I was wrong - the code is merged but we need to have more architectural
discussion before we deploy, so I moved it out of the RfC archive.

Sumana Harihareswara
Senior Technical Writer
Wikimedia Foundation


On Fri, Jul 11, 2014 at 1:18 PM, Sumana Harihareswara <[hidden email]
> wrote:

> This was merged on May 9th, so I have moved the RfC
> https://www.mediawiki.org/wiki/Requests_for_comment/Publishing_the_RecentChanges_feed
> to the archive - is that right?
>
> Can I understand better where we're using PubSubHubbub vs WebSockets vs
> rcstream?
>
> Sumana Harihareswara
> Senior Technical Writer
> Wikimedia Foundation
>
>
> On Sun, May 4, 2014 at 10:23 PM, Ori Livneh <[hidden email]> wrote:
>
>> Hi,
>>
>> Gerrit change Id819246a9 proposes an implementation for a recent changes
>> stream broadcast via socket.io, an abstraction layer over WebSockets that
>> also provides long polling as a fallback for older browsers. Comment on <
>> https://gerrit.wikimedia.org/r/#/c/131040/> or the mailing list.
>>
>> Thanks,
>> Ori
>> _______________________________________________
>> Wikitech-l mailing list
>> [hidden email]
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
>
>
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: recent changes stream

Petr Bena
In reply to this post by Antoine Musso-3
right now it seems to me worse in many points:

* it requires more traffic (slower and ineffective)
* it requires some extra libraries that enlarge dependency tree
* protocol itself is extra complicated and unflexible compared to IRC,
which can be programmed in few minutes

what's the point in replacing one technology with worse one and
forcing everyone to use it?

On Fri, Jul 11, 2014 at 8:54 PM, Antoine Musso <[hidden email]> wrote:

> Le 05/05/2014 11:29, Petr Bena a écrit :
>> Given the current specifications I can only support this change as
>> long as current IRC feed is preserved as IRC is IMHO, as much as evil
>> it looks, more suitable for this than WebSockets.
>>
>> I am not saying that IRC is suitable for this and I know that people
>> really wanted to get rid of it or replace it with something better,
>> but I just can't see how is this better.
>
> IRC will be phased out entirely and related code in MediaWiki entirely
> removed if it is not of any other use.
>
> I am not sure what is the communication / migration plan for RCStream
> but you should really start looking at it right now.
>
>
> --
> Antoine "hashar" Musso
>
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: recent changes stream

Petr Bena
BTW do you have any use data or feedback from people who switched from
IRC to this one? Based on what you decided you are going to phase it
out? I bet my shoes when you turn of IRC feed, half of wikipedia will
break and other half will start blaming all devs here on wikitech-l
for breaking everything.

You really should do some decent research before. I have no problem
with deploying new technologies, but phasing out old working ones that
everyone uses is definitely not an improvement to me. Just go to
en.wikipedia on IRC feed and have a look on number of users. They all
will get broken once you shut it off

On Fri, Jul 11, 2014 at 11:09 PM, Petr Bena <[hidden email]> wrote:

> right now it seems to me worse in many points:
>
> * it requires more traffic (slower and ineffective)
> * it requires some extra libraries that enlarge dependency tree
> * protocol itself is extra complicated and unflexible compared to IRC,
> which can be programmed in few minutes
>
> what's the point in replacing one technology with worse one and
> forcing everyone to use it?
>
> On Fri, Jul 11, 2014 at 8:54 PM, Antoine Musso <[hidden email]> wrote:
>> Le 05/05/2014 11:29, Petr Bena a écrit :
>>> Given the current specifications I can only support this change as
>>> long as current IRC feed is preserved as IRC is IMHO, as much as evil
>>> it looks, more suitable for this than WebSockets.
>>>
>>> I am not saying that IRC is suitable for this and I know that people
>>> really wanted to get rid of it or replace it with something better,
>>> but I just can't see how is this better.
>>
>> IRC will be phased out entirely and related code in MediaWiki entirely
>> removed if it is not of any other use.
>>
>> I am not sure what is the communication / migration plan for RCStream
>> but you should really start looking at it right now.
>>
>>
>> --
>> Antoine "hashar" Musso
>>
>>
>> _______________________________________________
>> Wikitech-l mailing list
>> [hidden email]
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: recent changes stream

Petr Bena
Also https://bugzilla.wikimedia.org/show_bug.cgi?id=67888 needs to be
fixed so that people have some chance to test if their tools works
with websocket at least few months before switch to it

On Fri, Jul 11, 2014 at 11:14 PM, Petr Bena <[hidden email]> wrote:

> BTW do you have any use data or feedback from people who switched from
> IRC to this one? Based on what you decided you are going to phase it
> out? I bet my shoes when you turn of IRC feed, half of wikipedia will
> break and other half will start blaming all devs here on wikitech-l
> for breaking everything.
>
> You really should do some decent research before. I have no problem
> with deploying new technologies, but phasing out old working ones that
> everyone uses is definitely not an improvement to me. Just go to
> en.wikipedia on IRC feed and have a look on number of users. They all
> will get broken once you shut it off
>
> On Fri, Jul 11, 2014 at 11:09 PM, Petr Bena <[hidden email]> wrote:
>> right now it seems to me worse in many points:
>>
>> * it requires more traffic (slower and ineffective)
>> * it requires some extra libraries that enlarge dependency tree
>> * protocol itself is extra complicated and unflexible compared to IRC,
>> which can be programmed in few minutes
>>
>> what's the point in replacing one technology with worse one and
>> forcing everyone to use it?
>>
>> On Fri, Jul 11, 2014 at 8:54 PM, Antoine Musso <[hidden email]> wrote:
>>> Le 05/05/2014 11:29, Petr Bena a écrit :
>>>> Given the current specifications I can only support this change as
>>>> long as current IRC feed is preserved as IRC is IMHO, as much as evil
>>>> it looks, more suitable for this than WebSockets.
>>>>
>>>> I am not saying that IRC is suitable for this and I know that people
>>>> really wanted to get rid of it or replace it with something better,
>>>> but I just can't see how is this better.
>>>
>>> IRC will be phased out entirely and related code in MediaWiki entirely
>>> removed if it is not of any other use.
>>>
>>> I am not sure what is the communication / migration plan for RCStream
>>> but you should really start looking at it right now.
>>>
>>>
>>> --
>>> Antoine "hashar" Musso
>>>
>>>
>>> _______________________________________________
>>> Wikitech-l mailing list
>>> [hidden email]
>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l