[WikiEN-l] An obscene example of remote loading

classic Classic list List threaded Threaded
29 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: [WikiEN-l] An obscene example of remote loading

Roberto Alfonso
I believe reference.com and about.com are, along with answers.com,
the biggest reusers in profile and size.

RB

On 1/28/07, Andrew Gray <[hidden email]> wrote:

> On 28/01/07, William Pietri <[hidden email]> wrote:
> > Rob Church wrote:
> > > Legitimate mirrors and people who want to reuse our content are free,
> > > and encouraged, to download a database dump and process it for their
> > > needs.
> >
> > Say, are there examples of people who do this well, contributing back to
> > Wikipedia or to the general public? The examples I've seen are all a bit
> > disappointing, but perhaps that's just because the outrageous ones
> > generate more attention.
>
> answers.com is perhaps our most high-profile reuser, but we have an
> agreement with them for a live feed. I'm not offhand aware of a
> particularly shining example of a database-dump site, partly because
> we tend to outstrip them quite fast (and because enwiki dumps were
> iffy for quite a while, meaning most of them are long-stagnant)
>
> There are certainly some decent offline projects using dumps, though,
> in one form or another.
>
> --
> - Andrew Gray
>   [hidden email]
>
> _______________________________________________
> WikiEN-l mailing list
> [hidden email]
> To unsubscribe from this mailing list, visit:
> http://lists.wikimedia.org/mailman/listinfo/wikien-l
>

_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
http://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: [WikiEN-l] An obscene example of remote loading

David Gerard-2
On 28/01/07, Roberto Alfonso <[hidden email]> wrote:

> I believe reference.com and about.com are, along with answers.com,
> the biggest reusers in profile and size.


Does Yahoo still take a live feed?


- d.

_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
http://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: [WikiEN-l] An obscene example of remote loading

Jimmy Wales
In reply to this post by William Pietri
William Pietri wrote:
> Say, are there examples of people who do this well, contributing back to
> Wikipedia or to the general public? The examples I've seen are all a bit
> disappointing, but perhaps that's just because the outrageous ones
> generate more attention.

Answers.com is an excellent example.  They license Wikipedia content and
also license content from many other more traditional sources, and offer
it up to people who search on their site.  They have always been a
strong supporter of Wikimedia and have been traditionally the #1 sponsor
of the annual Wikimania conference.

So they benefit from our work, and they give back to the community as well.

--Jimbo

_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
http://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: [WikiEN-l] An obscene example of remote loading

Michael Noda
In reply to this post by Andrew Gray
On 1/28/07, Andrew Gray <[hidden email]> wrote:
> I'm not offhand aware of a
> particularly shining example of a database-dump site, partly because
> we tend to outstrip them quite fast (and because enwiki dumps were
> iffy for quite a while, meaning most of them are long-stagnant)


I'm unfamiliar with the technical aspects of creating a database dump,
but is it the sort of thing that would be made better and faster by
throwing more computing resources at it?

I expect the answer to this question is "yes, but the Foundation is
$500,000 short, and those hypothetical servers went on the budget
chopping block on January 16th."  :-(  <rhetorical> When's the next
fundraiser? </rhetorical>

> There are certainly some decent offline projects using dumps, though,
> in one form or another.

Like, say, Google Earth, which I expect would be ecstatic if it could
get more frequent dumps.

_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
http://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: [WikiEN-l] An obscene example of remote loading

William Pietri
In reply to this post by Jimmy Wales
Thanks all for the helpful answers.

Jimmy Wales wrote:

> William Pietri wrote:
>  
>> Say, are there examples of people who do this well, contributing back to
>> Wikipedia or to the general public? [...]
>>    
>
> Answers.com is an excellent example.  They license Wikipedia content and
> also license content from many other more traditional sources, and offer
> it up to people who search on their site.  They have always been a
> strong supporter of Wikimedia and have been traditionally the #1 sponsor
> of the annual Wikimania conference.
>
> So they benefit from our work, and they give back to the community as well.
>  

Interesting. Are there examples of organizations who give back in other
ways?

I ask with some ulterior motive. I and some pals are looking at doing a
commercial startup that would involve a substantial amount of open
content. That content would be narrower but deeper than Wikipedia, by
which I mean it would cover a much smaller set of topics, but would
include a fair bit of material that Wikipedia currently deletes for lack
of notability.

As a startup, major cash donations are unlikely, at least for a few
years. But where our material overlaps with Wikipedia, we wanted to find
ways to collaborate. I think the only item currently in our product plan
is a tool to compare related articles, so that editors of either site
can easily diff and merge parts they like from the other. Ideally, we'd
open-source that code so that it could be used to compare and sync
between other open-content sites as well.

Do folks here have other ideas that would be mutually beneficial?


Thanks,

William


_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
http://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: [WikiEN-l] An obscene example of remote loading

· Firefoxman
In reply to this post by David Gerard-2
Yahoo has its own special file made for it. See
http://download.wikimedia.org/enwiki/20070124/
Extracted page abstracts for Yahoo
1.3 GB!

On 1/28/07, David Gerard <[hidden email]> wrote:

>
> On 28/01/07, Roberto Alfonso <[hidden email]> wrote:
>
> > I believe reference.com and about.com are, along with answers.com,
> > the biggest reusers in profile and size.
>
>
> Does Yahoo still take a live feed?
>
>
> - d.
>
> _______________________________________________
> WikiEN-l mailing list
> [hidden email]
> To unsubscribe from this mailing list, visit:
> http://lists.wikimedia.org/mailman/listinfo/wikien-l
>



--
-Firefoxman
_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
http://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: [WikiEN-l] An obscene example of remote loading

Andrew Gray
In reply to this post by Michael Noda
On 29/01/07, Michael Noda <[hidden email]> wrote:
> I'm unfamiliar with the technical aspects of creating a database dump,
> but is it the sort of thing that would be made better and faster by
> throwing more computing resources at it?
>
> I expect the answer to this question is "yes, but the Foundation is
> $500,000 short, and those hypothetical servers went on the budget
> chopping block on January 16th."  :-(  <rhetorical> When's the next
> fundraiser? </rhetorical>

You'd have to speak to Tim or Brion, but IIRC the problem was simply
that the method of generating dumps had worked fine in the past, and
just collapsed under the sheer *scale* of enwiki - note that de, fr,
etc, all were being done fine. It seems to have been fixed now; there
was a dump released late last year, but it was the first one for quite
a while.

> > There are certainly some decent offline projects using dumps, though,
> > in one form or another.
>
> Like, say, Google Earth, which I expect would be ecstatic if it could
> get more frequent dumps.

GE has dealt directly with the Foundation at some point, and from the
reports I've seen seems to update faster than the usual dump schedule
- there was a brief flurry of "misplaced" articles reported just after
they released it. They may be doing something clever with periodic
crawling of selected pages - as long as it's not "live mirroring", and
they run a local cache, we're okay with that.

--
- Andrew Gray
  [hidden email]

_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
http://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: [WikiEN-l] An obscene example of remote loading

Death Phoenix
In reply to this post by Jimmy Wales
I first became interested in contributing to Wikipedia as a direct result of
looking at answers.com articles.

On 1/29/07, Jimmy Wales <[hidden email]> wrote:

>
> William Pietri wrote:
> > Say, are there examples of people who do this well, contributing back to
> > Wikipedia or to the general public? The examples I've seen are all a bit
> > disappointing, but perhaps that's just because the outrageous ones
> > generate more attention.
>
> Answers.com is an excellent example.  They license Wikipedia content and
> also license content from many other more traditional sources, and offer
> it up to people who search on their site.  They have always been a
> strong supporter of Wikimedia and have been traditionally the #1 sponsor
> of the annual Wikimania conference.
>
> So they benefit from our work, and they give back to the community as
> well.
>
> --Jimbo
>
> _______________________________________________
> WikiEN-l mailing list
> [hidden email]
> To unsubscribe from this mailing list, visit:
> http://lists.wikimedia.org/mailman/listinfo/wikien-l
>
_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
http://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: [WikiEN-l] An obscene example of remote loading

Death Phoenix
Or was it some other reuser? My memory is quite spotty these days.

On 1/30/07, Deathphoenix <[hidden email]> wrote:

>
> I first became interested in contributing to Wikipedia as a direct result
> of looking at answers.com articles.
>
> On 1/29/07, Jimmy Wales <[hidden email]> wrote:
> >
> > William Pietri wrote:
> > > Say, are there examples of people who do this well, contributing back
> > to
> > > Wikipedia or to the general public? The examples I've seen are all a
> > bit
> > > disappointing, but perhaps that's just because the outrageous ones
> > > generate more attention.
> >
> > Answers.com is an excellent example.  They license Wikipedia content and
> > also license content from many other more traditional sources, and offer
> >
> > it up to people who search on their site.  They have always been a
> > strong supporter of Wikimedia and have been traditionally the #1 sponsor
> > of the annual Wikimania conference.
> >
> > So they benefit from our work, and they give back to the community as
> > well.
> >
> > --Jimbo
> >
> > _______________________________________________
> > WikiEN-l mailing list
> > [hidden email]
> > To unsubscribe from this mailing list, visit:
> > http://lists.wikimedia.org/mailman/listinfo/wikien-l
> >
>
>
_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
http://lists.wikimedia.org/mailman/listinfo/wikien-l
12