Wikimedia and Environment

classic Classic list List threaded Threaded
35 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Wikimedia and Environment

Teofilo
You have probably heard about CO2 and the conference being held these
days in Copenhagen (1).

You have probably heard about the goal of carbon neutrality at the
Wikimania conference in Gdansk in July 2010 (2).

You may want to discuss the basic and perhaps naive wishes I have
written down on the strategy wiki about paper consumption (3).

Do we have an idea of the energy consumption related to the online
access to a Wikipedia article ? Some people say that a few minutes
long search on a search engine costs as much energy as boiling water
for a cup of tea : is that story true in the case of Wikipedia (4) ?

How about moving the servers (5) from Florida to a cold country
(Alaska, Canada, Finland, Russia) so that they can be used to heat
offices or homes ? It might not be unrealistic as one may read such
things as "the solution was to provide nearby homes with our waste
heat" (6).

(1) http://en.wikipedia.org/wiki/United_Nations_Climate_Change_Conference_2009
(2) http://meta.wikimedia.org/wiki/Wikimania_2010/Bids/Gda%C5%84sk#Environmental_issues
(3) http://strategy.wikimedia.org/wiki/Proposal:Environmental_policy_for_paper_products
(4) http://technology.timesonline.co.uk/tol/news/tech_and_web/article5489134.ece
(5) http://meta.wikimedia.org/wiki/Wikimedia_servers
(6) http://www.greenercomputing.com/news/2009/12/08/giant-data-center-heat-london-homes

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia and Environment

Mike DuPont
On Sat, Dec 12, 2009 at 5:32 PM, Teofilo <[hidden email]> wrote:
> Do we have an idea of the energy consumption related to the online
> access to a Wikipedia article ? Some people say that a few minutes
> long search on a search engine costs as much energy as boiling water
> for a cup of tea : is that story true in the case of Wikipedia (4) ?

my 2 cents : this php is cooking more cups of tea than an optimized
program written in c.

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia and Environment

geni
In reply to this post by Teofilo
2009/12/12 Teofilo <[hidden email]>:

> How about moving the servers (5) from Florida to a cold country
> (Alaska, Canada, Finland, Russia) so that they can be used to heat
> offices or homes ? It might not be unrealistic as one may read such
> things as "the solution was to provide nearby homes with our waste
> heat" (6).
>

Alaska has seriously expensive construction systems and the others
listed have unacceptable legal systems.


--
geni

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia and Environment

William Pietri
In reply to this post by Teofilo
On 12/12/2009 08:32 AM, Teofilo wrote:
> Do we have an idea of the energy consumption related to the online
> access to a Wikipedia article ? Some people say that a few minutes
> long search on a search engine costs as much energy as boiling water
> for a cup of tea : is that story true in the case of Wikipedia (4) ?
>    

I don't have time to do the math right now, but I believe this could be
estimated from publicly available data. You'd take the pageview numbers:

http://stats.wikimedia.org/wikimedia/squids/SquidReportRequests.htm

You'd look up our various servers:

http://wikitech.wikimedia.org/view/Main_Page

And then make some reasonable guesses as to actual power consumption.
(Sysadmins often measure this, so I'm sure some Googling would turn up
good approximations.) Divide one number by the other and you've got a
reasonably good guess at power usage per pageview.

You could take that a step farther by looking up the power composition
where the server farms are and estimating CO2 output.

If anybody tries to do this and gets stuck, drop me a line.

William

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia and Environment

Geoffrey Plourde
In reply to this post by Teofilo
The only reason the servers and internet access produce CO2 emissions is because of the defective and antiquated energy production systems we use across the world. As we move towards more efficient and "cleaner" means of energy production, the carbon footprint should decrease.


Moving servers to Scandinavia would be interesting, but a unsound logistical idea. I agree that it would be a effective reuse of energy, but I am concerned about the access problem of relocating assets in one region. Now, placing new servers in Scandinavia on a grid so that the energy production can be reused is not a bad idea, but would be something for the chapters there to look at.

With regards to Florida, if the servers are in an office building, one way to decrease costs might be to reconfigure the environmental systems to use the energy from the servers to heat/cool the building. Wikimedia would then be able to recoup part of the utility bills from surrounding tenants.

However, engineering input would be most beneficial to considering these interesting proposals.

Geoffrey



________________________________
From: Teofilo <[hidden email]>
To: [hidden email]
Sent: Sat, December 12, 2009 8:32:12 AM
Subject: [Foundation-l] Wikimedia and Environment

You have probably heard about CO2 and the conference being held these
days in Copenhagen (1).

You have probably heard about the goal of carbon neutrality at the
Wikimania conference in Gdansk in July 2010 (2).

You may want to discuss the basic and perhaps naive wishes I have
written down on the strategy wiki about paper consumption (3).

Do we have an idea of the energy consumption related to the online
access to a Wikipedia article ? Some people say that a few minutes
long search on a search engine costs as much energy as boiling water
for a cup of tea : is that story true in the case of Wikipedia (4) ?

How about moving the servers (5) from Florida to a cold country
(Alaska, Canada, Finland, Russia) so that they can be used to heat
offices or homes ? It might not be unrealistic as one may read such
things as "the solution was to provide nearby homes with our waste
heat" (6).

(1) http://en.wikipedia.org/wiki/United_Nations_Climate_Change_Conference_2009
(2) http://meta.wikimedia.org/wiki/Wikimania_2010/Bids/Gda%C5%84sk#Environmental_issues
(3) http://strategy.wikimedia.org/wiki/Proposal:Environmental_policy_for_paper_products
(4) http://technology.timesonline.co.uk/tol/news/tech_and_web/article5489134.ece
(5) http://meta.wikimedia.org/wiki/Wikimedia_servers
(6) http://www.greenercomputing.com/news/2009/12/08/giant-data-center-heat-london-homes

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l



     
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia and Environment

David Gerard-2
2009/12/12 Geoffrey Plourde <[hidden email]>:

> The only reason the servers and internet access produce CO2 emissions is because of the defective and antiquated energy production systems we use across the world. As we move towards more efficient and "cleaner" means of energy production, the carbon footprint should decrease.
> Moving servers to Scandinavia would be interesting, but a unsound logistical idea. I agree that it would be a effective reuse of energy, but I am concerned about the access problem of relocating assets in one region. Now, placing new servers in Scandinavia on a grid so that the energy production can be reused is not a bad idea, but would be something for the chapters there to look at.


Iceland! Geothermal energy!


- d.

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia and Environment

masti-2
W dniu 12.12.2009 22:36, David Gerard pisze:
> 2009/12/12 Geoffrey Plourde<[hidden email]>:
>
>> The only reason the servers and internet access produce CO2 emissions is because of the defective and antiquated energy production systems we use across the world. As we move towards more efficient and "cleaner" means of energy production, the carbon footprint should decrease.
>> Moving servers to Scandinavia would be interesting, but a unsound logistical idea. I agree that it would be a effective reuse of energy, but I am concerned about the access problem of relocating assets in one region. Now, placing new servers in Scandinavia on a grid so that the energy production can be reused is not a bad idea, but would be something for the chapters there to look at.
>
>
> Iceland! Geothermal energy!


but we need to cool not to heat our servers :)

masti
DataCenter Manager :)

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia and Environment

David Gerard-2
2009/12/12 masti <[hidden email]>:
> W dniu 12.12.2009 22:36, David Gerard pisze:

>> Iceland! Geothermal energy!

> but we need to cool not to heat our servers :)


I think they've got some of that there too ;-)


- d.

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia and Environment

Benjamin Lees
In reply to this post by Teofilo
On Sat, Dec 12, 2009 at 11:32 AM, Teofilo <[hidden email]> wrote:

> How about moving the servers (5) from Florida to a cold country
> (Alaska, Canada, Finland, Russia) so that they can be used to heat
> offices or homes ? It might not be unrealistic as one may read such
> things as "the solution was to provide nearby homes with our waste
> heat" (6).
>
I imagine the average Wikimedia user is more concerned with whether his
requests are optimized to be fast than with whether they're optimized to be
environmentally friendly.  Or, to add a coat of greenwash, remember that
power consumption is going to be greater if you have more latency.

If the WMF had $130 million lying around, I would rather they used it to
actually serve their mission.

I think Domas hit the nail on the head in May:
http://lists.wikimedia.org/pipermail/foundation-l/2009-May/051656.html
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia and Environment

Jussi-Ville Heiskanen
In reply to this post by Teofilo
Teofilo wrote:

>
> Do we have an idea of the energy consumption related to the online
> access to a Wikipedia article ? Some people say that a few minutes
> long search on a search engine costs as much energy as boiling water
> for a cup of tea : is that story true in the case of Wikipedia (4) ?
>
> How about moving the servers (5) from Florida to a cold country
> (Alaska, Canada, Finland, Russia) so that they can be used to heat
> offices or homes ? It might not be unrealistic as one may read such
> things as "the solution was to provide nearby homes with our waste
> heat" (6).
>  
Heh. That brings some old memories right front to center.

I used to be a hang-around member of this hacker collective
here in Finland. (Intsu -> The Hole -> Cute - as its designation
evolved)

While in The Hole phase, we had a rented basement space,
and turned down all the heating in the space, after receiving
a donated old mainframe. I think you can guess why.


Yours,

Jussi-Ville Heiskanen


_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia and Environment

Nikola Smolenski-2
In reply to this post by Mike DuPont
Дана Saturday 12 December 2009 17:41:44 [hidden email] написа:
> On Sat, Dec 12, 2009 at 5:32 PM, Teofilo <[hidden email]> wrote:
> > Do we have an idea of the energy consumption related to the online
> > access to a Wikipedia article ? Some people say that a few minutes
> > long search on a search engine costs as much energy as boiling water
> > for a cup of tea : is that story true in the case of Wikipedia (4) ?
>
> my 2 cents : this php is cooking more cups of tea than an optimized
> program written in c.

But think of all the coffee developers would have to cook while coding and
optimizing in C!

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia and Environment

Andre Engels
In reply to this post by Teofilo
On Sat, Dec 12, 2009 at 5:32 PM, Teofilo <[hidden email]> wrote:

> How about moving the servers (5) from Florida to a cold country
> (Alaska, Canada, Finland, Russia) so that they can be used to heat
> offices or homes ? It might not be unrealistic as one may read such
> things as "the solution was to provide nearby homes with our waste
> heat" (6).

I don't think that's a practical solution. It's not because they need
to be cooled that computers cost so much energy - rather the opposite:
they use much energy, and because energy cannot be created or
destroyed, this energy has to go out some way - and that way is heat.

--
André Engels, [hidden email]

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia and Environment

Mike DuPont
In reply to this post by Nikola Smolenski-2
On Sun, Dec 13, 2009 at 10:30 AM, Nikola Smolenski <[hidden email]> wrote:

> Дана Saturday 12 December 2009 17:41:44 [hidden email] написа:
>> On Sat, Dec 12, 2009 at 5:32 PM, Teofilo <[hidden email]> wrote:
>> > Do we have an idea of the energy consumption related to the online
>> > access to a Wikipedia article ? Some people say that a few minutes
>> > long search on a search engine costs as much energy as boiling water
>> > for a cup of tea : is that story true in the case of Wikipedia (4) ?
>>
>> my 2 cents : this php is cooking more cups of tea than an optimized
>> program written in c.
>
> But think of all the coffee developers would have to cook while coding and
> optimizing in C!

But that is a one off expense. That is why we programmers can earn a
living, because we can work on many projects. Also we drink coffee
while playing UrbanTerror as well.

1. Php is very hard to optimize.
2. The mediawiki has a pretty nonstandard syntax. The best that I have
seen is the python implementation of the wikibook parser. But given
that each plugin can change the syntax as it will, it will get more
complex.
3. Even python is easier to optimize than php.
4. The other questions are, does it make sense to have such a
centralized client server architecture? We have been talking about
using a distributed vcs for mediawiki.
5. Well, now even if the mediawiki is fully distributed, it will cost
CPU, but that will be distributed. Each edit that has to be copied
will cause work to be done. In a distributed system even more work in
total.
6. Now, I have been wondering anyway who is the benefactor of all
these millions spend on bandwidth, where do they go to anyway?  What
about making a wikipedia network and have the people who want to
access it pay instead of having us pay to give it away? With these
millions you can buy a lot of routers and cables.
7. Now, back to the optimization. Lets say you were able to optimize
the program. We would identify the major cpu burners and optimize them
out. That does not solve the problem. Because I would think that the
php program is only a small part of the entire issue. The fact that
the data is flowing in a certain wasteful way is the cause of the
waste, not the program itself. Even if it would be much more efficient
and moving around data that is not needed, the data is not needed.

This would eventually lead, in an optimal world to updates not even
being distributed at all. Not all changes have to be centralized. Lets
say that there is one editor who would pull the changes from others
and make a public version. That would mean that only they would need
to have all data for that one topic. I think that you could optimize
the wikipedia along the lines of data travelling only to the people
who need it (editors versus viewers) and you would optimize first a
way to route edits into special interest groups and create smaller
virtual subnetworks of the editors CPUs working together in a peer to
peer direct network.

So if you have 10 people collaborating on a topic, only the results of
that work will be checked into the central server. the decentralized
communication would be between fewer parties and reduce the resources
used.

see also :
http://strategy.wikimedia.org/wiki/Proposal:A_MediaWiki_Parser_in_C


mike

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia and Environment

Domas Mituzas
Hi!!!

> 1. Php is very hard to optimize.

No, PHP is much easier to optimize (read - performance oriented refactoring).

> 3. Even python is easier to optimize than php.

Python's main design idea is readability. What is readable, is easier to refactor too, right? :)

> 4. The other questions are, does it make sense to have such a
> centralized client server architecture? We have been talking about
> using a distributed vcs for mediawiki.

Lunatics without any idea of stuff being done inside the engine talk about distribution. Let them!

> 5. Well, now even if the mediawiki is fully distributed, it will cost
> CPU, but that will be distributed. Each edit that has to be copied
> will cause work to be done. In a distributed system even more work in
> total.

Indeed, distribution raises costs.

> 6. Now, I have been wondering anyway who is the benefactor of all
> these millions spend on bandwidth, where do they go to anyway?  What
> about making a wikipedia network and have the people who want to
> access it pay instead of having us pay to give it away? With these
> millions you can buy a lot of routers and cables.

LOL. There's quite some competition in network department, and it has become economy of scale (or of serving youtube) long ago.

> 7. Now, back to the optimization. Lets say you were able to optimize
> the program. We would identify the major cpu burners and optimize them
> out. That does not solve the problem. Because I would think that the
> php program is only a small part of the entire issue. The fact that
> the data is flowing in a certain wasteful way is the cause of the
> waste, not the program itself. Even if it would be much more efficient
> and moving around data that is not needed, the data is not needed.

We can have new kind of Wikipedia. The one where we serve blank pages, and people imagine content in it. We\ve done that with moderate success quite often.

> So if you have 10 people collaborating on a topic, only the results of
> that work will be checked into the central server. the decentralized
> communication would be between fewer parties and reduce the resources
> used.

Except that you still need tracker to handle all that, and resolve conflicts, as still, there're no good methods of resolving conflicts with small number of untrusted entities.

> see also :
> http://strategy.wikimedia.org/wiki/Proposal:A_MediaWiki_Parser_in_C

How much would that save?

Domas
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia and Environment

Mike DuPont
Let me sum this up, The basic optimization is this :
You don't need to transfer that new article in every revision to all
users at all times.
The central server could just say  : this is the last revision that
has been released by the editors responsible for it, there are 100
edits in process and you can get involved by going to this page here
(hosted on a server someplace else). There is no need to transfer
those 100 edits to all the users on the web and they are not
interesting to everyone.


On Sun, Dec 13, 2009 at 12:10 PM, Domas Mituzas <[hidden email]> wrote:
>> 4. The other questions are, does it make sense to have such a
>> centralized client server architecture? We have been talking about
>> using a distributed vcs for mediawiki.
>
> Lunatics without any idea of stuff being done inside the engine talk about distribution. Let them!

I hope you are serious here,
Lets take a look at what the engine does, it allows editing of text.
It renders the text. It serves the text. The wiki from ward cunningham
is a perl script of the most basic form. There is not much magic
involved. Of course you need search tools, version histories and such.
There are places for optimizing all of those processes.

It is not lunacy, it is a fact that such work can be done, and is done
without a central server in many places.

Just look at for example how people edit code in an open source
software project using git. It is distributed, and it works.

There are already wikis based on git available.
There are other peer to peer networks such as TOR or freenet that
would be possible to use.

If you were to split up the editing of wikipedia articles into a
network of git servers across the globe and the rendering and
distribution of the resulting data would be the job of the WMF.

Now the issue of resolving conflicts is pretty simple in the issue of
git, everyone has a copy and can do what they want with it. If you
like the version from someone else, you pull it.

In terms of wikipedia as having only one viewpoint, the NPOV that is
reflected by the current revision at any one point in time, that
version would be one pushed from its editors repositories. It is
imaginable that you would have one senior editor for each topic who
has their own repository of of pages who pull in versions from many
people.

>> 7. Now, back to the optimization. Lets say you were able to optimize
>> the program. We would identify the major cpu burners and optimize them
>> out. That does not solve the problem. Because I would think that the
>> php program is only a small part of the entire issue. The fact that
>> the data is flowing in a certain wasteful way is the cause of the
>> waste, not the program itself. Even if it would be much more efficient
>> and moving around data that is not needed, the data is not needed.
>
> We can have new kind of Wikipedia. The one where we serve blank pages, and people imagine content in it. We\ve done that with moderate success quite often.

Please lets be serious here!
I am talking about the fact that not all people need all the
centralised services at all times.

>
>> So if you have 10 people collaborating on a topic, only the results of
>> that work will be checked into the central server. the decentralized
>> communication would be between fewer parties and reduce the resources
>> used.
>
> Except that you still need tracker to handle all that, and resolve conflicts, as still, there're > no good methods of resolving conflicts with small number of untrusted entities.

A tracker to manage what server is used for what group of editors can
be pretty efficient. Essentially it is a form of DNS. A tracker need
only show you the current repositories that are registered for a
certain topic.

Resolving conflicts is important, but you only need so many people for that.

The entire community does not get involved in all the conflicts. There
are only a certain number of people that are deeply involved in any
one section of the wikipedia at any given time.

Imagine that you had, lets say 1000 conference rooms available for
discussion and working together spread around the world and the
results of those rooms would be fed back into the Wikipedia. These
rooms or servers would be for processing the edits and conflicts any
given set of pages.

My idea is that you don't need to have a huge server to resolve
conflicts. many pages don't have many conflicts, there are certain
areas which need constant arbitration of course. Even if you split up
the groups into different viewpoints where the arbitration team only
deals with the output of two teams (pro and contra).

Even if you look at the number of editors in a highly contested page,
they are not unlimited.

From the retrospective you would be able to identify what groups of
editors are collaborating (enhancing each other) and conflicting
(overwriting each other). If you split them up into different rooms
when they should be collaborating and reduce the conflicts, then you
will win alot.

Even in Germany, most edits do not show up immediately. They have some
person to check the commits. Now that would also mean that those edits
before they are commited do not need to go a single data center.

People interested in getting all the versions available would need to
be able to find them. But for stuff like that people would be prepared
to wait a bit longer to collect the data from many servers if needed.
You should be able to just pull the versions you want in the depth
that you want. That selection of versions and depth would be a large
optimization in its self.

So there are different ways to reduce the load on a single server and
create pockets of processing for different topics. The only really
important thing is that people who are working on the same topic are
working on the same server or have a path of communication.

To sum it up, if conflicts are the major problem in the wikipedia, the
major cost in terms of review and coordination, then you should
rethink the workflow to push the processing time back to the editor
causing the conflict.

Right now the revisions are stored in whole, but not in part. If you
only add in new information then the you need less storage. That would
be one big optimization for the wikipedia to transfer only the changes
across the net and not full revisions.

For course even a new section could be a conflict if the new text is
garbage or in need of editing. If you want to replace a single word or
a sentence then lets say would create a conflict branch in one of
external conference rooms that would be the host of the page until the
work is finished there. The main server would just have a pointer to
the workgroup and the load would be pushed away. That also means that
any local server would be able to process the data and host the branch
until it is pushed back to the main server.

OK, well I think this is enough for now. I do ask you to remain
serious, and we can have a serious discussion on the topic of
optimisation.

thanks,
mike

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia and Environment

Teofilo
In reply to this post by Geoffrey Plourde
2009/12/12, Geoffrey Plourde <[hidden email]>:
> With regards to Florida, if the servers are in an office building, one way to >decrease costs might be to reconfigure the environmental systems to use the >energy from the servers to heat/cool the building. Wikimedia would then be able >to recoup part of the utility bills from surrounding tenants.

I am not sure the laws of thermodynamics (1) would allow to use that
heat to cool a building. You would need a cold source like a river to
convert heat back into electricity. But it might be more cost
efficient to have the water from the river circulate directly into the
building, so that your extra heat is still remaining unused.

This is why I think it is more difficult to find solutions in a hot
country like Florida than in a cold country (as long as you don't
question the very existence of heated homes in cold countries, leaving
aside the possibility of moving people and their homes from cold to
warm countries).

(1) http://en.wikipedia.org/wiki/Laws_of_thermodynamics#Second_law

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia and Environment

Teofilo
In reply to this post by Andre Engels
2009/12/13, Andre Engels <[hidden email]>:
> I don't think that's a practical solution. It's not because they need
> to be cooled that computers cost so much energy - rather the opposite:
> they use much energy, and because energy cannot be created or
> destroyed, this energy has to go out some way - and that way is heat.

In cold countries, energy can have two lives : a first life making
calculations in a computer, or transforming matter (ore into metal,
trees into books), and a second life heating homes.

But the best is to use no energy at all : see the OLPC project in
Afghanistan (A computer with pedals, like the sewing machines of our
great-great-great-grand-mothers) (1)

(1) http://www.olpcnews.com/countries/afghanistan/updates_from_olpc_afghanistan_1.html

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia and Environment

David Gerard-2
2009/12/13 Teofilo <[hidden email]>:

> But the best is to use no energy at all : see the OLPC project in
> Afghanistan (A computer with pedals, like the sewing machines of our
> great-great-great-grand-mothers) (1)
> (1) http://www.olpcnews.com/countries/afghanistan/updates_from_olpc_afghanistan_1.html


That's the answer! Distributed serving by each volunteer's pedal power!


- d.

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia and Environment

Domas Mituzas
In reply to this post by Mike DuPont
Dude, I need that strong stuff you're having.

> Let me sum this up, The basic optimization is this :
> You don't need to transfer that new article in every revision to all
> users at all times.

There's not much difference between transferring every revision and just some 'good' revisions.

> The central server could just say  : this is the last revision that
> has been released by the editors responsible for it, there are 100
> edits in process and you can get involved by going to this page here
> (hosted on a server someplace else).

Editing is miniscule part of our workload.

> There is no need to transfer
> those 100 edits to all the users on the web and they are not
> interesting to everyone.

Well, we may not transfer them, in case of flagged revisions, we can transfer in case of pure wiki. Point is, someone has to transfer.

> Lets take a look at what the engine does, it allows editing of text.

That includes conflict resolution, cross-indexing, history tracking, abuse filtering, full text indexing, etc.

> It renders the text.

It means building the output out of many individual assets (templates, anyone?), embed media, transform based on user options, etc.

> It serves the text.

And not only text - it serves complex aggregate views like 'last related changes', 'watchlist', 'contributions by new users', etc.

> The wiki from ward cunningham
> is a perl script of the most basic form.

That is probably one of reasons why we're not using wiki from Ward Cunningham anymore, and have something else, called Mediawiki.

> There is not much magic
> involved.

Not much use at multi-million article wiki with hundreds of millions of revisions.  

> Of course you need search tools, version histories and such.
> There are places for optimizing all of those processes.

And we've done that with MediaWiki ;-)

> It is not lunacy, it is a fact that such work can be done, and is done
> without a central server in many places.

Name me a single website with distributed-over-internet backend.

> Just look at for example how people edit code in an open source
> software project using git. It is distributed, and it works.

Git is limited and expensive for way too many of our operations. Also, you have to have whole copy of GIT, it doesn't have on-demand-remote-pulls nor any caching layer attached to that.
I appreciate your will of cloning Wikipedia.

It works if you want expensive accesses, of course. We're talking about serving a website here, not a case which is very nicely depicted at: http://xkcd.com/303/

> There are already wikis based on git available.

Anyone tried putting Wikipedia content on them, and try simulating our workload? :)
I understand that Git's semantics are usable for Wikipedia's basic revision storage, but it's data would still have to be replicated to other types of storages, that would allow various cross-indexing and cross-reporting.

How well does Git handle parallelism internally? How can it be parallelized over multiple machines? etc ;-) It lacks engineering. Basic stuff is nice, but it isn't what we need.

> There are other peer to peer networks such as TOR or freenet that
> would be possible to use.

How? These are just transports.

> If you were to split up the editing of wikipedia articles into a
> network of git servers across the globe and the rendering and
> distribution of the resulting data would be the job of the WMF.

And how would that save any money? By adding much more complexity to most of processes, and by having major cost item untouched?

> Now the issue of resolving conflicts is pretty simple in the issue of
> git, everyone has a copy and can do what they want with it. If you
> like the version from someone else, you pull it.

Who's revision does Wikimedia merge?

> In terms of wikipedia as having only one viewpoint, the NPOV that is
> reflected by the current revision at any one point in time, that
> version would be one pushed from its editors repositories. It is
> imaginable that you would have one senior editor for each topic who
> has their own repository of of pages who pull in versions from many
> people.

Go to Citizendium, k, thx.

> Please lets be serious here!
> I am talking about the fact that not all people need all the
> centralised services at all times.

You have absolute misunderstanding on what our technology platform is doing. You're wasting your time, you're wasting my time, you're wasting time of everyone who has to read your or my emails.

> A tracker to manage what server is used for what group of editors can
> be pretty efficient. Essentially it is a form of DNS. A tracker need
> only show you the current repositories that are registered for a
> certain topic.

Seriously, need that stuff you're on. Have you ever been involved in building anything remotely similar?

> The entire community does not get involved in all the conflicts. There
> are only a certain number of people that are deeply involved in any
> one section of the wikipedia at any given time.

Have you ever edited Wikipedia? :) You understand editorial process there?

> Imagine that you had, lets say 1000 conference rooms available for
> discussion and working together spread around the world and the
> results of those rooms would be fed back into the Wikipedia. These
> rooms or servers would be for processing the edits and conflicts any
> given set of pages.

How is that more efficient?

> My idea is that you don't need to have a huge server to resolve
> conflicts. many pages don't have many conflicts, there are certain
> areas which need constant arbitration of course. Even if you split up
> the groups into different viewpoints where the arbitration team only
> deals with the output of two teams (pro and contra).

NEED YOUR STUFFFFFF.

> From the retrospective you would be able to identify what groups of
> editors are collaborating (enhancing each other) and conflicting
> (overwriting each other). If you split them up into different rooms
> when they should be collaborating and reduce the conflicts, then you
> will win alot.

You'll get Nobel prize of literature if you continue so!
Infinite monkeys, when managed properly, ... ;-)

> Even in Germany, most edits do not show up immediately. They have some
> person to check the commits. Now that would also mean that those edits
> before they are commited do not need to go a single data center.

Again, you don't win efficiency. You win 'something', like, bragging rights in your local p2p-wanking-circle.
This part of editorial process is miniscule in terms of workload.

> You should be able to just pull the versions you want in the depth
> that you want. That selection of versions and depth would be a large
> optimization in its self.

Except that it is not the cost for us.

> So there are different ways to reduce the load on a single server and
> create pockets of processing for different topics. The only really
> important thing is that people who are working on the same topic are
> working on the same server or have a path of communication.

YOU SHOULD MENTION JABBER!!!111oneoneeleven

> To sum it up, if conflicts are the major problem in the wikipedia, the
> major cost in terms of review and coordination, then you should
> rethink the workflow to push the processing time back to the editor
> causing the conflict.

Semi-atomic resolution of conflicts is what allows fast collaboration to happen.
You fail to understand that.

> Right now the revisions are stored in whole, but not in part. If you
> only add in new information then the you need less storage. That would
> be one big optimization for the wikipedia to transfer only the changes
> across the net and not full revisions.

??????

> OK, well I think this is enough for now. I do ask you to remain
> serious, and we can have a serious discussion on the topic of
> optimisation.

I am serious. You fail at everything.

You fail to understand online operation implications (privacy, security, etc)
You fail to understand our content.
You fail to understand our costs
You fail to understand our archival and cross-indexing needs
You fail to understand our editorial process efficiency
You fail to understand that distribution increases overall costs.

You fail to understand pretty much everything.


I admire your enthusiasm of 'scaling basic wiki'. We're not running basic wiki, we're way beyond that. I have no idea how I can have serious discussion with someone who is so out of reality.
You suggest high complexity engineering project, that would bring nearly no wins over anything. At this point you should erase your email client, that would much more efficient.

I deliberately keep this topic on foundation-l, because I'm sure it is not worth the time of people on wikitech-l@ ;-)

Domas


_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia and Environment

Domas Mituzas
In reply to this post by Teofilo
Hi!

> In cold countries, energy can have two lives : a first life making
> calculations in a computer, or transforming matter (ore into metal,
> trees into books), and a second life heating homes.

One needs to build-out quite static-energy-output datacenters (e.g. deploy 10MW at once, and don't grow) for that. Not our business.

> But the best is to use no energy at all : see the OLPC project in
> Afghanistan (A computer with pedals, like the sewing machines of our
> great-great-great-grand-mothers) (1)

Do you realize that in terms of carbon footprint that is much much less efficient? Look at the title of the thread.

Domas
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
12