Long-term archiving of Wikimedia content

classic Classic list List threaded Threaded
74 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

Long-term archiving of Wikimedia content

metasj
I'm splitting off a separate thread about long-term archiving.  The
original thread is important enough not to derail it.

This is a big topic, and also one that has been addressed in many
different bodies of planning and literature.  The Long Now foundation
has considered a 10,000-year library project, and their Rosetta
Project tests a technique for 5,000-year preservation of texts.
Sadly, an earlier forum devoted to these ideas has been taken offline,
robots.txt'ed out of the internet archive, and I can't find a copy...
[ a long now apparently doesn't require archival public discussion? :)
]

Kevin Kelly on long-term backups:
  http://blog.longnow.org/2008/08/20/very-long-term-backup/
The original y2k event:
  http://www.longnow.org/projects/past-events/10klibrary/

Related research into long-term archival engineering has turned up
good ideas: laser micro-etching into nickel provides an excellent
price/size/weight point per archived page, and requires only the
[re]creation of decent, bootstrappable optics to recover lost
knowledge.

You could create and distribute etched-plate copies of the 10B words
of all Wikimedia text [and thumbnails?] on perhaps 100 thin nickel
sheets, for roughly $100k / 50kg / 0.01 m^3 (incl padding).  If this
laser etching process were scaled up, it would drop significantly in
price.

SJ


On Mon, May 4, 2009 at 6:41 PM, Thomas Dalton <[hidden email]> wrote:

> 2009/5/4 Nikola Smolenski <[hidden email]>:
>> It seems to me that you are joking, but I was seriously thinking about
>> cooperating with the Long Now project on long term preservation of Wikipedia.
>
> No joke, I thing the long term preservation of knowledge is a very worthy cause.
>
>> Printing Wikipedia on acid-free paper every year or at least decade in several
>> copies dispersed on several continents should ensure that the contents last
>> for several centuries at least. It wouldn't be prohibitively expensive either
>> and it could gather some media attention (= sponsors).
>
> Acid-free paper won't last for several centuries without decent
> storage, and we're talking about a small library worth of paper. (See
> http://en.wikipedia.org/wiki/Wikipedia:Size_in_volumes - and that's
> just the English Wikipedia. Include other languages and other projects
> and you have a very sizeable amount of content.) That kind of storage
> isn't particularly cheap. Air tight containers in a cave might work
> pretty well though - caves have very stable temperature, and the air
> tight containers would control humidity - and the caves already exist
> so no need to spend money constructing somewhere.
>
>> For a really long term, a cooperation with some brickworks, where a brick
>> printer would be introductd in the brick producing process, so that Wikipedia
>> (and other important works) would be printed on every brick produced. We know
>> that Sumerian tablets have lasted for thousands of years, so these bricks
>> would surely last that long too.
>>
>> And for even longer, do the same with bottle manufacturers.
>
> Yeah, bits and pieces would survive a long time, but you wouldn't get
> any significant portion of the projects saved that way. If you got it
> written on bricks that were being used to build a building you have
> good reason to believe will be around a long time, then it might work,
> but you would need a lot of bricks.
>
> According to the page I linked to above, the English Wikipedia has
> 7,484,527,350 characters. Let's assume an 8pt font (any smaller and it
> becomes difficult to write or read easily) on a standard brick (which
> Wikipedia tells me is, in the UK, 215mm by 65mm), that's about 18
> lines of text and maybe 17 words per line. That's about 300 words per
> brick (I'm assuming only one face will be written on). That works out
> at 25 million bricks. That's well over 1000 typical houses just for
> one copy of one project. Since the vast majority of these bricks
> aren't going to survive you are going to want massive redundancy. I
> don't think it is practical.
>
> Engraving on bottles isn't going to work - the bottles will
> (hopefully!) get recycled.

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Long-term archiving of Wikimedia content

geni
2009/5/5 Samuel Klein <[hidden email]>:

> I'm splitting off a separate thread about long-term archiving.  The
> original thread is important enough not to derail it.
>
> This is a big topic, and also one that has been addressed in many
> different bodies of planning and literature.  The Long Now foundation
> has considered a 10,000-year library project, and their Rosetta
> Project tests a technique for 5,000-year preservation of texts.
> Sadly, an earlier forum devoted to these ideas has been taken offline,
> robots.txt'ed out of the internet archive, and I can't find a copy...
> [ a long now apparently doesn't require archival public discussion? :)
> ]
>
> Kevin Kelly on long-term backups:
>  http://blog.longnow.org/2008/08/20/very-long-term-backup/
> The original y2k event:
>  http://www.longnow.org/projects/past-events/10klibrary/
>
> Related research into long-term archival engineering has turned up
> good ideas: laser micro-etching into nickel provides an excellent
> price/size/weight point per archived page, and requires only the
> [re]creation of decent, bootstrappable optics to recover lost
> knowledge.
>
> You could create and distribute etched-plate copies of the 10B words
> of all Wikimedia text [and thumbnails?] on perhaps 100 thin nickel
> sheets, for roughly $100k / 50kg / 0.01 m^3 (incl padding).  If this
> laser etching process were scaled up, it would drop significantly in
> price.
>
> SJ

High purity nickel would appear to run into the intrinsic value issue.

The value of including thumbnails is complicated. On one hand it
solves the translation issue since near 3 million will illustrated
articles is unlikely to present a significant translation challenge to
any moderately advanced civilization. On the other hand they take up
more space than pure text.



--
geni

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Long-term archiving of Wikimedia content

metasj
They wouldn't take up proportionally more space in etching than they
do on screen.  So an extra 10-20% overall.  They would probably make
the process a bit more expensive, but still to this scale.  an
illustrated encyclo may well be worth twice as much.

Let's see what the Rosetta folks have to say.   I can think of a lot
of people, not least those who have one of the early Rosetta disks,
who would love an  archival etched copy of Wikipedia + Commons thumbs,
which might cover some of the early costs of trying this out.

Håkon: perhaps PrinceXML would be useful for making an etch-specific layout?

SJ

On Mon, May 4, 2009 at 8:12 PM, geni <[hidden email]> wrote:

> 2009/5/5 Samuel Klein <[hidden email]>:
>> I'm splitting off a separate thread about long-term archiving.  The
>> original thread is important enough not to derail it.
>>
>> This is a big topic, and also one that has been addressed in many
>> different bodies of planning and literature.  The Long Now foundation
>> has considered a 10,000-year library project, and their Rosetta
>> Project tests a technique for 5,000-year preservation of texts.
>> Sadly, an earlier forum devoted to these ideas has been taken offline,
>> robots.txt'ed out of the internet archive, and I can't find a copy...
>> [ a long now apparently doesn't require archival public discussion? :)
>> ]
>>
>> Kevin Kelly on long-term backups:
>>  http://blog.longnow.org/2008/08/20/very-long-term-backup/
>> The original y2k event:
>>  http://www.longnow.org/projects/past-events/10klibrary/
>>
>> Related research into long-term archival engineering has turned up
>> good ideas: laser micro-etching into nickel provides an excellent
>> price/size/weight point per archived page, and requires only the
>> [re]creation of decent, bootstrappable optics to recover lost
>> knowledge.
>>
>> You could create and distribute etched-plate copies of the 10B words
>> of all Wikimedia text [and thumbnails?] on perhaps 100 thin nickel
>> sheets, for roughly $100k / 50kg / 0.01 m^3 (incl padding).  If this
>> laser etching process were scaled up, it would drop significantly in
>> price.
>>
>> SJ
>
> High purity nickel would appear to run into the intrinsic value issue.
>
> The value of including thumbnails is complicated. On one hand it
> solves the translation issue since near 3 million will illustrated
> articles is unlikely to present a significant translation challenge to
> any moderately advanced civilization. On the other hand they take up
> more space than pure text.
>
>
>
> --
> geni
>
> _______________________________________________
> foundation-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
>

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Long-term archiving of Wikimedia content

Brian J Mingus
My technology/power of community inspired opinion is that we don't need to
worry about that problem right now. We could recreate all the content in
short order were all the datacenters simultaneously struck by asteroids, and
more feasible long-term storage solutions will present themselves in the
next few decades. Anything we do right now is just going to get replaced.

On Mon, May 4, 2009 at 6:35 PM, Samuel Klein <[hidden email]> wrote:

> They wouldn't take up proportionally more space in etching than they
> do on screen.  So an extra 10-20% overall.  They would probably make
> the process a bit more expensive, but still to this scale.  an
> illustrated encyclo may well be worth twice as much.
>
> Let's see what the Rosetta folks have to say.   I can think of a lot
> of people, not least those who have one of the early Rosetta disks,
> who would love an  archival etched copy of Wikipedia + Commons thumbs,
> which might cover some of the early costs of trying this out.
>
> Håkon: perhaps PrinceXML would be useful for making an etch-specific
> layout?
>
> SJ
>
> On Mon, May 4, 2009 at 8:12 PM, geni <[hidden email]> wrote:
> > 2009/5/5 Samuel Klein <[hidden email]>:
> >> I'm splitting off a separate thread about long-term archiving.  The
> >> original thread is important enough not to derail it.
> >>
> >> This is a big topic, and also one that has been addressed in many
> >> different bodies of planning and literature.  The Long Now foundation
> >> has considered a 10,000-year library project, and their Rosetta
> >> Project tests a technique for 5,000-year preservation of texts.
> >> Sadly, an earlier forum devoted to these ideas has been taken offline,
> >> robots.txt'ed out of the internet archive, and I can't find a copy...
> >> [ a long now apparently doesn't require archival public discussion? :)
> >> ]
> >>
> >> Kevin Kelly on long-term backups:
> >>  http://blog.longnow.org/2008/08/20/very-long-term-backup/
> >> The original y2k event:
> >>  http://www.longnow.org/projects/past-events/10klibrary/
> >>
> >> Related research into long-term archival engineering has turned up
> >> good ideas: laser micro-etching into nickel provides an excellent
> >> price/size/weight point per archived page, and requires only the
> >> [re]creation of decent, bootstrappable optics to recover lost
> >> knowledge.
> >>
> >> You could create and distribute etched-plate copies of the 10B words
> >> of all Wikimedia text [and thumbnails?] on perhaps 100 thin nickel
> >> sheets, for roughly $100k / 50kg / 0.01 m^3 (incl padding).  If this
> >> laser etching process were scaled up, it would drop significantly in
> >> price.
> >>
> >> SJ
> >
> > High purity nickel would appear to run into the intrinsic value issue.
> >
> > The value of including thumbnails is complicated. On one hand it
> > solves the translation issue since near 3 million will illustrated
> > articles is unlikely to present a significant translation challenge to
> > any moderately advanced civilization. On the other hand they take up
> > more space than pure text.
> >
> >
> >
> > --
> > geni
> >
> > _______________________________________________
> > foundation-l mailing list
> > [hidden email]
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
> >
> >
>
> _______________________________________________
> foundation-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Long-term archiving of Wikimedia content

Tim Starling-2
In reply to this post by metasj
Samuel Klein wrote:
> They wouldn't take up proportionally more space in etching than they
> do on screen.  So an extra 10-20% overall.  They would probably make
> the process a bit more expensive, but still to this scale.  an
> illustrated encyclo may well be worth twice as much.
>
> Let's see what the Rosetta folks have to say.   I can think of a lot
> of people, not least those who have one of the early Rosetta disks,
> who would love an  archival etched copy of Wikipedia + Commons thumbs,
> which might cover some of the early costs of trying this out.

I can tell you what the Rosetta folks would say: they would say that
they paid $125k to Norsam for 5 prototype discs, and that we are free
to do the same. Norsam have developed this technology at great cost
and expect a commercial return, regardless of who's paying them.

<http://www.internetnews.com/storage/article.php/3771051/Storage+That+Really+Lasts.htm>

Personally I think it would be a waste of general funds, since I don't
expect we'll see the end of civilisation any time in the next year or
two. Maybe if there was a directed grant, it would be appropriate. Or
we could have a small investment fund aimed at paying for such an
archive in 20 years or so, when the process will be cheaper.

By the way, it's FIB etching, not laser etching, and the discs are
nickel-coated silicon, not plain nickel.

-- Tim Starling


_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Long-term archiving of Wikimedia content

Brian J Mingus
Wouldn't the most cost effective solution to be to first fund research in
compression so fewer bits have to be etched out?
In that case these guys are already on the job: http://prize.hutter1.net/

On Mon, May 4, 2009 at 10:02 PM, Tim Starling <[hidden email]>wrote:

> Samuel Klein wrote:
> > They wouldn't take up proportionally more space in etching than they
> > do on screen.  So an extra 10-20% overall.  They would probably make
> > the process a bit more expensive, but still to this scale.  an
> > illustrated encyclo may well be worth twice as much.
> >
> > Let's see what the Rosetta folks have to say.   I can think of a lot
> > of people, not least those who have one of the early Rosetta disks,
> > who would love an  archival etched copy of Wikipedia + Commons thumbs,
> > which might cover some of the early costs of trying this out.
>
> I can tell you what the Rosetta folks would say: they would say that
> they paid $125k to Norsam for 5 prototype discs, and that we are free
> to do the same. Norsam have developed this technology at great cost
> and expect a commercial return, regardless of who's paying them.
>
> <
> http://www.internetnews.com/storage/article.php/3771051/Storage+That+Really+Lasts.htm
> >
>
> Personally I think it would be a waste of general funds, since I don't
> expect we'll see the end of civilisation any time in the next year or
> two. Maybe if there was a directed grant, it would be appropriate. Or
> we could have a small investment fund aimed at paying for such an
> archive in 20 years or so, when the process will be cheaper.
>
> By the way, it's FIB etching, not laser etching, and the discs are
> nickel-coated silicon, not plain nickel.
>
> -- Tim Starling
>
>
> _______________________________________________
> foundation-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Long-term archiving of Wikimedia content

Pharos-3
In reply to this post by Tim Starling-2
On Tue, May 5, 2009 at 12:02 AM, Tim Starling <[hidden email]> wrote:

> Samuel Klein wrote:
>> They wouldn't take up proportionally more space in etching than they
>> do on screen.  So an extra 10-20% overall.  They would probably make
>> the process a bit more expensive, but still to this scale.  an
>> illustrated encyclo may well be worth twice as much.
>>
>> Let's see what the Rosetta folks have to say.   I can think of a lot
>> of people, not least those who have one of the early Rosetta disks,
>> who would love an  archival etched copy of Wikipedia + Commons thumbs,
>> which might cover some of the early costs of trying this out.
>
> I can tell you what the Rosetta folks would say: they would say that
> they paid $125k to Norsam for 5 prototype discs, and that we are free
> to do the same. Norsam have developed this technology at great cost
> and expect a commercial return, regardless of who's paying them.
>
> <http://www.internetnews.com/storage/article.php/3771051/Storage+That+Really+Lasts.htm>
>
> Personally I think it would be a waste of general funds, since I don't
> expect we'll see the end of civilisation any time in the next year or
> two. Maybe if there was a directed grant, it would be appropriate. Or
> we could have a small investment fund aimed at paying for such an
> archive in 20 years or so, when the process will be cheaper.
>
> By the way, it's FIB etching, not laser etching, and the discs are
> nickel-coated silicon, not plain nickel.
>
> -- Tim Starling

If we or anyone were to go this route, wouldn't microfiche in a sealed
plastic container be a lot cheaper and more practical to mass-produce?

See the section on preservation through moisture-tight containers:

http://graphics.kodak.com/docimaging/uploadedFiles/D-31.pdf

My personal plan for saving civilization is through intrinsically
worthless plastic jewelry, kind of like this idea:

http://www.google.com/patents?id=yLk3AAAAEBAJ&dq=4249330

Make these cheap pendants colorful, make them collectible, let people
string dozens on a necklace, and soon you'd have thousands of copies
of books floating through society that can never be lost.

Wow, I can't believe they let us post this stuff on foundation-l :)

Thanks,
Pharos

>
> _______________________________________________
> foundation-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Long-term archiving of Wikimedia content

Tim Starling-2
In reply to this post by Brian J Mingus
Brian wrote:
> Wouldn't the most cost effective solution to be to first fund research in
> compression so fewer bits have to be etched out?
> In that case these guys are already on the job: http://prize.hutter1.net/

The obvious reply to that is that the Rosetta project aims to make an
archive readable with 17th century technology, which digital
information compressed with advanced algorithms is not.

They try to make an issue out of the obsolescence of digital
technology, which I think is overwrought. Just because I don't have a
slot in my computer where I can insert a 1970s era magnetic tape
doesn't mean it's unreadable. I don't have a 750x optical microscope
lying around either. Both media are readable using extant technology.

There have been some problems with restoration of data where the
decoding software has been lost. But the popular, well-documented
digital formats of the past are as readable as ever: I have a program
on my computer called groff which is largely backwards-compatible with
runoff, one of the earliest digital typesetting formats, dating back
to the 1960s.

There is still a great deal of extant text dating to ancient times,
despite the fact that copying was fantastically expensive, and that
everything was written on flammable materials in a time when flame was
the only artificial light source. Maybe the future will be more like
Orson Scott Card's Homecoming series than the dark ages: a future with
such a weight of carefully recorded and preserved history that
studying it, even in overview, becomes the work of a lifetime.

Anyone who claims to know what the far future will be like is a
charlatan. But I think it would be foolish to assume that it will be
anything like the past.

-- Tim Starling


_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Long-term archiving of Wikimedia content

Brian J Mingus
If I'm limited to 17th century technology then I guess my other solution is
out too.
Compressor: Drop Wikipedia into a black hole
Decompressor: Read Wikipedia out of the hawking radiation

Ahh well.

On Mon, May 4, 2009 at 11:24 PM, Tim Starling <[hidden email]>wrote:

> Brian wrote:
> > Wouldn't the most cost effective solution to be to first fund research in
> > compression so fewer bits have to be etched out?
> > In that case these guys are already on the job:
> http://prize.hutter1.net/
>
> The obvious reply to that is that the Rosetta project aims to make an
> archive readable with 17th century technology, which digital
> information compressed with advanced algorithms is not.
>
> They try to make an issue out of the obsolescence of digital
> technology, which I think is overwrought. Just because I don't have a
> slot in my computer where I can insert a 1970s era magnetic tape
> doesn't mean it's unreadable. I don't have a 750x optical microscope
> lying around either. Both media are readable using extant technology.
>
> There have been some problems with restoration of data where the
> decoding software has been lost. But the popular, well-documented
> digital formats of the past are as readable as ever: I have a program
> on my computer called groff which is largely backwards-compatible with
> runoff, one of the earliest digital typesetting formats, dating back
> to the 1960s.
>
> There is still a great deal of extant text dating to ancient times,
> despite the fact that copying was fantastically expensive, and that
> everything was written on flammable materials in a time when flame was
> the only artificial light source. Maybe the future will be more like
> Orson Scott Card's Homecoming series than the dark ages: a future with
> such a weight of carefully recorded and preserved history that
> studying it, even in overview, becomes the work of a lifetime.
>
> Anyone who claims to know what the far future will be like is a
> charlatan. But I think it would be foolish to assume that it will be
> anything like the past.
>
> -- Tim Starling
>
>
> _______________________________________________
> foundation-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Long-term archiving of Wikimedia content

metasj
In reply to this post by Tim Starling-2
On Tue, May 5, 2009 at 12:02 AM, Tim Starling <[hidden email]> wrote:
>
> I can tell you what the Rosetta folks would say: they would say that
> they paid $125k to Norsam for 5 prototype discs, and that we are free
> to do the same. Norsam have developed this technology at great cost
> and expect a commercial return, regardless of who's paying them.

The $25k per disc quote includes the research and development costs
for the entire Rosetta project.  That's what they were asking would-be
sponsors to pay for a final disc (which includes a nickel half, a
titanium half with sexy black matte coating and a different etch
process, and the encasing double-hemisphere of lenses...)  The
business end of the etching seems to be 20 cents per page to make a
master, with lower costs for copies.


> By the way, it's FIB etching, not laser etching, and the discs are
> nickel-coated silicon, not plain nickel.

Thanks for the correction.  You're right, it's focused ion beam
etching (and re: my earlier musing, etching images seems to cost the
same as etching text).  I'm not sure about the plating - I believe the
Norsam Rosetta process etches a silicon master, and makes solid nickel
plates with that...  compare the article you linked to with the
description here  (which also has a nice photo of 13,500 pages etched
at 10 pg/mm^2):
   https://secure.longnow.org/members/news-fall-02008.php#rosetta

> <http://www.internetnews.com/storage/article.php/3771051/Storage+That+Really+Lasts.htm>
>
> Personally I think it would be a waste of general funds, since I don't
> expect we'll see the end of civilisation any time in the next year or two.
> Maybe if there was a directed grant, it would be appropriate. Or
> we could have a small investment fund aimed at paying for such an
> archive in 20 years or so, when the process will be cheaper.

There's certainly no need to use current resources.  People who are
producing or investigating good archival technology, and those curious
to own early copies, may contribute to the cause.  And one might try
to produce 100k pages a year, using the best processes available each
year, cycling through topic categories.

A major value I see in having this sort of discussion is as a work of
practical art, to expand circles of thought.  This could be part of a
long-term plan for highlighting the lasting value and meaning of the
projects.  An investment over time with small contributions from many
people would be appropriate.


Pharos writes:
> If we or anyone were to go this route, wouldn't microfiche in a sealed
> plastic container be a lot cheaper and more practical to mass-produce?

Are you imagining digital-to-microfiche printing?

SJ

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Long-term archiving of Wikimedia content

Anthony-73
In reply to this post by Tim Starling-2
On Tue, May 5, 2009 at 12:02 AM, Tim Starling <[hidden email]>wrote:

> Personally I think it would be a waste of general funds, since I don't
> expect we'll see the end of civilisation any time in the next year or
> two.


Umm, if civilization ends, we won't be around to see it, and the Wikimedia
Foundation mission will have been fulfilled ("every single human being can
freely share in the sum of all knowledge").

No one is seriously suggesting using WMF funds for this nonsense, are they?
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Long-term archiving of Wikimedia content

Thomas Dalton
2009/5/5 Anthony <[hidden email]>:

> On Tue, May 5, 2009 at 12:02 AM, Tim Starling <[hidden email]>wrote:
>
>> Personally I think it would be a waste of general funds, since I don't
>> expect we'll see the end of civilisation any time in the next year or
>> two.
>
>
> Umm, if civilization ends, we won't be around to see it, and the Wikimedia
> Foundation mission will have been fulfilled ("every single human being can
> freely share in the sum of all knowledge").
>
> No one is seriously suggesting using WMF funds for this nonsense, are they?

Certainly not large amounts of funds any time soon. If it could be
done for $5k, I'd recommend doing it with WMF funds. I might also
recommend doing it with WMF funds in 20 years when we have a massive
endowment paying all our running costs and get billions of dollars
donated a year. But right now we cannot justify the expense that would
almost certainly be required.

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Long-term archiving of Wikimedia content

Aryeh Gregor
On Tue, May 5, 2009 at 9:17 AM, Thomas Dalton <[hidden email]> wrote:
> Certainly not large amounts of funds any time soon. If it could be
> done for $5k, I'd recommend doing it with WMF funds.

I'm pretty sure buying another server or offering a slightly higher
salary on the next job offering or just leaving the money to
accumulate interest would do considerably more to advance Wikimedia's
mission than this.

> I might also
> recommend doing it with WMF funds in 20 years when we have a massive
> endowment paying all our running costs and get billions of dollars
> donated a year.

I think there are still going to be much more useful things to spend
thousands of dollars on.  The utility of this project is virtually
zero from any perspective.

Of course, since all of Wikimedia's data is freely available, anyone
else who'd like to store it in some durable form for any sum of money
is absolutely free to do so.  Or they could give Wikimedia a directed
grant.  But it would be a waste of Wikimedia's money.

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Long-term archiving of Wikimedia content

Thomas Dalton
2009/5/5 Aryeh Gregor <[hidden email]>:
> The utility of this project is virtually
> zero from any perspective.

I disagree. The short term utility is obviously zero, but the long
term utility could be massive. The contents of Wikimedia projects
could play a vital role in rebuilding civilisation - I call that
useful.

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Long-term archiving of Wikimedia content

David Gerard-2
In reply to this post by Aryeh Gregor
2009/5/5 Aryeh Gregor <[hidden email]>:

> Of course, since all of Wikimedia's data is freely available, anyone
> else who'd like to store it in some durable form for any sum of money
> is absolutely free to do so.  Or they could give Wikimedia a directed
> grant.  But it would be a waste of Wikimedia's money.


The best way is to make archives readily available so there are *lots*
of copies.

So first we need good dumps for people to make lots of copies of ...

There are people like the Internet Archive as well. We should be
making sure they have a copy of every database dump in their
collection.


- d.

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Long-term archiving of Wikimedia content

Aryeh Gregor
In reply to this post by Thomas Dalton
On Tue, May 5, 2009 at 10:12 AM, Thomas Dalton <[hidden email]> wrote:
> I disagree. The short term utility is obviously zero, but the long
> term utility could be massive. The contents of Wikimedia projects
> could play a vital role in rebuilding civilisation - I call that
> useful.

Assuming civilization collapses to begin with.  And assuming the
collapse is so complete that there are literally no computers left,
even in the hands of the most powerful (who would likely lead
rebuilding efforts).  And assuming that civilization doesn't recover
in a short enough period of time that most records will be intact
anyway.  And assuming that people can actually find the handful of
disks or whatever that are probably locked up in a vault somewhere.
And assuming they still have microscopes, but not computers.  And
assuming they bother to actually look at the disks in a microscope,
instead of melting them down as scrap metal or using them as doorstops
or just dumping them in a landfill.  And assuming that vast quantities
of trivia interspersed with incomplete scraps of poorly-explained
science would in fact be useful for rebuilding civilization.  (Have
you ever tried to learn anything practical from Wikipedia?  Textbooks
would at least be useful.)

Yeah, I'd say virtually zero utility.  But if some weirdos want to
waste money on it, that's their business.  They can also prepare for
the end of the world in 2012 as predicted by the Mayans, if they like.

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Long-term archiving of Wikimedia content

Chad
In reply to this post by Thomas Dalton
On Tue, May 5, 2009 at 10:12 AM, Thomas Dalton <[hidden email]> wrote:

> 2009/5/5 Aryeh Gregor <[hidden email]>:
>> The utility of this project is virtually
>> zero from any perspective.
>
> I disagree. The short term utility is obviously zero, but the long
> term utility could be massive. The contents of Wikimedia projects
> could play a vital role in rebuilding civilisation - I call that
> useful.
>
> _______________________________________________
> foundation-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>

That is quite possibly the funniest thing I've read in months.

-Chad

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Long-term archiving of Wikimedia content

Thomas Dalton
In reply to this post by Aryeh Gregor
2009/5/5 Aryeh Gregor <[hidden email]>:

> On Tue, May 5, 2009 at 10:12 AM, Thomas Dalton <[hidden email]> wrote:
>> I disagree. The short term utility is obviously zero, but the long
>> term utility could be massive. The contents of Wikimedia projects
>> could play a vital role in rebuilding civilisation - I call that
>> useful.
>
> Assuming civilization collapses to begin with.  And assuming the
> collapse is so complete that there are literally no computers left,
> even in the hands of the most powerful (who would likely lead
> rebuilding efforts).  And assuming that civilization doesn't recover
> in a short enough period of time that most records will be intact
> anyway.  And assuming that people can actually find the handful of
> disks or whatever that are probably locked up in a vault somewhere.
> And assuming they still have microscopes, but not computers.  And
> assuming they bother to actually look at the disks in a microscope,
> instead of melting them down as scrap metal or using them as doorstops
> or just dumping them in a landfill.  And assuming that vast quantities
> of trivia interspersed with incomplete scraps of poorly-explained
> science would in fact be useful for rebuilding civilization.  (Have
> you ever tried to learn anything practical from Wikipedia?  Textbooks
> would at least be useful.)
>
> Yeah, I'd say virtually zero utility.  But if some weirdos want to
> waste money on it, that's their business.  They can also prepare for
> the end of the world in 2012 as predicted by the Mayans, if they like.

I haven't proposed any archive method that would require a microscope
and I don't recommend one. Civilisations have a tendency to collapse,
it has happened numerous times before and it is far from unheard of
for a dark age to follow in which a lot of knowledge is lost.

But rebuilding civilisation is probably not the most likely use such
archives would be put to (it's just the most exciting, so the one I
mentioned). The historical and cultural value 1000 years from now of
knowing what people 1000 years ago knew and thought would be immense.
While useful, functional knowledge would probably have been preserved,
the presentation of that knowledge and all the non-functional
knowledge (our pop culture articles would probably be of most
interest) would probably be lost. Think of the archive as more of a
time capsule than a how-to guide for rebuilding civilisation - the
latter is more fun (and of greater utility if it happens), but the
former is the far more likely utility.

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Long-term archiving of Wikimedia content

Aryeh Gregor
On Tue, May 5, 2009 at 10:32 AM, Thomas Dalton <[hidden email]> wrote:
> But rebuilding civilisation is probably not the most likely use such
> archives would be put to (it's just the most exciting, so the one I
> mentioned). The historical and cultural value 1000 years from now of
> knowing what people 1000 years ago knew and thought would be immense.

But if you don't postulate a catastrophic event that we can't plan
for, like civilization ending due to an overnight thermonuclear war,
then we don't need to plan in advance.  If Wikipedia ceases to exist
at some point, and if at that point it looks like it would be useful
to preserve its contents, we could preserve it more durably at that
point.  We don't need to preserve it now.  In fact, it would be
counterproductive: better to preserve it at the last possible moment,
when it will contain as much data as possible.

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Long-term archiving of Wikimedia content

Thomas Dalton
2009/5/5 Aryeh Gregor <[hidden email]>:

> On Tue, May 5, 2009 at 10:32 AM, Thomas Dalton <[hidden email]> wrote:
>> But rebuilding civilisation is probably not the most likely use such
>> archives would be put to (it's just the most exciting, so the one I
>> mentioned). The historical and cultural value 1000 years from now of
>> knowing what people 1000 years ago knew and thought would be immense.
>
> But if you don't postulate a catastrophic event that we can't plan
> for, like civilization ending due to an overnight thermonuclear war,
> then we don't need to plan in advance.  If Wikipedia ceases to exist
> at some point, and if at that point it looks like it would be useful
> to preserve its contents, we could preserve it more durably at that
> point.  We don't need to preserve it now.  In fact, it would be
> counterproductive: better to preserve it at the last possible moment,
> when it will contain as much data as possible.

You make a good point, but that point applies just as well to any
other time capsule plan and people still consider them worthwhile.
It's not about having as much information as possible, it's about
preserving the information as it is now. Obviously, Wikipedia has
revision histories going back years and, as long as they are
maintained, I suppose you could just make your own "archive" when you
wanted to read it. So, I guess all we really need to do is ensure we
have reliable full history dumps backed up. This kind of plan would
only be needed if we stopped storing histories (which, before anyone
says otherwise, we are not required to do by either GFDL or CC-BY-SA).

However, most information isn't lost because of disaster, it is lost
because people don't think they need it any more and delete/destroy
it. Can we trust whoever is around in the future to continue to
preserve the history dumps they've backed up?

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
1234