Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

classic Classic list List threaded Threaded
53 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

David Gerard-2
http://blogs.law.harvard.edu/infolaw/2009/06/19/using-wikisource-as-an-alternative-open-access-repository-for-legal-scholarship/

Interesting. How well does this fit with what Wikisource does?


- d.

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

metasj
There is a wealth of work done all the time by primary source
researchers and publishers, which could be improved on by having
wikisource entries, translations, &c.

Related question : how appropriate would large numbers of public
domain texts, with page scans and the best available OCR [and
translations of same], fit with what Wikisource does now?  This is
clearly a wiki project that needs to happen : OCR even at its best
misses rare meaning-bearing words.   If not Wikisource, where should
this work take place?

SJ

On Sat, Jun 20, 2009 at 11:41 AM, David Gerard<[hidden email]> wrote:

> http://blogs.law.harvard.edu/infolaw/2009/06/19/using-wikisource-as-an-alternative-open-access-repository-for-legal-scholarship/
>
> Interesting. How well does this fit with what Wikisource does?
>
>
> - d.
>
> _______________________________________________
> foundation-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
>

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Brian J Mingus
This has reminded me to complain about Google Books. Google has the world's
best OCR (in virtue of having the largest OCR'able dataset) and also has a
mission to scan in all the public domain books they can get their hand on.
They recently updated their interface to, as they put it, "make it easier to
find our plain text versions of public domain books. If a book is available
in full view, you can click the 'Plain text' button in the toolbar."
Unfortunately the only way I've found to download the full text of a public
domain book from Google is to flip through the book a page at a time,
copying the text to your clipboard.
There are roughly 2-3 million public domain books in Google Books.


On Sat, Jun 20, 2009 at 10:10 AM, Samuel Klein <[hidden email]> wrote:

> There is a wealth of work done all the time by primary source
> researchers and publishers, which could be improved on by having
> wikisource entries, translations, &c.
>
> Related question : how appropriate would large numbers of public
> domain texts, with page scans and the best available OCR [and
> translations of same], fit with what Wikisource does now?  This is
> clearly a wiki project that needs to happen : OCR even at its best
> misses rare meaning-bearing words.   If not Wikisource, where should
> this work take place?
>
> SJ
>
> On Sat, Jun 20, 2009 at 11:41 AM, David Gerard<[hidden email]> wrote:
> >
> http://blogs.law.harvard.edu/infolaw/2009/06/19/using-wikisource-as-an-alternative-open-access-repository-for-legal-scholarship/
> >
> > Interesting. How well does this fit with what Wikisource does?
> >
> >
> > - d.
> >
> > _______________________________________________
> > foundation-l mailing list
> > [hidden email]
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
> >
> >
>
> _______________________________________________
> foundation-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Platonides
Brian wrote:
> Unfortunately the only way I've found to download the full text of a public
> domain book from Google is to flip through the book a page at a time,
> copying the text to your clipboard.
> There are roughly 2-3 million public domain books in Google Books.

That's easy to fix :)


_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Brian J Mingus
Not likely. I've been banned from Google's regular search at least a dozen
times during semi-frenetic search sprees in which I was identified as a bot.
There is no doubt that if you try to automate it you will be quickly shot
down.

On Sat, Jun 20, 2009 at 12:02 PM, Platonides <[hidden email]> wrote:

> Brian wrote:
> > Unfortunately the only way I've found to download the full text of a
> public
> > domain book from Google is to flip through the book a page at a time,
> > copying the text to your clipboard.
> > There are roughly 2-3 million public domain books in Google Books.
>
> That's easy to fix :)
>
>
> _______________________________________________
> foundation-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Anthony-73
Easier than scanning, though :)

On Sat, Jun 20, 2009 at 2:04 PM, Brian <[hidden email]> wrote:

> Not likely. I've been banned from Google's regular search at least a dozen
> times during semi-frenetic search sprees in which I was identified as a
> bot.
> There is no doubt that if you try to automate it you will be quickly shot
> down.
>
> On Sat, Jun 20, 2009 at 12:02 PM, Platonides <[hidden email]> wrote:
>
> > Brian wrote:
> > > Unfortunately the only way I've found to download the full text of a
> > public
> > > domain book from Google is to flip through the book a page at a time,
> > > copying the text to your clipboard.
> > > There are roughly 2-3 million public domain books in Google Books.
> >
> > That's easy to fix :)
> >
> >
> > _______________________________________________
> > foundation-l mailing list
> > [hidden email]
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
> >
> _______________________________________________
> foundation-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Falcorian-2
In reply to this post by Brian J Mingus
So the bot just has to run at human speeds so it does not get banned, it
still won't get tired or make unpredictable mistakes. And you can run it
from different IPs to parallelize.

--Falcorian

On Sat, Jun 20, 2009 at 11:04 AM, Brian <[hidden email]> wrote:

> Not likely. I've been banned from Google's regular search at least a dozen
> times during semi-frenetic search sprees in which I was identified as a
> bot.
> There is no doubt that if you try to automate it you will be quickly shot
> down.
>
> On Sat, Jun 20, 2009 at 12:02 PM, Platonides <[hidden email]> wrote:
>
> > Brian wrote:
> > > Unfortunately the only way I've found to download the full text of a
> > public
> > > domain book from Google is to flip through the book a page at a time,
> > > copying the text to your clipboard.
> > > There are roughly 2-3 million public domain books in Google Books.
> >
> > That's easy to fix :)
> >
> >
> > _______________________________________________
> > foundation-l mailing list
> > [hidden email]
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
> >
> _______________________________________________
> foundation-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Brian J Mingus
That is against the law. It violates Google's ToS.

I'm mostly complaining that Google is being Very Evil. There is nothing we
can do about it except complain to them. Which I don't know how to do - they
apparently believe that the plain text versions of their books are akin to
their intellectual property and are unwilling to give them away.

On Sat, Jun 20, 2009 at 12:34 PM, Falcorian <
[hidden email]<alex.public.account%[hidden email]>
> wrote:

> So the bot just has to run at human speeds so it does not get banned, it
> still won't get tired or make unpredictable mistakes. And you can run it
> from different IPs to parallelize.
>
> --Falcorian
>
> On Sat, Jun 20, 2009 at 11:04 AM, Brian <[hidden email]> wrote:
>
> > Not likely. I've been banned from Google's regular search at least a
> dozen
> > times during semi-frenetic search sprees in which I was identified as a
> > bot.
> > There is no doubt that if you try to automate it you will be quickly shot
> > down.
> >
> > On Sat, Jun 20, 2009 at 12:02 PM, Platonides <[hidden email]>
> wrote:
> >
> > > Brian wrote:
> > > > Unfortunately the only way I've found to download the full text of a
> > > public
> > > > domain book from Google is to flip through the book a page at a time,
> > > > copying the text to your clipboard.
> > > > There are roughly 2-3 million public domain books in Google Books.
> > >
> > > That's easy to fix :)
> > >
> > >
> > > _______________________________________________
> > > foundation-l mailing list
> > > [hidden email]
> > > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
> > >
> > _______________________________________________
> > foundation-l mailing list
> > [hidden email]
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
> >
> _______________________________________________
> foundation-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Geoffrey Plourde
For some reason, I am reminded of a Supreme Court case about the information in telephone directories. Maybe because of the insanity of trying to put public domain material under copyright.




________________________________
From: Brian <[hidden email]>
To: Wikimedia Foundation Mailing List <[hidden email]>
Sent: Saturday, June 20, 2009 11:47:28 AM
Subject: Re: [Foundation-l] Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

That is against the law. It violates Google's ToS.

I'm mostly complaining that Google is being Very Evil. There is nothing we
can do about it except complain to them. Which I don't know how to do - they
apparently believe that the plain text versions of their books are akin to
their intellectual property and are unwilling to give them away.

On Sat, Jun 20, 2009 at 12:34 PM, Falcorian <
[hidden email]<alex.public.account%[hidden email]>
> wrote:

> So the bot just has to run at human speeds so it does not get banned, it
> still won't get tired or make unpredictable mistakes. And you can run it
> from different IPs to parallelize.
>
> --Falcorian
>
> On Sat, Jun 20, 2009 at 11:04 AM, Brian <[hidden email]> wrote:
>
> > Not likely. I've been banned from Google's regular search at least a
> dozen
> > times during semi-frenetic search sprees in which I was identified as a
> > bot.
> > There is no doubt that if you try to automate it you will be quickly shot
> > down.
> >
> > On Sat, Jun 20, 2009 at 12:02 PM, Platonides <[hidden email]>
> wrote:
> >
> > > Brian wrote:
> > > > Unfortunately the only way I've found to download the full text of a
> > > public
> > > > domain book from Google is to flip through the book a page at a time,
> > > > copying the text to your clipboard.
> > > > There are roughly 2-3 million public domain books in Google Books.
> > >
> > > That's easy to fix :)
> > >
> > >
> > > _______________________________________________
> > > foundation-l mailing list
> > > [hidden email]
> > > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
> > >
> > _______________________________________________
> > foundation-l mailing list
> > [hidden email]
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
> >
> _______________________________________________
> foundation-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l



     
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Geoffrey Plourde
In reply to this post by David Gerard-2
For Supreme Court cases, would it be possible to have a bot pull the audio decisions from Oyez, and convert them into text?




________________________________
From: David Gerard <[hidden email]>
To: Wikimedia Foundation Mailing List <[hidden email]>
Sent: Saturday, June 20, 2009 8:41:45 AM
Subject: [Foundation-l] Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

http://blogs.law.harvard.edu/infolaw/2009/06/19/using-wikisource-as-an-alternative-open-access-repository-for-legal-scholarship/

Interesting. How well does this fit with what Wikisource does?


- d.

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l



     
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Parker Higgins
In reply to this post by Geoffrey Plourde
Except google isn't asserting any kind of copyright control over these
books, they're just not making it convenient to download them in your
preferred format.  Maybe not The Right Thing, but not as boneheaded as suing
a party who reprints public domain material, as was the case in Feist v.
Rural (the supreme court case you mention.)

Sent from my portable e-mail unit

On Jun 20, 2009 3:23 PM, "Geoffrey Plourde" <[hidden email]> wrote:

For some reason, I am reminded of a Supreme Court case about the information
in telephone directories. Maybe because of the insanity of trying to put
public domain material under copyright.




________________________________
From: Brian <[hidden email]>
To: Wikimedia Foundation Mailing List <[hidden email]>
Sent: Saturday, June 20, 2009 11:47:28 AM
Subject: Re: [Foundation-l] Info/Law blog: Using Wikisource as an
Alternative Open Access Repository for Legal Scholarship

That is against the law. It violates Google's ToS. I'm mostly complaining
that Google is being Ver...
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Platonides
In reply to this post by Brian J Mingus
Brian wrote:
> That is against the law. It violates Google's ToS.
>
> I'm mostly complaining that Google is being Very Evil. There is nothing we
> can do about it except complain to them. Which I don't know how to do - they
> apparently believe that the plain text versions of their books are akin to
> their intellectual property and are unwilling to give them away.

Where does it forbid them?
The most related part is section 5.
I understand that doing queries at bot rate may be against #5.3 but I
don't see anything against this.
Unlike searches, the book OCR result will be cached, so this shouldn't
be inconvenience them (and they don't place ads there!).

I'd wikify the html instead of just moving to plain text, though.


_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Anthony-73
In reply to this post by Brian J Mingus
Wow, what's Wikipedia's policy about using a bot to scrape everything?

On Sat, Jun 20, 2009 at 2:47 PM, Brian <[hidden email]> wrote:

> That is against the law. It violates Google's ToS.
>
> I'm mostly complaining that Google is being Very Evil. There is nothing we
> can do about it except complain to them. Which I don't know how to do -
> they
> apparently believe that the plain text versions of their books are akin to
> their intellectual property and are unwilling to give them away.
>
> On Sat, Jun 20, 2009 at 12:34 PM, Falcorian <
> [hidden email]<alex.public.account%[hidden email]>
> <alex.public.account%[hidden email]<alex.public.account%[hidden email]>
> >
> > wrote:
>
> > So the bot just has to run at human speeds so it does not get banned, it
> > still won't get tired or make unpredictable mistakes. And you can run it
> > from different IPs to parallelize.
> >
> > --Falcorian
> >
> > On Sat, Jun 20, 2009 at 11:04 AM, Brian <[hidden email]>
> wrote:
> >
> > > Not likely. I've been banned from Google's regular search at least a
> > dozen
> > > times during semi-frenetic search sprees in which I was identified as a
> > > bot.
> > > There is no doubt that if you try to automate it you will be quickly
> shot
> > > down.
> > >
> > > On Sat, Jun 20, 2009 at 12:02 PM, Platonides <[hidden email]>
> > wrote:
> > >
> > > > Brian wrote:
> > > > > Unfortunately the only way I've found to download the full text of
> a
> > > > public
> > > > > domain book from Google is to flip through the book a page at a
> time,
> > > > > copying the text to your clipboard.
> > > > > There are roughly 2-3 million public domain books in Google Books.
> > > >
> > > > That's easy to fix :)
> > > >
> > > >
> > > > _______________________________________________
> > > > foundation-l mailing list
> > > > [hidden email]
> > > > Unsubscribe:
> https://lists.wikimedia.org/mailman/listinfo/foundation-l
> > > >
> > > _______________________________________________
> > > foundation-l mailing list
> > > [hidden email]
> > > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
> > >
> > _______________________________________________
> > foundation-l mailing list
> > [hidden email]
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
> >
> _______________________________________________
> foundation-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Brian J Mingus
In reply to this post by Platonides
On Sat, Jun 20, 2009 at 1:29 PM, Platonides <[hidden email]> wrote:

> Where does it forbid them?


5.3 You agree not to access (or attempt to access) any of the Services by
any means other than through the interface that is provided by Google,
unless you have been specifically allowed to do so in a separate agreement
with Google. You specifically agree not to access (or attempt to access) any
of the Services through any automated means (including use of scripts or web
crawlers) and shall ensure that you comply with the instructions set out in
any robots.txt file present on the Services.
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Ray Saintonge
In reply to this post by Brian J Mingus
Brian wrote:
> That is against the law. It violates Google's ToS.
>
> I'm mostly complaining that Google is being Very Evil. There is nothing we
> can do about it except complain to them. Which I don't know how to do - they
> apparently believe that the plain text versions of their books are akin to
> their intellectual property and are unwilling to give them away.
>
>  
How is violating Google's ToS against the law?  Sites put all sorts of
meaningless garbage into these documents, and users mostly ignore them.

Of course Google's evil; it's about time that people noticed that.  They
use their deep pockets as a way to bully other sites ... with a smile.
Fortunately the U.S. does not have database protection laws like the
E.U.  Ideally, every PD item they host should also be hosted on an
alternative site, but that's a massive undertaking, ... and they know
it.  Nothing requires them to be nice to the competition, such as by
making it easy to copy their material.

Ec

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Geoffrey Plourde
If a bot has a meaningful effect on server load (i.e. page requests), it falls under the category of malicious software, which is highly illegal.




________________________________
From: Ray Saintonge <[hidden email]>
To: Wikimedia Foundation Mailing List <[hidden email]>
Sent: Saturday, June 20, 2009 2:35:52 PM
Subject: Re: [Foundation-l] Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Brian wrote:
> That is against the law. It violates Google's ToS.
>
> I'm mostly complaining that Google is being Very Evil. There is nothing we
> can do about it except complain to them. Which I don't know how to do - they
> apparently believe that the plain text versions of their books are akin to
> their intellectual property and are unwilling to give them away.
>
>  
How is violating Google's ToS against the law?  Sites put all sorts of
meaningless garbage into these documents, and users mostly ignore them.

Of course Google's evil; it's about time that people noticed that.  They
use their deep pockets as a way to bully other sites ... with a smile.
Fortunately the U.S. does not have database protection laws like the
E.U.  Ideally, every PD item they host should also be hosted on an
alternative site, but that's a massive undertaking, ... and they know
it.  Nothing requires them to be nice to the competition, such as by
making it easy to copy their material.

Ec

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l



     
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Ray Saintonge
In reply to this post by Anthony-73
Anthony wrote:
> Wow, what's Wikipedia's policy about using a bot to scrape everything?
>  

I don't know about any policy, but I think it should still be
discouraged.  For me this has less to do with predation on other sites
than with our inability to keep up with the volume of data that would be
produced.  Proofreading and wikifying are labour-intensive processes.  
It is very easy for the technically minded to bring the scan and OCR of
a 500-page book under our roof, but without the manpower to bring the
added value these processes are scarcely better than data dumps.

Ec

> On Sat, Jun 20, 2009 at 2:47 PM, Brian <[hidden email]> wrote:
>  
>> That is against the law. It violates Google's ToS.
>>
>> I'm mostly complaining that Google is being Very Evil. There is nothing we
>> can do about it except complain to them. Which I don't know how to do -
>> they
>> apparently believe that the plain text versions of their books are akin to
>> their intellectual property and are unwilling to give them away.
>>
>> On Sat, Jun 20, 2009 at 12:34 PM, Falcorian wrote:
>>    
>>> So the bot just has to run at human speeds so it does not get banned, it
>>> still won't get tired or make unpredictable mistakes. And you can run it
>>> from different IPs to parallelize.
>>>
>>> --Falcorian


_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Ray Saintonge
In reply to this post by Geoffrey Plourde
Geoffrey Plourde wrote:
> If a bot has a meaningful effect on server load (i.e. page requests), it falls under the category of malicious software, which is highly illegal.
>  
Malicious software or overloading servers goes well beyond ignoring a
ToS.  Why should downloading whole books from Google have any greater
effect on server load than downloading a whole book of similar length
from Internet Archive?

Ec


> ________________________________
> From: Ray Saintonge
>
>
> Brian wrote:
>  
>> That is against the law. It violates Google's ToS.
>>
>> I'm mostly complaining that Google is being Very Evil. There is nothing we
>> can do about it except complain to them. Which I don't know how to do - they
>> apparently believe that the plain text versions of their books are akin to
>> their intellectual property and are unwilling to give them away.
>>
>>  
>>    
> How is violating Google's ToS against the law?  Sites put all sorts of
> meaningless garbage into these documents, and users mostly ignore them.
>
> Of course Google's evil; it's about time that people noticed that.  They
> use their deep pockets as a way to bully other sites ... with a smile.
> Fortunately the U.S. does not have database protection laws like the
> E.U.  Ideally, every PD item they host should also be hosted on an
> alternative site, but that's a massive undertaking, ... and they know
> it.  Nothing requires them to be nice to the competition, such as by
> making it easy to copy their material.
>
> Ec
>  


_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Anthony-73
In reply to this post by Ray Saintonge
Evil I tell you.  Evil!

On Sat, Jun 20, 2009 at 7:56 PM, Ray Saintonge <[hidden email]> wrote:

> Anthony wrote:
> > Wow, what's Wikipedia's policy about using a bot to scrape everything?
> >
>
> I don't know about any policy, but I think it should still be
> discouraged.  For me this has less to do with predation on other sites
> than with our inability to keep up with the volume of data that would be
> produced.  Proofreading and wikifying are labour-intensive processes.
> It is very easy for the technically minded to bring the scan and OCR of
> a 500-page book under our roof, but without the manpower to bring the
> added value these processes are scarcely better than data dumps.
>
> Ec
> > On Sat, Jun 20, 2009 at 2:47 PM, Brian <[hidden email]>
> wrote:
> >
> >> That is against the law. It violates Google's ToS.
> >>
> >> I'm mostly complaining that Google is being Very Evil. There is nothing
> we
> >> can do about it except complain to them. Which I don't know how to do -
> >> they
> >> apparently believe that the plain text versions of their books are akin
> to
> >> their intellectual property and are unwilling to give them away.
> >>
> >> On Sat, Jun 20, 2009 at 12:34 PM, Falcorian wrote:
> >>
> >>> So the bot just has to run at human speeds so it does not get banned,
> it
> >>> still won't get tired or make unpredictable mistakes. And you can run
> it
> >>> from different IPs to parallelize.
> >>>
> >>> --Falcorian
>
>
> _______________________________________________
> foundation-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Stephen Bain
In reply to this post by Parker Higgins
On Sun, Jun 21, 2009 at 5:27 AM, Parker Higgins<[hidden email]> wrote:
> Except google isn't asserting any kind of copyright control over these
> books, they're just not making it convenient to download them in your
> preferred format.  Maybe not The Right Thing, but not as boneheaded as suing
> a party who reprints public domain material, as was the case in Feist v.
> Rural (the supreme court case you mention.)

They want people to use their service. Fair enough, given that the
scanning and OCRing happened on their dime.

--
Stephen Bain
[hidden email]

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
123