Time to redirect to https by default?

classic Classic list List threaded Threaded
44 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Time to redirect to https by default?

David Gerard-2
Lots of monitoring going into place:

https://en.wikipedia.org/wiki/Wikipedia:List_of_articles_censored_in_Saudi_Arabia
http://www.bbc.co.uk/news/uk-politics-17576745

What are the current technical barriers to redirection to https by default?


- d.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Time to redirect to https by default?

Petr Bena
I see no point in doing that. Https doesn't support caching well and
is generally slower. There is no use for readers for that.

On Sun, Apr 1, 2012 at 12:06 PM, David Gerard <[hidden email]> wrote:

> Lots of monitoring going into place:
>
> https://en.wikipedia.org/wiki/Wikipedia:List_of_articles_censored_in_Saudi_Arabia
> http://www.bbc.co.uk/news/uk-politics-17576745
>
> What are the current technical barriers to redirection to https by default?
>
>
> - d.
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Time to redirect to https by default?

David Gerard-2
On 1 April 2012 11:55, Petr Bena <[hidden email]> wrote:

> I see no point in doing that. Https doesn't support caching well and
> is generally slower. There is no use for readers for that.


The use is that the requests themselves are encrypted, so that the
only thing logged is that they went to Wikimedia. You did read the
linked articles, right?


- d.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Time to redirect to https by default?

Svip
On 1 April 2012 13:01, David Gerard <[hidden email]> wrote:

> On 1 April 2012 11:55, Petr Bena <[hidden email]> wrote:
>
>> I see no point in doing that. Https doesn't support caching well and
>> is generally slower. There is no use for readers for that.
>
> The use is that the requests themselves are encrypted, so that the
> only thing logged is that they went to Wikimedia. You did read the
> linked articles, right?

Obviously, I cannot confirm whether Mr Bena read the linked articles
or not, but he did provide an answer regarding the technical
restrictions.

Wikimedia already spends an incredible amount of time caching its
content, because *so many* users use Wikipedia and its sister projects
daily.

And since most of the content is fairly static, caching makes a lot of sense.

However, HTTPS does not support caching (at least not well), which
means each page would suddenly have to be generated for *each* page.
It's true that MediaWiki itself supports caching, but its own caching
is no where near as fast as a caching server like Varnish (although I
believe a less powerful caching server is used on Wikimedia's
servers).

The trade off is that the service would be slower for everyone or we
would need more servers.  And I am not sure Wikimedia has that kind of
money.

Those are the *technical* limitations to defaulting to HTTPS.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Time to redirect to https by default?

Svip
In reply to this post by David Gerard-2
On 1 April 2012 12:06, David Gerard <[hidden email]> wrote:

> http://www.bbc.co.uk/news/uk-politics-17576745

Also, this article was written on 1 April and is far beyond any
monitoring scheme ever suggested in the Western World.  And I am sure
we would have heard about it being mentioned up until this point, if
it was real.

So I would take that article with a grain of salt.  Particularly the
statement about 'real time'.  That's not even feasible.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Time to redirect to https by default?

David Gerard-2
On 1 April 2012 12:23, Svip <[hidden email]> wrote:
> On 1 April 2012 12:06, David Gerard <[hidden email]> wrote:

>> http://www.bbc.co.uk/news/uk-politics-17576745

> Also, this article was written on 1 April and is far beyond any
> monitoring scheme ever suggested in the Western World.  And I am sure
> we would have heard about it being mentioned up until this point, if
> it was real.


It would be nice, but if it's a prank then (a) lots of other
newspapers are in on it (b) ORG flagged the programme described
several weeks in advance:

http://wiki.openrightsgroup.org/wiki/Communications_Capabilities_Development_Programme
http://www.openrightsgroup.org/issues/ccdp

So no, it's in no way a joke. This is absolutely real.


> So I would take that article with a grain of salt.  Particularly the
> statement about 'real time'.  That's not even feasible.


That a desired monitoring regime would require a violation of physics
has *never* stopped a legislative push for such.


- d.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Time to redirect to https by default?

Petr Bena
I said there is a little benefit for most of users, of course there
would be some who could find it usefull, however that's no reason to
redirect all users. I use wikipedia a lot, and I don't care if someone
see which pages I open. If someone does care, they should switch to
https themselves.

On Sun, Apr 1, 2012 at 1:59 PM, David Gerard <[hidden email]> wrote:

> On 1 April 2012 12:23, Svip <[hidden email]> wrote:
>> On 1 April 2012 12:06, David Gerard <[hidden email]> wrote:
>
>>> http://www.bbc.co.uk/news/uk-politics-17576745
>
>> Also, this article was written on 1 April and is far beyond any
>> monitoring scheme ever suggested in the Western World.  And I am sure
>> we would have heard about it being mentioned up until this point, if
>> it was real.
>
>
> It would be nice, but if it's a prank then (a) lots of other
> newspapers are in on it (b) ORG flagged the programme described
> several weeks in advance:
>
> http://wiki.openrightsgroup.org/wiki/Communications_Capabilities_Development_Programme
> http://www.openrightsgroup.org/issues/ccdp
>
> So no, it's in no way a joke. This is absolutely real.
>
>
>> So I would take that article with a grain of salt.  Particularly the
>> statement about 'real time'.  That's not even feasible.
>
>
> That a desired monitoring regime would require a violation of physics
> has *never* stopped a legislative push for such.
>
>
> - d.
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Time to redirect to https by default?

Svip
In reply to this post by David Gerard-2
On 1 April 2012 13:59, David Gerard <[hidden email]> wrote:

> On 1 April 2012 12:23, Svip <[hidden email]> wrote:
>
>> On 1 April 2012 12:06, David Gerard <[hidden email]> wrote:
>>
>>> http://www.bbc.co.uk/news/uk-politics-17576745
>>
>> Also, this article was written on 1 April and is far beyond any
>> monitoring scheme ever suggested in the Western World.  And I am sure
>> we would have heard about it being mentioned up until this point, if
>> it was real.
>
> It would be nice, but if it's a prank then (a) lots of other
> newspapers are in on it (b) ORG flagged the programme described
> several weeks in advance:
>
> http://wiki.openrightsgroup.org/wiki/Communications_Capabilities_Development_Programme
> http://www.openrightsgroup.org/issues/ccdp
>
> So no, it's in no way a joke. This is absolutely real.

Still *kind of* a joke.

>> So I would take that article with a grain of salt.  Particularly the
>> statement about 'real time'.  That's not even feasible.
>
> That a desired monitoring regime would require a violation of physics
> has *never* stopped a legislative push for such.

But it has always stopped it from being implemented or executed in
practice.  While the development is terrifying, it is also important
to note the lack of actual consequences it will have.  Other than
being a huge embarrassment.

But I was always under the influence that the UK didn't really care
about free speech and privacy.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

correct way to import SQL dumps into MySQL database in terms of character encoding

Piotr Jagielski
Hello,

I'm trying to import categorylinks.sql dump into my MySQL database. I'm
able to import it and query for articles in specific categories as long
the category name contains only English-language characters. I don't get
any results if I try to query for non-English category name. My
understanding is that the dump is in UTF-8 format so I tried the following:

create the database using the following command:
CREATE DATABASE wiki CHARACTER SET utf8 COLLATE utf8_general_ci;

import the dump using the following command:
mysql --user root --password=root wiki <
C:\Path\plwiki-20111227-categorylinks.sql --default-character-set=utf8

set my data source URL to the following in my Java code:
jdbc:mysql://localhost/plwiki?useUnicode=true&characterEncoding=UTF-8

It still doesn't work. What am I missing? Are there any instructions on
how to correctly import the dump anywhere?

Thanks,
Piotr



_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: correct way to import SQL dumps into MySQL database in terms of character encoding

Svip
On 1 April 2012 16:04, Piotr Jagielski <[hidden email]> wrote:

> mysql --user root --password=root wiki <
> C:\Path\plwiki-20111227-categorylinks.sql --default-character-set=utf8

It's -p, not --password=root and it will prompt you for the password.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: correct way to import SQL dumps into MySQL database in terms of character encoding

Piotr Jagielski
These options should be equivalent. It does load the data using the
below command. It just incorrectly handles non-English characters.

Regards,
Piotr

On 2012-04-01 16:31, Svip wrote:

> On 1 April 2012 16:04, Piotr Jagielski<[hidden email]>  wrote:
>
>> mysql --user root --password=root wiki<
>> C:\Path\plwiki-20111227-categorylinks.sql --default-character-set=utf8
> It's -p, not --password=root and it will prompt you for the password.
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Time to redirect to https by default?

Platonides
In reply to this post by Svip
On 1 April 2012 14:53, Svip wrote:

> On 1 April 2012 13:59, David Gerard <[hidden email]> wrote:
>> On 1 April 2012 12:23, Svip <[hidden email]> wrote:
>>> So I would take that article with a grain of salt.  Particularly the
>>> statement about 'real time'.  That's not even feasible.
>>
>> That a desired monitoring regime would require a violation of physics
>> has *never* stopped a legislative push for such.
>
> But it has always stopped it from being implemented or executed in
> practice.  While the development is terrifying, it is also important
> to note the lack of actual consequences it will have.  Other than
> being a huge embarrassment.

I don't see why it *couldn't* be implemented.
Note that the real time statement is no different on how they can snoop
your phone calls in real time.
Sure, the storage requirements would be crazy, but I don't see specific
details on what is to be stored, so it may well be implementable given
enough funding.


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: correct way to import SQL dumps into MySQL database in terms of character encoding

Platonides
In reply to this post by Piotr Jagielski
On 01/04/12 17:05, Piotr Jagielski wrote:
> These options should be equivalent. It does load the data using the
> below command. It just incorrectly handles non-English characters.
>
> Regards,
> Piotr

Do you have $wgDBmysql5 set in your LocalSettings.php?



_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: correct way to import SQL dumps into MySQL database in terms of character encoding

Piotr Jagielski
I don't have MediaWiki installed. I'm just trying to import the dump
into a standalone database so I can do some batch processing on the data.

Regards,
Piotr

On 2012-04-01 17:30, Platonides wrote:

> On 01/04/12 17:05, Piotr Jagielski wrote:
>> These options should be equivalent. It does load the data using the
>> below command. It just incorrectly handles non-English characters.
>>
>> Regards,
>> Piotr
> Do you have $wgDBmysql5 set in your LocalSettings.php?
>
>
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Time to redirect to https by default?

Bináris
In reply to this post by David Gerard-2
2012/4/1 David Gerard <[hidden email]>

> http://www.bbc.co.uk/news/uk-politics-17576745
>
> This one may be an April 1 joke, let's wait one day. :-)

--
Bináris
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Time to redirect to https by default?

David Gerard-2
On 1 April 2012 17:00, Bináris <[hidden email]> wrote:
> 2012/4/1 David Gerard <[hidden email]>

>> http://www.bbc.co.uk/news/uk-politics-17576745

> This one may be an April 1 joke, let's wait one day. :-)


No, it really isn't, sadly.


- d.
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Time to redirect to https by default?

Antoine Musso-3
In reply to this post by Petr Bena
Le 01/04/12 12:55, Petr Bena wrote:
> I see no point in doing that. Https doesn't support caching well and
> is generally slower. There is no use for readers for that.

HTTPS has nothing to do with caching, it just transports informations
between the client and the server so they can actually handle caching.

HTTPS supports caching as well as HTTP since they are exactly the same
protocol, the first just being encrypted.

You are right though, in the sense of most web browsers will BY DEFAULT
not save a copy of the received content whenever it is received through
HTTPS.  The reason behind is that HTTPS page is/was usually used to
serve private content.  Caching can be explicitly set to caching by
marking it as public, send "Cache-Control: public" and that should work.


I do agree there is probably no use for readers to have HTTPS enabled.
If the purposes is to bypass countries firewall such as in China (or I
think Thailand), they will just intercept the HTTPS connection form the
server on their hardware, decypher it for analysis and resign the
content with their own certificate before sending it back to clients.

That is exactly what you do in a big company when you want to make sure
(as an example) that your employee do not use the chat function in Facebook.

The only thing HTTPS is going to prevent, is being still its password
when logging in or getting the session cookie hijacked by sniffing the
local network.  The WMF has already moved its private wikis to HTTPS
just for that :-]

cheers,

--
Antoine "hashar" Musso


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: correct way to import SQL dumps into MySQL database in terms of character encoding

Marcin Cieslak-3
In reply to this post by Piotr Jagielski
>> Piotr Jagielski <[hidden email]> wrote:
> Hello,
>
> set my data source URL to the following in my Java code:
> jdbc:mysql://localhost/plwiki?useUnicode=true&characterEncoding=UTF-8

Please note you have "plwiki" here and you imported into "wiki".
Assuming your .my.cnf is not making things difficult I ran a small
Jython script to test:

$ jython
Jython 2.5.2 (Release_2_5_2:7206, Mar 2 2011, 23:12:06)
[OpenJDK 64-Bit Server VM (Sun Microsystems Inc.)] on java1.6.0
Type "help", "copyright", "credits" or "license" for more information.
>>> from com.ziclix.python.sql import zxJDBC
>>> d, u, p, v = "jdbc:mysql://localhost/wiki", "root", None, "org.gjt.mm.mysql.Driver"
>>> db = zxJDBC.connect(d, u, p, v, CHARSET="utf8")
>>> c=db.cursor()
>>> c.execute("select cl_from, cl_to from categorylinks where cl_from=61 limit 10")
>>> c.fetchone()
(61, array('b', [65, 110, 100, 111, 114, 97]))
>>> (a,b) = c.fetchone()
>>> print b
array('b', [67, 122, -59, -126, 111, 110, 107, 111, 119, 105, 101, 95, 79, 114, 103, 97, 110, 105, 122, 97, 99, 106, 105, 95, 78, 97, 114, 111, 100, -61, -77, 119, 95, 90, 106, 101, 100, 110, 111, 99, 122, 111, 110, 121, 99, 104])
>>> for x in b:
...     try:
...         print chr(x),
...     except ValueError:
...         print "%02x" % x,
...
C z -3b -7e o n k o w i e _ O r g a n i z a c j i _ N a r o d -3d -4d w _ Z j e d n o c z o n y c h

array('b", [ ... ]) in Jython means that SQL driver returns an array of bytes.

It seems to me that array of bytes contains raw UTF-8, so you need to decode it into
proper Unicode that Java uses in strings.

I think this behaviour is described in

http://bugs.mysql.com/bug.php?id=25528

Probably you need to play with getBytes() on a result object
to get what you want.

//Saper


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Time to redirect to https by default?

Platonides
In reply to this post by Antoine Musso-3
On 01/04/12 18:43, Antoine Musso wrote:
> Le 01/04/12 12:55, Petr Bena wrote:
>> I see no point in doing that. Https doesn't support caching well and
>> is generally slower. There is no use for readers for that.
>
> HTTPS has nothing to do with caching, it just transports informations
> between the client and the server so they can actually handle caching.
>
> HTTPS supports caching as well as HTTP since they are exactly the same
> protocol, the first just being encrypted.

There would be a small difference if you're behind a caching proxy, but
that's unlikely to make a difference to pretty much everyone.


> I do agree there is probably no use for readers to have HTTPS enabled.
> If the purposes is to bypass countries firewall such as in China (or I
> think Thailand), they will just intercept the HTTPS connection form the
> server on their hardware, decypher it for analysis and resign the
> content with their own certificate before sending it back to clients.

Note that such approach would yield a certificate, which if stored
during the attack and later published, is a proof of their evil-doing.
Any CA willingly doing that (even if "forced by the government") would
(should) be immediately revoked from the browsers certificate bundles.

(I believe such interposition has been done in the past, though)


> That is exactly what you do in a big company when you want to make sure
> (as an example) that your employee do not use the chat function in Facebook.

A company can install its own CA certificate in their own computers, and
have a policy of "we will sniff everything" (note that if the employee
is not conveniently informed of that, the wiretapping could well be
illegal).
I wonder how they handle self-signed certificates.


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: correct way to import SQL dumps into MySQL database in terms of character encoding

Piotr Jagielski
In reply to this post by Marcin Cieslak-3
Sorry, I made a mistake in the e-mail. I had the database set to the
same name in both places.

My problem is actually opposite because I don't get any result where I
use UTF-8 string as an input in the query. But I verified that I don't
get correct results where using the query you provided neither. The link
with the MySQL bug report might be helpful in resolving the problem so
thanks for providing it.

Piotr

On 2012-04-01 19:50, Marcin Cieslak wrote:

>>> Piotr Jagielski<[hidden email]>  wrote:
>> Hello,
>>
>> set my data source URL to the following in my Java code:
>> jdbc:mysql://localhost/plwiki?useUnicode=true&characterEncoding=UTF-8
> Please note you have "plwiki" here and you imported into "wiki".
> Assuming your .my.cnf is not making things difficult I ran a small
> Jython script to test:
>
> $ jython
> Jython 2.5.2 (Release_2_5_2:7206, Mar 2 2011, 23:12:06)
> [OpenJDK 64-Bit Server VM (Sun Microsystems Inc.)] on java1.6.0
> Type "help", "copyright", "credits" or "license" for more information.
>>>> from com.ziclix.python.sql import zxJDBC
>>>> d, u, p, v = "jdbc:mysql://localhost/wiki", "root", None, "org.gjt.mm.mysql.Driver"
>>>> db = zxJDBC.connect(d, u, p, v, CHARSET="utf8")
>>>> c=db.cursor()
>>>> c.execute("select cl_from, cl_to from categorylinks where cl_from=61 limit 10")
>>>> c.fetchone()
> (61, array('b', [65, 110, 100, 111, 114, 97]))
>>>> (a,b) = c.fetchone()
>>>> print b
> array('b', [67, 122, -59, -126, 111, 110, 107, 111, 119, 105, 101, 95, 79, 114, 103, 97, 110, 105, 122, 97, 99, 106, 105, 95, 78, 97, 114, 111, 100, -61, -77, 119, 95, 90, 106, 101, 100, 110, 111, 99, 122, 111, 110, 121, 99, 104])
>>>> for x in b:
> ...     try:
> ...         print chr(x),
> ...     except ValueError:
> ...         print "%02x" % x,
> ...
> C z -3b -7e o n k o w i e _ O r g a n i z a c j i _ N a r o d -3d -4d w _ Z j e d n o c z o n y c h
>
> array('b", [ ... ]) in Jython means that SQL driver returns an array of bytes.
>
> It seems to me that array of bytes contains raw UTF-8, so you need to decode it into
> proper Unicode that Java uses in strings.
>
> I think this behaviour is described in
>
> http://bugs.mysql.com/bug.php?id=25528
>
> Probably you need to play with getBytes() on a result object
> to get what you want.
>
> //Saper
>
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
123