Wikipedia & YSlow

classic Classic list List threaded Threaded
35 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Wikipedia & YSlow

howard chen
Hello all,

As you might already know, YSlow is a tool to check website
performance, I just run a test against:
http://en.wikipedia.org/wiki/Main_Page

Results is quite surprising, Grade F (47). Of course lower mark does
not always means bad, but there are some room for improvement, e.g.


1. Minify JS (e.g. http://en.wikipedia.org/skins-1.5/common/ajax.js?179)

2. Enable GZip compression (e.g.
http://en.wikipedia.org/skins-1.5/monobook/main.css?179)

3. Add expire header (e.g.
http://upload.wikimedia.org/wikipedia/en/9/9d/Commons-logo-31px.png)

4. Don't put CSS out of the <head />

etc.


By do this, it should save some money on bandwidth, as well as to
provide a better user experience.

Howard

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia & YSlow

Gregory Maxwell
On Sun, Oct 5, 2008 at 12:15 PM, howard chen <[hidden email]> wrote:
> Results is quite surprising, Grade F (47). Of course lower mark does
> not always means bad, but there are some room for improvement, e.g.
[snip]

> 1. Minify JS (e.g. http://en.wikipedia.org/skins-1.5/common/ajax.js?179)

Probably pointless. It's small enough already that the load time is
going to be latency bound for any user not sitting inside a Wikimedia
data center. On ones which are above the latency bound window (of
roughly 8k), gzipping should get them back under it.

> 2. Enable GZip compression (e.g.
> http://en.wikipedia.org/skins-1.5/monobook/main.css?179)

The page text is gzipped.  CSS/JS are not. Many of the CSS/JS are
small enough that gzipping would not be a significant win (see above)
but I don't recall the reason the the CSS/JS are not. Is there a
client compatibility issue here?

> 3. Add expire header (e.g.
> http://upload.wikimedia.org/wikipedia/en/9/9d/Commons-logo-31px.png)

Hm. There are expire headers on the skin provided images, but not ones
from upload.  It does correctly respond with 304 not modified, but a
not-modified is often as time consuming as sending the image. Firefox
doesn't IMS these objects every time in any case.

The caching headers for the OggHandler play button are a bit odd and
are causing the object to be refreshed on every load for me.


In any case, from the second page onwards pages typically display in
<100ms for me, and the cold cache (first page) load time for me looks
like it's about 230ms, which is also not bad.  The grade 'f' is hardly
deserved.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia & YSlow

howard chen
Hello,

On Mon, Oct 6, 2008 at 1:00 AM, Gregory Maxwell <[hidden email]> wrote:
>
>> 1. Minify JS (e.g. http://en.wikipedia.org/skins-1.5/common/ajax.js?179)
>
> Probably pointless. It's small enough already that the load time is
> going to be latency bound for any user not sitting inside a Wikimedia
> data center. On ones which are above the latency bound window (of
> roughly 8k), gzipping should get them back under it.
>


Given that the traffic of wikipedia, every bit should count.
That are no reason to send to inline programming comments to normal user anyway?



>> 2. Enable GZip compression (e.g.
>> http://en.wikipedia.org/skins-1.5/monobook/main.css?179)
>
> The page text is gzipped.  CSS/JS are not. Many of the CSS/JS are
> small enough that gzipping would not be a significant win (see above)
> but I don't recall the reason the the CSS/JS are not. Is there a
> client compatibility issue here?


Gzip css/js `should` not bring any compatibility issue to most
browsers, Yahoo! is doing anyway.


>> 3. Add expire header (e.g.
>> http://upload.wikimedia.org/wikipedia/en/9/9d/Commons-logo-31px.png)
>
> Hm. There are expire headers on the skin provided images, but not ones
> from upload.  It does correctly respond with 304 not modified, but a
> not-modified is often as time consuming as sending the image. Firefox
> doesn't IMS these objects every time in any case.

Have a simple policy to generate unique URI for each resources, and
expire as far as possible.


Howard

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia & YSlow

Max Semenik
In reply to this post by Gregory Maxwell
On 05.10.2008, 21:00 Gregory wrote:

>> 1. Minify JS (e.g. http://en.wikipedia.org/skins-1.5/common/ajax.js?179)

> Probably pointless. It's small enough already that the load time is
> going to be latency bound for any user not sitting inside a Wikimedia
> data center. On ones which are above the latency bound window (of
> roughly 8k), gzipping should get them back under it.

mwsuggest.js loses 10 kb that way, wikibits.js - 11k.

> The page text is gzipped.  CSS/JS are not. Many of the CSS/JS are
> small enough that gzipping would not be a significant win (see above)
> but I don't recall the reason the the CSS/JS are not. Is there a
> client compatibility issue here?

For a logged in user with monobook it's 33k vs. 106 kb - not that
insignificant.

> In any case, from the second page onwards pages typically display in
> <100ms for me, and the cold cache (first page) load time for me looks
> like it's about 230ms, which is also not bad.  The grade 'f' is hardly
> deserved.

Not everyone lives in the US and enjoys fast Internet.

--
Best regards,
  Max Semenik ([[User:MaxSem]])


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia & YSlow

Gregory Maxwell
On Sun, Oct 5, 2008 at 2:29 PM, Max Semenik <[hidden email]> wrote:

> On 05.10.2008, 21:00 Gregory wrote:
>
>>> 1. Minify JS (e.g. http://en.wikipedia.org/skins-1.5/common/ajax.js?179)
>
>> Probably pointless. It's small enough already that the load time is
>> going to be latency bound for any user not sitting inside a Wikimedia
>> data center. On ones which are above the latency bound window (of
>> roughly 8k), gzipping should get them back under it.
>
> mwsuggest.js loses 10 kb that way, wikibits.js - 11k.

The gzipped copy can't lose 11k, because it's not even that large when
gzipped (it's 9146 bytes gzipped).

Compare the gzipped sizes. Post gzipping the savings from whitespace
removal and friends is much smaller. Yet it makes the JS unreadable
and makes debugging into a pain.

> For a logged in user with monobook it's 33k vs. 106 kb - not that
> insignificant.

Logged in a mess of uncachability. You're worried about
one-time-per-session loaded object for logged in users?

>> In any case, from the second page onwards pages typically display in
>> <100ms for me, and the cold cache (first page) load time for me looks
>> like it's about 230ms, which is also not bad.  The grade 'f' is hardly
>> deserved.
>
> Not everyone lives in the US and enjoys fast Internet.

You're missing my point. For small objects *latency* overwhelms the
loading time, even if you're on a slow connection, because TCP never
gets a chance to open the window up.  The further you are away from
the Wikimedia datacenters the more significant that effect is.

Much of the poorly connected world suffers very high latencies due to
congestion induced queueing delay or service via satellite in addition
to being far from Wikimedia. (and besides, the US itself lags much of
the world in terms of throughput).

If it takes 75ms to get to the nearest Wikimedia datacenter and back
then a new HTTP get can not finish in less than 150ms.  If you want to
improve performance you need to focus on shaving *round trips* rather
than bytes. Byte reduction only saves you round trips if you're able
to reduce the number of TCP windows worth of data, it's quantized and
the lowest threshold is about 8kbytes.

Removing round trips helps everyone, while shaving bytes only helps
people who are low delay and very low bandwidth, an increasingly
uncommon configuration. Also, getting JS out of the critical path
helps everyone. The reader does not care how long an once-per-session
object takes to load when it doesn't block rendering, and the site
already does really well at this.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia & YSlow

Platonides
In reply to this post by Gregory Maxwell
Gregory Maxwell wrote:
> In any case, from the second page onwards pages typically display in
> <100ms for me, and the cold cache (first page) load time for me looks
> like it's about 230ms, which is also not bad.  The grade 'f' is hardly
> deserved.

That's because it's an uploaded image. It is cached in the squids, but
not outside. So people will need to check if it's modified, but it can
be modified at any time.


Howard Chen wrote:
> Have a simple policy to generate unique URI for each resources, and
> expire as far as possible.
That's what is being used for site js and css (the appened query string)
and that's why it can have the expire header.

We could provide a per-image unique URI with a large caching, based on
some id or simply in the image hash. The tradeoff is that then you can
set a large expiry time, but you need to purge all the pages including
the image when it's reuploaded, whereas with the current system that
would only be needed when it's deleted (or uploaded for the first time).

Meybe the html caches aren't even purged when the image is deleted on
commons, given that there is no single table for that (the CheckUsage
problem). Anyone can confirm? But it degrades gracefully, something a
hash-based path wouldn't do.


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia & YSlow

Aryeh Gregor
In reply to this post by Gregory Maxwell
On Sun, Oct 5, 2008 at 1:00 PM, Gregory Maxwell <[hidden email]> wrote:
> The page text is gzipped.  CSS/JS are not. Many of the CSS/JS are
> small enough that gzipping would not be a significant win (see above)
> but I don't recall the reason the the CSS/JS are not. Is there a
> client compatibility issue here?

Some of the CSS/JS *is* gzipped, so there had better not be a client
compatibility issue.  Styles/scripts served from index.php are
gzipped, it's only statically-served files that aren't.  I'm guessing
this would just require a line or two changed in Apache config.

>> 3. Add expire header (e.g.
>> http://upload.wikimedia.org/wikipedia/en/9/9d/Commons-logo-31px.png)
>
> Hm. There are expire headers on the skin provided images, but not ones
> from upload.  It does correctly respond with 304 not modified, but a
> not-modified is often as time consuming as sending the image.

Looking at the URL, this doesn't seem to identify the version at all,
so we can't safely send an Expires header.  I'm guessing we use a
non-versioned URL here to avoid purging parser/Squid caches when a new
image is uploaded.  This is probably a losing proposition, except
maybe for very widely used images (but those shouldn't change often?).
 But it would be a pain to change.

> In any case, from the second page onwards pages typically display in
> <100ms for me, and the cold cache (first page) load time for me looks
> like it's about 230ms, which is also not bad.  The grade 'f' is hardly
> deserved.

I've found that YSlow is kind of flaky in its grades.  Some of its
advice is fairly brainless.  But there's definitely room for
improvement on our part -- gzipping stuff at the very least!

On Sun, Oct 5, 2008 at 3:52 PM, Gregory Maxwell <[hidden email]> wrote:
> Compare the gzipped sizes. Post gzipping the savings from whitespace
> removal and friends is much smaller. Yet it makes the JS unreadable
> and makes debugging into a pain.

What I was thinking (not that the idea is original to me :P) is that
we could have all JS and all CSS sent from some PHP file, call it
compact.php.  So you would have just one script tag and one style tag,
like

<script type="text/javascript" src="/w/compact.php?type=js&..."></script>
<link rel="stylesheet" type="text/css" href="/w/compact.php?type=css&..." />

That could concatenate the appropriate files, add on
dynamically-generated stuff, gzip everything, and send it in one
request, with appropriate Expires headers and so on.  This would
dramatically cut round-trips (15 total external CSS/JS files
logged-out for me right now, 24 logged-in!).

Round-trips are serialized in modern browsers a lot more than they
should be -- Firefox 3 will stop all other requests while it's
requesting any JS file for some crazy reason (is there a bug open for
that?), and IE is apparently similar.  It's even worse with older
browsers, which are unreasonably cautious about how many parallel
requests to send to the same domain.

Once you're already serving all the JS together from a script, you
could minify it with no real extra cost, and prepend a helpful comment
like /* Add &minify=0 to the URL for a human-readable version */.
Minification tends to save a few percent of the original filesize on
top of gzipping (which can be like 20% of the file size after
gzipping), which can make a difference for big scripts.  Here are a
couple of examples to illustrate the point (from the O'Reilly book
"High Performance Web Sites", by Steve Souders of Yahoo!):

http://stevesouders.com/hpws/js-large-normal.php
http://stevesouders.com/hpws/js-large-minify.php

I see a saving of a few hundred milliseconds according to those pages
-- and yes, this is after gzipping.

Since this is you, though, I wait to be corrected.  :)

> If it takes 75ms to get to the nearest Wikimedia datacenter and back
> then a new HTTP get can not finish in less than 150ms.  If you want to
> improve performance you need to focus on shaving *round trips* rather
> than bytes. Byte reduction only saves you round trips if you're able
> to reduce the number of TCP windows worth of data, it's quantized and
> the lowest threshold is about 8kbytes.

The example above has a script that's 76 KB gzipped, but only 29 KB
minified plus gzipped.  On the other hand, the script is three times
as large as the scripts we serve to an anon viewing the main page (377
KB vs. 126 KB ungzipped).

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia & YSlow

Daniel Friesen
I really wish people would stop spreading the crap about the /benefits/
of minification, while only giving half the information.
Sure, minification does reduce some size in comparison to a full file.
And yes, minification+gzipping does make things insanely small. But that
there is blatantly disregarding something. It's not the minification
that makes min+gz so small, it's the gzipping, in fact once you gzip
trying to minify becomes nearly pointless.

Here's the table for wikibits.js, and wikipedia's gen.js for anons
(basically monobook.js).
wikibits.js
        non-gz
        gzipped
full
        27.1kb
        8.9kb
minified
        16.7kb
        5.0kb

wp's gen.js
        non-gz
        gzipped
full
        29.2kb
        7.9kb
minified
        16.8kb
        4.5kb


Minification alone only reduces a file by about 40%. However gzipping
reduces a file by about 70% alone.
When it comes down to it, once you gzip minification can barely even
save you 10% of a file's size. And honestly, that measly 10% is not
worth how badly it screws up the readability of the code.

As for client compatibility. There are some older browsers that don't
support gzipping properly (notably ie6). We serve gzipped data from the
php scripts, but we only do that after detecting if the browser supports
gzipping or not. So we're not serving gzipped stuff to old browsers like
ie6 that have broken handling of gzip.
The difference with the static stuff is quite simply because it's not as
easy to make a webserver detect gzip compatibility as it is to make a
php script do it.

The limitation in grabbing data in browsers isn't crazy, the standard is
to restrict to only 2 open http connections for a single hostname.
Gzipping and reducing the number of script tags we use are the only
useful things that can be done to speed up viewing.

~Daniel Friesen (Dantman, Nadir-Seen-Fire)
~Profile/Portfolio: http://nadir-seen-fire.com
-The Nadir-Point Group (http://nadir-point.com)
--It's Wiki-Tools subgroup (http://wiki-tools.com)
--The ElectronicMe project (http://electronic-me.org)
-Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG)
--Animepedia (http://anime.wikia.com)
--Narutopedia (http://naruto.wikia.com)

Aryeh Gregor wrote:

> On Sun, Oct 5, 2008 at 1:00 PM, Gregory Maxwell <[hidden email]> wrote:
>  
>> The page text is gzipped.  CSS/JS are not. Many of the CSS/JS are
>> small enough that gzipping would not be a significant win (see above)
>> but I don't recall the reason the the CSS/JS are not. Is there a
>> client compatibility issue here?
>>    
>
> Some of the CSS/JS *is* gzipped, so there had better not be a client
> compatibility issue.  Styles/scripts served from index.php are
> gzipped, it's only statically-served files that aren't.  I'm guessing
> this would just require a line or two changed in Apache config.
>
>  
>>> 3. Add expire header (e.g.
>>> http://upload.wikimedia.org/wikipedia/en/9/9d/Commons-logo-31px.png)
>>>      
>> Hm. There are expire headers on the skin provided images, but not ones
>> from upload.  It does correctly respond with 304 not modified, but a
>> not-modified is often as time consuming as sending the image.
>>    
>
> Looking at the URL, this doesn't seem to identify the version at all,
> so we can't safely send an Expires header.  I'm guessing we use a
> non-versioned URL here to avoid purging parser/Squid caches when a new
> image is uploaded.  This is probably a losing proposition, except
> maybe for very widely used images (but those shouldn't change often?).
>  But it would be a pain to change.
>
>  
>> In any case, from the second page onwards pages typically display in
>> <100ms for me, and the cold cache (first page) load time for me looks
>> like it's about 230ms, which is also not bad.  The grade 'f' is hardly
>> deserved.
>>    
>
> I've found that YSlow is kind of flaky in its grades.  Some of its
> advice is fairly brainless.  But there's definitely room for
> improvement on our part -- gzipping stuff at the very least!
>
> On Sun, Oct 5, 2008 at 3:52 PM, Gregory Maxwell <[hidden email]> wrote:
>  
>> Compare the gzipped sizes. Post gzipping the savings from whitespace
>> removal and friends is much smaller. Yet it makes the JS unreadable
>> and makes debugging into a pain.
>>    
>
> What I was thinking (not that the idea is original to me :P) is that
> we could have all JS and all CSS sent from some PHP file, call it
> compact.php.  So you would have just one script tag and one style tag,
> like
>
> <script type="text/javascript" src="/w/compact.php?type=js&..."></script>
> <link rel="stylesheet" type="text/css" href="/w/compact.php?type=css&..." />
>
> That could concatenate the appropriate files, add on
> dynamically-generated stuff, gzip everything, and send it in one
> request, with appropriate Expires headers and so on.  This would
> dramatically cut round-trips (15 total external CSS/JS files
> logged-out for me right now, 24 logged-in!).
>
> Round-trips are serialized in modern browsers a lot more than they
> should be -- Firefox 3 will stop all other requests while it's
> requesting any JS file for some crazy reason (is there a bug open for
> that?), and IE is apparently similar.  It's even worse with older
> browsers, which are unreasonably cautious about how many parallel
> requests to send to the same domain.
>
> Once you're already serving all the JS together from a script, you
> could minify it with no real extra cost, and prepend a helpful comment
> like /* Add &minify=0 to the URL for a human-readable version */.
> Minification tends to save a few percent of the original filesize on
> top of gzipping (which can be like 20% of the file size after
> gzipping), which can make a difference for big scripts.  Here are a
> couple of examples to illustrate the point (from the O'Reilly book
> "High Performance Web Sites", by Steve Souders of Yahoo!):
>
> http://stevesouders.com/hpws/js-large-normal.php
> http://stevesouders.com/hpws/js-large-minify.php
>
> I see a saving of a few hundred milliseconds according to those pages
> -- and yes, this is after gzipping.
>
> Since this is you, though, I wait to be corrected.  :)
>
>  
>> If it takes 75ms to get to the nearest Wikimedia datacenter and back
>> then a new HTTP get can not finish in less than 150ms.  If you want to
>> improve performance you need to focus on shaving *round trips* rather
>> than bytes. Byte reduction only saves you round trips if you're able
>> to reduce the number of TCP windows worth of data, it's quantized and
>> the lowest threshold is about 8kbytes.
>>    
>
> The example above has a script that's 76 KB gzipped, but only 29 KB
> minified plus gzipped.  On the other hand, the script is three times
> as large as the scripts we serve to an anon viewing the main page (377
> KB vs. 126 KB ungzipped).
>  
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia & YSlow

Aryeh Gregor
On Sun, Oct 5, 2008 at 6:35 PM, Daniel Friesen <[hidden email]> wrote:
> I really wish people would stop spreading the crap about the /benefits/
> of minification, while only giving half the information.
> Sure, minification does reduce some size in comparison to a full file.
> And yes, minification+gzipping does make things insanely small. But that
> there is blatantly disregarding something. It's not the minification
> that makes min+gz so small, it's the gzipping, in fact once you gzip
> trying to minify becomes nearly pointless.

It can cut off a significant amount of extra size on top of gzipping,
as my last post indicated, at least in some cases.  It's not "nearly
pointless".

> Here's the table for wikibits.js, and wikipedia's gen.js for anons
> (basically monobook.js).
> wikibits.js
>        non-gz
>        gzipped
> full
>        27.1kb
>        8.9kb
> minified
>        16.7kb
>        5.0kb
>
> wp's gen.js
>        non-gz
>        gzipped
> full
>        29.2kb
>        7.9kb
> minified
>        16.8kb
>        4.5kb

In other words, minification reduces the total size of those two files
from 16.9 KB to 9.5 KB, after gzipping.  That's more than 7 KB less.
That's already not pointless, and it's probably only going to become
less and less pointless over time as we use more and more scripts
(which I'm guessing will happen).

> And honestly, that measly 10% is not
> worth how badly it screws up the readability of the code.

How about my suggestion to begin the code with a comment "/* Append
&minify=0 to the URL for a human-readable version */"?

> As for client compatibility. There are some older browsers that don't
> support gzipping properly (notably ie6). We serve gzipped data from the
> php scripts, but we only do that after detecting if the browser supports
> gzipping or not. So we're not serving gzipped stuff to old browsers like
> ie6 that have broken handling of gzip.
> The difference with the static stuff is quite simply because it's not as
> easy to make a webserver detect gzip compatibility as it is to make a
> php script do it.

It should be reasonably easy to do a User-Agent check in Apache
config, shouldn't it?

> The limitation in grabbing data in browsers isn't crazy, the standard is
> to restrict to only 2 open http connections for a single hostname.

Yes, which ended up being crazy as the web evolved.  :)  That's fixed
in recent browsers, though, with heavier parallelization of most
files' loading.  But Firefox <= 3 and IE <= 7 will still stop loading
everything else when loading scripts.  IE8 and recent WebKit
thankfully no longer do this:

http://blogs.msdn.com/kristoffer/archive/2006/12/22/loading-javascript-files-in-parallel.aspx
http://webkit.org/blog/166/optimizing-page-loading-in-web-browser/

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia & YSlow

Tim Starling-2
In reply to this post by howard chen
howard chen wrote:

> Hello all,
>
> As you might already know, YSlow is a tool to check website
> performance, I just run a test against:
> http://en.wikipedia.org/wiki/Main_Page
>
> Results is quite surprising, Grade F (47). Of course lower mark does
> not always means bad, but there are some room for improvement, e.g.
>
>
> 1. Minify JS (e.g. http://en.wikipedia.org/skins-1.5/common/ajax.js?179)

We've discussed this already. It's not happening.

> 2. Enable GZip compression (e.g.
> http://en.wikipedia.org/skins-1.5/monobook/main.css?179)

Yes this is possible. But there are two ways of doing it and Brion thinks
it should be done the hard way ;)

> 3. Add expire header (e.g.
> http://upload.wikimedia.org/wikipedia/en/9/9d/Commons-logo-31px.png)

You know this is a wiki, right?

> 4. Don't put CSS out of the <head />

You mean style attributes? That's an editorial issue.

> By do this, it should save some money on bandwidth, as well as to
> provide a better user experience.

Are you saying we're slow?

-- Tim Starling


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia & YSlow

Aryeh Gregor
On Sun, Oct 5, 2008 at 8:16 PM, Tim Starling <[hidden email]> wrote:
>> 3. Add expire header (e.g.
>> http://upload.wikimedia.org/wikipedia/en/9/9d/Commons-logo-31px.png)
>
> You know this is a wiki, right?

Clearly we'd need to use a versioned URL to do it, at least for
widely-used images.

> Are you saying we're slow?

I do regularly get pages taking ten to thirty seconds to load, but I
don't think it has much to do with this sort of front-end
optimization.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia & YSlow

Brion Vibber-3
In reply to this post by Gregory Maxwell
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Gregory Maxwell wrote:

> On Sun, Oct 5, 2008 at 12:15 PM, howard chen <[hidden email]> wrote:
>> Results is quite surprising, Grade F (47). Of course lower mark does
>> not always means bad, but there are some room for improvement, e.g.
> [snip]
>
>> 1. Minify JS (e.g. http://en.wikipedia.org/skins-1.5/common/ajax.js?179)
>
> Probably pointless. It's small enough already that the load time is
> going to be latency bound for any user not sitting inside a Wikimedia
> data center. On ones which are above the latency bound window (of
> roughly 8k), gzipping should get them back under it.

Minification can actually decrease sizes significantly even with
gzipping. Particularly for low-bandwidth and mobile use this could be a
serious plus.

The big downside of minification, of course, is that it makes it harder
to read and debug the code.

>> 2. Enable GZip compression (e.g.
>> http://en.wikipedia.org/skins-1.5/monobook/main.css?179)
>
> The page text is gzipped.  CSS/JS are not. Many of the CSS/JS are
> small enough that gzipping would not be a significant win (see above)
> but I don't recall the reason the the CSS/JS are not. Is there a
> client compatibility issue here?

CSS/JS generated via MediaWiki are gzipped. Those loaded from raw files
are not, as the servers aren't currently configured to do that.

>> 3. Add expire header (e.g.
>> http://upload.wikimedia.org/wikipedia/en/9/9d/Commons-logo-31px.png)
>
> Hm. There are expire headers on the skin provided images, but not ones
> from upload.  It does correctly respond with 304 not modified, but a
> not-modified is often as time consuming as sending the image. Firefox
> doesn't IMS these objects every time in any case.

The primary holdup for serious expires headers on file uploads is not
having unique per-version URLs. With a far-future expires header, things
get horribly confusing when a file has been replaced, but everyone still
sees the old cached version.


Anyway, these are all known issues.

Possible remedies for CSS/JS files:
* Configue Apache to compress them on the fly (probably easy)
* Pre-minify them and have Apache compress them on the fly (not very hard)
* Run them through MediaWiki to compress them (slightly harder)
* Run them through MediaWiki to compress them *and* minify them *and*
merge multiple files together to reduce number of requests (funk-ay!)

Possible remedies for image URLs better caching:
* Stick a version number on the URL in a query string (probably easy --
grab the timestamp from the image metadata and toss it on the url?)
* Store files with unique filenames per version (harder since it
requires migrating files around, but something I'd love us to do)

- -- brion
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkjqQzkACgkQwRnhpk1wk45kKgCgtMsxUkrfbCFzCrWswdK6ucTb
WdUAnRz1MvUNziq4SDMyPWtWDw9tB6IW
=iHEn
-----END PGP SIGNATURE-----

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia & YSlow

Jared Williams
 

> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of
> Brion Vibber
> Sent: 06 October 2008 17:56
> To: Wikimedia developers
> Subject: Re: [Wikitech-l] Wikipedia & YSlow
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Gregory Maxwell wrote:
> > On Sun, Oct 5, 2008 at 12:15 PM, howard chen
> <[hidden email]> wrote:
> >> Results is quite surprising, Grade F (47). Of course lower
> mark does
> >> not always means bad, but there are some room for improvement, e.g.
> > [snip]
> >
> >> 1. Minify JS (e.g.
> >> http://en.wikipedia.org/skins-1.5/common/ajax.js?179)
> >
> > Probably pointless. It's small enough already that the load time is
> > going to be latency bound for any user not sitting inside a
> Wikimedia
> > data center. On ones which are above the latency bound window (of
> > roughly 8k), gzipping should get them back under it.
>
> Minification can actually decrease sizes significantly even
> with gzipping. Particularly for low-bandwidth and mobile use
> this could be a serious plus.
>
> The big downside of minification, of course, is that it makes
> it harder to read and debug the code.
>
> >> 2. Enable GZip compression (e.g.
> >> http://en.wikipedia.org/skins-1.5/monobook/main.css?179)
> >
> > The page text is gzipped.  CSS/JS are not. Many of the CSS/JS are
> > small enough that gzipping would not be a significant win
> (see above)
> > but I don't recall the reason the the CSS/JS are not. Is there a
> > client compatibility issue here?
>
> CSS/JS generated via MediaWiki are gzipped. Those loaded from
> raw files are not, as the servers aren't currently configured
> to do that.
>
> >> 3. Add expire header (e.g.
> >>
> http://upload.wikimedia.org/wikipedia/en/9/9d/Commons-logo-31px.png)
> >
> > Hm. There are expire headers on the skin provided images,
> but not ones
> > from upload.  It does correctly respond with 304 not
> modified, but a
> > not-modified is often as time consuming as sending the
> image. Firefox
> > doesn't IMS these objects every time in any case.
>
> The primary holdup for serious expires headers on file
> uploads is not having unique per-version URLs. With a
> far-future expires header, things get horribly confusing when
> a file has been replaced, but everyone still sees the old
> cached version.
>
>
> Anyway, these are all known issues.
>
> Possible remedies for CSS/JS files:
> * Configue Apache to compress them on the fly (probably easy)
> * Pre-minify them and have Apache compress them on the fly
> (not very hard)
> * Run them through MediaWiki to compress them (slightly harder)
> * Run them through MediaWiki to compress them *and* minify
> them *and* merge multiple files together to reduce number of
> requests (funk-ay!)
>
> Possible remedies for image URLs better caching:
> * Stick a version number on the URL in a query string
> (probably easy -- grab the timestamp from the image metadata
> and toss it on the url?)

> * Store files with unique filenames per version (harder since
> it requires migrating files around, but something I'd love us to do)

Wouldn't rollbacks waste space?
Would've thought content addressing to store all the images, and the content
address in the url?

Unless there is versioned metadata associated with images that would affect
how its sent to the client.

Jared


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia & YSlow

Brion Vibber-3
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jared Williams wrote:
>> * Store files with unique filenames per version (harder since
>> it requires migrating files around, but something I'd love us to do)
>
> Wouldn't rollbacks waste space?

Not if we follow the 2006 restructuring plan:
http://www.mediawiki.org/wiki/FileStore

Storing the same file version multiple times would not require any
additional filesystem space, just the extra DB row w/ the versioning info.

- -- brion
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkjqWLQACgkQwRnhpk1wk44wqQCeNaM60DpjhOrniGBPGskCmrM3
KWoAnidVbFY0wo8oIYOQk7AQtUGfGEQC
=V8Lp
-----END PGP SIGNATURE-----

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia & YSlow

Juliano F. Ravasi
In reply to this post by Brion Vibber-3
Brion Vibber wrote:
> The big downside of minification, of course, is that it makes it harder
> to read and debug the code.

Debugging is needed by what, 0.01% of all users? I think that using
&minify=0 suggestion when debugging is a very good solution. But Tim
said that isn't going to happen...

> CSS/JS generated via MediaWiki are gzipped. Those loaded from raw files
> are not, as the servers aren't currently configured to do that.

Talking about gzipping, something a little off-topic: Currently, the
parsed output in the object cache is gzipped, and MediaWiki has to unzip
it, insert it into the Monobook skin, then gzip again (in PHP zlib
output compression). Did you consider taking a shortcut, and sending the
compressed parser output directly from the cache to the client,
compressing only the surroundings created by the skin?

> * Stick a version number on the URL in a query string (probably easy --
> grab the timestamp from the image metadata and toss it on the url?)

I think it is better to have a fragment (like, the first 32-bits) of the
SHA1, so that rollbacks preserve the already-cached versions.


--
Juliano F. Ravasi ยทยท http://juliano.info/
5105 46CC B2B7 F0CD 5F47 E740 72CA 54F4 DF37 9E96

"A candle loses nothing by lighting another candle." -- Erin Majors

* NOTE: Don't try to reach me through this address, use "contact@" instead.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia & YSlow

Brion Vibber-3
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Juliano F. Ravasi wrote:
> Talking about gzipping, something a little off-topic: Currently, the
> parsed output in the object cache is gzipped, and MediaWiki has to unzip
> it, insert it into the Monobook skin, then gzip again (in PHP zlib
> output compression). Did you consider taking a shortcut, and sending the
> compressed parser output directly from the cache to the client,
> compressing only the surroundings created by the skin?

That sounds like some scary gzip voodoo. :) gzip is pretty cheap
(especially compared to all the surrounding network time); playing games
like this is much more likely to break, assuming it even works at all.

>> * Stick a version number on the URL in a query string (probably easy --
>> grab the timestamp from the image metadata and toss it on the url?)
>
> I think it is better to have a fragment (like, the first 32-bits) of the
> SHA1, so that rollbacks preserve the already-cached versions.

Doable, but wouldn't be a huge % of bandwidth probably.

- -- brion
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkjqYUAACgkQwRnhpk1wk46umgCgoL45ACDyksWojmFIrBG3aRXF
o9EAniQjeT5HlcFrJCny2AWEiQFPICjG
=K2Gp
-----END PGP SIGNATURE-----

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia & YSlow

Daniel Friesen
In reply to this post by Aryeh Gregor
No, messed up table since I was using html mode.

wikibits.js non-gz gzipped
full 27.1kb 8.9kb
minified 16.7kb 5.0kb

wp's gen.js non-gz gzipped
full 29.2kb 7.9kb
minified 16.8kb 4.5kb

When not gzipped minification cuts something from 27kb to 16kb, but when
already gzipped down to 8kb it only reduces it to 5kb...

Wikia does something similar to that idea... They have an allinone.js,
and you use &allinone=0 to disable it. But honesly, there are cases
where you use links, or something, or a post request or something else
that can only be done once, you get a js error, and it's a pain to find
out what's going on. Not when it only saves around 4kb.

Minification being pointless when gzipped is actually logical to
understand if you know their principles. Both gzipping and minification
follow the same principle. They take sequences that are repeated, create
an optimized tree, and then store smaller sequences that refer to the
data in that tree.
Once something has been optimized like that, it's almost impossible to
get it any smaller because you've already removed the repeat sequences
of data.
Basically trying to minify then gzip, is like trying to gzip twice, it
can't technically give you much more.

Ohwait, scratch that... ^_^ I'm thinking of js packing...
JS minification doesn't do anything except kill things like
whitespace... That obviously can't save to much.

~Daniel Friesen (Dantman, Nadir-Seen-Fire)
~Profile/Portfolio: http://nadir-seen-fire.com
-The Nadir-Point Group (http://nadir-point.com)
--It's Wiki-Tools subgroup (http://wiki-tools.com)
--The ElectronicMe project (http://electronic-me.org)
-Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG)
--Animepedia (http://anime.wikia.com)
--Narutopedia (http://naruto.wikia.com)



Aryeh Gregor wrote:

> On Sun, Oct 5, 2008 at 6:35 PM, Daniel Friesen <[hidden email]> wrote:
>  
>> I really wish people would stop spreading the crap about the /benefits/
>> of minification, while only giving half the information.
>> Sure, minification does reduce some size in comparison to a full file.
>> And yes, minification+gzipping does make things insanely small. But that
>> there is blatantly disregarding something. It's not the minification
>> that makes min+gz so small, it's the gzipping, in fact once you gzip
>> trying to minify becomes nearly pointless.
>>    
>
> It can cut off a significant amount of extra size on top of gzipping,
> as my last post indicated, at least in some cases.  It's not "nearly
> pointless".
>
>  
>> Here's the table for wikibits.js, and wikipedia's gen.js for anons
>> (basically monobook.js).
>> wikibits.js
>>        non-gz
>>        gzipped
>> full
>>        27.1kb
>>        8.9kb
>> minified
>>        16.7kb
>>        5.0kb
>>
>> wp's gen.js
>>        non-gz
>>        gzipped
>> full
>>        29.2kb
>>        7.9kb
>> minified
>>        16.8kb
>>        4.5kb
>>    
>
> In other words, minification reduces the total size of those two files
> from 16.9 KB to 9.5 KB, after gzipping.  That's more than 7 KB less.
> That's already not pointless, and it's probably only going to become
> less and less pointless over time as we use more and more scripts
> (which I'm guessing will happen).
>
>  
>> And honestly, that measly 10% is not
>> worth how badly it screws up the readability of the code.
>>    
>
> How about my suggestion to begin the code with a comment "/* Append
> &minify=0 to the URL for a human-readable version */"?
>
>  
>> As for client compatibility. There are some older browsers that don't
>> support gzipping properly (notably ie6). We serve gzipped data from the
>> php scripts, but we only do that after detecting if the browser supports
>> gzipping or not. So we're not serving gzipped stuff to old browsers like
>> ie6 that have broken handling of gzip.
>> The difference with the static stuff is quite simply because it's not as
>> easy to make a webserver detect gzip compatibility as it is to make a
>> php script do it.
>>    
>
> It should be reasonably easy to do a User-Agent check in Apache
> config, shouldn't it?
>
>  
>> The limitation in grabbing data in browsers isn't crazy, the standard is
>> to restrict to only 2 open http connections for a single hostname.
>>    
>
> Yes, which ended up being crazy as the web evolved.  :)  That's fixed
> in recent browsers, though, with heavier parallelization of most
> files' loading.  But Firefox <= 3 and IE <= 7 will still stop loading
> everything else when loading scripts.  IE8 and recent WebKit
> thankfully no longer do this:
>
> http://blogs.msdn.com/kristoffer/archive/2006/12/22/loading-javascript-files-in-parallel.aspx
> http://webkit.org/blog/166/optimizing-page-loading-in-web-browser/
>  


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia & YSlow

howard chen
On Tue, Oct 7, 2008 at 11:18 AM, Daniel Friesen <[hidden email]> wrote:
> Basically trying to minify then gzip, is like trying to gzip twice, it
> can't technically give you much more.
>

nope nope nope, not only cut space, but also removing comments etc..

there are no reason to send the programming comments to users, at
least 99.99999% of user don't need it, if you need it, probably you
know how to get it, programming comments are completely not needed to
render a page.

Consider a page like wikipedia as one of the most popular site in the
world, every bit should count. It will save your bandwidth
investments.

Howard

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia & YSlow

Chad
On Tue, Oct 7, 2008 at 7:20 AM, howard chen <[hidden email]> wrote:

> On Tue, Oct 7, 2008 at 11:18 AM, Daniel Friesen <[hidden email]>
> wrote:
> > Basically trying to minify then gzip, is like trying to gzip twice, it
> > can't technically give you much more.
> >
>
> nope nope nope, not only cut space, but also removing comments etc..
>
> there are no reason to send the programming comments to users, at
> least 99.99999% of user don't need it, if you need it, probably you
> know how to get it, programming comments are completely not needed to
> render a page.
>
> Consider a page like wikipedia as one of the most popular site in the
> world, every bit should count. It will save your bandwidth
> investments.
>
> Howard
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

I would support minification if we have a minify=0 parameter which
we can specify to unminify if needed (like someone suggested
above). Sometimes you just need to be able to read it.

-Chad
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Problem with edittoken

javi bueno

Hi! I'm trying to get the edittoken using a POST call through the next code :

$URL="es.wikipedia.org/w/api.php"; $ch = curl_init( ); curl_setopt($ch, CURLOPT_URL,"http://$URL"); curl_setopt($ch, CURLOPT_HEADER, 0);curl_setopt($ch, CURLOPT_POST, 1);curl_setopt($ch,CURLOPT_RETURNTRANSFER,1); $poststring["action"] = "query";$poststring["prop"] = "info|revisions";$poststring["intoken"] = "edit";$poststring["titles"] = "Portada";$data = curl_exec ($ch);

... but I always received the following response :
edittoken="+\"

anybody knows what's happening?

Thanks.





_________________________________________________________________
Llega la nueva temporada. Consulta las nuevas tendencias en MSN Estilo
http://estilo.es.msn.com/moda/
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
12