WebRequest and PHP bug 31892 fixed 6 years ago

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

WebRequest and PHP bug 31892 fixed 6 years ago

vitalif
Hello!

WebRequest::getPathInfo() still depends on PHP bug 31892 fixed 6 years
ago. I.e. WebRequest uses REQUEST_URI instead of "mangled" PATH_INFO
which is not "mangled" since PHP 5.2.4. Yeah, Apache still replaces
multiple /// with single /, but afaik it's done for REQUEST_URI as well
as PATH_INFO.
Maybe that part of the code should be removed?

Also I don't understand the need for PathRouter - my IMHO is that it's
just an unnecessary sophistication. As I understand EVERYTHING worked
without it and there is no feature in MediaWiki which depends on a
router. Am I correct?

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: WebRequest and PHP bug 31892 fixed 6 years ago

K. Peachey-2
On Wed, Mar 13, 2013 at 8:22 PM, <[hidden email]> wrote:
> Also I don't understand the need for PathRouter - my IMHO is that it's
> just an unnecessary sophistication. As I understand EVERYTHING worked
> without it and there is no feature in MediaWiki which depends on a router.
> Am I correct?

I doubt Daniel would have introduced it if it was "un-necessary" or
"pointless", I believe from memory it was to improve the handling of
paths over a wide range of set-ups and environments (where sometimes
it would fail). You would need to git blame the file and find the
revision where it was introduced to confirm if that is truly the case
(or if i'm mistaking it for other code)

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: WebRequest and PHP bug 31892 fixed 6 years ago

vitalif
> I doubt Daniel would have introduced it if it was "un-necessary" or
> "pointless", I believe from memory it was to improve the handling of
> paths over a wide range of set-ups and environments (where sometimes
> it would fail). You would need to git blame the file and find the
> revision where it was introduced to confirm if that is truly the case
> (or if i'm mistaking it for other code)

I've looked at the annotations and what I've seen is that PathRouter
only fixes https://bugzilla.wikimedia.org/show_bug.cgi?id=32621 by using
path weights. Actually, I've started looking at the routing code after
hitting this same bug with img_auth.php action path. But as I
understand, it could be fixed much simpler just by reordering two parts
of existing code and examining $wgArticlePath after $wgActionPaths :)
And a single extension using the PathRouter is
http://www.mediawiki.org/wiki/Extension:NamespacePaths ...

Of course I support new features, there are some features that I myself
would want to be in MW core :-)

And I'm sure my point of view may be incorrect :-) but MW trunk (i.e.
master) slightly frigtens me when compared to previous versions - the
codebase seems to grow and grow and grow having more and more and more
different helpers... And it becomes more and more complex with no
simplification effort... (or maybe I'm just not aware of it)

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: WebRequest and PHP bug 31892 fixed 6 years ago

Daniel Friesen-2
On Wed, 13 Mar 2013 11:40:08 -0700, <[hidden email]> wrote:

>> I doubt Daniel would have introduced it if it was "un-necessary" or
>> "pointless", I believe from memory it was to improve the handling of
>> paths over a wide range of set-ups and environments (where sometimes
>> it would fail). You would need to git blame the file and find the
>> revision where it was introduced to confirm if that is truly the case
>> (or if i'm mistaking it for other code)
>
> I've looked at the annotations and what I've seen is that PathRouter  
> only fixes https://bugzilla.wikimedia.org/show_bug.cgi?id=32621 by using  
> path weights. Actually, I've started looking at the routing code after  
> hitting this same bug with img_auth.php action path. But as I  
> understand, it could be fixed much simpler just by reordering two parts  
> of existing code and examining $wgArticlePath after $wgActionPaths :)
> And a single extension using the PathRouter is  
> http://www.mediawiki.org/wiki/Extension:NamespacePaths ...
>
> Of course I support new features, there are some features that I myself  
> would want to be in MW core :-)
>
> And I'm sure my point of view may be incorrect :-) but MW trunk (i.e.  
> master) slightly frigtens me when compared to previous versions - the  
> codebase seems to grow and grow and grow having more and more and more  
> different helpers... And it becomes more and more complex with no  
> simplification effort... (or maybe I'm just not aware of it)

fixing bug 32621 is a todo. The first attempt failed and some tweaks are  
needed to use the PathRouter to fix that bug.

PathRouter allows for the use of custom paths to expand. NamespacePaths is  
an example of one thing you can do (say giving Help: pages a /help/ path)  
but you could also apply that to special pages, etc... whatever. It's also  
the precursor to MW being able to handle 404s natively. The plan is in the  
future you'll just be able to throw everything that's not a file right at  
index.php and pretty urls, 404 pages, 404 thumbnail handlers, etc... will  
all just work natively without any special configuration.

And by 404, I don't mean standard 404 pages like this:
http://wiki.commonjs.org/404
I mean nice in-site 404 pages that actually help visitors find what they  
were looking for:
http://www.dragonballencyclopedia.com/404

Not sure how PATH_INFO being unmangled fixes anything. There are other  
servers where PATH_INFO won't easily be outputted. REQUEST_URI handling  
works better in every case. And ?title=$1 in rewrite rules are evil.  
Determining what urls run what code has always been the job of the  
application in every good language, not the webserver. And we can do it  
using REQUEST_URI much more reliably than some webservers.
Anyways, I wish I could just get rid of the PATH_INFO code. I have yet to  
hear of someone actually using it now that practically every webserver  
there is outputs REQUEST_URI meaning the PATH_INFO code is never reached.

--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://danielfriesen.name/]


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: WebRequest and PHP bug 31892 fixed 6 years ago

vitalif
> fixing bug 32621 is a todo. The first attempt failed and some tweaks
> are  needed to use the PathRouter to fix that bug.
>
> PathRouter allows for the use of custom paths to expand.
> NamespacePaths is  an example of one thing you can do (say giving
> Help: pages a /help/ path)  but you could also apply that to special
> pages, etc... whatever. It's also  the precursor to MW being able to
> handle 404s natively. The plan is in the  future you'll just be able
> to throw everything that's not a file right at  index.php and pretty
> urls, 404 pages, 404 thumbnail handlers, etc... will  all just work
> natively without any special configuration.
>
> And by 404, I don't mean standard 404 pages like this:
> http://wiki.commonjs.org/404
> I mean nice in-site 404 pages that actually help visitors find what
> they  were looking for:
> http://www.dragonballencyclopedia.com/404
>
> Not sure how PATH_INFO being unmangled fixes anything. There are
> other  servers where PATH_INFO won't easily be outputted. REQUEST_URI
> handling  works better in every case. And ?title=$1 in rewrite rules
> are evil.  Determining what urls run what code has always been the
> job
> of the  application in every good language, not the webserver. And we
> can do it  using REQUEST_URI much more reliably than some webservers.
> Anyways, I wish I could just get rid of the PATH_INFO code. I have
> yet to  hear of someone actually using it now that practically every
> webserver  there is outputs REQUEST_URI meaning the PATH_INFO code is
> never reached.

Thanks for answering!
But wasn't all that possible with just using something like
$wgActionPaths?
Unmangled PATH_INFO allows for a single rewrite rule like (.*) ->
index.php/$1 to and you won't need to strip base from URIs (yet of
course it's not hard)
And you say PATH_INFO is unavailable on some configurations - can you
please clarify what are these configurations?


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: WebRequest and PHP bug 31892 fixed 6 years ago

vitalif
In reply to this post by Daniel Friesen-2
And what is the point of making "pretty urls" in case of MediaWiki?
I think they're already pretty much pretty in MediaWiki :)
/edit/$1 is slightly prettier than ?action=edit, but as I understand it
doesn't affect anything, even like SEO.
And I don't think /help/$1 is any better than /Help:$1 at all...

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: WebRequest and PHP bug 31892 fixed 6 years ago

Daniel Friesen-2
In reply to this post by vitalif
On Thu, 14 Mar 2013 23:37:40 -0700, <[hidden email]> wrote:

> Thanks for answering!
> But wasn't all that possible with just using something like  
> $wgActionPaths?

$wgActionPaths only support actions. And actions are a very old, legacy,  
and frankly dying out thing. Practically every feature is now done with  
special pages which were previously not eligible for special url handling.


> And what is the point of making "pretty urls" in case of MediaWiki?
> I think they're already pretty much pretty in MediaWiki :)
> /edit/$1 is slightly prettier than ?action=edit, but as I understand it  
> doesn't affect anything, even like SEO.
> And I don't think /help/$1 is any better than /Help:$1 at all...

You may not think /help/Foo is nicer than /wiki/Help:Foo but there is  
undoubtedly some wiki that would like to have something like /user/Foo,  
/forum/..., or say /product/Foo. PathRouter also supports things more  
complex than something simple like what that one extension does. The  
ShortUrl extension actually even uses PathRouter to support /s/xyz style  
short urls to pages on the wiki (very useful on foreign wikis where urls  
usually look something like  
"/wiki/%E3%82%A6%E3%82%A3%E3%82%AD%E3%83%9A%E3%83%87%E3%82%A3%E3%82%A2").
A good deal of it is about having the control to implement whatever fits  
the situation you are in best.


> Unmangled PATH_INFO allows for a single rewrite rule like (.*) ->  
> index.php/$1 to and you won't need to strip base from URIs (yet of  
> course it's not hard)

index.php/$1 is a bad way to do rewrites. It is only fully supported by  
Apache+mod_php. It doesn't work on other webservers. And it also breaks in  
some of the cgi/fcgi Apache setups.
While on the other hand piping everything to index.php and using  
REQUEST_URI to determine what to do works on every webserver, in every  
setup (ok if your host didn't let you any where near any sort of config,  
404, or rewrite control it wouldn't work, but neither would /wiki/$1). It  
also gives us better control over the url handling and makes the  
possibility of configuring short/pretty urls by GUI/installer in the  
future much more feasible.


> And you say PATH_INFO is unavailable on some configurations - can you  
> please clarify what are these configurations?

I never said that PATH_INFO was unavailable. I said it's unused. At this  
stage I have yet to see a webserver that does not output REQUEST_URI. Our  
code is written such that PATH_INFO is only used when REQUEST_URI is not  
given. That means that for all of those webservers the branch of code we  
have handling PATH_INFO is never actually run. We could probably throw a  
fatal error into that section of code and have absolutely no one notice.  
Heck, that code could legitimately bit rot, be broken for years, and it  
never be realized. Hence I'd like to make it disappear.


--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://danielfriesen.name/]


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l