non-obvious uses of <nowiki> in your language

classic Classic list List threaded Threaded
32 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: non-obvious uses of <nowiki> in your language

John Erling Blad
In my opinion we should try to first process the whole linked phrase by
inflection aka affix rules, and if that fails aka no link target can be
found – then and only then should regexps form prefix and linktrails be
applied. If applying prefix or linktrails creates a word that can be
inflected, and it links to the same target, then move the strings into the
linked phrase. If the link use the pipe-form, then move the strings into
the second part of the link, aka the link text.

Links using the pipe-form should not have the link target inflected. This
is important, as this is the natural escape route if inflection gives wrong
target for whatever reason.

Inflected links should go to the target with the smallest difference. This
is a non-trivial problem. We often link _phrases_ and those could be
processed by several rules, each with some kind of weight rules. An edit
distance would probably not be sufficient.

Perhaps most important; VisualEditor should not insert <nowiki/>, if the
users needs this escape route then let them do it themselves in
WikitextEditor.


On Fri, Oct 5, 2018 at 6:17 PM Amir E. Aharoni <[hidden email]>
wrote:

> ‫בתאריך יום ו׳, 5 באוק׳ 2018 ב-16:59 מאת ‪Dan Garry‬‏ <‪
> [hidden email]
> ‬‏>:‬
> >
> > On Thu, 4 Oct 2018 at 23:29, John Erling Blad <[hidden email]> wrote:
> >
> > > Usually it comes from user errors while using VE. This kind of errors
> are
> > > quite common, and I asked (several years ago) whether it could be fixed
> in
> > > VE, but was told "no".
> > >
> >
> > I'd really appreciate it if you could give me more information on this.
>
> This is very frequent. I know that in the Hebrew Wikipedia it happens up to
> 20 times a day (I actually counted this for many months), and this is never
> intentional or desirable. Never, ever. 100% of cases. The same must be true
> for many other languages, but probably not for all. In wikis bigger than
> the Hebrew Wikipedia it probably happens much more often than 20 times a
> day.
>
> It is possibly the most frequent reason for automatic insertion of <nowiki>
> tags (although this may be different by language).
>
> How does it happen? Several ways:
> * People add a word ending to an existing link. English has very few word
> endings (-s, -ing, -ed, -able, and not much more), but many other languages
> have more.
> * People highlight only a part of a word when they add a link, even though
> they should have highlighted the whole word.
> * In particular, people highlight the part of the word without an ending.
> For example, "Dogs" is written, and people highlight "Dog".
> * People sometimes actually want to write two separate words and forget to
> write a space. (This may sound silly, but I saw this happening very often.)
> * People write a compound word and link a part of the word. Sometimes it's
> intentional, although as we can see in other emails in this thread not
> everybody agrees about the desirability of this. This works very
> differently in different languages. German has a lot of them, English has
> much less, Hebrew has almost zero.
>
> It's worth running proper user testing
>
> > Here's how the linking feature works right now for adding links to words
> > which presently have no links:
> >
> >    - If you put your cursor inside a word without highlighting anything,
> >    and add a link, the link is added to the entire word.
> >    - If you highlight some text, and add a link, the link is added to the
> >    highlighted text.
>
> I know this, and I like how it works, but the fact is that there are many
> other users who don't know this. Simply searching wikitext for
> "]]<nowiki/>" will show how often does this happen.
>
> > How would you propose this feature be changed?
>
> One possibility is to not add <nowiki/> after a link. I proposed it, but it
> was declined: https://phabricator.wikimedia.org/T141689 . The declining
> comment links to T128060, which you mentioned in your email, and it's still
> not resolved.
>
> Other than fully stopping to do it, I cannot think of many other
> possibilities. Maybe we could show a warning, although I suspect that many
> users will ignore it or find it unnecessarily intrusive. I'm not a real
> designer, and it's possible that a real designer can come with something
> better.
>
> Another thing we could consider is to link the whole word *by default*, and
> to add another function that separates a link from the trail. I'd further
> suggest the separation be done internally not by "<nowiki/>", but by some
> other syntax that looks more semantic, for example "{{#sep}}" (this should
> be a magic word and not a template!). My educated guess is that separating
> the word from the link is much less frequent than wanting to link the whole
> word. Part of my motivation for starting this thread was to understand how
> does this work in different languages.
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: non-obvious uses of <nowiki> in your language

Chad
In reply to this post by Daniel Kinzler-3
I'm personally a fan of <!>.

I came across it years ago--it's a null comment. Can't find the reference
at the moment though.

-Chad

On Thu, Oct 4, 2018, 2:25 PM Daniel Kinzler <[hidden email]> wrote:

> Am 04.10.2018 um 18:58 schrieb Thiemo Kreuz:
> > The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the
> > German community. There are not many other ways to achieve the same:
> > <span /> or &shy; can be used instead.[1] The later is often the
> > better alternative, but an auto-replacement is not possible. For
> > example, "[[Bund]]<nowiki />estag" must become "[[Bund]]es&shy;tag".
>
> We could introduce new syntax for this, such as &nope; or even &nowiki;.
>
> Or how about {{}} for "this is a syntactic element, but it does nothing"?
> But if
> that is mixed in with template expansion, it won't work if it expands to
> nothing, since template expansion happens before link parsing, right? For
> better
> or worse...
>
> --
> Daniel Kinzler
> Principal Software Engineer, MediaWiki Platform
> Wikimedia Foundation
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: non-obvious uses of <nowiki> in your language

Chad
Found it :)

https://www.w3.org/MarkUp/SGML/sgml-lex/sgml-lex

Search for "empty comment declaration" :)

-Chad

On Fri, Oct 5, 2018, 11:50 PM Chad <[hidden email]> wrote:

> I'm personally a fan of <!>.
>
> I came across it years ago--it's a null comment. Can't find the reference
> at the moment though.
>
> -Chad
>
> On Thu, Oct 4, 2018, 2:25 PM Daniel Kinzler <[hidden email]>
> wrote:
>
>> Am 04.10.2018 um 18:58 schrieb Thiemo Kreuz:
>> > The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the
>> > German community. There are not many other ways to achieve the same:
>> > <span /> or &shy; can be used instead.[1] The later is often the
>> > better alternative, but an auto-replacement is not possible. For
>> > example, "[[Bund]]<nowiki />estag" must become "[[Bund]]es&shy;tag".
>>
>> We could introduce new syntax for this, such as &nope; or even &nowiki;.
>>
>> Or how about {{}} for "this is a syntactic element, but it does nothing"?
>> But if
>> that is mixed in with template expansion, it won't work if it expands to
>> nothing, since template expansion happens before link parsing, right? For
>> better
>> or worse...
>>
>> --
>> Daniel Kinzler
>> Principal Software Engineer, MediaWiki Platform
>> Wikimedia Foundation
>>
>> _______________________________________________
>> Wikitech-l mailing list
>> [hidden email]
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
>
On Oct 5, 2018 11:50 PM, "Chad" <[hidden email]> wrote:

I'm personally a fan of <!>.

I came across it years ago--it's a null comment. Can't find the reference
at the moment though.

-Chad

On Thu, Oct 4, 2018, 2:25 PM Daniel Kinzler <[hidden email]> wrote:

> Am 04.10.2018 um 18:58 schrieb Thiemo Kreuz:
> > The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the
> > German community. There are not many other ways to achieve the same:
> > <span /> or &shy; can be used instead.[1] The later is often the
> > better alternative, but an auto-replacement is not possible. For
> > example, "[[Bund]]<nowiki />estag" must become "[[Bund]]es&shy;tag".
>
> We could introduce new syntax for this, such as &nope; or even &nowiki;.
>
> Or how about {{}} for "this is a syntactic element, but it does nothing"?
> But if
> that is mixed in with template expansion, it won't work if it expands to
> nothing, since template expansion happens before link parsing, right? For
> better
> or worse...
>
> --
> Daniel Kinzler
> Principal Software Engineer, MediaWiki Platform
> Wikimedia Foundation
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: non-obvious uses of <nowiki> in your language

bawolff
Alas, no longer valid in XML or HTML5. (Although HTML5 will still
parse it as an empty comment, but with a  "incorrectly-opened-comment"
error.

--
Brian


On Sat, Oct 6, 2018 at 6:57 AM Chad <[hidden email]> wrote:

>
> Found it :)
>
> https://www.w3.org/MarkUp/SGML/sgml-lex/sgml-lex
>
> Search for "empty comment declaration" :)
>
> -Chad
>
> On Fri, Oct 5, 2018, 11:50 PM Chad <[hidden email]> wrote:
>
> > I'm personally a fan of <!>.
> >
> > I came across it years ago--it's a null comment. Can't find the reference
> > at the moment though.
> >
> > -Chad
> >
> > On Thu, Oct 4, 2018, 2:25 PM Daniel Kinzler <[hidden email]>
> > wrote:
> >
> >> Am 04.10.2018 um 18:58 schrieb Thiemo Kreuz:
> >> > The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the
> >> > German community. There are not many other ways to achieve the same:
> >> > <span /> or &shy; can be used instead.[1] The later is often the
> >> > better alternative, but an auto-replacement is not possible. For
> >> > example, "[[Bund]]<nowiki />estag" must become "[[Bund]]es&shy;tag".
> >>
> >> We could introduce new syntax for this, such as &nope; or even &nowiki;.
> >>
> >> Or how about {{}} for "this is a syntactic element, but it does nothing"?
> >> But if
> >> that is mixed in with template expansion, it won't work if it expands to
> >> nothing, since template expansion happens before link parsing, right? For
> >> better
> >> or worse...
> >>
> >> --
> >> Daniel Kinzler
> >> Principal Software Engineer, MediaWiki Platform
> >> Wikimedia Foundation
> >>
> >> _______________________________________________
> >> Wikitech-l mailing list
> >> [hidden email]
> >> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
> >
> On Oct 5, 2018 11:50 PM, "Chad" <[hidden email]> wrote:
>
> I'm personally a fan of <!>.
>
> I came across it years ago--it's a null comment. Can't find the reference
> at the moment though.
>
> -Chad
>
> On Thu, Oct 4, 2018, 2:25 PM Daniel Kinzler <[hidden email]> wrote:
>
> > Am 04.10.2018 um 18:58 schrieb Thiemo Kreuz:
> > > The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the
> > > German community. There are not many other ways to achieve the same:
> > > <span /> or &shy; can be used instead.[1] The later is often the
> > > better alternative, but an auto-replacement is not possible. For
> > > example, "[[Bund]]<nowiki />estag" must become "[[Bund]]es&shy;tag".
> >
> > We could introduce new syntax for this, such as &nope; or even &nowiki;.
> >
> > Or how about {{}} for "this is a syntactic element, but it does nothing"?
> > But if
> > that is mixed in with template expansion, it won't work if it expands to
> > nothing, since template expansion happens before link parsing, right? For
> > better
> > or worse...
> >
> > --
> > Daniel Kinzler
> > Principal Software Engineer, MediaWiki Platform
> > Wikimedia Foundation
> >
> > _______________________________________________
> > Wikitech-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: non-obvious uses of <nowiki> in your language

Amir E. Aharoni
... And, more importantly, its form doesn't say "separate the trail from
the link". Just like <nowiki>, it only *happened* to do it (I tried on
Wikipedia, and it doesn't do it now).

The point I'm trying to make in this thread is that <nowiki> happens to do
certain things other than showing wiki syntax without parsing, and is used
for them as if it's *intended* for it, but this is a hack. If a certain
functionality is needed, such as separating the trail from the link, then
it's worth considering creating a piece of syntax for it.

--
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
http://aharoni.wordpress.com
‪“We're living in pieces,
I want to live in peace.” – T. Moore‬


‫בתאריך יום א׳, 7 באוק׳ 2018 ב-19:08 מאת ‪bawolff‬‏ <‪[hidden email]
‬‏>:‬

> Alas, no longer valid in XML or HTML5. (Although HTML5 will still
> parse it as an empty comment, but with a  "incorrectly-opened-comment"
> error.
>
> --
> Brian
>
>
> On Sat, Oct 6, 2018 at 6:57 AM Chad <[hidden email]> wrote:
> >
> > Found it :)
> >
> > https://www.w3.org/MarkUp/SGML/sgml-lex/sgml-lex
> >
> > Search for "empty comment declaration" :)
> >
> > -Chad
> >
> > On Fri, Oct 5, 2018, 11:50 PM Chad <[hidden email]> wrote:
> >
> > > I'm personally a fan of <!>.
> > >
> > > I came across it years ago--it's a null comment. Can't find the
> reference
> > > at the moment though.
> > >
> > > -Chad
> > >
> > > On Thu, Oct 4, 2018, 2:25 PM Daniel Kinzler <[hidden email]>
> > > wrote:
> > >
> > >> Am 04.10.2018 um 18:58 schrieb Thiemo Kreuz:
> > >> > The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the
> > >> > German community. There are not many other ways to achieve the same:
> > >> > <span /> or &shy; can be used instead.[1] The later is often the
> > >> > better alternative, but an auto-replacement is not possible. For
> > >> > example, "[[Bund]]<nowiki />estag" must become "[[Bund]]es&shy;tag".
> > >>
> > >> We could introduce new syntax for this, such as &nope; or even
> &nowiki;.
> > >>
> > >> Or how about {{}} for "this is a syntactic element, but it does
> nothing"?
> > >> But if
> > >> that is mixed in with template expansion, it won't work if it expands
> to
> > >> nothing, since template expansion happens before link parsing, right?
> For
> > >> better
> > >> or worse...
> > >>
> > >> --
> > >> Daniel Kinzler
> > >> Principal Software Engineer, MediaWiki Platform
> > >> Wikimedia Foundation
> > >>
> > >> _______________________________________________
> > >> Wikitech-l mailing list
> > >> [hidden email]
> > >> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > >
> > >
> > On Oct 5, 2018 11:50 PM, "Chad" <[hidden email]> wrote:
> >
> > I'm personally a fan of <!>.
> >
> > I came across it years ago--it's a null comment. Can't find the reference
> > at the moment though.
> >
> > -Chad
> >
> > On Thu, Oct 4, 2018, 2:25 PM Daniel Kinzler <[hidden email]>
> wrote:
> >
> > > Am 04.10.2018 um 18:58 schrieb Thiemo Kreuz:
> > > > The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the
> > > > German community. There are not many other ways to achieve the same:
> > > > <span /> or &shy; can be used instead.[1] The later is often the
> > > > better alternative, but an auto-replacement is not possible. For
> > > > example, "[[Bund]]<nowiki />estag" must become "[[Bund]]es&shy;tag".
> > >
> > > We could introduce new syntax for this, such as &nope; or even
> &nowiki;.
> > >
> > > Or how about {{}} for "this is a syntactic element, but it does
> nothing"?
> > > But if
> > > that is mixed in with template expansion, it won't work if it expands
> to
> > > nothing, since template expansion happens before link parsing, right?
> For
> > > better
> > > or worse...
> > >
> > > --
> > > Daniel Kinzler
> > > Principal Software Engineer, MediaWiki Platform
> > > Wikimedia Foundation
> > >
> > > _______________________________________________
> > > Wikitech-l mailing list
> > > [hidden email]
> > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > _______________________________________________
> > Wikitech-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: non-obvious uses of <nowiki> in your language

C. Scott Ananian
The relevant Parsoid feature request for having VE use linktrails is
https://phabricator.wikimedia.org/T50463 since in general Parsoid just
generates [[Book|books]] when VE gives it `<a href="./Book">books</a>`.

If VE gives Parsoid `<a href="./Book">book</a>s` it will assume that's what
the author actually meant, and will generate `[[Book]]<nowiki/>s` using a
very general mechanism used for a number of other syntax conflicts (like if
you actually want to start a line with the literal character `*`).

I don't think the answer is to invent new syntax for linktrail separation
-- we already have quite enough different ways of escaping and/or
token-breaking already, as partially enumerated in this thread already.
The only one I would be happy to faciliatate would be `-{}-` since it is
already an odd parser corner case -- it is parsed by the wikitext
preprocessor but then spit back out as literal text by the second parsing
phase unless LanguageConverter is enabled for the specific page language.
It would simplify the parse if the LanguageConverter constructs were
"always on" instead of being en/disabled on a page-by-page basis.
  --scott

On Sun, Oct 7, 2018 at 12:23 PM Amir E. Aharoni <
[hidden email]> wrote:

> ... And, more importantly, its form doesn't say "separate the trail from
> the link". Just like <nowiki>, it only *happened* to do it (I tried on
> Wikipedia, and it doesn't do it now).
>
> The point I'm trying to make in this thread is that <nowiki> happens to do
> certain things other than showing wiki syntax without parsing, and is used
> for them as if it's *intended* for it, but this is a hack. If a certain
> functionality is needed, such as separating the trail from the link, then
> it's worth considering creating a piece of syntax for it.
>
> --
> Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
> http://aharoni.wordpress.com
> ‪“We're living in pieces,
> I want to live in peace.” – T. Moore‬
>
>
> ‫בתאריך יום א׳, 7 באוק׳ 2018 ב-19:08 מאת ‪bawolff‬‏ <‪[hidden email]
> ‬‏>:‬
>
> > Alas, no longer valid in XML or HTML5. (Although HTML5 will still
> > parse it as an empty comment, but with a  "incorrectly-opened-comment"
> > error.
> >
> > --
> > Brian
> >
> >
> > On Sat, Oct 6, 2018 at 6:57 AM Chad <[hidden email]> wrote:
> > >
> > > Found it :)
> > >
> > > https://www.w3.org/MarkUp/SGML/sgml-lex/sgml-lex
> > >
> > > Search for "empty comment declaration" :)
> > >
> > > -Chad
> > >
> > > On Fri, Oct 5, 2018, 11:50 PM Chad <[hidden email]> wrote:
> > >
> > > > I'm personally a fan of <!>.
> > > >
> > > > I came across it years ago--it's a null comment. Can't find the
> > reference
> > > > at the moment though.
> > > >
> > > > -Chad
> > > >
> > > > On Thu, Oct 4, 2018, 2:25 PM Daniel Kinzler <[hidden email]>
> > > > wrote:
> > > >
> > > >> Am 04.10.2018 um 18:58 schrieb Thiemo Kreuz:
> > > >> > The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the
> > > >> > German community. There are not many other ways to achieve the
> same:
> > > >> > <span /> or &shy; can be used instead.[1] The later is often the
> > > >> > better alternative, but an auto-replacement is not possible. For
> > > >> > example, "[[Bund]]<nowiki />estag" must become
> "[[Bund]]es&shy;tag".
> > > >>
> > > >> We could introduce new syntax for this, such as &nope; or even
> > &nowiki;.
> > > >>
> > > >> Or how about {{}} for "this is a syntactic element, but it does
> > nothing"?
> > > >> But if
> > > >> that is mixed in with template expansion, it won't work if it
> expands
> > to
> > > >> nothing, since template expansion happens before link parsing,
> right?
> > For
> > > >> better
> > > >> or worse...
> > > >>
> > > >> --
> > > >> Daniel Kinzler
> > > >> Principal Software Engineer, MediaWiki Platform
> > > >> Wikimedia Foundation
> > > >>
> > > >> _______________________________________________
> > > >> Wikitech-l mailing list
> > > >> [hidden email]
> > > >> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > > >
> > > >
> > > On Oct 5, 2018 11:50 PM, "Chad" <[hidden email]> wrote:
> > >
> > > I'm personally a fan of <!>.
> > >
> > > I came across it years ago--it's a null comment. Can't find the
> reference
> > > at the moment though.
> > >
> > > -Chad
> > >
> > > On Thu, Oct 4, 2018, 2:25 PM Daniel Kinzler <[hidden email]>
> > wrote:
> > >
> > > > Am 04.10.2018 um 18:58 schrieb Thiemo Kreuz:
> > > > > The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the
> > > > > German community. There are not many other ways to achieve the
> same:
> > > > > <span /> or &shy; can be used instead.[1] The later is often the
> > > > > better alternative, but an auto-replacement is not possible. For
> > > > > example, "[[Bund]]<nowiki />estag" must become
> "[[Bund]]es&shy;tag".
> > > >
> > > > We could introduce new syntax for this, such as &nope; or even
> > &nowiki;.
> > > >
> > > > Or how about {{}} for "this is a syntactic element, but it does
> > nothing"?
> > > > But if
> > > > that is mixed in with template expansion, it won't work if it expands
> > to
> > > > nothing, since template expansion happens before link parsing, right?
> > For
> > > > better
> > > > or worse...
> > > >
> > > > --
> > > > Daniel Kinzler
> > > > Principal Software Engineer, MediaWiki Platform
> > > > Wikimedia Foundation
> > > >
> > > > _______________________________________________
> > > > Wikitech-l mailing list
> > > > [hidden email]
> > > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > > _______________________________________________
> > > Wikitech-l mailing list
> > > [hidden email]
> > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
> > _______________________________________________
> > Wikitech-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l



--
(http://cscott.net)
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: non-obvious uses of <nowiki> in your language

MGChecker
In reply to this post by John Erling Blad
Hi,

> Links using the pipe-form should not have the link target inflected. This is important, as this is the natural escape route if inflection gives wrong target for whatever reason.

This is what I think is particularly odd about linktrails: Why do links like [[Examples|Example]]s have a linktrail? I wouldn't expect it and I don't think anyone would, on the contrary I still remember discovering this really weird behavior years ago.

I know parser changes are difficult, but adding linktrails only to links without | seems like the easiest and expected solution for this whole problem to me, even if it isn't the most elegant one.

Regards,
MGChecker

-----Ursprüngliche Nachricht-----
Von: Wikitech-l [mailto:[hidden email]] Im Auftrag von John Erling Blad
Gesendet: Freitag, 5. Oktober 2018 20:48
An: Wikimedia developers
Betreff: Re: [Wikitech-l] non-obvious uses of <nowiki> in your language

In my opinion we should try to first process the whole linked phrase by inflection aka affix rules, and if that fails aka no link target can be found – then and only then should regexps form prefix and linktrails be applied. If applying prefix or linktrails creates a word that can be inflected, and it links to the same target, then move the strings into the linked phrase. If the link use the pipe-form, then move the strings into the second part of the link, aka the link text.

Links using the pipe-form should not have the link target inflected. This is important, as this is the natural escape route if inflection gives wrong target for whatever reason.

Inflected links should go to the target with the smallest difference. This is a non-trivial problem. We often link _phrases_ and those could be processed by several rules, each with some kind of weight rules. An edit distance would probably not be sufficient.

Perhaps most important; VisualEditor should not insert <nowiki/>, if the users needs this escape route then let them do it themselves in WikitextEditor.


On Fri, Oct 5, 2018 at 6:17 PM Amir E. Aharoni <[hidden email]>
wrote:

> ‫בתאריך יום ו׳, 5 באוק׳ 2018 ב-16:59 מאת ‪Dan Garry‬‏ <‪
> [hidden email] ‬‏>:‬
> >
> > On Thu, 4 Oct 2018 at 23:29, John Erling Blad <[hidden email]> wrote:
> >
> > > Usually it comes from user errors while using VE. This kind of
> > > errors
> are
> > > quite common, and I asked (several years ago) whether it could be
> > > fixed
> in
> > > VE, but was told "no".
> > >
> >
> > I'd really appreciate it if you could give me more information on this.
>
> This is very frequent. I know that in the Hebrew Wikipedia it happens
> up to 20 times a day (I actually counted this for many months), and
> this is never intentional or desirable. Never, ever. 100% of cases.
> The same must be true for many other languages, but probably not for
> all. In wikis bigger than the Hebrew Wikipedia it probably happens
> much more often than 20 times a day.
>
> It is possibly the most frequent reason for automatic insertion of
> <nowiki> tags (although this may be different by language).
>
> How does it happen? Several ways:
> * People add a word ending to an existing link. English has very few
> word endings (-s, -ing, -ed, -able, and not much more), but many other
> languages have more.
> * People highlight only a part of a word when they add a link, even
> though they should have highlighted the whole word.
> * In particular, people highlight the part of the word without an ending.
> For example, "Dogs" is written, and people highlight "Dog".
> * People sometimes actually want to write two separate words and
> forget to write a space. (This may sound silly, but I saw this
> happening very often.)
> * People write a compound word and link a part of the word. Sometimes
> it's intentional, although as we can see in other emails in this
> thread not everybody agrees about the desirability of this. This works
> very differently in different languages. German has a lot of them,
> English has much less, Hebrew has almost zero.
>
> It's worth running proper user testing
>
> > Here's how the linking feature works right now for adding links to
> > words which presently have no links:
> >
> >    - If you put your cursor inside a word without highlighting anything,
> >    and add a link, the link is added to the entire word.
> >    - If you highlight some text, and add a link, the link is added to the
> >    highlighted text.
>
> I know this, and I like how it works, but the fact is that there are
> many other users who don't know this. Simply searching wikitext for
> "]]<nowiki/>" will show how often does this happen.
>
> > How would you propose this feature be changed?
>
> One possibility is to not add <nowiki/> after a link. I proposed it,
> but it was declined: https://phabricator.wikimedia.org/T141689 . The
> declining comment links to T128060, which you mentioned in your email,
> and it's still not resolved.
>
> Other than fully stopping to do it, I cannot think of many other
> possibilities. Maybe we could show a warning, although I suspect that
> many users will ignore it or find it unnecessarily intrusive. I'm not
> a real designer, and it's possible that a real designer can come with
> something better.
>
> Another thing we could consider is to link the whole word *by
> default*, and to add another function that separates a link from the
> trail. I'd further suggest the separation be done internally not by
> "<nowiki/>", but by some other syntax that looks more semantic, for
> example "{{#sep}}" (this should be a magic word and not a template!).
> My educated guess is that separating the word from the link is much
> less frequent than wanting to link the whole word. Part of my
> motivation for starting this thread was to understand how does this work in different languages.
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: non-obvious uses of <nowiki> in your language

MZMcBride-2
MGChecker wrote:
>> Links using the pipe-form should not have the link target inflected.
>>This is important, as this is the natural escape route if inflection
>>gives wrong target for whatever reason.
>
>This is what I think is particularly odd about linktrails: Why do links
>like [[Examples|Example]]s have a linktrail? I wouldn't expect it and I
>don't think anyone would, on the contrary I still remember discovering
>this really weird behavior years ago.

I'm not sure I understand. I would expect a link trail with
"[[Examples|Example]]s" since there is a link trail with "[[Example]]s".
I'm not sure why anyone would associate link trail behavior with the
presence or lack of a pipe character. The defining characteristic of link
trails is text being adjacent to "]]", as far as I know.

Is the particular case you mention common? It seems like it would be much
more common for a user to simply write "[[Examples]]" currently to achieve
the same output.

MZMcBride



_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: non-obvious uses of <nowiki> in your language

MGChecker
> MZMcBrider wrote:
> I'm not sure I understand. I would expect a link trail with "[[Examples|Example]]s" since there is a link trail with "[[Example]]s".
> I'm not sure why anyone would associate link trail behavior with the presence or lack of a pipe character. The defining characteristic of link trails is text being adjacent to "]]", as far as I know.

Yeah, currently there is a link trail with "[[Example]]s", but I neither consider this intuitive nor helpful. If I specify target and link text separately, why would I want a link trail? I could write it as part of the target instead. I think for most people writing something like [[Examples|Example]]s is the first thing they try to avoid link trails. In my opinion, link trailing doesn't make anything easier if target and link text are specified separately. To be clear: I propose to change the current parser behavior to avoid unwanted link trails.

> Is the particular case you mention common? It seems like it would be much more common for a user to simply write "[[Examples]]" currently to achieve the same output.

As the case I mentioned shouldn't be common and clearly more complicated as needed, I think a behavior change wouldn't have that much impact.

Regards,
MGChecker


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: non-obvious uses of <nowiki> in your language

Trey Jones
I'm not sure how much impact it would have on existing link specifications
to make the change, but I think MGChecker has a good solution. The
"[[target|linktext]]extra" format allows you to specify exactly what part
of the text should have a link, while "[[target]]extra" would be understood
as a shortcut to "[[target|targetextra]]". This solves the linktrails
problem without introducing any extra tags or using nowiki in weird ways.

Looking at some examples in this thread:

   - [[Schnee]]<nowiki />reichtum would be [[Schnee|Schnee]]reichtum
   - [[Gesetz]]e and [[Finger]]s are fine
   - [[Heimat]]losigkeit is fine
   - [[absorpsjon]]s[[Spektrallinje|linjene]] might work as intended, but
   if the middle "s" isn't supposed to be linked then [[Absorpsjon|
   absorpsjon]]s[[Spektrallinje|linjene]] would do the trick
   - ma[[Øssur Havgrímsson|ge]]<nowiki/>e[[Øssur Havgrímsson|evner og]] is
   still something of a mystery, but ma[[Øssur Havgrímsson|ge]]e[[Øssur
   Havgrímsson|evner og]] would probably do what is intended
   - [[Alexander Kielland]]<nowiki/>s would be [[Alexander Kielland|Alexander
   Kielland]]s
   - [[De forente nasjoner|FN]]s would be fine

This isn't really about <nowiki> anymore—sorry Amir!—but I think it could
solve the linktrails syntax issue. The problem, as I alluded to earlier, is
what changing the syntax would do to existing links. Though it would be
possible to automatically convert existing "[[target|linktext]]extra" to "
[[target|linktextextra]]" if target and linktext are different, or "
[[target]]extra" if target and linktext are the same (possibly modulo
whatever minor differences are allowed, like upper/lowercase—though there
are rare instances of articles that differ only by upper/lowercase).

Are their any other linktrails setting other than off and on? We'd want to
make sure any changes didn't do weird things to Chinese or other spaceless
languages.

—Trey

Trey Jones
Sr. Software Engineer, Search Platform
Wikimedia Foundation

On Sat, Oct 13, 2018 at 9:29 PM, MGChecker <[hidden email]> wrote:

> > MZMcBrider wrote:
> > I'm not sure I understand. I would expect a link trail with
> "[[Examples|Example]]s" since there is a link trail with "[[Example]]s".
> > I'm not sure why anyone would associate link trail behavior with the
> presence or lack of a pipe character. The defining characteristic of link
> trails is text being adjacent to "]]", as far as I know.
>
> Yeah, currently there is a link trail with "[[Example]]s", but I neither
> consider this intuitive nor helpful. If I specify target and link text
> separately, why would I want a link trail? I could write it as part of the
> target instead. I think for most people writing something like
> [[Examples|Example]]s is the first thing they try to avoid link trails. In
> my opinion, link trailing doesn't make anything easier if target and link
> text are specified separately. To be clear: I propose to change the current
> parser behavior to avoid unwanted link trails.
>
> > Is the particular case you mention common? It seems like it would be
> much more common for a user to simply write "[[Examples]]" currently to
> achieve the same output.
>
> As the case I mentioned shouldn't be common and clearly more complicated
> as needed, I think a behavior change wouldn't have that much impact.
>
> Regards,
> MGChecker
>
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: non-obvious uses of <nowiki> in your language

Bartosz Dziewoński
On 2018-10-15 16:34, Trey Jones wrote:
> I'm not sure how much impact it would have on existing link specifications
> to make the change, but I think MGChecker has a good solution. The
> "[[target|linktext]]extra" format allows you to specify exactly what part
> of the text should have a link, while "[[target]]extra" would be understood
> as a shortcut to "[[target|targetextra]]". This solves the linktrails
> problem without introducing any extra tags or using nowiki in weird ways.

Sounds like a cute small syntax improvement! :)


> Are their any other linktrails setting other than off and on? We'd want to
> make sure any changes didn't do weird things to Chinese or other spaceless
> languages.

There are two things to consider:

* Linktrails are language-specific. For example, in English, only ASCII
a-z are handled in linktrails, while Polish also allows accented letters
ęóąśłżźćńĘÓĄŚŁŻŹĆŃ. Chinese actually effectively disables linktrails
(disallows everything). This is defined using $linkTrail variables in
files like MessagesEn.php etc.

* There is also something called "linkprefix", used by e.g. Arabic
(MessagesAr.php uses $linkPrefixExtension = true). I am not sure how
this feature works, but it probably complicates everything a bit.

--
Bartosz Dziewoński

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: non-obvious uses of <nowiki> in your language

Trey Jones
Thanks for the technical details, Bartosz!

One would hope (but should confirm) that link prefixes are treated with the
same basic logic as link postfixes/trails, so assuming pre- and post-link
trails are enabled, "pre[[target]]post" is all linked, but
"pre[[target|linktext]]post" is only linked on "linktext", and intermediate
cases can be spelled out as "[[target|pre+target]]post" or
"pre[[target|target+post]]".

Overall, it sounds like reasonable default shortcut behavior that can
easily be overridden with a fully-specified link.

Sounds like a cute small syntax improvement! :)


Exactly!

On Mon, Oct 15, 2018 at 12:01 PM, Bartosz Dziewoński <[hidden email]>
wrote:

> On 2018-10-15 16:34, Trey Jones wrote:
>
>> I'm not sure how much impact it would have on existing link specifications
>> to make the change, but I think MGChecker has a good solution. The
>> "[[target|linktext]]extra" format allows you to specify exactly what part
>> of the text should have a link, while "[[target]]extra" would be
>> understood
>> as a shortcut to "[[target|targetextra]]". This solves the linktrails
>> problem without introducing any extra tags or using nowiki in weird ways.
>>
>
> Sounds like a cute small syntax improvement! :)
>
>
> Are their any other linktrails setting other than off and on? We'd want to
>> make sure any changes didn't do weird things to Chinese or other spaceless
>> languages.
>>
>
> There are two things to consider:
>
> * Linktrails are language-specific. For example, in English, only ASCII
> a-z are handled in linktrails, while Polish also allows accented letters
> ęóąśłżźćńĘÓĄŚŁŻŹĆŃ. Chinese actually effectively disables linktrails
> (disallows everything). This is defined using $linkTrail variables in files
> like MessagesEn.php etc.
>
> * There is also something called "linkprefix", used by e.g. Arabic
> (MessagesAr.php uses $linkPrefixExtension = true). I am not sure how this
> feature works, but it probably complicates everything a bit.
>
> --
> Bartosz Dziewoński
>
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
12