LanguageConverter and searching with the incorrect keyboard layout

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

LanguageConverter and searching with the incorrect keyboard layout

Amir E. Aharoni
There's a problem which is familiar to people who use non-Latin
alphabets in computers is that they sometimes forget to switch the
keyboard layout and type a whole word or even a sentence of gibberish
until they notice it. For example, people who use a Cyrillic keyboard
may search Google for "цшлшзувшф", when they actually meant to search
for "wikipedia", and vice versa - "dbrbgtlbz" when they meant
"википедия" (that's "wikipedia" in Russian). The Google search engine
is aware of it for a few years now and often automatically searches
for the desired term in a DWIM manner.

Wikipedia's own search engine is not aware of it yet. A user in the
Hebrew Wikipedia had this idea: Maybe LanguageConverter can be used
for it? Common keyboard layouts can be mapped to each other, like the
two Serbian alphabet are mapped to each other today, and
Special:Search is already aware of LanguageConverter.

--
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
http://aharoni.wordpress.com
"We're living in pieces,
 I want to live in peace." - T. Moore

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: LanguageConverter and searching with the incorrect keyboard layout

cngxzl
strtr() is sufficient.** Current LC is based on it.

2011/6/10 Amir E. Aharoni <[hidden email]>

> There's a problem which is familiar to people who use non-Latin
> alphabets in computers is that they sometimes forget to switch the
> keyboard layout and type a whole word or even a sentence of gibberish
> until they notice it. For example, people who use a Cyrillic keyboard
> may search Google for "цшлшзувшф", when they actually meant to search
> for "wikipedia", and vice versa - "dbrbgtlbz" when they meant
> "википедия" (that's "wikipedia" in Russian). The Google search engine
> is aware of it for a few years now and often automatically searches
> for the desired term in a DWIM manner.
>
> Wikipedia's own search engine is not aware of it yet. A user in the
> Hebrew Wikipedia had this idea: Maybe LanguageConverter can be used
> for it? Common keyboard layouts can be mapped to each other, like the
> two Serbian alphabet are mapped to each other today, and
> Special:Search is already aware of LanguageConverter.
>
> --
> Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
> http://aharoni.wordpress.com
> "We're living in pieces,
>  I want to live in peace." - T. Moore
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: LanguageConverter and searching with the incorrect keyboard layout

Robert Stojnic-2
In reply to this post by Amir E. Aharoni

Google is not aware of this either. It works for certain queries like
wikipedia (probably because many people misspell it in Cyrillic for
fun), but try a more general query (e.g. университз оф оџфорд), and it
won't return any results.

r.

On 10/06/11 06:46, Amir E. Aharoni wrote:

> There's a problem which is familiar to people who use non-Latin
> alphabets in computers is that they sometimes forget to switch the
> keyboard layout and type a whole word or even a sentence of gibberish
> until they notice it. For example, people who use a Cyrillic keyboard
> may search Google for "цшлшзувшф", when they actually meant to search
> for "wikipedia", and vice versa - "dbrbgtlbz" when they meant
> "википедия" (that's "wikipedia" in Russian). The Google search engine
> is aware of it for a few years now and often automatically searches
> for the desired term in a DWIM manner.
>
> Wikipedia's own search engine is not aware of it yet. A user in the
> Hebrew Wikipedia had this idea: Maybe LanguageConverter can be used
> for it? Common keyboard layouts can be mapped to each other, like the
> two Serbian alphabet are mapped to each other today, and
> Special:Search is already aware of LanguageConverter.
>
> --
> Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
> http://aharoni.wordpress.com
> "We're living in pieces,
>   I want to live in peace." - T. Moore
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: LanguageConverter and searching with the incorrect keyboard layout

Max Semenik
On Fri, Jun 10, 2011 at 2:16 PM, Robert Stojnic <[hidden email]> wrote:

>
> Google is not aware of this either. It works for certain queries like
> wikipedia (probably because many people misspell it in Cyrillic for
> fun), but try a more general query (e.g. университз оф оџфорд), and it
> won't return any results.
>

  He was referring to search terms typed in a wrong keyboard layout, not
transliteration, i.e. "гтшмукышешуы ща щчащкв" instead of what you wrote for
"universities of oxford". Google doesn't really handle this query either, by
the way. But yes - I personally felt the need in such a feature and was
thinking of doing it myself. Ideally, this should be made a
backend-independent extension that hooks into Special:Search directly.
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: LanguageConverter and searching with the incorrect keyboard layout

Robert Stojnic-2

My query wasn't a transliteration, but typed with a Serbian Cyrillic
layout.

r.

On 10/06/11 11:58, Max Semenik wrote:

>    He was referring to search terms typed in a wrong keyboard layout, not
> transliteration, i.e. "гтшмукышешуы ща щчащкв" instead of what you wrote for
> "universities of oxford". Google doesn't really handle this query either, by
> the way. But yes - I personally felt the need in such a feature and was
> thinking of doing it myself. Ideally, this should be made a
> backend-independent extension that hooks into Special:Search directly.
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: LanguageConverter and searching with the incorrect keyboard layout

Max Semenik
On Fri, Jun 10, 2011 at 3:07 PM, Robert Stojnic <[hidden email]> wrote:

>
> My query wasn't a transliteration, but typed with a Serbian Cyrillic
> layout.
>

So it's phonetically equivalent to QWERTY? Neat.

--
Best regards,
Max Semenik ([[User:MaxSem]])
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: LanguageConverter and searching with the incorrect keyboard layout

Aryeh Gregor
In reply to this post by Robert Stojnic-2
On Fri, Jun 10, 2011 at 6:16 AM, Robert Stojnic <[hidden email]> wrote:
> Google is not aware of this either. It works for certain queries like
> wikipedia (probably because many people misspell it in Cyrillic for
> fun), but try a more general query (e.g. университз оф оџфорд), and it
> won't return any results.

It seems to work consistently if you type English using a Hebrew
layout, even for rather uncommon search terms:

http://www.google.com/search?q=%D7%A8%D7%9D%D7%A0%D7%A7%D7%A8%D7%90+%D7%93%D7%90%D7%9D%D7%97%D7%9E%D7%9F%D7%91

And the reverse:

http://www.google.com/search?q=tnhr+tvrubh

Maybe it just doesn't know about Serbian Cyrillic keyboard layouts,
but does know about Hebrew keyboard layouts.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: LanguageConverter and searching with the incorrect keyboard layout

Amir E. Aharoni
In reply to this post by cngxzl
OK, but if i understand correctly, LanguageConverter is already
integrated with the search engine.

2011/6/10 Philip Tzou <[hidden email]>:

> strtr() is sufficient.** Current LC is based on it.
>
> 2011/6/10 Amir E. Aharoni <[hidden email]>
>
>> There's a problem which is familiar to people who use non-Latin
>> alphabets in computers is that they sometimes forget to switch the
>> keyboard layout and type a whole word or even a sentence of gibberish
>> until they notice it. For example, people who use a Cyrillic keyboard
>> may search Google for "цшлшзувшф", when they actually meant to search
>> for "wikipedia", and vice versa - "dbrbgtlbz" when they meant
>> "википедия" (that's "wikipedia" in Russian). The Google search engine
>> is aware of it for a few years now and often automatically searches
>> for the desired term in a DWIM manner.
>>
>> Wikipedia's own search engine is not aware of it yet. A user in the
>> Hebrew Wikipedia had this idea: Maybe LanguageConverter can be used
>> for it? Common keyboard layouts can be mapped to each other, like the
>> two Serbian alphabet are mapped to each other today, and
>> Special:Search is already aware of LanguageConverter.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: LanguageConverter and searching with the incorrect keyboard layout

Mark A. Hershberger
In reply to this post by Aryeh Gregor
Aryeh Gregor <[hidden email]> writes:

> Maybe it just doesn't know about Serbian Cyrillic keyboard layouts,
> but does know about Hebrew keyboard layouts.

My understanding of how this worked (probably wrong) is that Google
looks at what searches people perform back-to-back.  So perhaps there
are more people using Hebrew keyboard layouts than Serbian Cyrillic
layouts?

Mark.

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l