[Wikimedia-l] adding visual search for Wikipedia

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[Wikimedia-l] adding visual search for Wikipedia

Tomás O'Hara
Hi, here's a link to a proposal I have for adding visual search to
Wikipedia:

https://meta.wikimedia.org/wiki/Use_visual_search_frontend_for_Wikipedia
<https://meta.wikimedia.org/wiki/Use_visual_search_frontend_for_Wikipedia#Proposed_by>

(This was created with user ID: tomasohara
<https://en.wiktionary.org/wiki/User:Tomasohara>.) I can move it into a
better location if desired as it as not a "sister project" proper. The
Proposals
for new projects
<https://meta.wikimedia.org/wiki/Proposals_for_new_projects> page doesn't
offer suggestions for alternative postings, so I left it there for now.

Below is a copy of the project overview. See the link above for details on
how this can be applied to foreign language wikipedias. Note that most can
be supported right "out of the box" except for the text categorization used
to select images for documents without images. A Wikipedia-specific way to
do this might be possible (e.g., based on the hierarchy of pages).

Best,
Tom

---------

It would be good for Wikipedia to use a general-purpose visual search front
end. Note that a big incentive for this is that users will be drawn to
Wikipedia to use this type of search rather than Google Search or Bing.
This would be beneficial because these search engines often show Wikipedia
content for popular entities like sports stars or tourist attractions,
which cuts down on Wikipedia traffic.

You will be able to use the visual search frontend I developed without
charge for the duration of my patent in the works (a la license free). Here
is a link to an example with Wikipedia search on left and my Scrappy Search
on right:

http://www.scrappycito.com/wikipedia-vs-scrappy-search-small
-dog-breeds-en-wiki-site.png


Two other examples illustrate some added benefits of this visual search
with respect to Wikipedia. First, disambiguation becomes based on images
and keywords rather than just snippets of text. See the following:

http://www.scrappycito.com/wikipedia-vs-scrappy-search-bob-j
ones-en-wiki-site.png

In addition, links to other pages for the same entity become much more
engaging:

http://www.scrappycito.com/wikipedia-vs-scrappy-search-taylo
r-swift-en-wiki-site.png


See http://www.scrappycito.com for the stable version of the system and
http://www.tomasohara.trade:9330 for the work-in-progress version. The
latter has support for handheld devices and also better aesthetics (n.b.,
version used in examples).

I think this will be extremely popular with the Instagram crowd and younger
users in general (e.g., younger than 30). To do similar Wikipedia-specific
searches with the visual search front end, just add *site:en.wikipedia.org
<http://en.wikipedia.org>* to the query*,* as in following example:

Lionel Messi  site:en.wikipedia.org

Scrappy Search uses the Google search API, so all of the search operators
<https://support.google.com/websearch/answer/2466433?hl=en> are supported.

The patent for this visual search will be owned by my company ScrappyCito,
LLC. If the company gets acquired, I will require that they honor the
license-free usage of the visual search system by Wikimedia for Wikipedia.
(They will likewise be required to pass along this license-free usage
requirement if they in turn are acquired). You will have access to the
current source code for use in Wikipedia and other approved projects.

I am doing this both for exposure and because I want to help keep Wikipedia
viable (e.g., by enabling higher traffic). This is a great way for users to
browse the encyclopedia, so it can keep users on the Wikipedia domain
longer.

If this sounds interesting, I can develop a prototype for the Simple
English Wikipedia for use on one of my servers. After review, I help with
the deployment for the regular English Wikipedia on your servers once
approved.

==============================================================
Tom O'Hara, founder ScrappyCito, LLC.              PO Box 6430
[hidden email]                     Austin, TX 78762-6430
737-203-1577                               www.scrappycito.com
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: [hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] adding visual search for Wikipedia

Jeremy Lee-Jenkins
Tom O'Hara,

I foresee several problems with your proposal.

a) The Wikimedia Foundation itself spent a very large amount of money
building something essentially the same which was rejected by the community
and abandoned.
 https://en.wikipedia.org/wiki/Knowledge_Engine_(Wikimedia_Foundation)
<https://en.wikipedia.org/wiki/Knowledge_Engine_(Wikimedia_Foundation)>

b) Your proposal is for a proprietary software to be added to the core
Mediawiki software of Wikipedia. The Wikimedia Foundation is notorious for
never using third-party proprietary software.

c) The design is still in the stone age when compared to Bing/Google, so
would not necessarily compete well to attract the target demographic.

d) The search bar at https://www.wikipedia.org/ already has images in a
kind of drop-down search suggestions function, this is nice, but has not
become very popular.

I would actually suggest you go down the route of offering it as an
extension on https://www.mediawiki.org/wiki/Category:Extensions

Warm Regards

Jeremy Lee-Jenkins

On Fri, Jul 6, 2018 at 3:07 PM, Tomás O'Hara <[hidden email]> wrote:

> Hi, here's a link to a proposal I have for adding visual search to
> Wikipedia:
>
> https://meta.wikimedia.org/wiki/Use_visual_search_frontend_for_Wikipedia
> <https://meta.wikimedia.org/wiki/Use_visual_search_frontend_for_Wikipedia#
> Proposed_by>
>
> (This was created with user ID: tomasohara
> <https://en.wiktionary.org/wiki/User:Tomasohara>.) I can move it into a
> better location if desired as it as not a "sister project" proper. The
> Proposals
> for new projects
> <https://meta.wikimedia.org/wiki/Proposals_for_new_projects> page doesn't
> offer suggestions for alternative postings, so I left it there for now.
>
> Below is a copy of the project overview. See the link above for details on
> how this can be applied to foreign language wikipedias. Note that most can
> be supported right "out of the box" except for the text categorization used
> to select images for documents without images. A Wikipedia-specific way to
> do this might be possible (e.g., based on the hierarchy of pages).
>
> Best,
> Tom
>
> ---------
>
> It would be good for Wikipedia to use a general-purpose visual search front
> end. Note that a big incentive for this is that users will be drawn to
> Wikipedia to use this type of search rather than Google Search or Bing.
> This would be beneficial because these search engines often show Wikipedia
> content for popular entities like sports stars or tourist attractions,
> which cuts down on Wikipedia traffic.
>
> You will be able to use the visual search frontend I developed without
> charge for the duration of my patent in the works (a la license free). Here
> is a link to an example with Wikipedia search on left and my Scrappy Search
> on right:
>
> http://www.scrappycito.com/wikipedia-vs-scrappy-search-small
> -dog-breeds-en-wiki-site.png
>
>
> Two other examples illustrate some added benefits of this visual search
> with respect to Wikipedia. First, disambiguation becomes based on images
> and keywords rather than just snippets of text. See the following:
>
> http://www.scrappycito.com/wikipedia-vs-scrappy-search-bob-j
> ones-en-wiki-site.png
>
> In addition, links to other pages for the same entity become much more
> engaging:
>
> http://www.scrappycito.com/wikipedia-vs-scrappy-search-taylo
> r-swift-en-wiki-site.png
>
>
> See http://www.scrappycito.com for the stable version of the system and
> http://www.tomasohara.trade:9330 for the work-in-progress version. The
> latter has support for handheld devices and also better aesthetics (n.b.,
> version used in examples).
>
> I think this will be extremely popular with the Instagram crowd and younger
> users in general (e.g., younger than 30). To do similar Wikipedia-specific
> searches with the visual search front end, just add *site:en.wikipedia.org
> <http://en.wikipedia.org>* to the query*,* as in following example:
>
> Lionel Messi  site:en.wikipedia.org
>
> Scrappy Search uses the Google search API, so all of the search operators
> <https://support.google.com/websearch/answer/2466433?hl=en> are supported.
>
> The patent for this visual search will be owned by my company ScrappyCito,
> LLC. If the company gets acquired, I will require that they honor the
> license-free usage of the visual search system by Wikimedia for Wikipedia.
> (They will likewise be required to pass along this license-free usage
> requirement if they in turn are acquired). You will have access to the
> current source code for use in Wikipedia and other approved projects.
>
> I am doing this both for exposure and because I want to help keep Wikipedia
> viable (e.g., by enabling higher traffic). This is a great way for users to
> browse the encyclopedia, so it can keep users on the Wikipedia domain
> longer.
>
> If this sounds interesting, I can develop a prototype for the Simple
> English Wikipedia for use on one of my servers. After review, I help with
> the deployment for the regular English Wikipedia on your servers once
> approved.
>
> ==============================================================
> Tom O'Hara, founder ScrappyCito, LLC.              PO Box 6430
> [hidden email]                     Austin, TX 78762-6430
> 737-203-1577                               www.scrappycito.com
> _______________________________________________
> Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/
> wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/
> wiki/Wikimedia-l
> New messages to: [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:[hidden email]?subject=unsubscribe>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: [hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] adding visual search for Wikipedia

Tomás O'Hara
Hi, thanks for pointing out those issues. The issue of proprietary
software is the only potential showstopper, but that can be addressed
by introducing the notion of "wikimedia-friendly sharing", rather than
"unrestricted sharing". Basically, the visual search engine source
code can be copied by anyone, and the burden will be on ScrappyCito to
track down wikimedia-unfriendly organizations using the software. See
the comments-in-context below for details.

Unfortunately, there are some subjective aspects in the issues raised,
which I think are due to a misunderstanding of the system. In short,
this a general-purpose visual search front end: a search of "spirit of
wikimedia" on each system of Google, Bing, and Scrappy Search shows
which are interfaces are truly paleolithic (e.g., pre-2000)! See
        http://www.scrappycito.com/bgs-spirit-of-wikipedia-24jul18.png

Sorry, for the the long delay. This was my first post to a Wikipedia
forum, and I overlooked the response. I also overlooked the initial
display of my posting, as can seen by my reposting on the 11th.
Unfortunately, I was then engulfed in an extended finalization for my
patent application.

Again more details are below to elaborate the main concerns raised.

Best,
Tom


> Tom O'Hara,
>
> I foresee several problems with your proposal.
>
> a) The Wikimedia Foundation itself spent a very large amount of money
> building something essentially the same which was rejected by the community
> and abandoned.
>  https://en.wikipedia.org/wiki/Knowledge_Engine_(Wikimedia_Foundation)
> <https://en.wikipedia.org/wiki/Knowledge_Engine_(Wikimedia_Foundation)>

That was overly ambitious, basically trying to be a search engine as
well as structured data integrator. Moreover, it was also highly
controversial due to the lack of transparency at the start. This
alienated many people in the wikimedia m community.

Furthermore, the example on the page for the Knowledge Engine  shows
that not all search result entries have associated images. For
example, both United Nations Security Council and Cyclone Pam had
images when the example screen shot was taken.

This is basically another example of a stylish visual search that only
shows a subset of the results with images in order to accommodate the
display of related information. See Danny Sullivan's critique of such
visual search engines in his Search Engine Land article.
         Danny Sullivan (2006), "Visual Search The Future? Spare Me
The Eye Candy",
         https://searchengineland.com/visual-search-the-future-spare-me-the-eye-candy-14279

> b) Your proposal is for a proprietary software to be added to the core
> Mediawiki software of Wikipedia. The Wikimedia Foundation is notorious for
> never using third-party proprietary software.

Does that include third-party proprietary software for which the
source code is available? The is the only substantial objection
raised. I was surprised that the Wikimedia Foundation imposing such a
stringent requirement for server software. Based on
https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Guiding_Principles,
the main reason seems to be to streamline the process customizing
wikimedia content and supporting code. This is in the spirit of
unrestricted sharing.

However, I think it is reasonable for the Wikimedia Foundation to
adopt a restricted form of sharing with respect to server software,
provided the source code is available. Basically, the wikimedia
software can be downloaded and run on by anyone in good standing with
the wikimedia community. The intention will be that the software can
be used by any wikimedia-friendly organization, such as schools and
public agencies which promote open sharing. It would then be up to
ScrappyCito to track down ineligible organizations that are using the
software.

> c) The design is still in the stone age when compared to Bing/Google, so
> would not necessarily compete well to attract the target demographic.

I believe this is due to a misunderstanding. An important constraint
is that all results have image. This is not the case for Google or
Bing, especially for abstract queries. If you are referring just to
image-only results, I agree the interface is primitive with respect to
image search provided by Google or Bing. However, this is a general
purpose front end, so all results will have associated images. For
example, Google and Bing only show a few images for "boring topic",
and neither shows any for "abstruse topic": Scrappy Search always
shows images.

Moreover, this is particularly designed to take advantage of high
resolution provided by tablet. I know of no visual search engine front
end that allows users to re-arrange and re-size the results using
gestures common to modern handheld devices. Grade school students will
surely like this features; see
        http://www.tomasohara.trade:9330/static/night-owl-sample.png
The interface is streamlined on smartphones, bur it nonetheless is
quite usable for casual browsing.

> d) The search bar at https://www.wikipedia.org/ already has images in a
> kind of drop-down search suggestions function, this is nice, but has not
> become very popular.

That only works for disambiguation and just shows limited information
in the menu (e.g., title plus characterization [if available[). Images
are not always included even though some of the pages contain images.
For example, searching for "garden party" just shows a generic
document icon. It would be more useful if it could include other
article as well. That would be on-the-fly search. Organizing it into a
grid would be more space efficient. The end result would be something
like Scrappy Search.

> I would actually suggest you go down the route of offering it as an
> extension on https://www.mediawiki.org/wiki/Category:Extensions

Thanks for pointing that out the wikimedia extension: I was unaware of
that, so I will do some experimentation. Being able to customize
wikimedia interfaces can be a useful skill.

However, this is not attractive in this case because casual wikimedia
users will not likely download the extension. It also looks a bit
complicated to implement. For instance, parts of the interface would
need to rewritten to work as part of a wikimedia.

> Warm Regards
>
> Jeremy Lee-Jenkins
>

Best,
Tom


> On Fri, Jul 6, 2018 at 3:07 PM, Tomás O'Hara <[hidden email]> wrote:
>
>> Hi, here's a link to a proposal I have for adding visual search to
>> Wikipedia:
>>
>> https://meta.wikimedia.org/wiki/Use_visual_search_frontend_for_Wikipedia
>> <https://meta.wikimedia.org/wiki/Use_visual_search_frontend_for_Wikipedia#
>> Proposed_by>
>>
>> (This was created with user ID: tomasohara
>> <https://en.wiktionary.org/wiki/User:Tomasohara>.) I can move it into a
>> better location if desired as it as not a "sister project" proper. The
>> Proposals
>> for new projects
>> <https://meta.wikimedia.org/wiki/Proposals_for_new_projects> page doesn't
>> offer suggestions for alternative postings, so I left it there for now.
>>
>> Below is a copy of the project overview. See the link above for details on
>> how this can be applied to foreign language wikipedias. Note that most can
>> be supported right "out of the box" except for the text categorization used
>> to select images for documents without images. A Wikipedia-specific way to
>> do this might be possible (e.g., based on the hierarchy of pages).
>>
>> Best,
>> Tom
>>
>> ---------
>>
>> It would be good for Wikipedia to use a general-purpose visual search front
>> end. Note that a big incentive for this is that users will be drawn to
>> Wikipedia to use this type of search rather than Google Search or Bing.
>> This would be beneficial because these search engines often show Wikipedia
>> content for popular entities like sports stars or tourist attractions,
>> which cuts down on Wikipedia traffic.
>>
>> You will be able to use the visual search frontend I developed without
>> charge for the duration of my patent in the works (a la license free). Here
>> is a link to an example with Wikipedia search on left and my Scrappy Search
>> on right:
>>
>> http://www.scrappycito.com/wikipedia-vs-scrappy-search-small
>> -dog-breeds-en-wiki-site.png
>>
>>
>> Two other examples illustrate some added benefits of this visual search
>> with respect to Wikipedia. First, disambiguation becomes based on images
>> and keywords rather than just snippets of text. See the following:
>>
>> http://www.scrappycito.com/wikipedia-vs-scrappy-search-bob-j
>> ones-en-wiki-site.png
>>
>> In addition, links to other pages for the same entity become much more
>> engaging:
>>
>> http://www.scrappycito.com/wikipedia-vs-scrappy-search-taylo
>> r-swift-en-wiki-site.png
>>
>>
>> See http://www.scrappycito.com for the stable version of the system and
>> http://www.tomasohara.trade:9330 for the work-in-progress version. The
>> latter has support for handheld devices and also better aesthetics (n.b.,
>> version used in examples).
>>
>> I think this will be extremely popular with the Instagram crowd and younger
>> users in general (e.g., younger than 30). To do similar Wikipedia-specific
>> searches with the visual search front end, just add *site:en.wikipedia.org
>> <http://en.wikipedia.org>* to the query*,* as in following example:
>>
>> Lionel Messi  site:en.wikipedia.org
>>
>> Scrappy Search uses the Google search API, so all of the search operators
>> <https://support.google.com/websearch/answer/2466433?hl=en> are supported.
>>
>> The patent for this visual search will be owned by my company ScrappyCito,
>> LLC. If the company gets acquired, I will require that they honor the
>> license-free usage of the visual search system by Wikimedia for Wikipedia.
>> (They will likewise be required to pass along this license-free usage
>> requirement if they in turn are acquired). You will have access to the
>> current source code for use in Wikipedia and other approved projects.
>>
>> I am doing this both for exposure and because I want to help keep Wikipedia
>> viable (e.g., by enabling higher traffic). This is a great way for users to
>> browse the encyclopedia, so it can keep users on the Wikipedia domain
>> longer.
>>
>> If this sounds interesting, I can develop a prototype for the Simple
>> English Wikipedia for use on one of my servers. After review, I help with
>> the deployment for the regular English Wikipedia on your servers once
>> approved.
>>
>> ==============================================================
>> Tom O'Hara, founder ScrappyCito, LLC.              PO Box 6430
>> [hidden email]                     Austin, TX 78762-6430
>> 737-203-1577                               www.scrappycito.com
>> _______________________________________________
>> Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/
>> wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/
>> wiki/Wikimedia-l
>> New messages to: [hidden email]
>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
>> <mailto:[hidden email]?subject=unsubscribe>
> _______________________________________________
> Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l
> New messages to: [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>

_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: [hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>