[Wikimedia-l] adding visual search for Wikipedia

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[Wikimedia-l] adding visual search for Wikipedia

Tomás O'Hara
Hi, here's a link to a proposal I have for adding visual search to Wikipedia:
        via user tomasohara [https://meta.wikimedia.org/wiki/User:Tomasohara]
I can move it into a better location if desired as it as not a "sister
project" proper. The Proposals for new projects page
[https://meta.wikimedia.org/wiki/Proposals_for_new_projects] doesn't
offer suggestions for alternative postings, so I left it there for

Below is a copy of the project overview. See the link above for
details on how this can be applied to foreign language wikipedias.
Note that most can be supported right "out of the box" except for the
text categorization used to select images for documents without
images. A Wikipedia-specific way to do this might be possible (e.g.,
based on the hierarchy of pages). Otherwise, this is something I
intend to do for the top languages on the web. (Perhaps some grant can
be acquired to do most of the rest, requiring about 40 hours per



It would be good for Wikipedia to use a general-purpose visual search
front end. Note that a big incentive for this is that users will be
drawn to Wikipedia to use this type of search rather than Google
Search or Bing. This would be good because these search engines often
show Wikipedia content for popular entities like sports stars or
tourist attractions, which cuts down on Wikipedia traffic.

You will be able to use the visual search frontend I developed without
charge for the duration of my patent in the works (a la license free).
Here is a simple example with Wikipedia search on left and my Scrappy
Search on right (i.e., white vs. tan backgrounds):

Two other examples illustrate some added benefits of this visual
search with respect to Wikipedia. First, disambiguation becomes based
on images and keywords rather than just snippets of text. See the

In addition, links to other pages for the same entity become much more engaging:

See http://www.scrappycito.com for the stable version of the system
and http://www.tomasohara.trade:9330 for the work-in-progress version.
The latter has support for handheld devices and also better aesthetics
(n.b., version used in examples).

I think this will be extremely popular with the Instagram crowd and
younger users in general (e.g., younger than 30). To do similar
Wikipedia-specific searches with the visual search front end, just add
site:en.wikipedia.org to the query, as in following example:
        Lionel Messi  site:en.wikipedia.org

Scrappy Search uses the Google search API, so all of the search
operators are supported. See

The patent for this visual search will be owned by my company
ScrappyCito, LLC. If the company gets acquired, I will require that
they honor the license-free usage of the visual search system by
Wikimedia for Wikipedia. (They will likewise be required to pass along
this license-free usage requirement if they in turn are acquired,
etc.). You will have access to the current source code for use in
Wikipedia and other approved projects.

I am doing this both for exposure and because I want to help keep
Wikipedia viable (e.g., by enabling higher traffic). This is a great
way for users to browse the encyclopedia, so it can keep users on the
Wikipedia domain longer.

If this sounds interesting, I can develop a prototype for the Simple
English Wikipedia for use on one of my servers. After review, I can
help with the deployment for the regular English Wikipedia on your
servers once approved.

Tom O'Hara, founder ScrappyCito, LLC.
           PO Box 6430
[hidden email]
   Austin, TX 78762-6430

Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: [hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>