customizing wiki search

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

customizing wiki search

Liz Kim
Hi,
We have a wiki directory site which is a simple table with names, phone
numbers. ect..
I was thinking about creating a search script to do a look up, go through
this file and find by anything in the content.  I can think of two ways to
approach this..
1. Somehow customize the search to ONLY search within the page for this
directory page.
2. Write a script that goes into the database to do a search..
Any inputs/suggestions?
Thank you
_______________________________________________
Wikitech-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: customizing wiki search

Lars Aronsson
Liz Kim wrote:

> We have a wiki directory site which is a simple table with names, phone
> numbers. ect..
> I was thinking about creating a search script to do a look up, go through
> this file and find by anything in the content.  I can think of two ways to
> approach this..
> 1. Somehow customize the search to ONLY search within the page for this
> directory page.
> 2. Write a script that goes into the database to do a search..
> Any inputs/suggestions?

If you use a MediaWiki template to enter the information in the
tables, you get a kind of semantic markup of the data. This can
then be harvested, either from an XML dump, or by modifying the
MediaWiki software to do the harvesting when a page is saved.

Instead of writing in the page:

{|
|-
! Name || Phone No.
|-
| Lars || 47
|-
| Liz  || 32
|}

You can write:

{|
|-
! Name || Phone No.

{{phonebookentry|name=Lars|no=47}}
{{phonebookentry|name=Liz|no=32}}

|}

That kind of markup is a lot easier to harvest and analyze,
because it hints at what the values are supposed to mean. And then
you let the Template:Phonebookentry contain this:

|-
| {{{name}}} || {{{no}}}

One such harvesting attempt for Wikipedia's contents is described
on http://meta.wikimedia.org/wiki/User:LA2/Extraktor


--
  Lars Aronsson ([hidden email])
  Aronsson Datateknik - http://aronsson.se
_______________________________________________
Wikitech-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: customizing wiki search

Andre Engels
In reply to this post by Liz Kim
2006/9/8, Liz Kim <[hidden email]>:

> Hi,
> We have a wiki directory site which is a simple table with names, phone
> numbers. ect..
> I was thinking about creating a search script to do a look up, go through
> this file and find by anything in the content.  I can think of two ways to
> approach this..
> 1. Somehow customize the search to ONLY search within the page for this
> directory page.
> 2. Write a script that goes into the database to do a search..
> Any inputs/suggestions?
> Thank you

Number 1 could be done by making the directory a separate namespace,
or by having a template on all those pages, and search on the wanted
text in combination with the text of the template.

--
Andre Engels, [hidden email]
ICQ: 6260644  --  Skype: a_engels
_______________________________________________
Wikitech-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: customizing wiki search

Markus Krötzsch
In reply to this post by Lars Aronsson
On Friday 08 September 2006 09:40, Lars Aronsson wrote:

> Liz Kim wrote:
> > We have a wiki directory site which is a simple table with names, phone
> > numbers. ect..
> > I was thinking about creating a search script to do a look up, go through
> > this file and find by anything in the content.  I can think of two ways
> > to approach this..
> > 1. Somehow customize the search to ONLY search within the page for this
> > directory page.
> > 2. Write a script that goes into the database to do a search..
> > Any inputs/suggestions?
>
> If you use a MediaWiki template to enter the information in the
> tables, you get a kind of semantic markup of the data. This can
> then be harvested, either from an XML dump, or by modifying the
> MediaWiki software to do the harvesting when a page is saved.
Or you could use Semantic MediaWiki [1] to enter the data (this can also be
combined with a template, but need not -- so the syntax of the final pages
could be the similar to the Template-approach). The software then does the
extraction for you and provides the data in RDF/XML format. We have shown at
Wikimania2006 how 7 lines of PHP suffice to load data from this format, even
on the fly and over the web (some slides for the tutorial are at [2]). But
most other common programming languages have good RDF support as well.

But maybe you would not even need the extraction, since Semantic MediaWiki
already has some built-in search functions (which may or may not be useful
for your setting).

We also use Semantic MediaWiki in our group-wiki to store our telephone
numbers. We do it by putting the numbers on the user pages of our members. A
list with all telephone numbers is then created automatically elsewhere in
the wiki, and you can directly search for numbers by person. If you do not
want to have extra articles for everything that has a telephone number, then
Semantic MediaWiki can probably just help you in part of the extraction (e.g.
you could get strings of the form "Name: some number" and continue processing
these). At least you avoid parsing the wiki articles yourself.

Cheers,

Markus

[1] http://ontoworld.org/wiki/Semantic_MediaWiki
[2] http://wikimania2006.wikimedia.org/wiki/Proceedings:MK1


--
Markus Krötzsch
Institute AIFB, University of Karlsruhe, D-76128 Karlsruhe
[hidden email]        phone +49 (0)721 608 7362
www.aifb.uni-karlsruhe.de/WBS/     fax +49 (0)721 693  717

_______________________________________________
Wikitech-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikitech-l

attachment0 (198 bytes) Download Attachment