idea: Wiki-Index

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

idea: Wiki-Index

András Kardos
An idea. I'll try to be short.

Wikipedia has a lot of information, and it is heavily crosslinked. But it's not
indexed. I mean an index of people, and index of places and an index of things.
And events. And countries. And lakes. And whatever. Each index is a table (in
database terms), with a few required fields. You could the add a page (or a part
of it) to an index (or more indexes) by specifying theese required fields of an
index (probably in the wiki source). The MediaWiki software would create real
database tables based on this information.
 
Using this you could look up things/people that happened, borned, died or
whatever on a given day. Or things that happened in Tokyo, or in 1923, and put
that on a Google Map. Look at Wikipedia as an intelligent "who's who" (searching
not only by name). Or list books or movies that have wiki pages about them.
Possibilities are quite broad. Look up pages that are in multiple indexes,
"events" and "presidents of the world" for example.
 
"indexers" would be wikipedians who index things. Make and index, like
"countries" or "operating systems" or "mysteries". And then collect things into
that index. And specify the attributes (database fields) of that index. There
are pages like this, I know, for database systems for example, but you see this
is a different level. You could create an index of abbreviations for example...

(I don't have much time to discuss it, but if anyone finds it worth working on,
please let me know. Later I might join in. Have a nice day.)

_______________________________________________
MediaWiki-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
Reply | Threaded
Open this post in threaded view
|

Re: idea: Wiki-Index

Martin Jambon
That would certainly be an excellent thing to have, but I see 2
difficulties:
- readability: manual annotations in the source code make it much more
difficult to read (it's easier to do it once the document is frozen,
which never happens in wikis). It's a bit like hyperlinks.
- quality: I am afraid that automatic indexing would not be better
than a Google search, but if authors have do it manually, it's hard to
maintain.

So I am not sure that indexing for web documents is as useful as in books
due to the presence of hyperlinks :-)

my 2 cents

Martin

On Wed, 11 Jan 2006, András Kardos wrote:

> An idea. I'll try to be short.
>
> Wikipedia has a lot of information, and it is heavily crosslinked. But it's not
> indexed. I mean an index of people, and index of places and an index of things.
> And events. And countries. And lakes. And whatever. Each index is a table (in
> database terms), with a few required fields. You could the add a page (or a part
> of it) to an index (or more indexes) by specifying theese required fields of an
> index (probably in the wiki source). The MediaWiki software would create real
> database tables based on this information.
>
> Using this you could look up things/people that happened, borned, died or
> whatever on a given day. Or things that happened in Tokyo, or in 1923, and put
> that on a Google Map. Look at Wikipedia as an intelligent "who's who" (searching
> not only by name). Or list books or movies that have wiki pages about them.
> Possibilities are quite broad. Look up pages that are in multiple indexes,
> "events" and "presidents of the world" for example.
>
> "indexers" would be wikipedians who index things. Make and index, like
> "countries" or "operating systems" or "mysteries". And then collect things into
> that index. And specify the attributes (database fields) of that index. There
> are pages like this, I know, for database systems for example, but you see this
> is a different level. You could create an index of abbreviations for example...
>
> (I don't have much time to discuss it, but if anyone finds it worth working on,
> please let me know. Later I might join in. Have a nice day.)
>
> _______________________________________________
> MediaWiki-l mailing list
> [hidden email]
> http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
>
--
Martin Jambon, PhD
http://martin.jambon.free.fr

Visit http://wikiomics.org, the Bioinformatics Howto Wiki
_______________________________________________
MediaWiki-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
Reply | Threaded
Open this post in threaded view
|

Re: idea: Wiki-Index

Hans Voss
In reply to this post by András Kardos
I like the idea of such indices very much. It would be uber-cool if
you could use SQL syntax to search the content of a wiki.

What I understand is that you want to create a tag/extension/template
construction where the editor of the page can enter the fields to go
into the table (and presumably select the table it should go to.
Something like this (using the template syntax)
{{indexthis|cities|name=Vlaardingen|Country=NL|population=70000|.....etc....}}

While this may seem a good idea, the first thougth that sprung to mind
was that this makes for a very exploitable structure (for
wikispammers).
With a "normal" page the spammers are annoying, but simple revert the
edit and the information is gone from the wiki (the search anyway).
But how does this work with indices:
* How can I delete records from the index again (for example I made a
page for a city with an index record and later decided to delete the
entire page. How does the record get removed from the index
automagically).
* How can I change records in the index (correcting
mistakes/typos...). I cannot forsee a structure where I can uniquely
identify individual records in the index/database other then adding
another tag/template to specifically change a record which is rahter
cumbersome.
* How do we avoid creating multiple records in the index table.

What might (repeat *might*) make more sense is that some clever coder
(like the guys at google or 'our own' mediawiki coders develop
algorithms that yield better search results on wiki pages. (Because of
the sometimes amazing number of cross links on a page I imagine that
pages in a wiki can be searched/linked in more efficient ways then
"just any other" HTML page.

Just my two cents (OK maybe 5).

Hans Voss.

On 1/11/06, András Kardos <[hidden email]> wrote:

> An idea. I'll try to be short.
>
> Wikipedia has a lot of information, and it is heavily crosslinked. But it's not
> indexed. I mean an index of people, and index of places and an index of things.
> And events. And countries. And lakes. And whatever. Each index is a table (in
> database terms), with a few required fields. You could the add a page (or a part
> of it) to an index (or more indexes) by specifying theese required fields of an
> index (probably in the wiki source). The MediaWiki software would create real
> database tables based on this information.
>
> Using this you could look up things/people that happened, borned, died or
> whatever on a given day. Or things that happened in Tokyo, or in 1923, and put
> that on a Google Map. Look at Wikipedia as an intelligent "who's who" (searching
> not only by name). Or list books or movies that have wiki pages about them.
> Possibilities are quite broad. Look up pages that are in multiple indexes,
> "events" and "presidents of the world" for example.
>
> "indexers" would be wikipedians who index things. Make and index, like
> "countries" or "operating systems" or "mysteries". And then collect things into
> that index. And specify the attributes (database fields) of that index. There
> are pages like this, I know, for database systems for example, but you see this
> is a different level. You could create an index of abbreviations for example...
>
> (I don't have much time to discuss it, but if anyone finds it worth working on,
> please let me know. Later I might join in. Have a nice day.)
>
> _______________________________________________
> MediaWiki-l mailing list
> [hidden email]
> http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
>


--
----
Met vriendelijke groeten / With kind regards
Hans Voss
---------------------------------------
skype: hans.voss
google talk enabled
I am looking for people to invite to Gmail. I have 100 invitations left.
_______________________________________________
MediaWiki-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
Reply | Threaded
Open this post in threaded view
|

Re: idea: Wiki-Index

Brion Vibber
In reply to this post by András Kardos
Please do searches for prior code and discussion:

Already implemented:
* manually created lists
* categories

Discussion:
* "semantic wikipedia"
* wikidata
* geographic data
* metadata
etc

-- brion vibber (brion @ pobox.com)


_______________________________________________
MediaWiki-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/mediawiki-l

signature.asc (257 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: idea: Wiki-Index

Jan Steinman
In reply to this post by András Kardos
> From: Andr?s Kardos <[hidden email]>
>
> Wikipedia has a lot of information, and it is heavily crosslinked.  
> But it's not
> indexed. I mean an index of people, and index of places and an  
> index of things.
> And events. And countries. And lakes. And whatever.

Although nothing exactly like what you're describing currently  
exists, I think (as Brion points out) similar functionality already  
exists.

For one thing, Categories. I think they are poorly managed at  
present, and could use some additional support. I say this without  
having given it much thought, and without any suggestions, so I know  
I'm setting myself up for criticism here! :-)

For another thing, Special:Allpages. This *is* an index already.  
(What you're asking for is not formerly an *index.*) The concept  
could be expanded by hacking a copy of Special:Allpages into a  
concordance, where both the link and link text would have meaning, as  
well as preceding and following words. That could be done  
mechanically, which (IMHO) is an vast advantage over giving the  
WikiPedia public yet another tool to master or misuse!

> "indexers" would be wikipedians who index things. Make and index, like
> "countries" or "operating systems" or "mysteries". And then collect  
> things into
> that index.

I dated a professional indexer once. At least before Microsoft Word  
made everyone an (untrained, unskilled) indexer, this was a  
profession with its own society and conferences and such. Doing it  
"correctly" is difficult, specialized work. Doing it by rote *should*  
be automatic and mechanical, rather than depend on untrained indexers.


:::: Insanity: doing the same thing over and over and expecting  
different results
:::: Jan Steinman <http://www.Bytesmiths.com/Item/99AU22>


_______________________________________________
MediaWiki-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
Reply | Threaded
Open this post in threaded view
|

Re: idea: Wiki-Index

András Kardos
In reply to this post by Hans Voss
Hans Voss <hans.voss@...> writes:

> While this may seem a good idea, the first thougth that sprung to mind
> was that this makes for a very exploitable structure (for
> wikispammers).
> With a "normal" page the spammers are annoying, but simple revert the
> edit and the information is gone from the wiki (the search anyway).
> But how does this work with indices:

Hans understood what I meant.

Some more things: these tables are filled (records added, deleted) just after a
page is updated - when it is parsed. If you add a "record" to a page - in the
inline syntax - it will be added to the apropriate database table at that time
too. If you revert a page to a previous version, or remove a declaration of an
"inline" record, then the database record it corresponds to is deleted too - for
example if it was a "spam". Maybe database records only kept for current
versions of pages, since they can be recreated anytime from the page sources. So
database records refer to pages by their name.

And I thought this indexing would be done by hande. BTW, I'm studiing to be a
"real" indexer (a librarian) at the moment...

_______________________________________________
MediaWiki-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
Reply | Threaded
Open this post in threaded view
|

Re: idea: Wiki-Index

Bugzilla from sy1234@gmail.com
In reply to this post by András Kardos
On 1/11/06, András Kardos <[hidden email]> wrote:
> Using this you could look up things/people that happened, borned, died or
> whatever on a given day. Or things that happened in Tokyo, or in 1923, and put
> that on a Google Map. Look at Wikipedia as an intelligent "who's who" (searching
> not only by name). Or list books or movies that have wiki pages about them.
> Possibilities are quite broad. Look up pages that are in multiple indexes,
> "events" and "presidents of the world" for example.

There are a couple of hacks and extensions for mediawiki which can
automate some of this.

One allows you to create a list from the information for different
categories.  So you can create a list of all pages in [category x} and
{category y} but not {category z}.

Another lets you do similar things, but with backlinks.

I don't think either would ever make it into mediawiki code or into
the Wikipedia if they are very cpu-intensive to operate.


I don't like the idea of editors needing to do things like this
manually.  I like the idea of having mediawiki do it  Martin mentioned
Google.. they index wikipedia very well, and it's a "free" * way to
have good indexing done.


* Not totally free because of advertising on search result pages.
_______________________________________________
MediaWiki-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
Reply | Threaded
Open this post in threaded view
|

Re: idea: Wiki-Index

András Kardos
Sy Ali <sy1234@...> writes:

> I don't like the idea of editors needing to do things like this
> manually.  I like the idea of having mediawiki do it  Martin mentioned
> Google.. they index wikipedia very well, and it's a "free" * way to
> have good indexing done.
>
> * Not totally free because of advertising on search result pages.
>

The whole Wikipedia is done "manually". This indexing is just a way to make this
knowledge more accessible. Google is fine for page-like information, but not
database like. You can't list people born in France in the time of World War 1,
thogh this information is "hidden" in wikipedia.

Some people like to write articles, others (like myself) would do this indexing
stuff better. It's a personality thing...



_______________________________________________
MediaWiki-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/mediawiki-l
Reply | Threaded
Open this post in threaded view
|

Re: Re: idea: Wiki-Index

Bugzilla from sy1234@gmail.com
On 1/14/06, András Kardos <[hidden email]> wrote:
> The whole Wikipedia is done "manually". This indexing is just a way to make this
> knowledge more accessible. Google is fine for page-like information, but not
> database like. You can't list people born in France in the time of World War 1,
> thogh this information is "hidden" in wikipedia.

Hmm.. good point.  The "see also" or related topics ideas really do
help the searchability of things, but it's nowhere near perfect, nor
is it accessible.  Plus it doesn't give the "map" or related items the
way a good system would.

> Some people like to write articles, others (like myself) would do this indexing
> stuff better. It's a personality thing...

I for one would love to move all the "see also" links to the top where
they're more accessible.  I hate doing fuzzy searches, finding a
related article and needing to scroll through to the bottom to find
related articles.

For me.. doing indexing and information management would actually be
quite fun.  Breadcrumbs, disambiguation pages and see also links have
made my own site particularly easy to get around.. even if it's mostly
for me.  =)
_______________________________________________
MediaWiki-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/mediawiki-l