How to query whole page content?

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

How to query whole page content?

Josh King
Hi all,

I seem to be stuck on what I feel should be a simple issue. I've created a
search interface for users on my wiki to allow for a sort of "advanced
search" where they can effectively filter their results by things like
authors, date published, etc through an inline query (each of which is
attached to a particular property and is already set up for the user).

What I would like to do though is also add a general-purpose field to my
query to the extent of "page includes" which searches for similar text to
the user entry within all of the page's content just like the top-right
search bar would in a standard wiki search. Any idea how to do this?

Since inline queries seem to be built on searching by property
relationships, I got hung up. Thanks for any help!

-Josh
------------------------------------------------------------------------------
_______________________________________________
Semediawiki-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/semediawiki-user
Reply | Threaded
Open this post in threaded view
|

Re: How to query whole page content?

Jeremi Plazas
Hi Josh,

I've run into the same problem. I would love to see a solution to this but
haven't yet. It seems to be an either/or kind of situation. Either search
page content and titles by using the top right search bar, or search
semantic properties with custom query forms, etc. But I haven't found a way
to combine the two.

Hope this issue is addressed, I second that it's an obvious need.

On Fri, Oct 16, 2015 at 4:31 PM Josh King <[hidden email]> wrote:

> Hi all,
>
> I seem to be stuck on what I feel should be a simple issue. I've created a
> search interface for users on my wiki to allow for a sort of "advanced
> search" where they can effectively filter their results by things like
> authors, date published, etc through an inline query (each of which is
> attached to a particular property and is already set up for the user).
>
> What I would like to do though is also add a general-purpose field to my
> query to the extent of "page includes" which searches for similar text to
> the user entry within all of the page's content just like the top-right
> search bar would in a standard wiki search. Any idea how to do this?
>
> Since inline queries seem to be built on searching by property
> relationships, I got hung up. Thanks for any help!
>
> -Josh
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Semediawiki-user mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
>
--
Jeremi Plazas
Assistant Director of Research @ Tsadra Foundation
www.tsadra.org // [hidden email]
------------------------------------------------------------------------------
_______________________________________________
Semediawiki-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/semediawiki-user
Reply | Threaded
Open this post in threaded view
|

Re: How to query whole page content?

Josh King
Hi Jeremi,

Glad to see I'm not the only one.
One solution I could imagine is taking advantage of the fact that much of
my pages (even the bodies of text) are included in various properties, so I
could conceivably have a complex query that searches through all of those
properties for that text I think. I personally would like to avoid this
though as it would cause me to need set my query depth and size
configurations high enough that I bet I'd lose performance (and I'm not
even sure the query would really be possible itself).

Just thought I would mention that in case it comes into thinking for others
also.

-Josh

On Fri, Oct 16, 2015 at 5:52 PM, Jeremi Plazas <[hidden email]> wrote:

> Hi Josh,
>
> I've run into the same problem. I would love to see a solution to this but
> haven't yet. It seems to be an either/or kind of situation. Either search
> page content and titles by using the top right search bar, or search
> semantic properties with custom query forms, etc. But I haven't found a way
> to combine the two.
>
> Hope this issue is addressed, I second that it's an obvious need.
>
> On Fri, Oct 16, 2015 at 4:31 PM Josh King <[hidden email]> wrote:
>
>> Hi all,
>>
>> I seem to be stuck on what I feel should be a simple issue. I've created a
>> search interface for users on my wiki to allow for a sort of "advanced
>> search" where they can effectively filter their results by things like
>> authors, date published, etc through an inline query (each of which is
>> attached to a particular property and is already set up for the user).
>>
>> What I would like to do though is also add a general-purpose field to my
>> query to the extent of "page includes" which searches for similar text to
>> the user entry within all of the page's content just like the top-right
>> search bar would in a standard wiki search. Any idea how to do this?
>>
>> Since inline queries seem to be built on searching by property
>> relationships, I got hung up. Thanks for any help!
>>
>> -Josh
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> Semediawiki-user mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
>>
> --
> Jeremi Plazas
> Assistant Director of Research @ Tsadra Foundation
> www.tsadra.org // [hidden email]
>
------------------------------------------------------------------------------
_______________________________________________
Semediawiki-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/semediawiki-user
Reply | Threaded
Open this post in threaded view
|

Re: How to query whole page content?

Yaron Koren-2
Hi,

I'm not a core SMW developer, so I don't know if I can directly help with
this issue, but I am curious about it. As we all know, property-based
querying and text-based searching are very different from one other
conceptually: one is semantic, and the other is syntactic. But on a
practical level, they also seem rather different: text searches are used
for finding one or more individual pages, and are often done just once;
while standard SMW queries are generally used for aggregating data, and are
usually meant for display. So it doesn't seem obvious that it would be
useful to put the two together - though I'm aware that this is a common
request. In both of your cases, my guess is that this would be used to help
with one-time searches, to let users further drill down on search results
in order to avoid having to go through many result pages in order to find
what they're looking for. Is that correct? And if so, how often is that
necessary? Are there really times when this kind of hybrid search would be
much faster than one type of search or the other?

-Yaron

On Fri, Oct 16, 2015 at 7:12 PM, Josh King <[hidden email]> wrote:

> Hi Jeremi,
>
> Glad to see I'm not the only one.
> One solution I could imagine is taking advantage of the fact that much of
> my pages (even the bodies of text) are included in various properties, so I
> could conceivably have a complex query that searches through all of those
> properties for that text I think. I personally would like to avoid this
> though as it would cause me to need set my query depth and size
> configurations high enough that I bet I'd lose performance (and I'm not
> even sure the query would really be possible itself).
>
> Just thought I would mention that in case it comes into thinking for others
> also.
>
> -Josh
>
> On Fri, Oct 16, 2015 at 5:52 PM, Jeremi Plazas <[hidden email]> wrote:
>
> > Hi Josh,
> >
> > I've run into the same problem. I would love to see a solution to this
> but
> > haven't yet. It seems to be an either/or kind of situation. Either search
> > page content and titles by using the top right search bar, or search
> > semantic properties with custom query forms, etc. But I haven't found a
> way
> > to combine the two.
> >
> > Hope this issue is addressed, I second that it's an obvious need.
> >
> > On Fri, Oct 16, 2015 at 4:31 PM Josh King <[hidden email]> wrote:
> >
> >> Hi all,
> >>
> >> I seem to be stuck on what I feel should be a simple issue. I've
> created a
> >> search interface for users on my wiki to allow for a sort of "advanced
> >> search" where they can effectively filter their results by things like
> >> authors, date published, etc through an inline query (each of which is
> >> attached to a particular property and is already set up for the user).
> >>
> >> What I would like to do though is also add a general-purpose field to my
> >> query to the extent of "page includes" which searches for similar text
> to
> >> the user entry within all of the page's content just like the top-right
> >> search bar would in a standard wiki search. Any idea how to do this?
> >>
> >> Since inline queries seem to be built on searching by property
> >> relationships, I got hung up. Thanks for any help!
> >>
> >> -Josh
> >>
> >>
> ------------------------------------------------------------------------------
> >> _______________________________________________
> >> Semediawiki-user mailing list
> >> [hidden email]
> >> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
> >>
> > --
> > Jeremi Plazas
> > Assistant Director of Research @ Tsadra Foundation
> > www.tsadra.org // [hidden email]
> >
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Semediawiki-user mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
>



--
WikiWorks · MediaWiki Consulting · http://wikiworks.com
------------------------------------------------------------------------------
_______________________________________________
Semediawiki-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/semediawiki-user
Reply | Threaded
Open this post in threaded view
|

Re: How to query whole page content?

Jeremi Plazas
Hey Yaron,

Thanks for taking the time to put some thought into this.

Here's the situation for us here at Tsadra Foundation. We use the wiki for
archiving books. Any given book page contains a template with lots of
semantic data on that book, namely author, translator, editor, publication
information, glossary, index, and bibliographic information of all kinds,
then in a lot of cases we also have some of the content of the book on the
page itself (plain text, outside of the main template), usually the
introduction, glossary and bibliography, and in some cases the full text of
the book. Our site is used for research by many translators and scholars
who, for the most part are interested in "hybrid" searches (as you call
them). Meaning for example, they'd like to know how such an author
(semantic data) translates a particular Tibetan term in their glossary
(syntactic data in the page's full text). This is just an example, but most
of the searches we are requested to do on their behalf is constructed as
such: a combination of some set of semantic data to isolate a book or group
of books, and then syntactic data from these books content. So in short,
for our use anyways, some form of search capability that combines the two
would be ideal.

Anyways this is our own specific usage and may not be relevant enough to
the wiki community at large. We work with older scholars and academics who
aren't necessarily very tech savvy and making them use a search box instead
of emailing us is already hard enough. They easily turn-off their interest
in our resources if that search isn't as close as possible to ideal. But I
understand your point, is it worth investing time into combining these
search types if one or the other gets you close enough. For us I think the
answer would be yes. Or at least a way to refine a text-search's results
using semantic properties? Maybe? I don't know.

Thanks again very much for taking the time,

On Mon, Oct 19, 2015 at 9:07 AM Yaron Koren <[hidden email]> wrote:

> Hi,
>
> I'm not a core SMW developer, so I don't know if I can directly help with
> this issue, but I am curious about it. As we all know, property-based
> querying and text-based searching are very different from one other
> conceptually: one is semantic, and the other is syntactic. But on a
> practical level, they also seem rather different: text searches are used
> for finding one or more individual pages, and are often done just once;
> while standard SMW queries are generally used for aggregating data, and are
> usually meant for display. So it doesn't seem obvious that it would be
> useful to put the two together - though I'm aware that this is a common
> request. In both of your cases, my guess is that this would be used to help
> with one-time searches, to let users further drill down on search results
> in order to avoid having to go through many result pages in order to find
> what they're looking for. Is that correct? And if so, how often is that
> necessary? Are there really times when this kind of hybrid search would be
> much faster than one type of search or the other?
>
> -Yaron
>
> On Fri, Oct 16, 2015 at 7:12 PM, Josh King <[hidden email]> wrote:
>
>> Hi Jeremi,
>>
>> Glad to see I'm not the only one.
>> One solution I could imagine is taking advantage of the fact that much of
>> my pages (even the bodies of text) are included in various properties, so
>> I
>> could conceivably have a complex query that searches through all of those
>> properties for that text I think. I personally would like to avoid this
>> though as it would cause me to need set my query depth and size
>> configurations high enough that I bet I'd lose performance (and I'm not
>> even sure the query would really be possible itself).
>>
>> Just thought I would mention that in case it comes into thinking for
>> others
>> also.
>>
>> -Josh
>>
>> On Fri, Oct 16, 2015 at 5:52 PM, Jeremi Plazas <[hidden email]> wrote:
>>
>> > Hi Josh,
>> >
>> > I've run into the same problem. I would love to see a solution to this
>> but
>> > haven't yet. It seems to be an either/or kind of situation. Either
>> search
>> > page content and titles by using the top right search bar, or search
>> > semantic properties with custom query forms, etc. But I haven't found a
>> way
>> > to combine the two.
>> >
>> > Hope this issue is addressed, I second that it's an obvious need.
>> >
>> > On Fri, Oct 16, 2015 at 4:31 PM Josh King <[hidden email]> wrote:
>> >
>> >> Hi all,
>> >>
>> >> I seem to be stuck on what I feel should be a simple issue. I've
>> created a
>> >> search interface for users on my wiki to allow for a sort of "advanced
>> >> search" where they can effectively filter their results by things like
>> >> authors, date published, etc through an inline query (each of which is
>> >> attached to a particular property and is already set up for the user).
>> >>
>> >> What I would like to do though is also add a general-purpose field to
>> my
>> >> query to the extent of "page includes" which searches for similar text
>> to
>> >> the user entry within all of the page's content just like the top-right
>> >> search bar would in a standard wiki search. Any idea how to do this?
>> >>
>> >> Since inline queries seem to be built on searching by property
>> >> relationships, I got hung up. Thanks for any help!
>> >>
>> >> -Josh
>> >>
>> >>
>> ------------------------------------------------------------------------------
>> >> _______________________________________________
>> >> Semediawiki-user mailing list
>> >> [hidden email]
>> >> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
>> >>
>> > --
>> > Jeremi Plazas
>> > Assistant Director of Research @ Tsadra Foundation
>> > www.tsadra.org // [hidden email]
>> >
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> Semediawiki-user mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
>>
>
>
>
> --
> WikiWorks · MediaWiki Consulting · http://wikiworks.com
>
--
Jeremi Plazas
Assistant Director of Research @ Tsadra Foundation
www.tsadra.org // [hidden email]
------------------------------------------------------------------------------
_______________________________________________
Semediawiki-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/semediawiki-user
Reply | Threaded
Open this post in threaded view
|

Re: How to query whole page content?

James HK
Hi,

> Anyways this is our own specific usage and may not be relevant enough to
> the wiki community at large. We work with older scholars and academics who

Nice, I wish we had more of such use case descriptions.

Since this also emerged recently on smw.org [0], let me add some
remarks to this topic.

In general, arbitrary text search (matching a string, computation of
string similarity [1]) and attributive search (matching an entity on
certain attributive conditions, computation of semantic relatedness
between concepts) are two distinct approaches and suffice depending on
the objective one tries to achieve.

> in our resources if that search isn't as close as possible to ideal. But I
> understand your point, is it worth investing time into combining these
> search types if one or the other gets you close enough. For us I think the

As to how this can be combined, I try to give some examples [2, 3, 4].

[2] uses MW's standard Special:Search interface to match semantic
conditions entered in the search field.

[3] also uses MW's standard Special:Search interface with results
being matched against semantic attributes after they have been fetched
from a SearchEngine (which could be ElasticSearch and independent from
SMW). The reduction of results is done in a post-process to only show
subjects that match certain semantic conditions (in case of [3] it is
the annotated content language).

[2, 3] are semi "combined search" solutions that use MW's
infrastructure (search hooks) to provide a single interface and
integrate search results provided by an external SearchEngine (which
focuses on document or full-text indexing) with matches against some
attributes (category, property etc.).

[4] video (2:40 min) shows how a text-search and result set reduction
can work together when using MW's Special:Search.

Another possibility is to look at the SolrStore extension [5] which
provides a Lucene full-text search while incorporating semantic
annotations.

> answer would be yes. Or at least a way to refine a text-search's results
> using semantic properties? Maybe? I don't know.

[2, 3] shows how a Special:Search integration can be achieved and you
can easily develop such interface yourself (given that the full-text
search is used as starting point for the search process and attributes
are matched in a post-process to reduce the set of matches).

PS: In the past we had some minor exchange about an ElasticSearch
integration but since that requires commitment and resources it hasn't
been on anyone's agenda.

[0] https://semantic-mediawiki.org/w/index.php?title=semantic-mediawiki.org:Community_portal&offset=20151019103205&lqt_mustshow=1975#Semantic_search_in_documents_.28indexing.29_1975

[1] Giunchiglia, Fausto, Uladzimir Kharkevich, and Ilya Zaihrayeu.
"Concept search: Semantics enabled syntactic search." (2008).

[2] https://semantic-mediawiki.org/wiki/Help:SMWSearch
[3] https://github.com/SemanticMediaWiki/SemanticInterlanguageLinks/tree/master/src/Search

[4] https://vimeo.com/115871518

[5] https://www.mediawiki.org/wiki/Extension:SolrStore

Cheers

On 10/19/15, Jeremi Plazas <[hidden email]> wrote:

> Hey Yaron,
>
> Thanks for taking the time to put some thought into this.
>
> Here's the situation for us here at Tsadra Foundation. We use the wiki for
> archiving books. Any given book page contains a template with lots of
> semantic data on that book, namely author, translator, editor, publication
> information, glossary, index, and bibliographic information of all kinds,
> then in a lot of cases we also have some of the content of the book on the
> page itself (plain text, outside of the main template), usually the
> introduction, glossary and bibliography, and in some cases the full text of
> the book. Our site is used for research by many translators and scholars
> who, for the most part are interested in "hybrid" searches (as you call
> them). Meaning for example, they'd like to know how such an author
> (semantic data) translates a particular Tibetan term in their glossary
> (syntactic data in the page's full text). This is just an example, but most
> of the searches we are requested to do on their behalf is constructed as
> such: a combination of some set of semantic data to isolate a book or group
> of books, and then syntactic data from these books content. So in short,
> for our use anyways, some form of search capability that combines the two
> would be ideal.
>
> Anyways this is our own specific usage and may not be relevant enough to
> the wiki community at large. We work with older scholars and academics who
> aren't necessarily very tech savvy and making them use a search box instead
> of emailing us is already hard enough. They easily turn-off their interest
> in our resources if that search isn't as close as possible to ideal. But I
> understand your point, is it worth investing time into combining these
> search types if one or the other gets you close enough. For us I think the
> answer would be yes. Or at least a way to refine a text-search's results
> using semantic properties? Maybe? I don't know.
>
> Thanks again very much for taking the time,
>
> On Mon, Oct 19, 2015 at 9:07 AM Yaron Koren <[hidden email]> wrote:
>
>> Hi,
>>
>> I'm not a core SMW developer, so I don't know if I can directly help with
>> this issue, but I am curious about it. As we all know, property-based
>> querying and text-based searching are very different from one other
>> conceptually: one is semantic, and the other is syntactic. But on a
>> practical level, they also seem rather different: text searches are used
>> for finding one or more individual pages, and are often done just once;
>> while standard SMW queries are generally used for aggregating data, and
>> are
>> usually meant for display. So it doesn't seem obvious that it would be
>> useful to put the two together - though I'm aware that this is a common
>> request. In both of your cases, my guess is that this would be used to
>> help
>> with one-time searches, to let users further drill down on search results
>> in order to avoid having to go through many result pages in order to find
>> what they're looking for. Is that correct? And if so, how often is that
>> necessary? Are there really times when this kind of hybrid search would be
>> much faster than one type of search or the other?
>>
>> -Yaron
>>
>> On Fri, Oct 16, 2015 at 7:12 PM, Josh King <[hidden email]> wrote:
>>
>>> Hi Jeremi,
>>>
>>> Glad to see I'm not the only one.
>>> One solution I could imagine is taking advantage of the fact that much of
>>> my pages (even the bodies of text) are included in various properties, so
>>> I
>>> could conceivably have a complex query that searches through all of those
>>> properties for that text I think. I personally would like to avoid this
>>> though as it would cause me to need set my query depth and size
>>> configurations high enough that I bet I'd lose performance (and I'm not
>>> even sure the query would really be possible itself).
>>>
>>> Just thought I would mention that in case it comes into thinking for
>>> others
>>> also.
>>>
>>> -Josh
>>>
>>> On Fri, Oct 16, 2015 at 5:52 PM, Jeremi Plazas <[hidden email]> wrote:
>>>
>>> > Hi Josh,
>>> >
>>> > I've run into the same problem. I would love to see a solution to this
>>> but
>>> > haven't yet. It seems to be an either/or kind of situation. Either
>>> search
>>> > page content and titles by using the top right search bar, or search
>>> > semantic properties with custom query forms, etc. But I haven't found a
>>> way
>>> > to combine the two.
>>> >
>>> > Hope this issue is addressed, I second that it's an obvious need.
>>> >
>>> > On Fri, Oct 16, 2015 at 4:31 PM Josh King <[hidden email]> wrote:
>>> >
>>> >> Hi all,
>>> >>
>>> >> I seem to be stuck on what I feel should be a simple issue. I've
>>> created a
>>> >> search interface for users on my wiki to allow for a sort of "advanced
>>> >> search" where they can effectively filter their results by things like
>>> >> authors, date published, etc through an inline query (each of which is
>>> >> attached to a particular property and is already set up for the user).
>>> >>
>>> >> What I would like to do though is also add a general-purpose field to
>>> my
>>> >> query to the extent of "page includes" which searches for similar text
>>> to
>>> >> the user entry within all of the page's content just like the
>>> >> top-right
>>> >> search bar would in a standard wiki search. Any idea how to do this?
>>> >>
>>> >> Since inline queries seem to be built on searching by property
>>> >> relationships, I got hung up. Thanks for any help!
>>> >>
>>> >> -Josh
>>> >>
>>> >>
>>> ------------------------------------------------------------------------------
>>> >> _______________________________________________
>>> >> Semediawiki-user mailing list
>>> >> [hidden email]
>>> >> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
>>> >>
>>> > --
>>> > Jeremi Plazas
>>> > Assistant Director of Research @ Tsadra Foundation
>>> > www.tsadra.org // [hidden email]
>>> >
>>>
>>> ------------------------------------------------------------------------------
>>> _______________________________________________
>>> Semediawiki-user mailing list
>>> [hidden email]
>>> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
>>>
>>
>>
>>
>> --
>> WikiWorks · MediaWiki Consulting · http://wikiworks.com
>>
> --
> Jeremi Plazas
> Assistant Director of Research @ Tsadra Foundation
> www.tsadra.org // [hidden email]
> ------------------------------------------------------------------------------
> _______________________________________________
> Semediawiki-user mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
>

------------------------------------------------------------------------------
_______________________________________________
Semediawiki-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/semediawiki-user
Reply | Threaded
Open this post in threaded view
|

Re: How to query whole page content?

Josh King
In reply to this post by Yaron Koren-2
Hi Yaron,

It's nice to have your thoughts on this as I certainly appreciate all the
support you've already provided to the wiki community.
An example of use for this instance (and indeed my case) would be searching
through short articles that have pre-formatted semantic data (such as where
it was published, certain categorical data I apply, tags, etc.).
Conceivably a user could search through the semantic data of categories and
tags to find their article, but these are often general and the user might
also remember a particular phrase he/she saw to search by or even something
too specific to be appropriate as a tag.

In terms of whether this hybrid search would really be used all that much
and be faster, I think that likely depends on how well the tags represent
the content of articles. Personally I think most users would search for a
phrase or some keywords and then narrow down the results by the semantic
data of it's field/category or publishing.

Hope that answers your questions!

Thanks,
Josh

On Mon, Oct 19, 2015 at 10:07 AM, Yaron Koren <[hidden email]> wrote:

> Hi,
>
> I'm not a core SMW developer, so I don't know if I can directly help with
> this issue, but I am curious about it. As we all know, property-based
> querying and text-based searching are very different from one other
> conceptually: one is semantic, and the other is syntactic. But on a
> practical level, they also seem rather different: text searches are used
> for finding one or more individual pages, and are often done just once;
> while standard SMW queries are generally used for aggregating data, and are
> usually meant for display. So it doesn't seem obvious that it would be
> useful to put the two together - though I'm aware that this is a common
> request. In both of your cases, my guess is that this would be used to help
> with one-time searches, to let users further drill down on search results
> in order to avoid having to go through many result pages in order to find
> what they're looking for. Is that correct? And if so, how often is that
> necessary? Are there really times when this kind of hybrid search would be
> much faster than one type of search or the other?
>
> -Yaron
>
> On Fri, Oct 16, 2015 at 7:12 PM, Josh King <[hidden email]> wrote:
>
>> Hi Jeremi,
>>
>> Glad to see I'm not the only one.
>> One solution I could imagine is taking advantage of the fact that much of
>> my pages (even the bodies of text) are included in various properties, so
>> I
>> could conceivably have a complex query that searches through all of those
>> properties for that text I think. I personally would like to avoid this
>> though as it would cause me to need set my query depth and size
>> configurations high enough that I bet I'd lose performance (and I'm not
>> even sure the query would really be possible itself).
>>
>> Just thought I would mention that in case it comes into thinking for
>> others
>> also.
>>
>> -Josh
>>
>> On Fri, Oct 16, 2015 at 5:52 PM, Jeremi Plazas <[hidden email]> wrote:
>>
>> > Hi Josh,
>> >
>> > I've run into the same problem. I would love to see a solution to this
>> but
>> > haven't yet. It seems to be an either/or kind of situation. Either
>> search
>> > page content and titles by using the top right search bar, or search
>> > semantic properties with custom query forms, etc. But I haven't found a
>> way
>> > to combine the two.
>> >
>> > Hope this issue is addressed, I second that it's an obvious need.
>> >
>> > On Fri, Oct 16, 2015 at 4:31 PM Josh King <[hidden email]> wrote:
>> >
>> >> Hi all,
>> >>
>> >> I seem to be stuck on what I feel should be a simple issue. I've
>> created a
>> >> search interface for users on my wiki to allow for a sort of "advanced
>> >> search" where they can effectively filter their results by things like
>> >> authors, date published, etc through an inline query (each of which is
>> >> attached to a particular property and is already set up for the user).
>> >>
>> >> What I would like to do though is also add a general-purpose field to
>> my
>> >> query to the extent of "page includes" which searches for similar text
>> to
>> >> the user entry within all of the page's content just like the top-right
>> >> search bar would in a standard wiki search. Any idea how to do this?
>> >>
>> >> Since inline queries seem to be built on searching by property
>> >> relationships, I got hung up. Thanks for any help!
>> >>
>> >> -Josh
>> >>
>> >>
>> ------------------------------------------------------------------------------
>> >> _______________________________________________
>> >> Semediawiki-user mailing list
>> >> [hidden email]
>> >> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
>> >>
>> > --
>> > Jeremi Plazas
>> > Assistant Director of Research @ Tsadra Foundation
>> > www.tsadra.org // [hidden email]
>> >
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> Semediawiki-user mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
>>
>
>
>
> --
> WikiWorks · MediaWiki Consulting · http://wikiworks.com
>
------------------------------------------------------------------------------
_______________________________________________
Semediawiki-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/semediawiki-user
Reply | Threaded
Open this post in threaded view
|

Re: How to query whole page content?

Josh King
In reply to this post by James HK
Thanks James... I think I need to sit on this for a bit and see if I can
make any of this work for me. I'd love if I could keep everything "in
house" and not have to rely on external engines for the search, but it's a
wise idea I hadn't yet considered and should think on.

-Josh

On Mon, Oct 19, 2015 at 12:02 PM, James HK <[hidden email]>
wrote:

> Hi,
>
> > Anyways this is our own specific usage and may not be relevant enough to
> > the wiki community at large. We work with older scholars and academics
> who
>
> Nice, I wish we had more of such use case descriptions.
>
> Since this also emerged recently on smw.org [0], let me add some
> remarks to this topic.
>
> In general, arbitrary text search (matching a string, computation of
> string similarity [1]) and attributive search (matching an entity on
> certain attributive conditions, computation of semantic relatedness
> between concepts) are two distinct approaches and suffice depending on
> the objective one tries to achieve.
>
> > in our resources if that search isn't as close as possible to ideal. But
> I
> > understand your point, is it worth investing time into combining these
> > search types if one or the other gets you close enough. For us I think
> the
>
> As to how this can be combined, I try to give some examples [2, 3, 4].
>
> [2] uses MW's standard Special:Search interface to match semantic
> conditions entered in the search field.
>
> [3] also uses MW's standard Special:Search interface with results
> being matched against semantic attributes after they have been fetched
> from a SearchEngine (which could be ElasticSearch and independent from
> SMW). The reduction of results is done in a post-process to only show
> subjects that match certain semantic conditions (in case of [3] it is
> the annotated content language).
>
> [2, 3] are semi "combined search" solutions that use MW's
> infrastructure (search hooks) to provide a single interface and
> integrate search results provided by an external SearchEngine (which
> focuses on document or full-text indexing) with matches against some
> attributes (category, property etc.).
>
> [4] video (2:40 min) shows how a text-search and result set reduction
> can work together when using MW's Special:Search.
>
> Another possibility is to look at the SolrStore extension [5] which
> provides a Lucene full-text search while incorporating semantic
> annotations.
>
> > answer would be yes. Or at least a way to refine a text-search's results
> > using semantic properties? Maybe? I don't know.
>
> [2, 3] shows how a Special:Search integration can be achieved and you
> can easily develop such interface yourself (given that the full-text
> search is used as starting point for the search process and attributes
> are matched in a post-process to reduce the set of matches).
>
> PS: In the past we had some minor exchange about an ElasticSearch
> integration but since that requires commitment and resources it hasn't
> been on anyone's agenda.
>
> [0]
> https://semantic-mediawiki.org/w/index.php?title=semantic-mediawiki.org:Community_portal&offset=20151019103205&lqt_mustshow=1975#Semantic_search_in_documents_.28indexing.29_1975
>
> [1] Giunchiglia, Fausto, Uladzimir Kharkevich, and Ilya Zaihrayeu.
> "Concept search: Semantics enabled syntactic search." (2008).
>
> [2] https://semantic-mediawiki.org/wiki/Help:SMWSearch
> [3]
> https://github.com/SemanticMediaWiki/SemanticInterlanguageLinks/tree/master/src/Search
>
> [4] https://vimeo.com/115871518
>
> [5] https://www.mediawiki.org/wiki/Extension:SolrStore
>
> Cheers
>
> On 10/19/15, Jeremi Plazas <[hidden email]> wrote:
> > Hey Yaron,
> >
> > Thanks for taking the time to put some thought into this.
> >
> > Here's the situation for us here at Tsadra Foundation. We use the wiki
> for
> > archiving books. Any given book page contains a template with lots of
> > semantic data on that book, namely author, translator, editor,
> publication
> > information, glossary, index, and bibliographic information of all kinds,
> > then in a lot of cases we also have some of the content of the book on
> the
> > page itself (plain text, outside of the main template), usually the
> > introduction, glossary and bibliography, and in some cases the full text
> of
> > the book. Our site is used for research by many translators and scholars
> > who, for the most part are interested in "hybrid" searches (as you call
> > them). Meaning for example, they'd like to know how such an author
> > (semantic data) translates a particular Tibetan term in their glossary
> > (syntactic data in the page's full text). This is just an example, but
> most
> > of the searches we are requested to do on their behalf is constructed as
> > such: a combination of some set of semantic data to isolate a book or
> group
> > of books, and then syntactic data from these books content. So in short,
> > for our use anyways, some form of search capability that combines the two
> > would be ideal.
> >
> > Anyways this is our own specific usage and may not be relevant enough to
> > the wiki community at large. We work with older scholars and academics
> who
> > aren't necessarily very tech savvy and making them use a search box
> instead
> > of emailing us is already hard enough. They easily turn-off their
> interest
> > in our resources if that search isn't as close as possible to ideal. But
> I
> > understand your point, is it worth investing time into combining these
> > search types if one or the other gets you close enough. For us I think
> the
> > answer would be yes. Or at least a way to refine a text-search's results
> > using semantic properties? Maybe? I don't know.
> >
> > Thanks again very much for taking the time,
> >
> > On Mon, Oct 19, 2015 at 9:07 AM Yaron Koren <[hidden email]> wrote:
> >
> >> Hi,
> >>
> >> I'm not a core SMW developer, so I don't know if I can directly help
> with
> >> this issue, but I am curious about it. As we all know, property-based
> >> querying and text-based searching are very different from one other
> >> conceptually: one is semantic, and the other is syntactic. But on a
> >> practical level, they also seem rather different: text searches are used
> >> for finding one or more individual pages, and are often done just once;
> >> while standard SMW queries are generally used for aggregating data, and
> >> are
> >> usually meant for display. So it doesn't seem obvious that it would be
> >> useful to put the two together - though I'm aware that this is a common
> >> request. In both of your cases, my guess is that this would be used to
> >> help
> >> with one-time searches, to let users further drill down on search
> results
> >> in order to avoid having to go through many result pages in order to
> find
> >> what they're looking for. Is that correct? And if so, how often is that
> >> necessary? Are there really times when this kind of hybrid search would
> be
> >> much faster than one type of search or the other?
> >>
> >> -Yaron
> >>
> >> On Fri, Oct 16, 2015 at 7:12 PM, Josh King <[hidden email]> wrote:
> >>
> >>> Hi Jeremi,
> >>>
> >>> Glad to see I'm not the only one.
> >>> One solution I could imagine is taking advantage of the fact that much
> of
> >>> my pages (even the bodies of text) are included in various properties,
> so
> >>> I
> >>> could conceivably have a complex query that searches through all of
> those
> >>> properties for that text I think. I personally would like to avoid this
> >>> though as it would cause me to need set my query depth and size
> >>> configurations high enough that I bet I'd lose performance (and I'm not
> >>> even sure the query would really be possible itself).
> >>>
> >>> Just thought I would mention that in case it comes into thinking for
> >>> others
> >>> also.
> >>>
> >>> -Josh
> >>>
> >>> On Fri, Oct 16, 2015 at 5:52 PM, Jeremi Plazas <[hidden email]>
> wrote:
> >>>
> >>> > Hi Josh,
> >>> >
> >>> > I've run into the same problem. I would love to see a solution to
> this
> >>> but
> >>> > haven't yet. It seems to be an either/or kind of situation. Either
> >>> search
> >>> > page content and titles by using the top right search bar, or search
> >>> > semantic properties with custom query forms, etc. But I haven't
> found a
> >>> way
> >>> > to combine the two.
> >>> >
> >>> > Hope this issue is addressed, I second that it's an obvious need.
> >>> >
> >>> > On Fri, Oct 16, 2015 at 4:31 PM Josh King <[hidden email]>
> wrote:
> >>> >
> >>> >> Hi all,
> >>> >>
> >>> >> I seem to be stuck on what I feel should be a simple issue. I've
> >>> created a
> >>> >> search interface for users on my wiki to allow for a sort of
> "advanced
> >>> >> search" where they can effectively filter their results by things
> like
> >>> >> authors, date published, etc through an inline query (each of which
> is
> >>> >> attached to a particular property and is already set up for the
> user).
> >>> >>
> >>> >> What I would like to do though is also add a general-purpose field
> to
> >>> my
> >>> >> query to the extent of "page includes" which searches for similar
> text
> >>> to
> >>> >> the user entry within all of the page's content just like the
> >>> >> top-right
> >>> >> search bar would in a standard wiki search. Any idea how to do this?
> >>> >>
> >>> >> Since inline queries seem to be built on searching by property
> >>> >> relationships, I got hung up. Thanks for any help!
> >>> >>
> >>> >> -Josh
> >>> >>
> >>> >>
> >>>
> ------------------------------------------------------------------------------
> >>> >> _______________________________________________
> >>> >> Semediawiki-user mailing list
> >>> >> [hidden email]
> >>> >> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
> >>> >>
> >>> > --
> >>> > Jeremi Plazas
> >>> > Assistant Director of Research @ Tsadra Foundation
> >>> > www.tsadra.org // [hidden email]
> >>> >
> >>>
> >>>
> ------------------------------------------------------------------------------
> >>> _______________________________________________
> >>> Semediawiki-user mailing list
> >>> [hidden email]
> >>> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
> >>>
> >>
> >>
> >>
> >> --
> >> WikiWorks · MediaWiki Consulting · http://wikiworks.com
> >>
> > --
> > Jeremi Plazas
> > Assistant Director of Research @ Tsadra Foundation
> > www.tsadra.org // [hidden email]
> >
> ------------------------------------------------------------------------------
> > _______________________________________________
> > Semediawiki-user mailing list
> > [hidden email]
> > https://lists.sourceforge.net/lists/listinfo/semediawiki-user
> >
>
------------------------------------------------------------------------------
_______________________________________________
Semediawiki-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/semediawiki-user
Reply | Threaded
Open this post in threaded view
|

Re: How to query whole page content?

Josh King
I think I'm set on trying to find a crafty way to avoid those external
options if possible. Anyone know if I could dump all of my page text into a
sort of master property and query it syntactically that way? ...Maybe I'm
crazy for thinking it, but hey...

On Tue, Oct 20, 2015 at 8:41 PM, Josh King <[hidden email]> wrote:

> Thanks James... I think I need to sit on this for a bit and see if I can
> make any of this work for me. I'd love if I could keep everything "in
> house" and not have to rely on external engines for the search, but it's a
> wise idea I hadn't yet considered and should think on.
>
> -Josh
>
> On Mon, Oct 19, 2015 at 12:02 PM, James HK <[hidden email]>
> wrote:
>
>> Hi,
>>
>> > Anyways this is our own specific usage and may not be relevant enough to
>> > the wiki community at large. We work with older scholars and academics
>> who
>>
>> Nice, I wish we had more of such use case descriptions.
>>
>> Since this also emerged recently on smw.org [0], let me add some
>> remarks to this topic.
>>
>> In general, arbitrary text search (matching a string, computation of
>> string similarity [1]) and attributive search (matching an entity on
>> certain attributive conditions, computation of semantic relatedness
>> between concepts) are two distinct approaches and suffice depending on
>> the objective one tries to achieve.
>>
>> > in our resources if that search isn't as close as possible to ideal.
>> But I
>> > understand your point, is it worth investing time into combining these
>> > search types if one or the other gets you close enough. For us I think
>> the
>>
>> As to how this can be combined, I try to give some examples [2, 3, 4].
>>
>> [2] uses MW's standard Special:Search interface to match semantic
>> conditions entered in the search field.
>>
>> [3] also uses MW's standard Special:Search interface with results
>> being matched against semantic attributes after they have been fetched
>> from a SearchEngine (which could be ElasticSearch and independent from
>> SMW). The reduction of results is done in a post-process to only show
>> subjects that match certain semantic conditions (in case of [3] it is
>> the annotated content language).
>>
>> [2, 3] are semi "combined search" solutions that use MW's
>> infrastructure (search hooks) to provide a single interface and
>> integrate search results provided by an external SearchEngine (which
>> focuses on document or full-text indexing) with matches against some
>> attributes (category, property etc.).
>>
>> [4] video (2:40 min) shows how a text-search and result set reduction
>> can work together when using MW's Special:Search.
>>
>> Another possibility is to look at the SolrStore extension [5] which
>> provides a Lucene full-text search while incorporating semantic
>> annotations.
>>
>> > answer would be yes. Or at least a way to refine a text-search's results
>> > using semantic properties? Maybe? I don't know.
>>
>> [2, 3] shows how a Special:Search integration can be achieved and you
>> can easily develop such interface yourself (given that the full-text
>> search is used as starting point for the search process and attributes
>> are matched in a post-process to reduce the set of matches).
>>
>> PS: In the past we had some minor exchange about an ElasticSearch
>> integration but since that requires commitment and resources it hasn't
>> been on anyone's agenda.
>>
>> [0]
>> https://semantic-mediawiki.org/w/index.php?title=semantic-mediawiki.org:Community_portal&offset=20151019103205&lqt_mustshow=1975#Semantic_search_in_documents_.28indexing.29_1975
>>
>> [1] Giunchiglia, Fausto, Uladzimir Kharkevich, and Ilya Zaihrayeu.
>> "Concept search: Semantics enabled syntactic search." (2008).
>>
>> [2] https://semantic-mediawiki.org/wiki/Help:SMWSearch
>> [3]
>> https://github.com/SemanticMediaWiki/SemanticInterlanguageLinks/tree/master/src/Search
>>
>> [4] https://vimeo.com/115871518
>>
>> [5] https://www.mediawiki.org/wiki/Extension:SolrStore
>>
>> Cheers
>>
>> On 10/19/15, Jeremi Plazas <[hidden email]> wrote:
>> > Hey Yaron,
>> >
>> > Thanks for taking the time to put some thought into this.
>> >
>> > Here's the situation for us here at Tsadra Foundation. We use the wiki
>> for
>> > archiving books. Any given book page contains a template with lots of
>> > semantic data on that book, namely author, translator, editor,
>> publication
>> > information, glossary, index, and bibliographic information of all
>> kinds,
>> > then in a lot of cases we also have some of the content of the book on
>> the
>> > page itself (plain text, outside of the main template), usually the
>> > introduction, glossary and bibliography, and in some cases the full
>> text of
>> > the book. Our site is used for research by many translators and scholars
>> > who, for the most part are interested in "hybrid" searches (as you call
>> > them). Meaning for example, they'd like to know how such an author
>> > (semantic data) translates a particular Tibetan term in their glossary
>> > (syntactic data in the page's full text). This is just an example, but
>> most
>> > of the searches we are requested to do on their behalf is constructed as
>> > such: a combination of some set of semantic data to isolate a book or
>> group
>> > of books, and then syntactic data from these books content. So in short,
>> > for our use anyways, some form of search capability that combines the
>> two
>> > would be ideal.
>> >
>> > Anyways this is our own specific usage and may not be relevant enough to
>> > the wiki community at large. We work with older scholars and academics
>> who
>> > aren't necessarily very tech savvy and making them use a search box
>> instead
>> > of emailing us is already hard enough. They easily turn-off their
>> interest
>> > in our resources if that search isn't as close as possible to ideal.
>> But I
>> > understand your point, is it worth investing time into combining these
>> > search types if one or the other gets you close enough. For us I think
>> the
>> > answer would be yes. Or at least a way to refine a text-search's results
>> > using semantic properties? Maybe? I don't know.
>> >
>> > Thanks again very much for taking the time,
>> >
>> > On Mon, Oct 19, 2015 at 9:07 AM Yaron Koren <[hidden email]>
>> wrote:
>> >
>> >> Hi,
>> >>
>> >> I'm not a core SMW developer, so I don't know if I can directly help
>> with
>> >> this issue, but I am curious about it. As we all know, property-based
>> >> querying and text-based searching are very different from one other
>> >> conceptually: one is semantic, and the other is syntactic. But on a
>> >> practical level, they also seem rather different: text searches are
>> used
>> >> for finding one or more individual pages, and are often done just once;
>> >> while standard SMW queries are generally used for aggregating data, and
>> >> are
>> >> usually meant for display. So it doesn't seem obvious that it would be
>> >> useful to put the two together - though I'm aware that this is a common
>> >> request. In both of your cases, my guess is that this would be used to
>> >> help
>> >> with one-time searches, to let users further drill down on search
>> results
>> >> in order to avoid having to go through many result pages in order to
>> find
>> >> what they're looking for. Is that correct? And if so, how often is that
>> >> necessary? Are there really times when this kind of hybrid search
>> would be
>> >> much faster than one type of search or the other?
>> >>
>> >> -Yaron
>> >>
>> >> On Fri, Oct 16, 2015 at 7:12 PM, Josh King <[hidden email]>
>> wrote:
>> >>
>> >>> Hi Jeremi,
>> >>>
>> >>> Glad to see I'm not the only one.
>> >>> One solution I could imagine is taking advantage of the fact that
>> much of
>> >>> my pages (even the bodies of text) are included in various
>> properties, so
>> >>> I
>> >>> could conceivably have a complex query that searches through all of
>> those
>> >>> properties for that text I think. I personally would like to avoid
>> this
>> >>> though as it would cause me to need set my query depth and size
>> >>> configurations high enough that I bet I'd lose performance (and I'm
>> not
>> >>> even sure the query would really be possible itself).
>> >>>
>> >>> Just thought I would mention that in case it comes into thinking for
>> >>> others
>> >>> also.
>> >>>
>> >>> -Josh
>> >>>
>> >>> On Fri, Oct 16, 2015 at 5:52 PM, Jeremi Plazas <[hidden email]>
>> wrote:
>> >>>
>> >>> > Hi Josh,
>> >>> >
>> >>> > I've run into the same problem. I would love to see a solution to
>> this
>> >>> but
>> >>> > haven't yet. It seems to be an either/or kind of situation. Either
>> >>> search
>> >>> > page content and titles by using the top right search bar, or search
>> >>> > semantic properties with custom query forms, etc. But I haven't
>> found a
>> >>> way
>> >>> > to combine the two.
>> >>> >
>> >>> > Hope this issue is addressed, I second that it's an obvious need.
>> >>> >
>> >>> > On Fri, Oct 16, 2015 at 4:31 PM Josh King <[hidden email]>
>> wrote:
>> >>> >
>> >>> >> Hi all,
>> >>> >>
>> >>> >> I seem to be stuck on what I feel should be a simple issue. I've
>> >>> created a
>> >>> >> search interface for users on my wiki to allow for a sort of
>> "advanced
>> >>> >> search" where they can effectively filter their results by things
>> like
>> >>> >> authors, date published, etc through an inline query (each of
>> which is
>> >>> >> attached to a particular property and is already set up for the
>> user).
>> >>> >>
>> >>> >> What I would like to do though is also add a general-purpose field
>> to
>> >>> my
>> >>> >> query to the extent of "page includes" which searches for similar
>> text
>> >>> to
>> >>> >> the user entry within all of the page's content just like the
>> >>> >> top-right
>> >>> >> search bar would in a standard wiki search. Any idea how to do
>> this?
>> >>> >>
>> >>> >> Since inline queries seem to be built on searching by property
>> >>> >> relationships, I got hung up. Thanks for any help!
>> >>> >>
>> >>> >> -Josh
>> >>> >>
>> >>> >>
>> >>>
>> ------------------------------------------------------------------------------
>> >>> >> _______________________________________________
>> >>> >> Semediawiki-user mailing list
>> >>> >> [hidden email]
>> >>> >> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
>> >>> >>
>> >>> > --
>> >>> > Jeremi Plazas
>> >>> > Assistant Director of Research @ Tsadra Foundation
>> >>> > www.tsadra.org // [hidden email]
>> >>> >
>> >>>
>> >>>
>> ------------------------------------------------------------------------------
>> >>> _______________________________________________
>> >>> Semediawiki-user mailing list
>> >>> [hidden email]
>> >>> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> WikiWorks · MediaWiki Consulting · http://wikiworks.com
>> >>
>> > --
>> > Jeremi Plazas
>> > Assistant Director of Research @ Tsadra Foundation
>> > www.tsadra.org // [hidden email]
>> >
>> ------------------------------------------------------------------------------
>> > _______________________________________________
>> > Semediawiki-user mailing list
>> > [hidden email]
>> > https://lists.sourceforge.net/lists/listinfo/semediawiki-user
>> >
>>
>
>
------------------------------------------------------------------------------
_______________________________________________
Semediawiki-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/semediawiki-user
Reply | Threaded
Open this post in threaded view
|

Re: How to query whole page content?

Josh King
To anyone interested on this still and hopefully to save them time also...

Unfortunately trying to use subproperties as a work-around is not viable
for this. I attempted placing key portions of my articles in subproperties
and then putting them under an umbrella property like "Has page content for
search." This works well for a few, short properties, but fails at longer
text strings due to only 40 characters being searchable (*a real bummer...*)
as noted in this old discussion

http://wikimedia.7.x6.nabble.com/Unable-to-query-over-the-last-words-of-a-long-text-value-td5020618.html#a5020801

The query could also be broken up into a bunch of OR statements to run
through a series of short properties as well (being fed from one field),
but each property would still be limited by the 40 character count as noted
above.

I suppose I'll end up trying to use SolrStore eventually, but I can't help
but feel that SMW will eventually need to head in the direction of a hybrid
search anyways. Even if using third-party code to create a hybrbid
semantic/syntactic search page is viable, it remains quite *awkward* for
the top-right search bar to still not have the functionality (especially as
users will almost always default to that top-right bar rather than hunting
for a different search link anyways). My opinions at least.

Hope that update helps someone out there!
-Josh

On Sat, Oct 24, 2015 at 4:59 PM, Josh King <[hidden email]> wrote:

> I think I'm set on trying to find a crafty way to avoid those external
> options if possible. Anyone know if I could dump all of my page text into a
> sort of master property and query it syntactically that way? ...Maybe I'm
> crazy for thinking it, but hey...
>
> On Tue, Oct 20, 2015 at 8:41 PM, Josh King <[hidden email]> wrote:
>
>> Thanks James... I think I need to sit on this for a bit and see if I can
>> make any of this work for me. I'd love if I could keep everything "in
>> house" and not have to rely on external engines for the search, but it's a
>> wise idea I hadn't yet considered and should think on.
>>
>> -Josh
>>
>> On Mon, Oct 19, 2015 at 12:02 PM, James HK <[hidden email]>
>> wrote:
>>
>>> Hi,
>>>
>>> > Anyways this is our own specific usage and may not be relevant enough
>>> to
>>> > the wiki community at large. We work with older scholars and academics
>>> who
>>>
>>> Nice, I wish we had more of such use case descriptions.
>>>
>>> Since this also emerged recently on smw.org [0], let me add some
>>> remarks to this topic.
>>>
>>> In general, arbitrary text search (matching a string, computation of
>>> string similarity [1]) and attributive search (matching an entity on
>>> certain attributive conditions, computation of semantic relatedness
>>> between concepts) are two distinct approaches and suffice depending on
>>> the objective one tries to achieve.
>>>
>>> > in our resources if that search isn't as close as possible to ideal.
>>> But I
>>> > understand your point, is it worth investing time into combining these
>>> > search types if one or the other gets you close enough. For us I think
>>> the
>>>
>>> As to how this can be combined, I try to give some examples [2, 3, 4].
>>>
>>> [2] uses MW's standard Special:Search interface to match semantic
>>> conditions entered in the search field.
>>>
>>> [3] also uses MW's standard Special:Search interface with results
>>> being matched against semantic attributes after they have been fetched
>>> from a SearchEngine (which could be ElasticSearch and independent from
>>> SMW). The reduction of results is done in a post-process to only show
>>> subjects that match certain semantic conditions (in case of [3] it is
>>> the annotated content language).
>>>
>>> [2, 3] are semi "combined search" solutions that use MW's
>>> infrastructure (search hooks) to provide a single interface and
>>> integrate search results provided by an external SearchEngine (which
>>> focuses on document or full-text indexing) with matches against some
>>> attributes (category, property etc.).
>>>
>>> [4] video (2:40 min) shows how a text-search and result set reduction
>>> can work together when using MW's Special:Search.
>>>
>>> Another possibility is to look at the SolrStore extension [5] which
>>> provides a Lucene full-text search while incorporating semantic
>>> annotations.
>>>
>>> > answer would be yes. Or at least a way to refine a text-search's
>>> results
>>> > using semantic properties? Maybe? I don't know.
>>>
>>> [2, 3] shows how a Special:Search integration can be achieved and you
>>> can easily develop such interface yourself (given that the full-text
>>> search is used as starting point for the search process and attributes
>>> are matched in a post-process to reduce the set of matches).
>>>
>>> PS: In the past we had some minor exchange about an ElasticSearch
>>> integration but since that requires commitment and resources it hasn't
>>> been on anyone's agenda.
>>>
>>> [0]
>>> https://semantic-mediawiki.org/w/index.php?title=semantic-mediawiki.org:Community_portal&offset=20151019103205&lqt_mustshow=1975#Semantic_search_in_documents_.28indexing.29_1975
>>>
>>> [1] Giunchiglia, Fausto, Uladzimir Kharkevich, and Ilya Zaihrayeu.
>>> "Concept search: Semantics enabled syntactic search." (2008).
>>>
>>> [2] https://semantic-mediawiki.org/wiki/Help:SMWSearch
>>> [3]
>>> https://github.com/SemanticMediaWiki/SemanticInterlanguageLinks/tree/master/src/Search
>>>
>>> [4] https://vimeo.com/115871518
>>>
>>> [5] https://www.mediawiki.org/wiki/Extension:SolrStore
>>>
>>> Cheers
>>>
>>> On 10/19/15, Jeremi Plazas <[hidden email]> wrote:
>>> > Hey Yaron,
>>> >
>>> > Thanks for taking the time to put some thought into this.
>>> >
>>> > Here's the situation for us here at Tsadra Foundation. We use the wiki
>>> for
>>> > archiving books. Any given book page contains a template with lots of
>>> > semantic data on that book, namely author, translator, editor,
>>> publication
>>> > information, glossary, index, and bibliographic information of all
>>> kinds,
>>> > then in a lot of cases we also have some of the content of the book on
>>> the
>>> > page itself (plain text, outside of the main template), usually the
>>> > introduction, glossary and bibliography, and in some cases the full
>>> text of
>>> > the book. Our site is used for research by many translators and
>>> scholars
>>> > who, for the most part are interested in "hybrid" searches (as you call
>>> > them). Meaning for example, they'd like to know how such an author
>>> > (semantic data) translates a particular Tibetan term in their glossary
>>> > (syntactic data in the page's full text). This is just an example, but
>>> most
>>> > of the searches we are requested to do on their behalf is constructed
>>> as
>>> > such: a combination of some set of semantic data to isolate a book or
>>> group
>>> > of books, and then syntactic data from these books content. So in
>>> short,
>>> > for our use anyways, some form of search capability that combines the
>>> two
>>> > would be ideal.
>>> >
>>> > Anyways this is our own specific usage and may not be relevant enough
>>> to
>>> > the wiki community at large. We work with older scholars and academics
>>> who
>>> > aren't necessarily very tech savvy and making them use a search box
>>> instead
>>> > of emailing us is already hard enough. They easily turn-off their
>>> interest
>>> > in our resources if that search isn't as close as possible to ideal.
>>> But I
>>> > understand your point, is it worth investing time into combining these
>>> > search types if one or the other gets you close enough. For us I think
>>> the
>>> > answer would be yes. Or at least a way to refine a text-search's
>>> results
>>> > using semantic properties? Maybe? I don't know.
>>> >
>>> > Thanks again very much for taking the time,
>>> >
>>> > On Mon, Oct 19, 2015 at 9:07 AM Yaron Koren <[hidden email]>
>>> wrote:
>>> >
>>> >> Hi,
>>> >>
>>> >> I'm not a core SMW developer, so I don't know if I can directly help
>>> with
>>> >> this issue, but I am curious about it. As we all know, property-based
>>> >> querying and text-based searching are very different from one other
>>> >> conceptually: one is semantic, and the other is syntactic. But on a
>>> >> practical level, they also seem rather different: text searches are
>>> used
>>> >> for finding one or more individual pages, and are often done just
>>> once;
>>> >> while standard SMW queries are generally used for aggregating data,
>>> and
>>> >> are
>>> >> usually meant for display. So it doesn't seem obvious that it would be
>>> >> useful to put the two together - though I'm aware that this is a
>>> common
>>> >> request. In both of your cases, my guess is that this would be used to
>>> >> help
>>> >> with one-time searches, to let users further drill down on search
>>> results
>>> >> in order to avoid having to go through many result pages in order to
>>> find
>>> >> what they're looking for. Is that correct? And if so, how often is
>>> that
>>> >> necessary? Are there really times when this kind of hybrid search
>>> would be
>>> >> much faster than one type of search or the other?
>>> >>
>>> >> -Yaron
>>> >>
>>> >> On Fri, Oct 16, 2015 at 7:12 PM, Josh King <[hidden email]>
>>> wrote:
>>> >>
>>> >>> Hi Jeremi,
>>> >>>
>>> >>> Glad to see I'm not the only one.
>>> >>> One solution I could imagine is taking advantage of the fact that
>>> much of
>>> >>> my pages (even the bodies of text) are included in various
>>> properties, so
>>> >>> I
>>> >>> could conceivably have a complex query that searches through all of
>>> those
>>> >>> properties for that text I think. I personally would like to avoid
>>> this
>>> >>> though as it would cause me to need set my query depth and size
>>> >>> configurations high enough that I bet I'd lose performance (and I'm
>>> not
>>> >>> even sure the query would really be possible itself).
>>> >>>
>>> >>> Just thought I would mention that in case it comes into thinking for
>>> >>> others
>>> >>> also.
>>> >>>
>>> >>> -Josh
>>> >>>
>>> >>> On Fri, Oct 16, 2015 at 5:52 PM, Jeremi Plazas <[hidden email]>
>>> wrote:
>>> >>>
>>> >>> > Hi Josh,
>>> >>> >
>>> >>> > I've run into the same problem. I would love to see a solution to
>>> this
>>> >>> but
>>> >>> > haven't yet. It seems to be an either/or kind of situation. Either
>>> >>> search
>>> >>> > page content and titles by using the top right search bar, or
>>> search
>>> >>> > semantic properties with custom query forms, etc. But I haven't
>>> found a
>>> >>> way
>>> >>> > to combine the two.
>>> >>> >
>>> >>> > Hope this issue is addressed, I second that it's an obvious need.
>>> >>> >
>>> >>> > On Fri, Oct 16, 2015 at 4:31 PM Josh King <[hidden email]>
>>> wrote:
>>> >>> >
>>> >>> >> Hi all,
>>> >>> >>
>>> >>> >> I seem to be stuck on what I feel should be a simple issue. I've
>>> >>> created a
>>> >>> >> search interface for users on my wiki to allow for a sort of
>>> "advanced
>>> >>> >> search" where they can effectively filter their results by things
>>> like
>>> >>> >> authors, date published, etc through an inline query (each of
>>> which is
>>> >>> >> attached to a particular property and is already set up for the
>>> user).
>>> >>> >>
>>> >>> >> What I would like to do though is also add a general-purpose
>>> field to
>>> >>> my
>>> >>> >> query to the extent of "page includes" which searches for similar
>>> text
>>> >>> to
>>> >>> >> the user entry within all of the page's content just like the
>>> >>> >> top-right
>>> >>> >> search bar would in a standard wiki search. Any idea how to do
>>> this?
>>> >>> >>
>>> >>> >> Since inline queries seem to be built on searching by property
>>> >>> >> relationships, I got hung up. Thanks for any help!
>>> >>> >>
>>> >>> >> -Josh
>>> >>> >>
>>> >>> >>
>>> >>>
>>> ------------------------------------------------------------------------------
>>> >>> >> _______________________________________________
>>> >>> >> Semediawiki-user mailing list
>>> >>> >> [hidden email]
>>> >>> >> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
>>> >>> >>
>>> >>> > --
>>> >>> > Jeremi Plazas
>>> >>> > Assistant Director of Research @ Tsadra Foundation
>>> >>> > www.tsadra.org // [hidden email]
>>> >>> >
>>> >>>
>>> >>>
>>> ------------------------------------------------------------------------------
>>> >>> _______________________________________________
>>> >>> Semediawiki-user mailing list
>>> >>> [hidden email]
>>> >>> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> WikiWorks · MediaWiki Consulting · http://wikiworks.com
>>> >>
>>> > --
>>> > Jeremi Plazas
>>> > Assistant Director of Research @ Tsadra Foundation
>>> > www.tsadra.org // [hidden email]
>>> >
>>> ------------------------------------------------------------------------------
>>> > _______________________________________________
>>> > Semediawiki-user mailing list
>>> > [hidden email]
>>> > https://lists.sourceforge.net/lists/listinfo/semediawiki-user
>>> >
>>>
>>
>>
>
------------------------------------------------------------------------------
_______________________________________________
Semediawiki-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/semediawiki-user
Reply | Threaded
Open this post in threaded view
|

Re: How to query whole page content?

P. Josepherum
I expect a simple enough approach is to use the mediawki api to get a json
list of potential matches and then filter those results by arbitrary ask
queries. Using perhaps an LUA module or PHP extension.

On Sun, 15 Nov 2015 23:18 Josh King <[hidden email]> wrote:

> To anyone interested on this still and hopefully to save them time also...
>
> Unfortunately trying to use subproperties as a work-around is not viable
> for this. I attempted placing key portions of my articles in subproperties
> and then putting them under an umbrella property like "Has page content for
> search." This works well for a few, short properties, but fails at longer
> text strings due to only 40 characters being searchable (*a real
> bummer...*)
> as noted in this old discussion
>
>
> http://wikimedia.7.x6.nabble.com/Unable-to-query-over-the-last-words-of-a-long-text-value-td5020618.html#a5020801
>
> The query could also be broken up into a bunch of OR statements to run
> through a series of short properties as well (being fed from one field),
> but each property would still be limited by the 40 character count as noted
> above.
>
> I suppose I'll end up trying to use SolrStore eventually, but I can't help
> but feel that SMW will eventually need to head in the direction of a hybrid
> search anyways. Even if using third-party code to create a hybrbid
> semantic/syntactic search page is viable, it remains quite *awkward* for
> the top-right search bar to still not have the functionality (especially as
> users will almost always default to that top-right bar rather than hunting
> for a different search link anyways). My opinions at least.
>
> Hope that update helps someone out there!
> -Josh
>
> On Sat, Oct 24, 2015 at 4:59 PM, Josh King <[hidden email]> wrote:
>
> > I think I'm set on trying to find a crafty way to avoid those external
> > options if possible. Anyone know if I could dump all of my page text
> into a
> > sort of master property and query it syntactically that way? ...Maybe I'm
> > crazy for thinking it, but hey...
> >
> > On Tue, Oct 20, 2015 at 8:41 PM, Josh King <[hidden email]> wrote:
> >
> >> Thanks James... I think I need to sit on this for a bit and see if I can
> >> make any of this work for me. I'd love if I could keep everything "in
> >> house" and not have to rely on external engines for the search, but
> it's a
> >> wise idea I hadn't yet considered and should think on.
> >>
> >> -Josh
> >>
> >> On Mon, Oct 19, 2015 at 12:02 PM, James HK <
> [hidden email]>
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>> > Anyways this is our own specific usage and may not be relevant enough
> >>> to
> >>> > the wiki community at large. We work with older scholars and
> academics
> >>> who
> >>>
> >>> Nice, I wish we had more of such use case descriptions.
> >>>
> >>> Since this also emerged recently on smw.org [0], let me add some
> >>> remarks to this topic.
> >>>
> >>> In general, arbitrary text search (matching a string, computation of
> >>> string similarity [1]) and attributive search (matching an entity on
> >>> certain attributive conditions, computation of semantic relatedness
> >>> between concepts) are two distinct approaches and suffice depending on
> >>> the objective one tries to achieve.
> >>>
> >>> > in our resources if that search isn't as close as possible to ideal.
> >>> But I
> >>> > understand your point, is it worth investing time into combining
> these
> >>> > search types if one or the other gets you close enough. For us I
> think
> >>> the
> >>>
> >>> As to how this can be combined, I try to give some examples [2, 3, 4].
> >>>
> >>> [2] uses MW's standard Special:Search interface to match semantic
> >>> conditions entered in the search field.
> >>>
> >>> [3] also uses MW's standard Special:Search interface with results
> >>> being matched against semantic attributes after they have been fetched
> >>> from a SearchEngine (which could be ElasticSearch and independent from
> >>> SMW). The reduction of results is done in a post-process to only show
> >>> subjects that match certain semantic conditions (in case of [3] it is
> >>> the annotated content language).
> >>>
> >>> [2, 3] are semi "combined search" solutions that use MW's
> >>> infrastructure (search hooks) to provide a single interface and
> >>> integrate search results provided by an external SearchEngine (which
> >>> focuses on document or full-text indexing) with matches against some
> >>> attributes (category, property etc.).
> >>>
> >>> [4] video (2:40 min) shows how a text-search and result set reduction
> >>> can work together when using MW's Special:Search.
> >>>
> >>> Another possibility is to look at the SolrStore extension [5] which
> >>> provides a Lucene full-text search while incorporating semantic
> >>> annotations.
> >>>
> >>> > answer would be yes. Or at least a way to refine a text-search's
> >>> results
> >>> > using semantic properties? Maybe? I don't know.
> >>>
> >>> [2, 3] shows how a Special:Search integration can be achieved and you
> >>> can easily develop such interface yourself (given that the full-text
> >>> search is used as starting point for the search process and attributes
> >>> are matched in a post-process to reduce the set of matches).
> >>>
> >>> PS: In the past we had some minor exchange about an ElasticSearch
> >>> integration but since that requires commitment and resources it hasn't
> >>> been on anyone's agenda.
> >>>
> >>> [0]
> >>>
> https://semantic-mediawiki.org/w/index.php?title=semantic-mediawiki.org:Community_portal&offset=20151019103205&lqt_mustshow=1975#Semantic_search_in_documents_.28indexing.29_1975
> >>>
> >>> [1] Giunchiglia, Fausto, Uladzimir Kharkevich, and Ilya Zaihrayeu.
> >>> "Concept search: Semantics enabled syntactic search." (2008).
> >>>
> >>> [2] https://semantic-mediawiki.org/wiki/Help:SMWSearch
> >>> [3]
> >>>
> https://github.com/SemanticMediaWiki/SemanticInterlanguageLinks/tree/master/src/Search
> >>>
> >>> [4] https://vimeo.com/115871518
> >>>
> >>> [5] https://www.mediawiki.org/wiki/Extension:SolrStore
> >>>
> >>> Cheers
> >>>
> >>> On 10/19/15, Jeremi Plazas <[hidden email]> wrote:
> >>> > Hey Yaron,
> >>> >
> >>> > Thanks for taking the time to put some thought into this.
> >>> >
> >>> > Here's the situation for us here at Tsadra Foundation. We use the
> wiki
> >>> for
> >>> > archiving books. Any given book page contains a template with lots of
> >>> > semantic data on that book, namely author, translator, editor,
> >>> publication
> >>> > information, glossary, index, and bibliographic information of all
> >>> kinds,
> >>> > then in a lot of cases we also have some of the content of the book
> on
> >>> the
> >>> > page itself (plain text, outside of the main template), usually the
> >>> > introduction, glossary and bibliography, and in some cases the full
> >>> text of
> >>> > the book. Our site is used for research by many translators and
> >>> scholars
> >>> > who, for the most part are interested in "hybrid" searches (as you
> call
> >>> > them). Meaning for example, they'd like to know how such an author
> >>> > (semantic data) translates a particular Tibetan term in their
> glossary
> >>> > (syntactic data in the page's full text). This is just an example,
> but
> >>> most
> >>> > of the searches we are requested to do on their behalf is constructed
> >>> as
> >>> > such: a combination of some set of semantic data to isolate a book or
> >>> group
> >>> > of books, and then syntactic data from these books content. So in
> >>> short,
> >>> > for our use anyways, some form of search capability that combines the
> >>> two
> >>> > would be ideal.
> >>> >
> >>> > Anyways this is our own specific usage and may not be relevant enough
> >>> to
> >>> > the wiki community at large. We work with older scholars and
> academics
> >>> who
> >>> > aren't necessarily very tech savvy and making them use a search box
> >>> instead
> >>> > of emailing us is already hard enough. They easily turn-off their
> >>> interest
> >>> > in our resources if that search isn't as close as possible to ideal.
> >>> But I
> >>> > understand your point, is it worth investing time into combining
> these
> >>> > search types if one or the other gets you close enough. For us I
> think
> >>> the
> >>> > answer would be yes. Or at least a way to refine a text-search's
> >>> results
> >>> > using semantic properties? Maybe? I don't know.
> >>> >
> >>> > Thanks again very much for taking the time,
> >>> >
> >>> > On Mon, Oct 19, 2015 at 9:07 AM Yaron Koren <[hidden email]>
> >>> wrote:
> >>> >
> >>> >> Hi,
> >>> >>
> >>> >> I'm not a core SMW developer, so I don't know if I can directly help
> >>> with
> >>> >> this issue, but I am curious about it. As we all know,
> property-based
> >>> >> querying and text-based searching are very different from one other
> >>> >> conceptually: one is semantic, and the other is syntactic. But on a
> >>> >> practical level, they also seem rather different: text searches are
> >>> used
> >>> >> for finding one or more individual pages, and are often done just
> >>> once;
> >>> >> while standard SMW queries are generally used for aggregating data,
> >>> and
> >>> >> are
> >>> >> usually meant for display. So it doesn't seem obvious that it would
> be
> >>> >> useful to put the two together - though I'm aware that this is a
> >>> common
> >>> >> request. In both of your cases, my guess is that this would be used
> to
> >>> >> help
> >>> >> with one-time searches, to let users further drill down on search
> >>> results
> >>> >> in order to avoid having to go through many result pages in order to
> >>> find
> >>> >> what they're looking for. Is that correct? And if so, how often is
> >>> that
> >>> >> necessary? Are there really times when this kind of hybrid search
> >>> would be
> >>> >> much faster than one type of search or the other?
> >>> >>
> >>> >> -Yaron
> >>> >>
> >>> >> On Fri, Oct 16, 2015 at 7:12 PM, Josh King <[hidden email]>
> >>> wrote:
> >>> >>
> >>> >>> Hi Jeremi,
> >>> >>>
> >>> >>> Glad to see I'm not the only one.
> >>> >>> One solution I could imagine is taking advantage of the fact that
> >>> much of
> >>> >>> my pages (even the bodies of text) are included in various
> >>> properties, so
> >>> >>> I
> >>> >>> could conceivably have a complex query that searches through all of
> >>> those
> >>> >>> properties for that text I think. I personally would like to avoid
> >>> this
> >>> >>> though as it would cause me to need set my query depth and size
> >>> >>> configurations high enough that I bet I'd lose performance (and I'm
> >>> not
> >>> >>> even sure the query would really be possible itself).
> >>> >>>
> >>> >>> Just thought I would mention that in case it comes into thinking
> for
> >>> >>> others
> >>> >>> also.
> >>> >>>
> >>> >>> -Josh
> >>> >>>
> >>> >>> On Fri, Oct 16, 2015 at 5:52 PM, Jeremi Plazas <[hidden email]>
> >>> wrote:
> >>> >>>
> >>> >>> > Hi Josh,
> >>> >>> >
> >>> >>> > I've run into the same problem. I would love to see a solution to
> >>> this
> >>> >>> but
> >>> >>> > haven't yet. It seems to be an either/or kind of situation.
> Either
> >>> >>> search
> >>> >>> > page content and titles by using the top right search bar, or
> >>> search
> >>> >>> > semantic properties with custom query forms, etc. But I haven't
> >>> found a
> >>> >>> way
> >>> >>> > to combine the two.
> >>> >>> >
> >>> >>> > Hope this issue is addressed, I second that it's an obvious need.
> >>> >>> >
> >>> >>> > On Fri, Oct 16, 2015 at 4:31 PM Josh King <[hidden email]>
> >>> wrote:
> >>> >>> >
> >>> >>> >> Hi all,
> >>> >>> >>
> >>> >>> >> I seem to be stuck on what I feel should be a simple issue. I've
> >>> >>> created a
> >>> >>> >> search interface for users on my wiki to allow for a sort of
> >>> "advanced
> >>> >>> >> search" where they can effectively filter their results by
> things
> >>> like
> >>> >>> >> authors, date published, etc through an inline query (each of
> >>> which is
> >>> >>> >> attached to a particular property and is already set up for the
> >>> user).
> >>> >>> >>
> >>> >>> >> What I would like to do though is also add a general-purpose
> >>> field to
> >>> >>> my
> >>> >>> >> query to the extent of "page includes" which searches for
> similar
> >>> text
> >>> >>> to
> >>> >>> >> the user entry within all of the page's content just like the
> >>> >>> >> top-right
> >>> >>> >> search bar would in a standard wiki search. Any idea how to do
> >>> this?
> >>> >>> >>
> >>> >>> >> Since inline queries seem to be built on searching by property
> >>> >>> >> relationships, I got hung up. Thanks for any help!
> >>> >>> >>
> >>> >>> >> -Josh
> >>> >>> >>
> >>> >>> >>
> >>> >>>
> >>>
> ------------------------------------------------------------------------------
> >>> >>> >> _______________________________________________
> >>> >>> >> Semediawiki-user mailing list
> >>> >>> >> [hidden email]
> >>> >>> >> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
> >>> >>> >>
> >>> >>> > --
> >>> >>> > Jeremi Plazas
> >>> >>> > Assistant Director of Research @ Tsadra Foundation
> >>> >>> > www.tsadra.org // [hidden email]
> >>> >>> >
> >>> >>>
> >>> >>>
> >>>
> ------------------------------------------------------------------------------
> >>> >>> _______________________________________________
> >>> >>> Semediawiki-user mailing list
> >>> >>> [hidden email]
> >>> >>> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
> >>> >>>
> >>> >>
> >>> >>
> >>> >>
> >>> >> --
> >>> >> WikiWorks · MediaWiki Consulting · http://wikiworks.com
> >>> >>
> >>> > --
> >>> > Jeremi Plazas
> >>> > Assistant Director of Research @ Tsadra Foundation
> >>> > www.tsadra.org // [hidden email]
> >>> >
> >>>
> ------------------------------------------------------------------------------
> >>> > _______________________________________________
> >>> > Semediawiki-user mailing list
> >>> > [hidden email]
> >>> > https://lists.sourceforge.net/lists/listinfo/semediawiki-user
> >>> >
> >>>
> >>
> >>
> >
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Semediawiki-user mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
>
--

Love and waffles,
PJosepherum
------------------------------------------------------------------------------
Presto, an open source distributed SQL query engine for big data, initially
developed by Facebook, enables you to easily query your data on Hadoop in a
more interactive manner. Teradata is also now providing full enterprise
support for Presto. Download a free open source copy now.
http://pubads.g.doubleclick.net/gampad/clk?id=250295911&iu=/4140
_______________________________________________
Semediawiki-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/semediawiki-user
Reply | Threaded
Open this post in threaded view
|

Re: How to query whole page content?

Fannon
In reply to this post by Josh King
Hello Josh,

I've briefly thought about adding ElasticSearch as an additional store to
SMW. This could add all semantic properties + the freetext to the ES DB and
make it available for advanced searching, querying and data-analysis.

SMW also allows to to fetch all semantic properties of a page:
http://semantic-mediawiki.org/wiki/Ask_API#BrowseBySubject

This would be a bigger project, though.

Best,
Simon

2015-11-16 0:16 GMT+01:00 Josh King <[hidden email]>:

> To anyone interested on this still and hopefully to save them time also...
>
> Unfortunately trying to use subproperties as a work-around is not viable
> for this. I attempted placing key portions of my articles in subproperties
> and then putting them under an umbrella property like "Has page content for
> search." This works well for a few, short properties, but fails at longer
> text strings due to only 40 characters being searchable (*a real
> bummer...*)
> as noted in this old discussion
>
>
> http://wikimedia.7.x6.nabble.com/Unable-to-query-over-the-last-words-of-a-long-text-value-td5020618.html#a5020801
>
> The query could also be broken up into a bunch of OR statements to run
> through a series of short properties as well (being fed from one field),
> but each property would still be limited by the 40 character count as noted
> above.
>
> I suppose I'll end up trying to use SolrStore eventually, but I can't help
> but feel that SMW will eventually need to head in the direction of a hybrid
> search anyways. Even if using third-party code to create a hybrbid
> semantic/syntactic search page is viable, it remains quite *awkward* for
> the top-right search bar to still not have the functionality (especially as
> users will almost always default to that top-right bar rather than hunting
> for a different search link anyways). My opinions at least.
>
> Hope that update helps someone out there!
> -Josh
>
> On Sat, Oct 24, 2015 at 4:59 PM, Josh King <[hidden email]> wrote:
>
> > I think I'm set on trying to find a crafty way to avoid those external
> > options if possible. Anyone know if I could dump all of my page text
> into a
> > sort of master property and query it syntactically that way? ...Maybe I'm
> > crazy for thinking it, but hey...
> >
> > On Tue, Oct 20, 2015 at 8:41 PM, Josh King <[hidden email]> wrote:
> >
> >> Thanks James... I think I need to sit on this for a bit and see if I can
> >> make any of this work for me. I'd love if I could keep everything "in
> >> house" and not have to rely on external engines for the search, but
> it's a
> >> wise idea I hadn't yet considered and should think on.
> >>
> >> -Josh
> >>
> >> On Mon, Oct 19, 2015 at 12:02 PM, James HK <
> [hidden email]>
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>> > Anyways this is our own specific usage and may not be relevant enough
> >>> to
> >>> > the wiki community at large. We work with older scholars and
> academics
> >>> who
> >>>
> >>> Nice, I wish we had more of such use case descriptions.
> >>>
> >>> Since this also emerged recently on smw.org [0], let me add some
> >>> remarks to this topic.
> >>>
> >>> In general, arbitrary text search (matching a string, computation of
> >>> string similarity [1]) and attributive search (matching an entity on
> >>> certain attributive conditions, computation of semantic relatedness
> >>> between concepts) are two distinct approaches and suffice depending on
> >>> the objective one tries to achieve.
> >>>
> >>> > in our resources if that search isn't as close as possible to ideal.
> >>> But I
> >>> > understand your point, is it worth investing time into combining
> these
> >>> > search types if one or the other gets you close enough. For us I
> think
> >>> the
> >>>
> >>> As to how this can be combined, I try to give some examples [2, 3, 4].
> >>>
> >>> [2] uses MW's standard Special:Search interface to match semantic
> >>> conditions entered in the search field.
> >>>
> >>> [3] also uses MW's standard Special:Search interface with results
> >>> being matched against semantic attributes after they have been fetched
> >>> from a SearchEngine (which could be ElasticSearch and independent from
> >>> SMW). The reduction of results is done in a post-process to only show
> >>> subjects that match certain semantic conditions (in case of [3] it is
> >>> the annotated content language).
> >>>
> >>> [2, 3] are semi "combined search" solutions that use MW's
> >>> infrastructure (search hooks) to provide a single interface and
> >>> integrate search results provided by an external SearchEngine (which
> >>> focuses on document or full-text indexing) with matches against some
> >>> attributes (category, property etc.).
> >>>
> >>> [4] video (2:40 min) shows how a text-search and result set reduction
> >>> can work together when using MW's Special:Search.
> >>>
> >>> Another possibility is to look at the SolrStore extension [5] which
> >>> provides a Lucene full-text search while incorporating semantic
> >>> annotations.
> >>>
> >>> > answer would be yes. Or at least a way to refine a text-search's
> >>> results
> >>> > using semantic properties? Maybe? I don't know.
> >>>
> >>> [2, 3] shows how a Special:Search integration can be achieved and you
> >>> can easily develop such interface yourself (given that the full-text
> >>> search is used as starting point for the search process and attributes
> >>> are matched in a post-process to reduce the set of matches).
> >>>
> >>> PS: In the past we had some minor exchange about an ElasticSearch
> >>> integration but since that requires commitment and resources it hasn't
> >>> been on anyone's agenda.
> >>>
> >>> [0]
> >>>
> https://semantic-mediawiki.org/w/index.php?title=semantic-mediawiki.org:Community_portal&offset=20151019103205&lqt_mustshow=1975#Semantic_search_in_documents_.28indexing.29_1975
> >>>
> >>> [1] Giunchiglia, Fausto, Uladzimir Kharkevich, and Ilya Zaihrayeu.
> >>> "Concept search: Semantics enabled syntactic search." (2008).
> >>>
> >>> [2] https://semantic-mediawiki.org/wiki/Help:SMWSearch
> >>> [3]
> >>>
> https://github.com/SemanticMediaWiki/SemanticInterlanguageLinks/tree/master/src/Search
> >>>
> >>> [4] https://vimeo.com/115871518
> >>>
> >>> [5] https://www.mediawiki.org/wiki/Extension:SolrStore
> >>>
> >>> Cheers
> >>>
> >>> On 10/19/15, Jeremi Plazas <[hidden email]> wrote:
> >>> > Hey Yaron,
> >>> >
> >>> > Thanks for taking the time to put some thought into this.
> >>> >
> >>> > Here's the situation for us here at Tsadra Foundation. We use the
> wiki
> >>> for
> >>> > archiving books. Any given book page contains a template with lots of
> >>> > semantic data on that book, namely author, translator, editor,
> >>> publication
> >>> > information, glossary, index, and bibliographic information of all
> >>> kinds,
> >>> > then in a lot of cases we also have some of the content of the book
> on
> >>> the
> >>> > page itself (plain text, outside of the main template), usually the
> >>> > introduction, glossary and bibliography, and in some cases the full
> >>> text of
> >>> > the book. Our site is used for research by many translators and
> >>> scholars
> >>> > who, for the most part are interested in "hybrid" searches (as you
> call
> >>> > them). Meaning for example, they'd like to know how such an author
> >>> > (semantic data) translates a particular Tibetan term in their
> glossary
> >>> > (syntactic data in the page's full text). This is just an example,
> but
> >>> most
> >>> > of the searches we are requested to do on their behalf is constructed
> >>> as
> >>> > such: a combination of some set of semantic data to isolate a book or
> >>> group
> >>> > of books, and then syntactic data from these books content. So in
> >>> short,
> >>> > for our use anyways, some form of search capability that combines the
> >>> two
> >>> > would be ideal.
> >>> >
> >>> > Anyways this is our own specific usage and may not be relevant enough
> >>> to
> >>> > the wiki community at large. We work with older scholars and
> academics
> >>> who
> >>> > aren't necessarily very tech savvy and making them use a search box
> >>> instead
> >>> > of emailing us is already hard enough. They easily turn-off their
> >>> interest
> >>> > in our resources if that search isn't as close as possible to ideal.
> >>> But I
> >>> > understand your point, is it worth investing time into combining
> these
> >>> > search types if one or the other gets you close enough. For us I
> think
> >>> the
> >>> > answer would be yes. Or at least a way to refine a text-search's
> >>> results
> >>> > using semantic properties? Maybe? I don't know.
> >>> >
> >>> > Thanks again very much for taking the time,
> >>> >
> >>> > On Mon, Oct 19, 2015 at 9:07 AM Yaron Koren <[hidden email]>
> >>> wrote:
> >>> >
> >>> >> Hi,
> >>> >>
> >>> >> I'm not a core SMW developer, so I don't know if I can directly help
> >>> with
> >>> >> this issue, but I am curious about it. As we all know,
> property-based
> >>> >> querying and text-based searching are very different from one other
> >>> >> conceptually: one is semantic, and the other is syntactic. But on a
> >>> >> practical level, they also seem rather different: text searches are
> >>> used
> >>> >> for finding one or more individual pages, and are often done just
> >>> once;
> >>> >> while standard SMW queries are generally used for aggregating data,
> >>> and
> >>> >> are
> >>> >> usually meant for display. So it doesn't seem obvious that it would
> be
> >>> >> useful to put the two together - though I'm aware that this is a
> >>> common
> >>> >> request. In both of your cases, my guess is that this would be used
> to
> >>> >> help
> >>> >> with one-time searches, to let users further drill down on search
> >>> results
> >>> >> in order to avoid having to go through many result pages in order to
> >>> find
> >>> >> what they're looking for. Is that correct? And if so, how often is
> >>> that
> >>> >> necessary? Are there really times when this kind of hybrid search
> >>> would be
> >>> >> much faster than one type of search or the other?
> >>> >>
> >>> >> -Yaron
> >>> >>
> >>> >> On Fri, Oct 16, 2015 at 7:12 PM, Josh King <[hidden email]>
> >>> wrote:
> >>> >>
> >>> >>> Hi Jeremi,
> >>> >>>
> >>> >>> Glad to see I'm not the only one.
> >>> >>> One solution I could imagine is taking advantage of the fact that
> >>> much of
> >>> >>> my pages (even the bodies of text) are included in various
> >>> properties, so
> >>> >>> I
> >>> >>> could conceivably have a complex query that searches through all of
> >>> those
> >>> >>> properties for that text I think. I personally would like to avoid
> >>> this
> >>> >>> though as it would cause me to need set my query depth and size
> >>> >>> configurations high enough that I bet I'd lose performance (and I'm
> >>> not
> >>> >>> even sure the query would really be possible itself).
> >>> >>>
> >>> >>> Just thought I would mention that in case it comes into thinking
> for
> >>> >>> others
> >>> >>> also.
> >>> >>>
> >>> >>> -Josh
> >>> >>>
> >>> >>> On Fri, Oct 16, 2015 at 5:52 PM, Jeremi Plazas <[hidden email]>
> >>> wrote:
> >>> >>>
> >>> >>> > Hi Josh,
> >>> >>> >
> >>> >>> > I've run into the same problem. I would love to see a solution to
> >>> this
> >>> >>> but
> >>> >>> > haven't yet. It seems to be an either/or kind of situation.
> Either
> >>> >>> search
> >>> >>> > page content and titles by using the top right search bar, or
> >>> search
> >>> >>> > semantic properties with custom query forms, etc. But I haven't
> >>> found a
> >>> >>> way
> >>> >>> > to combine the two.
> >>> >>> >
> >>> >>> > Hope this issue is addressed, I second that it's an obvious need.
> >>> >>> >
> >>> >>> > On Fri, Oct 16, 2015 at 4:31 PM Josh King <[hidden email]>
> >>> wrote:
> >>> >>> >
> >>> >>> >> Hi all,
> >>> >>> >>
> >>> >>> >> I seem to be stuck on what I feel should be a simple issue. I've
> >>> >>> created a
> >>> >>> >> search interface for users on my wiki to allow for a sort of
> >>> "advanced
> >>> >>> >> search" where they can effectively filter their results by
> things
> >>> like
> >>> >>> >> authors, date published, etc through an inline query (each of
> >>> which is
> >>> >>> >> attached to a particular property and is already set up for the
> >>> user).
> >>> >>> >>
> >>> >>> >> What I would like to do though is also add a general-purpose
> >>> field to
> >>> >>> my
> >>> >>> >> query to the extent of "page includes" which searches for
> similar
> >>> text
> >>> >>> to
> >>> >>> >> the user entry within all of the page's content just like the
> >>> >>> >> top-right
> >>> >>> >> search bar would in a standard wiki search. Any idea how to do
> >>> this?
> >>> >>> >>
> >>> >>> >> Since inline queries seem to be built on searching by property
> >>> >>> >> relationships, I got hung up. Thanks for any help!
> >>> >>> >>
> >>> >>> >> -Josh
> >>> >>> >>
> >>> >>> >>
> >>> >>>
> >>>
> ------------------------------------------------------------------------------
> >>> >>> >> _______________________________________________
> >>> >>> >> Semediawiki-user mailing list
> >>> >>> >> [hidden email]
> >>> >>> >> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
> >>> >>> >>
> >>> >>> > --
> >>> >>> > Jeremi Plazas
> >>> >>> > Assistant Director of Research @ Tsadra Foundation
> >>> >>> > www.tsadra.org // [hidden email]
> >>> >>> >
> >>> >>>
> >>> >>>
> >>>
> ------------------------------------------------------------------------------
> >>> >>> _______________________________________________
> >>> >>> Semediawiki-user mailing list
> >>> >>> [hidden email]
> >>> >>> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
> >>> >>>
> >>> >>
> >>> >>
> >>> >>
> >>> >> --
> >>> >> WikiWorks · MediaWiki Consulting · http://wikiworks.com
> >>> >>
> >>> > --
> >>> > Jeremi Plazas
> >>> > Assistant Director of Research @ Tsadra Foundation
> >>> > www.tsadra.org // [hidden email]
> >>> >
> >>>
> ------------------------------------------------------------------------------
> >>> > _______________________________________________
> >>> > Semediawiki-user mailing list
> >>> > [hidden email]
> >>> > https://lists.sourceforge.net/lists/listinfo/semediawiki-user
> >>> >
> >>>
> >>
> >>
> >
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Semediawiki-user mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
>
------------------------------------------------------------------------------
Presto, an open source distributed SQL query engine for big data, initially
developed by Facebook, enables you to easily query your data on Hadoop in a
more interactive manner. Teradata is also now providing full enterprise
support for Presto. Download a free open source copy now.
http://pubads.g.doubleclick.net/gampad/clk?id=250295911&iu=/4140
_______________________________________________
Semediawiki-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/semediawiki-user
Reply | Threaded
Open this post in threaded view
|

Re: How to query whole page content?

Josh King
Hi Simon,

Well know that there are those of us out there who would love for you to
continue on that thought of doing so! :)
A dedicated search that works with SMW would really open up quite a few
doors.

-Josh

On Mon, Nov 16, 2015 at 2:11 AM, Simon Heimler <[hidden email]>
wrote:

> Hello Josh,
>
> I've briefly thought about adding ElasticSearch as an additional store to
> SMW. This could add all semantic properties + the freetext to the ES DB and
> make it available for advanced searching, querying and data-analysis.
>
> SMW also allows to to fetch all semantic properties of a page:
> http://semantic-mediawiki.org/wiki/Ask_API#BrowseBySubject
>
> This would be a bigger project, though.
>
> Best,
> Simon
>
> 2015-11-16 0:16 GMT+01:00 Josh King <[hidden email]>:
>
>> To anyone interested on this still and hopefully to save them time also...
>>
>> Unfortunately trying to use subproperties as a work-around is not viable
>> for this. I attempted placing key portions of my articles in subproperties
>> and then putting them under an umbrella property like "Has page content
>> for
>> search." This works well for a few, short properties, but fails at longer
>> text strings due to only 40 characters being searchable (*a real
>> bummer...*)
>> as noted in this old discussion
>>
>>
>> http://wikimedia.7.x6.nabble.com/Unable-to-query-over-the-last-words-of-a-long-text-value-td5020618.html#a5020801
>>
>> The query could also be broken up into a bunch of OR statements to run
>> through a series of short properties as well (being fed from one field),
>> but each property would still be limited by the 40 character count as
>> noted
>> above.
>>
>> I suppose I'll end up trying to use SolrStore eventually, but I can't help
>> but feel that SMW will eventually need to head in the direction of a
>> hybrid
>> search anyways. Even if using third-party code to create a hybrbid
>> semantic/syntactic search page is viable, it remains quite *awkward* for
>>
>> the top-right search bar to still not have the functionality (especially
>> as
>> users will almost always default to that top-right bar rather than hunting
>> for a different search link anyways). My opinions at least.
>>
>> Hope that update helps someone out there!
>> -Josh
>>
>> On Sat, Oct 24, 2015 at 4:59 PM, Josh King <[hidden email]> wrote:
>>
>> > I think I'm set on trying to find a crafty way to avoid those external
>> > options if possible. Anyone know if I could dump all of my page text
>> into a
>> > sort of master property and query it syntactically that way? ...Maybe
>> I'm
>> > crazy for thinking it, but hey...
>> >
>> > On Tue, Oct 20, 2015 at 8:41 PM, Josh King <[hidden email]> wrote:
>> >
>> >> Thanks James... I think I need to sit on this for a bit and see if I
>> can
>> >> make any of this work for me. I'd love if I could keep everything "in
>> >> house" and not have to rely on external engines for the search, but
>> it's a
>> >> wise idea I hadn't yet considered and should think on.
>> >>
>> >> -Josh
>> >>
>> >> On Mon, Oct 19, 2015 at 12:02 PM, James HK <
>> [hidden email]>
>> >> wrote:
>> >>
>> >>> Hi,
>> >>>
>> >>> > Anyways this is our own specific usage and may not be relevant
>> enough
>> >>> to
>> >>> > the wiki community at large. We work with older scholars and
>> academics
>> >>> who
>> >>>
>> >>> Nice, I wish we had more of such use case descriptions.
>> >>>
>> >>> Since this also emerged recently on smw.org [0], let me add some
>> >>> remarks to this topic.
>> >>>
>> >>> In general, arbitrary text search (matching a string, computation of
>> >>> string similarity [1]) and attributive search (matching an entity on
>> >>> certain attributive conditions, computation of semantic relatedness
>> >>> between concepts) are two distinct approaches and suffice depending on
>> >>> the objective one tries to achieve.
>> >>>
>> >>> > in our resources if that search isn't as close as possible to ideal.
>> >>> But I
>> >>> > understand your point, is it worth investing time into combining
>> these
>> >>> > search types if one or the other gets you close enough. For us I
>> think
>> >>> the
>> >>>
>> >>> As to how this can be combined, I try to give some examples [2, 3, 4].
>> >>>
>> >>> [2] uses MW's standard Special:Search interface to match semantic
>> >>> conditions entered in the search field.
>> >>>
>> >>> [3] also uses MW's standard Special:Search interface with results
>> >>> being matched against semantic attributes after they have been fetched
>> >>> from a SearchEngine (which could be ElasticSearch and independent from
>> >>> SMW). The reduction of results is done in a post-process to only show
>> >>> subjects that match certain semantic conditions (in case of [3] it is
>> >>> the annotated content language).
>> >>>
>> >>> [2, 3] are semi "combined search" solutions that use MW's
>> >>> infrastructure (search hooks) to provide a single interface and
>> >>> integrate search results provided by an external SearchEngine (which
>> >>> focuses on document or full-text indexing) with matches against some
>> >>> attributes (category, property etc.).
>> >>>
>> >>> [4] video (2:40 min) shows how a text-search and result set reduction
>> >>> can work together when using MW's Special:Search.
>> >>>
>> >>> Another possibility is to look at the SolrStore extension [5] which
>> >>> provides a Lucene full-text search while incorporating semantic
>> >>> annotations.
>> >>>
>> >>> > answer would be yes. Or at least a way to refine a text-search's
>> >>> results
>> >>> > using semantic properties? Maybe? I don't know.
>> >>>
>> >>> [2, 3] shows how a Special:Search integration can be achieved and you
>> >>> can easily develop such interface yourself (given that the full-text
>> >>> search is used as starting point for the search process and attributes
>> >>> are matched in a post-process to reduce the set of matches).
>> >>>
>> >>> PS: In the past we had some minor exchange about an ElasticSearch
>> >>> integration but since that requires commitment and resources it hasn't
>> >>> been on anyone's agenda.
>> >>>
>> >>> [0]
>> >>>
>> https://semantic-mediawiki.org/w/index.php?title=semantic-mediawiki.org:Community_portal&offset=20151019103205&lqt_mustshow=1975#Semantic_search_in_documents_.28indexing.29_1975
>> >>>
>> >>> [1] Giunchiglia, Fausto, Uladzimir Kharkevich, and Ilya Zaihrayeu.
>> >>> "Concept search: Semantics enabled syntactic search." (2008).
>> >>>
>> >>> [2] https://semantic-mediawiki.org/wiki/Help:SMWSearch
>> >>> [3]
>> >>>
>> https://github.com/SemanticMediaWiki/SemanticInterlanguageLinks/tree/master/src/Search
>> >>>
>> >>> [4] https://vimeo.com/115871518
>> >>>
>> >>> [5] https://www.mediawiki.org/wiki/Extension:SolrStore
>> >>>
>> >>> Cheers
>> >>>
>> >>> On 10/19/15, Jeremi Plazas <[hidden email]> wrote:
>> >>> > Hey Yaron,
>> >>> >
>> >>> > Thanks for taking the time to put some thought into this.
>> >>> >
>> >>> > Here's the situation for us here at Tsadra Foundation. We use the
>> wiki
>> >>> for
>> >>> > archiving books. Any given book page contains a template with lots
>> of
>> >>> > semantic data on that book, namely author, translator, editor,
>> >>> publication
>> >>> > information, glossary, index, and bibliographic information of all
>> >>> kinds,
>> >>> > then in a lot of cases we also have some of the content of the book
>> on
>> >>> the
>> >>> > page itself (plain text, outside of the main template), usually the
>> >>> > introduction, glossary and bibliography, and in some cases the full
>> >>> text of
>> >>> > the book. Our site is used for research by many translators and
>> >>> scholars
>> >>> > who, for the most part are interested in "hybrid" searches (as you
>> call
>> >>> > them). Meaning for example, they'd like to know how such an author
>> >>> > (semantic data) translates a particular Tibetan term in their
>> glossary
>> >>> > (syntactic data in the page's full text). This is just an example,
>> but
>> >>> most
>> >>> > of the searches we are requested to do on their behalf is
>> constructed
>> >>> as
>> >>> > such: a combination of some set of semantic data to isolate a book
>> or
>> >>> group
>> >>> > of books, and then syntactic data from these books content. So in
>> >>> short,
>> >>> > for our use anyways, some form of search capability that combines
>> the
>> >>> two
>> >>> > would be ideal.
>> >>> >
>> >>> > Anyways this is our own specific usage and may not be relevant
>> enough
>> >>> to
>> >>> > the wiki community at large. We work with older scholars and
>> academics
>> >>> who
>> >>> > aren't necessarily very tech savvy and making them use a search box
>> >>> instead
>> >>> > of emailing us is already hard enough. They easily turn-off their
>> >>> interest
>> >>> > in our resources if that search isn't as close as possible to ideal.
>> >>> But I
>> >>> > understand your point, is it worth investing time into combining
>> these
>> >>> > search types if one or the other gets you close enough. For us I
>> think
>> >>> the
>> >>> > answer would be yes. Or at least a way to refine a text-search's
>> >>> results
>> >>> > using semantic properties? Maybe? I don't know.
>> >>> >
>> >>> > Thanks again very much for taking the time,
>> >>> >
>> >>> > On Mon, Oct 19, 2015 at 9:07 AM Yaron Koren <[hidden email]>
>> >>> wrote:
>> >>> >
>> >>> >> Hi,
>> >>> >>
>> >>> >> I'm not a core SMW developer, so I don't know if I can directly
>> help
>> >>> with
>> >>> >> this issue, but I am curious about it. As we all know,
>> property-based
>> >>> >> querying and text-based searching are very different from one other
>> >>> >> conceptually: one is semantic, and the other is syntactic. But on a
>> >>> >> practical level, they also seem rather different: text searches are
>> >>> used
>> >>> >> for finding one or more individual pages, and are often done just
>> >>> once;
>> >>> >> while standard SMW queries are generally used for aggregating data,
>> >>> and
>> >>> >> are
>> >>> >> usually meant for display. So it doesn't seem obvious that it
>> would be
>> >>> >> useful to put the two together - though I'm aware that this is a
>> >>> common
>> >>> >> request. In both of your cases, my guess is that this would be
>> used to
>> >>> >> help
>> >>> >> with one-time searches, to let users further drill down on search
>> >>> results
>> >>> >> in order to avoid having to go through many result pages in order
>> to
>> >>> find
>> >>> >> what they're looking for. Is that correct? And if so, how often is
>> >>> that
>> >>> >> necessary? Are there really times when this kind of hybrid search
>> >>> would be
>> >>> >> much faster than one type of search or the other?
>> >>> >>
>> >>> >> -Yaron
>> >>> >>
>> >>> >> On Fri, Oct 16, 2015 at 7:12 PM, Josh King <[hidden email]>
>> >>> wrote:
>> >>> >>
>> >>> >>> Hi Jeremi,
>> >>> >>>
>> >>> >>> Glad to see I'm not the only one.
>> >>> >>> One solution I could imagine is taking advantage of the fact that
>> >>> much of
>> >>> >>> my pages (even the bodies of text) are included in various
>> >>> properties, so
>> >>> >>> I
>> >>> >>> could conceivably have a complex query that searches through all
>> of
>> >>> those
>> >>> >>> properties for that text I think. I personally would like to avoid
>> >>> this
>> >>> >>> though as it would cause me to need set my query depth and size
>> >>> >>> configurations high enough that I bet I'd lose performance (and
>> I'm
>> >>> not
>> >>> >>> even sure the query would really be possible itself).
>> >>> >>>
>> >>> >>> Just thought I would mention that in case it comes into thinking
>> for
>> >>> >>> others
>> >>> >>> also.
>> >>> >>>
>> >>> >>> -Josh
>> >>> >>>
>> >>> >>> On Fri, Oct 16, 2015 at 5:52 PM, Jeremi Plazas <[hidden email]
>> >
>> >>> wrote:
>> >>> >>>
>> >>> >>> > Hi Josh,
>> >>> >>> >
>> >>> >>> > I've run into the same problem. I would love to see a solution
>> to
>> >>> this
>> >>> >>> but
>> >>> >>> > haven't yet. It seems to be an either/or kind of situation.
>> Either
>> >>> >>> search
>> >>> >>> > page content and titles by using the top right search bar, or
>> >>> search
>> >>> >>> > semantic properties with custom query forms, etc. But I haven't
>> >>> found a
>> >>> >>> way
>> >>> >>> > to combine the two.
>> >>> >>> >
>> >>> >>> > Hope this issue is addressed, I second that it's an obvious
>> need.
>> >>> >>> >
>> >>> >>> > On Fri, Oct 16, 2015 at 4:31 PM Josh King <[hidden email]>
>> >>> wrote:
>> >>> >>> >
>> >>> >>> >> Hi all,
>> >>> >>> >>
>> >>> >>> >> I seem to be stuck on what I feel should be a simple issue.
>> I've
>> >>> >>> created a
>> >>> >>> >> search interface for users on my wiki to allow for a sort of
>> >>> "advanced
>> >>> >>> >> search" where they can effectively filter their results by
>> things
>> >>> like
>> >>> >>> >> authors, date published, etc through an inline query (each of
>> >>> which is
>> >>> >>> >> attached to a particular property and is already set up for the
>> >>> user).
>> >>> >>> >>
>> >>> >>> >> What I would like to do though is also add a general-purpose
>> >>> field to
>> >>> >>> my
>> >>> >>> >> query to the extent of "page includes" which searches for
>> similar
>> >>> text
>> >>> >>> to
>> >>> >>> >> the user entry within all of the page's content just like the
>> >>> >>> >> top-right
>> >>> >>> >> search bar would in a standard wiki search. Any idea how to do
>> >>> this?
>> >>> >>> >>
>> >>> >>> >> Since inline queries seem to be built on searching by property
>> >>> >>> >> relationships, I got hung up. Thanks for any help!
>> >>> >>> >>
>> >>> >>> >> -Josh
>> >>> >>> >>
>> >>> >>> >>
>> >>> >>>
>> >>>
>> ------------------------------------------------------------------------------
>> >>> >>> >> _______________________________________________
>> >>> >>> >> Semediawiki-user mailing list
>> >>> >>> >> [hidden email]
>> >>> >>> >> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
>> >>> >>> >>
>> >>> >>> > --
>> >>> >>> > Jeremi Plazas
>> >>> >>> > Assistant Director of Research @ Tsadra Foundation
>> >>> >>> > www.tsadra.org // [hidden email]
>> >>> >>> >
>> >>> >>>
>> >>> >>>
>> >>>
>> ------------------------------------------------------------------------------
>> >>> >>> _______________________________________________
>> >>> >>> Semediawiki-user mailing list
>> >>> >>> [hidden email]
>> >>> >>> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
>> >>> >>>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> --
>> >>> >> WikiWorks · MediaWiki Consulting · http://wikiworks.com
>> >>> >>
>> >>> > --
>> >>> > Jeremi Plazas
>> >>> > Assistant Director of Research @ Tsadra Foundation
>> >>> > www.tsadra.org // [hidden email]
>> >>> >
>> >>>
>> ------------------------------------------------------------------------------
>> >>> > _______________________________________________
>> >>> > Semediawiki-user mailing list
>> >>> > [hidden email]
>> >>> > https://lists.sourceforge.net/lists/listinfo/semediawiki-user
>> >>> >
>> >>>
>> >>
>> >>
>> >
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> Semediawiki-user mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
>>
>
>
------------------------------------------------------------------------------
Presto, an open source distributed SQL query engine for big data, initially
developed by Facebook, enables you to easily query your data on Hadoop in a
more interactive manner. Teradata is also now providing full enterprise
support for Presto. Download a free open source copy now.
http://pubads.g.doubleclick.net/gampad/clk?id=250295911&iu=/4140
_______________________________________________
Semediawiki-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/semediawiki-user