Error in Revision Table in lbwiki

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Error in Revision Table in lbwiki

Anuradha Uduwage
Hi all, 

I have a script running collecting data in multiple wikipedia(s), I started to notice that revision table in lbwiki_p has some incorrect data. 

Here is an example:
mysql> select rev_id, rev_user, rev_page, rev_deleted, rev_len, rev_timestamp from revision where rev_id = 185751;
+--------+----------+----------+-------------+---------+----------------+
| rev_id | rev_user | rev_page | rev_deleted | rev_len | rev_timestamp  |
+--------+----------+----------+-------------+---------+----------------+
| 185751 |      580 |    83446 |           0 |    NULL | 20061203231418 |
+--------+----------+----------+-------------+---------+----------------+

mysql> select rev_id, rev_page, rev_len from revision where rev_page = 83446 and rev_timestamp < 20061203231418;
+--------+----------+---------+
| rev_id | rev_page | rev_len |
+--------+----------+---------+
| 115478 |    83446 |    NULL |
| 118003 |    83446 |    NULL |
| 118009 |    83446 |    NULL |
| 138010 |    83446 |    NULL |
+--------+----------+---------+

According to my understanding if a record exist rev_len shouldn't be NULL, if the revision deleted then rev_deleted should get flag but rev_length should remain as it is. 

Hope someone can look into this, because people who are doing analysis might end up getting wrong results. 

Best;
--
Anuradha Uduwage (Anu)

_______________________________________________
Toolserver-l mailing list ([hidden email])
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Reply | Threaded
Open this post in threaded view
|

Re: Error in Revision Table in lbwiki

DaB.-2
Hello,
At Saturday 16 March 2013 01:41:47 DaB. wrote:

> Hi all,
>
> I have a script running collecting data in multiple wikipedia(s), I started
> to notice that revision table in lbwiki_p has some incorrect data.
>
> Here is an example:
> mysql> select rev_id, rev_user, rev_page, rev_deleted, rev_len,
> rev_timestamp from revision where rev_id = 185751;
> +--------+----------+----------+-------------+---------+----------------+
>
> | rev_id | rev_user | rev_page | rev_deleted | rev_len | rev_timestamp  |
>
> +--------+----------+----------+-------------+---------+----------------+
>
> | 185751 |      580 |    83446 |           0 |    NULL | 20061203231418 |
>
> +--------+----------+----------+-------------+---------+----------------+
The result is correct.

>
> According to my understanding if a record exist rev_len shouldn't be NULL,
> if the revision deleted then rev_deleted should get flag but rev_length
> should remain as it is.
>
> Hope someone can look into this, because people who are doing analysis
> might end up getting wrong results.

rev_lenght will remain as it is – the problem is that rev_lenght was not there
from the very beginning and was never (AFAIK) back-populated; so very old rows
has no lenght and are NULL.


> Best;
> --
> Anuradha Uduwage (Anu)

Sincerely,
DaB.

--
Userpage: [[:w:de:User:DaB.]] — PGP: 0x2d3ee2d42b255885

_______________________________________________
Toolserver-l mailing list ([hidden email])
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette

signature.asc (205 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Error in Revision Table in lbwiki

MZMcBride-2
DaB. wrote:
>rev_lenght will remain as it is – the problem is that rev_lenght was not
>there
>from the very beginning and was never (AFAIK) back-populated; so very old
>rows
>has no lenght and are NULL.

Related: <https://bugzilla.wikimedia.org/show_bug.cgi?id=12188>.

MZMcBride



_______________________________________________
Toolserver-l mailing list ([hidden email])
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette