Discovery Weekly Update for the week starting 2019-03-25

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Discovery Weekly Update for the week starting 2019-03-25

Chris Koerner-2

This is the weekly update from the Search Platform team for the week
starting 2019-03-25 and 2019-04-01.

As always, feedback and questions are welcome.

== Discussions ==

=== Search ===
* ElasticSearch upgrade to v6:
** incident [0]
*Trey finished a deep dive into the performance of language
identification for cross-wiki searching [1] (example [2]) and
punctuation-related problems, and discovered things are working pretty
well overall, but the Chinese language model is a bit off.
* Erik noticed that the inlabel / incaption keywords should highlight
the label/caption but were not [3]
* David worked on fixing an error code that Elasticsearch 6
nested_path and nested_filter are deprecated [4] and
_retry_on_conflict was deprecated [5]
* We worked on migrating mjolnir to stdout/syslog/cee logging output [6]
* The team worked on upgrade to elasticsearch 6.5.4 for cirrus / codfw
(specifically) [7] and for eqiad [8]
* Erik worked on the implementation and testing of glent m0
integration with wmf infrastructure [9]
* David did a lot of work to update the mw-config to use the psi&omega
elastic clusters [10]
* David found that the auto_generate_phrase_queries is deprecated and
ineffective [11]
* The team fixed an old bug where we were getting fatal errors -
"cannot perform this operation with arrays" from
CirrusSearch/ElasticaWrite (using JobQueueDB) [12]
* Gehel worked to make spicerack more robust when unfreezing writes to
elasticsearch / cirrus [13] as well as creating a cookbook to reset
frozen write state on elasticsearch / cirrus [14]
* Stas moved WikibaseLexeme search code to WikibaseLexemeCirrusSearch
extension [15]
* We noticed that Elasticsearch indices went read-only, causing a huge lag [16]
* We also saw where search exceptions handling was printing response
information on the screen [17]
* The team fixed an issue where mwgrep was not working [18]
* We also fixed an issue where Elasticsearch 6 needed to silence
deprecation warnings to avoid logspam [19]
* We needed to create an extra elasticsearch clusters in the beta cluster [20]
* We also needed some alerts so we know if mjolnir starts misbehaving [21]
* We also converted icinga plugin to py3 [22]
* We needed to start using local nginx reverse proxy for connections reuse [23]
* The version of curator that we currently use (5.2.0) isn't
compatible with elasticsearch 6. Which causes issues in a few cron on
logtash servers (see blelow). Version 5.6.0 supports both
elasticsearch 5 and updated it [24]
* We also did some cleanup of the reprepro configuration for
elasticsearch-curator [25]
* Getting a centralized way to inspect the content of the search
profiles might be helpful when investigating search behaviors. In the
same vein as other dump debug APIs (mapping/settings/cirrusdoc) David
suggested that we should add a new simple API to dump the profiles
(cirrus-profiles-dump) [26]
* David also found that a call to a member function toArray() on a
non-object (null) in
vendor/ruflin/elastica/lib/Elastica/Client.php:736 and fixed it [27]



Subscribe to receive on-wiki (or opt-in email) notifications of the
Discovery weekly update.

The archive of all past updates can be found on

Interested in getting involved? See tasks marked as "Easy" or
"Volunteer needed" in Phabricator.


Chris Koerner (he/him)
Community Relations Specialist
Wikimedia Foundation

Wikitech-l mailing list
[hidden email]