Notes involving the setup of CirrusSearch for the UESP wiki.
SetupEdit
-
- Install Java if not already installed on server (yum install java-1.7.0-openjdk.x86_64).
- Download Elastic Search and uncompress. Or install via yum (?).
- If using manual install, copy to /home/uesp/elasticsearch/.
- Add a script.disable_dynamic: false line to config/elasticsearch.yml.
- Set the amount of memory to use in bin/elasticsearch.in.sh with two lines like:
ES_HEAP_SIZE=2g MAX_LOCKED_MEMORY=unlimited
-
- Create/copy an init.d script in /etc/init.d/elasticsearch. Start ES as a daemon (/home/uesp/elasticsearch/bin/elasticsearch -d) with the user uesp.
- Add elasticsearch to chkconfig startup.
chkconfig --add elasticsearch chkconfig --level 345 elasticsearh on
-
- Install the MediaWiki Elastica Extension.
- Install the MediaWiki CirrusSearch Extension.
- The relevant lines to LocalSettings.php should look like:
require_once( "$IP/extensions/Elastica/Elastica.php"); require_once( "$IP/extensions/CirrusSearch/CirrusSearch.php" ); # $wgDisableSearchUpdate = true; $wgCirrusSearchServers = array( '10.7.143.20' ); $wgSearchType = 'CirrusSearch';
-
- Create the ES index:
php ./extensions/CirrusSearch/maintenance/updateSearchIndexConfig.php
-
- Force the index to be updated:
php ./extensions/CirrusSearch/maintenance/forceSearchIndex.php --skipLinks --indexOnSkip php ./extensions/CirrusSearch/maintenance/forceSearchIndex.php --skipParse
-
- Average index rate for the first step on content3 was ~3 pages/second over 200349 page IDs. Second index step averaged around 40 pages/second.
- Average index rate for the final index on files1 was ~14 pages/second using 4 parallel indexing operations on content1/2/3.
- Test search function.
BenchmarkingEdit
-
- dev.uesp.net
-
- 600 MB RAM in use, Index size 700MB.
- From LocalHost
-
- Simple benchmark like ab -kc 10 -t 3 http://localhost:9200/uesp_net_wiki5_general_first/_search?q=something
- Werewolf4: 902 req/sec (11 ms average, 99% at 55 ms)
- Werewolf: 5560 req/sec (2 ms average, 99% at 13 ms)
- Werewolf7: 1580 req/sec (6 ms average, 99% at 22 ms)
- Vampire: 5050 req/sec (2 ms average, 99% at 13 ms)
- Vampire+Werewolf: 1210 req/sec (8 ms average, 99% at 30 ms)
- From Content1
-
- Vampire: 2140 req/sec (5 ms average, 99% at 10 ms)
- files1.uesp.net
-
- From content2
-
- Vampire: 1550 req/sec (6.5 ms average, 99% at 14 ms)
- Werewolf: 1070 req/sec (9.3 ms average, 99% at 17 ms)
- Vampire+Werewolf: 150 req/sec (67 ms average, 99% at 92 ms)
Search HighlighterEdit
-
- This details installing the ElasticSearch searchhighlighter plugin so we can use the $wgCirrusSearchUseExperimentalHighlighter = true; feature in MW. Note that setting this to true without the plugin causes the search to crash.
- The installation for v1.7 detailed at https://github.com/wikimedia/search-highlighter doesn't work as the JAR files are no longer available at the original location. Instead use the following command on your ElasticSearch setup:
./bin/plugin --install wikimedia/search-highlighter --url https://download.jar-download.com/cache_jars/org.wikimedia.search.highlighter/experimental-highlighter-elasticsearch-plugin/1.7.0/jar_files.zip
-
- Restart ElasticSearch.
- Check the ElasticSearch log and look for a line like:
[plugins ] [Domina] loaded [experimental highlighter], sites []
-
- to verify the plugin is installed and working.
- You can also check the MW search query by adding &cirrusDumpQuery and looking for "type":"experimental" in the result.
- Set the following in the MW config and test search:
$wgCirrusSearchUseExperimentalHighlighter = true; $wgCirrusSearchOptimizeIndexForExperimentalHighlighter = true;
PortsEdit
ElasticSearch services running on search1:
-
- v2.4 -- Port 9202
- v5.3 -- Port 9005
- v5.6 -- Port 9004
- v6.8 -- Port 9006
- v7.10 -- Port 9007