Share » Forums » Extensions » eZ Find » Problem with really BIG solr indices

Problem with really BIG solr indices

Problem with really BIG solr indices

Thursday 19 March 2009 4:52:13 am - 5 replies

Author Message

Ali Nebi

Thursday 09 April 2009 6:50:18 am

Hi,

we just made some tests with ezfind2 and we found the same problem. The solr indexes took 650GB. This is really big. The same solr indexes with ezfind1 and related solr is 9,5GB.

Why this happen and how to solve this problem?

Thanks in advanced!

Iguana Information Technologies, SL - http://www.iguanait.com

Nicolas Pastorino

Friday 10 April 2009 12:32:55 am

@Xavier :
Any feedback on your issue ? Did the proposed solution of disabling the OptimizeOnCommit directive + setting up a daily 'optimize' workflow work ?

@Ali :
This index size is very surprising. Did the indexed content base grow a lot between the eZ Find 1.x usage and eZ Find 2.0 ? Are external elements indexed ( through the DataImportHandler Solr extension for instance ? ) too ? Websites crawled ?

Best regards,

--
Nicolas Pastorino
Director Community - eZ
Member of the Community Project Board

eZ Publish Community on twitter: http://twitter.com/ezcommunity

t : http://twitter.com/jeanvoye
G+ : http://plus.tl/jeanvoye

Ali Nebi

Wednesday 15 April 2009 8:20:04 am

Sorry for my late reply.

We use the same database for tests and the data in database is not changing. Also we don't index any external elements.

We continue to do tests with this. We test in one other test server and there the size of data dir was less than the other server, where it was 650GB, but it is still big. 14GB for 40% indexed data.

Regards, Ali Nebi!

Iguana Information Technologies, SL - http://www.iguanait.com

Xavier Serna

Thursday 16 April 2009 12:50:04 am

Hi Nico,

many thanks for your proposed solution, it seems to work fine now disabling optimizeoncommit.
Only one detail, in the updatesearchindexsolr.php on each commit, every 1000 objects, it's forced an optimize, not respecting the setting in the ini file. I believe that this should be updated, because reindexation of the whole xml files takes more than 4 hours.

thanks!

--
Xavier Serna
eZ Publish Certified Developer
Departament de Software
Microblau S.L. - http://www.microblau.net
+34 937 466 205

Ali Nebi

Monday 01 June 2009 4:56:57 am

Hi,

after some more tests and spending more time for ezfind 2 tests, we found out why the solr indices were so big.

First we needed to use userFork to false. The real problem was explained here from Denitsa M.:

http://ez.no/developer/forum/extensions/ez_find/ezfind2_indexing_speed_incredibly_low_er

When the indexing start to index objects that have relationlist attribute(s), then indexing loops between these objects and indices are getting bigger and bigger. When we did these attributes no searchable, then for 2 GB database indexing was much faster and the indices size was hundred of MB.

Regards, Ali Nebi!

Iguana Information Technologies, SL - http://www.iguanait.com

You must be logged in to post messages in this topic!

36 542 Users on board!

Forums menu