Apache lucene database5/26/2023 ![]() Underlying data model for in memory cacheĪs mentioned before, internal Lucene data model is based on two main data sets – Index and documents, which are implemented as two models – IndexMemoryModel and DocumentMemoryModel. A compromise is achieved through implementing configurable cache time to live parameter, limiting cache presence in each Lucene instance. The latter requires minimizing of the cache life time to synchronize content with the HBase instance (a single copy of thruth). ![]() The implementation tries to balance two conflicting requirements - performance: in memory cache can drastically improve performance by minimizing the amount of HBase reads for search and documents retrieval and scalability: ability to run as many Lucene instances as required to support growing search clients population. The overall implementation (Figure 3) is based on a memory-based backend used as an in memory cache and a mechanism for synchronizing this cache with the HBase backend.įigure 3: Overall Architecture of HBase-based Lucene implementation The implementation presented in the article follows this approach. module, Lucandra and HBasene took a different approach and overwrote not a directory but higher level Lucene's classes - IndexReader and IndexWriter, thus bypassing Directory APIs (Figure 2).įigure 2: Integration Lucene with back end without file systemĪlthough such approach often requires more work, it leads to significantly more powerful implementations allowing for full utilization of back end's native capabilities. As a result, several Lucene ports, including a limited memory index support from Lucene contrib.
0 Comments
Leave a Reply. |