Percolator is much slower than in ES1, and pre-selecting do not work #16285

garipovazamat · 2024-10-11T10:08:51Z

What is the bug?

We have been trying to migrate from Elasticsearch version 1.7.6 to the latest version (8.15) in our company and discovered that the latest version has become much slower. To find the reason for this degradation, I conducted several experiments. During these experiments, I found that some claimed improvements likely do not work as expected. I have duplicated this issue from elasticseaerch repository, because I found the same problem in Opensearch. I'm sure that problem migrated when Opensearch was forked.

Experiment details

I created the following index mapping:

{  
    "properties": {  
        "props": {  
            "properties": {  
                "entity_obj": {  
                    "properties": {  
                        "category": {"type": "keyword"},  
                        "id": {"type": "integer"},  
                        "priceTotal": {"type": "integer"},  
                        "totalArea": {"type": "double"}  
                    }  
                },  
                "price": {"type": "long"},  
                "room": {"type": "short", "store": True}  
            }  
        },  
        "query": {"type": "percolator"}  
    }  
}

I filled index with 10 000 duplicated queries, which contain only must, should, term and range conditions

{  
    "query": {  
        "bool": {  
            "must": [  
                {"term": {"props.entity_obj.category": "flat1"}},  # first, simple condition
                {  # second, more comlicated condition
                    "bool": {  
                        "must": [  
                            {"bool": {  
                                "should": [  
                                    {'term': {"props.entity_obj.category": 'flat2'}},  
                                    {"range": {"props.price": {"gte": 1000}}},  
                                    {'term': {"props.entity_obj.category": 'flat2'}},  
                                    {"range": {"props.price": {"gte": 1000}}},  
                                    {'term': {"props.entity_obj.category": 'flat2'}},  
                                    {"range": {"props.price": {"gte": 1000}}},  
                                    {'term': {"props.entity_obj.category": 'flat2'}},  
                                    {"range": {"props.price": {"gte": 1000}}},  
                                    {'term': {"props.entity_obj.category": 'flat2'}},  
                                    {"range": {"props.price": {"gte": 1000}}},  
                                    {'term': {"props.entity_obj.category": 'flat2'}},  # the more such conditions, the longer the percolation
                                    {"range": {"props.price": {"gte": 1000}}},  
                                ]  
                            }},  
                        ]  
                    }  
                }  
            ]  
        }  
    }  
}

You can see, that there are two main conditions inside must: the first is simple, and the second a bit more complex. Logically, there is no reason to check the second condition if the first one is false. However, my experiments showed that if the first condition is false for a document, adding conditions inside should (the second condition) increases the percolation time. Therefore, I conclude that the improvements claimed in this article https://www.elastic.co/blog/elasticsearch-percolator-continues-to-evolve do not work.

Also the percolator will no longer load the percolator queries as Lucene queries into memory as they are instead read from disk. Pre 5.0 if you had thousands of percolator queries they’d take up megabytes of precious JVM heap space, putting pressure on jvm garbage collecting and if not being careful lead to an infamous jvm out of memory error. Back then loading the percolator queries into memory made sense because all the percolator queries were evaluated all the time so we made executing each one as fast as possible. Now with pre-selecting, only percolator queries that are likely to match. We decided to trade speed for stability, removing the caching to free up memory. The speed loss is more than paid for by skipping most queries in most cases.

I ran the percolation with the following request:

{  
  "constant_score": {  
    "filter": {  
      "percolate": {  
        "field": "query",  
        "document": {  
          "props": {  
            "entity_obj": {  
              "category": ["flat2"],  
              "id": 1,  
              "priceTotal": 10001,  
              "totalArea": 100  
            },  
            "price": 10001,  
            "room": 1  
          }  
        }  
      }  
    }  
  }

As a result, I got the following percolation time with one document: ~0.157 seconds.
I conducted a similar experiment on Elasticsearch version 1.7.6 with identical data, and the result was: ~0.008 seconds, which is ! ~20x faster.

We also tried percolating with real production data. The only improvement we saw was when we added additional filters with the percolate query by using metadata, which we extracted from the primary query. For example, we took the query mentioned above and added metadata (meta_data.category field).

{  
  "query": {  
    "bool": {  
      "must": [  
        {"term": {"props.entity_obj.category": "flat1"}},  
        {  
          "bool": {  
            "must": [  
              {  
                "bool": {  
                  "should": [  
                    {"term": {"props.entity_obj.category": "flat2"}},  
                    {"range": {"props.price": {"gte": 1000}}}  
                  ]  
                }  
              }  
            ]  
          }  
        }  
      ]  
    }  
  },  
  "meta_data": {  
    "category": "flat1"  
  }  
}

Then I sent the following request:

{    
  "constant_score": {    
    "filter": {    
      "bool": {    
        "must": [  
          {  # additional filter
            "bool": {  
              "should": [  
                {"term": {"meta_data.category": "flat1"}},  
                {"bool": {"must_not":  {"exists": {"field": "meta_data.category"}}}}  # condition for cases, when query has no filter by category
              ]  
            }  
          },  
          {"percolate": {"field": "query", "document": {# our document} }}    
          }
        ]    
      }    
    }    
  }    
}

But this approach has a disadvantage. It becomes more difficult to percolate a large batch. If I need to percolate many documents, I have to separate them by the category field, resulting in smaller batches. This negates the improvement of percolating many documents in one query. I also tried using named percolation (the name field in the percolate query) and made a query with a few percolate queries inside (one for each category), but this approach did not have any advantage compared to separate requests (the percolation time was the same).
In general, extracting metadata and adding additional filters for this metadata seems like unnecessary work, forcing us to maintain those extra filters. It seems that the search engine should handle such optimizations itself. I suspect this is the "pre-selecting" feature.

Python scripts for experiments (python 3.12): scripts

Conclusion

Currently, percolation with queries, even simple filters, performs significantly slower than in older versions of Elasticsearch. It seems likely that the latest version lacks the pre-selecting optimization, or it is not functioning correctly. Alternatively, I might have missed something, and it can be enabled. I would appreciate any help you can provide to resolve this problem.

The text was updated successfully, but these errors were encountered:

peternied · 2024-10-11T15:23:01Z

@garipovazamat Moving this issue to the OpenSearch core repo where it would be addressed, thanks for creating this issue.

dblock · 2024-10-11T15:57:34Z

This is very well detailed thank you @garipovazamat. Since you have a repro, have you tried bisecting this to a change via ./gradlew run?

garipovazamat · 2024-10-14T06:45:49Z

have you tried bisecting this to a change

@dblock Most likely not. What you mean? What you suggest to bisect?
I did everything in docker container.

garipovazamat added bug Something isn't working untriaged labels Oct 11, 2024

peternied transferred this issue from opensearch-project/.github Oct 11, 2024

peternied added Performance This is for any performance related enhancements or bugs Search Search query, autocomplete ...etc Search:Performance and removed untriaged labels Oct 11, 2024

dbwiddis mentioned this issue Oct 16, 2024

Fixed inefficient Stream API call chains ending with count() #15386

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Percolator is much slower than in ES1, and pre-selecting do not work #16285

Percolator is much slower than in ES1, and pre-selecting do not work #16285

garipovazamat commented Oct 11, 2024 •

edited

Loading

peternied commented Oct 11, 2024

dblock commented Oct 11, 2024

garipovazamat commented Oct 14, 2024 •

edited

Loading

Percolator is much slower than in ES1, and pre-selecting do not work #16285

Percolator is much slower than in ES1, and pre-selecting do not work #16285

Comments

garipovazamat commented Oct 11, 2024 • edited Loading

What is the bug?

Experiment details

Conclusion

peternied commented Oct 11, 2024

dblock commented Oct 11, 2024

garipovazamat commented Oct 14, 2024 • edited Loading

garipovazamat commented Oct 11, 2024 •

edited

Loading

garipovazamat commented Oct 14, 2024 •

edited

Loading