-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
case unsensitive search #16
Comments
Yes, it makes totally sense. Unfortunately there is no easy way to implement it. ZIM file, as well as Wikipedia, contains case-sensitive title index. The only possible implementation I see right now, is to emulate case insensitive by doing multiple searches. For instance when you type "case sensitivity", Zimpedia should make at least 4 queries: "case sensitivity", "Case sensitivity", "Case Sensitivity", "case Sensitivity". Then it is need to merge 4 results, eliminate duplicates etc. Here is the right quote from wikimedia article that summarize the problem:
|
@kelson42 That is great news! Zimpedia uses zimlib, so implementation in zimlib interest me the most. Is there any issue number that I could observe to be informed about the implementation status? |
The question of introducing a fully case insensitive suggestion system is still open on my side. For now we basically try to generalised the fulltext (case insensitive) search. |
I never really noticed this in Kiwix on Android because there the keyboard starts in lowercase (in Kiwix search, not in a general textfield). So maybe there's a simple way to set something like On a slightly related note, but perhaps this should be its own FR, it'd be nice if |
Unfortunately it is not so simple right now. A search result retried from libzim is case sensitive, so to achieve what you've suggested few searches with different case variants (élève, elève, éleve, eleve, elevé, etc.) should be made. All results should be de-duplicated and merged. It is possible but complicated... As suggested @kelson42, maybe you should try full-text search mode (this feature was added in the recent Zimpedia update). It is case insensitive but unfortunately results could be sometimes unpredictable. |
Thanks for the clarification. I will look in to it. |
In 3ab979e I've added following search procedure:
It works pretty well... but |
Very nice! 👍
Well yeah, diacritics are hard. ;-) I'll have to check what GoldenDict does because you've made me curious. Iirc it performs a variety of clever tricks with diacritics and morphology alike. |
Hi,
I think it would make sense to build in an option to search case unsensitive.
The text was updated successfully, but these errors were encountered: