Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RegExps support for Mjolnir's WordList protection #546

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

PC-Admin
Copy link
Contributor

@PC-Admin PC-Admin commented Oct 13, 2024

Well it did not take long for the spammers to side-step my word-list...

So here's a patch that actually allows you to enable regular expression support with Mjolnir's WordList protection! Making it substantially more powerful. 😎

I've tested this patch out and it appears to work flawlessly. Happy to revise it and re-test if needed.


Fixes #547

@PC-Admin
Copy link
Contributor Author

PC-Admin commented Oct 13, 2024

Testing (enabledRegExps=true)

With this feature enabled I tested all of the following with this wordlist, it all worked fine:

      words:
        - (https?:\/\/)?t\.me\/(\+)?[a-zA-Z0-9_-]+
        - cashapp
        - "apple pay"
        - "paypal"
  1. a regular word, lower cases

cashapp

  1. a regular word, capital letters

CASHAPP

  1. a t.me link with https

https://t.me/+32cFzLuOiacxZmMe

  1. a t.me link without https

t.me/+32cFzLuOiacxZmMe

  1. with surrounding text:

Visit t.me/channel for more info! (t.me/channel)

Testing (enabledRegExps=false)

Now to test it with this new feature disabled and the same wordlist, to ensure that all the previous functionality still works.

  1. a regular word, lower cases

cashapp

  1. a regular word, capital letters

CASHAPP

  1. a t.me link with https (it's not triggered here):

https://t.me/+32cFzLuOiacxZmMe

@PC-Admin
Copy link
Contributor Author

PC-Admin commented Oct 13, 2024

It's been stated that:

i think at one point mjolnir fixed a bug where an empty word in the list would match every single message, and that was when the original regex support stopped working
so i’m a little curious, what happens if you use an empty string with your implementation :p

Just tested this out and nope, it seems fine. With this word list:

matrix_bot_mjolnir_configuration_extension_yaml: |
  # Configuration specific to certain toggle-able protections
  protections:
    wordlist:
      enableRegExps: true
      words:
        - (https?:\/\/)?t\.me\/(\+)?[a-zA-Z0-9_-]+
        - ""
        - 
      minutesBeforeTrusting: 10080

The entry that's - "" doesn't seem to impact functionality at all.

While the truly blank entry at the end (when added after) seems to stop the WordList feature from working entirely. (Which is a separate issue this PR isn't meant to address really)

@PC-Admin
Copy link
Contributor Author

This is a fix for #547

Copy link
Member

@turt2live turt2live left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! The code looks good, but there's a couple things we need before we can merge:

  • Sign off
  • Unit/integration tests to prove the behaviour

If you run into issues with writing the tests, let us know in the Mjolnir room on Matrix and we'll try to help out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature - Mjolnir's WordList protection would be a lot more powerful if you could use regular expressions.
2 participants