Feat: add shared codes #18

amindadgar · 2024-11-20T12:37:07Z

Summary by CodeRabbit

Release Notes

New Features
- Introduced new environment variables for the app service: REDIS_HOST, REDIS_PORT, and REDIS_PASSWORD.
- Added a new class ModulesBase for managing platform-related metadata and tokens in the database.
Version Updates
- Updated the version number of the package to 1.4.0.
Testing Enhancements
- Added new integration and unit tests for MongoSingleton, QdrantSingleton, and RedisSingleton classes to ensure robust functionality and error handling.

coderabbitai · 2024-11-20T12:37:15Z

Walkthrough

The changes in this pull request involve updates to several files, including the addition of new environment variables for the Redis configuration in docker-compose.test.yml, an increment in the version number in setup.py, and the introduction of a new class ModulesBase in modules_base.py for MongoDB interactions. Additionally, several new test classes were created to validate the functionality of the ModulesBase, MongoSingleton, QdrantSingleton, and RedisSingleton classes. Type hinting improvements were made in mongo.py, enhancing code clarity.

Changes

File	Change Summary
`docker-compose.test.yml`	Added environment variables: `REDIS_HOST`, `REDIS_PORT`, `REDIS_PASSWORD` to the `app` service and introduced a new `redis` service with health checks.
`setup.py`	Updated version from `1.3.0` to `1.4.0`.
`tc_hivemind_backend/db/modules_base.py`	Added class `ModulesBase` with methods: `query`, `get_platform_community_ids`, `get_token`, `get_platform_metadata`.
`tc_hivemind_backend/db/mongo.py`	Updated type hint for `__instance` in `MongoSingleton` from `"MongoSingleton"
`tc_hivemind_backend/tests/integration/test_modules_base_query_token.py`	Added test class `TestModulesBaseQueryToken` for `get_token` method with several test cases.
`tc_hivemind_backend/tests/integration/test_mongo_singleton.py`	Added test class `TestMongoSingletonIntegration` for integration testing of `MongoSingleton`.
`tc_hivemind_backend/tests/unit/test_mongo_singleton.py`	Added test class `TestMongoSingleton` for unit testing of `MongoSingleton` and `get_mongo_uri`.
`tc_hivemind_backend/tests/unit/test_qdrant_singleton.py`	Added test class `TestQdrantSingleton` for unit testing of `QdrantSingleton`.
`tc_hivemind_backend/tests/unit/test_redis_singleton.py`	Added test class `TestRedisSingleton` for unit testing of `RedisSingleton`.

Possibly related PRs

feat: Added ingestion pipeline and mongo and redis singleton! #17: The main PR adds Redis-related environment variables, while PR feat: Added ingestion pipeline and mongo and redis singleton! #17 introduces a Redis singleton class, indicating a direct relationship in managing Redis configurations.

🐰 "In the code garden, changes bloom,
Redis whispers, 'Make some room!'
New tests hop, and classes play,
Version's up, hip-hip-hooray!
Mongo's secrets, we now can find,
With every change, we're more aligned!" 🐇✨

Warning

Rate limit exceeded

@amindadgar has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 14 minutes and 16 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between fc5d5b4 and 71fb776.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 6

🧹 Outside diff range and nitpick comments (21)

tc_hivemind_backend/tests/integration/test_mongo_singleton.py (1)

7-26: Consider adding more test cases for comprehensive coverage.

The test suite would benefit from additional test cases:

Test connection with invalid credentials

Test connection with invalid host/port

Test connection string configuration validation

Test database and collection access

Would you like me to provide example implementations for these test cases?
tc_hivemind_backend/db/mongo.py (2)
Line range hint 12-29: Consider adding thread safety to the singleton implementation

The current implementation could face race conditions in a multi-threaded environment. Consider using double-checked locking pattern with threading.Lock().

Here's a suggested implementation:
+ import threading

class MongoSingleton:
    __instance: Optional["MongoSingleton"] = None
+   __lock = threading.Lock()

    def __init__(self):
        if MongoSingleton.__instance is not None:
            raise Exception("This class is a singleton!")
        else:
            connection_uri = get_mongo_uri()
            self.client = MongoClient(connection_uri)
            MongoSingleton.__instance = self

    @staticmethod
    def get_instance():
-       if MongoSingleton.__instance is None:
-           MongoSingleton()
+       if not MongoSingleton.__instance:
+           with MongoSingleton.__lock:
+               if not MongoSingleton.__instance:
+                   MongoSingleton()
Also, consider catching specific MongoDB exceptions instead of generic Exception:
-           except Exception as exp:
+           except (ConnectionError, ServerSelectionTimeoutError) as exp:
Line range hint 39-47: Enhance MongoDB connection security and reliability

The connection string could benefit from additional security and reliability options:

SSL/TLS configuration for secure communication

Connection pool settings

Timeout configurations

Retry settings for better reliability

Consider updating the connection string construction:
-    connection = f"mongodb://{user}:{password}@{host}:{port}"
+    connection = (
+        f"mongodb://{user}:{password}@{host}:{port}"
+        "?ssl=true"
+        "&retryWrites=true"
+        "&w=majority"
+        "&maxPoolSize=50"
+        "&connectTimeoutMS=2000"
+    )
tc_hivemind_backend/tests/unit/test_redis_singleton.py (5)
7-9: Consider adding a tearDown method and improving singleton reset mechanism.

While the setUp method correctly resets the singleton, consider:

Adding a tearDown method to ensure clean state after tests

Adding a public reset method in RedisSingleton instead of accessing private attribute
 def setUp(self):
     RedisSingleton._RedisSingleton__instance = None

+def tearDown(self):
+    RedisSingleton._RedisSingleton__instance = None
10-19: Consider splitting singleton pattern test into two methods.

For better test clarity and single responsibility, consider splitting this into:

test_singleton_returns_same_instance

test_singleton_prevents_direct_instantiation
-def test_singleton_pattern(self):
+def test_singleton_returns_same_instance(self):
     instance1 = RedisSingleton.get_instance()
     instance2 = RedisSingleton.get_instance()
     self.assertIs(instance1, instance2)
+
+def test_singleton_prevents_direct_instantiation(self):
     with self.assertRaises(Exception) as context:
         RedisSingleton()
     self.assertEqual(str(context.exception), "This class is a singleton!")
39-64: Add assertions for Redis client state.

While the test verifies logging, consider adding assertions for the client's connected state and availability:
 RedisSingleton.get_instance()

 mock_info_log.assert_called_once_with(
     "Redis Connected Successfully! Ping returned: True"
 )
 mock_error_log.assert_not_called()
+self.assertTrue(mock_client.ping.called)  # Verify ping was attempted
+self.assertIsNotNone(RedisSingleton.get_instance().get_client())  # Verify client is available
65-90: Use specific exception type for connection failure.

Instead of using a generic Exception, consider using a more specific Redis connection exception for better error handling simulation:
-mock_client.ping.side_effect = Exception("Connection failed")
+from redis.exceptions import ConnectionError
+mock_client.ping.side_effect = ConnectionError("Connection failed")
91-102: Consider using a custom exception for credentials loading failure.

Instead of using a generic Exception, consider defining and using a custom exception type for credential loading failures:
class CredentialsLoadError(Exception):
    pass
Then update the test:
-mock_credentials_instance.load_redis.side_effect = Exception(
+mock_credentials_instance.load_redis.side_effect = CredentialsLoadError(
     "Failed to load credentials"
 )
tc_hivemind_backend/tests/unit/test_qdrant_singleton.py (3)
10-19: Consider splitting singleton pattern tests for better clarity.

While the test coverage is good, consider splitting test_singleton_pattern into two separate test methods for better clarity:

test_singleton_instance_reuse

test_singleton_direct_instantiation_prevention
-    def test_singleton_pattern(self):
-        instance1 = QdrantSingleton.get_instance()
-        instance2 = QdrantSingleton.get_instance()
-
-        self.assertIs(instance1, instance2)
-
-        with self.assertRaises(Exception) as context:
-            QdrantSingleton()
-        self.assertEqual(str(context.exception), "This class is a singleton!")
+    def test_singleton_instance_reuse(self):
+        instance1 = QdrantSingleton.get_instance()
+        instance2 = QdrantSingleton.get_instance()
+        self.assertIs(instance1, instance2)
+
+    def test_singleton_direct_instantiation_prevention(self):
+        with self.assertRaises(Exception) as context:
+            QdrantSingleton()
+        self.assertEqual(str(context.exception), "This class is a singleton!")
20-55: Consider using class-level constants for test data.

The test data could be moved to class-level constants to improve maintainability and reusability across test methods.
 class TestQdrantSingleton(unittest.TestCase):
+    TEST_HOST = "test_host"
+    TEST_PORT = 6333
+    TEST_API_KEY = "test_api_key"
+    
+    @classmethod
+    def _get_mock_credentials(cls, with_api_key=False):
+        return {
+            "host": cls.TEST_HOST,
+            "port": cls.TEST_PORT,
+            "api_key": cls.TEST_API_KEY if with_api_key else "",
+        }
Then update the test methods to use these constants:
-        mock_load_credentials.return_value = {
-            "host": "test_host",
-            "port": 6333,
-            "api_key": "",
-        }
+        mock_load_credentials.return_value = self._get_mock_credentials()
102-108: Consider adding more error scenarios.

While the current error handling test is good, consider adding tests for other potential failure scenarios:

Network timeout errors

Authentication failures

Invalid host/port configurations
tc_hivemind_backend/tests/unit/test_mongo_singleton.py (4)
12-12: Consider using a public reset method for testing

Directly accessing the private __instance variable breaks encapsulation. Consider adding a reset_instance class method to MongoSingleton specifically for testing purposes.
# In MongoSingleton class
@classmethod
def reset_instance(cls):
    """Reset the singleton instance (for testing only)"""
    cls.__instance = None
13-20: Consider extracting logging setup to a helper method

The logging setup code could be extracted into a helper method to improve readability and reusability.
def setup_logging_capture(self):
    self.log_records = []
    self.logger = logging.getLogger()
    self.old_level = self.logger.level
    self.logger.setLevel(logging.INFO)
    self.handler = logging.Handler()
    self.handler.emit = lambda record: self.log_records.append(record)
    self.logger.addHandler(self.handler)
99-108: Consider additional client verification

While the basic client instance check is good, consider adding assertions for:

Client configuration properties

Connection parameters

Default database selection

1-108: Consider adding these test scenarios

The test suite is well-structured, but consider adding tests for:

Connection timeout handling

Reconnection attempts

Thread safety of the singleton pattern

Connection pool configuration

Would you like me to provide example implementations for any of these test scenarios?

🧰 Tools

🪛 Ruff

72-72: Local variable instance is assigned to but never used

Remove assignment to unused variable instance

(F841)

90-90: Local variable instance is assigned to but never used

Remove assignment to unused variable instance

(F841)
tc_hivemind_backend/tests/integration/test_modules_base_query_token.py (5)
10-14: Add tearDown method to clean up after tests

While the setUp method correctly prepares the test environment, it's a good practice to also implement a tearDown method to ensure proper cleanup after each test, even if the collections are dropped in setUp.
def tearDown(self) -> None:
    self.client["Core"].drop_collection("tokens")
    self.client["Core"].drop_collection("platforms")
15-53: Add docstring to describe test purpose and data setup

The test method would benefit from a clear docstring explaining the test scenario and the relationship between the platform and token documents.
def test_one_token(self):
    """Test successful token retrieval when valid platform and token exist.
    
    Sets up:
    1. A platform document with a specific user
    2. A matching token document for that user
    Verifies that get_token returns the correct token value.
    """
Also, consider moving the test data setup to class-level constants for better maintainability:
class TestModulesBaseQueryToken(TestCase):
    SAMPLE_USER_ID = ObjectId("5d7baf326c8a2e2400000000")
    COMMUNITY_ID = ObjectId("6579c364f1120850414e0dc5")
    PLATFORM_ID = ObjectId("6579c364f1120850414e0dc6")
54-61: Add docstring to clarify error scenario

Add a docstring to explain the expected behavior when attempting to retrieve a token from an empty collection.
def test_empty_tokens_collection(self):
    """Test that attempting to get a token from an empty collection raises ValueError.
    
    Verifies the error handling when no tokens exist in the database.
    """
62-82: Add docstring and reduce token setup duplication

The token document setup is duplicated across multiple tests. Consider extracting it to a helper method.
def _create_token_doc(self, user_id, token_value, token_type):
    """Helper method to create a token document with standard fields."""
    return {
        "token": token_value,
        "user": user_id,
        "type": token_type,
        "expires": datetime.now() + timedelta(days=1),
        "blacklisted": False,
        "createdAt": datetime.now() - timedelta(days=1),
        "updatedAt": datetime.now() - timedelta(days=1),
    }
Also, add a docstring to explain the test scenario:
def test_no_platform(self):
    """Test that attempting to get a token without a corresponding platform raises ValueError.
    
    Sets up a token document but no platform document, verifying that the method
    properly validates platform existence.
    """
83-121: Add docstring and reduce platform setup duplication

The platform document setup is duplicated. Consider extracting it to a helper method along with the previously suggested token helper.
def _create_platform_doc(self, platform_id, user_id, community_id):
    """Helper method to create a platform document with standard fields."""
    return {
        "_id": platform_id,
        "name": "platform_name",
        "metadata": {
            "id": "113445975232201081511",
            "userId": str(user_id),
        },
        "community": community_id,
        "disconnectedAt": None,
        "connectedAt": datetime.now(),
        "createdAt": datetime.now(),
        "updatedAt": datetime.now(),
    }
Add a docstring to explain the test scenario:
def test_no_token(self):
    """Test that attempting to get a token for a platform with mismatched user raises ValueError.
    
    Sets up:
    1. A platform document for user A
    2. A token document for user B
    Verifies that the method properly validates user matching between platform and token.
    """
tc_hivemind_backend/db/modules_base.py (1)
9-9: Type Hint Compatibility with Python Versions Below 3.9

The use of list[dict] as a return type hint requires Python 3.9 or newer. If the project supports earlier Python versions, consider importing List from the typing module.

Modify the import and type hint as follows:
from typing import List, Dict

def query(self, platform: str, **kwargs) -> List[Dict]:
    ...

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 2a3a9ff and b4e4f37.

📒 Files selected for processing (9)

docker-compose.test.yml (1 hunks)
setup.py (1 hunks)
tc_hivemind_backend/db/modules_base.py (1 hunks)
tc_hivemind_backend/db/mongo.py (2 hunks)
tc_hivemind_backend/tests/integration/test_modules_base_query_token.py (1 hunks)
tc_hivemind_backend/tests/integration/test_mongo_singleton.py (1 hunks)
tc_hivemind_backend/tests/unit/test_mongo_singleton.py (1 hunks)
tc_hivemind_backend/tests/unit/test_qdrant_singleton.py (1 hunks)
tc_hivemind_backend/tests/unit/test_redis_singleton.py (1 hunks)

✅ Files skipped from review due to trivial changes (1)

setup.py

🧰 Additional context used

🪛 Ruff

tc_hivemind_backend/tests/unit/test_mongo_singleton.py

72-72: Local variable instance is assigned to but never used

Remove assignment to unused variable instance

(F841)

90-90: Local variable instance is assigned to but never used

Remove assignment to unused variable instance

(F841)

🔇 Additional comments (14)

tc_hivemind_backend/tests/integration/test_mongo_singleton.py (1)

1-9: LGTM! Well-structured test class setup.

The imports are appropriate, and the class is well-documented with a clear purpose.

tc_hivemind_backend/db/mongo.py (1)

10-10: LGTM! Improved type hint readability

The change to Optional["MongoSingleton"] is more idiomatic Python and clearer than the union type syntax.

docker-compose.test.yml (1)

24-24: ⚠️ Potential issue

Security concern: Empty Redis password in test environment

While this is a test environment, it's recommended to use a non-empty password for Redis to maintain security practices consistent with production environments and prevent potential security issues during testing.

tc_hivemind_backend/tests/unit/test_redis_singleton.py (2)

20-38: Verify port value type handling in Redis initialization.

The test uses a string port value "6379" in mock data but asserts it as an integer 6379 in Redis initialization. While this works, it's better to explicitly test the type conversion:

 mock_credentials_instance.load_redis.return_value = {
     "host": "test_host",
-    "port": "6379",
+    "port": 6379,  # Use integer to match expected type
     "password": "test_password",
 }

104-105: LGTM!

The main block is correctly implemented.

tc_hivemind_backend/tests/unit/test_qdrant_singleton.py (3)

1-9: LGTM! Well-structured test setup.

The imports are clean and specific, and the setUp method ensures proper test isolation by resetting the singleton instance.

56-101: LGTM! Comprehensive connection testing.

The connection tests are well-structured with:

Proper verification of success/failure scenarios
Comprehensive logging assertions
Effective error simulation

1-108: Overall excellent test implementation!

The test suite is well-structured with comprehensive coverage of the QdrantSingleton functionality. It effectively tests:

Singleton pattern enforcement
Different initialization scenarios
Connection handling
Error cases
Logging behavior

The suggestions provided are for enhancement rather than critical issues.

tc_hivemind_backend/tests/unit/test_mongo_singleton.py (3)

1-7: LGTM! Clean and well-organized imports.

The imports are appropriate and include all necessary components for testing.

28-42: Consider adding edge cases for URI construction

While the happy path is well tested, consider adding test cases for:

Missing credentials
Special characters in username/password that need URL encoding
Different authentication mechanisms

43-64: LGTM! Thorough singleton pattern testing.

The tests effectively verify both the singleton instance creation and prevention of direct instantiation.

tc_hivemind_backend/tests/integration/test_modules_base_query_token.py (1)

1-121: Consider adding tests for token expiration and blacklist status

The current test suite covers basic token retrieval and error cases, but doesn't verify the behavior when:

The token is expired
The token is blacklisted

Consider adding these test cases to ensure complete coverage of the token validation logic.

Would you like me to help generate these additional test cases?

tc_hivemind_backend/db/modules_base.py (2)

131-132: Handle Missing Metadata Fields to Prevent KeyError

In the get_platform_metadata method, accessing platform["metadata"][metadata_name] without checking if the key exists can raise a KeyError if metadata_name is not present in the metadata.

Consider adding error handling to manage the absence of the metadata field:

     metadata_field = platform["metadata"].get(metadata_name)
+    if metadata_field is None:
+        raise ValueError(f"Metadata '{metadata_name}' not found for platform id: {platform_id}")
     return metadata_field

30-36: Potential Issue with Querying Embedded Documents

In the query method, the filter {"options.platforms.name": platform} assumes that options.platforms is a dictionary. If options.platforms is a list of platform dictionaries, this query may not work as intended.

If options.platforms is a list, use the $elemMatch operator to correctly filter the documents:

     cursor = client["Core"]["modules"].find(
         {
-            "options.platforms.name": platform,
+            "options.platforms": {"$elemMatch": {"name": platform}},
             "name": "hivemind",
         },
         projection,
     )

coderabbitai · 2024-11-20T12:41:19Z

tc_hivemind_backend/tests/integration/test_mongo_singleton.py

+    def setUp(self):
+        """Reset the singleton instance before each test"""
+        MongoSingleton._MongoSingleton__instance = None
+


🛠️ Refactor suggestion

Consider adding a reset method to MongoSingleton.

Direct manipulation of the private __instance variable breaks encapsulation. Consider adding a reset_instance() class method to MongoSingleton instead.

# In MongoSingleton class: + @classmethod + def reset_instance(cls) -> None: + """Reset the singleton instance for testing purposes.""" + cls.__instance = None # In test class: def setUp(self): """Reset the singleton instance before each test""" - MongoSingleton._MongoSingleton__instance = None + MongoSingleton.reset_instance()

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-20T12:41:19Z

tc_hivemind_backend/tests/integration/test_mongo_singleton.py

+    def test_real_connection(self):
+        """Integration test for real MongoDB connection"""
+        instance = MongoSingleton.get_instance()
+        client = instance.get_client()
+        self.assertIsInstance(client, MongoClient)
+
+        # Test actual connection
+        try:
+            info = client.server_info()
+            self.assertIsInstance(info, dict)
+            self.assertIn("version", info)
+        except Exception as e:
+            self.fail(f"Failed to connect to MongoDB: {e}")


🛠️ Refactor suggestion

Add connection cleanup and improve error handling.

Several improvements can enhance the test reliability:

Add connection cleanup in tearDown

Specify connection timeout

Catch specific MongoDB exceptions

Verify singleton behavior

+from pymongo.errors import ConnectionFailure, ServerSelectionTimeoutError + class TestMongoSingletonIntegration(unittest.TestCase): + def tearDown(self): + """Clean up MongoDB connection after each test""" + instance = MongoSingleton.get_instance() + client = instance.get_client() + client.close() + MongoSingleton.reset_instance() + def test_real_connection(self): """Integration test for real MongoDB connection""" instance = MongoSingleton.get_instance() + # Verify singleton behavior + second_instance = MongoSingleton.get_instance() + self.assertIs(instance, second_instance) + client = instance.get_client() self.assertIsInstance(client, MongoClient) # Test actual connection try: - info = client.server_info() + info = client.server_info(timeout_ms=5000) self.assertIsInstance(info, dict) self.assertIn("version", info) - except Exception as e: + except (ConnectionFailure, ServerSelectionTimeoutError) as e: self.fail(f"Failed to connect to MongoDB: {e}")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def test_real_connection(self):

"""Integration test for real MongoDB connection"""

instance = MongoSingleton.get_instance()

client = instance.get_client()

self.assertIsInstance(client, MongoClient)

# Test actual connection

try:

info = client.server_info()

self.assertIsInstance(info, dict)

self.assertIn("version", info)

except Exception as e:

self.fail(f"Failed to connect to MongoDB: {e}")

from pymongo.errors import ConnectionFailure, ServerSelectionTimeoutError

class TestMongoSingletonIntegration(unittest.TestCase):

def tearDown(self):

"""Clean up MongoDB connection after each test"""

instance = MongoSingleton.get_instance()

client = instance.get_client()

client.close()

MongoSingleton.reset_instance()

def test_real_connection(self):

"""Integration test for real MongoDB connection"""

instance = MongoSingleton.get_instance()

# Verify singleton behavior

second_instance = MongoSingleton.get_instance()

self.assertIs(instance, second_instance)

client = instance.get_client()

self.assertIsInstance(client, MongoClient)

# Test actual connection

try:

info = client.server_info(timeout_ms=5000)

self.assertIsInstance(info, dict)

self.assertIn("version", info)

except (ConnectionFailure, ServerSelectionTimeoutError) as e:

self.fail(f"Failed to connect to MongoDB: {e}")

docker-compose.test.yml

tc_hivemind_backend/tests/unit/test_mongo_singleton.py

tc_hivemind_backend/db/modules_base.py

coderabbitai

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (3)

tc_hivemind_backend/db/modules_base.py (2)

6-8: Add class documentation and consider removing empty init

The class would benefit from a docstring explaining its purpose and responsibilities. Since the __init__ method is empty, it can be removed entirely as Python will automatically provide a default constructor.
 class ModulesBase:
-    def __init__(self) -> None:
-        pass
+    """
+    Base class for managing MongoDB operations related to platform modules,
+    providing functionality for querying module metadata, tokens, and community IDs.
+    """
57-57: Use list comprehension for better readability

Replace the map function with a more Pythonic list comprehension.
-        community_ids = list(map(lambda x: str(x["community"]), modules))
+        community_ids = [str(doc["community"]) for doc in modules]

tc_hivemind_backend/tests/unit/test_mongo_singleton.py (1)

10-26: Consider extracting logging setup into a helper method.

While the logging setup is comprehensive, it could be made more maintainable by extracting it into a helper method. This would improve readability and make it easier to modify logging configuration in the future.

+ def _setup_logging(self):
+     self.log_records = []
+     self.logger = logging.getLogger()
+     self.old_level = self.logger.level
+     self.logger.setLevel(logging.INFO)
+     self.handler = logging.Handler()
+     self.handler.emit = lambda record: self.log_records.append(record)
+     self.logger.addHandler(self.handler)

+ def _cleanup_logging(self):
+     self.logger.removeHandler(self.handler)
+     self.logger.setLevel(self.old_level)
+     self.log_records = []

  def setUp(self):
      """Reset the singleton instance before each test"""
      MongoSingleton._MongoSingleton__instance = None
-     self.log_records = []
-     self.logger = logging.getLogger()
-     self.old_level = self.logger.level
-     self.logger.setLevel(logging.INFO)
-     self.handler = logging.Handler()
-     self.handler.emit = lambda record: self.log_records.append(record)
-     self.logger.addHandler(self.handler)
+     self._setup_logging()

  def tearDown(self):
      """Clean up after each test"""
-     self.logger.removeHandler(self.handler)
-     self.logger.setLevel(self.old_level)
-     self.log_records = []
+     self._cleanup_logging()

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between b4e4f37 and fc5d5b4.

📒 Files selected for processing (5)

tc_hivemind_backend/db/modules_base.py (1 hunks)
tc_hivemind_backend/db/mongo.py (1 hunks)
tc_hivemind_backend/tests/unit/test_mongo_singleton.py (1 hunks)
tc_hivemind_backend/tests/unit/test_qdrant_singleton.py (1 hunks)
tc_hivemind_backend/tests/unit/test_redis_singleton.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

tc_hivemind_backend/db/mongo.py
tc_hivemind_backend/tests/unit/test_qdrant_singleton.py
tc_hivemind_backend/tests/unit/test_redis_singleton.py

🧰 Additional context used

🪛 Ruff

tc_hivemind_backend/tests/unit/test_mongo_singleton.py

72-72: Local variable instance is assigned to but never used

Remove assignment to unused variable instance

(F841)

90-90: Local variable instance is assigned to but never used

Remove assignment to unused variable instance

(F841)

🔇 Additional comments (2)

tc_hivemind_backend/tests/unit/test_mongo_singleton.py (2)

1-7: LGTM! Well-organized imports.

The imports are properly organized and include all necessary modules for testing.

28-108: LGTM! Comprehensive test coverage with proper test isolation.

The test suite demonstrates:

Thorough coverage of MongoDB connection scenarios
Proper mocking of external dependencies
Clear test method organization
Good validation of singleton pattern

Skipping comment about unused variables on lines 72 and 90 as it was already addressed in previous reviews.

🧰 Tools

🪛 Ruff

72-72: Local variable instance is assigned to but never used

Remove assignment to unused variable instance

(F841)

90-90: Local variable instance is assigned to but never used

Remove assignment to unused variable instance

(F841)

coderabbitai · 2024-11-20T12:51:06Z

tc_hivemind_backend/db/modules_base.py

+    def get_token(self, platform_id: ObjectId, token_type: str) -> str:
+        """
+        get a specific type of token for a platform
+        This method is called when we needed a token for modules to extract its data
+
+        Parameters
+        ------------
+        platform_id : ObjectId
+            the platform id that we want their token
+        token_type : str
+            the type of token. i.e. `google_refresh`
+
+        Returns
+        --------
+        token : str
+            the token that was required for module's ETL process
+        """
+        client = MongoSingleton.get_instance().client
+
+        user_id = self.get_platform_metadata(platform_id, "userId")
+        user_id = ObjectId(user_id)
+        token_doc = client["Core"]["tokens"].find_one(
+            {
+                "user": user_id,
+                "type": token_type,
+            },
+            {
+                "token": 1,
+            },
+            sort=[("createdAt", -1)],
+        )
+        if token_doc is None:
+            raise ValueError(
+                f"No Token for the given user {user_id} "
+                "in tokens collection of the Core database!"
+            )
+        token = token_doc["token"]
+        return token


⚠️ Potential issue

Enhance security and validation in token handling

Several security and validation improvements are needed:

Validate token_type parameter

Avoid exposing internal IDs in error messages

Consider implementing token encryption

def get_token(self, platform_id: ObjectId, token_type: str) -> str: + VALID_TOKEN_TYPES = {'google_refresh', 'other_valid_types'} # Add valid types + if not isinstance(token_type, str) or token_type not in VALID_TOKEN_TYPES: + raise ValueError("Invalid token type") + client = MongoSingleton.get_instance().client user_id = self.get_platform_metadata(platform_id, "userId") user_id = ObjectId(user_id) - token_doc = client["Core"]["tokens"].find_one( + try: + token_doc = client["Core"]["tokens"].find_one( { "user": user_id, "type": token_type, @@ -91,9 +95,8 @@ ) if token_doc is None: raise ValueError( - f"No Token for the given user {user_id} " - "in tokens collection of the Core database!" + "Token not found" ) token = token_doc["token"] - return token + return self._decrypt_token(token) # Consider implementing token encryption

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-20T12:51:06Z

tc_hivemind_backend/db/modules_base.py

+    def query(self, platform: str, **kwargs) -> list[dict]:
+        """
+        query the modules database for to get platforms' metadata
+
+        Parameters
+        -----------
+        platform : str
+            the platform to choose
+            it can be `github`, `discourse`, `discord` or etc
+        **kwargs : dict
+            projection : dict[str, int]
+                feature projection on query
+
+        Returns
+        ---------
+        modules_docs : list[dict]
+            all the module documents that have the `platform` within them
+        """
+        client = MongoSingleton.get_instance().client
+        projection = kwargs.get("projection", {})
+
+        cursor = client["Core"]["modules"].find(
+            {
+                "options.platforms.name": platform,
+                "name": "hivemind",
+            },
+            projection,
+        )
+        modules_docs = list(cursor)
+        return modules_docs


🛠️ Refactor suggestion

Add input validation and error handling

The query method should validate the platform parameter and include error handling for MongoDB operations:

Validate platform against allowed values

Add error handling for MongoDB operations

Consider adding index hints for query optimization

def query(self, platform: str, **kwargs) -> list[dict]: + VALID_PLATFORMS = {'github', 'discourse', 'discord'} + if platform not in VALID_PLATFORMS: + raise ValueError(f"Invalid platform. Must be one of: {VALID_PLATFORMS}") + client = MongoSingleton.get_instance().client projection = kwargs.get("projection", {}) - cursor = client["Core"]["modules"].find( - { - "options.platforms.name": platform, - "name": "hivemind", - }, - projection, - ) - modules_docs = list(cursor) - return modules_docs + try: + cursor = client["Core"]["modules"].find( + { + "options.platforms.name": platform, + "name": "hivemind", + }, + projection, + ).hint("options.platforms.name_1") # Assuming index exists + return list(cursor) + except Exception as e: + raise RuntimeError(f"Failed to query modules: {str(e)}") from e

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (2)

tc_hivemind_backend/db/modules_base.py (2)

6-9: Add class documentation and remove empty init

The class would benefit from a docstring explaining its purpose and responsibilities. Also, the empty __init__ method can be removed as it doesn't provide any functionality.

 class ModulesBase:
-    def __init__(self) -> None:
-        pass
+    """
+    Base class for handling MongoDB operations related to modules, platforms, and tokens.
+    Provides methods for querying module data, retrieving community IDs, tokens, and platform metadata.
+    """

41-60: Improve documentation and add error handling

The docstring mentions "discord platform" specifically, but the method is platform-agnostic.
Consider adding error handling for empty results.

     def get_platform_community_ids(self, platform_name: str) -> list[str]:
         """
         get all community ids that a platform has
 
         Parameters
         ------------
         platform_name : str
             the platform having community id and available for hivemind module
 
         Returns
         --------
         community_ids : list[str]
-            id of communities that has discord platform and hivemind module enabled
+            id of communities that have the specified platform and hivemind module enabled
 
         """
         modules = self.query(platform=platform_name, projection={"community": 1})
+        if not modules:
+            raise ValueError(f"No communities found for platform: {platform_name}")
         community_ids = list(map(lambda x: str(x["community"]), modules))
 
         return community_ids

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between fc5d5b4 and 71fb776.

📒 Files selected for processing (3)

docker-compose.test.yml (3 hunks)
tc_hivemind_backend/db/modules_base.py (1 hunks)
tc_hivemind_backend/tests/unit/test_mongo_singleton.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

tc_hivemind_backend/tests/unit/test_mongo_singleton.py

🔇 Additional comments (2)

docker-compose.test.yml (1)

22-24: Consider securing Redis with a password in non-development environments.

The Redis password is currently empty. While this might be acceptable for local development, ensure that proper authentication is configured in staging and production environments.

tc_hivemind_backend/db/modules_base.py (1)

10-40: LGTM with previous suggestions

The method implementation is functional, but please refer to the previous review comment about adding input validation, error handling, and index hints.

coderabbitai · 2024-11-20T12:57:15Z

docker-compose.test.yml

+  redis:
+    image: redis:latest
+    ports:
+      - "6379:6379"
+    networks:
+      - python_service_network
+    healthcheck:
+      test: ["CMD", "redis-cli", "ping"]
+      interval: 10s
+      timeout: 5s
+      retries: 5


🛠️ Refactor suggestion

Pin Redis version to a specific release.

Using the latest tag can lead to unexpected behavior or breaking changes when the image is updated. Consider pinning to a specific version for better stability and reproducibility.

- image: redis:latest + image: redis:7.2.3

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

redis:

image: redis:latest

ports:

- "6379:6379"

networks:

- python_service_network

healthcheck:

test: ["CMD", "redis-cli", "ping"]

interval: 10s

timeout: 5s

retries: 5

redis:

image: redis:7.2.3

ports:

- "6379:6379"

networks:

- python_service_network

healthcheck:

test: ["CMD", "redis-cli", "ping"]

interval: 10s

timeout: 5s

retries: 5

coderabbitai · 2024-11-20T12:57:15Z

tc_hivemind_backend/db/modules_base.py

+            raise ValueError(
+                f"No Token for the given user {user_id} "
+                "in tokens collection of the Core database!"
+            )


⚠️ Potential issue

Remove sensitive information from error message

The error message exposes the internal user ID, which could be a security concern. Consider using a more generic error message.

if token_doc is None: raise ValueError( - f"No Token for the given user {user_id} " - "in tokens collection of the Core database!" + "No valid token found for the specified platform and token type" )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

raise ValueError(

f"No Token for the given user {user_id} "

"in tokens collection of the Core database!"

)

raise ValueError(

"No valid token found for the specified platform and token type"

)

coderabbitai · 2024-11-20T12:57:15Z

tc_hivemind_backend/db/modules_base.py

+    def get_platform_metadata(
+        self, platform_id: ObjectId, metadata_name: str
+    ) -> str | dict | list:
+        """
+        get the userid that belongs to a platform
+
+        Parameters
+        -----------
+        platform_id : bson.ObjectId
+            the platform id we need their owner user id
+        metadata_name : str
+            a specific field of metadata that we want
+
+        Returns
+        ---------
+        metadata_value : Any
+            the values that the metadata belongs to
+        """
+        client = MongoSingleton.get_instance().get_client()
+
+        platform = client["Core"]["platforms"].find_one(
+            {
+                "_id": platform_id,
+                "disconnectedAt": None,
+            },
+            {
+                f"metadata.{metadata_name}": 1,
+            },
+        )
+        if platform is None:
+            raise ValueError(f"No platform available given platform id: {platform_id}")
+
+        metadata_field = platform["metadata"][metadata_name]
+        return metadata_field


🛠️ Refactor suggestion

Improve type hints and add error handling for metadata access

The return type hint could be more specific using Union from typing module

Add error handling for KeyError when accessing metadata field

+from typing import Union + def get_platform_metadata( self, platform_id: ObjectId, metadata_name: str - ) -> str | dict | list: + ) -> Union[str, dict, list]: """ get the userid that belongs to a platform Parameters ----------- platform_id : bson.ObjectId the platform id we need their owner user id metadata_name : str a specific field of metadata that we want Returns --------- metadata_value : Union[str, dict, list] the values that the metadata belongs to Raises ------- ValueError If platform not found or metadata field doesn't exist """ client = MongoSingleton.get_instance().get_client() platform = client["Core"]["platforms"].find_one( { "_id": platform_id, "disconnectedAt": None, }, { f"metadata.{metadata_name}": 1, }, ) if platform is None: raise ValueError(f"No platform available given platform id: {platform_id}") + try: metadata_field = platform["metadata"][metadata_name] + except (KeyError, TypeError): + raise ValueError(f"Metadata field '{metadata_name}' not found") return metadata_field

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def get_platform_metadata(

self, platform_id: ObjectId, metadata_name: str

) -> str | dict | list:

"""

get the userid that belongs to a platform

Parameters

-----------

platform_id : bson.ObjectId

the platform id we need their owner user id

metadata_name : str

a specific field of metadata that we want

Returns

---------

metadata_value : Any

the values that the metadata belongs to

"""

client = MongoSingleton.get_instance().get_client()

platform = client["Core"]["platforms"].find_one(

{

"_id": platform_id,

"disconnectedAt": None,

},

{

f"metadata.{metadata_name}": 1,

},

)

if platform is None:

raise ValueError(f"No platform available given platform id: {platform_id}")

metadata_field = platform["metadata"][metadata_name]

return metadata_field

from typing import Union

def get_platform_metadata(

self, platform_id: ObjectId, metadata_name: str

) -> Union[str, dict, list]:

"""

get the userid that belongs to a platform

Parameters

-----------

platform_id : bson.ObjectId

the platform id we need their owner user id

metadata_name : str

a specific field of metadata that we want

Returns

---------

metadata_value : Union[str, dict, list]

the values that the metadata belongs to

Raises

-------

ValueError

If platform not found or metadata field doesn't exist

"""

client = MongoSingleton.get_instance().get_client()

platform = client["Core"]["platforms"].find_one(

{

"_id": platform_id,

"disconnectedAt": None,

},

{

f"metadata.{metadata_name}": 1,

},

)

if platform is None:

raise ValueError(f"No platform available given platform id: {platform_id}")

try:

metadata_field = platform["metadata"][metadata_name]

except (KeyError, TypeError):

raise ValueError(f"Metadata field '{metadata_name}' not found")

return metadata_field

amindadgar added 2 commits November 20, 2024 16:05

feat: Added more test cases!

3ab6239

feat: Added more shared codes!

b4e4f37

coderabbitai bot reviewed Nov 20, 2024

View reviewed changes

fix: isort linter issues!

fc5d5b4

coderabbitai bot reviewed Nov 20, 2024

View reviewed changes

fix: codeRabbitAI suggestions!

71fb776

amindadgar merged commit 6b03e66 into main Nov 20, 2024
13 checks passed

coderabbitai bot reviewed Nov 20, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: add shared codes #18

Feat: add shared codes #18

amindadgar commented Nov 20, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 20, 2024 •

edited

Loading

Rate limit exceeded

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Nov 20, 2024

coderabbitai bot Nov 20, 2024

coderabbitai bot left a comment

coderabbitai bot Nov 20, 2024

coderabbitai bot Nov 20, 2024

coderabbitai bot left a comment

coderabbitai bot Nov 20, 2024

coderabbitai bot Nov 20, 2024

coderabbitai bot Nov 20, 2024

Feat: add shared codes #18

Feat: add shared codes #18

Conversation

amindadgar commented Nov 20, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

Release Notes

coderabbitai bot commented Nov 20, 2024 • edited Loading

Walkthrough

Changes

Possibly related PRs

Rate limit exceeded

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 20, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 20, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 20, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 20, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 20, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 20, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 20, 2024

Choose a reason for hiding this comment

amindadgar commented Nov 20, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 20, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)