Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: add shared codes #18

Merged
merged 4 commits into from
Nov 20, 2024
Merged

Feat: add shared codes #18

merged 4 commits into from
Nov 20, 2024

Conversation

amindadgar
Copy link
Member

@amindadgar amindadgar commented Nov 20, 2024

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced new environment variables for the app service: REDIS_HOST, REDIS_PORT, and REDIS_PASSWORD.
    • Added a new class ModulesBase for managing platform-related metadata and tokens in the database.
  • Version Updates

    • Updated the version number of the package to 1.4.0.
  • Testing Enhancements

    • Added new integration and unit tests for MongoSingleton, QdrantSingleton, and RedisSingleton classes to ensure robust functionality and error handling.

Copy link

coderabbitai bot commented Nov 20, 2024

Walkthrough

The changes in this pull request involve updates to several files, including the addition of new environment variables for the Redis configuration in docker-compose.test.yml, an increment in the version number in setup.py, and the introduction of a new class ModulesBase in modules_base.py for MongoDB interactions. Additionally, several new test classes were created to validate the functionality of the ModulesBase, MongoSingleton, QdrantSingleton, and RedisSingleton classes. Type hinting improvements were made in mongo.py, enhancing code clarity.

Changes

File Change Summary
docker-compose.test.yml Added environment variables: REDIS_HOST, REDIS_PORT, REDIS_PASSWORD to the app service and introduced a new redis service with health checks.
setup.py Updated version from 1.3.0 to 1.4.0.
tc_hivemind_backend/db/modules_base.py Added class ModulesBase with methods: query, get_platform_community_ids, get_token, get_platform_metadata.
tc_hivemind_backend/db/mongo.py Updated type hint for __instance in MongoSingleton from `"MongoSingleton"
tc_hivemind_backend/tests/integration/test_modules_base_query_token.py Added test class TestModulesBaseQueryToken for get_token method with several test cases.
tc_hivemind_backend/tests/integration/test_mongo_singleton.py Added test class TestMongoSingletonIntegration for integration testing of MongoSingleton.
tc_hivemind_backend/tests/unit/test_mongo_singleton.py Added test class TestMongoSingleton for unit testing of MongoSingleton and get_mongo_uri.
tc_hivemind_backend/tests/unit/test_qdrant_singleton.py Added test class TestQdrantSingleton for unit testing of QdrantSingleton.
tc_hivemind_backend/tests/unit/test_redis_singleton.py Added test class TestRedisSingleton for unit testing of RedisSingleton.

Possibly related PRs

🐰 "In the code garden, changes bloom,
Redis whispers, 'Make some room!'
New tests hop, and classes play,
Version's up, hip-hip-hooray!
Mongo's secrets, we now can find,
With every change, we're more aligned!" 🐇✨

Warning

Rate limit exceeded

@amindadgar has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 14 minutes and 16 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between fc5d5b4 and 71fb776.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Outside diff range and nitpick comments (21)
tc_hivemind_backend/tests/integration/test_mongo_singleton.py (1)

7-26: Consider adding more test cases for comprehensive coverage.

The test suite would benefit from additional test cases:

  1. Test connection with invalid credentials
  2. Test connection with invalid host/port
  3. Test connection string configuration validation
  4. Test database and collection access

Would you like me to provide example implementations for these test cases?

tc_hivemind_backend/db/mongo.py (2)

Line range hint 12-29: Consider adding thread safety to the singleton implementation

The current implementation could face race conditions in a multi-threaded environment. Consider using double-checked locking pattern with threading.Lock().

Here's a suggested implementation:

+ import threading

class MongoSingleton:
    __instance: Optional["MongoSingleton"] = None
+   __lock = threading.Lock()

    def __init__(self):
        if MongoSingleton.__instance is not None:
            raise Exception("This class is a singleton!")
        else:
            connection_uri = get_mongo_uri()
            self.client = MongoClient(connection_uri)
            MongoSingleton.__instance = self

    @staticmethod
    def get_instance():
-       if MongoSingleton.__instance is None:
-           MongoSingleton()
+       if not MongoSingleton.__instance:
+           with MongoSingleton.__lock:
+               if not MongoSingleton.__instance:
+                   MongoSingleton()

Also, consider catching specific MongoDB exceptions instead of generic Exception:

-           except Exception as exp:
+           except (ConnectionError, ServerSelectionTimeoutError) as exp:

Line range hint 39-47: Enhance MongoDB connection security and reliability

The connection string could benefit from additional security and reliability options:

  1. SSL/TLS configuration for secure communication
  2. Connection pool settings
  3. Timeout configurations
  4. Retry settings for better reliability

Consider updating the connection string construction:

-    connection = f"mongodb://{user}:{password}@{host}:{port}"
+    connection = (
+        f"mongodb://{user}:{password}@{host}:{port}"
+        "?ssl=true"
+        "&retryWrites=true"
+        "&w=majority"
+        "&maxPoolSize=50"
+        "&connectTimeoutMS=2000"
+    )
tc_hivemind_backend/tests/unit/test_redis_singleton.py (5)

7-9: Consider adding a tearDown method and improving singleton reset mechanism.

While the setUp method correctly resets the singleton, consider:

  1. Adding a tearDown method to ensure clean state after tests
  2. Adding a public reset method in RedisSingleton instead of accessing private attribute
 def setUp(self):
     RedisSingleton._RedisSingleton__instance = None

+def tearDown(self):
+    RedisSingleton._RedisSingleton__instance = None

10-19: Consider splitting singleton pattern test into two methods.

For better test clarity and single responsibility, consider splitting this into:

  • test_singleton_returns_same_instance
  • test_singleton_prevents_direct_instantiation
-def test_singleton_pattern(self):
+def test_singleton_returns_same_instance(self):
     instance1 = RedisSingleton.get_instance()
     instance2 = RedisSingleton.get_instance()
     self.assertIs(instance1, instance2)
+
+def test_singleton_prevents_direct_instantiation(self):
     with self.assertRaises(Exception) as context:
         RedisSingleton()
     self.assertEqual(str(context.exception), "This class is a singleton!")

39-64: Add assertions for Redis client state.

While the test verifies logging, consider adding assertions for the client's connected state and availability:

 RedisSingleton.get_instance()

 mock_info_log.assert_called_once_with(
     "Redis Connected Successfully! Ping returned: True"
 )
 mock_error_log.assert_not_called()
+self.assertTrue(mock_client.ping.called)  # Verify ping was attempted
+self.assertIsNotNone(RedisSingleton.get_instance().get_client())  # Verify client is available

65-90: Use specific exception type for connection failure.

Instead of using a generic Exception, consider using a more specific Redis connection exception for better error handling simulation:

-mock_client.ping.side_effect = Exception("Connection failed")
+from redis.exceptions import ConnectionError
+mock_client.ping.side_effect = ConnectionError("Connection failed")

91-102: Consider using a custom exception for credentials loading failure.

Instead of using a generic Exception, consider defining and using a custom exception type for credential loading failures:

class CredentialsLoadError(Exception):
    pass

Then update the test:

-mock_credentials_instance.load_redis.side_effect = Exception(
+mock_credentials_instance.load_redis.side_effect = CredentialsLoadError(
     "Failed to load credentials"
 )
tc_hivemind_backend/tests/unit/test_qdrant_singleton.py (3)

10-19: Consider splitting singleton pattern tests for better clarity.

While the test coverage is good, consider splitting test_singleton_pattern into two separate test methods for better clarity:

  1. test_singleton_instance_reuse
  2. test_singleton_direct_instantiation_prevention
-    def test_singleton_pattern(self):
-        instance1 = QdrantSingleton.get_instance()
-        instance2 = QdrantSingleton.get_instance()
-
-        self.assertIs(instance1, instance2)
-
-        with self.assertRaises(Exception) as context:
-            QdrantSingleton()
-        self.assertEqual(str(context.exception), "This class is a singleton!")
+    def test_singleton_instance_reuse(self):
+        instance1 = QdrantSingleton.get_instance()
+        instance2 = QdrantSingleton.get_instance()
+        self.assertIs(instance1, instance2)
+
+    def test_singleton_direct_instantiation_prevention(self):
+        with self.assertRaises(Exception) as context:
+            QdrantSingleton()
+        self.assertEqual(str(context.exception), "This class is a singleton!")

20-55: Consider using class-level constants for test data.

The test data could be moved to class-level constants to improve maintainability and reusability across test methods.

 class TestQdrantSingleton(unittest.TestCase):
+    TEST_HOST = "test_host"
+    TEST_PORT = 6333
+    TEST_API_KEY = "test_api_key"
+    
+    @classmethod
+    def _get_mock_credentials(cls, with_api_key=False):
+        return {
+            "host": cls.TEST_HOST,
+            "port": cls.TEST_PORT,
+            "api_key": cls.TEST_API_KEY if with_api_key else "",
+        }

Then update the test methods to use these constants:

-        mock_load_credentials.return_value = {
-            "host": "test_host",
-            "port": 6333,
-            "api_key": "",
-        }
+        mock_load_credentials.return_value = self._get_mock_credentials()

102-108: Consider adding more error scenarios.

While the current error handling test is good, consider adding tests for other potential failure scenarios:

  • Network timeout errors
  • Authentication failures
  • Invalid host/port configurations
tc_hivemind_backend/tests/unit/test_mongo_singleton.py (4)

12-12: Consider using a public reset method for testing

Directly accessing the private __instance variable breaks encapsulation. Consider adding a reset_instance class method to MongoSingleton specifically for testing purposes.

# In MongoSingleton class
@classmethod
def reset_instance(cls):
    """Reset the singleton instance (for testing only)"""
    cls.__instance = None

13-20: Consider extracting logging setup to a helper method

The logging setup code could be extracted into a helper method to improve readability and reusability.

def setup_logging_capture(self):
    self.log_records = []
    self.logger = logging.getLogger()
    self.old_level = self.logger.level
    self.logger.setLevel(logging.INFO)
    self.handler = logging.Handler()
    self.handler.emit = lambda record: self.log_records.append(record)
    self.logger.addHandler(self.handler)

99-108: Consider additional client verification

While the basic client instance check is good, consider adding assertions for:

  • Client configuration properties
  • Connection parameters
  • Default database selection

1-108: Consider adding these test scenarios

The test suite is well-structured, but consider adding tests for:

  1. Connection timeout handling
  2. Reconnection attempts
  3. Thread safety of the singleton pattern
  4. Connection pool configuration

Would you like me to provide example implementations for any of these test scenarios?

🧰 Tools
🪛 Ruff

72-72: Local variable instance is assigned to but never used

Remove assignment to unused variable instance

(F841)


90-90: Local variable instance is assigned to but never used

Remove assignment to unused variable instance

(F841)

tc_hivemind_backend/tests/integration/test_modules_base_query_token.py (5)

10-14: Add tearDown method to clean up after tests

While the setUp method correctly prepares the test environment, it's a good practice to also implement a tearDown method to ensure proper cleanup after each test, even if the collections are dropped in setUp.

def tearDown(self) -> None:
    self.client["Core"].drop_collection("tokens")
    self.client["Core"].drop_collection("platforms")

15-53: Add docstring to describe test purpose and data setup

The test method would benefit from a clear docstring explaining the test scenario and the relationship between the platform and token documents.

def test_one_token(self):
    """Test successful token retrieval when valid platform and token exist.
    
    Sets up:
    1. A platform document with a specific user
    2. A matching token document for that user
    Verifies that get_token returns the correct token value.
    """

Also, consider moving the test data setup to class-level constants for better maintainability:

class TestModulesBaseQueryToken(TestCase):
    SAMPLE_USER_ID = ObjectId("5d7baf326c8a2e2400000000")
    COMMUNITY_ID = ObjectId("6579c364f1120850414e0dc5")
    PLATFORM_ID = ObjectId("6579c364f1120850414e0dc6")

54-61: Add docstring to clarify error scenario

Add a docstring to explain the expected behavior when attempting to retrieve a token from an empty collection.

def test_empty_tokens_collection(self):
    """Test that attempting to get a token from an empty collection raises ValueError.
    
    Verifies the error handling when no tokens exist in the database.
    """

62-82: Add docstring and reduce token setup duplication

The token document setup is duplicated across multiple tests. Consider extracting it to a helper method.

def _create_token_doc(self, user_id, token_value, token_type):
    """Helper method to create a token document with standard fields."""
    return {
        "token": token_value,
        "user": user_id,
        "type": token_type,
        "expires": datetime.now() + timedelta(days=1),
        "blacklisted": False,
        "createdAt": datetime.now() - timedelta(days=1),
        "updatedAt": datetime.now() - timedelta(days=1),
    }

Also, add a docstring to explain the test scenario:

def test_no_platform(self):
    """Test that attempting to get a token without a corresponding platform raises ValueError.
    
    Sets up a token document but no platform document, verifying that the method
    properly validates platform existence.
    """

83-121: Add docstring and reduce platform setup duplication

The platform document setup is duplicated. Consider extracting it to a helper method along with the previously suggested token helper.

def _create_platform_doc(self, platform_id, user_id, community_id):
    """Helper method to create a platform document with standard fields."""
    return {
        "_id": platform_id,
        "name": "platform_name",
        "metadata": {
            "id": "113445975232201081511",
            "userId": str(user_id),
        },
        "community": community_id,
        "disconnectedAt": None,
        "connectedAt": datetime.now(),
        "createdAt": datetime.now(),
        "updatedAt": datetime.now(),
    }

Add a docstring to explain the test scenario:

def test_no_token(self):
    """Test that attempting to get a token for a platform with mismatched user raises ValueError.
    
    Sets up:
    1. A platform document for user A
    2. A token document for user B
    Verifies that the method properly validates user matching between platform and token.
    """
tc_hivemind_backend/db/modules_base.py (1)

9-9: Type Hint Compatibility with Python Versions Below 3.9

The use of list[dict] as a return type hint requires Python 3.9 or newer. If the project supports earlier Python versions, consider importing List from the typing module.

Modify the import and type hint as follows:

from typing import List, Dict

def query(self, platform: str, **kwargs) -> List[Dict]:
    ...
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 2a3a9ff and b4e4f37.

📒 Files selected for processing (9)
  • docker-compose.test.yml (1 hunks)
  • setup.py (1 hunks)
  • tc_hivemind_backend/db/modules_base.py (1 hunks)
  • tc_hivemind_backend/db/mongo.py (2 hunks)
  • tc_hivemind_backend/tests/integration/test_modules_base_query_token.py (1 hunks)
  • tc_hivemind_backend/tests/integration/test_mongo_singleton.py (1 hunks)
  • tc_hivemind_backend/tests/unit/test_mongo_singleton.py (1 hunks)
  • tc_hivemind_backend/tests/unit/test_qdrant_singleton.py (1 hunks)
  • tc_hivemind_backend/tests/unit/test_redis_singleton.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • setup.py
🧰 Additional context used
🪛 Ruff
tc_hivemind_backend/tests/unit/test_mongo_singleton.py

72-72: Local variable instance is assigned to but never used

Remove assignment to unused variable instance

(F841)


90-90: Local variable instance is assigned to but never used

Remove assignment to unused variable instance

(F841)

🔇 Additional comments (14)
tc_hivemind_backend/tests/integration/test_mongo_singleton.py (1)

1-9: LGTM! Well-structured test class setup.

The imports are appropriate, and the class is well-documented with a clear purpose.

tc_hivemind_backend/db/mongo.py (1)

10-10: LGTM! Improved type hint readability

The change to Optional["MongoSingleton"] is more idiomatic Python and clearer than the union type syntax.

docker-compose.test.yml (1)

24-24: ⚠️ Potential issue

Security concern: Empty Redis password in test environment

While this is a test environment, it's recommended to use a non-empty password for Redis to maintain security practices consistent with production environments and prevent potential security issues during testing.

tc_hivemind_backend/tests/unit/test_redis_singleton.py (2)

20-38: Verify port value type handling in Redis initialization.

The test uses a string port value "6379" in mock data but asserts it as an integer 6379 in Redis initialization. While this works, it's better to explicitly test the type conversion:

 mock_credentials_instance.load_redis.return_value = {
     "host": "test_host",
-    "port": "6379",
+    "port": 6379,  # Use integer to match expected type
     "password": "test_password",
 }

104-105: LGTM!

The main block is correctly implemented.

tc_hivemind_backend/tests/unit/test_qdrant_singleton.py (3)

1-9: LGTM! Well-structured test setup.

The imports are clean and specific, and the setUp method ensures proper test isolation by resetting the singleton instance.


56-101: LGTM! Comprehensive connection testing.

The connection tests are well-structured with:

  • Proper verification of success/failure scenarios
  • Comprehensive logging assertions
  • Effective error simulation

1-108: Overall excellent test implementation!

The test suite is well-structured with comprehensive coverage of the QdrantSingleton functionality. It effectively tests:

  • Singleton pattern enforcement
  • Different initialization scenarios
  • Connection handling
  • Error cases
  • Logging behavior

The suggestions provided are for enhancement rather than critical issues.

tc_hivemind_backend/tests/unit/test_mongo_singleton.py (3)

1-7: LGTM! Clean and well-organized imports.

The imports are appropriate and include all necessary components for testing.


28-42: Consider adding edge cases for URI construction

While the happy path is well tested, consider adding test cases for:

  • Missing credentials
  • Special characters in username/password that need URL encoding
  • Different authentication mechanisms

43-64: LGTM! Thorough singleton pattern testing.

The tests effectively verify both the singleton instance creation and prevention of direct instantiation.

tc_hivemind_backend/tests/integration/test_modules_base_query_token.py (1)

1-121: Consider adding tests for token expiration and blacklist status

The current test suite covers basic token retrieval and error cases, but doesn't verify the behavior when:

  1. The token is expired
  2. The token is blacklisted

Consider adding these test cases to ensure complete coverage of the token validation logic.

Would you like me to help generate these additional test cases?

tc_hivemind_backend/db/modules_base.py (2)

131-132: Handle Missing Metadata Fields to Prevent KeyError

In the get_platform_metadata method, accessing platform["metadata"][metadata_name] without checking if the key exists can raise a KeyError if metadata_name is not present in the metadata.

Consider adding error handling to manage the absence of the metadata field:

     metadata_field = platform["metadata"].get(metadata_name)
+    if metadata_field is None:
+        raise ValueError(f"Metadata '{metadata_name}' not found for platform id: {platform_id}")
     return metadata_field

30-36: Potential Issue with Querying Embedded Documents

In the query method, the filter {"options.platforms.name": platform} assumes that options.platforms is a dictionary. If options.platforms is a list of platform dictionaries, this query may not work as intended.

If options.platforms is a list, use the $elemMatch operator to correctly filter the documents:

     cursor = client["Core"]["modules"].find(
         {
-            "options.platforms.name": platform,
+            "options.platforms": {"$elemMatch": {"name": platform}},
             "name": "hivemind",
         },
         projection,
     )

Comment on lines +10 to +13
def setUp(self):
"""Reset the singleton instance before each test"""
MongoSingleton._MongoSingleton__instance = None

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider adding a reset method to MongoSingleton.

Direct manipulation of the private __instance variable breaks encapsulation. Consider adding a reset_instance() class method to MongoSingleton instead.

# In MongoSingleton class:
+    @classmethod
+    def reset_instance(cls) -> None:
+        """Reset the singleton instance for testing purposes."""
+        cls.__instance = None

# In test class:
     def setUp(self):
         """Reset the singleton instance before each test"""
-        MongoSingleton._MongoSingleton__instance = None
+        MongoSingleton.reset_instance()

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +14 to +26
def test_real_connection(self):
"""Integration test for real MongoDB connection"""
instance = MongoSingleton.get_instance()
client = instance.get_client()
self.assertIsInstance(client, MongoClient)

# Test actual connection
try:
info = client.server_info()
self.assertIsInstance(info, dict)
self.assertIn("version", info)
except Exception as e:
self.fail(f"Failed to connect to MongoDB: {e}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add connection cleanup and improve error handling.

Several improvements can enhance the test reliability:

  1. Add connection cleanup in tearDown
  2. Specify connection timeout
  3. Catch specific MongoDB exceptions
  4. Verify singleton behavior
+from pymongo.errors import ConnectionFailure, ServerSelectionTimeoutError
+
 class TestMongoSingletonIntegration(unittest.TestCase):
+    def tearDown(self):
+        """Clean up MongoDB connection after each test"""
+        instance = MongoSingleton.get_instance()
+        client = instance.get_client()
+        client.close()
+        MongoSingleton.reset_instance()
+
     def test_real_connection(self):
         """Integration test for real MongoDB connection"""
         instance = MongoSingleton.get_instance()
+        # Verify singleton behavior
+        second_instance = MongoSingleton.get_instance()
+        self.assertIs(instance, second_instance)
+
         client = instance.get_client()
         self.assertIsInstance(client, MongoClient)

         # Test actual connection
         try:
-            info = client.server_info()
+            info = client.server_info(timeout_ms=5000)
             self.assertIsInstance(info, dict)
             self.assertIn("version", info)
-        except Exception as e:
+        except (ConnectionFailure, ServerSelectionTimeoutError) as e:
             self.fail(f"Failed to connect to MongoDB: {e}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def test_real_connection(self):
"""Integration test for real MongoDB connection"""
instance = MongoSingleton.get_instance()
client = instance.get_client()
self.assertIsInstance(client, MongoClient)
# Test actual connection
try:
info = client.server_info()
self.assertIsInstance(info, dict)
self.assertIn("version", info)
except Exception as e:
self.fail(f"Failed to connect to MongoDB: {e}")
from pymongo.errors import ConnectionFailure, ServerSelectionTimeoutError
class TestMongoSingletonIntegration(unittest.TestCase):
def tearDown(self):
"""Clean up MongoDB connection after each test"""
instance = MongoSingleton.get_instance()
client = instance.get_client()
client.close()
MongoSingleton.reset_instance()
def test_real_connection(self):
"""Integration test for real MongoDB connection"""
instance = MongoSingleton.get_instance()
# Verify singleton behavior
second_instance = MongoSingleton.get_instance()
self.assertIs(instance, second_instance)
client = instance.get_client()
self.assertIsInstance(client, MongoClient)
# Test actual connection
try:
info = client.server_info(timeout_ms=5000)
self.assertIsInstance(info, dict)
self.assertIn("version", info)
except (ConnectionFailure, ServerSelectionTimeoutError) as e:
self.fail(f"Failed to connect to MongoDB: {e}")

docker-compose.test.yml Show resolved Hide resolved
tc_hivemind_backend/tests/unit/test_mongo_singleton.py Outdated Show resolved Hide resolved
tc_hivemind_backend/db/modules_base.py Outdated Show resolved Hide resolved
tc_hivemind_backend/db/modules_base.py Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (3)
tc_hivemind_backend/db/modules_base.py (2)

6-8: Add class documentation and consider removing empty init

The class would benefit from a docstring explaining its purpose and responsibilities. Since the __init__ method is empty, it can be removed entirely as Python will automatically provide a default constructor.

 class ModulesBase:
-    def __init__(self) -> None:
-        pass
+    """
+    Base class for managing MongoDB operations related to platform modules,
+    providing functionality for querying module metadata, tokens, and community IDs.
+    """

57-57: Use list comprehension for better readability

Replace the map function with a more Pythonic list comprehension.

-        community_ids = list(map(lambda x: str(x["community"]), modules))
+        community_ids = [str(doc["community"]) for doc in modules]
tc_hivemind_backend/tests/unit/test_mongo_singleton.py (1)

10-26: Consider extracting logging setup into a helper method.

While the logging setup is comprehensive, it could be made more maintainable by extracting it into a helper method. This would improve readability and make it easier to modify logging configuration in the future.

+ def _setup_logging(self):
+     self.log_records = []
+     self.logger = logging.getLogger()
+     self.old_level = self.logger.level
+     self.logger.setLevel(logging.INFO)
+     self.handler = logging.Handler()
+     self.handler.emit = lambda record: self.log_records.append(record)
+     self.logger.addHandler(self.handler)

+ def _cleanup_logging(self):
+     self.logger.removeHandler(self.handler)
+     self.logger.setLevel(self.old_level)
+     self.log_records = []

  def setUp(self):
      """Reset the singleton instance before each test"""
      MongoSingleton._MongoSingleton__instance = None
-     self.log_records = []
-     self.logger = logging.getLogger()
-     self.old_level = self.logger.level
-     self.logger.setLevel(logging.INFO)
-     self.handler = logging.Handler()
-     self.handler.emit = lambda record: self.log_records.append(record)
-     self.logger.addHandler(self.handler)
+     self._setup_logging()

  def tearDown(self):
      """Clean up after each test"""
-     self.logger.removeHandler(self.handler)
-     self.logger.setLevel(self.old_level)
-     self.log_records = []
+     self._cleanup_logging()
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between b4e4f37 and fc5d5b4.

📒 Files selected for processing (5)
  • tc_hivemind_backend/db/modules_base.py (1 hunks)
  • tc_hivemind_backend/db/mongo.py (1 hunks)
  • tc_hivemind_backend/tests/unit/test_mongo_singleton.py (1 hunks)
  • tc_hivemind_backend/tests/unit/test_qdrant_singleton.py (1 hunks)
  • tc_hivemind_backend/tests/unit/test_redis_singleton.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • tc_hivemind_backend/db/mongo.py
  • tc_hivemind_backend/tests/unit/test_qdrant_singleton.py
  • tc_hivemind_backend/tests/unit/test_redis_singleton.py
🧰 Additional context used
🪛 Ruff
tc_hivemind_backend/tests/unit/test_mongo_singleton.py

72-72: Local variable instance is assigned to but never used

Remove assignment to unused variable instance

(F841)


90-90: Local variable instance is assigned to but never used

Remove assignment to unused variable instance

(F841)

🔇 Additional comments (2)
tc_hivemind_backend/tests/unit/test_mongo_singleton.py (2)

1-7: LGTM! Well-organized imports.

The imports are properly organized and include all necessary modules for testing.


28-108: LGTM! Comprehensive test coverage with proper test isolation.

The test suite demonstrates:

  • Thorough coverage of MongoDB connection scenarios
  • Proper mocking of external dependencies
  • Clear test method organization
  • Good validation of singleton pattern

Skipping comment about unused variables on lines 72 and 90 as it was already addressed in previous reviews.

🧰 Tools
🪛 Ruff

72-72: Local variable instance is assigned to but never used

Remove assignment to unused variable instance

(F841)


90-90: Local variable instance is assigned to but never used

Remove assignment to unused variable instance

(F841)

Comment on lines 61 to 98
def get_token(self, platform_id: ObjectId, token_type: str) -> str:
"""
get a specific type of token for a platform
This method is called when we needed a token for modules to extract its data

Parameters
------------
platform_id : ObjectId
the platform id that we want their token
token_type : str
the type of token. i.e. `google_refresh`

Returns
--------
token : str
the token that was required for module's ETL process
"""
client = MongoSingleton.get_instance().client

user_id = self.get_platform_metadata(platform_id, "userId")
user_id = ObjectId(user_id)
token_doc = client["Core"]["tokens"].find_one(
{
"user": user_id,
"type": token_type,
},
{
"token": 1,
},
sort=[("createdAt", -1)],
)
if token_doc is None:
raise ValueError(
f"No Token for the given user {user_id} "
"in tokens collection of the Core database!"
)
token = token_doc["token"]
return token
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Enhance security and validation in token handling

Several security and validation improvements are needed:

  1. Validate token_type parameter
  2. Avoid exposing internal IDs in error messages
  3. Consider implementing token encryption
     def get_token(self, platform_id: ObjectId, token_type: str) -> str:
+        VALID_TOKEN_TYPES = {'google_refresh', 'other_valid_types'}  # Add valid types
+        if not isinstance(token_type, str) or token_type not in VALID_TOKEN_TYPES:
+            raise ValueError("Invalid token type")
+
         client = MongoSingleton.get_instance().client
 
         user_id = self.get_platform_metadata(platform_id, "userId")
         user_id = ObjectId(user_id)
-        token_doc = client["Core"]["tokens"].find_one(
+        try:
+            token_doc = client["Core"]["tokens"].find_one(
             {
                 "user": user_id,
                 "type": token_type,
@@ -91,9 +95,8 @@
         )
         if token_doc is None:
             raise ValueError(
-                f"No Token for the given user {user_id} "
-                "in tokens collection of the Core database!"
+                "Token not found"
             )
         token = token_doc["token"]
-        return token
+        return self._decrypt_token(token)  # Consider implementing token encryption

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines 10 to 39
def query(self, platform: str, **kwargs) -> list[dict]:
"""
query the modules database for to get platforms' metadata

Parameters
-----------
platform : str
the platform to choose
it can be `github`, `discourse`, `discord` or etc
**kwargs : dict
projection : dict[str, int]
feature projection on query

Returns
---------
modules_docs : list[dict]
all the module documents that have the `platform` within them
"""
client = MongoSingleton.get_instance().client
projection = kwargs.get("projection", {})

cursor = client["Core"]["modules"].find(
{
"options.platforms.name": platform,
"name": "hivemind",
},
projection,
)
modules_docs = list(cursor)
return modules_docs
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add input validation and error handling

The query method should validate the platform parameter and include error handling for MongoDB operations:

  1. Validate platform against allowed values
  2. Add error handling for MongoDB operations
  3. Consider adding index hints for query optimization
     def query(self, platform: str, **kwargs) -> list[dict]:
+        VALID_PLATFORMS = {'github', 'discourse', 'discord'}
+        if platform not in VALID_PLATFORMS:
+            raise ValueError(f"Invalid platform. Must be one of: {VALID_PLATFORMS}")
+
         client = MongoSingleton.get_instance().client
         projection = kwargs.get("projection", {})

-        cursor = client["Core"]["modules"].find(
-            {
-                "options.platforms.name": platform,
-                "name": "hivemind",
-            },
-            projection,
-        )
-        modules_docs = list(cursor)
-        return modules_docs
+        try:
+            cursor = client["Core"]["modules"].find(
+                {
+                    "options.platforms.name": platform,
+                    "name": "hivemind",
+                },
+                projection,
+            ).hint("options.platforms.name_1")  # Assuming index exists
+            return list(cursor)
+        except Exception as e:
+            raise RuntimeError(f"Failed to query modules: {str(e)}") from e

Committable suggestion skipped: line range outside the PR's diff.

@amindadgar amindadgar merged commit 6b03e66 into main Nov 20, 2024
13 checks passed
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (2)
tc_hivemind_backend/db/modules_base.py (2)

6-9: Add class documentation and remove empty init

The class would benefit from a docstring explaining its purpose and responsibilities. Also, the empty __init__ method can be removed as it doesn't provide any functionality.

 class ModulesBase:
-    def __init__(self) -> None:
-        pass
+    """
+    Base class for handling MongoDB operations related to modules, platforms, and tokens.
+    Provides methods for querying module data, retrieving community IDs, tokens, and platform metadata.
+    """

41-60: Improve documentation and add error handling

  1. The docstring mentions "discord platform" specifically, but the method is platform-agnostic.
  2. Consider adding error handling for empty results.
     def get_platform_community_ids(self, platform_name: str) -> list[str]:
         """
         get all community ids that a platform has
 
         Parameters
         ------------
         platform_name : str
             the platform having community id and available for hivemind module
 
         Returns
         --------
         community_ids : list[str]
-            id of communities that has discord platform and hivemind module enabled
+            id of communities that have the specified platform and hivemind module enabled
 
         """
         modules = self.query(platform=platform_name, projection={"community": 1})
+        if not modules:
+            raise ValueError(f"No communities found for platform: {platform_name}")
         community_ids = list(map(lambda x: str(x["community"]), modules))
 
         return community_ids
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between fc5d5b4 and 71fb776.

📒 Files selected for processing (3)
  • docker-compose.test.yml (3 hunks)
  • tc_hivemind_backend/db/modules_base.py (1 hunks)
  • tc_hivemind_backend/tests/unit/test_mongo_singleton.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tc_hivemind_backend/tests/unit/test_mongo_singleton.py
🔇 Additional comments (2)
docker-compose.test.yml (1)

22-24: Consider securing Redis with a password in non-development environments.

The Redis password is currently empty. While this might be acceptable for local development, ensure that proper authentication is configured in staging and production environments.

tc_hivemind_backend/db/modules_base.py (1)

10-40: LGTM with previous suggestions

The method implementation is functional, but please refer to the previous review comment about adding input validation, error handling, and index hints.

Comment on lines +77 to +87
redis:
image: redis:latest
ports:
- "6379:6379"
networks:
- python_service_network
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 5
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Pin Redis version to a specific release.

Using the latest tag can lead to unexpected behavior or breaking changes when the image is updated. Consider pinning to a specific version for better stability and reproducibility.

-    image: redis:latest
+    image: redis:7.2.3
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
redis:
image: redis:latest
ports:
- "6379:6379"
networks:
- python_service_network
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 5
redis:
image: redis:7.2.3
ports:
- "6379:6379"
networks:
- python_service_network
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 5

Comment on lines +93 to +96
raise ValueError(
f"No Token for the given user {user_id} "
"in tokens collection of the Core database!"
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove sensitive information from error message

The error message exposes the internal user ID, which could be a security concern. Consider using a more generic error message.

         if token_doc is None:
             raise ValueError(
-                f"No Token for the given user {user_id} "
-                "in tokens collection of the Core database!"
+                "No valid token found for the specified platform and token type"
             )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
raise ValueError(
f"No Token for the given user {user_id} "
"in tokens collection of the Core database!"
)
raise ValueError(
"No valid token found for the specified platform and token type"
)

Comment on lines +100 to +133
def get_platform_metadata(
self, platform_id: ObjectId, metadata_name: str
) -> str | dict | list:
"""
get the userid that belongs to a platform

Parameters
-----------
platform_id : bson.ObjectId
the platform id we need their owner user id
metadata_name : str
a specific field of metadata that we want

Returns
---------
metadata_value : Any
the values that the metadata belongs to
"""
client = MongoSingleton.get_instance().get_client()

platform = client["Core"]["platforms"].find_one(
{
"_id": platform_id,
"disconnectedAt": None,
},
{
f"metadata.{metadata_name}": 1,
},
)
if platform is None:
raise ValueError(f"No platform available given platform id: {platform_id}")

metadata_field = platform["metadata"][metadata_name]
return metadata_field
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve type hints and add error handling for metadata access

  1. The return type hint could be more specific using Union from typing module
  2. Add error handling for KeyError when accessing metadata field
+from typing import Union
+
     def get_platform_metadata(
         self, platform_id: ObjectId, metadata_name: str
-    ) -> str | dict | list:
+    ) -> Union[str, dict, list]:
         """
         get the userid that belongs to a platform
 
         Parameters
         -----------
         platform_id : bson.ObjectId
             the platform id we need their owner user id
         metadata_name : str
             a specific field of metadata that we want
 
         Returns
         ---------
         metadata_value : Union[str, dict, list]
             the values that the metadata belongs to
 
         Raises
         -------
         ValueError
             If platform not found or metadata field doesn't exist
         """
         client = MongoSingleton.get_instance().get_client()
 
         platform = client["Core"]["platforms"].find_one(
             {
                 "_id": platform_id,
                 "disconnectedAt": None,
             },
             {
                 f"metadata.{metadata_name}": 1,
             },
         )
         if platform is None:
             raise ValueError(f"No platform available given platform id: {platform_id}")
 
+        try:
             metadata_field = platform["metadata"][metadata_name]
+        except (KeyError, TypeError):
+            raise ValueError(f"Metadata field '{metadata_name}' not found")
         return metadata_field
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def get_platform_metadata(
self, platform_id: ObjectId, metadata_name: str
) -> str | dict | list:
"""
get the userid that belongs to a platform
Parameters
-----------
platform_id : bson.ObjectId
the platform id we need their owner user id
metadata_name : str
a specific field of metadata that we want
Returns
---------
metadata_value : Any
the values that the metadata belongs to
"""
client = MongoSingleton.get_instance().get_client()
platform = client["Core"]["platforms"].find_one(
{
"_id": platform_id,
"disconnectedAt": None,
},
{
f"metadata.{metadata_name}": 1,
},
)
if platform is None:
raise ValueError(f"No platform available given platform id: {platform_id}")
metadata_field = platform["metadata"][metadata_name]
return metadata_field
from typing import Union
def get_platform_metadata(
self, platform_id: ObjectId, metadata_name: str
) -> Union[str, dict, list]:
"""
get the userid that belongs to a platform
Parameters
-----------
platform_id : bson.ObjectId
the platform id we need their owner user id
metadata_name : str
a specific field of metadata that we want
Returns
---------
metadata_value : Union[str, dict, list]
the values that the metadata belongs to
Raises
-------
ValueError
If platform not found or metadata field doesn't exist
"""
client = MongoSingleton.get_instance().get_client()
platform = client["Core"]["platforms"].find_one(
{
"_id": platform_id,
"disconnectedAt": None,
},
{
f"metadata.{metadata_name}": 1,
},
)
if platform is None:
raise ValueError(f"No platform available given platform id: {platform_id}")
try:
metadata_field = platform["metadata"][metadata_name]
except (KeyError, TypeError):
raise ValueError(f"Metadata field '{metadata_name}' not found")
return metadata_field

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant