98 hivemind refactor GitHub issue extraction functions to object orientated code #107

JacobCWBillings · 2024-04-04T23:13:37Z

Summary by CodeRabbit

Refactor
- Enhanced the GitHub issues extraction process by introducing a more structured approach with the GithubExtraction and GithubIssueExtraction classes.
- Improved testing methodology for GitHub issues fetching to align with the new extraction classes.
- Restructured imports and method calls for improved code organization and readability.
- Added a new class GitHubIssueID for handling specific attributes related to GitHub issues.
- Updated test cases for fetching GitHub issue IDs and raw issues to accommodate changes in the extraction process.

coderabbitai · 2024-04-04T23:13:48Z

Walkthrough

The update shifts the GitHub ETL process towards object-oriented design, introducing new classes for extracting GitHub data like issues and commits. This restructuring enhances code organization and maintainability, with functions now encapsulated within specific classes for better structure and testability.

Changes

Files	Change Summary
`github_etl.py`, `test_github_etl_fetch_issues.py`	Replaced `fetch_issues` with methods from `GithubExtraction` class; test updates.
`extract/__init__.py`, `extract/issues.py`	Introduced `GithubIssueExtraction` class and added issue processing methods.
`test_github_etl_fetch_raw_issues.py`	Refactored tests to utilize `GithubExtraction` class for fetching raw issues.

Possibly related issues

[Hivemind] Refactor Github commit extraction functions to object orientated code #93: The refactor aligns with the task to transition GitHub commit extraction functions to object-oriented code, showcasing a similar approach that could be beneficial for the requested changes.

🐇💻🌟
In the realm of code and logic deep,
Where functions once did play,
Classes now in harmony leap,
Refactoring the old way.
🌳📚🐾
Through GitHub's fields we tread,
Data structures anew,
Evolving with each line we've read,
Our codebase ever true.
🌈🐇💾

Recent Review Details

Configuration used: CodeRabbit UI

Commits

Files that changed from the base of the PR and between 3278cd5 and 6efc3f7.

Files selected for processing (8)

dags/hivemind_etl_helpers/github_etl.py (2 hunks)
dags/hivemind_etl_helpers/src/db/github/extract/init.py (1 hunks)
dags/hivemind_etl_helpers/src/db/github/extract/github_extraction.py (1 hunks)
dags/hivemind_etl_helpers/src/db/github/extract/issues.py (1 hunks)
dags/hivemind_etl_helpers/src/db/github/schema/init.py (1 hunks)
dags/hivemind_etl_helpers/src/db/github/schema/issue.py (1 hunks)
dags/hivemind_etl_helpers/tests/integration/test_github_etl_fetch_issue_ids.py (1 hunks)
dags/hivemind_etl_helpers/tests/integration/test_github_etl_fetch_raw_issues.py (2 hunks)

Additional comments not posted (7)

dags/hivemind_etl_helpers/src/db/github/schema/__init__.py (1)

4-4: Ensure that the GitHubIssueID class is correctly utilized wherever GitHubIssue is used, given that it's now part of the exported entities.

dags/hivemind_etl_helpers/src/db/github/extract/__init__.py (1)

4-5: The import changes correctly reflect the transition to an object-oriented approach. Ensure that the removed fetch_issues function is no longer used in the codebase.
dags/hivemind_etl_helpers/src/db/github/extract/github_extraction.py (1)
8-32: The GithubExtraction class is well-structured and encapsulates the issue extraction logic effectively. Ensure that the methods _fetch_raw_issues and fetch_issues are used correctly across the codebase.

Verification successful

The verification process has confirmed that the methods _fetch_raw_issues and fetch_issues from the GithubExtraction class are being used across the codebase, including in integration tests and the main ETL script. This suggests that these methods are integral to the system's functionality and are likely being tested for correctness and reliability.

Usage in Integration Tests: Both methods are used in multiple test scenarios, which helps ensure their reliability and correctness.

Usage in ETL Script: The fetch_issues method is also used in the github_etl.py script, indicating its importance in the actual data extraction process.
Scripts Executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Verify the usage of `_fetch_raw_issues` and `fetch_issues` methods in the codebase.
rg --files-with-matches --type py "GithubExtraction" | xargs rg "_fetch_raw_issues"
rg --files-with-matches --type py "GithubExtraction" | xargs rg "fetch_issues"
Length of output: 1712
dags/hivemind_etl_helpers/src/db/github/schema/issue.py (1)
68-84: The GitHubIssueID class is correctly implemented as a subclass of GitHubIssue, focusing on the id attribute. Ensure that this class is used appropriately wherever issue IDs are specifically required.

Verification successful

The verification process has confirmed that the GitHubIssueID class is correctly and effectively used in the codebase, particularly in scenarios where only the ID of GitHub issues is required. The class is utilized in issues.py to extract and manage lists of issue IDs, which aligns with its intended purpose.

dags/hivemind_etl_helpers/src/db/github/extract/issues.py: The class is used to create and manage lists of GitHub issue IDs, which is appropriate for the context.
Scripts Executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Verify the usage of `GitHubIssueID` in the codebase, especially in contexts where only the ID is needed.
rg --files-with-matches --type py "GitHubIssue" | xargs rg "GitHubIssueID"
Length of output: 947
dags/hivemind_etl_helpers/github_etl.py (1)
64-70: The modification to use GithubExtraction for fetching issues is correctly implemented. Ensure that the old fetch_issues function is fully deprecated and not used elsewhere in the codebase.
Verification successful
---

The examination of the code snippets where `fetch_issues` is referenced reveals the following:

1. **github_etl.py**: The function `fetch_issues` is not directly called; instead, the method `fetch_issues` from the `GithubExtraction` class is used. This aligns with the intended object-oriented approach.

2. **test_github_etl_fetch_issues.py**: The tests are using the `fetch_issues` method from the `GithubExtraction` class, which is appropriate and does not indicate outdated usage.

3. **github_extraction.py and issues.py**: These files define and use the `fetch_issues` method within the context of their respective classes (`GithubExtraction` and a class likely defined in `issues.py`). This usage is consistent with an object-oriented design and does not suggest deprecation is needed.

4. **github_api_helpers/issues.py**: This file contains a standalone function `fetch_issues` that fetches issues for a specific repository page by page. This function appears to be part of a different module (`github_api_helpers`) and serves a different purpose, likely used for direct API interactions rather than the ETL process.

Based on this analysis, the references to `fetch_issues` across the codebase are appropriate and reflect either method calls from class instances or function calls in a contextually different module. There is no indication that the old `fetch_issues` function, if it was a standalone function in the `github_etl.py` context, is still being used inappropriately.

**Conclusion:**
The transition to using the `GithubExtraction` class for fetching issues in the `github_etl.py` file is correctly implemented. The remaining references to `fetch_issues` in the codebase are valid and do not require deprecation. They are either method calls from new class instances or function calls in a separate module context.



</details>

<details>
<summary>Scripts Executed</summary>

The following scripts were executed for the analysis:



---

Script:
```shell
#!/bin/bash
# Search for any remaining references to the old `fetch_issues` function.
rg --files-with-matches --type py "fetch_issues"
Length of output: 332

Script:
#!/bin/bash
# Extract relevant code snippets from files where `fetch_issues` is mentioned to understand the context of its usage.

# Extract from github_etl.py
echo "=== github_etl.py ==="
rg "fetch_issues" dags/hivemind_etl_helpers/github_etl.py --context 5

# Extract from test_github_etl_fetch_issues.py
echo "=== test_github_etl_fetch_issues.py ==="
rg "fetch_issues" dags/hivemind_etl_helpers/tests/integration/test_github_etl_fetch_issues.py --context 5

# Extract from github_extraction.py
echo "=== github_extraction.py ==="
rg "fetch_issues" dags/hivemind_etl_helpers/src/db/github/extract/github_extraction.py --context 5

# Extract from issues.py
echo "=== issues.py ==="
rg "fetch_issues" dags/hivemind_etl_helpers/src/db/github/extract/issues.py --context 5

# Extract from issues.py in github_api_helpers
echo "=== github_api_helpers/issues.py ==="
rg "fetch_issues" dags/github/github_api_helpers/issues.py --context 5
Length of output: 4128
dags/hivemind_etl_helpers/tests/integration/test_github_etl_fetch_raw_issues.py (1)
18-20: The integration tests are correctly updated to use the _fetch_raw_issues method from the GithubExtraction class. Ensure that these tests cover all relevant cases and that the method is robustly tested.

Verification successful

The verification process confirms that the _fetch_raw_issues method is well-covered by tests in the test_github_etl_fetch_raw_issues.py file. The tests include scenarios such as getting empty results and fetching single issues, which suggests a robust testing approach for this method. Additionally, the method's implementation is referenced in the issues.py file, aligning with the expected usage in the tests.

dags/hivemind_etl_helpers/tests/integration/test_github_etl_fetch_raw_issues.py: Contains multiple test cases for _fetch_raw_issues.

dags/hivemind_etl_helpers/src/db/github/extract/issues.py: Implementation of _fetch_raw_issues.
Scripts Executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Verify that the `_fetch_raw_issues` method is covered by tests in other parts of the codebase.
rg --files-with-matches --type py "_fetch_raw_issues" | xargs rg "test_"
Length of output: 727
dags/hivemind_etl_helpers/tests/integration/test_github_etl_fetch_issue_ids.py (1)
18-20: The integration tests are correctly updated to use the fetch_issue_ids method from the GithubExtraction class. Ensure that these tests cover all relevant cases and that the method is robustly tested.

Verification successful

The integration tests for the fetch_issue_ids method in the GithubExtraction class appear to be comprehensive. They cover various scenarios including no data, single issue retrieval, and multiple issues retrieval, which ensures that the method is robustly tested across different conditions. This aligns with the expectations set in the review comment.

File and tests:

dags/hivemind_etl_helpers/tests/integration/test_github_etl_fetch_issue_ids.py

test_get_empty_results_no_from_date

test_get_empty_results

test_get_single_issue_id_single_repo

test_get_multiple_issue_ids_single_repo

test_get_multiple_issue_ids_single_repo_with_filtering
Scripts Executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Verify that the `fetch_issue_ids` method is covered by tests in other parts of the codebase.
rg --files-with-matches --type py "fetch_issue_ids" | xargs rg "test_"
Length of output: 839

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 1

Review Status

Configuration used: CodeRabbit UI

Commits

Files that changed from the base of the PR and between 43ee9f1 and 3278cd5.

Files selected for processing (5)

dags/hivemind_etl_helpers/github_etl.py (2 hunks)
dags/hivemind_etl_helpers/src/db/github/extract/init.py (1 hunks)
dags/hivemind_etl_helpers/src/db/github/extract/issues.py (1 hunks)
dags/hivemind_etl_helpers/tests/integration/test_github_etl_fetch_issues.py (4 hunks)
dags/hivemind_etl_helpers/tests/integration/test_github_etl_fetch_raw_issues.py (2 hunks)

Additional comments not posted (4)

dags/hivemind_etl_helpers/src/db/github/extract/__init__.py (1)

8-9: Consider adding a comment explaining the purpose or future plans for the GithubExtraction class to improve code clarity.

dags/hivemind_etl_helpers/github_etl.py (1)

64-70: Ensure that the GithubExtraction instance is correctly integrated and utilized across the entire ETL process.

dags/hivemind_etl_helpers/tests/integration/test_github_etl_fetch_raw_issues.py (1)

18-20: Consider adding a comment explaining the necessity of directly accessing the private method __fetch_raw_issues in tests. Evaluate if there's a way to refactor the tests or the class design to avoid breaking encapsulation.

dags/hivemind_etl_helpers/tests/integration/test_github_etl_fetch_issues.py (1)

18-20: LGTM! The test cases correctly utilize the GithubExtraction class for fetching GitHub issues.

dags/hivemind_etl_helpers/src/db/github/extract/issues.py

amindadgar · 2024-04-08T05:36:22Z

dags/hivemind_etl_helpers/github_etl.py

@@ -2,10 +2,10 @@
 from datetime import datetime

 from dotenv import load_dotenv
-from hivemind_etl_helpers.src.db.github.extract import (
+from hivemind_etl_helpers.src.db.github.extract import (  # fetch_issues,


Please remove the comment for fetch_issues

amindadgar · 2024-04-08T05:38:00Z

dags/hivemind_etl_helpers/src/db/github/extract/issues.py

-    from_date : datetime | None
-        get the issues form a specific date that they were created
-        defualt is `None`, meaning to apply no filtering on data
+    def __fetch_raw_issues(


Please use just one underline for the method name.

amindadgar · 2024-04-08T05:43:40Z

dags/hivemind_etl_helpers/tests/integration/test_github_etl_fetch_raw_issues.py

        self.assertEqual(issues, [])

    def test_get_empty_results(self):
        repository_ids = [123, 124]
-        issues = fetch_raw_issues(
+        issues = self.extractor._GithubIssueExtraction__fetch_raw_issues(


Please consider updating to use the _fetch_raw_issues directly and not via GitHubIssueExtraction

amindadgar · 2024-04-08T05:44:07Z

dags/hivemind_etl_helpers/tests/integration/test_github_etl_fetch_raw_issues.py

@@ -60,7 +63,7 @@ def test_get_single_issue_single_repo(self):
            )

        repository_ids = [123]
-        issues = fetch_raw_issues(
+        issues = self.extractor._GithubIssueExtraction__fetch_raw_issues(


Please consider updating to use the _fetch_raw_issues directly and not via GitHubIssueExtraction

amindadgar

Thanks for your job. Please consider doing updates based on given comments.

amindadgar · 2024-04-10T14:42:15Z

Hey @JacobCWBillings, please have a look over this note of one of the other bounty hunters shared. https://github.com/TogetherCrew/airflow-dags/blob/1e7083ea9f2a2c28bfecc6d3a1f74c4cab742b7b/dags/hivemind_etl_helpers/src/db/github/extract/github_extraction.py#L6C1-L20C43

Can you update your PR using composition rather than inheritance?

JacobCWBillings · 2024-04-12T07:48:25Z

Ciao Mohammad, Thank you for the involvement in the project. I has been nice seeing how many ideas have been contributed to this project, so far. I have been traveling this past week. But today, I will have time to implement the edits you indicate, as well as to push a solution to the second bounty. All the best, -Jacob

…

On Wed, Apr 10, 2024, 16:42 Mohammad Amin Dadgar ***@***.***> wrote: Hey @JacobCWBillings <https://github.com/JacobCWBillings>, please have a look over this note of one of the other bounty hunters shared. https://github.com/TogetherCrew/airflow-dags/blob/1e7083ea9f2a2c28bfecc6d3a1f74c4cab742b7b/dags/hivemind_etl_helpers/src/db/github/extract/github_extraction.py#L6C1-L20C43 Can you update your PR using composition rather than inheritance? — Reply to this email directly, view it on GitHub <#107 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AI2QN2OFBVC2LOXHHEUTEDLY4VFVZAVCNFSM6AAAAABFYEQZ36VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBXG42DQMBYHE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Bounties/jcwb 00

coderabbitai

Actionable comments posted: 1

coderabbitai · 2024-04-18T17:03:53Z

dags/hivemind_etl_helpers/src/db/github/extract/issues.py

+        """
+
+        query = """MATCH (i:Issue)<-[:CREATED]-(user:GitHubUser)
+            WHERE
+            i.repository_id IN $repoIds
+        """
+        if from_date is not None:
+            query += "AND datetime(i.updated_at) >= datetime($from_date)"
+
+        query += """
+            MATCH (repo:Repository {id: i.repository_id})
+            RETURN
+                user.login as author_name,
+                i.id as id,
+                i.title as title,
+                i.body as text,
+                i.state as state,
+                i.state_reason as state_reason,
+                i.created_at as created_at,
+                i.updated_at as updated_at,
+                i.closed_at as closed_at,
+                i.latestSavedAt as latest_saved_at,
+                i.html_url as url,
+                i.repository_id as repository_id,
+                repo.full_name as repository_name
+            ORDER BY datetime(created_at)
+        """
+
+        def _exec_query(tx, repoIds, from_date):
+            result = tx.run(query, repoIds=repoIds, from_date=from_date)
+            return list(result)
+
+        with self.neo4j_driver.session() as session:
+            raw_records = session.execute_read(
+                _exec_query,
+                repoIds=repository_id,
+                from_date=from_date,
+            )
+
+        return raw_records
+
+    def fetch_issues(
+        self,
+        repository_id: list[int],
+        from_date: datetime | None = None,
+    ) -> list[GitHubIssue]:
+        """
+        fetch issues from data dump in neo4j
+
+        Parameters
+        ------------
+        repository_id : list[int]
+            a list of repository id to fetch their issues
+        from_date : datetime | None
+            get the issues form a specific date that they were created
+            default is `None`, meaning to apply no filtering on data
+
+        Returns
+        --------
+        github_issues : list[GitHubIssue]
+            list of neo4j records as the extracted issues
+        """
+        records = self._fetch_raw_issues(repository_id, from_date)
+
+        github_issues: list[GitHubIssue] = []
+        for record in records:
+            issue = GitHubIssue.from_dict(record)
+            github_issues.append(issue)
+
+        return github_issues
+
+    def fetch_issue_ids(
+        self,
+        repository_id: list[int],
+        from_date: datetime | None = None,
+    ) -> list[GitHubIssueID]:
+        """
+        fetch issues from data dump in neo4j
+
+        Parameters
+        ------------
+        repository_id : list[int]
+            a list of repository id to fetch their issues
+        from_date : datetime | None
+            get the issues form a specific date that they were created
+            default is `None`, meaning to apply no filtering on data
+
+        Returns
+        --------
+        github_issues_ids : list[GitHubIssueID]
+            list of neo4j records as the extracted issue ids
+        """
+        records = self._fetch_raw_issues(repository_id, from_date)
+
+        github_issue_ids: list[GitHubIssueID] = []
+        for record in records:
+            issue = GitHubIssueID.from_dict(record)
+            github_issue_ids.append(issue)
+
+        return github_issue_ids


Tip
Codebase Verification

The methods _fetch_raw_issues, fetch_issues, and fetch_issue_ids from the GithubIssueExtraction class are actively used within the codebase. However, the method _fetch_raw_issues is accessed from outside its own class (GithubIssueExtraction), specifically in github_extraction.py. This is not typical for a method prefixed with an underscore, which conventionally indicates it should be a private method. This might suggest a need to reconsider the method's visibility (i.e., renaming it to remove the underscore if it's intended to be used externally).

issues.py and github_extraction.py both use _fetch_raw_issues.

fetch_issues and fetch_issue_ids are used within their own class and also in github_extraction.py.

Analysis chain

The GithubIssueExtraction class is well-implemented, encapsulating the logic for fetching issues and issue IDs. Ensure that the methods _fetch_raw_issues, fetch_issues, and fetch_issue_ids are used correctly and tested thoroughly.

Scripts Executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Verify the usage of `_fetch_raw_issues`, `fetch_issues`, and `fetch_issue_ids` methods in the codebase. rg --files-with-matches --type py "GithubIssueExtraction" | xargs rg "_fetch_raw_issues" rg --files-with-matches --type py "GithubIssueExtraction" | xargs rg "fetch_issues" rg --files-with-matches --type py "GithubIssueExtraction" | xargs rg "fetch_issue_ids"

Length of output: 1470

amindadgar · 2024-04-24T08:31:32Z

dags/hivemind_etl_helpers/src/db/github/extract/__init__.py

@@ -1,5 +1,6 @@
 # flake8: noqa
 from .comments import fetch_comments
 from .commit import fetch_commits
-from .issues import fetch_issues
+from .github_extraction import GithubExtraction
+from .issues import GithubIssueExtraction


Please consider removing the line 5 (from .issues import GithubIssueExtraction) because this is not always required to be used and we're always using GitHubExtraction

If we wanted to use the GitHubIssueExtraction we can just import from its file and not this module directory.

amindadgar · 2024-04-24T08:33:46Z

dags/hivemind_etl_helpers/src/db/github/extract/github_extraction.py

+        # self.comment_extraction = GitHubCommentExtraction()
+        self.issue_extraction = GithubIssueExtraction()
+
+    def _fetch_raw_issues(


Was there any specific reason we included the _fetch_raw_issues within GitHubExtraction?

I think having the fetch_issues would be enough in our GitHubExtraction

Currently, the _raw methods are still called from GitHubExtraction inside the test cases.

Ah, yes I see your point. To access that in test cases you can call the _fetch_raw_issues directly from GitHubExtraction, you can do the following within test cases.

self.extractor.issue_extraction._fetch_raw_issues

dags/hivemind_etl_helpers/src/db/github/extract/issues.py

dags/hivemind_etl_helpers/src/db/github/schema/__init__.py

amindadgar

Thanks for your updates. I added some comments, please have a look @JacobCWBillings

amindadgar · 2024-04-25T05:18:28Z

dags/hivemind_etl_helpers/src/db/github/extract/issues.py

+        github_issues_ids : list[GitHubIssueID]
+            list of neo4j records as the extracted issue ids
+        """
+        records = self._fetch_raw_issues(repository_id, from_date)


Could you write another function for just fetching the ids? Fetching the whole data and just using their ids is inefficient. You can name the method as _fetch_raw_issue_ids Please let me know if you need any help on the neo4j query.

JacobCWBillings · 2024-04-25T10:32:19Z

That sounds good. Will implement this functionality.

…

On Thu, Apr 25, 2024, 07:18 Mohammad Amin Dadgar ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In dags/hivemind_etl_helpers/src/db/github/extract/issues.py <#107 (comment)> : > + fetch issues from data dump in neo4j + + Parameters + ------------ + repository_id : list[int] + a list of repository id to fetch their issues + from_date : datetime | None + get the issues form a specific date that they were created + default is `None`, meaning to apply no filtering on data + + Returns + -------- + github_issues_ids : list[GitHubIssueID] + list of neo4j records as the extracted issue ids + """ + records = self._fetch_raw_issues(repository_id, from_date) Could you write another function for just fetching the ids? Fetching the whole data and just using their ids is inefficient. You can name the method as _fetch_raw_issue_ids Please let me know if you need any help on the neo4j query. — Reply to this email directly, view it on GitHub <#107 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AI2QN2MKT2UIWJXG3O32ANDY7CG3TAVCNFSM6AAAAABFYEQZ36VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDAMRRGUYTEOBYGY> . You are receiving this because you were mentioned.Message ID: ***@***.***>

JacobCWBillings · 2024-06-14T00:44:15Z

Implemented optimization suggestions for fetch_raw_issue_ids.

There seems to be an issue, now, with <dags.hivemind_etl_helpers.tests.integration.test_discourse_summary>.

Is this a known problem?

JacobCWBillings added 6 commits March 29, 2024 14:00

Functional to OOP refactor for fetch_issues method

be0fb2d

retest functional to OOP refactor, fetch_issues method

fe6884d

test functional to OOP refactor, fetch_issues method

9ee4747

test functional to OOP refactor, fetch_issues method

6b12419

lint code

549699b

lint code

3278cd5

JacobCWBillings linked an issue Apr 4, 2024 that may be closed by this pull request

[Hivemind] Refactor Github issue extraction functions to object orientated code #98

Open

coderabbitai bot reviewed Apr 4, 2024

View reviewed changes

dags/hivemind_etl_helpers/src/db/github/extract/issues.py Outdated Show resolved Hide resolved

amindadgar self-requested a review April 8, 2024 05:33

amindadgar reviewed Apr 8, 2024

View reviewed changes

amindadgar requested changes Apr 8, 2024

View reviewed changes

JacobCWBillings and others added 10 commits April 12, 2024 11:05

Draft commit, Bounty 2

4d2f753

Draft commit, Bounty 2

ef4da20

Draft commit, Bounty 2, bug fix

db159cc

Draft commit, Bounty 2, Linted

4d6e3f6

Bounty 2 commit. Tested. Linted.

1cbbf69

Bounty 2 commit. Tested. Linted.

06fa34e

Bounty 1 (&2) recommended fixes.

01fed3d

Bounty 1 (&2) recommended fixes, + bug fixes

3dc679d

Bounty 1 (&2) recommended fixes, + bug fixes.

fe72a60

Merge pull request #124 from TogetherCrew/bounties/jcwb_00

6efc3f7

Bounties/jcwb 00

coderabbitai bot reviewed Apr 18, 2024

View reviewed changes

amindadgar reviewed Apr 24, 2024

View reviewed changes

dags/hivemind_etl_helpers/src/db/github/extract/issues.py Show resolved Hide resolved

amindadgar reviewed Apr 24, 2024

View reviewed changes

dags/hivemind_etl_helpers/src/db/github/schema/__init__.py Show resolved Hide resolved

amindadgar requested changes Apr 24, 2024

View reviewed changes

amindadgar reviewed Apr 25, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

98 hivemind refactor GitHub issue extraction functions to object orientated code #107

98 hivemind refactor GitHub issue extraction functions to object orientated code #107

JacobCWBillings commented Apr 4, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 4, 2024 •

edited

Loading

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

amindadgar Apr 8, 2024

amindadgar Apr 8, 2024

amindadgar Apr 8, 2024

amindadgar Apr 8, 2024

amindadgar left a comment

amindadgar commented Apr 10, 2024

JacobCWBillings commented Apr 12, 2024 via email

coderabbitai bot left a comment

coderabbitai bot Apr 18, 2024

amindadgar Apr 24, 2024

amindadgar Apr 24, 2024 •

edited

Loading

JacobCWBillings May 5, 2024

amindadgar May 6, 2024

amindadgar left a comment

amindadgar Apr 25, 2024

JacobCWBillings commented Apr 25, 2024 via email

JacobCWBillings commented Jun 14, 2024

98 hivemind refactor GitHub issue extraction functions to object orientated code #107

Are you sure you want to change the base?

98 hivemind refactor GitHub issue extraction functions to object orientated code #107

Conversation

JacobCWBillings commented Apr 4, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Apr 4, 2024 • edited Loading

Walkthrough

Changes

Possibly related issues

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amindadgar left a comment

Choose a reason for hiding this comment

amindadgar commented Apr 10, 2024

JacobCWBillings commented Apr 12, 2024 via email

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Apr 18, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amindadgar Apr 24, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amindadgar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JacobCWBillings commented Apr 25, 2024 via email

JacobCWBillings commented Jun 14, 2024

JacobCWBillings commented Apr 4, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 4, 2024 •

edited

Loading

CodeRabbit Configration File (`.coderabbit.yaml`)

amindadgar Apr 24, 2024 •

edited

Loading