Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sample Bank Statements #34

Open
flywire opened this issue May 30, 2021 · 4 comments
Open

Sample Bank Statements #34

flywire opened this issue May 30, 2021 · 4 comments

Comments

@flywire
Copy link
Contributor

flywire commented May 30, 2021

Include Sample Bank Statements for testing

@marlanperumal
Copy link
Owner

Thanks - these will be useful to add some sample configs for

@flywire
Copy link
Contributor Author

flywire commented Jun 8, 2021

@flywire
Copy link
Contributor Author

flywire commented Jun 13, 2021

Mock bank statement template set up in LibreOffice Calc with Tools, Option, LibreOffice Calc, General, Measurement unit: point. Checked in SumatraPDF using m to show cursor coordinates.

    "layout": {
        "default": {
            "area": [280, 27, 763, 576],
            "columns": [83, 264, 344, 425, 485, 570]
        },
        "first": {
            "area": [480, 27, 763, 576],
            "columns": [83, 264, 344, 425, 485, 570]
        }
    },

[Edit: Updated following files:]

MockTemplate.xlsx
MockTemplate.pdf


Note: For scaling measurements - A4 is 210mm x 297mm and there are 72 points/inch so A4 is 595.275 x 841.8892 points

@flywire
Copy link
Contributor Author

flywire commented Jun 21, 2021

json file comments allow new config to be setup quickly and easily in text editor.

Draft config follows. Year is given as first word of first Transaction.

config\cba\saving.json

{
    "$schema": "https://raw.githubusercontent.com/marlanperumal/pdf_statement_reader/develop/pdf_statement_reader/config/psr_config.schema.json",
    "//": "Describe layout for pages to be scanned",
    "layout": { 
        "//": "Default layout for all pages not otherwise defined",
        "default": {
            "//": "Page coordinates containing table in pts",
            "//": "[top, left, bottom, right]",
            "area": [143, 58, 760, 546],
            "//": "Right x coordinate of each column in table",
            "columns": [93, 344, 402, 460, 546]
        },
        "//": "Layout for first page",
        "first": {
            "area": [385, 58, 760, 546],
            "columns": [93, 344, 402, 460, 546]
        }
    },

    "//": "Statement column names exactly as they appear",
    "columns": {
        "trans_date": "Date",
        "trans_type": "Transaction",
        "debit": "Debit",
        "credit": "Credit",
        "balance": "Balance"
    },

    "//": "csv column output order",
    "order": [
        "trans_date",
        "trans_type",
        "debit",
        "credit",
        "balance"
    ],

    "//": "Specify required cleaning operations",
    "cleaning": {
        "//": "Convert these columns to numeric",
        "numeric": ["debit", "credit", "balance"],
        "//": "Convert these columns to date",
        "date": ["trans_date"],
        "//": "Use this date format to parse any date columns",
        "date_format": "%d mmm",
        "//": "Only keep the rows where these columns are populated",
        "dropna": ["balance"]
    }
}

Edit: Column mistakenly entered at left margin removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants