Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Statement Columns as Graphics #51

Open
flywire opened this issue Jun 23, 2021 · 1 comment
Open

Statement Columns as Graphics #51

flywire opened this issue Jun 23, 2021 · 1 comment

Comments

@flywire
Copy link
Contributor

flywire commented Jun 23, 2021

The Australian Citibank cheque account uses graphics rather than text for statement columns (ie can't swipe it like the transactions) so pdf_statement_reader can't detect the start of the columns. It makes some attempt at the first two columns but it would be better if it used CLOSING BALANCE to detect the end of the transactions rather than picking up broken parts of bank notices.

image

The general layout is similar to CBA except dates are dd Mmm yyyy.

@flywire
Copy link
Contributor Author

flywire commented Jun 24, 2021

Looks like no headers need a change to config.json and a bit more code - citi.cheque.pdf

>>> import tabula
>>> dfs = tabula.read_pdf("citi.cheque.pdf", pages="all", area=(530, 47, 800, 558), options="--columns 105,275,340,444,558", pandas_options={'header': None})
>>> for df in dfs: df.columns = ['Date','Desc','DB','CR','Balance']
...
>>> dfs
[          Date                              Desc       DB        CR   Balance
0   1 Feb 2021                   Opening Balance      NaN       NaN  1,000.00
1   1 Feb 2021       Lorem ipsum dolor sit amet,  -197.60       NaN    802.40
2          NaN  consectetur adipiscing elit, sed      NaN       NaN       NaN
3          NaN      do eiusmod tempor incididunt      NaN       NaN       NaN
4   3 Feb 2021    labore et dolore magna aliqua.      NaN    651.12  1,453.52
5          NaN  Elit pellentesque habitant morbi      NaN       NaN       NaN
6   5 Feb 2021       tristique senectus. Quisque      NaN    666.06  2,119.58
7  28 Feb 2021                   Closing Balance      NaN       NaN  2,119.58
8          NaN                             TOTAL  -197.60  1,317.18       NaN
9          NaN                               NaN   Page 1       NaN       NaN]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant