Skip to content

Commit

Permalink
fixed get_raw_def method and bumped version in Pipfile.lock (#29)
Browse files Browse the repository at this point in the history
  • Loading branch information
marlanperumal authored Apr 30, 2021
1 parent 40b9357 commit e1d6ee1
Show file tree
Hide file tree
Showing 6 changed files with 251 additions and 233 deletions.
1 change: 1 addition & 0 deletions Pipfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ ipykernel = "*"
tabula-py = "*"
pikepdf = "*"
click = "*"
pdf-statement-reader = {editable = true, path = "."}

[dev-packages]
setuptools = "*"
Expand Down
472 changes: 243 additions & 229 deletions Pipfile.lock

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,8 @@ The configuration file itself is in JSON format. Here's the Absa cheque account

These were the configuration options that were required for the default format. It is envisaged that as more formats are added, the list of options will grow.

A key part in setting up a new configuration is getting the page coordinates for the area and columns. The easiest way to do this is to run the [tabula GUI](https://tabula.technology/), autodetect the page areas, save the settings as a template, then download and inspect json template file. It's not a one-to-one mapping to the psr config but hopefully it will be a good starting point.

## CLI API

### decrypt
Expand Down
2 changes: 2 additions & 0 deletions data/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
*
!.gitignore
5 changes: 2 additions & 3 deletions pdf_statement_reader/parse.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,8 @@ def get_raw_df(filename, num_pages, config):
"-Dorg.apache.commons.logging.Log=org.apache.commons.logging.impl.NoOpLog"
]
)
if df is not None:
dfs.append(df)

if df is not None and len(df) > 0:
dfs.extend(df)
statement = pd.concat(dfs, sort=False).reset_index(drop=True)
return statement

Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

setup(
name="pdf_statement_reader",
version="0.2.1",
version="0.2.2",
description="PDF Statement Reader",
long_description=long_description,
long_description_content_type="text/markdown",
Expand Down

0 comments on commit e1d6ee1

Please sign in to comment.