pdf2table

pdf2table is a node.js library that attempts to extract tables from a pdf.

The 'tables' are extracted as an array of rows.

It uses pdf2json to extract the pdf data.

Install

You can install pdf2table using the Node Package Manager (npm):

npm install pdf2table

Simple example

var pdf2table = require('pdf2table');
var fs = require('fs');

fs.readFile('./test.pdf', function (err, buffer) {
    if (err) return console.log(err);

    pdf2table.parse(buffer, function (err, rows, rowsdebug) {
        if(err) return console.log(err);

        console.log(rows);
    });
});

Note

Note that this is a simplistic implementation to extract tables. If your pdf contains other stuff that's not a table, pdf2table will still attempt to shape this data into a row. Feel free to improve and send pull requests.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
lib		lib
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pdf2table

Install

Simple example

Note

About

Releases

Packages

Used by 152

Contributors 4

Languages

License

SamDecrock/pdf2table

Folders and files

Latest commit

History

Repository files navigation

pdf2table

Install

Simple example

Note

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Used by 152

Contributors 4

Languages

Packages