Tabula

Tabula is a free tool that can be used to extract tabulated data from PDF.

Tabula has limitations that it will only worked on OCR files (meaning scanned image PDF won't work)

To use it :

  • download tabula from http://tabula.technology
  • install it on your computer.
  • run tabula and open it in your web browser
  • import your pdf file
  • select the page where you have table
  • drag and drop alongside the table
  • preview and extract
  • extract to CSV
  • done !

Exercise

Extract the amount of aid (table 4) that Australia plan to give to other countries in the pdf file in the search engine exercise

results matching ""

    No results matching ""