In this post we’ll see a Java program to read PDF document using iText library.
To know more about iText library and PDF examples check this post- Generating PDF in Java Using iText Tutorial
Reading PDFs using iText
For reading PDF using iText you need to use the following steps.
- Create a PDFReader instance, wrap it with in a PDFDocument.
- Get the number of pages in the PDF that has to be read.
- Iterate through pages and extract the content of each page using PdfTextExtractor.
PDF used for reading.
Java Program
import java.io.IOException; import com.itextpdf.kernel.pdf.PdfDocument; import com.itextpdf.kernel.pdf.PdfReader; import com.itextpdf.kernel.pdf.canvas.parser.PdfTextExtractor; public class ReadPDF { public static final String READ_PDF = "F://knpcode//result//List.pdf"; public static void main(String[] args) { try { // PDFReader PdfReader reader = new PdfReader(READ_PDF); PdfDocument pdfDoc = new PdfDocument(reader); // get the number of pages in PDF int noOfPages = pdfDoc.getNumberOfPages(); System.out.println("Extracted content of PDF---- "); for(int i = 1; i <= noOfPages; i++) { // Extract content of each page String contentOfPage = PdfTextExtractor.getTextFromPage(pdfDoc.getPage(i)); System.out.println(contentOfPage ); } pdfDoc.close(); }catch (IOException e) { System.out.println("Exception occurred " + e.getMessage()); } } }
Output
Extracted content of PDF---- List with Roman symbols i. Item1 ii. Item2 iii. Item3 List with English letter symbols A. Item1 B. Item2 C. Item3 List with Greek letter symbols α. Item1 β. Item2 γ. Item3
Related Posts
- Generate PDF From XML in Java Using Apache FOP
- Generating PDF in Java Using iText Tutorial
- Merging PDFs in Java Using iText
- Password Protected PDF Using iText in Java
- Read PDF in Java Using OpenPDF
- Java PDFBox Example – Read Text And Extract Image From PDF
- Get Current Thread Name And ID in Java
- Display Time in 24 Hour Format in Java
That’s all for the topic Read PDF in Java Using iText. If something is missing or you have something to share about the topic please write a comment.
You may also like
Hello!
Thank you for the post. To make it more useful it seem necessary to add link to sample PDF doc.
And more important – I’m searching a not commercial PDF library that can extract text and images (images’ info) in “as in PDF document” sequence. It needs to verify that image’s group description correspond to images attached (below description). This is a goal of the test.
Could you advice me such library, please?
Thank you in advance!