How to Convert PDF Files to Images using Java

Several days ago, i met a very rare condition where i had to open pdf files on my java swing application. After spending several time brainstorming and googling i decide to convert pdf pages into jpg images and attach it to JPanel using a very neat library, pdf-renderer.

To diplay pdf files, it’s a simple as this,

package com.baculsoft.pdfviewer.main;

import com.sun.pdfview.PDFViewer;
import java.io.File;

/**
 *
 * @author edw
 * @created Dec 28, 2010, 1:23:46 PM
 * @purpose Main Frame to Display PDF files
 */
public class Main {

    private void execute() throws Exception{

        //  load a pdf from a file
        File file = new File("a.pdf");

        // display it on a JFrame
        PDFViewer pdfv = new PDFViewer(true);
        pdfv.openFile(file) ;
        pdfv.setEnabling();
        pdfv.pack();
        pdfv.setVisible(true);

    }

    public static void main(String[] args) {
        Main main = new Main();
        try {
            main.execute();
        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }
}

It will show a PDF Viewer Frame like this.

But then i was thinking, if it can displays PDF pages on JPanel, can it also convert pdf pages to images? The answer is obviously yes. This is how to do it.

package com.baculsoft.pdfviewer.util;

import com.sun.pdfview.PDFFile;
import com.sun.pdfview.PDFPage;
import java.awt.Graphics;
import java.awt.Image;
import java.awt.Rectangle;
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.FileInputStream;
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.channels.Channels;
import java.nio.channels.FileChannel;
import java.nio.channels.ReadableByteChannel;
import javax.imageio.ImageIO;

public class PDFToImageConverter {

    private void convert() throws Exception {
        
        //  load a pdf from a file
        File file = new File("a.pdf");
        RandomAccessFile raf = new RandomAccessFile(file, "r");
        ReadableByteChannel ch = Channels.newChannel(new FileInputStream(file));

        FileChannel channel = raf.getChannel();
        ByteBuffer buf = channel.map(FileChannel.MapMode.READ_ONLY,
                0, channel.size());
        PDFFile pdffile = new PDFFile(buf);

        //   get number of pages
        int jumlahhalaman = pdffile.getNumPages();

        //  iterate through the number of pages
        for (int i = 1; i <= jumlahhalaman; i++) {
            PDFPage page = pdffile.getPage(i);

            //  create new image
            Rectangle rect = new Rectangle(0, 0,
                    (int) page.getBBox().getWidth(),
                    (int) page.getBBox().getHeight());

            Image img = page.getImage(
                    rect.width, rect.height, //width & height
                    rect, // clip rect
                    null, // null for the ImageObserver
                    true, // fill background with white
                    true // block until drawing is done
                    );

            BufferedImage bufferedImage = new BufferedImage(rect.width, rect.height, BufferedImage.TYPE_INT_RGB);
            Graphics g = bufferedImage.createGraphics();
            g.drawImage(img, 0, 0, null);
            g.dispose();

            File asd = new File("bbb" + i + ".jpg");
            if (asd.exists()) {
                asd.delete();
            }
            ImageIO.write(bufferedImage, "jpg", asd);
        }
    }

    public static void main(String[] args) {
        PDFToImageConverter converter = new PDFToImageConverter();
        try {
            converter.convert();
        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }
}

Before :

After :

As you can see, it will create some jpg files. The number of jpg files created depends on how many pages your PDF files have. You can also test to convert to PNG or maybe GIF file types. Just change on line 57 and 61 from “jpg” to “png” or “gif”.

This is my Netbeans 6.9 project structure

Cheers. (B)

19 thoughts on “How to Convert PDF Files to Images using Java”

  1. Chinnaswamy

    Hi Edwin,
    You have provided an excellent example for how to convert pdf files to Images in Java. When I tried to explore this, I found a puzzle.

    I tried to read an A3 size pdf file (created in another computer) that I was aiming at to convert to an image using your code. But the resulting image was not covering the full pdf object and also, it was distorted ( X-scale is different from y-Scale)

    The size of the PDFBox Rectangle was 1684 x 1191.

    After brainstorming for some hours, I guessed that
    the difference would be the screen resolution difference among the computers of pdf file creation and pdf convertion to Image.

    Requesting for your advice.

    Chinnaswamy

  2. Chinnaswamy

    Hi All, this is in continuation to my help request that posted on 17th June 2012.

    When it is tried to read a pdf file created from any CAD software where the scale is same in both x X and Y directions, Edwin’s code works fine, just by making the height of the rect created for each PDF page as same as width.

    Thanks and regards,

    Chinnaswamy
    Singapore

    1. Hi Chinnaswamy, glad your code could work
      and if you have some spare time, could you please share your workaround on your blog
      who knows perhaps you could help other people with your writings 🙂

  3. Hi Edwin,
    A very good solution. Though I am running into a problem when i try converting a pdf with two pages into two images. It throws this exception :
    java.lang.ArrayIndexOutOfBoundsException: 14
    at com.sun.pdfview.decode.CCITTFaxDecoder.decodeT6(CCITTFaxDecoder.java:1088)
    at com.sun.pdfview.decode.CCITTFaxDecode.decode(CCITTFaxDecode.java:55)
    at com.sun.pdfview.decode.CCITTFaxDecode.decode(CCITTFaxDecode.java:17)
    at com.sun.pdfview.decode.PDFDecoder.decodeStream(PDFDecoder.java:138)
    at com.sun.pdfview.PDFObject.decodeStream(PDFObject.java:347)
    at com.sun.pdfview.PDFObject.getStream(PDFObject.java:261)
    at com.sun.pdfview.PDFObject.getStream(PDFObject.java:255)
    at com.sun.pdfview.PDFObject.getStream(PDFObject.java:298)
    at com.sun.pdfview.PDFImage.getImage(PDFImage.java:306)
    at com.sun.pdfview.PDFRenderer.drawImage(PDFRenderer.java:274)
    at com.sun.pdfview.PDFImageCmd.execute(PDFPage.java:665)
    at com.sun.pdfview.PDFRenderer.iterate(PDFRenderer.java:577)
    at com.sun.pdfview.BaseWatchable.run(BaseWatchable.java:101)
    at com.sun.pdfview.BaseWatchable.execute(BaseWatchable.java:263)
    at com.sun.pdfview.BaseWatchable.go(BaseWatchable.java:197)
    at com.sun.pdfview.PDFPage.getImage(PDFPage.java:239)
    at com.ag.test.PDFToImageConverter.convert(PDFToImageConverter.java:45)
    at com.ag.test.PDFToImageConverter.main(PDFToImageConverter.java:69)

    Any solutions for that .
    PS : not very great at PDF or Image knowledge

    1. Hi ash,
      ive tried to convert pdf with multiple pages before and i never found that kind of error, what is the content of your pdf?

  4. i use your code. this work perfectly, but some pdf file is not working, the problem is that after creating the PDFFile object, i get pdffile.getNumPages() is zero (0).. but the pdf file has one page.i check the version of pdf,its version is [format-PDF-1.3]… thn i found that format-PDF-1.3 is not working but format-PDF-1.1,format-PDF-1.2,format-PDF-1.4 is working.i update the pdf-render jar to 1.0.5.. but the problem is same..can any one help me plzzzz…

    1. Hi saurav das,
      Thank you for your question, too bad my knowledge regarding pdf formats is not too wide.
      Perhaps you could find a workaround to convert your pdf format to pdf-1.4 and try to re-run your application.
      Sorry i could not do much to help you.

  5. hi!!
    thanks for the example;
    can u help me in getting better quality photo while converting;
    so that i can read it even if i zoom it a little

    1. Hi Jojo, it’s been a while since the last time i use pdf-renderer
      so i can’t tell whether there is a way to increase the pdf-to-image convertion quality
      but you can try to export your pdf into png instead of jpg, perhaps it would increase your image quality.

    2. Hi Jojo,

      I know this is months late but I had to solve that exact issue myself today. The key to getting a higher resolution image is to use larger dimensions than the image or bounding box might indicate when capturing the bitmap. This scaling applies to both the page.getImage() call and the BufferedImage constructor (because you’ll need a BitmapImage that’s big enough to hold the larger image you’ve captured).

      e.g.

      (Note how I use a float variable called “scalingFactor” below. I’m using a value of 2.0 as a default and it works for me.)

      Rectangle rect = new Rectangle(
        0, // x 
        0, // y
        (int) (page.getBBox().getWidth() / page.getAspectRatio()), 
        (int) page.getBBox().getHeight()
      );
      
      Image img = page.getImage(
        (int) scalingFactor * rect.width,  // width
        (int) scalingFactor * rect.height, // height
        rect, 	// clip rectangle
        null, 	// null for the ImageObserver
        false, 	// fill background with white
        true 	// block until drawing is done
      );
      
      BufferedImage bufferedImage = new BufferedImage(
        (int) (scalingFactor * page.getWidth()),
        (int) (scalingFactor * rect.height),  
        BufferedImage.TYPE_INT_RGB
      );
      // For my image, for some reason I had to shift the contents of the image. 
      // You'll need to scale any shifts as well.
      g.drawImage(img, 
        (int) (-scalingFactor * (rect.width - page.getWidth())),  
        0,
        null);
      g.dispose();
      

      Apologies if the formatting gets mangled. There’s not much I can do about it if it does.

      Good luck!

  6. Hi !!
    I tried this code as sample java class this works fine..
    but
    the same code i tried to run this code in servlet i am getting following error.
    java.lang.ClassNotFoundException: com.sun.pdfview.PDFFile
    at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1713)
    at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1558)
    at com.incresol.PDFPOC.PdfToImageConverter.convert(PdfToImageConverter.java:76)
    at com.incresol.PDFPOC.PdfToImageConverter.doGet(PdfToImageConverter.java:45)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:621)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:728)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
    at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
    at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:947)
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
    at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1009)
    at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
    at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:312)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:724)
    Please Help Me..

    1. Hi Diya,
      ive only have exposed to pdfrenderer for a while, so perhaps im not having a deep knowledge for it.

  7. Hi,

    I can’t run this code. can you any one help me??

    Complile Error:

    run:
    E:\Users\dinesh.sekar\Documents\tif_input.pdf
    java.lang.RuntimeException: Uncompilable source code – incompatible types: void cannot be converted to org.pdfbox.pdmodel.PDPageNode
    at pdftoimageconverter.PDFToImageConverter.convertPDFToJPG(PDFToImageConverter.java:66)
    at pdftoimageconverter.PDFToImageConverter.main(PDFToImageConverter.java:36)
    BUILD SUCCESSFUL (total time: 10 seconds)

    ****************************************************************
    How to solve this…

    Please help me….!

Leave a Reply to Chinnaswamy Cancel Reply

Your email address will not be published.