How to Convert PDF Files to Images using Java

Several days ago, i met a very rare condition where i had to open pdf files on my java swing application. After spending several time brainstorming and googling i decide to convert pdf pages into jpg images and attach it to JPanel using a very neat library, pdf-renderer.

To diplay pdf files, it’s a simple as this,

package com.baculsoft.pdfviewer.main;

import com.sun.pdfview.PDFViewer;
import java.io.File;

/**
 *
 * @author edw
 * @created Dec 28, 2010, 1:23:46 PM
 * @purpose Main Frame to Display PDF files
 */
public class Main {

    private void execute() throws Exception{

        //  load a pdf from a file
        File file = new File("a.pdf");

        // display it on a JFrame
        PDFViewer pdfv = new PDFViewer(true);
        pdfv.openFile(file) ;
        pdfv.setEnabling();
        pdfv.pack();
        pdfv.setVisible(true);

    }

    public static void main(String[] args) {
        Main main = new Main();
        try {
            main.execute();
        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }
}

It will show a PDF Viewer Frame like this.

But then i was thinking, if it can displays PDF pages on JPanel, can it also convert pdf pages to images? The answer is obviously yes. This is how to do it.

package com.baculsoft.pdfviewer.util;

import com.sun.pdfview.PDFFile;
import com.sun.pdfview.PDFPage;
import java.awt.Graphics;
import java.awt.Image;
import java.awt.Rectangle;
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.FileInputStream;
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.channels.Channels;
import java.nio.channels.FileChannel;
import java.nio.channels.ReadableByteChannel;
import javax.imageio.ImageIO;

public class PDFToImageConverter {

    private void convert() throws Exception {
        
        //  load a pdf from a file
        File file = new File("a.pdf");
        RandomAccessFile raf = new RandomAccessFile(file, "r");
        ReadableByteChannel ch = Channels.newChannel(new FileInputStream(file));

        FileChannel channel = raf.getChannel();
        ByteBuffer buf = channel.map(FileChannel.MapMode.READ_ONLY,
                0, channel.size());
        PDFFile pdffile = new PDFFile(buf);

        //   get number of pages
        int jumlahhalaman = pdffile.getNumPages();

        //  iterate through the number of pages
        for (int i = 1; i <= jumlahhalaman; i++) {
            PDFPage page = pdffile.getPage(i);

            //  create new image
            Rectangle rect = new Rectangle(0, 0,
                    (int) page.getBBox().getWidth(),
                    (int) page.getBBox().getHeight());

            Image img = page.getImage(
                    rect.width, rect.height, //width & height
                    rect, // clip rect
                    null, // null for the ImageObserver
                    true, // fill background with white
                    true // block until drawing is done
                    );

            BufferedImage bufferedImage = new BufferedImage(rect.width, rect.height, BufferedImage.TYPE_INT_RGB);
            Graphics g = bufferedImage.createGraphics();
            g.drawImage(img, 0, 0, null);
            g.dispose();

            File asd = new File("bbb" + i + ".jpg");
            if (asd.exists()) {
                asd.delete();
            }
            ImageIO.write(bufferedImage, "jpg", asd);
        }
    }

    public static void main(String[] args) {
        PDFToImageConverter converter = new PDFToImageConverter();
        try {
            converter.convert();
        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }
}

Before :

After :

As you can see, it will create some jpg files. The number of jpg files created depends on how many pages your PDF files have. You can also test to convert to PNG or maybe GIF file types. Just change on line 57 and 61 from “jpg” to “png” or “gif”.

This is my Netbeans 6.9 project structure

Cheers. (B)

Google+

19 Comments

Chinnaswamy

about 5 years ago

Hi Edwin, You have provided an excellent example for how to convert pdf files to Images in Java. When I tried to explore this, I found a puzzle. I tried to read an A3 size pdf file (created in another computer) that I was aiming at to convert to an image using your code. But the resulting image was not covering the full pdf object and also, it was distorted ( X-scale is different from y-Scale) The size of the PDFBox Rectangle was 1684 x 1191. After brainstorming for some hours, I guessed that the difference would be the screen resolution difference among the computers of pdf file creation and pdf convertion to Image. Requesting for your advice. Chinnaswamy

Reply

Chinnaswamy

about 5 years ago

Hi All, this is in continuation to my help request that posted on 17th June 2012. When it is tried to read a pdf file created from any CAD software where the scale is same in both x X and Y directions, Edwin's code works fine, just by making the height of the rect created for each PDF page as same as width. Thanks and regards, Chinnaswamy Singapore

Reply

edwin

about 5 years ago

Hi Chinnaswamy, glad your code could work and if you have some spare time, could you please share your workaround on your blog who knows perhaps you could help other people with your writings :)

ash

about 5 years ago

Hi Edwin, A very good solution. Though I am running into a problem when i try converting a pdf with two pages into two images. It throws this exception : java.lang.ArrayIndexOutOfBoundsException: 14 at com.sun.pdfview.decode.CCITTFaxDecoder.decodeT6(CCITTFaxDecoder.java:1088) at com.sun.pdfview.decode.CCITTFaxDecode.decode(CCITTFaxDecode.java:55) at com.sun.pdfview.decode.CCITTFaxDecode.decode(CCITTFaxDecode.java:17) at com.sun.pdfview.decode.PDFDecoder.decodeStream(PDFDecoder.java:138) at com.sun.pdfview.PDFObject.decodeStream(PDFObject.java:347) at com.sun.pdfview.PDFObject.getStream(PDFObject.java:261) at com.sun.pdfview.PDFObject.getStream(PDFObject.java:255) at com.sun.pdfview.PDFObject.getStream(PDFObject.java:298) at com.sun.pdfview.PDFImage.getImage(PDFImage.java:306) at com.sun.pdfview.PDFRenderer.drawImage(PDFRenderer.java:274) at com.sun.pdfview.PDFImageCmd.execute(PDFPage.java:665) at com.sun.pdfview.PDFRenderer.iterate(PDFRenderer.java:577) at com.sun.pdfview.BaseWatchable.run(BaseWatchable.java:101) at com.sun.pdfview.BaseWatchable.execute(BaseWatchable.java:263) at com.sun.pdfview.BaseWatchable.go(BaseWatchable.java:197) at com.sun.pdfview.PDFPage.getImage(PDFPage.java:239) at com.ag.test.PDFToImageConverter.convert(PDFToImageConverter.java:45) at com.ag.test.PDFToImageConverter.main(PDFToImageConverter.java:69) Any solutions for that . PS : not very great at PDF or Image knowledge

Reply

edwin

about 5 years ago

Hi ash, ive tried to convert pdf with multiple pages before and i never found that kind of error, what is the content of your pdf?

saurav das

about 5 years ago

i use your code. this work perfectly, but some pdf file is not working, the problem is that after creating the PDFFile object, i get pdffile.getNumPages() is zero (0).. but the pdf file has one page.i check the version of pdf,its version is [format-PDF-1.3]... thn i found that format-PDF-1.3 is not working but format-PDF-1.1,format-PDF-1.2,format-PDF-1.4 is working.i update the pdf-render jar to 1.0.5.. but the problem is same..can any one help me plzzzz...

Reply

edwin

about 5 years ago

Hi saurav das, Thank you for your question, too bad my knowledge regarding pdf formats is not too wide. Perhaps you could find a workaround to convert your pdf format to pdf-1.4 and try to re-run your application. Sorry i could not do much to help you.

raja

about 5 years ago

i got error like this "package com.sun.pdfview does not exist". how d o i resolve it?

Reply

edwin

about 5 years ago

Hi Raja, i think you need to include PDFRenderer.jar in your classpath. :)

jojo

about 4 years ago

hi!! thanks for the example; can u help me in getting better quality photo while converting; so that i can read it even if i zoom it a little

Reply

edwin

about 4 years ago

Hi Jojo, it's been a while since the last time i use pdf-renderer so i can't tell whether there is a way to increase the pdf-to-image convertion quality but you can try to export your pdf into png instead of jpg, perhaps it would increase your image quality.

Taylor

about 3 years ago

Hi Jojo, I know this is months late but I had to solve that exact issue myself today. The key to getting a higher resolution image is to use larger dimensions than the image or bounding box might indicate when capturing the bitmap. This scaling applies to both the page.getImage() call and the BufferedImage constructor (because you'll need a BitmapImage that's big enough to hold the larger image you've captured). e.g. (Note how I use a float variable called "scalingFactor" below. I'm using a value of 2.0 as a default and it works for me.) [code language="java"] Rectangle rect = new Rectangle( 0, // x 0, // y (int) (page.getBBox().getWidth() / page.getAspectRatio()), (int) page.getBBox().getHeight() ); Image img = page.getImage( (int) scalingFactor * rect.width, // width (int) scalingFactor * rect.height, // height rect, // clip rectangle null, // null for the ImageObserver false, // fill background with white true // block until drawing is done ); BufferedImage bufferedImage = new BufferedImage( (int) (scalingFactor * page.getWidth()), (int) (scalingFactor * rect.height), BufferedImage.TYPE_INT_RGB ); // For my image, for some reason I had to shift the contents of the image. // You'll need to scale any shifts as well. g.drawImage(img, (int) (-scalingFactor * (rect.width - page.getWidth())), 0, null); g.dispose(); [/code] Apologies if the formatting gets mangled. There's not much I can do about it if it does. Good luck!

Mounka

about 3 years ago

Hi !! I tried this code as sample java class this works fine.. but the same code i tried to run this code in servlet i am getting following error. java.lang.ClassNotFoundException: com.sun.pdfview.PDFFile at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1713) at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1558) at com.incresol.PDFPOC.PdfToImageConverter.convert(PdfToImageConverter.java:76) at com.incresol.PDFPOC.PdfToImageConverter.doGet(PdfToImageConverter.java:45) at javax.servlet.http.HttpServlet.service(HttpServlet.java:621) at javax.servlet.http.HttpServlet.service(HttpServlet.java:728) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:947) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1009) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:312) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) Please Help Me..

Reply

edwin

about 3 years ago

Hi Mounka, i think you need to add pdf-renderer.jar into your classpath

Diya

about 3 years ago

Do u have any exposure with PDFRenderer?? M trying to use that to view pdf files..

Reply

edwin

about 3 years ago

Hi Diya, ive only have exposed to pdfrenderer for a while, so perhaps im not having a deep knowledge for it.

Dinesh Sekar

about 2 years ago

Hi, I can't run this code. can you any one help me?? Complile Error: run: E:\Users\dinesh.sekar\Documents\tif_input.pdf java.lang.RuntimeException: Uncompilable source code - incompatible types: void cannot be converted to org.pdfbox.pdmodel.PDPageNode at pdftoimageconverter.PDFToImageConverter.convertPDFToJPG(PDFToImageConverter.java:66) at pdftoimageconverter.PDFToImageConverter.main(PDFToImageConverter.java:36) BUILD SUCCESSFUL (total time: 10 seconds) **************************************************************** How to solve this... Please help me....!

Reply

edwin

about 2 years ago

looks like there is something wrong with your code Dinesh.

wiem

about 12 months ago

Sayang sekali hasil pdf viewernya terjadi penurunan resolusi,

Reply

Leave a Comment

Please be polite. We appreciate that.
Your email address will not be published and required fields are marked


:-[ (B) (^) (P) (@) (O) (D) :-S ;-( (C) (&) :-$ (E) (~) (K) (I) (L) (8) :-O (T) (G) (F) :-( (H) :-) (*) :-D (N) (Y) :-P (U) (W) ;-)