Fix: ImportError: cannot import name PDFDocument (when using slate)

I  spent over an hour to fix this problem. I write this post to avoid searching again and share the solution I used. Up to this point, this solution still works perfectly, I don’t know if it still work in the future, since the owner of slate package said he’s gonna change the code.

After installing slate, and you type in

import slate

and you get this error

Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/slate/__init__.py", line 48, in <module>
from slate import PDF
File "/usr/local/lib/python2.7/dist-packages/slate/slate.py", line 3, in <module>
from pdfminer.pdfparser import PDFParser, PDFDocument
ImportError: cannot import name PDFDocument

In this post, there are many ways to solve this problem, but not all of them can be applied. And I found the solution from rdpickard .  It’s not quite elegant, but it fixes shit.

If you’re afraid of clicking the link above, here’s what he wrote:

Not sure if editing the slate.py is an option for people’s environment, but if you change

line 3

from pdfminer.pdfparser import PDFParser, PDFDocument

to

from pdfminer.pdfparser import PDFParser
from pdfminer.pdfdocument import PDFDocument
from pdfminer.pdfpage import PDFPage

line 38

        self.doc = PDFDocument()

to

        self.doc = PDFDocument(self.parser)

comment out lines 40 & 41

line 49

            for page in self.doc.get_pages():
                self.append(self.interpreter.process_page(page))

to

            for page in PDFPage.create_pages(self.doc):
                self.append(self.interpreter.process_page(page))

it works.

Here are the versions of libraries I am using

cssselect==0.9.1
lxml==3.6.0
pdfminer==20140328
pyquery==1.2.13
slate==0.3
wheel==0.24.0

That’s it.

If you cant change the code inside slate.py, it probably means you don’t have permission. If you’re using Linux, type this line into terminal

sudo chown yourusername:yourusername path_to_file_or_folder_you_want_to_get_permission

and then enter your password.

Advertisements