xhtml2pdf
        
            HTML/CSS to PDF converter based on Python
     HTML/CSS to PDF converter written in Python - HTML2PDF script
           
               
           
            
        
            
             
              
      
                 
                
                
            
            
I'm trying to convert html2pdf from pisa utility. please check the code below. I'm getting error which I couldn't figure out.
Traceback (most recent call last):
  File "dewa.py", line 27, in <module>
    html = html.encode(enc, 'replace')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd9 in position 203: ordinal not in range(128)
Please check code here.
from cStringIO import StringIO
from grab import Grab
from grab.tools.lxml_tools import drop_node, render_html
from grab.tools.text import remove_bom
from lxml import etree
import grab.error
import inspect
import lxml
import os
import sys
import xhtml2pdf.pisa as pisa
enc = 'utf-8'
filePath = '~/Desktop/dewa'
##############################
g = Grab()
g.go('http://www.dewa.gov.ae/arabic/aboutus/dewahistory.aspx')
html = g.response.body
html = html.replace('bgcolor="EDF389"', 'bgcolor="#EDF389"')
''' clear page '''
html = html.encode(enc, 'replace')
print html
f = file(filePath + '.html' , 'wb')
f.write(html)
f.flush()
f.close()
''' Save PDF '''
pdfresult = StringIO()
pdf = pisa.pisaDocument(StringIO(html), pdfresult, encoding = enc)
f = file(filePath + '.pdf', 'wb')
f.write(pdfresult.getvalue())
f.flush()
f.close()
pdfresult.close()
        Source: (StackOverflow)
                  
                 
            
                 
                
                
            
            
I generate PDFs with the xhtml2pdf Python package. The output is not optimal. I use floating divs in order to place images and text on the page. In HTML this works but after PDF rendering, images and text ar placed underneath eachother which is not what I want. From surfing the web I learned that the Report Lab package that is used by xhtml2pdf can not handle floating divs. Does a workaround exist? I have tried webkit rendering via QT but the resulting PDFs are of low quality, i.e. character spacing is completely wrong.
        Source: (StackOverflow)
                  
                 
            
                 
                
                
            
            
I have followed exactly the code here: Convert HTML into PDF using Python, but my images are still not showing up. They have absolute URLs, in any case.
xhtml2pdf and reportlab are both placed in my app folder as modules, so no import errors pop up or anything. The PDF renders fine, except that images are not being displayed. I tried to remove HTML and CSS width/height attributes as well to no avail.
Any pointers?
        Source: (StackOverflow)
                  
                 
            
                 
                
                
            
            
I'm using xhtml2pdf (former pisa, or is it vice versa? :)) to generate PDF from the django template. The template is rendered ok, but PDF I get from that template is corrupted in a very weird manner: text in table cells are lifted to the top of the cell, so capital letters touch the upper border of the cell:

While in the browser it looks like that:

I've tried:
- Applying vertical-align- looks like it's just ignored, at least I didn't notice any changes in pdf, even if they were in generated html
- Applying padding-top- it moves the text down, but increases the cell height as well.
- Wrapping text into spanwithmargin-top- same effect aspadding-top
I think the reason is that text is rendered by xhtml2pdf at the very top of the line, while browsers tend to render it somewhere in the middle of the block. In other words the text block occupies the very same position both in pdf and html, but the text inside the block is shifted. But that's just my speculation.
So, has anyone faced the same issue? Am I doing something wrong? Any workarounds possible?
Pieces of code:
        Source: (StackOverflow)
                  
                 
            
                 
                
                
            
            
ReportLab/xhtml2pdf have worked perfectly until now when it crashes at this style bit in HTML:
<p style="border-style: initial; border-color: initial; border-image: initial; 
 font-family: Ubuntu-R; font-size: small; border-width: 0px; padding: 0px; 
 margin: 0px;">Done:</p>
with this error:
File "/usr/local/lib/python2.7/dist-packages/reportlab/lib/colors.py",
line 850, in __call__
    raise ValueError('Invalid color value %r' % arg)
ValueError: Invalid color value 'initial'
I use it typically like this:
     pdf = pisa.pisaDocument(StringIO.StringIO(html.encode('UTF-8')), 
result, encoding='UTF-8', link_callback=fetch_resources)
Is there a way to overcome this other than patching it's original code?
        Source: (StackOverflow)
                  
                 
            
                 
                
                
            
            
I am using xhtml2pdf for converting html to pdf. 
For some reason it does not detect the width of any div. I have tried giving width using style  it still does not work. What am I do doing wrong?
    <html>
<head>
</head>
<body>
<style>
    div{
        width:100pt;
        height:100pt;
        border:Solid red 1pt;
    }
</style>
<div>
    WOw a pdf
</div>
</body>
</html>
In the above code the div does not have a width of 100px or 100pt. 
def myview(request):
    options1 = ReportPropertyOption.objects.all()
    for option in options1:
        option.exterior_images = ReportExteriorImages.objects.filter(report = option)  
        option.interior_images = ReportInteriorImages.objects.filter(report = option)
        option.floorplan_images = ReportFloorPlanImages.objects.filter(report = option)
    html  = render_to_string('report/export.html', { 'pagesize' : 'A4', }, context_instance=RequestContext(request,{'options1':options1}))
    result = StringIO.StringIO()
    pdf = pisa.pisaDocument(StringIO.StringIO(html.encode("UTF-8")), dest=result, link_callback=fetch_resources )
    if not pdf.err:
        return HttpResponse(result.getvalue(), mimetype='application/pdf')
    return HttpResponse('Gremlins ate your pdf! %s' % cgi.escape(html))
def fetch_resources(uri, rel):  
    path = os.path.join(settings.MEDIA_ROOT, uri.replace("/media/", ""))
    return path.replace("\\","/")
        Source: (StackOverflow)
                  
                 
            
                 
                
                
            
            
I use Pisa/xhtml2pdf in my Django apps to generate pdf from an HTML source. That is:
- I generate the HTML file formatted with all 'printing' stuffs (e.g. page-breaks, header, footer, etc.)
- I convert this HTML into pdf using Pisa
This process is ok but it is slow (expecially when dealing with long tables) and I must use HTML/CSS according to Pisa features/limitations.
The question is: is this the right way to generate pdf from a web application (i.e. create HTML and then convert it to pdf) or there is a more direct way, that is "write" the pdf with a more suitable language?
        Source: (StackOverflow)
                  
                 
            
                 
                
                
            
            
I installed xhtml2pdf using pip for use with Django. I am getting the following ImportError:
Reportlab Toolkit Version 2.2 or higher needed
But I have reportlab 3.0
>>> import reportlab
>>> print reportlab.Version                                                                                                                                                                                                                 
3.0
I found this try catch block in the __init__.py of xhtml2pdf:
REQUIRED_INFO = """
****************************************************
IMPORT ERROR!
%s
****************************************************
The following Python packages are required for PISA:
- Reportlab Toolkit >= 2.2 <http://www.reportlab.org/>
- HTML5lib >= 0.11.1 <http://code.google.com/p/html5lib/>
Optional packages:
- pyPDF <http://pybrary.net/pyPdf/>
- PIL <http://www.pythonware.com/products/pil/>
""".lstrip()
log = logging.getLogger(__name__)
try:
    from xhtml2pdf.util import REPORTLAB22
    if not REPORTLAB22:
        raise ImportError, "Reportlab Toolkit Version 2.2 or higher needed"
except ImportError, e:
    import sys
    sys.stderr.write(REQUIRED_INFO % e)
    log.error(REQUIRED_INFO % e)
    raise
There's also another error in the util.py:
if not (reportlab.Version[0] == "2" and reportlab.Version[2] >= "1"):
Shouldn't that read something like:
if not (reportlab.Version[:3] >="2.1"):
What gives?
        Source: (StackOverflow)
                  
                 
            
                 
                
                
            
            
As of now we are using XHTML2PDF to dynamically generate PDFs and outputting to browser whenever required. Now our requirements is changed to generate the PDF only once and store it in the server. The link should be displayed to user to view the PDF. Could you please point out any resources or snippets to achieve this?
        Source: (StackOverflow)
                  
                 
            
                 
                
                
            
            
First of all, I'm new to python, reportlab, xhtml2pdf. 
I've already done my first pdf files with reportlab, but I ran into the following problem.
I need a large text in two columns.
First I create my canvas, create my story, append my large text as a paragraph to the story, create my Frame and finally add the story to the frame. 
c = Canvas("local.pdf")
storyExample = []
textExample = (""" This is a very large text Lorem Ipsum ... """)
storyExample.append(Paragraph(textExample, styleText))
frameExample = Frame(0, 0, 50, 50,showBoundary=0)
frameExample.addFromList(storyExample,c)
c.showPage()
c.save()
Works like a charm. But I need to show the text in a two column represantation. 
Now the text just flows threw my frame like:
|aaaaaaaaaaaaaaaaaaaa|
|bbbbbbbbbbbbbbbbbbbb|
|cccccccccccccccccccc|
|dddddddddddddddddddd|
But I need it like this:
|aaaaaaaaa  bbbbbbbbbb|
|aaaaaaaaa  cccccccccc|
|bbbbbbbbb  cccccccccc|
|bbbbbbbbb  dddddddddd|
I hope you understood what I am trying to say.   
        Source: (StackOverflow)
                  
                 
            
                 
                
                
            
            
I'm having some trouble getting a footer to appear as one frame on the first page of a Pisa document, and as another frame on every other page. I have attempted to adapt the lastPage idea from here, but with no luck.
Is it possible to do this? <pdf:nextpage /> doesn't seem to be the right thing here since the document has a long table that may (or may not) flow over multiple pages. <pdf:nextframe /> plus a first-page-only frame looks promising, though I'm not sure how to use this exactly.
Currently I have (snipped for brevity):
<style type="text/css">
  @page {
    margin: 1cm;
    margin-bottom: 2.5cm;
    @frame footer {
      -pdf-frame-content: footerFirst;
      -pdf-frame-border: 1;
      bottom: 2cm;
      margin-left: 1cm;
      margin-right: 1cm;
      height: 1cm;
   }
   @frame footer {
      -pdf-frame-content: footerOther;
      bottom: 2cm;
      margin-left: 1cm;
      margin-right: 1cm;
      height: 1cm;
}
</style>
<body>
  <table repeat="1">
    <!-- extra long table here -->
  </table>
  <div id="footerContent">This is a footer</div>
  <!-- what goes here to switch frames after the first page? -->
  <div id="footerOther"></div>
</body>
This places the same footer on each page. I need the same space left on each consecutive pages, but with no content in the frame.
        Source: (StackOverflow)
                  
                 
            
                 
                
                
            
            
I can't seem to find a working tutorial or howto document for this module. Does one exist somewhere?
The "To be completed" section here:
https://github.com/chrisglass/xhtml2pdf/blob/master/doc/usage.rst
is buggy, and doesn't seem to contain working code. After corrections, this code sequence:
from xhtml2pdf import pisa as pisa
filename = u'test.pdf'
pdf = pisa.CreatePDF("Hello <strong>World</strong>",file(filename, "wb"))
pisa.startViewer(filename)
produces an empty test.pdf file (well, not exactly empty, it's a pdf file without content)
        Source: (StackOverflow)
                  
                 
            
                 
                
                
            
            
I'm trying to build a view which would render itself to PDF. 
Each time I accessed the view, I had some random issues with the structure of rendered document / table.
Tracking the error, I've came down to rendering completely static html code, and found out, that - each request, the resulting document size is different.
    template = get_template(self.get_report_template_name())
    html = template.render(Context({}))
    strobj = StringIO.StringIO()
    pisa.CreatePDF(html.encode("UTF-8"), strobj, encoding='UTF-8')
    return HttpResponse('len: %d' % strobj.len);
as you can see, each time the very same template is rendered, with empty context, to make sure nothing changes. anyway, the template doesn't use django templating language at all
the above code returns a bit different result each time I refresh the page
len: 2573, len: 2595
len: 2234,
len: 2601,
len: 2244,
len: 2632,
etc ... (some of the values are repeated multiple time).
when saved & displayed these documents - they contains "broken" page structure, like incorrectly displayed table cell or something. Only one of these looks correct.
Any suggestions where to find the problem? 
        Source: (StackOverflow)
                  
                 
            
                 
                
                
            
            
good day...I am trying to using xhtml2pdf to print webpage into local disk PDF files. there's an example found as below.
it runs and doesn't return error. however it doesn't convert the webpage but only a sentence. in this case, only 'http://www.yahoo.com/' is written into the PDF file.
how can I actually convert the web page into PDF? thanks.
from xhtml2pdf import pisa
sourceHtml = 'http://www.yahoo.com/'
outputFilename = "test.pdf"
def convertHtmlToPdf(sourceHtml, outputFilename):
    resultFile = open(outputFilename, "w+b")
    pisaStatus = pisa.CreatePDF(sourceHtml,resultFile)
    resultFile.close()
    return pisaStatus.err
if __name__=="__main__":
    pisa.showLogging()
    convertHtmlToPdf(sourceHtml, outputFilename)
        Source: (StackOverflow)
                  
                 
            
                 
                
                
            
            
I am trying to export an html document to pdf using the xhtml2pdf python library.
I think the <img> tag is supported - however the docs are not clear on this matter - there are a couple of test cases using the tag.
Following the example in the docs, with an image added, I did this:
from xhtml2pdf import pisa
sourceHtml = "<html><body><div><img src ='testimage.jpg'></div><p>Some text output for testing...<p></body></html>"
outputFilename = "test.pdf"
resultFile = open(outputFilename, "w+b")
pisa.CreatePDF(sourceHtml,dest=resultFile)
resultFile.close()
However no image was included in the resulting pdf. Reading around, I see that this might be to do with the PIL package - which appears to be installed OK on my system.
My question is should I be expecting the above code to work with xhtml2pdf or does it ignore the <img> tag?
        Source: (StackOverflow)