Skip to content

Generate PDF documents from HTML pages with Web Kit

tl;dr
Use wkhtmltopdf to generate PDFs of your web pages.

 

A frequent need we have is to be able generate PDF documents from our website content. A savvy user can always use a print-driver based solution like PrimoPDF to print a web page out to PDF but that’s not really user friendly. There are also PDF creation libraries that allow you to programmatically generate PDF output but from my experience they can only generate very basic looking output (and only with a lot of work).

That leads me to the best solution we’ve found from both user and developer standpoints.

wkhtmltopdf

Simple shell utility to convert html to pdf using the webkit rendering engine, and qt.

Using this tool, creating a PDF from a website is as simple as running the following at a command prompt:

wkhtmltopdf http://www.myhomepage.com myhomepage.pdf

wkhtmltopdf is a great little piece of software engineering that takes advantage of the Web Kit HTML rendering engine to render web pages and capture the output as PDF. Web Kit is the open source rendering engine that powers the display of the Google Chrome and Safari web browsers. wkhtmltopdf uses webkit in the form of the QTWebKit widget that was released with QT 4.4. If you’ve ever tried working with the Web Kit codebase you’ll know that using the QT wrapper is a huge productive booster.

For Drupal people, there’s print module that can be configured to utilize this library to create PDF versions of content.

Advertisements

Plenty of Fish Hacked: Sloppy security

Just heard that the biggest free dating site in North America, Plenty of Fish, has been hacked. There seems to be some drama behind it as there are stories about an extortion attempt but what interested me was the claim that PoF stores user passwords as plaintext! This is never a good idea. While you’re at it sending user input to the server as plaintext is also a bad idea, use SSL.

To safely store passwords: you don’t store passwords. Instead, a hash of the password is stored. This way, the server doesn’t actually know what the passwords are.  When you want to verify the passwords run it through the same hash function and compare with what you have in your database. Caveat: Don’t use an optimized general hash function because then it can be brute-forced; use one that can be configured to run slowly.

Solution: user bcrypt

P.S. Encryption isn’t the best solution here because all it takes is for your encryption key to be comprised to open the flood gates.

Developing Google Chrome Extensions Notes

Typical Google Chrome Extensions are made from the same kinds of files you would see in your average web page: HTML, CSS, PNG, Javascript. Another kind of extension is a Packaged App (*.crx) which can not have a brower action or page action. Instead, at least one HTML file must be provided to serve as the UI.
A special manifest.json file contains metadata about the main entry points of your extension.
There are several ways Google Chrome can present a UI for your extension.
Browser Actions are icons that appear next to the address bar.
Page Actions are icons that appear inside the address bar.
Standalone page in the case of Packaged Apps.
For security and stability reasons, the only way access to the current page’s DOM is limited to:
1. content-scripts
2. TAB Api: ExecuteScript method.
Background.html is a special file that contains privileged code that remains active throughout the life of your extension.
To communicate between your extension (background.html) and the unprivileged code (content-scripts, popup page) you will to use message passing.
To use jQuery in your html pages you can just include it. To use jQuery in content-scripts you must include it in your manifest.json file (content_scripts[‘js’]).
You can control where your extension is active using the “permissions” field of the manifest.json.
You can control where your content-scripts will be injected using the content_scripts[‘matches’] field.

What is a CouchApp?

CouchDB is a nifty document datastore with awesome replication and recovery capabilities. Replicating your data to another site has never been easier, it just works. Another cool feature is that you can create a web-application directly in CouchDB with no additional software.

In addition to regular documents CouchDB can store what are known as “Design Documents”. Design Documents are designated by the _id: _design/docname (they are the only documents that allow a / in the id). These documents must follow a specific schema that CouchDB uses to serve your application.

Typically, your design documents will have special functions called “views” which are described in a “views” field that generates views of your data.

You can do all of your development manually but it’s a pain writing javascript code as standard JSON strings. Instead, there are tools available that allow you to write your app logic in standard JavaScript and the tool will “Json Stringify” it to JSON before uploading it to your CouchDB.

Along with tools for creating Couchapps, there are also serverside JavaScript libraries that you can use to speed up your development. Node, Step, Moustace are some popular libraries. Node is one of the core libraries for writing server side JavaScript code. Step is a node package (see NPM) that makes writing for Node easier to understand. Moustache is a templating library for rending output to the client.

All this is pretty cool. I like the idea of having the ability to easily replicate your app to users. It’s really THAT easy.

Here’s a cool tutorial that I found: http://couchtim.github.com/clubhouse/

Ruby on Rails with RVM on Ubuntu: no such file to load — openssl error

The following was adapted from: http://cjohansen.no/en/ruby/ruby_version_manager_ubuntu_and_openssl

RVM (Ruby Version Manager) is a must if you are working with different versions of Ruby. You can also create “gemsets” which are a set of gems that you can easily switch between. It isn’t perfect, however, on Ubuntu, you may hit some errors when using 1.8.7 when trying to use the openssl package for any non-system provided Ruby version.

Fixing openssl for Rvm-provided Ruby’s

RVM maintains the various versions of ruby in their own self-contained environments (folders) and will have trouble accessing some system-installed packages. On Ubuntu, apt is most commonly used to administrate Ruby. Using apt, openssl is a separate package ( libopenssl-ruby). This installs openssl for your apt-provided Ruby, which will most likely be /usr/lib/ruby/1.8/openssl. Rvm-provided Ruby’s can’t access this, however, so you need to build openssl for each rvm-provided Ruby you install.

The trick for this is to get the libssl-dev from apt before building openssl, otherwise you’ll have trouble configuring it. So, to get openssl on a rvm-provided Ruby, simply do:

sudo apt-get install libssl-dev
cd ~/.rvm/src/ruby-1.8.7/ext/openssl
ruby extconf.rb
make && make install

Sharepoint ISVs

Retrieved from my old SharePoint blog: http://vspug.com/kwanl/2010/08/29/sharepoint-isvs/

After a long hiatus, I have decided to return to SharePoint blogging. The SharePoint marketplace has exploded since I was actively working with the toy web parts we made at my old employer. Here is a list of the big SharePoint vendors I know about, if you know more please suggest them in the comments.

Report Viewer 2008 error: Unable to load client print control

One of the updates found in the Report Viewer 2008 control is the ability to print from your browser in local-mode. Frustratingly, the first time your users try it they will be greeted by an error message: “Unable to load client print control”. I’ve read that this is caused by a Microsoft Security Update

To resolve this, I ran the following registry script:
Copy the text below into a file with a REG file extension and double-click it

Windows Registry Editor Version 5.00

[-HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Internet Explorer\ActiveX Compatibility\{FA91DF8D-53AB-455D-AB20-F2F023E498D3}]

Other ways to fix this issue:  Reporting Services Client-Side Printing and Silent Deployment of RSClientPrint.cab ActiveX file.

Resources:
Unable to load client print control

Brian Hartman’s Report Viewer Blog