FreeBSD 上使用 wkhtmltopdf 做網頁擷取
by admin on 八月.04, 2010, under FreeBSD, PHP & MYSQL
以往要用程式控制將網頁輸出pdf或擷取網頁,都是件浩大工程.
這次介紹如何在FreeBSD 命令列模式下即可輕易辦到,並且不需龐大的X Windows 圖形系統適合在server上跑.
以下是軟體原文介紹.
wkhtmltopdf
Convert html to pdf using webkit (qtwebkit)
Description
Simple shell utility to convert html to pdf using the webkit rendering engine, and qt.
Introduction
Searching the web, I have found several command line tools that allow you to convert a HTML-document to a PDF-document, however they all seem to use their own, and rather incomplete rendering engine, resulting in poor quality. Recently QT 4.4 was released with a WebKit widget (WebKit is the engine of Apples Safari, which is a fork of the KDE KHtml), and making a good tool became very easy.
此軟體使用WebKit開發完成,除了flash以外其他接可正常顯示!!(包含js)
在安裝前請先確定你的FreeBSD已安裝 linux-base 套件並正常使用,並且將port tree更新.
1.安裝linux-expat
# cd /usr/ports/textproc/linux-f10-expat;make install clean;
===> License check disabled, port has not defined LICENSE
=> expat-2.0.1-5.i386.rpm doesn't seem to exist in /usr/ports/distfiles/rpm/i386/fedora/10.
=> Attempting to fetch from http://ftp.tw.freebsd.org/pub/FreeBSD/distfiles/rpm/i386/fedora/10/.
expat-2.0.1-5.i386.rpm 100% of 82 kB 244 kBps
===> Extracting for linux-f10-expat-2.0.1
=> MD5 Checksum OK for rpm/i386/fedora/10/expat-2.0.1-5.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/expat-2.0.1-5.i386.rpm.
===> linux-f10-expat-2.0.1 depends on file: /usr/local/bin/rpm2cpio - found
===> Patching for linux-f10-expat-2.0.1
===> Configuring for linux-f10-expat-2.0.1
===> Installing for linux-f10-expat-2.0.1
===> linux-f10-expat-2.0.1 depends on file: /compat/linux/etc/fedora-release - found
===> Generating temporary packing list
===> Checking if textproc/linux-f10-expat already installed
cd /usr/ports/textproc/linux-f10-expat/work && /usr/bin/find * -type d -exec /bin/mkdir -p "/compat/linux/{}" \;
cd /usr/ports/textproc/linux-f10-expat/work && /usr/bin/find * ! -type d | /usr/bin/cpio -pm -R root:wheel /compat/linux
367 blocks
===> Running linux ldconfig
/compat/linux/sbin/ldconfig -r /compat/linux
===> Registering installation for linux-f10-expat-2.0.1
===> Cleaning for linux-f10-expat-2.0.1
2.安裝linux-fontconfig
# cd /usr/ports/x11-fonts/linux-f10-fontconfig; make install clean;
===> License check disabled, port has not defined LICENSE
=> fontconfig-2.6.0-3.fc10.i386.rpm doesn't seem to exist in /usr/ports/distfiles/rpm/i386/fedora/10.
=> Attempting to fetch from http://ftp.tw.freebsd.org/pub/FreeBSD/distfiles/rpm/i386/fedora/10/.
fontconfig-2.6.0-3.fc10.i386.rpm 100% of 182 kB 241 kBps
===> Extracting for linux-f10-fontconfig-2.6.0
=> MD5 Checksum OK for rpm/i386/fedora/10/fontconfig-2.6.0-3.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/fontconfig-2.6.0-3.fc10.i386.rpm.
===> linux-f10-fontconfig-2.6.0 depends on file: /usr/local/bin/rpm2cpio - found
===> Patching for linux-f10-fontconfig-2.6.0
===> Configuring for linux-f10-fontconfig-2.6.0
===> Installing for linux-f10-fontconfig-2.6.0
===> linux-f10-fontconfig-2.6.0 depends on file: /compat/linux/etc/fedora-release - found
===> linux-f10-fontconfig-2.6.0 depends on file: /compat/linux/lib/libexpat.so.1 - found
===> Generating temporary packing list
===> Checking if x11-fonts/linux-f10-fontconfig already installed
cd /usr/ports/x11-fonts/linux-f10-fontconfig/work && /usr/bin/find * -type d -exec /bin/mkdir -p "/compat/linux/{}" \;
cd /usr/ports/x11-fonts/linux-f10-fontconfig/work && /usr/bin/find * ! -type d | /usr/bin/cpio -pm -R root:wheel /compat/linux
617 blocks
===> Running linux ldconfig
/compat/linux/sbin/ldconfig -r /compat/linux
===> Registering installation for linux-f10-fontconfig-2.6.0
===> Cleaning for linux-f10-fontconfig-2.6.0
3.安裝 linux-xorg-libs
# cd /usr/ports/x11/linux-f10-xorg-libs; make install clean;
===> Extracting for linux-f10-xorg-libs-7.4_1
=> MD5 Checksum OK for rpm/i386/fedora/10/libICE-1.0.4-4.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libICE-1.0.4-4.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libFS-1.0.1-2.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libFS-1.0.1-2.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libSM-1.1.0-2.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libSM-1.1.0-2.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libX11-1.1.5-4.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libX11-1.1.5-4.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXScrnSaver-1.1.3-1.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXScrnSaver-1.1.3-1.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXTrap-1.0.0-6.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXTrap-1.0.0-6.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXau-1.0.4-1.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXau-1.0.4-1.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXaw-1.0.4-3.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXaw-1.0.4-3.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXcomposite-0.4.0-5.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXcomposite-0.4.0-5.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXcursor-1.1.9-3.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXcursor-1.1.9-3.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXdamage-1.1.1-4.fc9.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXdamage-1.1.1-4.fc9.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXdmcp-1.0.2-6.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXdmcp-1.0.2-6.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXevie-1.0.2-4.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXevie-1.0.2-4.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXext-1.0.4-1.fc9.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXext-1.0.4-1.fc9.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXfixes-4.0.3-4.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXfixes-4.0.3-4.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXfont-1.3.3-1.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXfont-1.3.3-1.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXft-2.1.13-1.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXft-2.1.13-1.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXi-1.1.3-4.fc9.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXi-1.1.3-4.fc9.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXinerama-1.0.3-2.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXinerama-1.0.3-2.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXmu-1.0.4-1.fc9.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXmu-1.0.4-1.fc9.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXp-1.0.0-11.fc9.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXp-1.0.0-11.fc9.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXpm-3.5.7-4.fc9.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXpm-3.5.7-4.fc9.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXrandr-1.2.3-1.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXrandr-1.2.3-1.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXrender-0.9.4-3.fc9.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXrender-0.9.4-3.fc9.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXres-1.0.3-5.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXres-1.0.3-5.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXt-1.0.5-1.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXt-1.0.5-1.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXtst-1.0.3-3.fc9.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXtst-1.0.3-3.fc9.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXv-1.0.4-1.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXv-1.0.4-1.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXvMC-1.0.4-5.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXvMC-1.0.4-5.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXxf86dga-1.0.2-3.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXxf86dga-1.0.2-3.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXxf86misc-1.0.1-6.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXxf86misc-1.0.1-6.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libXxf86vm-1.0.2-1.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libXxf86vm-1.0.2-1.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libfontenc-1.0.4-6.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libfontenc-1.0.4-6.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libxcb-1.1.91-5.fc10.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libxcb-1.1.91-5.fc10.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/libxkbfile-1.0.4-5.fc9.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/libxkbfile-1.0.4-5.fc9.i386.rpm.
=> MD5 Checksum OK for rpm/i386/fedora/10/mesa-libGLw-6.5.1-5.fc9.i386.rpm.
=> SHA256 Checksum OK for rpm/i386/fedora/10/mesa-libGLw-6.5.1-5.fc9.i386.rpm.
===> linux-f10-xorg-libs-7.4_1 depends on file: /usr/local/bin/rpm2cpio - found
===> Patching for linux-f10-xorg-libs-7.4_1
===> Configuring for linux-f10-xorg-libs-7.4_1
===> Installing for linux-f10-xorg-libs-7.4_1
===> linux-f10-xorg-libs-7.4_1 depends on file: /compat/linux/etc/fedora-release - found
===> linux-f10-xorg-libs-7.4_1 depends on file: /compat/linux/lib/libexpat.so.1 - found
===> linux-f10-xorg-libs-7.4_1 depends on file: /compat/linux/usr/lib/libfontconfig.so.1.3.0 - found
===> Generating temporary packing list
===> Checking if x11/linux-f10-xorg-libs already installed
cd /usr/ports/x11/linux-f10-xorg-libs/work && /usr/bin/find * -type d -exec /bin/mkdir -p "/compat/linux/{}" \;
cd /usr/ports/x11/linux-f10-xorg-libs/work && /usr/bin/find * ! -type d | /usr/bin/cpio -pm -R root:wheel /compat/linux
12139 blocks
===> Running linux ldconfig
/compat/linux/sbin/ldconfig -r /compat/linux
===> Registering installation for linux-f10-xorg-libs-7.4_1
===> SECURITY REPORT:
This port has installed the following files which may act as network
servers and may therefore pose a remote security risk to the system.
/compat/linux/usr/lib/libICE.so.6.3.0
/compat/linux/usr/lib/libXdmcp.so.6.0.0
If there are vulnerabilities in these programs there may be a security
risk to the system. FreeBSD makes no guarantee about the security of
ports included in the Ports Collection. Please type 'make deinstall'
to deinstall the port if this is a concern.
For more information, and contact details about the security
status of this software, see the following webpage:
http://x.org
===> Cleaning for linux-f10-xorg-libs-7.4_1
4.安裝中文字型cwttf
# wget http://cle.linux.org.tw/fonts/cwttf/cwttf-v1.0.tar.gz # cp * /usr/local/lib/X11/fonts/TTF # fc-cache -f -v /usr/local/lib/X11/fonts: caching, new cache contents: 0 fonts, 12 dirs /usr/local/lib/X11/fonts/100dpi: caching, new cache contents: 398 fonts, 0 dirs /usr/local/lib/X11/fonts/75dpi: caching, new cache contents: 398 fonts, 0 dirs /usr/local/lib/X11/fonts/OTF: caching, new cache contents: 23 fonts, 0 dirs /usr/local/lib/X11/fonts/TTF: caching, new cache contents: 31 fonts, 0 dirs /usr/local/lib/X11/fonts/Type1: caching, new cache contents: 29 fonts, 0 dirs /usr/local/lib/X11/fonts/bitstream-vera: caching, new cache contents: 10 fonts, 0 dirs /usr/local/lib/X11/fonts/cyrillic: caching, new cache contents: 0 fonts, 0 dirs /usr/local/lib/X11/fonts/encodings: caching, new cache contents: 0 fonts, 1 dirs /usr/local/lib/X11/fonts/encodings/large: caching, new cache contents: 0 fonts, 0 dirs /usr/local/lib/X11/fonts/lfpfonts-fix: caching, new cache contents: 71 fonts, 0 dirs /usr/local/lib/X11/fonts/local: caching, new cache contents: 2 fonts, 0 dirs /usr/local/lib/X11/fonts/misc: caching, new cache contents: 59 fonts, 0 dirs /usr/local/lib/X11/fonts/util: caching, new cache contents: 0 fonts, 0 dirs /root/.fonts: skipping, no such directory /var/db/fontconfig: cleaning cache directory /root/.fontconfig: not cleaning non-existent cache directory fc-cache: succeeded # fc-list :lang=zh-tw 文鼎PL中楷,AR PL KaitiM Big5:style=Regular AR PL UMing TW:style=Light AR PL UMing HK:style=Light cwTeX 粗黑體,cwTeXHeiBold:style=Medium AR PL UMing CN:style=Light 文鼎PL新宋,AR PL New Sung:style=Regular AR PL UKai TW MBE:style=Book cwTeX 仿宋體,cwTeXFangSong:style=Medium cwTeX 明體,cwTeXMing:style=Medium AR PL UKai CN:style=Book AR PL UKai HK:style=Book cwTeX 楷書,cwTeXKai:style=Medium AR PL UKai TW:style=Book 文鼎PL細上海宋,AR PL Mingti2L Big5:style=Regular,Reguler AR PL UMing TW MBE:style=Light cwTeX 圓體,cwTeXYen:style=Medium
5.下載wkhtmltopdf Linux Static Binary (i368)
wget http://wkhtmltopdf.googlecode.com/files/wkhtmltopdf-0.10.0_beta4-static-i386.tar.bz2 --2010-08-03 20:13:15-- http://wkhtmltopdf.googlecode.com/files/wkhtmltopdf-0.10.0_beta4-static-i386.tar.bz2 正在查找主機 wkhtmltopdf.googlecode.com... 64.233.183.82 正在連接 wkhtmltopdf.googlecode.com|64.233.183.82|:80... 連上了。 已送出 HTTP 要求,正在等候回應... 200 OK 長度: 11712708 (11M) [application/x-bzip2] Saving to: `wkhtmltopdf-0.10.0_beta4-static-i386.tar.bz2' 100%[=================================================================================================================================================================================>] 11,712,708 1.08M/s in 13s 2010-08-03 20:13:29 (881 KB/s) -- 已儲存 ‘wkhtmltopdf-0.10.0_beta4-static-i386.tar.bz2’ [11712708/11712708])
6.執行
# ./wkhtmltopdf-i386 You need to specify atleast one input file, and exactly one output file Use - for stdin or stdout Name: wkhtmltopdf 0.10.0 beta4 Synopsis: wkhtmltopdf [GLOBAL OPTION]... [OBJECT]...
example:
# ./wkhtmltopdf-i386 http://tw.yahoo.com/ test.pdf

PDF SAMPLE
2011-03-04補充:
如需FAX轉成TIF 則配合ImagMagic使用:
1 | wkhtmltoimage-i386 test.html test.png;convert test.png -colorspace HWB -monochrome -compress Fax test.tif |

十一月 10th, 2010 on 06:08:17
Hi,
First of all, that for this great install tuto.
Installing the same ports in PCBSD8.1 (PCBSD8.1 = FreeBSD8.1 + KDE) doesn’t allow me to run wkhtmltopdf.
I got the following error when running the linux statically compiled version "wkhtmltopdf-0.10.0-beta5″ (i386):
# file ./wkhtmltopdf-i386
./wkhtmltopdf-i386: ELF 32-bit LSB executable, Intel 80386, version 1 (GNU/Linux), statically linked, stripped
This is how it fails:
# brandelf -t Linux wkhtmltopdf-i386
# ./wkhtmltopdf-i386
PROT_EXEC|PROT_WRITE failed.
# truss ./wkhtmltopdf-i386
linux_mmap(0xbfbfed28,0×1000,0xc01000,0x16b6bc6,0x 0,0×6) ERR#22 ‘Invalid argument’
write(2,"PROT_EXEC|PROT_WRITE failed.\n",29) ERR#9 ‘Bad file descriptor’
process exit, rval = 127
# kdump
29785 ktrace RET ktrace 0
29785 ktrace CALL execve(0xbfbfee1b,0xbfbfed04,0xbfbfed0c)
29785 ktrace NAMI "./wkhtmltopdf-i386″
29785 wkhtmltopdf-i386 RET execve 0
29785 wkhtmltopdf-i386 CALL dup2(0xbfbfed3c)
29785 wkhtmltopdf-i386 RET dup2 -1 errno 22 Invalid argument
29785 wkhtmltopdf-i386 CALL write(0×2,0x16b6b4e,0x1d)
29785 wkhtmltopdf-i386 GIO fd 2 wrote 29 bytes
"PROT_EXEC|PROT_WRITE failed. "
29785 wkhtmltopdf-i386 RET write 29/0x1d
29785 wkhtmltopdf-i386 CALL exit(0x7f)
Help is appreciated!
Thanks in advance
Zabby
十一月 16th, 2010 on 14:55:01
SAME as YOU Only IN FreeBSD8.1……
I wait for fix it…