TeX4ht and fontspec sample

Michal Hoftich

June 6, 2013

1 introduction

This is sample document which uses fontspec, converted to html usint TEX4ht. This document can be converted either with htxelatex or htlualatex.

Normally, TEX4ht doesn’t support opentype fonts used with fontspec because it depends on tfm files which are used with traditional TEXfonts.This package tries to patch fontspec to not load opentype fonts. Better way would be to add support for opentype fonts to TEX4ht binary, but it probably won’t happend in near future, so I tried if this way is possible.

Because we suppress font processing, we must take care of input and output encoding. For utf8 support, we load configuration files depending on script attributes of font declaring commands. Default loaded script is Latin, which should support all diacritics used in languages with latin script:

Příliš žluťoučký kůň úpěl ďábelské ódy

Other supported script is Greek:

ελληνική γλώσσα

All fonts with different scripts must be defined in preamble

\newfontfamily\greekfont[Script=Greek]{Gentium}  
\begin{document}  
Some text  
...  
\greekfont  
ελληνική γλώσσα

2 Instalation

This is just proof of concept now, so it may not be good idea to install this package somewhere to your local texmf tree. Just run these commands in some directory

git clone git@github.com:michal-h21/fontspec.git  
cd fontspec  
pdftex fontspec.dtx

3 Unicode support

For new script support, go to page http://www.utf8-chartable.de/unicode-utf8-table.pl, select unicode block with the script in interest, select decimal display format for utf8 encoding and copy the resulting table to some file. Then run included script f4ht-utftable.lua:

texlua f4ht-utftable.lua < filewithtable > utf-scriptname-4ht.tex

There is command line switch e for f4ht-utftable.lua, which outputs html entities instead of direct utf-8 values. In the case of complex scripts, like Devanagari or other Asian scripts, this option is better.

See file utf-latin-4ht.tsv for example of input file and file utf-latin-4ht.tex for resulting output file.

4 Issues