You are not logged in.
Well, today I also got to tackle the compression of the fonts in SynPdf.
In my previous post I thought I found a possible solution with the TTFCFP_FLAGS_COMPRESS flag. But it turns out I put that flag in the wrong place. I used it with usSubsetFormat and not the usFlags where it should be. When using it in usSybsetFormat the $0002 will act as TTFCFP_DELTA to create incremental characters. But it did show how much initial data was getting into the PDF because the actual characters just make up 15KB (instead of 60KB).
So I went to investigate the TTF data from CreateFontPackage for the SegoeScript. It contains A LOT of garbage. Copyright notices, certificates etc. I don't think they are really needed for embedding. So I needed to see what makes a TTF tick.
As I see it now... TTF fonts are made up of tables. Result from CreateFontPackage for the SegoeScript Subset ("Hello World").
==== Table directory
Version: 1.0, number of tables: 21
Name Offset Length
DSIG 160736 7620
GDEF 79464 90
GPOS 79556 26766
GSUB 106324 54340
LTSH 5800 1944
OS/2 472 96
cmap 54408 608
cvt 57892 476
fpgm 55016 2384
gasp 79448 16
glyf 58368 14488
hdmx 7744 46664
head 348 54
hhea 404 36
hmtx 568 5232
loca 72856 3882
maxp 440 32
meta 160664 72
name 76740 2675
post 79416 32
prep 57400 490
For embedding a font in a PDF we actually only need the following 10 tables.
cvt (476), fpgm (2384), prep (490), head (54), hhea (36), maxp (32), hmtx (5232), cmap (608), loca (3882) and glyf (14488).
Thats 27.682 instead of 167.997 (before compression). I got this information from here. (I hope that is all correct)
I think 'name' and 'post' are not required for embedding. They ARE required for saving a .ttf file (but we are not doing that).
So... let's go stripping. We don't need to do the subsetting ourselves because the CreateFontPackage has already done that. We just need to remove the unwanted tables.
This is my final result. Calling code:
// subset was created successfully -> save to PDF file
SetString(TTF, SubSetData, SubSetSize);
FreeMem(SubSetData);
if fDoc.fEmbeddedSubsetCleanup then // I added this one to the interface
CleanUpSubsetTTFTables(TTF);
// this is from the other topic to mark the fonts as subset correctly
Prefix := '';
if System.RandSeed = 0 then Randomize; // only call when needed
for i := 1 to 6 do Prefix := Prefix + Chr(65 + Random(26));
Prefix := Prefix + '+';
if fFontDescriptor.ValueByName('FontName') <> nil then
TPdfName(fFontDescriptor.ValueByName('FontName')).Value := Prefix + TPdfName(fFontDescriptor.ValueByName('FontName')).Value;
if Data.ValueByName('BaseFont') <> nil then
TPdfName(Data.ValueByName('BaseFont')).Value := Prefix + TPdfName(Data.ValueByName('BaseFont')).Value;
The CleanUpSubsetTTFTables looks like this. It takes the TTF string, reads it into a TMemoryStream and only outputs the tables we actually want back into TTF.
// ============================================
type
TByte2 = array [0 .. 1] of byte; // 16-bit
TByte4 = array [0 .. 3] of byte; // 32-bit
function WordToBytes(const Data: word): TByte2;
begin
Result[0] := (Data shr 8) and 255;
Result[1] := Data and 255;
end;
function CardinalToBytes(const Data: Cardinal): TByte4;
begin
Result[0] := (Data shr 24) and 255;
Result[1] := (Data shr 16) and 255;
Result[2] := (Data shr 8) and 255;
Result[3] := Data and 255;
end;
function BytesToWord(const Data: TByte2): word;
begin
Result := (Data[0] * 256) + Data[1];
end;
function BytesToCardinal(const Data: TByte4): Cardinal;
begin
Result := (Data[0] * 16777216) + (Data[1] * 65536) + (Data[2] * 256) + Data[3];
end;
type
recTableDirectory = record
sfntVersion: TByte4; // 0x00010000 for version 1.0
numTables: TByte2; // number of tables
searchRange: TByte2; // (Maximum power of 2 <= NumTables) x 16
entrySelector: TByte2; // Log2(maximum power of 2 <= NumTables
rangeShift: TByte2; // NumTables x 16 - SearchRange
end;
recTableEntry = record
Tag: array [0 .. 3] of AnsiChar; // table identifier
CheckSum: TByte4; // checksum for this table
offset: TByte4; // offset from start of font file
length: TByte4; // length of this table
end;
recTableData = TBytes;
procedure CleanUpSubsetTTFTables(var TTF: PDFString);
const
TablesWeWant: array [0 .. 9] of AnsiString =
('cvt ', 'fpgm', 'prep', 'head', 'hhea', 'maxp', 'hmtx', 'cmap', 'loca', 'glyf');
// 'name', 'post' are not needed for embedding, they are needed for a .ttf file
var
Input: TMemoryStream;
Output: TMemoryStream;
TD: recTableDirectory;
FontEntries: array of recTableEntry;
FontData: array of recTableData;
numTables: word;
i, j: integer;
Off, Len: Cardinal;
begin
Input := TMemoryStream.Create;
Output := TMemoryStream.Create;
try
Input.Write(TTF[1], length(TTF));
Input.Position := 0;
Input.Read(TD, SizeOf(TD));
numTables := BytesToWord(TD.numTables);
SetLength(FontEntries, numTables);
SetLength(FontData, numTables);
Input.Read(FontEntries[0], numTables * SizeOf(recTableEntry));
for i := 0 to numTables - 1 do
begin
Off := BytesToCardinal(FontEntries[i].offset);
Len := BytesToCardinal(FontEntries[i].length);
Input.Position := Off;
SetLength(FontData[i], Len);
Input.Read(FontData[i], Len);
end;
for i := numTables - 1 downto 0 do
begin
if not MatchStr(FontEntries[i].Tag, TablesWeWant) then
begin
for j := i + 1 to numTables - 1 do FontEntries[j - 1] := FontEntries[j];
for j := i + 1 to numTables - 1 do FontData[j - 1] := FontData[j];
dec(numTables);
SetLength(FontEntries, numTables);
SetLength(FontData, numTables);
end;
end;
Output.Position := SizeOf(TD) + numTables * SizeOf(recTableEntry); // always on 4 byte boundary
for i := 0 to numTables - 1 do
begin
Off := Output.Position;
FontEntries[i].offset := CardinalToBytes(Off);
Len := BytesToCardinal(FontEntries[i].length);
Output.Write(FontData[i], Len);
Off := 0;
while (Output.Position mod 4 <> 0) do Output.Write(Off, 1); // align on 4 bytes boundary
end;
TD.numTables := WordToBytes(numTables);
System.Move(TD, (PByte(Output.Memory))^, SizeOf(TD));
System.Move(FontEntries[0], (PByte(Output.Memory) + SizeOf(TD))^, numTables * SizeOf(recTableEntry));
SetString(TTF, PAnsiChar(Output.Memory), Output.size);
finally
Output.Free;
Input.Free;
end;
end;
// ============================================
Testing code:
procedure MakePdfSynPdf;
var
FileTemp: string;
Doc: TPdfDocumentGDI;
// Page: TPdfPage;
begin
// if CheckC39 then; // For testing I installed this font for current users
FileTemp := 'C:\Temp\Test2.pdf';
Doc := TPdfDocumentGDI.Create;
try
Doc.GeneratePDF15File := true; // kleiner
Doc.EmbeddedTTF := true;
Doc.EmbeddedTTFIgnore.Text := MSWINDOWS_DEFAULT_FONTS;
Doc.EmbeddedWholeTTF := false;
Doc.EmbeddedSubsetCleanup := false;
Doc.Root.PageLayout := plSinglePage;
Doc.NewDoc;
{ Page := } Doc.AddPage;
Doc.VCLCanvas.TextOut(40, 40, 'Test1');
Doc.VCLCanvas.TextOut(60, 60, 'Test2');
Doc.VCLCanvas.Font.Name := 'Code 3 de 9';
Doc.VCLCanvas.Font.size := 24;
Doc.VCLCanvas.TextOut(80, 80, '*123456789*'); // blocks
Doc.VCLCanvas.Font.Name := 'Code 128';
Doc.VCLCanvas.Font.size := 24;
Doc.VCLCanvas.TextOut(120, 120, '*123456789*'); // blocks
Doc.VCLCanvas.Font.Name := 'KIX Barcode';
Doc.VCLCanvas.Font.size := 12;
Doc.VCLCanvas.TextOut(160, 160, '5569LB33'); // correct
Doc.VCLCanvas.Font.Name := 'Segoe Script';
Doc.VCLCanvas.Font.size := 14;
Doc.VCLCanvas.TextOut(190, 190, 'Hello World'); // correct
Doc.SaveToFile(FileTemp);
// ExecAssociatedApp(FileTemp);
finally
Doc.Free;
end;
FileTemp := 'C:\Temp\Test3.pdf';
Doc := TPdfDocumentGDI.Create;
try
Doc.GeneratePDF15File := true; // kleiner
Doc.EmbeddedTTF := true;
Doc.EmbeddedTTFIgnore.Text := MSWINDOWS_DEFAULT_FONTS;
Doc.EmbeddedWholeTTF := false;
Doc.EmbeddedSubsetCleanup := true;
Doc.Root.PageLayout := plSinglePage;
Doc.NewDoc;
{ Page := } Doc.AddPage;
Doc.VCLCanvas.TextOut(40, 40, 'Test1');
Doc.VCLCanvas.TextOut(60, 60, 'Test2');
Doc.VCLCanvas.Font.Name := 'Code 3 de 9';
Doc.VCLCanvas.Font.size := 24;
Doc.VCLCanvas.TextOut(80, 80, '*123456789*'); // blocks
Doc.VCLCanvas.Font.Name := 'Code 128';
Doc.VCLCanvas.Font.size := 24;
Doc.VCLCanvas.TextOut(120, 120, '*123456789*'); // blocks
Doc.VCLCanvas.Font.Name := 'KIX Barcode';
Doc.VCLCanvas.Font.size := 12;
Doc.VCLCanvas.TextOut(160, 160, '5569LB33'); // correct
Doc.VCLCanvas.Font.Name := 'Segoe Script';
Doc.VCLCanvas.Font.size := 14;
Doc.VCLCanvas.TextOut(190, 190, 'Hello World'); // correct
Doc.SaveToFile(FileTemp);
// ExecAssociatedApp(FileTemp);
finally
Doc.Free;
end;
end;
Result with EmbeddedSubsetCleanup false is 60KB (thanks to the fixed EmbeddedWholeTTF otherwise it was 352KB)
Result with EmbeddedSubsetCleanup true is 15KB
(Both PDF's seem to be correct in Adobe reader)
I hope this is all correct and it would need to be thoroughly tested (with multiple fonts) but with the option EmbeddedSubsetCleanup default as false it couldn't hurt either.
Offline
This is great work!
I have refactored your proposal to include a proper trailing header, and also recompute the checksum.
It works just fine on my side.
Please try https://github.com/synopse/mORMot/commi … b00bf31efe
Offline
Yes. The ReduceTTF does a good job as far as I can see for now
Thanks.
Offline