#151 PDF Engine » Embedded fonts with Subset are not marked as Subset in PDF » 2022-04-28 14:23:32

rvk
Replies: 12

I'm trying to see why SynPdf files don't show embedded subset fonts as Subset in Adobe reader.
Looking at pdffonts from xpdf-tools you see that for Ghostscript (Temp0.pdf) the fonts are embedded but also Subsetted and Unicode.

According to the official PDF specs, embedded subset codes need to be proceeded by 6 random character followed by a + sign.
The fonts in Temp1.pdf from SynPdf are not marked as subset (and are also not embedded as unicode).

5.5.3 Font Subsets
PDF 1.1 permits documents to include subsets of Type 1 and TrueType fonts. The font and font descriptor that describe a font subset are slightly different from those of ordinary fonts. These differences allow an application to recognize font subsets and to merge documents containing different subsets of the same font. (For more information on font descriptors, see Section 5.7, “Font Descriptors.”) For a font subset, the PostScript name of the font —the value of the font’s BaseFont entry and the font descriptor’s FontName entry— begins with a tag followed by a plus sign (+). The tag consists of exactly six uppercase letters; the choice of letters is arbitrary, but different subsets in the same PDF file must have different tags. For example, EOODIA+Poetica is the name of a subset of Poetica®, a Type 1 font. (See implementation note 63 in Appendix H.)

https://ghostscript.com/~robin/pdf_reference17.pdf

This one is from ghostscript:

S:\pdfs\xpdf-tools-win-4.02\bin32>pdffonts -loc c:\temp\Test0.pdf
name                                           type              emb sub uni prob object ID location
---------------------------------------------- ----------------- --- --- --- ---- --------- --------
RDZRPI+Code128                                 TrueType          yes yes yes          12  0 embedded
UFQSLH+KIXBarcode                              TrueType          yes yes yes          14  0 embedded
ZRSKVS+SegoeScript                             TrueType          yes yes yes          16  0 embedded
UFQSLH+Tahoma                                  TrueType          yes yes yes           8  0 embedded
RDZRPI+Code3de9                                TrueType          yes yes yes          10  0 embedded

This one is from SynPdf (note the "no" in the sub column, and in Adobe reader there is also no Subset keyword):

S:\pdfs\xpdf-tools-win-4.02\bin32>pdffonts -loc c:\temp\Test1.pdf
name                                           type              emb sub uni prob object ID location
---------------------------------------------- ----------------- --- --- --- ---- --------- --------
Tahoma                                         TrueType          no  no  no            6  0 external: C:\WINDOWS\Fonts\tahoma.ttf
Code3de9                                       TrueType          yes no  no            8  0 embedded
Code128                                        TrueType          yes no  no           10  0 embedded
KIXBarcode                                     TrueType          yes no  no           12  0 embedded
SegoeScript                                    TrueType          yes no  no           14  0 embedded

So I hacked the code a little to add the random characters. Of course they should not collide with other fonts and I haven't implemented that but with this code they do show as Subset.

var
    Prefix: AnsiString;
//...
if CreateFontPackage(pointer(ttf),ttfSize,
    SubSetData,SubSetMem,SubSetSize,
    usFlags,ttcIndex,TTFMFP_SUBSET,0,
    TTFCFP_MS_PLATFORMID,TTFCFP_DONT_CARE,
    pointer(Used.Values),Used.Count,
    @lpfnAllocate,@lpfnReAllocate,@lpfnFree,nil)=0 then begin
  // subset was created successfully -> save to PDF file
  SetString(ttf,SubSetData,SubSetSize);
  FreeMem(SubSetData);

  // CleanUpSubsetTTFTables(TTF); // working on this, see future topic

  //---------
  Prefix := '';
  if System.RandSeed = 0 then Randomize; // only call when needed
  for i := 1 to 6 do Prefix := Prefix + Chr(65 + Random(26));
  Prefix := Prefix + '+';
  if fFontDescriptor.ValueByName('FontName') <> nil then
    TPdfName(fFontDescriptor.ValueByName('FontName')).Value := Prefix + TPdfName(fFontDescriptor.ValueByName('FontName')).Value;
  if Data.ValueByName('BaseFont') <> nil then
    TPdfName(Data.ValueByName('BaseFont')).Value := Prefix + TPdfName(Data.ValueByName('BaseFont')).Value;
  //---------

end;

Result (Adobe reader also shows it correctly now):

S:\pdfs\xpdf-tools-win-4.02\bin32>pdffonts -loc "c:\temp\test3.pdf"
name                                           type              emb sub uni prob object ID location
---------------------------------------------- ----------------- --- --- --- ---- --------- --------
Tahoma                                         TrueType          no  no  no            6  0 external: C:\WINDOWS\Fonts\tahoma.ttf
OQRQQB+Code3de9                                TrueType          yes yes no            8  0 embedded
UXMNNE+Code128                                 TrueType          yes yes no           10  0 embedded
VXAITD+KIXBarcode                              TrueType          yes yes no           12  0 embedded
CAOSJQ+SegoeScript                             TrueType          yes yes no           14  0 embedded

I'm sure this bit of code can be much approved upon when officially integrated (or done in a completely other way) wink

#152 Re: PDF Engine » EmbeddedWholeTTF false with barcode fonts gives blocks » 2022-04-28 14:04:29

rvk

I have implemented some improvements when embedding subset of a default Microsoft font. I saw those font include a LOT of information that is not needed in a PDF so the TTF information can be made much smaller. If I have some more time I'll post the cleaned up changes/code in a separate topic.

I also made en effort to mark embedded subset fonts correctly in a PDF. Subsets need 6 random uppercase character plus a + sign in front of the name according to the official documentation. I'll also post this in a separate topic.

#153 Re: PDF Engine » EmbeddedWholeTTF false with barcode fonts gives blocks » 2022-04-27 17:59:24

rvk

Yikes,
According to the PDF Tahoma should be replaced and not EMBEDDED.
But if I use pdftops to extract the pdf I see that Tahoma is STILL embedded.
That's the reason the PDF is so large.

So even with the fact a font is not supposed to be embedded (and is not used as embedded in the PDF) the font information is still inside the PDF (in the /sfnts part).

If I have some time tonight I'll find out why it's still in there.
Maybe it's an artifact from pdftops but i'll find out.

Edit: Probably false alarm. Via another stream debugger I see 4 fonts of which the SegoeScript is the largest one.
So Tahoma should not be embedded (and pdftops probably extract the local Windows font into the ps).
You can ignore this post.

#154 Re: PDF Engine » Problem with Code 128 and Code2of5interleaved » 2022-04-27 17:16:17

rvk
SvenJ wrote:

but in the SynPDF version from 2022 both Fonts are not recognized as UNICODE and both Fonts displayed only as empty squares in any PDF Viewer.

I had the same issue. See topic https://synopse.info/forum/viewtopic.php?pid=37219
This has been fixed in revision 211.
Maybe it works for you now too.

#155 Re: PDF Engine » EmbeddedWholeTTF false with barcode fonts gives blocks » 2022-04-27 17:15:02

rvk

Yes. It works fine now with TTFCFP_DONT_CARE. Glad to hear using "DONT_CARE" is no problem.

Using PDF 1.5 gives a slightly smaller file but not much.
PDF 1.3 gives 61,69 KB while PDF 1.5 gives 59,76 KB
Ghostscript gives 17,28 KB for PDF 1.3.

But at least the embedding of subset is much better now (was 352 KB).
Still, the SynPDF gives "Embedded" without the "Subset" keyword after is. Not sure why that is.

If I get any ideas about further compression I will let you know wink

(I'll also post a small note in the other topic to let Sven know.)

Thanks.

Edit: Looking at pdffonts from xpdf-tools you see that for Ghostscript (Temp0.pdf) the fonts are embedded but also Subsetted and Unicode.
Subsetted codes need to be proceeded by 6 random character followed by a + sign.
The fonts in Temp1.pdf from SynPdf are not marked as subsetted and are also not embedded as unicode.
Maybe it doesn't really matter (it seems to work fine) but I reckon this is not strict according to the rules wink

S:\pdfs\xpdf-tools-win-4.02\bin32>pdffonts -loc c:\temp\Test0.pdf
Config Error: No display font for 'Symbol'
Config Error: No display font for 'ZapfDingbats'
name                                           type              emb sub uni prob object ID location
---------------------------------------------- ----------------- --- --- --- ---- --------- --------
RDZRPI+Code128                                 TrueType          yes yes yes          12  0 embedded
UFQSLH+KIXBarcode                              TrueType          yes yes yes          14  0 embedded
ZRSKVS+SegoeScript                             TrueType          yes yes yes          16  0 embedded
UFQSLH+Tahoma                                  TrueType          yes yes yes           8  0 embedded
RDZRPI+Code3de9                                TrueType          yes yes yes          10  0 embedded

S:\pdfs\xpdf-tools-win-4.02\bin32>pdffonts -loc c:\temp\Test1.pdf
Config Error: No display font for 'Symbol'
Config Error: No display font for 'ZapfDingbats'
name                                           type              emb sub uni prob object ID location
---------------------------------------------- ----------------- --- --- --- ---- --------- --------
Tahoma                                         TrueType          no  no  no            6  0 external: C:\WINDOWS\Fonts\tahoma.ttf
Code3de9                                       TrueType          yes no  no            8  0 embedded
Code128                                        TrueType          yes no  no           10  0 embedded
KIXBarcode                                     TrueType          yes no  no           12  0 embedded
SegoeScript                                    TrueType          yes no  no           14  0 embedded

Edit #2: Yes. Doing this
    FFontDescriptor.AddItem('FontName','ABCDEF+' + FName);
with an ugly hack again, the fonts are marked as Subset.
Of course they all need to be randomized and only for subsetted fonts but you get the point.

Is the fact that they are embedded as Ansi a problem?
(Not sure why the Ghostscript variant is encoded as "Built-in" or what built-in actually means.)

(I hope you don't get sick of me smile )

#156 Re: PDF Engine » EmbeddedWholeTTF false with barcode fonts gives blocks » 2022-04-27 08:56:18

rvk

BTW, in the documentation it says TTFCFP_FLAGS_COMPRESS isn't implemented yet.
But if I use TTFMFP_SUBSET or $0002 { TTFCFP_FLAGS_COMPRESS } the resulting PDF gets from 60KB downto 15KB.

So TTFCFP_FLAGS_COMPRESS is definitely implemented in Windows 10. The only question is if this compression can be used in the PDF because the result isn't entirely correct. But that's for another time wink

YNPjyto.png

Edit: It's not TTFCFP_FLAGS_COMPRESS that is implemented (it still does nothing). I inadvertently put this in the usSubsetFormat parameter. And then $0002 stands for TTFCFP_DELTA. It does show that only the changed extra characters do not take much space. It's the initial ttf data with copyright info and certificates etc that take up so much space. Other generators stript this information (running it through an optimizer does the same).

#157 Re: PDF Engine » EmbeddedWholeTTF false with barcode fonts gives blocks » 2022-04-27 08:47:38

rvk

Calling CreateFontPackage() with the $FFFF { = TTFCFP_DONT_CARE } flag does fix it.

But that feels like a really ugly hack (but I'm not sure, I don't know if passing that flag is important).

if CreateFontPackage(pointer(ttf),ttfSize,
   SubSetData,SubSetMem,SubSetSize, usFlags,ttcIndex,TTFMFP_SUBSET,0,
   TTFCFP_MS_PLATFORMID, { TTFCFP_UNICODE_CHAR_SET } { TTFCFP_SYMBOL_CHAR_SET } $FFFF { = TTFCFP_DONT_CARE },
    pointer(Used.Values),Used.Count, @lpfnAllocate,@lpfnReAllocate,@lpfnFree,nil)=0 then begin

Result with EmbeddedWholeTTF true is 352KB. With EmbeddedWholeTTF false is 62KB smile

(Still, the ghostscript version is 17KB)

#158 Re: PDF Engine » EmbeddedWholeTTF false with barcode fonts gives blocks » 2022-04-27 08:39:27

rvk

I may have found the problem (or partially).

In SynPdf.pas there is the CreateFontPackage().
But the call is always called with TTFCFP_UNICODE_CHAR_SET.

When I change this to TTFCFP_SYMBOL_CHAR_SET the Code128 and Code3de9 gets embedded correctly.
But if I do that, the KIXBarcode and SegoeScript now incorrectly embedded.
So it needs to check what font is used to embed correctly.

(I'm not sure how to do that smile )

It's also strange Code128 and Code3de9 are not detected as 'symbol' fonts (whatever they are) but I'm not sure it matters if for the CreateFontPackage() the correct flag is used.

Documentation for CreateFontPackage():
https://docs.microsoft.com/en-us/window … ontpackage

#159 PDF Engine » EmbeddedWholeTTF false with barcode fonts gives blocks » 2022-04-27 08:19:24

rvk
Replies: 7

I'm now trying to tackle the reason why I can't use EmbeddedWholeTTF := false when using barcode fonts.
I already found this topic but the suggested hack to set isSymbolFont to true for that font doesn't work for me.
https://synopse.info/forum/viewtopic.php?id=6197

(the topic was not concluded with a solution)

*) First one little question. Embedding partial fonts (even fonts that work correctly like 'Segoe Script') are marked as "Embedded" and Encoding "Ansi" in the PDF. When I create this PDF with another creator I get "Embedded Subset" and encoding "Built-in". Could it be that subset embedding and/or encoding 'flag' isn't set correctly?

*) Then the barcode font. When using EmbeddedWholeTTF := true it works perfectly. When using EmbeddedWholeTTF := false I get blocks for Code3de9 and Code128. The KIXBarcode and SegoeScript work correctly. (but as stated below, all are marked as "Embedded" without the word "Subset" and all are encoding "Ansi".

Where can I look to debug this problem. Embedding the barcodes fully doesn't add much to the PDF but for SegoeScript it does. For what fonts is isSymbolFont to be set (what's it used for)? (But as stated, forcing this to true doesn't work for me)

(it somehow feels like the wrong letters are embedded.)

Used fonts: Tahoma (default in Windows), SegoeScript (default in Windows but not in EmbeddedTTFIgnore), Code128 and Code3de9, KIXBarcode.

procedure MakePdfSynPdf;
var
  FileTemp: string;
  Doc: TPdfDocumentGDI;
  Page: TPdfPage;
begin
  // if CheckC39 then; // For testing I installed this font for current users
  FileTemp := 'C:\Temp\Test2.pdf';
  Doc := TPdfDocumentGDI.Create;
  try
    Doc.AddTrueTypeFont('Code 3 de 9');
    Doc.EmbeddedTTF := true;
    Doc.EmbeddedTTFIgnore.Text := MSWINDOWS_DEFAULT_FONTS;

    Doc.EmbeddedWholeTTF := false; // why does this need to be true?

    Doc.Root.PageLayout := plSinglePage;
    Doc.NewDoc;
    Page := Doc.AddPage;
    Doc.VCLCanvas.TextOut(40, 40, 'Test1');
    Doc.VCLCanvas.TextOut(60, 60, 'Test2');

    Doc.VCLCanvas.Font.Name := 'Code 3 de 9';
    Doc.VCLCanvas.Font.size := 24;
    Doc.VCLCanvas.TextOut(80, 80, '*123456789*');    // blocks

    Doc.VCLCanvas.Font.Name := 'Code 128';
    Doc.VCLCanvas.Font.size := 24;
    Doc.VCLCanvas.TextOut(120, 120, '*123456789*');  // blocks

    Doc.VCLCanvas.Font.Name := 'KIX Barcode';
    Doc.VCLCanvas.Font.size := 12;
    Doc.VCLCanvas.TextOut(160, 160, '5569LB33');     // correct

    Doc.VCLCanvas.Font.Name := 'Segoe Script';
    Doc.VCLCanvas.Font.size := 14;
    Doc.VCLCanvas.TextOut(190, 190, 'Hello World');  // correct

    Doc.SaveToFile(FileTemp);
    // ExecAssociatedApp(FileTemp);
  finally
      Doc.Free;
  end;
end;

with EmbeddedWholeTTF := false;

XZ2luC4.png

with EmbeddedWholeTTF := true;

WvscxCR.png

#160 Re: PDF Engine » Embedding and using font added with AddFontMemResourceEx » 2022-04-25 15:21:26

rvk
ab wrote:

I have just added a new TPdfDocument.AddTrueTypeFont() method.
It should help you in your case.
See https://synopse.info/fossil/info/aedd978136

Yes, that works perfectly.
Thank you very much.

I added some extra lines in mORMotReport.pas because I use that to more closely emulate the printer-code (so I hacked it a bit and maybe in the future I will use more native code).

And it doesn't matter if nonexistent fonts are added here because if they are not used for selecting, they are not used in SynPdf to get embedded.

// rvk
PDF.AddTrueTypeFont('KIX Barcode');
PDF.AddTrueTypeFont('Code EAN13');
PDF.AddTrueTypeFont('Code 3 de 9');
PDF.AddTrueTypeFont('Code 128');
PDF.EmbeddedTTF := true; // could also use ExportPDFEmbeddedTTF
PDF.EmbeddedTTFIgnore.Text := MSWINDOWS_DEFAULT_FONTS; // but these 2 are hidden
PDF.EmbeddedWholeTTF := true; // needed for barcode fonts
PDF.Root.PageLayout := plSinglePage; // always force show whole page

#161 Re: PDF Engine » Embedding and using font added with AddFontMemResourceEx » 2022-04-25 14:00:18

rvk

Yes, it's a TTF font. It's the Code 3 de 9 (code39.ttf) from here https://grandzebu.net/informatique/codbar-en/code39.htm

The "Code 3 de 9" (of code39) is not in in Doc.fTrueTypeFonts after TPdfDocumentGDI.Create (so also not in EnumFontsProcW).

It's also not in Printer.Fonts.Text but for the Printer it works so for printers it doesn't need to be in there to be selected.

According to the documentation fonts added with AddFontMemResourceEx are always private and not enumerable.

This function allows an application to get a font that is embedded in a document or a webpage. A font that is added by AddFontMemResourceEx is always private to the process that made the call and is not enumerable.

But that doesn't seem to stop the Printer unit from being able to use it.
So it is selectable.

(As a workaround I'm now using AddFontResourceEx with a temporary disk-font (with FR_PRIVATE set but without FR_NOT_ENUM) and that works for now but I would rather have direct memory resource fonts.)

Is there an option to force this font to be accepted (selected) without being in fTrueTypeFonts[] ?

#162 PDF Engine » Embedding and using font added with AddFontMemResourceEx » 2022-04-25 08:58:52

rvk
Replies: 5

Is it possible for SynPdf to use fonts which are added with AddFontMemResourceEx from a resource in memory?

For example, fonts loaded with the following snippet in Delphi are not used in TPdfDocumentGDI.

const
  C39CodeName = 'Code 3 de 9';
var
  hC39FontRes: Cardinal;

function CheckC39: Boolean;
var
  ResS1: TResourceStream;
  FontCount1: Cardinal;
  FontId: Integer;
begin
  Result := true;
  FontId := Screen.Fonts.IndexOf(C39CodeName);
  if FontId > 0 then exit;
  if hC39FontRes > 0 then exit;
  Result := false;
  FontCount1 := 0;
  try
    ResS1 := TResourceStream.Create(hInstance, 'C39_FONT', 'RT_FONT');
    try
      if ResS1.Size > 14 then
          hC39FontRes := AddFontMemResourceEx(ResS1.Memory, ResS1.Size, nil, @FontCount1);
    finally
      ResS1.Free;
      Result := (FontCount1 = 1);
    end;
  except
    on E: Exception do ; // ShowException(E, 'Error loading C39font');
  end;
end;

Using this code:

procedure MakePdf;
var
  FileTemp: string;
  Doc: TPdfDocumentGDI;
  Page: TPdfPage;
begin
  FileTemp := 'C:\Temp\Test.pdf';
  Doc := TPdfDocumentGDI.Create;
  try
    Doc.EmbeddedTTF := true;
    Doc.EmbeddedTTFIgnore.Text := MSWINDOWS_DEFAULT_FONTS;
    Doc.EmbeddedWholeTTF := true;
    Doc.Root.PageLayout := plSinglePage;
    Doc.NewDoc;
    Page := Doc.AddPage;
    Doc.VCLCanvas.TextOut(100, 100, 'Test1');
    Doc.VCLCanvas.TextOut(200, 200, 'Test2');
    Doc.VCLCanvas.TextOut(300, 300, 'Test3');
    if CheckC39 then
    begin
      Doc.VCLCanvas.Font.Name := C39CodeName;
      Doc.VCLCanvas.Font.size := 24;
      Doc.VCLCanvas.TextOut(400, 400, '*123456789*');
    end;
    Doc.VCLCanvas.Font.Name := 'Segoe Script';
    Doc.VCLCanvas.Font.size := 14;
    Doc.VCLCanvas.TextOut(500, 500, 'Hello World');
    Doc.SaveToFile(FileTemp);
  finally
      Doc.Free;
  end;
end;

This works fine if I use a printer-based PDF creator.
For SynPdf LucidaSansUnicode is used and embedded.

If it possible to make this work in SynPdf?

Edit: Sidenote: Using AddFontResource() just before TPdfDocumentGDI.Create to include a font from a temp-file does work.

Doing the AddFontMemResourceEx() before the TPdfDocumentGDI.Create also doesn't work.

Board footer

Powered by FluxBB