#1 Re: PDF Engine » Having trouble displaying Chinese Characters » 2013-03-23 23:20:07

Thanks folks. No more problem now. Even my unrooted Android phone can display testunicode.pdf correctly.

BTW, I modified Synpdf.pas so that I can change the default fallback font at run time, and found that the following preinstalled fonts can all handle Chinese/Japanese unicode: @MS Gothic, MS Gothic, @MS Mincho, MS Mincho, @Batang, Batang, @Gulim, Gulim, @Dotum, Dotum etc., with the last 3 pairs Korean capable as well. In fact it seems all fonts whose name starts with @ are capable of Asian unicode, although the @ prefix should be removed when setting font name. Thus I can use one of those fonts and disable embedding to reduce the pdf file size if the file is intended for PC only.

It puzzles me why I was unable to render Chinese characters before installing Arial Unicode MS font even though I had tried all the above fonts. I suspect the method TPdfWrite.AddUnicodeHexTextNoUniScribe may not handle unicode well. I modified it thus:

procedure TPdfWrite.AddUnicodeHexTextNoUniScribe(PW: PWideChar;
  TTF: TPdfFontTrueType; NextLine: boolean; Canvas: TPdfCanvas);
var Ansi: integer; [color=blue]TTFsav: TPdfFontTrueType; (*** in case non-WinANSI ***)[/color]
begin
  Ansi := WideCharToWinAnsi(cardinal(PW^));
[color=blue]  TTFsav := TTF; (*** preserve it for AddGlyphFromChar ***)[/color]
  if TTF<>nil then
	TTF := TTF.WinAnsiFont else // we expect the WinAnsi font in the code below
	if Ansi<0 then
	  Ansi := ord('?'); // WinAnsi only font shows ? glyph for unicode chars
  while Ansi<>0 do begin
	if Ansi>0 then begin
	  // add WinAnsi-encoded chars as such
	  if (TTF<>nil) and (Canvas.FPage.Font<>TTF) then
		Canvas.SetPDFFont(TTF,Canvas.FPage.FontSize);
	  Add('(');
	  repeat
		case Ansi of
		  40,41,92: Add('\');   // see PDF 2nd ed. p. 290
		  160: Ansi := 32; // fixed space is written as normal space
		end;
		TTF.AddUsedWinAnsiChar(AnsiChar(Ansi));
		Add(AnsiChar(Ansi));
		Inc(PW);
		Ansi := WideCharToWinAnsi(cardinal(PW^));
		if (TTF=nil) and (Ansi<0) then
		  Ansi := ord('?'); // WinAnsi only font shows ? glyph for unicode chars
	  until Ansi<=0;
	  Add(')').Add(SHOWTEXTCMD[NextLine]);
	  NextLine := false; // MoveToNextLine only once
	end;
	if Ansi=0 then
	  break;
	// here we know that PW^ is not a Win-Ansi glyph, and that TTF exists
	repeat
	  AddGlyphFromChar(PW^,Canvas,TTFsav,@NextLine); [color=blue](*** TTFsav instead of TTF ***)[/color]
	  inc(PW);
	  Ansi := WideCharToWinAnsi(cardinal(PW^));
	  if Ansi=160 then
		Ansi := 32;
	  if Ansi=32 then
		if WideCharToWinAnsi(cardinal(PW[1]))<0 then
		  continue; // we allow one space inside Unicode text
	until Ansi>=0;
	AddGlyphFlush(Canvas,TTFsav,@NextLine); [color=blue](*** TTFsav instead of TTF ***)[/color]
  end;
end;

and method AddGlyphFromChar has to be modified a bit:

procedure TPdfWrite.AddGlyphFromChar(Char: WideChar; Canvas: TPdfCanvas;
  TTF: TPdfFontTrueType; NextLine: PBoolean);
var aChanged: boolean;
	aTTF: TPdfFontTrueType;
	Glyph: word;
begin
  assert((TTF<>nil)[color=blue](*** and (TTF=TTF.WinAnsiFont)***)[/color]);

The above modification seems to work - the rendered Eastern unicode characters correspond to the specified font rather than the fallback font.

I also note that TPdfDocument.canvas.font may be set to the fallback font after a ShowText call, so that I need to call SetFont several times in unit1 to ensure the desired font is used.

Wai Wong

#2 Re: PDF Engine » Having trouble displaying Chinese Characters » 2013-03-23 12:45:57

I downloaded and installed Arial Unicode MS.ttf and the generated pdf displays OK on my computer. However, when the PDF file is sent to another computer, chances are high that the latter doesn't have Arial Unicode MS.ttf installed, and all the unicode characters, including Greek, Arabic, etc. will not be displayed. All my Windows XP, Windows 7 & Windows 8 computers don't have that font preinstalled.

On the other hand, all Chinese PDF files I got can be viewed properly on my PCs before I installed Arial Unicode MS.ttf. For example, on an XP PC without that ttf, I created a PDF file by printing Arabic.uni (containing Asian characters) to a virtual printer (PDF Lite) and that PDF file displays properly on all my PCs. I use notepad to view that PDF file and found it contains the following lines:

%PDF-1.4%...

<</BaseFont/GMTXSU+Arial/FontDescriptor 14 0 R/ToUnicode 19 0 R/Type/Font/FirstChar 1/LastChar 24/Widths[ 375...

<</BaseFont/IVLJLU+MSUIGothic-WinCharSetFFFF-H/ToUnicode 20 0 R/Type/Font/Encoding /Identity-H/DescendantFonts[11...

In another downloaded Chinese PDF file, it contains:

%PDF-1.4%

<</BaseFont/NCNLIB+ArialUnicodeMS/Subtype/Type0/DescendantFonts[34 0 R]/ToUnicode 20 0 R>>

After googling I still can't find the meaning of those 6 letter prefixes and am not sure if they are significant.

Thanks

Wai Wong

#3 Re: PDF Engine » Having trouble displaying Chinese Characters » 2013-03-22 22:35:55

I copied and pasted the new LoadUnicodeStrings procedure into unit1 and the result is still the same, just like 5c4f394a's post.

The Chinese characters have always been displayed correctly inside the paintbox, suggesting that both the new and old LoadUnicodeStrings procedures are able to load the Chinese unicode characters correctly.

Thanks folks

Wai Wong

#4 Re: PDF Engine » Having trouble displaying Chinese Characters » 2013-03-22 09:08:57

I downloaded http://synopse.info/fossil/zip/mORMot%2 … 1b822c.zip dated 2013-3-21, extracted the 5 mentioned Syn*.pas, rebuilt my test programs including synpdfunicode's Project1 but still got the same problem.

Thanks

Wai Wong

#5 Re: PDF Engine » Having trouble displaying Chinese Characters » 2013-03-21 05:40:55

Can't figure out which file contains the latest unstable version of SynPdf. I will wait until the stable one is released. Thanks very much anyway!

Wai Wong

#6 PDF Engine » Having trouble displaying Chinese Characters » 2013-03-20 23:38:33

wywong
Replies: 19

I have been using various methods trying to display unicode Chinese characters but always have the characters displayed as blanks (not boxes) but no problem with other languages (Greek/English etc.). I have tried using TPdfDocument.canvas.showtext(), windows.TextoutW(TPdfDocumentGDI.VCLcanvas.handle,..) in D6/D7 under XP/Win7/Win8 and the result is the same.

The following steps is one of the attempts I made:

Extract the content of http://synopse.info/files/pdf/synpdfunicode.zip and replace the content of Arabic.uni with a single line of unicode characters "中文Chinese" (the two Chinese characters are common to both Japanese and Chinese, meaning Chinese). When the project1.dpr is compiled and run, the Chinese characters are correctly displayed inside the paintbox, but in the generated pdf, I get

...
Arabic
  Chinese
Greek text...

My computer can display Chinese pdf files without problem. I suspected the font used by pdfcanvas being the problem so I added a Fontdialog to allow the font.name property to be selected at run time. I have tried almost all fonts available but still no luck. I changed UseUniscribe to false and it made no difference. I peeped inside a Chinese pdf file and found it only uses normal looking fonts like Verdana, NCNLIB+ArialUnicodeMS, ArialMT, and NCNMEN+SymbolMT.

Can anyone help?

Many TIAs

Wai Wong

Board footer

Powered by FluxBB