#1 2013-03-20 23:38:33

wywong
Member
Registered: 2013-03-20
Posts: 6

Having trouble displaying Chinese Characters

I have been using various methods trying to display unicode Chinese characters but always have the characters displayed as blanks (not boxes) but no problem with other languages (Greek/English etc.). I have tried using TPdfDocument.canvas.showtext(), windows.TextoutW(TPdfDocumentGDI.VCLcanvas.handle,..) in D6/D7 under XP/Win7/Win8 and the result is the same.

The following steps is one of the attempts I made:

Extract the content of http://synopse.info/files/pdf/synpdfunicode.zip and replace the content of Arabic.uni with a single line of unicode characters "中文Chinese" (the two Chinese characters are common to both Japanese and Chinese, meaning Chinese). When the project1.dpr is compiled and run, the Chinese characters are correctly displayed inside the paintbox, but in the generated pdf, I get

...
Arabic
  Chinese
Greek text...

My computer can display Chinese pdf files without problem. I suspected the font used by pdfcanvas being the problem so I added a Fontdialog to allow the font.name property to be selected at run time. I have tried almost all fonts available but still no luck. I changed UseUniscribe to false and it made no difference. I peeped inside a Chinese pdf file and found it only uses normal looking fonts like Verdana, NCNLIB+ArialUnicodeMS, ArialMT, and NCNMEN+SymbolMT.

Can anyone help?

Many TIAs

Wai Wong

Offline

#2 2013-03-21 04:52:58

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,237
Website

Re: Having trouble displaying Chinese Characters

Try to use latest unstable version.

This has been fixed.

Offline

#3 2013-03-21 05:40:55

wywong
Member
Registered: 2013-03-20
Posts: 6

Re: Having trouble displaying Chinese Characters

Can't figure out which file contains the latest unstable version of SynPdf. I will wait until the stable one is released. Thanks very much anyway!

Wai Wong

Offline

#4 2013-03-21 07:06:45

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,237
Website

Re: Having trouble displaying Chinese Characters

As stated by the main doc page http://synopse.info/fossil/wiki?name=PDF+Engine
see http://synopse.info/fossil/wiki?name=Get+the+source
and get expected units, i.e. SynCommons.pas / SynCommons.ini / SynPDF.pas / SynGDIPlus.pas / SynLZ.pas / SynZip.pas (at least).

Offline

#5 2013-03-22 09:08:57

wywong
Member
Registered: 2013-03-20
Posts: 6

Re: Having trouble displaying Chinese Characters

I downloaded http://synopse.info/fossil/zip/mORMot%2 … 1b822c.zip dated 2013-3-21, extracted the 5 mentioned Syn*.pas, rebuilt my test programs including synpdfunicode's Project1 but still got the same problem.

Thanks

Wai Wong

Offline

#6 2013-03-22 09:39:55

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,237
Website

Re: Having trouble displaying Chinese Characters

Problem is not in our library, but in the way you are using this sample program.
This sample program is reading the content from the internal resource (.res), not directly the .uni files.

You did not update the resource.
If you read the content from the files, it works as expected.

procedure LoadUnicodeStrings(Name: string; var Strings: array of SynUnicode);
// Loads the Unicode strings from the FILE (not resource!)
var Stream: TMemoryStream;
    Head, Tail, Finish: PWideChar;
    I: Integer;
begin
  Stream := TMemoryStream.Create;
  try
    Stream.LoadFromFile(Name+'.uni');
    Head := Stream.Memory;
    Finish := Head+Stream.Size;
    // Skip byte order mark.
    Inc(Head);
    Tail := Head;
    for I := 0 to High(Strings) do
    begin
      Head := Tail;
      while (Tail<Finish) and not (Tail^ in [WideChar(#0), WideChar(#13)]) do
        Inc(Tail);
      SetString(Strings[i], Head, Tail - Head);
      // Skip carriage return and linefeed.
      Inc(Tail, 2);
    end;
  finally
    Stream.Free;
  end;
end;

Offline

#7 2013-03-22 18:57:08

5c4f394a
Member
Registered: 2013-03-22
Posts: 11

Re: Having trouble displaying Chinese Characters

synpdfunicode_project1.png

synpdfunicode_testunicode.png

synpdfunicode_test.rar (broken)

Check-in [25426a6933]
Date: 2013-03-21 14:50:03

D7(SynopseRTL), XE3-UP2,
WinXP-SP3(Korean),

Last edited by 5c4f394a (2018-08-10 04:57:32)

Offline

#8 2013-03-22 22:35:55

wywong
Member
Registered: 2013-03-20
Posts: 6

Re: Having trouble displaying Chinese Characters

I copied and pasted the new LoadUnicodeStrings procedure into unit1 and the result is still the same, just like 5c4f394a's post.

The Chinese characters have always been displayed correctly inside the paintbox, suggesting that both the new and old LoadUnicodeStrings procedures are able to load the Chinese unicode characters correctly.

Thanks folks

Wai Wong

Offline

#9 2013-03-23 08:52:20

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,237
Website

Re: Having trouble displaying Chinese Characters

See the documentation of the "font fallback" related properties:

    /// used to define if the PDF document will handle "font fallback" for
    // characters not existing in the current font: it will avoid rendering
    // block/square symbols instead of the correct characters (e.g. for Chinese text)
    // - will use the font specified by FontFallBackName property to add any
    // Unicode glyph not existing in the currently selected font
    // - default value is TRUE
    property UseFontFallBack: boolean read fUseFontFallBack write fUseFontFallBack;
    /// set the font name to be used for missing characters
    // - used only if UseFontFallBack is TRUE
    // - default value is 'Arial Unicode MS', if existing
    property FontFallBackName: string read GetFontFallBackName write SetFontFallBackName;

You need to set the FontFallBackName property to a font name containing all the needed glyphs.
I suspect your computer does not have 'Arial Unicode MS' font installed.
When it is available, it works as expected.
Note that my 'Arial Unicode MS' seems not able to display Korean characters, which seems weird.
But no problem for Chinese or Japanese:
1364028637286.png

Try to change the FontFallBackName property to a font name containing all the needed glyphs.

Perhaps it could be a good idea for the library to allow a list of font names, and not just one font name.

Offline

#10 2013-03-23 10:48:41

5c4f394a
Member
Registered: 2013-03-22
Posts: 11

Re: Having trouble displaying Chinese Characters

synpdfunicode_testunicode_arialuni.png

download@Arial Unicode MS - Version 1.01 at SourceForge.net
* ARIALUNI.TTF (23,275,812 bytes)

info@Arial Unicode MS - Version 1.01 | Microsoft Typography - Fonts and Products

Last edited by 5c4f394a (2018-08-10 04:58:07)

Offline

#11 2013-03-23 11:04:21

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,237
Website

Re: Having trouble displaying Chinese Characters

Happy we found a solution.

Thanks for  the feedback.

Offline

#12 2013-03-23 12:45:57

wywong
Member
Registered: 2013-03-20
Posts: 6

Re: Having trouble displaying Chinese Characters

I downloaded and installed Arial Unicode MS.ttf and the generated pdf displays OK on my computer. However, when the PDF file is sent to another computer, chances are high that the latter doesn't have Arial Unicode MS.ttf installed, and all the unicode characters, including Greek, Arabic, etc. will not be displayed. All my Windows XP, Windows 7 & Windows 8 computers don't have that font preinstalled.

On the other hand, all Chinese PDF files I got can be viewed properly on my PCs before I installed Arial Unicode MS.ttf. For example, on an XP PC without that ttf, I created a PDF file by printing Arabic.uni (containing Asian characters) to a virtual printer (PDF Lite) and that PDF file displays properly on all my PCs. I use notepad to view that PDF file and found it contains the following lines:

%PDF-1.4%...

<</BaseFont/GMTXSU+Arial/FontDescriptor 14 0 R/ToUnicode 19 0 R/Type/Font/FirstChar 1/LastChar 24/Widths[ 375...

<</BaseFont/IVLJLU+MSUIGothic-WinCharSetFFFF-H/ToUnicode 20 0 R/Type/Font/Encoding /Identity-H/DescendantFonts[11...

In another downloaded Chinese PDF file, it contains:

%PDF-1.4%

<</BaseFont/NCNLIB+ArialUnicodeMS/Subtype/Type0/DescendantFonts[34 0 R]/ToUnicode 20 0 R>>

After googling I still can't find the meaning of those 6 letter prefixes and am not sure if they are significant.

Thanks

Wai Wong

Offline

#13 2013-03-23 14:54:01

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,237
Website

Re: Having trouble displaying Chinese Characters

You can include the font subset to the file.

So it will work perfectly even if the font is not available.

Offline

#14 2013-03-23 16:59:34

5c4f394a
Member
Registered: 2013-03-22
Posts: 11

Re: Having trouble displaying Chinese Characters

wywong wrote:

I downloaded and installed Arial Unicode MS.ttf and the generated pdf displays OK on my computer. However, when the PDF file is sent to another computer, chances are high that the latter doesn't have Arial Unicode MS.ttf installed, and all the unicode characters, including Greek, Arabic, etc. will not be displayed. All my Windows XP, Windows 7 & Windows 8 computers don't have that font preinstalled.

ab wrote:

You can include the font subset to the file.

with TPdfDocument.Create do
try
  EmbeddedTTF := True;
  ...

Offline

#15 2013-03-23 23:20:07

wywong
Member
Registered: 2013-03-20
Posts: 6

Re: Having trouble displaying Chinese Characters

Thanks folks. No more problem now. Even my unrooted Android phone can display testunicode.pdf correctly.

BTW, I modified Synpdf.pas so that I can change the default fallback font at run time, and found that the following preinstalled fonts can all handle Chinese/Japanese unicode: @MS Gothic, MS Gothic, @MS Mincho, MS Mincho, @Batang, Batang, @Gulim, Gulim, @Dotum, Dotum etc., with the last 3 pairs Korean capable as well. In fact it seems all fonts whose name starts with @ are capable of Asian unicode, although the @ prefix should be removed when setting font name. Thus I can use one of those fonts and disable embedding to reduce the pdf file size if the file is intended for PC only.

It puzzles me why I was unable to render Chinese characters before installing Arial Unicode MS font even though I had tried all the above fonts. I suspect the method TPdfWrite.AddUnicodeHexTextNoUniScribe may not handle unicode well. I modified it thus:

procedure TPdfWrite.AddUnicodeHexTextNoUniScribe(PW: PWideChar;
  TTF: TPdfFontTrueType; NextLine: boolean; Canvas: TPdfCanvas);
var Ansi: integer; [color=blue]TTFsav: TPdfFontTrueType; (*** in case non-WinANSI ***)[/color]
begin
  Ansi := WideCharToWinAnsi(cardinal(PW^));
[color=blue]  TTFsav := TTF; (*** preserve it for AddGlyphFromChar ***)[/color]
  if TTF<>nil then
	TTF := TTF.WinAnsiFont else // we expect the WinAnsi font in the code below
	if Ansi<0 then
	  Ansi := ord('?'); // WinAnsi only font shows ? glyph for unicode chars
  while Ansi<>0 do begin
	if Ansi>0 then begin
	  // add WinAnsi-encoded chars as such
	  if (TTF<>nil) and (Canvas.FPage.Font<>TTF) then
		Canvas.SetPDFFont(TTF,Canvas.FPage.FontSize);
	  Add('(');
	  repeat
		case Ansi of
		  40,41,92: Add('\');   // see PDF 2nd ed. p. 290
		  160: Ansi := 32; // fixed space is written as normal space
		end;
		TTF.AddUsedWinAnsiChar(AnsiChar(Ansi));
		Add(AnsiChar(Ansi));
		Inc(PW);
		Ansi := WideCharToWinAnsi(cardinal(PW^));
		if (TTF=nil) and (Ansi<0) then
		  Ansi := ord('?'); // WinAnsi only font shows ? glyph for unicode chars
	  until Ansi<=0;
	  Add(')').Add(SHOWTEXTCMD[NextLine]);
	  NextLine := false; // MoveToNextLine only once
	end;
	if Ansi=0 then
	  break;
	// here we know that PW^ is not a Win-Ansi glyph, and that TTF exists
	repeat
	  AddGlyphFromChar(PW^,Canvas,TTFsav,@NextLine); [color=blue](*** TTFsav instead of TTF ***)[/color]
	  inc(PW);
	  Ansi := WideCharToWinAnsi(cardinal(PW^));
	  if Ansi=160 then
		Ansi := 32;
	  if Ansi=32 then
		if WideCharToWinAnsi(cardinal(PW[1]))<0 then
		  continue; // we allow one space inside Unicode text
	until Ansi>=0;
	AddGlyphFlush(Canvas,TTFsav,@NextLine); [color=blue](*** TTFsav instead of TTF ***)[/color]
  end;
end;

and method AddGlyphFromChar has to be modified a bit:

procedure TPdfWrite.AddGlyphFromChar(Char: WideChar; Canvas: TPdfCanvas;
  TTF: TPdfFontTrueType; NextLine: PBoolean);
var aChanged: boolean;
	aTTF: TPdfFontTrueType;
	Glyph: word;
begin
  assert((TTF<>nil)[color=blue](*** and (TTF=TTF.WinAnsiFont)***)[/color]);

The above modification seems to work - the rendered Eastern unicode characters correspond to the specified font rather than the fallback font.

I also note that TPdfDocument.canvas.font may be set to the fallback font after a ShowText call, so that I need to call SetFont several times in unit1 to ensure the desired font is used.

Wai Wong

Last edited by wywong (2013-03-24 21:57:12)

Offline

#16 2015-03-31 06:08:00

leepk
Member
Registered: 2015-03-31
Posts: 2

Re: Having trouble displaying Chinese Characters

Hi,

I have problem, can not show Hiragana/Katakana… of font MS PGothic.
Please teach me change.

Thanks.

Last edited by leepk (2015-03-31 06:08:24)

Offline

#17 2015-03-31 07:52:45

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,237
Website

Re: Having trouble displaying Chinese Characters

@wywong
I'm still not convinced by the patch...
AFAIR we should use the WinAnsi in TPdfWrite.AddGlyphFromChar(), as expected by the TPdfFontTrueType.fUsedWide[] array.

Offline

#18 2015-04-06 08:20:28

leepk
Member
Registered: 2015-03-31
Posts: 2

Re: Having trouble displaying Chinese Characters

Hi All,

I had fixed error do not show Hiragana/Katakana… of font MS PGothic.

fixed on Synpdf.pas
function GetTTCIndex(const FontName: RawUTF8; var ttcIndex: Word;
  const FontCount: LongWord): Boolean;

Offline

#19 2015-04-06 11:44:07

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,237
Website

Re: Having trouble displaying Chinese Characters

@leepk

What do you mean?

Offline

#20 2015-04-18 08:11:11

ab
Administrator
From: France
Registered: 2010-06-21
Posts: 14,237
Website

Re: Having trouble displaying Chinese Characters

Please try the latest SynPDF commit proposed by "nosa".
See http://synopse.info/forum/viewtopic.php?id=2515

It includes a lot of improvements for Uniscribe, especially for Right To Left languages.

Offline

Board footer

Powered by FluxBB