PMDF System Manager's Guide
6.2 Character Set Conversion
If PMDF probes and finds that the message is to be reformatted, it will
proceed to check each part of the message. Any text parts are found and
their character set parameters are used to generate the second probe.
Only when PMDF has checked and found that conversions may be needed
does it ever perform the second probe. The input string in this second
case looks like this:
IN-CHAN=in-channel;OUT-CHAN=out-channel;IN-CHARSET=in-char-set
|
The in-channel
and out-channel
are the same as before, and the in-char-set
is the name of the character set associated with the particular part in
question. If no match occurs for this second probe, no character set
conversion is performed (although message reformatting, e.g.,
changes to MIME structure, may be performed in accordance with the
keyword matched on the first probe). If a match does occur it should
produce a string of the form:
Here the out-char-set
specifies the name of the character set to which the in-char-set
should be converted. Note that both of these character sets must be defined in the character set definition table, charsets.txt
, located in the PMDF table directory. No conversion will be done if the character sets are not properly defined in this file. This is not usually a problem since this file defines several hundred character sets; most of the character sets in use today are defined in this file. See the description of the PMDF CHBUILD
(OpenVMS) or pmdf chbuild
(UNIX and NT) utility in Chapter 29 and Chapter 30 for further information on the charsets.txt
file.
If all the conditions are met, PMDF will proceed to build the character
set mapping and do the conversion. The converted message part will be
relabelled with the name of the character set to which it was converted.
Suppose that DEC-MCS is used locally, but this needs to be converted to ISO-8859-1 for use on the Internet. In particular, suppose the connection to the Internet is via a set of TCP channels (including but not limited to tcp_local
), and l
and d
channels are in use locally. The table shown in Example 6-1 brings
such conversions about.
Example 6-1 Converting DEC-MCS to and from
ISO-8859-1 |
CHARSET-CONVERSION
IN-CHAN=l;OUT-CHAN=tcp_*;CONVERT Yes
IN-CHAN=d;OUT-CHAN=tcp_*;CONVERT Yes
IN-CHAN=tcp_*;OUT-CHAN=l;CONVERT Yes
IN-CHAN=tcp_*;OUT-CHAN=d;CONVERT Yes
IN-CHAN=*;OUT-CHAN=*;CONVERT No
IN-CHAN=l;OUT-CHAN=tcp_*;IN-CHARSET=DEC-MCS OUT-CHARSET=ISO-8859-1
IN-CHAN=d;OUT-CHAN=tcp_*;IN-CHARSET=DEC-MCS OUT-CHARSET=ISO-8859-1
IN-CHAN=tcp_*;OUT-CHAN=l;IN-CHARSET=ISO-8859-1 OUT-CHARSET=DEC-MCS
IN-CHAN=tcp_*;OUT-CHAN=d;IN-CHARSET=ISO-8859-1 OUT-CHARSET=DEC-MCS
|
The table shown in Example 6-2 specifies a conversion between local
usage of DEC Kanji and the ISO 2022 based JP code used on the Internet.
Example 6-2 Converting DEC-Kanji to and from
ISO-2022-JP |
CHARSET-CONVERSION
IN-CHAN=l;OUT-CHAN=l;CONVERT No
IN-CHAN=l;OUT-CHAN=d;CONVERT No
IN-CHAN=d;OUT-CHAN=l;CONVERT No
IN-CHAN=d;OUT-CHAN=d;CONVERT No
IN-CHAN=l;OUT-CHAN=*;CONVERT Yes
IN-CHAN=d;OUT-CHAN=*;CONVERT Yes
IN-CHAN=*;OUT-CHAN=l;CONVERT Yes
IN-CHAN=*;OUT-CHAN=d;CONVERT Yes
IN-CHAN=l;OUT-CHAN=*;IN-CHARSET=DEC-KANJI OUT-CHARSET=ISO-2022-JP
IN-CHAN=d;OUT-CHAN=*;IN-CHARSET=DEC-KANJI OUT-CHARSET=ISO-2022-JP
IN-CHAN=*;OUT-CHAN=l;IN-CHARSET=ISO-2022-JP OUT-CHARSET=DEC-KANJI
IN-CHAN=*;OUT-CHAN=d;IN-CHARSET=ISO-2022-JP OUT-CHARSET=DEC-KANJI
|