Tech-invite3GPPspaceIETFspace
21222324252627282931323334353637384‑5x

Content for  TS 23.038  Word version:  18.0.0

Top   Top   Up   Prev   Next
1…   4   5   6…   6.2…   A…   A.3…   B…   C…

 

5  CBS Data Coding Schemep. 11

The CBS Data Coding Scheme indicates the intended handling of the message at the MS, the character set/coding, and the language (when applicable). Any reserved codings shall be assumed to be the GSM 7 bit default alphabet (the same as codepoint 00001111) by a receiving entity. The octet is used according to a coding group which is indicated in bits 7..4. The octet is then coded as follows:
Coding Group
Bits
7..4
Use of bits 3..0
0000 Language using the GSM 7 bit default alphabet
Bits 3..0 indicate the language:
0000  German
0001  English
0010  Italian
0011  French
0100  Spanish
0101  Dutch
0110  Swedish
0111  Danish
1000  Portuguese
1001  Finnish
1010  Norwegian
1011  Greek
1100  Turkish
1101  Hungarian
1110  Polish
1111  Language unspecified
0001 0000
GSM 7 bit default alphabet; message preceded by language indication.
The first 3 characters of the message are a two-character representation of the language encoded according to ISO 639 [12], followed by a CR character. The CR character is then followed by 90 characters of text (NOTE 1).
0001
UCS2; message preceded by language indication
The message starts with a two GSM 7-bit default alphabet character representation of the language encoded according to ISO 639 [12]. This is padded to the octet boundary with two bits set to 0 and then followed by 40 characters of UCS2-encoded message (NOTE 1).
An MS not supporting UCS2 coding will present the two character language identifier followed by improperly interpreted user data.
0010..1111
Reserved
0010.. 0000 Czech 0001 Hebrew (NOTE 2) 0010 Arabic (NOTE 2) 0011 Russian (NOTE 2) 0100 Icelandic
0101..1111
Reserved for other languages using the GSM 7 bit default alphabet, with unspecified handling at the MS
0011 0000..1111
Reserved for other languages using the GSM 7 bit default alphabet, with unspecified handling at the MS
01xx General Data Coding indication
Bits 5..0 indicate the following:
Bit 5, if set to 0, indicates the text is uncompressed
Bit 5, if set to 1, indicates the text is compressed using the compression algorithm defined in TS 23.042
Bit 4, if set to 0, indicates that bits 1 to 0 are reserved and have no message class meaning
Bit 4, if set to 1, indicates that bits 1 to 0 have a message class meaning:
Bit 1  Bit 0  Message Class:
0      0      Class 0
0      1      Class 1  Default meaning: ME-specific.
1      0      Class 2  (U)SIM specific message.
1      1      Class 3  Default meaning: TE-specific (see TS 27.005)
Bits 3 and 2 indicate the character set being used, as follows:
Bit 3  Bit 2  Character set:
0      0      GSM 7 bit default alphabet
0      1      8 bit data
1      0      UCS2 (16 bit) [10]
1      1      Reserved
1000 Reserved coding groups
1001 Message with User Data Header (UDH) structure:
Bit 1  Bit 0  Message Class:
0      0      Class 0
0      1      Class 1  Default meaning: ME-specific.
1      0      Class 2  (U)SIM specific message.
1      1      Class 3  Default meaning: TE-specific (see TS 27.005)
Bits 3 and 2 indicate the alphabet being used, as follows:
Bit 3  Bit 2  Alphabet:
0      0      GSM 7 bit default alphabet
0      1      8 bit data
1      0      UCS2 (16 bit) [10]
1      1      Reserved
1010..1100 Reserved coding groups
1101 I1 protocol message defined in TS 24.294
1110 Defined by the WAP Forum [15]
1111 Data coding / message handling
Bit 3 is reserved, set to 0.
Bit 2  Message coding:
0      GSM 7 bit default alphabet
1      8 bit data	
Bit 1  Bit 0  Message Class:
0      0      No message class.
0      1      Class 1  user defined.
1      0      Class 2  user defined.
1      1      Class 3  default meaning: TE specific
                       (see TS 27.005)	
NOTE 1:
The language indication shall appear at the start of each Message Information Page (see TS 23.041) and the language indication on each Message Information Page shall be for the same language.
NOTE 2:
Message text in Hebrew, Arabic and Russian cannot be encoded in the GSM 7-bit default alphabet. For these languages UCS2 encoding shall be used.
These codings may also be used for USSD and MMI/display purposes.
The message length specified in this subclause is not applicable for UTRAN, E-UTRAN, and NG-RAN, but only applicable for GSM.
See TS 24.090 for specific coding values applicable to USSD for MS originated USSD messages and MS terminated USSD messages. USSD messages using the default alphabet are coded with the GSM 7-bit default alphabet given in clause 6.2.1. The message can then consist of up to 182 user characters.
Cell Broadcast messages using the default alphabet are coded with the GSM 7-bit default alphabet given in clause 6.2.1. The Message Information Page then consists of 93 user characters.
If the GSM 7 bit default alphabet extension mechanism is used then the number of displayable characters will reduce by one for every instance where the GSM 7 bit default alphabet extension table is usedCell Broadcast Messages Information Page using 8-bit data have user-defined coding, and will each be 82 octets in length.
UCS2 character set indicates that the message is coded in UCS2 [10]. The General notes specified in clause 6.1.1 override any contrary specification in UCS2, so for example even in UCS2 a <CR> character will cause the MS to return to the beginning of the current line and overwrite any existing text with the characters which follow the <CR>. Cell Broadcast Messages Information Page encoded in UCS2 consist of 41 characters each.
When a CBS message received by the MS is message class 0 and the MS has the capability of displaying CBS messages, the MS shall display the message immediately. The message shall not be automatically stored in the (U)SIM or ME.
The ME may make provision through MMI for the user to selectively prevent the message from being displayed immediately.
If the ME is incapable of displaying CBS messages or if the immediate display of the message has been disabled through MMI then the ME shall treat the CBS message as though there was no message class, i.e. it will ignore bits 0 and 1 in the TP-DCS but may store the message either on the ME or on the (U)SIM.
Class 1 and Class 2 messages may be routed by the ME to user-defined destinations, but the user may override any default meaning and select their own routing.
Class 3 messages will normally be selected for transfer to a TE, in cases where a ME supports an SMS/CBS interface to a TE, and the TE requests "TE-specific" cell broadcast messages (see TS 27.005). The user may be able to override the default meaning and select their own routing.
Messages with a User Data Header Structure are encoded as described in TS 23.040 for SMS, in subclauses 3.10 and 9.2.3.24.
The use of Cell Broadcast DCS values for messages with a User Data Header structure implies that the 82-bytes CB payload has a User Data Header structure.
The CBS message information field will contain the IEs as described in TS 23.040. The concatenation IEs will not be used, as CB concatenation will rely in that case on the existing CB mechanism. Note that IEs that cannot be split and that IEs that are too large to fit in one CB segment cannot be transmitted using this mechanism. Also, some IEs as defined for SMS are not applicable for CB:
VALUE
(hex)
MEANING
00Concatenated short messages, 8-bit reference number
01Special SMS Message Indication
06SMSC Control Parameters
08Concatenated short message, 16-bit reference number
20RFC 822 E-Mail Header
23Enhanced Voice Mail Information
70-7F(U)SIM Toolkit Security Headers
80-89SME to SME specific use
Up

Up   Top   ToC