M66 - AT+CSCS some problem

ice_m · March 23, 2022, 10:01am

Hi, I use M66 Quectel module and there wasn’t a big problem with coding for this moment. There is AT+CSCS=“GSM”, AT+QSMSCODE=0 command and it’s works in many countries for example: _ which was send is receive as _, $ send is receive as $ etc. Problem is in Armenia (Vivacell). Receiver receive for example ż or ¤. When we try send received SMS with ż and ¤ to some phone with Vivacell sim card we get information: NOTICE: Cannot convert 104. character (C5BC)ż to GSM. Thanks for any help.

snowgum · March 23, 2022, 8:21pm

Your characters ż and ¤ have no representation in the GSM default alphabet. They can only be sent as UCS2.

It’s not clear to me whether the M66 supports UCS2 in text-mode.

If it does, you’ll need to set the text-mode Data Coding Scheme to UCS2 with the AT+CSMP command and convert the whole text payload to numeric UCS2 yourself.

So the text “Hello ż” becomes “00480065006C006C006F0020017C”.

You can, of course, send any text in any character set by using PDU-mode instead of text-mode.

ice_m · March 23, 2022, 9:28pm

Hi, I’’ explain. Program in device with M66 not send ż or ¤. This device has set configuration as I wrote: AT+CSCS=“GSM”, AT+QSMSCODE=0 . Device send A-Z, a-z, 0-9, space, characters from 21h (!) to 2Fh(/), characters from 3Ah ( to 40h(@), characters from 5Bh ([) to 60h(`) - all of them are from ASCII. There is no problem with send A-Z,a-z, 0-9 and space, but there is some problem with some other specials characters (ASCII). In some cases receiver terminal received ż or ¤. There is possible receiver terminal will receive other “difference” characters, which “wasn’t” send (was send by in other format).

snowgum · March 23, 2022, 10:07pm

As I understand it now, you are having problems with decoding received SMSs, and not encoding.

In that case, it becomes important to see exactly how the received messages are encoded. Are they using the GSM default alphabet? Are they using a “National Language Shift Table”? Something else?

You could put the modem into PDU-mode with AT+CMFG=0 and read the received PDUs with AT+CMGL=4

Copy individual PDUs and decode them using this tool: Online SMS PDU Decoder/Converter | Diafaan SMS Server

Is that decode different from the decode the M66 does in text-mode?

Have you tried setting the modem’s character set to “IRA” (the International Reference Alphabet) instead of GSM with AT+CSCS="IRA"

I know Quectel modems default to “GSM”, but 3GPP 27.007 recommends a default of “IRA”.

snowgum · March 24, 2022, 2:33am

Testing I’ve done with my RM500Q-AE reveals that the dollar sign “$” gets displayed incorrectly in text-mode with AT+CSCS="GSM", but is correct with AT+CSCS="IRA"

This is a standard incoming SMS using the 7-bit default alphabet and DCS=“00”.

I had not noticed this before, because I always use PDU-mode, so the the decoding is done outside the modem.

ice_m · March 24, 2022, 8:34am

Hi, I really thanks for any help I use text mode and I think this mode can’t be changed to pdu, because I need do many changes in software. I try AT+CSCS=“IRA” and give information for man, who has receiver with information what mode need to set in receiver. I add - software in my device send SMS by send ASCII code characters - If I want to send for example “0” I send 48 decimal.

snowgum · March 24, 2022, 9:40am

Yes, most printable characters in the GSM default alphabet are the same as 7-bit ASCII. But there are a few exceptions.

That alphabet is given in 3GPP 23.038, section 6.2.1.

There are some extensions in section 6.2.1.1. Characters from the extension table use two character positions.

The text-mode default is DCS=0 (zero) and that uses this alphabet.

You can show the text-mode parameters with the AT+CSMP? command. The last parameter is the DCS:

+CSMP: <fo>,<vp>,<pid>,<dcs>

AT+CSMP?
+CSMP: 17,167,0,0
OK

In text-mode the modem converts ASCII to the GSM default alphabet when DCS=0.

ice_m · March 25, 2022, 6:56pm

Hi, thanks :), I check this. Could You help me what number ASCII code could be for received ż and ^ ? I’m shure my program send ASCII codes >= 32 (space), so ^ and ż need to has > 32 ASCII code. Thanks for help.

snowgum · March 25, 2022, 8:51pm

If I understand you, you have received an SMS containing the “ż” character. As you know, this is not an ASCII character. It is not a character in the GSM default alphabet either.

When you read the SMS in text mode with AT+CSCS=“IRA” does it still render as “ż”?

If so, we need to look at the incoming SMS in PDU-mode to see the exact encoding. The 3GPP standards require the receiving device to store the incoming SMS PDU exactly as it is received.

Do you want to send the character “ż” in an outgoing SMS, so that it is received as “ż”? If you do, you’ll need to use UCS2 as I said in my first reply to you.

If I have not understood you, please try again.

ice_m · March 25, 2022, 9:08pm

Hi, my device don’t send ż and ^, I don’t want send and receive that characters.
I can’t make any changes in devices, which was produced and works. Devices are unavailable for me. Is there some way to check what ASCII code is for received ż or ^ ?

snowgum · March 25, 2022, 9:19pm

Just by discussing and using the “ż” character, we are not using ASCII for it. We are using variable-width UTF-8 encoding here on this forum.

It is character 380 decimal. It needs two bytes, being 101111100 binary.

UTF-8 encoding is given by RFC3629: https://www.rfc-editor.org/rfc/rfc3629.txt

So in UTF-8 the character “ż” becomes the two-byte sequence 0xC5BC.

Notice the reference to C5BC in your initial post?

snowgum · March 27, 2022, 1:29am

Concentrating on the “ż” character. If you see it in a received SMS, then the whole SMS may have been sent with UCS2 encoding.

Or the SMS might have been sent with some other character using the GSM default alphabet, and using AT+CSCS="GSM" has displayed it incorrectly.

To be sure we would need to look at the incoming PDU. SMSs are always sent and received PDU-encoded.

The character “ż” can not be represented in ASCII. It is not an ASCII character.

It encodes in UCS2 as 017C hex and in UTF-8 as C5BC hex.