PolarSSL is now part of ARM Official announcement and rebranded as mbed TLS.

Converting to Hardware accelerators


Oct 31, 2017 06:45
H

Hi,

I am using a crypto micro-controller which has support for most of the crypto accelerators in hw. I am following the guidelines as per https://tls.mbed.org/kb/development/hw_acc_guidelines to move to hw based crypto operations

As a starting point trying to test the AES part of my demo during SSL handshake process. After defining the _ALT files I can see that the AES self test passes with built in AES crypto module.

For my demo the Ciphersuite is TLS-ECDHE-ECDSA-WITH-AES-256-GCM-SHA384

With AES hw the ssl handshake process fails with error code -0x7780 (MBEDTLS_ERR_SSL_FATAL_ALERT_MESSAGE) during MBEDTLS_SSL_SERVER_CHANGE_CIPHER_SPEC step

My AES hw block does not support AES-GCM mode. However as I debug the mbedtls code I see that gcm.c will call AES routines with ECB mode of operation. So non-availability of GCM support in hw should not cause failure??

Question is can I use AES part of my hw block and gcm part in software to make it work? I am not very knowledgeable about GCM

My issue is somewhat similar to https://github.com/ARMmbed/mbed-os/issues/4928 but I don't have multiple aes contexts but just one

any suggestion is welcome

regards

 
Oct 31, 2017 12:13
Ron Eldor

Hi H,
GCM is basically AES and GMAC in the cryptographic modules. You can use HW accelerated AES in the AES part of the GCM. The error you are receiving is, as you mentioned, MBEDTLS_ERR_SSL_FATAL_ALERT_MESSAGE, however you haven't stated the alert message. From your description, I assume it is MBEDTLS_SSL_ALERT_MSG_BAD_RECORD_MAC. It is quite possible, that in your HW acceleration, you don't protect the operation with a mutex, and after you load IV, but before doing the Cryptographic operation, you load a new IV to the HW for the next operation, causing the output to be not as expected. Of courtse, it could be another paramter, such as the key, but it is most likely the IV.
I recommend protecting your Alternative implementation with mutex, on the operations that load data to the HW, and do the HW operation.
Regards,
Mbed TLS Team member
Ron

 
Oct 31, 2017 13:25
H

Hi Ron,

Thanks for your reply. I will verify if the IV/Key is getting corrupted

However couple of questions

  1. Since there is no multithreading why there is a possibility of old IV (or key) getting replaced with new IV (or key) before the encryption begins?

  2. The only functions which updates HW registers for my crypto block are mbedtls_aes_encrypt() and mbedtls_aes_decrypt() in ECB mode. As I see all the AES mode specific functions (ecb, cbc, cfb) call these APIs in ECB mode and IV related stuff is taken care in mode specific function, I doubt if such corruption is taking place

regards

 
Oct 31, 2017 14:46
Ron Eldor

Hi H,
Since you are replacing only mbedtls_aes_encrypt() and mbedtls_aes_decrypt() , I agree with you it is unlikely that the IV is getting corrupted. I was under the assumption you have replaced the whole module, using MBEDTLS_AES_ALT. Perhaps it is a different HW register getting corrupted. I suggest you run the AES test suites, in addition to the self test, since they cover more cases.
AS for the multithreading question, I agree it's unlikely, but worth a check.
In addition, I recommend you work with our sample applications on both server and client. Use the server with the working configuration ( e.g. SW AES), and your client with the HW AES. With this configuration, you will be able to debug both peers, and understand how the key material is derived on both sides. Perhaps this post could help.
Regards,
Mbed TLS Team member
Ron

 
Nov 2, 2017 05:05
H

Hi Ron,

AES HW + GCM SW is working fine. The issue was misaligned data address. The Crypto HW expects aligned pointers. The data pointers in self test were aligned but not in the actual application

I will continue converting other crypto modules (RSA, ECC) now and get back if any problems

Thanks a lot

regards

 
Nov 8, 2017 13:10
H

Hi Ron,

I am now working on converting RSA sw part to use hw accelerators. Presently I am working on following two functions

mbedtls_rsa_public & mbedtls_rsa_private.

As before I am using self test function in rsa.c to verify my changes

I see that these APIs use the keys stored in the MPI format. My hw APIs expect plain hex values. I tried to convert the MPI into hex using mbedtls_mpi_write_string however I am not getting expected output. Below is my code snippet

    uint8_t return_sts = 0;

    volatile uint16_t nBytes = 0;


          uint8_t rsaPublicModulus[0x100];

         uint8_t rsaPrivateExponent[0x100];

          uint8_t rsaPublicExponent[0x5] = {1, 0, 0, 0, 1};

          const BUFF8_T rsaPublicModulus_str = {
                    .len = 0x80,
                   .pd  = (uint8_t *)&rsaPublicModulus[0],
     };

          const BUFF8_T rsaPrivateExponent_str = {
                 .len = 0x80,
                 .pd  = (uint8_t *)&rsaPrivateExponent[0],
     };

         const BUFF8_T rsaPublicExponent_str = {
                .len = 5,
                .pd  = (uint8_t *)&rsaPublicExponent[0],
      };

   int mbedtls_rsa_public( mbedtls_rsa_context *ctx,
            const unsigned char *input,
            unsigned char *output )
    {
        uint8_t pkeEncOutput[128],pkeDecOutput[128], k;


    BUFF8_T pkeEncOutput_str = {
        .len = 128,
        .pd = pkeEncOutput,
    };

    BUFF8_T pkeDecOutput_str = {
        .len = 128,
        .pd = pkeDecOutput,
    };

    BUFF8_T pkeRSAInput_str = {
        .len = 0x16,
        .pd  = 0,
    };      


            uint8_t rsa_plaintext[24];

    memcpy( rsa_plaintext, RSA_PT, PT_LEN );

    pkeRSAInput_str.pd = (uint8_t*)&rsa_plaintext[0];

/* Power PKE Block */
    pke_power(true);

if (!pke_busy())
    {
        mbedtls_mpi_write_string(&ctx->N, 16, (char*)rsaPublicModulus_str.pd, 1024, (size_t*)&k);
  mbedtls_mpi_write_binary(&ctx->N, rsaPublicModulus_str.pd, 1024);                 
        mbedtls_mpi_write_string(&ctx->D, 16, (char*)rsaPrivateExponent_str.pd, 1024, (size_t*)&k);         
        //mbedtls_mpi_write_string(&ctx->E, 16, (char*)rsaPublicExponent_str.pd, 64, (size_t*)&k);

        return_sts = rsa_load_key(1024, &rsaPrivateExponent_str, &rsaPublicModulus_str,&rsaPublicExponent_str,false);

        if (return_sts == PKE_RET_OK)
        {
                /* RSA Encryption with Public key */
                return_sts = rsa_encrypt(1024, &pkeRSAInput_str, 0x0);
                pke_start(0);

                if (return_sts == PKE_RET_OK)
                {                           
                        while(pke_busy() == true);

                        /* Data Will be stored in slot 5, read data from Shared Crypto memory */
                        nBytes = pke_read_scm(&pkeEncOutput_str.pd[0], 128, 5, false);
                }
            }
        }

        pke_power(false);
        return 0;

RSA_N is a string of 256 bytes, so I am not clear on how to retrieve the 1024 bit keys in hex format

Also, it seems the input pointer does not directly point to input data, the data seems to start at the offset of 103

There is no _ALT macro similar to AES/HASH to convert to HW accelerators.

Please suggest

Thanks

regards

 
Nov 8, 2017 14:17
Ron Eldor

Hi H,

There is no _ALT macro similar to AES/HASH to convert to HW accelerators.

You are correct, but there is a Pull Request which supports this feature. Please follow this PR, to see when it will be merged.

RSA_N is a string of 256 bytes

Are you sure it's a string and not a binary buffer?

I am guessing that your HW implementation receives a binary buffer, and not a string, which in this case, you should use mbedtls_mpi_write_binary() which exports the data into a binary buffer, instead of mbedtls_mpi_write_string() which exports the data into an ASCII string.

I hope this hint helps
Regards,
Mbed TLS Team member
Ron

 
Nov 8, 2017 14:54
H

Thanks for quick reply. Yes my hw accepts binary data

When I say RSA_N is a string of 256 bytes it is actually from mbedtls test code. This is how it is defined in rsa.c for self test function

#define RSA_N

"9292758453063D803DD603D5E777D788" \ "8ED1D5BF35786190FA2F23EBC0848AEA" \ "DDA92CA6C3D80B32C4D109BE0F36D6AE" \ "7130B9CED7ACDF54CFC7555AC14EEBAB" \ "93A89813FBF3C4F8066D2D800F7C38A8" \ "1AE31942917403FF4946B0A83D3D3E05" \ "EE57C6F5F5606FB5D4BC6CD34EE0801A" \ "5E94BB77B07507233A0BC7BAC8F90F79"

I assume this is converted to hex form internally by mpi functions. I have already tried mpi_write_binary but it gives me empty buffer

Below fills the rsaPublicModulus_str.pd with data in string format.

 mbedtls_mpi_write_string(&ctx->N, 16, (char*)rsaPublicModulus_str.pd, 1024, (size_t*)&k);

Below fills the rsaPublicModulus_str.pd with all 0

 mbedtls_mpi_write_binary(&ctx->N, rsaPublicModulus_str.pd, 1024); 

Also, what about the input pointer being passed to this function (mbedtls_rsa_public), is the actual plaintext located at certain offset which I see to be 103?

regards

 
Nov 8, 2017 16:32
Ron Eldor

Hi H,
Note that mbedtls_mpi_write_binary() receives the buflen in bytes , not bits, which I believe is 0x100 (256) in your case. Since you are giving 1024 as input, I assume that there is some memory overflow.
Note in the code:

    memset( buf, 0, buflen );

    for( i = buflen - 1, j = 0; n > 0; i--, j++, n-- )
        buf[i] = (unsigned char)( X->p[j / ciL] >> ((j % ciL) << 3) );

The binary data is exported in Big endian, so it writes the data from the end of the buffer. Since your are writing the key starting position 1023, it explain why you get a buffer of all zeros ( and a big overflow). BTW, mbedtls_mpi_write_string() also receives the buffer length in bytes, but it less affects the output of the string, in this matter.

Also, what about the input pointer being passed to this function (mbedtls_rsa_public), is the actual plaintext located at certain offset which I see to be 103?

I'm sorry, I don't follow your question. what function? What exact pointer?

Regards,
Mbed TLS Team member
Ron

 
Nov 8, 2017 18:41
H

Hi Ron,

With your suggestion I was able to use mpi_write_binary to get public modulus & private exponent in binary format

For the public exponent which is defined in string as "10001" what should be the length passed to mpi_write_binary? I tried 3, 5, & 8 but encryption output is incorrect

Please note I am still using mbedtls context and not modified it for any hw specific. Also, I am using the keys & plaintext from self-test example

The prototype for mbedtls_rsa_public function is as below

int mbedtls_rsa_public( mbedtls_rsa_context *ctx,
            const unsigned char *input,
            unsigned char *output )

when I see the contents of address pointed by input, the plaintext (defined as RSA_PT) which is to be encrypted is not located at input[0] but rather at some different offset. So if I pass this pointer "input" directly to my hw API as a pointer to plaintext the encryption output will be wrong. I was trying to understand this pointer

 
Nov 9, 2017 08:01
Ron Eldor

Hi H,
I suggest you use mbedtls_mpi_size() to determine the actual byte size of the mbedtls_mpi type, instead of using hard coded values. Of course, it should be smaller than the size of the buffer you want to fill.

Thanks for clarifying my question. you shouldn't get an offset. Perhaps it is because you used a hard coded value \ your output buffer size, instead of mbedtls_mpi_size(). If you used 128 ( 1024 bits ), then it doesn't make sense you get the offset. If you used any other value, bigger than 128, then it's reasonable you will get an offset of the difference between the value you put, and 128, as mbedtls_mpi_write_binary() exports the data into Big Endian.
Regards,
Mbed TLS Team member
Ron

 
Nov 9, 2017 09:46
H

Great support Ron, Much appreciate

I am able to test encrypt and decrypt functions successfully with my crypto HW

mbedtls_rsa_public() 
mbedtls_rsa_private()

Now since I am only modifying these two functions the drawback is they do not pass the length of input data (plaintext). Right now I hard coded as I already know the length of test pattern

I see these functions are getting called from other functions such as sign & verify and the calling functions do some manipulations of input data pointer

Is there any way I can find out the length of input data in these two functions so that I don't have to do extra housekeeping in upper layer APIs.

thanks

 
Nov 9, 2017 10:28
Ron Eldor

Hi H,
The length of input should be the key length, 128 in your case (1024). Note that 1024 bit key length is considered unsecured.
As you can see from the Mbed TLS implementation of mbedtls_rsa_public() and mbedtls_rsa_private()

MBEDTLS_MPI_CHK( mbedtls_mpi_read_binary( &T, input, ctx->len ) );

the input length(key length) is given as part of the mbedtls_rsa_context.
Note you will need to set the length, similar to the example

rsa.len = ( mbedtls_mpi_bitlen( &rsa.N ) + 7 ) >> 3;

prior to calling the mbedtls_rsa_public() and mbedtls_rsa_private() functionality. This is done in the higher, application layer, though.

Regards,
Mbed TLS Team member
Ron

 
Nov 12, 2017 07:55
Ron Eldor

HI H,
One clarification about what I previously said. These functions expect the length to be set in the mbedtls_rsa_context input parameter. So, it should either be assigned from the user, as shown in the example I referenced in previous post, when the key pair was already generated, or it is set as part of mbedtls_rsa_gen_key(), when a key pair is needed to be generated.
Regards,
Mbed TLS Team member
Ron