Mbed TLS is now part of TrustedFirmware.org.

Parsing negative integers (ASN.1)


Jan 12, 2018 23:06
Wolf Smith

I was working with mbedtls-2.4.2 on a project, and it seems to me that the ASN.1 parsing code is unable to parse negative integers.

Specifically, as shown below, in mbedtls_asn1_get_int() there seems to be a check that if the high order bit of the first byte is set to 1, an error is returned.

If I understand correctly, in DER, the Value octets (after Tag and Length) of an INTEGER are supposed to be the 2's complement of the integer value to be encoded. Hence, when the high order bit of the first byte is 1, the integer is meant to be negative, which the parsing code below seems to consider to be an error.

This is perhaps not very significant for handling X.509 certificates, as negative numbers seldom appear (though there are reports of certs containing negative serial numbers), but if one wants to use the ASN.1 parser in mbedTLS for other applications, this might cause potential problems.

int mbedtls_asn1_get_int( unsigned char **p,
                  const unsigned char *end,
                  int *val )
{
    int ret;
    size_t len;

    if( ( ret = mbedtls_asn1_get_tag( p, end, &len, MBEDTLS_ASN1_INTEGER ) ) != 0 )
        return( ret );

    if( len == 0 || len > sizeof( int ) || ( **p & 0x80 ) != 0 )
        return( MBEDTLS_ERR_ASN1_INVALID_LENGTH );
 
Jan 13, 2018 10:14
Per Westermark

Note that the line in your quote verifies the length and not the value. It's quite reasonable that the test doesn't accept a negative length.

 
Jan 13, 2018 11:29
Wolf Smith

Is it though? If the check is indeed about negative length then I'd agree with you that it's quite reasonable.

However, if I understand correctly, at then end of mbedtls_asn1_get_tag() it calls mbedtls_asn1_get_len(), and after both returns (hence Tag and Length are both read), *p then points to the first byte of Value.

Perhaps let me illustrate my point with the following code:

#include <stdio.h>
#include "mbedtls/asn1.h"

int main(){

    unsigned char content1[3] = {0x02, 0x01, 0x03};
    unsigned char content2[3] = {0x02, 0x01, 0x81};

    unsigned char *p1   = &(content1[0]);
    unsigned char *end1 = &(content1[2]) + 1;
    int   v1   = 0;
    int   res1 = 0;

    unsigned char *p2   = &(content2[0]);
    unsigned char *end2 = &(content2[2]) + 1;
    int   v2   = 0;
    int   res2 = 0;

    res1 = mbedtls_asn1_get_int(&p1, end1, &v1);      
    res2 = mbedtls_asn1_get_int(&p2, end2, &v2);

    printf("res1 = %d, v1 = %d\n", res1, v1);
    printf("res2 = %d, v2 = %d\n", res2, v2);

    return 0;
}

which outputs

res1 = 0, v1 = 3
res2 = -100, v2 = 0

where res2 = -100 is also -0x64 which corresponds to MBEDTLS_ERR_ASN1_INVALID_LENGTH.

Looking at other ASN.1 parsers, for example ASN.1 JavaScript decoder (decoding 0x02 0x01 0x81) says 0x02 0x01 0x81 decodes to INTEGER -127.

Which OpenSSL seems to agree:

$ echo -n -e '\x02\x01\x81' > /tmp/testhex
$ hexdump -C /tmp/testhex 
00000000  02 01 81                                          |...|
00000003
$ openssl asn1parse -inform DER -in /tmp/testhex 
    0:d=0  hl=2 l=   1 prim: INTEGER           :-7F
 
Jan 14, 2018 12:57
Ron Eldor

Hi Wolf,
Mbed TLS is a small footprint TLS and crypto library, and not meant for use as a general purpose library. As such, negative integers are not part of the scope of the ASN.1 parser.
Regards,
Mbed TLS Team member
Ron

 
Jan 14, 2018 15:41
Wolf Smith

Hi Ron,

Regarding your comment about mbedTLS is not meant for use as a general purpose library, the comments inside config.h seem to indicate otherwise. For example, the description for the MBEDTLS_ASN1_PARSE_C option says this would "Enable the generic ASN1 parser".

The descriptions in asn1.h also don't seem to suggest that the ASN.1 parser is meant for mbedTLS internal use only.

From an embedded system engineering perspective, a generic ASN.1 parser can be very useful outside of TLS and crypto, (for example, many other telecom standards are also specified in ASN.1), so if a project is already using mbedTLS, one can potentially re-use its ASN.1 parser to save some precious resources.

Looking at the API doc on mbedtls_asn1_get_int(), however, it doesn't say anything about not handling negative integers.

While I understand that mbedTLS is aimed at maintaining a small footprint due to its target of resource constrained platforms, is it really that devastating to parse negative integers? After all, if I understand correctly, it's only simple 2's compliment.

And perhaps this could be offered as a configurable option, so that people who do not care about negative integers can stick to the old behaviour, and those who need this capability can enable it in config.h.

Here is a simple patch that I made to test out the idea. MBEDTLS_ASN1_PARSE_NEGINT is the option that can be used to turn on/off negative integer parsing. At the cost of an extra pointer, 2 ifs and some maths, this seems to work.

int mbedtls_asn1_get_int( unsigned char **p,
                  const unsigned char *end,
                  int *val )
{
    int ret;
    size_t len;

#if defined(MBEDTLS_ASN1_PARSE_NEGINT)
    unsigned char *neg = 0;
#endif /* MBEDTLS_ASN1_PARSE_NEGINT */

    if( ( ret = mbedtls_asn1_get_tag( p, end, &len, MBEDTLS_ASN1_INTEGER ) ) != 0 )
        return( ret );

    if( len == 0 || len > sizeof( int ) )
        return( MBEDTLS_ERR_ASN1_INVALID_LENGTH );

    if ((**p & 0x80) != 0)
#if defined(MBEDTLS_ASN1_PARSE_NEGINT)
    {
        neg = *p;
        **p &= ~(0x80);
    }
#else
    {
        return( MBEDTLS_ERR_ASN1_INVALID_DATA );
    }
#endif /* MBEDTLS_ASN1_PARSE_NEGINT */

    *val = 0;

    while( len-- > 0 )
    {
        *val = ( *val << 8 ) | **p;
        (*p)++;
    }

#if defined(MBEDTLS_ASN1_PARSE_NEGINT)
    if (neg)
        *val = ( ~(0) << ((*p - neg) * 8 - 1) ) | *val;
#endif /* MBEDTLS_ASN1_PARSE_NEGINT */

    return( 0 );
}

And testing it with my earlier toy example,

#include <stdio.h>

#include "mbedtls/asn1.h"

int main()
{

    unsigned char content1[6] = {0x02, 0x04, 0xa9, 0xaa, 0x18, 0x7e};
    unsigned char content2[6] = {0x02, 0x04, 0x56, 0x55, 0xe7, 0x82};

    unsigned char *p1   = &(content1[0]);
    unsigned char *end1 = &(content1[5]) + 1;
    int   v1   = 0;
    int   res1 = 0;

    unsigned char *p2   = &(content2[0]);
    unsigned char *end2 = &(content2[5]) + 1;
    int   v2   = 0;
    int   res2 = 0;

    res1 = mbedtls_asn1_get_int(&p1, end1, &v1);

    res2 = mbedtls_asn1_get_int(&p2, end2, &v2);

    printf("res1 = %d, v1 = %d\n", res1, v1);
    printf("res2 = %d, v2 = %d\n", res2, v2);

    return 0;
}

The output below agrees with Lapo's JS-based decoder Decoding 0x02, 0x04, 0xa9, 0xaa, 0x18, 0x7e and Decoding 0x02, 0x04, 0x56, 0x55, 0xe7, 0x82

res1 = 0, v1 = -1448470402
res2 = 0, v2 = 1448470402

and that of OpenSSL, where 5655E782 (hex) is 1448470402 (dec).

$ echo -n -e '\x02\x04\xa9\xaa\x18\x7e' > /tmp/testhex
$ openssl asn1parse -inform DER -in /tmp/testhex 
    0:d=0  hl=2 l=   4 prim: INTEGER           :-5655E782

$ echo -n -e '\x02\x04\x56\x55\xe7\x82' > /tmp/testhex
$ openssl asn1parse -inform DER -in /tmp/testhex 
    0:d=0  hl=2 l=   4 prim: INTEGER           :5655E782

On a side note, the original code returns the error MBEDTLS_ERR_ASN1_INVALID_LENGTH when a negative integer is encountered, which is a bit misleading with respect to the root cause of the problem. Perhaps a MBEDTLS_ERR_ASN1_INVALID_DATA would be semantically more accurate to describe the case of encountering a negative integer that the library is unable to handle.

 
Jan 15, 2018 09:23
Ron Eldor

HI Wolf,
I agree with you that the documentation should be improved, and the returned error code as well.
As I mentioned, Mbed TLS is intended for TLS and cryptography usage. It CAN be used for other needs, but it is not the primary intention.
Supporting negative integers can open the cryptographic operations which we simply want to avoid.
Regards,
Mbed TLS Team member
Ron

 
Jan 15, 2018 13:45
Mark Butcher

Hi

"Mbed TLS is a small footprint TLS and crypto library, and not meant for use as a general purpose library."

I don't know whether OpenSSL fulfills the requirement as "general purpose library" of not, but I do see the ARM MBED statement https://tls.mbed.org/openssl-alternative that states that the goal is to "overtake" OpenSSL

"OpenSSL

OpenSSL is still used in many places as the SSL library for applications. Is it because there is great OpenSSL documentation? No. Is it because the OpenSSL API is intuitive? No. Is it because there is no good replacement for OpenSSL? No. Then why? Because it is most used SSL library in education, books and examples. We are trying to change that!"

If the goal is to replace OpenSSL for applications (if this doesn't mean TLS and crypto library applications) then I would think that known restrictions would need to be resolved in the process.

Regards

Mark

uTasker project developer

 
Jan 15, 2018 14:25
Ron Eldor

Hi Mark,
Thank you for your quote. Even in your quote it mentions:
"OpenSSL is still used in many places as the SSL library for applications".
Perhaps it is not implicit, but it suggests that Mbed TLS can replace Open SSL as the SSL \ TLS library, not as as general purpose library. Negative Integer is not part of TLS standards.
As I previously mentioned, I agree the documentation dhould probably be more implicit on this issue.
Regards,
Mbed TLS Team member
Ron

 
Jan 15, 2018 15:14
Wolf Smith

Hi Ron,

You mentioned "Supporting negative integers can open the cryptographic operations which we simply want to avoid.". Out of curiosity, may I ask what kind of cryptographic operations are you referring to?

It is not immediately clear to me what would be the problems if the parser is made "more generic". After all, from a modular design perspective, a parser should be more about dealing with input at a syntactic input, and not enforcing semantic correctness, which is dependent on the specific application in hand (for example, the requirement of a version number of a certain format must be non-negative is semantic, but its ASN.1 data type being INTEGER is syntactic).

Based on the v2.4.2 source code I have on my hand, I can see that mbedtls_asn1_get_int() gets used in parsing other formats as well but it seems to me that negative integers are not going to break the existing code (of course more analysis and testing would give a higher confidence). For example, in pkcs12.c and pkcs5.c it's used to parse the number of iterations of certain cryptographic operations, which, because of how for-loops are written, a negative iteration count would mean the loop is simply not iterated. Other uses of mbedtls_asn1_get_int() in for example pkparse.c are for getting version numbers, which seem to have explicit boundary checks anyway.

If the behaviour of rejecting all negative integers is really a necessary prerequisite for the other portion of mbedTLS to function correctly (which I doubt), then may I suggest to at least offer my patch as a separate function, say mbedtls_asn1_get_int2c(), so that projects interfacing directly with mbedTLS's ASN.1 parser would get to make their own choices?

 
Jan 16, 2018 06:50
Ron Eldor

Hi Wolf,
I'm sorry, I missed a few words in my sentence. I meant, ""Supporting negative integers can open the cryptographic operations to possible security vulnerabilities which we simply want to avoid."
We will consider this feature request in our quarterly review for feature requests.
Regards,
Mbed TLS Team member
Ron

 
Jan 17, 2018 14:14
Wolf Smith

Hi Ron,

Thanks for your reply. I think it makes sense to have this reviewed and thoroughly tested. I will click through the CLA as you have suggested in my pull request soon.

 
Jan 21, 2018 01:47
Simon Butcher

Can you expand a little on the potential use cases for this? I'd like to understand what it is that this feature enables. Some concrete examples or citing other standards would really help.

Simon Butcher

Mbed TLS - Lead Maintainer

 
Feb 14, 2018 06:10
Wolf Smith

Hi Simon,

I apologize for my late reply; was previously busy with something else.

This feature would allow developers to potentially use the ASN.1 parser that comes with mbedTLS for other generic purpose. Interestingly, I felt like that was perhaps the original intention of mbedTLS as well, as suggested by the comments in asn1.h and asn1parse.c that says Generic ASN.1 parsing. In that case, it would be a pity to have to load two different ASN.1 parsers instead of one.

I understand that this probably means "Generic ASN.1 DER parser", as dealing with other encoding rules, some of which are quite flexible and sometimes ambiguous, is too much to ask for a small footprint library.

Depending on what the applications are trying to do, it is possible that some of them might use negative values. I cannot predict what exactly will developers put in their ASN.1 definitions, as there are infinitely many possibilities.

One example that I can think of, is perhaps Kerberos [RFC 4120], which also mandate the use of DER, and many message types use negative values to notate special meanings. ITU-T has a list of application fields of ASN.1, including some famous examples like UMTS and LTE, as well as in bioinformatics and biometrics, though I cannot guarantee that all of them are using DER for their messages.

After all, ASN.1 is meant to be generic. The details regarding encoding ASN.1 integers can be found in Section 8.3.3 of https://www.itu.int/rec/T-REC-X.690-201508-I/en, where it specifically talks about the handling of 2's complement. It is perhaps a nice coincidence that most of the crypto we use today don't seem to make much of a use of negative numbers (e.g. version numbers, serial numbers, and various algorithm parameters) yet.

I also understand that the genericness of an implementation is upper-bounded by realities, since resources are often limited, if not scarce. However, it seems in this case the 2's complement can be implemented somewhat efficiently (see my previous replies), so I don't see a very strong reason for not consider implementing it. The only concern is that what if somewhere hidden inside mbedTLS, some code makes a questionable assumption of integers being non-negative. That is not a question I can answer and I will leave it for you and your team to discuss and analyze. I said questionable assumption because rejecting negative integers at the parser level to fit the application on hand seems to be the opposite of a modular design, as parser should be more focused on syntactic instead of semantic issues.

A conservative compromise is to perhaps offer this negative integer parsing capability as an experimental configurable feature for the time being, similar to what I did in my pull request. Another alternative would be to offer it in a separate function.

One might still argue that the ASN.1 parser in mbedTLS is meant for "internal use" only and should not be used for generic purposes. That is also a fine stance, but then I'd like to suggest 1) update the comments and documentation to reflect this; 2) use MBEDTLS_ERR_ASN1_INVALID_DATA instead of MBEDTLS_ERR_ASN1_INVALID_LENGTH in asn1parse.c when the parser encounters a negative integer.