U8_VALIDATE(3C) | Standard C Library Functions | U8_VALIDATE(3C) |
u8_validate - validate UTF-8 characters and calculate the byte length
#include <sys/u8_textprep.h> int u8_validate(char *u8str, size_t n, char **list, int flag,
int *errnum);
u8str
n
list
flag
U8_VALIDATE_ENTIRE
When this option is used, u8_validate() will check up to n bytes from u8str and possibly more than a character before returning the result.
U8_VALIDATE_CHECK_ADDITIONAL
When this option is supplied with a list of character strings, u8_validate() additionally validates u8str against the character strings supplied with list and returns EBADF in errnum if u8str has any one of the character strings in list.
U8_VALIDATE_UCS2_RANGE
When this option is specified, the valid Unicode coding space is smaller to U+0000 to U+FFFF.
errnum
EBADF
EILSEQ
EINVAL
ERANGE
The u8_validate() function validates u8str in UTF-8 and determines the number of bytes constituting the character(s) pointed to by u8str.
If u8str is a null pointer, u8_validate() returns 0. Otherwise, u8_validate() returns either the number of bytes that constitute the characters if the next n or fewer bytes form valid characters, or -1 if there is an validation failure, in which case it may set errnum to indicate the error.
Example 1 Determine the length of the first UTF-8 character.
#include <sys/u8_textprep.h> char u8[MAXPATHLEN]; int errnum; . . . len = u8_validate(u8, 4, (char **)NULL, 0, &errnum); if (len == -1) {
switch (errnum) {
case EILSEQ:
case EINVAL:
return (MYFS4_ERR_INVAL);
case EBADF:
return (MYFS4_ERR_BADNAME);
case ERANGE:
return (MYFS4_ERR_BADCHAR);
default:
return (-10);
} }
Example 2 Check if there are any invalid characters in the entire string.
#include <sys/u8_textprep.h> char u8[MAXPATHLEN]; int n; int errnum; . . . n = strlen(u8); len = u8_validate(u8, n, (char **)NULL, U8_VALIDATE_ENTIRE, &errnum); if (len == -1) {
switch (errnum) {
case EILSEQ:
case EINVAL:
return (MYFS4_ERR_INVAL);
case EBADF:
return (MYFS4_ERR_BADNAME);
case ERANGE:
return (MYFS4_ERR_BADCHAR);
default:
return (-10);
} }
Example 3 Check if there is any invalid character, including prohibited characters, in the entire string.
#include <sys/u8_textprep.h> char u8[MAXPATHLEN]; int n; int errnum; char *prohibited[4] = {
".", "..", "\\", NULL }; . . . n = strlen(u8); len = u8_validate(u8, n, prohibited,
(U8_VALIDATE_ENTIRE|U8_VALIDATE_CHECK_ADDITIONAL), &errnum); if (len == -1) {
switch (errnum) {
case EILSEQ:
case EINVAL:
return (MYFS4_ERR_INVAL);
case EBADF:
return (MYFS4_ERR_BADNAME);
case ERANGE:
return (MYFS4_ERR_BADCHAR);
default:
return (-10);
} }
See attributes(7) for descriptions of the following attributes:
ATTRIBUTE TYPE | ATTRIBUTE VALUE |
Interface Stability | Committed |
MT-Level | MT-Safe |
u8_strcmp(3C), u8_textprep_str(3C), attributes(7), u8_strcmp(9F), u8_textprep_str(9F), u8_validate(9F)
The Unicode Standard (http://www.unicode.org)
September 18, 2007 | OmniOS |