curl_url_get(3) | Introduction to Library Functions | curl_url_get(3) |
curl_url_get - extract a part from a URL
#include <curl/curl.h> CURLUcode curl_url_get(const CURLU *url,
CURLUPart part,
char **content,
unsigned int flags);
Given a url handle of a URL object, this function extracts an individual piece or the full URL from it.
The part argument specifies which part to extract (see list below) and content points to a 'char *' to get updated to point to a newly allocated string with the contents.
The flags argument is a bitmask with individual features.
The returned content pointer must be freed with curl_free(3) after use.
The flags argument is zero, one or more bits set in a bitmask.
The query component also gets plus-to-space conversion as a bonus when this bit is set.
Note that this URL decoding is charset unaware and you get a zero terminated string back with data that could be intended for a particular encoding.
If there are byte values lower than 32 in the decoded string, the get operation returns an error instead.
Note that even when not asking for URL encoding, the '%' (byte 37) is URL encoded to make sure the hostname remains valid.
If libcurl is built without IDN capabilities, using this bit makes curl_url_get(3) return CURLUE_LACKS_IDN if the hostname contains anything outside the ASCII range.
(Added in curl 7.88.0)
If libcurl is built without IDN capabilities, using this bit makes curl_url_get(3) return CURLUE_LACKS_IDN if the hostname is using punycode.
(Added in curl 8.3.0)
An empty query part is one where this is nothing following the question mark (before the possible fragment). An empty fragments part is one where there is nothing following the hash sign.
(Added in curl 8.8.0)
Using this flag when getting CURLUPART_SCHEME if the scheme was set as the result of a guess makes curl_url_get() return CURLUE_NO_SCHEME.
Using this flag when getting CURLUPART_URL if the scheme was set as the result of a guess makes curl_url_get() return the full URL without the scheme component. Such a URL can then only be parsed with curl_url_set() if CURLU_GUESS_SCHEME is used.
(Added in curl 8.9.0)
We advise using the CURLU_PUNYCODE option to get the URL as "normalized" as possible since IDN allows hostnames to be written in many different ways that still end up the same punycode version.
Zero-length queries and fragments are excluded from the URL unless CURLU_GET_EMPTY is set.
IPv6 names are normalized when set, which should make them as short as possible while maintaining correct syntax.
A not-present query returns part set to NULL.
A zero-length query returns part as NULL unless CURLU_GET_EMPTY is set.
The query part gets pluses converted to space when asked to URL decode on get with the CURLU_URLDECODE bit.
A not-present fragment returns part set to NULL.
A zero-length fragment returns part as NULL unless CURLU_GET_EMPTY is set.
This functionality affects all supported protocols
int main(void) {
CURLUcode rc;
CURLU *url = curl_url();
rc = curl_url_set(url, CURLUPART_URL, "https://example.com", 0);
if(!rc) {
char *scheme;
rc = curl_url_get(url, CURLUPART_SCHEME, &scheme, 0);
if(!rc) {
printf("the scheme is %s\n", scheme);
curl_free(scheme);
}
curl_url_cleanup(url);
} }
Added in curl 7.62.0
Returns a CURLUcode error value, which is CURLUE_OK (0) if everything went fine. See the libcurl-errors(3) man page for the full list with descriptions.
If this function returns an error, no URL part is returned.
CURLOPT_CURLU(3), curl_url(3), curl_url_cleanup(3), curl_url_dup(3), curl_url_set(3), curl_url_strerror(3)
2024-10-05 | libcurl |