1 /****************************************************************************
3 ** Copyright (C) 2012 Nokia Corporation and/or its subsidiary(-ies).
4 ** Copyright (C) 2012 Intel Corporation.
5 ** Contact: http://www.qt-project.org/
7 ** This file is part of the QtCore module of the Qt Toolkit.
9 ** $QT_BEGIN_LICENSE:LGPL$
10 ** GNU Lesser General Public License Usage
11 ** This file may be used under the terms of the GNU Lesser General Public
12 ** License version 2.1 as published by the Free Software Foundation and
13 ** appearing in the file LICENSE.LGPL included in the packaging of this
14 ** file. Please review the following information to ensure the GNU Lesser
15 ** General Public License version 2.1 requirements will be met:
16 ** http://www.gnu.org/licenses/old-licenses/lgpl-2.1.html.
18 ** In addition, as a special exception, Nokia gives you certain additional
19 ** rights. These rights are described in the Nokia Qt LGPL Exception
20 ** version 1.1, included in the file LGPL_EXCEPTION.txt in this package.
22 ** GNU General Public License Usage
23 ** Alternatively, this file may be used under the terms of the GNU General
24 ** Public License version 3.0 as published by the Free Software Foundation
25 ** and appearing in the file LICENSE.GPL included in the packaging of this
26 ** file. Please review the following information to ensure the GNU General
27 ** Public License version 3.0 requirements will be met:
28 ** http://www.gnu.org/copyleft/gpl.html.
31 ** Alternatively, this file may be used in accordance with the terms and
32 ** conditions contained in a signed written agreement between you and Nokia.
41 ****************************************************************************/
46 \brief The QUrl class provides a convenient interface for working
55 It can parse and construct URLs in both encoded and unencoded
56 form. QUrl also has support for internationalized domain names
59 The most common way to use QUrl is to initialize it via the
60 constructor by passing a QString. Otherwise, setUrl() and
61 setEncodedUrl() can also be used.
63 URLs can be represented in two forms: encoded or unencoded. The
64 unencoded representation is suitable for showing to users, but
65 the encoded representation is typically what you would send to
66 a web server. For example, the unencoded URL
67 "http://b\uuml\c{}hler.example.com" would be sent to the server as
68 "http://xn--bhler-kva.example.com/List%20of%20applicants.xml".
70 A URL can also be constructed piece by piece by calling
71 setScheme(), setUserName(), setPassword(), setHost(), setPort(),
72 setPath(), setEncodedQuery() and setFragment(). Some convenience
73 functions are also available: setAuthority() sets the user name,
74 password, host and port. setUserInfo() sets the user name and
77 Call isValid() to check if the URL is valid. This can be done at
78 any point during the constructing of a URL.
80 Constructing a query is particularly convenient through the use
81 of setQueryItems(), addQueryItem() and removeQueryItem(). Use
82 setQueryDelimiters() to customize the delimiters used for
83 generating the query string.
85 For the convenience of generating encoded URL strings or query
86 strings, there are two static functions called
87 fromPercentEncoding() and toPercentEncoding() which deal with
88 percent encoding and decoding of QStrings.
90 Calling isRelative() will tell whether or not the URL is
91 relative. A relative URL can be resolved by passing it as argument
92 to resolved(), which returns an absolute URL. isParentOf() is used
93 for determining whether one URL is a parent of another.
95 fromLocalFile() constructs a QUrl by parsing a local
96 file path. toLocalFile() converts a URL to a local file path.
98 The human readable representation of the URL is fetched with
99 toString(). This representation is appropriate for displaying a
100 URL to a user in unencoded form. The encoded form however, as
101 returned by toEncoded(), is for internal use, passing to web
102 servers, mail clients and so on.
104 QUrl conforms to the URI specification from
105 \l{RFC 3986} (Uniform Resource Identifier: Generic Syntax), and includes
106 scheme extensions from \l{RFC 1738} (Uniform Resource Locators). Case
107 folding rules in QUrl conform to \l{RFC 3491} (Nameprep: A Stringprep
108 Profile for Internationalized Domain Names (IDN)).
110 \section2 Character Conversions
112 Follow these rules to avoid erroneous character conversion when
113 dealing with URLs and strings:
116 \li When creating an QString to contain a URL from a QByteArray or a
117 char*, always use QString::fromUtf8().
118 \o Favor the use of QUrl::fromEncoded() and QUrl::toEncoded() instead of
119 QUrl(string) and QUrl::toString() when converting a QUrl to or from
127 \enum QUrl::ParsingMode
129 The parsing mode controls the way QUrl parses strings.
131 \value TolerantMode QUrl will try to correct some common errors in URLs.
132 This mode is useful when processing URLs entered by
135 \value StrictMode Only valid URLs are accepted. This mode is useful for
136 general URL validation.
138 In TolerantMode, the parser corrects the following invalid input:
142 \li Spaces and "%20": If an encoded URL contains a space, this will be
143 replaced with "%20". If a decoded URL contains "%20", this will be
144 replaced with a single space before the URL is parsed.
146 \li Single "%" characters: Any occurrences of a percent character "%" not
147 followed by exactly two hexadecimal characters (e.g., "13% coverage.html")
148 will be replaced by "%25".
150 \li Reserved and unreserved characters: An encoded URL should only
151 contain a few characters as literals; all other characters should
152 be percent-encoded. In TolerantMode, these characters will be
153 automatically percent-encoded where they are not allowed:
154 space / double-quote / "<" / ">" / "[" / "\" /
155 "]" / "^" / "`" / "{" / "|" / "}"
161 \enum QUrl::FormattingOption
163 The formatting options define how the URL is formatted when written out
166 \value None The format of the URL is unchanged.
167 \value RemoveScheme The scheme is removed from the URL.
168 \value RemovePassword Any password in the URL is removed.
169 \value RemoveUserInfo Any user information in the URL is removed.
170 \value RemovePort Any specified port is removed from the URL.
171 \value RemoveAuthority
172 \value RemovePath The URL's path is removed, leaving only the scheme,
173 host address, and port (if present).
174 \value RemoveQuery The query part of the URL (following a '?' character)
176 \value RemoveFragment
177 \value PreferLocalFile If the URL is a local file according to isLocalFile()
178 and contains no query or fragment, a local file path is returned.
179 \value StripTrailingSlash The trailing slash is removed if one is present.
181 Note that the case folding rules in \l{RFC 3491}{Nameprep}, which QUrl
182 conforms to, require host names to always be converted to lower case,
183 regardless of the Qt::FormattingOptions used.
187 \fn uint qHash(const QUrl &url)
191 Computes a hash key from the normalized version of \a url.
195 #include "qplatformdefs.h"
197 #include "qstringlist.h"
199 #include "qdir.h" // for QDir::fromNativeSeparators
200 #include "qtldurl_p.h"
201 #include "private/qipaddress_p.h"
202 #include "qurlquery.h"
203 #if defined(Q_OS_WINCE_WM)
204 #pragma optimize("g", off)
209 inline static bool isHex(char c)
212 return (c >= '0' && c <= '9') || (c >= 'a' && c <= 'f');
215 static inline char toHex(quint8 c)
217 return c > 9 ? c - 10 + 'A' : c + '0';
220 static inline QString ftpScheme()
222 return QStringLiteral("ftp");
225 static inline QString httpScheme()
227 return QStringLiteral("http");
230 static inline QString fileScheme()
232 return QStringLiteral("file");
235 QUrlPrivate::QUrlPrivate()
237 errorCode(NoError), errorSupplement(0),
238 sectionIsPresent(0), sectionHasError(0)
242 QUrlPrivate::QUrlPrivate(const QUrlPrivate ©)
243 : ref(1), port(copy.port),
245 userName(copy.userName),
246 password(copy.password),
250 fragment(copy.fragment),
251 errorCode(copy.errorCode),
252 errorSupplement(copy.errorSupplement),
253 sectionIsPresent(copy.sectionIsPresent),
254 sectionHasError(copy.sectionHasError)
258 void QUrlPrivate::clear()
271 sectionIsPresent = 0;
276 // From RFC 3896, Appendix A Collected ABNF for URI
277 // URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
279 // scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
281 // authority = [ userinfo "@" ] host [ ":" port ]
282 // userinfo = *( unreserved / pct-encoded / sub-delims / ":" )
283 // host = IP-literal / IPv4address / reg-name
286 // reg-name = *( unreserved / pct-encoded / sub-delims )
288 // pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
290 // query = *( pchar / "/" / "?" )
292 // fragment = *( pchar / "/" / "?" )
294 // pct-encoded = "%" HEXDIG HEXDIG
296 // unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
297 // reserved = gen-delims / sub-delims
298 // gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
299 // sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
300 // / "*" / "+" / "," / ";" / "="
301 // the path component has a complex ABNF that basically boils down to
302 // slash-separated segments of "pchar"
304 // The above is the strict definition of the URL components and it is what we
305 // return encoded as FullyEncoded. However, we store the equivalent to
306 // PrettyDecoded internally, as that is the default formatting mode and most
307 // likely to be used. PrettyDecoded decodes spaces, unicode sequences and
308 // unambiguous delimiters.
310 // An ambiguous delimiter is a delimiter that, if appeared decoded, would be
311 // interpreted as the beginning of a new component. From last to first
312 // component, they are:
313 // - fragment: none, since it's the last.
314 // - query: the "#" character is ambiguous, as it starts the fragment. In
315 // addition, the "+" character is treated specially, as should be both
316 // intra-query delimiters. Since we don't know which ones they are, we
317 // keep all reserved characters untouched.
318 // - path: the "#" and "?" characters are ambigous. In addition to them,
319 // the slash itself is considered special.
320 // - host: completely special, see setHost() below.
321 // - password: the "#", "?", "/", and ":" characters are ambiguous
322 // - username: the "#", "?", "/", ":", and "@" characters are ambiguous
323 // - scheme: doesn't accept any delimiter, see setScheme() below.
325 // list the recoding table modifications to be used with the recodeFromUser
326 // function, according to the rules above
328 #define decode(x) ushort(x)
329 #define leave(x) ushort(0x100 | (x))
330 #define encode(x) ushort(0x200 | (x))
332 static const ushort encodedUserNameActions[] = {
333 // first field, everything must be encoded, including the ":"
334 // userinfo = *( unreserved / pct-encoded / sub-delims / ":" )
344 static const ushort * const prettyUserNameActions = encodedUserNameActions;
345 static const ushort * const decodedUserNameActions = 0;
347 static const ushort encodedPasswordActions[] = {
348 // same as encodedUserNameActions, but decode ":"
349 // userinfo = *( unreserved / pct-encoded / sub-delims / ":" )
359 static const ushort * const prettyPasswordActions = encodedPasswordActions;
360 static const ushort * const decodedPasswordActions = 0;
362 static const ushort encodedPathActions[] = {
363 // pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
373 static const ushort * const prettyPathActions = encodedPathActions + 2; // allow decoding "[" / "]"
374 static const ushort * const decodedPathActions = encodedPathActions + 4; // equivalent to leave('/')
376 static const ushort encodedFragmentActions[] = {
377 // fragment = *( pchar / "/" / "?" )
378 // gen-delims permitted: ":" / "@" / "/" / "?"
379 // -> must encode: "[" / "]" / "#"
380 // HOWEVER: we allow "#" to remain decoded
390 static const ushort * const prettyFragmentActions = 0;
391 static const ushort * const decodedFragmentActions = 0;
393 // the query is handled specially, since we prefer not to transform the delims
394 static const ushort * const encodedQueryActions = encodedFragmentActions + 4; // encode "#" / "[" / "]"
397 static inline QString
398 recode(const QString &input, const ushort *actions, QUrl::ComponentFormattingOptions encoding,
402 const QChar *begin = input.constData() + from;
403 const QChar *end = input.constData() + iend;
404 if (qt_urlRecode(output, begin, end, encoding, actions))
407 return input.mid(from, iend - from);
410 static inline QString
411 recodeFromUser(const QString &input, const ushort *actions, int from, int end)
413 return recode(input, actions,
414 QUrl::DecodeUnicode | QUrl::DecodeAllDelimiters | QUrl::DecodeSpaces,
418 void QUrlPrivate::appendAuthority(QString &appendTo, QUrl::FormattingOptions options) const
420 if ((options & QUrl::RemoveUserInfo) != QUrl::RemoveUserInfo) {
421 appendUserInfo(appendTo, options);
423 appendTo += QLatin1Char('@');
425 appendHost(appendTo, options);
426 if (!(options & QUrl::RemovePort) && port != -1)
427 appendTo += QLatin1Char(':') + QString::number(port);
430 void QUrlPrivate::appendUserInfo(QString &appendTo, QUrl::FormattingOptions options) const
432 // when constructing the authority or user-info, we never encode the ambiguous delimiters
433 options &= ~(QUrl::DecodeAllDelimiters & ~QUrl::DecodeUnambiguousDelimiters);
435 appendUserName(appendTo, options);
436 if (options & QUrl::RemovePassword || !hasPassword()) {
439 appendTo += QLatin1Char(':');
440 appendPassword(appendTo, options);
444 // appendXXXX functions:
445 // the internal value is already encoded in PrettyDecoded, so that case is easy.
446 // DecodeUnicode and DecodeSpaces are handled by qt_urlRecode.
447 // That leaves these functions to handle three cases related to delimiters:
448 // 1) encoded encodedXXXX tables
449 // 2) DecodeUnambiguousDelimiters prettyXXXX tables
450 // 3) DecodeAllDelimiters decodedXXXX tables
451 static inline void appendToUser(QString &appendTo, const QString &value, QUrl::FormattingOptions options,
452 const ushort *encodedActions, const ushort *prettyActions, const ushort *decodedActions)
454 if (options == QUrl::PrettyDecoded) {
459 const ushort *actions = 0;
460 if ((options & QUrl::DecodeAllDelimiters) == QUrl::DecodeUnambiguousDelimiters) {
461 actions = prettyActions;
462 } else if (options & QUrl::DecodeAllDelimiters) {
463 actions = decodedActions;
464 } else if ((options & QUrl::DecodeAllDelimiters) == 0) {
465 actions = encodedActions;
468 if (!qt_urlRecode(appendTo, value.constData(), value.constData() + value.length(),
473 inline void QUrlPrivate::appendUserName(QString &appendTo, QUrl::FormattingOptions options) const
475 appendToUser(appendTo, userName, options, encodedUserNameActions, prettyUserNameActions, decodedUserNameActions);
478 inline void QUrlPrivate::appendPassword(QString &appendTo, QUrl::FormattingOptions options) const
480 appendToUser(appendTo, password, options, encodedPasswordActions, prettyPasswordActions, decodedPasswordActions);
483 inline void QUrlPrivate::appendPath(QString &appendTo, QUrl::FormattingOptions options) const
485 appendToUser(appendTo, path, options, encodedPathActions, prettyPathActions, decodedPathActions);
488 inline void QUrlPrivate::appendFragment(QString &appendTo, QUrl::FormattingOptions options) const
490 appendToUser(appendTo, fragment, options, encodedFragmentActions, prettyFragmentActions, decodedFragmentActions);
493 inline void QUrlPrivate::appendQuery(QString &appendTo, QUrl::FormattingOptions options) const
495 // almost the same code as the previous functions
496 // except we prefer not to touch the delimiters
497 if (options == QUrl::PrettyDecoded) {
502 const ushort *actions = 0;
503 if ((options & QUrl::DecodeAllDelimiters) == QUrl::DecodeUnambiguousDelimiters) {
504 // reset to default qt_urlRecode behaviour (leave delimiters alone)
505 options &= ~QUrl::DecodeAllDelimiters;
506 } else if ((options & QUrl::DecodeAllDelimiters) == 0) {
507 actions = encodedQueryActions;
510 if (!qt_urlRecode(appendTo, query.constData(), query.constData() + query.length(),
517 bool QUrlPrivate::setScheme(const QString &value, int len, bool decoded)
519 // schemes are strictly RFC-compliant:
520 // scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
521 // but we need to decode any percent-encoding sequences that fall on
523 // we also lowercase the scheme
526 sectionIsPresent |= Scheme;
527 sectionHasError |= Scheme; // assume it has errors, we'll clear before returning true
528 errorCode = SchemeEmptyError;
533 errorCode = InvalidSchemeError;
534 int needsLowercasing = -1;
535 const ushort *p = reinterpret_cast<const ushort *>(value.constData());
536 for (int i = 0; i < len; ++i) {
537 if (p[i] >= 'a' && p[i] <= 'z')
539 if (p[i] >= 'A' && p[i] <= 'Z') {
540 needsLowercasing = i;
543 if (p[i] >= '0' && p[i] <= '9' && i > 0)
545 if (p[i] == '+' || p[i] == '-' || p[i] == '.')
549 // found a percent-encoded sign
550 // if we haven't decoded yet, decode and try again
551 errorSupplement = '%';
555 QString decodedScheme;
556 if (qt_urlRecode(decodedScheme, value.constData(), value.constData() + len, 0, 0) == 0)
558 return setScheme(decodedScheme, decodedScheme.length(), true);
561 // found something else
562 errorSupplement = p[i];
566 scheme = value.left(len);
567 sectionHasError &= ~Scheme;
570 if (needsLowercasing != -1) {
571 // schemes are ASCII only, so we don't need the full Unicode toLower
572 QChar *schemeData = scheme.data(); // force detaching here
573 for (int i = needsLowercasing; i >= 0; --i) {
574 register ushort c = schemeData[i].unicode();
575 if (c >= 'A' && c <= 'Z')
576 schemeData[i] = c + 0x20;
582 bool QUrlPrivate::setAuthority(const QString &auth, int from, int end)
584 sectionHasError &= ~Authority;
585 sectionIsPresent &= ~Authority;
586 sectionIsPresent |= Host;
595 int userInfoIndex = auth.indexOf(QLatin1Char('@'), from);
596 if (uint(userInfoIndex) < uint(end)) {
597 setUserInfo(auth, from, userInfoIndex);
598 from = userInfoIndex + 1;
601 int colonIndex = auth.lastIndexOf(QLatin1Char(':'), end - 1);
602 if (colonIndex < from)
605 if (uint(colonIndex) < uint(end)) {
606 if (auth.at(from).unicode() == '[') {
607 // check if colonIndex isn't inside the "[...]" part
608 int closingBracket = auth.indexOf(QLatin1Char(']'), from);
609 if (uint(closingBracket) > uint(colonIndex))
614 if (colonIndex == end - 1) {
615 // found a colon but no digits after it
616 sectionHasError |= Port;
617 errorCode = PortEmptyError;
618 } else if (uint(colonIndex) < uint(end)) {
620 for (int i = colonIndex + 1; i < end; ++i) {
621 ushort c = auth.at(i).unicode();
622 if (c >= '0' && c <= '9') {
626 sectionHasError |= Port;
627 errorCode = InvalidPortError;
628 x = ulong(-1); // x != ushort(x)
632 if (x == ushort(x)) {
635 sectionHasError |= Port;
636 errorCode = InvalidPortError;
642 return setHost(auth, from, qMin<uint>(end, colonIndex)) && !(sectionHasError & Port);
645 void QUrlPrivate::setUserInfo(const QString &userInfo, int from, int end)
647 int delimIndex = userInfo.indexOf(QLatin1Char(':'), from);
648 setUserName(userInfo, from, qMin<uint>(delimIndex, end));
650 if (delimIndex == -1) {
652 sectionIsPresent &= ~Password;
653 sectionHasError &= ~Password;
655 setPassword(userInfo, delimIndex + 1, end);
659 inline void QUrlPrivate::setUserName(const QString &value, int from, int end)
661 sectionIsPresent |= UserName;
662 sectionHasError &= ~UserName;
663 userName = recodeFromUser(value, prettyUserNameActions, from, end);
666 inline void QUrlPrivate::setPassword(const QString &value, int from, int end)
668 sectionIsPresent |= Password;
669 sectionHasError &= ~Password;
670 password = recodeFromUser(value, prettyPasswordActions, from, end);
673 inline void QUrlPrivate::setPath(const QString &value, int from, int end)
675 // sectionIsPresent |= Path; // not used, save some cycles
676 sectionHasError &= ~Path;
677 path = recodeFromUser(value, prettyPathActions, from, end);
680 // check for the "path-noscheme" case
681 // if the path contains a ":" before the first "/", it could be misinterpreted
685 inline void QUrlPrivate::setFragment(const QString &value, int from, int end)
687 sectionIsPresent |= Fragment;
688 sectionHasError &= ~Fragment;
689 fragment = recodeFromUser(value, prettyFragmentActions, from, end);
692 inline void QUrlPrivate::setQuery(const QString &value, int from, int iend)
694 sectionIsPresent |= Query;
695 sectionHasError &= ~Query;
697 // use the default actions for the query
698 static const ushort decodeActions[] = {
712 const QChar *begin = value.constData() + from;
713 const QChar *end = value.constData() + iend;
714 if (qt_urlRecode(output, begin, end, QUrl::DecodeUnicode | QUrl::DecodeSpaces,
718 query = value.mid(from, iend - from);
722 // The RFC says the host is:
723 // host = IP-literal / IPv4address / reg-name
724 // IP-literal = "[" ( IPv6address / IPvFuture ) "]"
725 // IPvFuture = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" )
726 // [a strict definition of IPv6Address and IPv4Address]
727 // reg-name = *( unreserved / pct-encoded / sub-delims )
729 // We deviate from the standard in all but IPvFuture. For IPvFuture we accept
730 // and store only exactly what the RFC says we should. No percent-encoding is
731 // permitted in this field, so Unicode characters and space aren't either.
733 // For IPv4 addresses, we accept broken addresses like inet_aton does (that is,
734 // less than three dots). However, we correct the address to the proper form
735 // and store the corrected address. After correction, we comply to the RFC and
736 // it's exclusively composed of unreserved characters.
738 // For IPv6 addresses, we accept addresses including trailing (embedded) IPv4
739 // addresses, the so-called v4-compat and v4-mapped addresses. We also store
740 // those addresses like that in the hostname field, which violates the spec.
741 // IPv6 hosts are stored with the square brackets in the QString. It also
742 // requires no transformation in any way.
744 // As for registered names, it's the other way around: we accept only valid
745 // hostnames as specified by STD 3 and IDNA. That means everything we accept is
746 // valid in the RFC definition above, but there are many valid reg-names
747 // according to the RFC that we do not accept in the name of security. Since we
748 // do accept IDNA, reg-names are subject to ACE encoding and decoding, which is
749 // specified by the DecodeUnicode flag. The hostname is stored in its Unicode form.
751 inline void QUrlPrivate::appendHost(QString &appendTo, QUrl::FormattingOptions options) const
753 // this is the only flag that matters
754 options &= QUrl::DecodeUnicode;
757 if (host.at(0).unicode() == '[') {
758 // IPv6Address and IPvFuture address never require any transformation
761 // this is either an IPv4Address or a reg-name
762 // if it is a reg-name, it is already stored in Unicode form
763 if (options == QUrl::DecodeUnicode)
766 appendTo += qt_ACE_do(host, ToAceOnly);
770 // the whole IPvFuture is passed and parsed here, including brackets
771 static int parseIpFuture(QString &host, const QChar *begin, const QChar *end)
773 // IPvFuture = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" )
774 static const char acceptable[] =
775 "!$&'()*+,;=" // sub-delims
777 "-._~"; // unreserved
779 // the brackets and the "v" have been checked
780 if (begin[3].unicode() != '.')
781 return begin[3].unicode();
782 if ((begin[2].unicode() >= 'A' && begin[2].unicode() >= 'F') ||
783 (begin[2].unicode() >= 'a' && begin[2].unicode() <= 'f') ||
784 (begin[2].unicode() >= '0' && begin[2].unicode() <= '9')) {
785 // this is so unlikely that we'll just go down the slow path
786 // decode the whole string, skipping the "[vH." and "]" which we already know to be there
787 host += QString::fromRawData(begin, 4);
792 if (qt_urlRecode(decoded, begin, end, QUrl::FullyEncoded, 0)) {
793 begin = decoded.constBegin();
794 end = decoded.constEnd();
797 for ( ; begin != end; ++begin) {
798 if (begin->unicode() >= 'A' && begin->unicode() <= 'Z')
800 else if (begin->unicode() >= 'a' && begin->unicode() <= 'z')
802 else if (begin->unicode() >= '0' && begin->unicode() <= '9')
804 else if (begin->unicode() < 0x80 && strchr(acceptable, begin->unicode()) != 0)
807 return begin->unicode();
809 host += QLatin1Char(']');
812 return begin[2].unicode();
815 // ONLY the IPv6 address is parsed here, WITHOUT the brackets
816 static bool parseIp6(QString &host, const QChar *begin, const QChar *end)
818 QIPAddressUtils::IPv6Address address;
819 if (!QIPAddressUtils::parseIp6(address, begin, end)) {
820 // IPv6 failed parsing, check if it was a percent-encoded character in
821 // the middle and try again
823 if (!qt_urlRecode(decoded, begin, end, QUrl::FullyEncoded, 0)) {
824 // no transformation, nothing to re-parse
829 // if the parsing fails again, the qt_urlRecode above will return 0
830 return parseIp6(host, decoded.constBegin(), decoded.constEnd());
833 host.reserve(host.size() + (end - begin));
834 host += QLatin1Char('[');
835 QIPAddressUtils::toString(host, address);
836 host += QLatin1Char(']');
840 bool QUrlPrivate::setHost(const QString &value, int from, int iend, bool maybePercentEncoded)
842 const QChar *begin = value.constData() + from;
843 const QChar *end = value.constData() + iend;
845 const int len = end - begin;
847 sectionIsPresent |= Host;
848 sectionHasError &= ~Host;
852 if (begin[0].unicode() == '[') {
853 // IPv6Address or IPvFuture
854 // smallest IPv6 address is "[::]" (len = 4)
855 // smallest IPvFuture address is "[v7.X]" (len = 6)
856 if (end[-1].unicode() != ']') {
857 sectionHasError |= Host;
858 errorCode = HostMissingEndBracket;
862 if (len > 5 && begin[1].unicode() == 'v') {
863 int c = parseIpFuture(host, begin, end);
865 sectionHasError |= Host;
866 errorCode = InvalidIPvFutureError;
867 errorSupplement = short(c);
872 if (parseIp6(host, begin + 1, end - 1))
875 sectionHasError |= Host;
876 errorCode = begin[1].unicode() == 'v' ?
877 InvalidIPvFutureError : InvalidIPv6AddressError;
881 // check if it's an IPv4 address
882 QIPAddressUtils::IPv4Address ip4;
883 if (QIPAddressUtils::parseIp4(ip4, begin, end)) {
885 QIPAddressUtils::toString(host, ip4);
886 sectionHasError &= ~Host;
890 // This is probably a reg-name.
891 // But it can also be an encoded string that, when decoded becomes one
892 // of the types above.
894 // Two types of encoding are possible:
895 // percent encoding (e.g., "%31%30%2E%30%2E%30%2E%31" -> "10.0.0.1")
896 // Unicode encoding (some non-ASCII characters case-fold to digits
897 // when nameprepping is done)
899 // The qt_ACE_do function below applies nameprepping and the STD3 check.
900 // That means a Unicode string may become an IPv4 address, but it cannot
901 // produce a '[' or a '%'.
903 // check for percent-encoding first
905 if (maybePercentEncoded && qt_urlRecode(s, begin, end, QUrl::MostDecoded, 0)) {
906 // something was decoded
907 // anything encoded left?
908 if (s.contains(QChar(0x25))) { // '%'
909 sectionHasError |= Host;
910 errorCode = InvalidRegNameError;
915 return setHost(s, 0, s.length(), false);
918 s = qt_ACE_do(QString::fromRawData(begin, len), NormalizeAce);
920 sectionHasError |= Host;
921 errorCode = InvalidRegNameError;
926 if (QIPAddressUtils::parseIp4(ip4, s.constBegin(), s.constEnd())) {
927 QIPAddressUtils::toString(host, ip4);
934 void QUrlPrivate::parse(const QString &url, QUrl::ParsingMode parsingMode)
936 // URI-reference = URI / relative-ref
937 // URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
938 // relative-ref = relative-part [ "?" query ] [ "#" fragment ]
939 // hier-part = "//" authority path-abempty
940 // / other path types
941 // relative-part = "//" authority path-abempty
942 // / other path types here
944 sectionIsPresent = 0;
947 // find the important delimiters
951 const int len = url.length();
952 const QChar *const begin = url.constData();
953 const ushort *const data = reinterpret_cast<const ushort *>(begin);
955 for (int i = 0; i < len; ++i) {
956 register uint uc = data[i];
957 if (uc == '#' && hash == -1) {
960 // nothing more to be found
964 if (question == -1) {
965 if (uc == ':' && colon == -1)
972 // check if we have a scheme
974 if (colon != -1 && setScheme(url, colon)) {
975 hierStart = colon + 1;
977 // recover from a failed scheme: it might not have been a scheme at all
980 sectionIsPresent = 0;
985 int hierEnd = qMin<uint>(qMin<uint>(question, hash), len);
986 if (hierEnd - hierStart >= 2 && data[hierStart] == '/' && data[hierStart + 1] == '/') {
987 // we have an authority, it ends at the first slash after these
988 int authorityEnd = hierEnd;
989 for (int i = hierStart + 2; i < authorityEnd ; ++i) {
990 if (data[i] == '/') {
996 setAuthority(url, hierStart + 2, authorityEnd);
998 // even if we failed to set the authority properly, let's try to recover
999 pathStart = authorityEnd;
1000 setPath(url, pathStart, hierEnd);
1006 pathStart = hierStart;
1008 if (hierStart < hierEnd)
1009 setPath(url, hierStart, hierEnd);
1014 if (uint(question) < uint(hash))
1015 setQuery(url, question + 1, qMin<uint>(hash, len));
1018 setFragment(url, hash + 1, len);
1020 if (sectionHasError || parsingMode == QUrl::TolerantMode)
1023 // The parsing so far was tolerant of errors, so the StrictMode
1024 // parsing is actually implemented here, as an extra post-check.
1025 // We only execute it if we haven't found any errors so far.
1027 // What we need to look out for, that the regular parser tolerates:
1028 // - percent signs not followed by two hex digits
1029 // - forbidden characters, which should always appear encoded
1030 // '"' / '<' / '>' / '\' / '^' / '`' / '{' / '|' / '}' / BKSP
1031 // control characters
1032 // - delimiters not allowed in certain positions
1033 // . scheme: parser is already strict
1034 // . user info: gen-delims (except for ':') disallowed
1035 // . host: parser is stricter than the standard
1036 // . port: parser is stricter than the standard
1037 // . path: all delimiters allowed
1038 // . fragment: all delimiters allowed
1039 // . query: all delimiters allowed
1040 // We would only need to check the user-info. However, the presence
1041 // of the disallowed gen-delims changes the parsing, so we don't
1042 // actually need to do anything
1043 static const char forbidden[] = "\"<>\\^`{|}\x7F";
1044 for (uint i = 0; i < uint(len); ++i) {
1045 register uint uc = data[i];
1049 if ((uc == '%' && (uint(len) < i + 2 || !isHex(data[i + 1]) || !isHex(data[i + 2])))
1050 || uc <= 0x20 || strchr(forbidden, uc)) {
1052 errorSupplement = uc;
1055 if (i > uint(hash)) {
1056 errorCode = InvalidFragmentError;
1057 sectionHasError |= Fragment;
1058 } else if (i > uint(question)) {
1059 errorCode = InvalidQueryError;
1060 sectionHasError |= Query;
1061 } else if (i > uint(pathStart)) {
1062 // pathStart is never -1
1063 errorCode = InvalidPathError;
1064 sectionHasError |= Path;
1066 // It must be in the authority, since the scheme is strict.
1067 // Since the port and hostname parsers are also strict,
1068 // the error can only have happened in the user info.
1069 int pos = url.indexOf(QLatin1Char(':'), hierStart);
1070 if (i > uint(pos)) {
1071 errorCode = InvalidPasswordError;
1072 sectionHasError |= Password;
1074 errorCode = InvalidUserNameError;
1075 sectionHasError |= UserName;
1083 From http://www.ietf.org/rfc/rfc3986.txt, 5.2.3: Merge paths
1085 Returns a merge of the current path with the relative path passed
1088 Note: \a relativePath is relative (does not start with '/').
1090 QString QUrlPrivate::mergePaths(const QString &relativePath) const
1092 // If the base URI has a defined authority component and an empty
1093 // path, then return a string consisting of "/" concatenated with
1094 // the reference's path; otherwise,
1095 if (!host.isEmpty() && path.isEmpty())
1096 return QLatin1Char('/') + relativePath;
1098 // Return a string consisting of the reference's path component
1099 // appended to all but the last segment of the base URI's path
1100 // (i.e., excluding any characters after the right-most "/" in the
1101 // base URI path, or excluding the entire base URI path if it does
1102 // not contain any "/" characters).
1104 if (!path.contains(QLatin1Char('/')))
1105 newPath = relativePath;
1107 newPath = path.leftRef(path.lastIndexOf(QLatin1Char('/')) + 1) + relativePath;
1113 From http://www.ietf.org/rfc/rfc3986.txt, 5.2.4: Remove dot segments
1115 Removes unnecessary ../ and ./ from the path. Used for normalizing
1118 static void removeDotsFromPath(QString *path)
1120 // The input buffer is initialized with the now-appended path
1121 // components and the output buffer is initialized to the empty
1123 QChar *out = path->data();
1124 const QChar *in = out;
1125 const QChar *end = out + path->size();
1127 // If the input buffer consists only of
1128 // "." or "..", then remove that from the input
1130 if (path->size() == 1 && in[0].unicode() == '.')
1132 else if (path->size() == 2 && in[0].unicode() == '.' && in[1].unicode() == '.')
1134 // While the input buffer is not empty, loop:
1137 // otherwise, if the input buffer begins with a prefix of "../" or "./",
1138 // then remove that prefix from the input buffer;
1139 if (path->size() >= 2 && in[0].unicode() == '.' && in[1].unicode() == '/')
1141 else if (path->size() >= 3 && in[0].unicode() == '.'
1142 && in[1].unicode() == '.' && in[2].unicode() == '/')
1145 // otherwise, if the input buffer begins with a prefix of
1146 // "/./" or "/.", where "." is a complete path segment,
1147 // then replace that prefix with "/" in the input buffer;
1148 if (in <= end - 3 && in[0].unicode() == '/' && in[1].unicode() == '.'
1149 && in[2].unicode() == '/') {
1152 } else if (in == end - 2 && in[0].unicode() == '/' && in[1].unicode() == '.') {
1153 *out++ = QLatin1Char('/');
1158 // otherwise, if the input buffer begins with a prefix
1159 // of "/../" or "/..", where ".." is a complete path
1160 // segment, then replace that prefix with "/" in the
1161 // input buffer and remove the last //segment and its
1162 // preceding "/" (if any) from the output buffer;
1163 if (in <= end - 4 && in[0].unicode() == '/' && in[1].unicode() == '.'
1164 && in[2].unicode() == '.' && in[3].unicode() == '/') {
1165 while (out > path->constData() && (--out)->unicode() != '/')
1167 if (out == path->constData() && out->unicode() != '/')
1171 } else if (in == end - 3 && in[0].unicode() == '/' && in[1].unicode() == '.'
1172 && in[2].unicode() == '.') {
1173 while (out > path->constData() && (--out)->unicode() != '/')
1175 if (out->unicode() == '/')
1181 // otherwise move the first path segment in
1182 // the input buffer to the end of the output
1183 // buffer, including the initial "/" character
1184 // (if any) and any subsequent characters up
1185 // to, but not including, the next "/"
1186 // character or the end of the input buffer.
1188 while (in < end && in->unicode() != '/')
1191 path->truncate(out - path->constData());
1195 void QUrlPrivate::validate() const
1197 QUrlPrivate *that = (QUrlPrivate *)this;
1198 that->encodedOriginal = that->toEncoded(); // may detach
1201 QURL_SETFLAG(that->stateFlags, Validated);
1206 QString auth = authority(); // causes the non-encoded forms to be valid
1208 // authority() calls canonicalHost() which sets this
1212 if (scheme == QLatin1String("mailto")) {
1213 if (!host.isEmpty() || port != -1 || !userName.isEmpty() || !password.isEmpty()) {
1214 that->isValid = false;
1215 that->errorInfo.setParams(0, QT_TRANSLATE_NOOP(QUrl, "expected empty host, username,"
1216 "port and password"),
1219 } else if (scheme == ftpScheme() || scheme == httpScheme()) {
1220 if (host.isEmpty() && !(path.isEmpty() && encodedPath.isEmpty())) {
1221 that->isValid = false;
1222 that->errorInfo.setParams(0, QT_TRANSLATE_NOOP(QUrl, "the host is empty, but not the path"),
1228 const QByteArray &QUrlPrivate::normalized() const
1230 if (QURL_HASFLAG(stateFlags, QUrlPrivate::Normalized))
1231 return encodedNormalized;
1233 QUrlPrivate *that = const_cast<QUrlPrivate *>(this);
1234 QURL_SETFLAG(that->stateFlags, QUrlPrivate::Normalized);
1236 QUrlPrivate tmp = *this;
1237 tmp.scheme = tmp.scheme.toLower();
1238 tmp.host = tmp.canonicalHost();
1240 // ensure the encoded and normalized parts of the URL
1241 tmp.ensureEncodedParts();
1242 if (tmp.encodedUserName.contains('%'))
1243 q_normalizePercentEncoding(&tmp.encodedUserName, userNameExcludeChars);
1244 if (tmp.encodedPassword.contains('%'))
1245 q_normalizePercentEncoding(&tmp.encodedPassword, passwordExcludeChars);
1246 if (tmp.encodedFragment.contains('%'))
1247 q_normalizePercentEncoding(&tmp.encodedFragment, fragmentExcludeChars);
1249 if (tmp.encodedPath.contains('%')) {
1250 // the path is a bit special:
1251 // the slashes shouldn't be encoded or decoded.
1252 // They should remain exactly like they are right now
1254 // treat the path as a slash-separated sequence of pchar
1256 result.reserve(tmp.encodedPath.length());
1257 if (tmp.encodedPath.startsWith('/'))
1260 const char *data = tmp.encodedPath.constData();
1265 nextSlash = tmp.encodedPath.indexOf('/', lastSlash);
1267 if (nextSlash == -1)
1268 len = tmp.encodedPath.length() - lastSlash;
1270 len = nextSlash - lastSlash;
1272 if (memchr(data + lastSlash, '%', len)) {
1273 // there's at least one percent before the next slash
1274 QByteArray block = QByteArray(data + lastSlash, len);
1275 q_normalizePercentEncoding(&block, pathExcludeChars);
1276 result.append(block);
1278 // no percents in this path segment, append wholesale
1279 result.append(data + lastSlash, len);
1282 // append the slash too, if it's there
1283 if (nextSlash != -1)
1286 lastSlash = nextSlash;
1287 } while (lastSlash != -1);
1289 tmp.encodedPath = result;
1292 if (!tmp.scheme.isEmpty()) // relative test
1293 removeDotsFromPath(&tmp.encodedPath);
1295 int qLen = tmp.query.length();
1296 for (int i = 0; i < qLen; i++) {
1297 if (qLen - i > 2 && tmp.query.at(i) == '%') {
1299 tmp.query[i] = qToLower(tmp.query.at(i));
1301 tmp.query[i] = qToLower(tmp.query.at(i));
1304 encodedNormalized = tmp.toEncoded();
1306 return encodedNormalized;
1311 \macro QT_NO_URL_CAST_FROM_STRING
1314 Disables automatic conversions from QString (or char *) to QUrl.
1316 Compiling your code with this define is useful when you have a lot of
1317 code that uses QString for file names and you wish to convert it to
1318 use QUrl for network transparency. In any code that uses QUrl, it can
1319 help avoid missing QUrl::resolved() calls, and other misuses of
1320 QString to QUrl conversions.
1323 url = filename; // probably not what you want
1325 url = QUrl::fromLocalFile(filename);
1326 url = baseurl.resolved(QUrl(filename));
1329 \sa QT_NO_CAST_FROM_ASCII
1334 Constructs a URL by parsing \a url. \a url is assumed to be in human
1335 readable representation, with no percent encoding. QUrl will automatically
1336 percent encode all characters that are not allowed in a URL.
1337 The default parsing mode is TolerantMode.
1339 Parses the \a url using the parser mode \a parsingMode.
1343 \snippet doc/src/snippets/code/src_corelib_io_qurl.cpp 0
1345 To construct a URL from an encoded string, call fromEncoded():
1347 \snippet doc/src/snippets/code/src_corelib_io_qurl.cpp 1
1349 \sa setUrl(), setEncodedUrl(), fromEncoded(), TolerantMode
1351 QUrl::QUrl(const QString &url, ParsingMode parsingMode) : d(0)
1353 setUrl(url, parsingMode);
1357 Constructs an empty QUrl object.
1364 Constructs a copy of \a other.
1366 QUrl::QUrl(const QUrl &other) : d(other.d)
1373 Destructor; called immediately before the object is deleted.
1377 if (d && !d->ref.deref())
1382 Returns true if the URL is non-empty and valid; otherwise returns false.
1384 The URL is run through a conformance test. Every part of the URL
1385 must conform to the standard encoding rules of the URI standard
1386 for the URL to be reported as valid.
1388 \snippet doc/src/snippets/code/src_corelib_io_qurl.cpp 2
1390 bool QUrl::isValid() const
1392 if (isEmpty()) return false;
1393 return d->sectionHasError == 0;
1397 Returns true if the URL has no data; otherwise returns false.
1399 bool QUrl::isEmpty() const
1401 if (!d) return true;
1402 return d->isEmpty();
1406 Resets the content of the QUrl. After calling this function, the
1407 QUrl is equal to one that has been constructed with the default
1412 if (d && !d->ref.deref())
1418 Parses \a url using the parsing mode \a parsingMode.
1420 \a url is assumed to be in unicode format, with no percent
1423 Calling isValid() will tell whether or not a valid URL was constructed.
1427 void QUrl::setUrl(const QString &url, ParsingMode parsingMode)
1430 d->parse(url, parsingMode);
1435 Sets the scheme of the URL to \a scheme. As a scheme can only
1436 contain ASCII characters, no conversion or encoding is done on the
1439 The scheme describes the type (or protocol) of the URL. It's
1440 represented by one or more ASCII characters at the start the URL,
1441 and is followed by a ':'. The following example shows a URL where
1442 the scheme is "ftp":
1444 \img qurl-authority2.png
1446 The scheme can also be empty, in which case the URL is interpreted
1449 \sa scheme(), isRelative()
1451 void QUrl::setScheme(const QString &scheme)
1454 if (scheme.isEmpty()) {
1455 // schemes are not allowed to be empty
1456 d->sectionIsPresent &= ~QUrlPrivate::Scheme;
1457 d->sectionHasError &= ~QUrlPrivate::Scheme;
1460 d->setScheme(scheme, scheme.length());
1465 Returns the scheme of the URL. If an empty string is returned,
1466 this means the scheme is undefined and the URL is then relative.
1468 \sa setScheme(), isRelative()
1470 QString QUrl::scheme() const
1472 if (!d) return QString();
1478 Sets the authority of the URL to \a authority.
1480 The authority of a URL is the combination of user info, a host
1481 name and a port. All of these elements are optional; an empty
1482 authority is therefore valid.
1484 The user info and host are separated by a '@', and the host and
1485 port are separated by a ':'. If the user info is empty, the '@'
1486 must be omitted; although a stray ':' is permitted if the port is
1489 The following example shows a valid authority string:
1491 \img qurl-authority.png
1493 void QUrl::setAuthority(const QString &authority)
1496 d->setAuthority(authority, 0, authority.length());
1497 if (authority.isNull()) {
1498 // QUrlPrivate::setAuthority cleared almost everything
1499 // but it leaves the Host bit set
1500 d->sectionIsPresent &= ~QUrlPrivate::Authority;
1505 Returns the authority of the URL if it is defined; otherwise
1506 an empty string is returned.
1510 QString QUrl::authority(ComponentFormattingOptions options) const
1512 if (!d) return QString();
1515 d->appendAuthority(result, options);
1520 Sets the user info of the URL to \a userInfo. The user info is an
1521 optional part of the authority of the URL, as described in
1524 The user info consists of a user name and optionally a password,
1525 separated by a ':'. If the password is empty, the colon must be
1526 omitted. The following example shows a valid user info string:
1528 \img qurl-authority3.png
1530 \sa userInfo(), setUserName(), setPassword(), setAuthority()
1532 void QUrl::setUserInfo(const QString &userInfo)
1535 QString trimmed = userInfo.trimmed();
1536 d->setUserInfo(trimmed, 0, trimmed.length());
1537 if (userInfo.isNull()) {
1538 // QUrlPrivate::setUserInfo cleared almost everything
1539 // but it leaves the UserName bit set
1540 d->sectionIsPresent &= ~QUrlPrivate::UserInfo;
1545 Returns the user info of the URL, or an empty string if the user
1548 QString QUrl::userInfo(ComponentFormattingOptions options) const
1550 if (!d) return QString();
1553 d->appendUserInfo(result, options);
1558 Sets the URL's user name to \a userName. The \a userName is part
1559 of the user info element in the authority of the URL, as described
1562 \sa setEncodedUserName(), userName(), setUserInfo()
1564 void QUrl::setUserName(const QString &userName)
1567 d->setUserName(userName, 0, userName.length());
1568 if (userName.isNull())
1569 d->sectionIsPresent &= ~QUrlPrivate::UserName;
1573 Returns the user name of the URL if it is defined; otherwise
1574 an empty string is returned.
1576 \sa setUserName(), encodedUserName()
1578 QString QUrl::userName(ComponentFormattingOptions options) const
1580 if (!d) return QString();
1583 d->appendUserName(result, options);
1588 Sets the URL's password to \a password. The \a password is part of
1589 the user info element in the authority of the URL, as described in
1592 \sa password(), setUserInfo()
1594 void QUrl::setPassword(const QString &password)
1597 d->setPassword(password, 0, password.length());
1598 if (password.isNull())
1599 d->sectionIsPresent &= ~QUrlPrivate::Password;
1603 Returns the password of the URL if it is defined; otherwise
1604 an empty string is returned.
1608 QString QUrl::password(ComponentFormattingOptions options) const
1610 if (!d) return QString();
1613 d->appendPassword(result, options);
1618 Sets the host of the URL to \a host. The host is part of the
1621 \sa host(), setAuthority()
1623 void QUrl::setHost(const QString &host)
1626 if (d->setHost(host, 0, host.length())) {
1628 d->sectionIsPresent &= ~QUrlPrivate::Host;
1629 } else if (!host.startsWith(QLatin1Char('['))) {
1630 // setHost failed, it might be IPv6 or IPvFuture in need of bracketing
1631 ushort oldCode = d->errorCode;
1632 ushort oldSupplement = d->errorSupplement;
1633 if (!d->setHost(QLatin1Char('[') + host + QLatin1Char(']'), 0, host.length() + 2)) {
1634 // failed again: choose if this was an IPv6 error or not
1635 if (!host.contains(QLatin1Char(':'))) {
1636 d->errorCode = oldCode;
1637 d->errorSupplement = oldSupplement;
1644 Returns the host of the URL if it is defined; otherwise
1645 an empty string is returned.
1647 QString QUrl::host(ComponentFormattingOptions options) const
1649 if (!d) return QString();
1652 d->appendHost(result, options);
1653 if (result.startsWith(QLatin1Char('[')))
1654 return result.mid(1, result.length() - 2);
1659 Sets the port of the URL to \a port. The port is part of the
1660 authority of the URL, as described in setAuthority().
1662 \a port must be between 0 and 65535 inclusive. Setting the
1663 port to -1 indicates that the port is unspecified.
1665 void QUrl::setPort(int port)
1669 if (port < -1 || port > 65535) {
1670 qWarning("QUrl::setPort: Out of range");
1672 d->sectionHasError |= QUrlPrivate::Port;
1673 d->errorCode = QUrlPrivate::InvalidPortError;
1675 d->sectionHasError &= ~QUrlPrivate::Port;
1684 Returns the port of the URL, or \a defaultPort if the port is
1689 \snippet doc/src/snippets/code/src_corelib_io_qurl.cpp 3
1691 int QUrl::port(int defaultPort) const
1693 if (!d) return defaultPort;
1694 return d->port == -1 ? defaultPort : d->port;
1698 Sets the path of the URL to \a path. The path is the part of the
1699 URL that comes after the authority but before the query string.
1701 \img qurl-ftppath.png
1703 For non-hierarchical schemes, the path will be everything
1704 following the scheme declaration, as in the following example:
1706 \img qurl-mailtopath.png
1710 void QUrl::setPath(const QString &path)
1713 d->setPath(path, 0, path.length());
1715 // optimized out, since there is no path delimiter
1716 // if (path.isNull())
1717 // d->sectionIsPresent &= ~QUrlPrivate::Path;
1721 Returns the path of the URL.
1725 QString QUrl::path(ComponentFormattingOptions options) const
1727 if (!d) return QString();
1730 d->appendPath(result, options);
1737 Returns true if this URL contains a Query (i.e., if ? was seen on it).
1739 \sa hasQueryItem(), encodedQuery()
1741 bool QUrl::hasQuery() const
1743 if (!d) return false;
1744 return d->hasQuery();
1748 Sets the query string of the URL to \a query. The string is
1749 inserted as-is, and no further encoding is performed when calling
1752 This function is useful if you need to pass a query string that
1753 does not fit into the key-value pattern, or that uses a different
1754 scheme for encoding special characters than what is suggested by
1757 Passing a value of QByteArray() to \a query (a null QByteArray) unsets
1758 the query completely. However, passing a value of QByteArray("")
1759 will set the query to an empty value, as if the original URL
1762 \sa encodedQuery(), hasQuery()
1764 void QUrl::setQuery(const QString &query)
1768 d->setQuery(query, 0, query.length());
1770 d->sectionIsPresent &= ~QUrlPrivate::Query;
1773 void QUrl::setQuery(const QUrlQuery &query)
1777 // we know the data is in the right format
1778 d->query = query.toString();
1779 if (query.isEmpty())
1780 d->sectionIsPresent &= ~QUrlPrivate::Query;
1782 d->sectionIsPresent |= QUrlPrivate::Query;
1786 Returns the query string of the URL in percent encoded form.
1788 QString QUrl::query(ComponentFormattingOptions options) const
1790 if (!d) return QString();
1793 d->appendQuery(result, options);
1794 if (d->hasQuery() && result.isNull())
1800 Sets the fragment of the URL to \a fragment. The fragment is the
1801 last part of the URL, represented by a '#' followed by a string of
1802 characters. It is typically used in HTTP for referring to a
1803 certain link or point on a page:
1805 \img qurl-fragment.png
1807 The fragment is sometimes also referred to as the URL "reference".
1809 Passing an argument of QString() (a null QString) will unset the fragment.
1810 Passing an argument of QString("") (an empty but not null QString)
1811 will set the fragment to an empty string (as if the original URL
1814 \sa fragment(), hasFragment()
1816 void QUrl::setFragment(const QString &fragment)
1820 d->setFragment(fragment, 0, fragment.length());
1821 if (fragment.isNull())
1822 d->sectionIsPresent &= ~QUrlPrivate::Fragment;
1826 Returns the fragment of the URL.
1830 QString QUrl::fragment(ComponentFormattingOptions options) const
1832 if (!d) return QString();
1835 d->appendFragment(result, options);
1836 if (d->hasFragment() && result.isNull())
1844 Returns true if this URL contains a fragment (i.e., if # was seen on it).
1846 \sa fragment(), setFragment()
1848 bool QUrl::hasFragment() const
1850 if (!d) return false;
1851 return d->hasFragment();
1857 Returns the TLD (Top-Level Domain) of the URL, (e.g. .co.uk, .net).
1858 Note that the return value is prefixed with a '.' unless the
1859 URL does not contain a valid TLD, in which case the function returns
1862 QString QUrl::topLevelDomain(ComponentFormattingOptions options) const
1864 QString tld = qTopLevelDomain(host());
1865 if ((options & DecodeUnicode) == 0) {
1866 return qt_ACE_do(tld, ToAceOnly);
1872 Returns the result of the merge of this URL with \a relative. This
1873 URL is used as a base to convert \a relative to an absolute URL.
1875 If \a relative is not a relative URL, this function will return \a
1876 relative directly. Otherwise, the paths of the two URLs are
1877 merged, and the new URL returned has the scheme and authority of
1878 the base URL, but with the merged path, as in the following
1881 \snippet doc/src/snippets/code/src_corelib_io_qurl.cpp 5
1883 Calling resolved() with ".." returns a QUrl whose directory is
1884 one level higher than the original. Similarly, calling resolved()
1885 with "../.." removes two levels from the path. If \a relative is
1886 "/", the path becomes "/".
1890 QUrl QUrl::resolved(const QUrl &relative) const
1892 if (!d) return relative;
1893 if (!relative.d) return *this;
1896 // be non strict and allow scheme in relative url
1897 if (!relative.d->scheme.isEmpty() && relative.d->scheme != d->scheme) {
1900 if (relative.d->hasAuthority()) {
1903 t.d = new QUrlPrivate;
1905 // copy the authority
1906 t.d->userName = d->userName;
1907 t.d->password = d->password;
1908 t.d->host = d->host;
1909 t.d->port = d->port;
1910 t.d->sectionIsPresent = d->sectionIsPresent & QUrlPrivate::Authority;
1912 if (relative.d->path.isEmpty()) {
1913 t.d->path = d->path;
1914 if (relative.d->hasQuery()) {
1915 t.d->query = relative.d->query;
1916 t.d->sectionIsPresent |= QUrlPrivate::Query;
1917 } else if (d->hasQuery()) {
1918 t.d->query = d->query;
1919 t.d->sectionIsPresent |= QUrlPrivate::Query;
1922 t.d->path = relative.d->path.startsWith(QLatin1Char('/'))
1924 : d->mergePaths(relative.d->path);
1925 if (relative.d->hasQuery()) {
1926 t.d->query = relative.d->query;
1927 t.d->sectionIsPresent |= QUrlPrivate::Query;
1931 t.d->scheme = d->scheme;
1933 t.d->sectionIsPresent |= QUrlPrivate::Scheme;
1935 t.d->sectionIsPresent &= ~QUrlPrivate::Scheme;
1937 t.d->fragment = relative.d->fragment;
1938 if (relative.d->hasFragment())
1939 t.d->sectionIsPresent |= QUrlPrivate::Fragment;
1941 t.d->sectionIsPresent &= ~QUrlPrivate::Fragment;
1943 removeDotsFromPath(&t.d->path);
1945 #if defined(QURL_DEBUG)
1946 qDebug("QUrl(\"%s\").resolved(\"%s\") = \"%s\"",
1948 qPrintable(relative.url()),
1949 qPrintable(t.url()));
1955 Returns true if the URL is relative; otherwise returns false. A
1956 URL is relative if its scheme is undefined; this function is
1957 therefore equivalent to calling scheme().isEmpty().
1959 bool QUrl::isRelative() const
1961 if (!d) return true;
1962 return !d->hasScheme() && !d->path.startsWith(QLatin1Char('/'));
1966 Returns a string representation of the URL.
1967 The output can be customized by passing flags with \a options.
1969 The resulting QString can be passed back to a QUrl later on.
1971 Synonym for toString(options).
1973 \sa FormattingOptions, toEncoded(), toString()
1975 QString QUrl::url(FormattingOptions options) const
1977 return toString(options);
1981 Returns a string representation of the URL.
1982 The output can be customized by passing flags with \a options.
1984 \sa FormattingOptions, url(), setUrl()
1986 QString QUrl::toString(FormattingOptions options) const
1988 if (!d) return QString();
1990 // return just the path if:
1991 // - QUrl::PreferLocalFile is passed
1992 // - QUrl::RemovePath isn't passed (rather stupid if the user did...)
1993 // - there's no query or fragment to return
1994 // that is, either they aren't present, or we're removing them
1995 // - it's a local file
1996 // (test done last since it's the most expensive)
1997 if (options.testFlag(QUrl::PreferLocalFile) && !options.testFlag(QUrl::RemovePath)
1998 && (!d->hasQuery() || options.testFlag(QUrl::RemoveQuery))
1999 && (!d->hasFragment() || options.testFlag(QUrl::RemoveFragment))
2001 return path(options);
2006 if (!(options & QUrl::RemoveScheme) && d->hasScheme())
2007 url += d->scheme + QLatin1Char(':');
2009 bool pathIsAbsolute = d->path.startsWith(QLatin1Char('/'));
2010 if (!((options & QUrl::RemoveAuthority) == QUrl::RemoveAuthority) && d->hasAuthority()) {
2011 url += QLatin1String("//");
2012 d->appendAuthority(url, options);
2013 } else if (isLocalFile() && pathIsAbsolute) {
2014 url += QLatin1String("//");
2017 if (!(options & QUrl::RemovePath)) {
2018 // check if we need to insert a slash
2019 if (!pathIsAbsolute && !d->path.isEmpty() && !url.isEmpty() && !url.endsWith(QLatin1Char(':')))
2020 url += QLatin1Char('/');
2022 d->appendPath(url, options);
2023 // check if we need to remove trailing slashes
2024 while ((options & StripTrailingSlash) && url.endsWith(QLatin1Char('/')))
2028 if (!(options & QUrl::RemoveQuery) && d->hasQuery()) {
2029 url += QLatin1Char('?');
2030 d->appendQuery(url, options);
2032 if (!(options & QUrl::RemoveFragment) && d->hasFragment()) {
2033 url += QLatin1Char('#');
2034 d->appendFragment(url, options);
2043 Returns a human-displayable string representation of the URL.
2044 The output can be customized by passing flags with \a options.
2045 The option RemovePassword is always enabled, since passwords
2046 should never be shown back to users.
2048 With the default options, the resulting QString can be passed back
2049 to a QUrl later on, but any password that was present initially will
2052 \sa FormattingOptions, toEncoded(), toString()
2055 QString QUrl::toDisplayString(FormattingOptions options) const
2057 return toString(options | RemovePassword);
2061 Returns the encoded representation of the URL if it's valid;
2062 otherwise an empty QByteArray is returned. The output can be
2063 customized by passing flags with \a options.
2065 The user info, path and fragment are all converted to UTF-8, and
2066 all non-ASCII characters are then percent encoded. The host name
2067 is encoded using Punycode.
2069 QByteArray QUrl::toEncoded(FormattingOptions options) const
2071 QString stringForm = toString(options);
2072 if (options & DecodeUnicode)
2073 return stringForm.toUtf8();
2074 return stringForm.toLatin1();
2078 \fn QUrl QUrl::fromEncoded(const QByteArray &input, ParsingMode parsingMode)
2080 Parses \a input and returns the corresponding QUrl. \a input is
2081 assumed to be in encoded form, containing only ASCII characters.
2083 Parses the URL using \a parsingMode.
2085 \sa toEncoded(), setUrl()
2087 QUrl QUrl::fromEncoded(const QByteArray &input, ParsingMode mode)
2089 return QUrl(QString::fromUtf8(input.constData(), input.size()), mode);
2093 Returns a decoded copy of \a input. \a input is first decoded from
2094 percent encoding, then converted from UTF-8 to unicode.
2096 QString QUrl::fromPercentEncoding(const QByteArray &input)
2098 return QString::fromUtf8(QByteArray::fromPercentEncoding(input));
2102 Returns an encoded copy of \a input. \a input is first converted
2103 to UTF-8, and all ASCII-characters that are not in the unreserved group
2104 are percent encoded. To prevent characters from being percent encoded
2105 pass them to \a exclude. To force characters to be percent encoded pass
2108 Unreserved is defined as:
2109 ALPHA / DIGIT / "-" / "." / "_" / "~"
2111 \snippet doc/src/snippets/code/src_corelib_io_qurl.cpp 6
2113 QByteArray QUrl::toPercentEncoding(const QString &input, const QByteArray &exclude, const QByteArray &include)
2115 return input.toUtf8().toPercentEncoding(exclude, include);
2119 \fn QByteArray QUrl::toPunycode(const QString &uc)
2121 Returns a \a uc in Punycode encoding.
2123 Punycode is a Unicode encoding used for internationalized domain
2124 names, as defined in RFC3492. If you want to convert a domain name from
2125 Unicode to its ASCII-compatible representation, use toAce().
2129 \fn QString QUrl::fromPunycode(const QByteArray &pc)
2131 Returns the Punycode decoded representation of \a pc.
2133 Punycode is a Unicode encoding used for internationalized domain
2134 names, as defined in RFC3492. If you want to convert a domain from
2135 its ASCII-compatible encoding to the Unicode representation, use
2142 Returns the Unicode form of the given domain name
2143 \a domain, which is encoded in the ASCII Compatible Encoding (ACE).
2144 The result of this function is considered equivalent to \a domain.
2146 If the value in \a domain cannot be encoded, it will be converted
2147 to QString and returned.
2149 The ASCII Compatible Encoding (ACE) is defined by RFC 3490, RFC 3491
2150 and RFC 3492. It is part of the Internationalizing Domain Names in
2151 Applications (IDNA) specification, which allows for domain names
2152 (like \c "example.com") to be written using international
2155 QString QUrl::fromAce(const QByteArray &domain)
2157 return qt_ACE_do(QString::fromLatin1(domain), NormalizeAce);
2163 Returns the ASCII Compatible Encoding of the given domain name \a domain.
2164 The result of this function is considered equivalent to \a domain.
2166 The ASCII-Compatible Encoding (ACE) is defined by RFC 3490, RFC 3491
2167 and RFC 3492. It is part of the Internationalizing Domain Names in
2168 Applications (IDNA) specification, which allows for domain names
2169 (like \c "example.com") to be written using international
2172 This function return an empty QByteArra if \a domain is not a valid
2173 hostname. Note, in particular, that IPv6 literals are not valid domain
2176 QByteArray QUrl::toAce(const QString &domain)
2178 QString result = qt_ACE_do(domain, ToAceOnly);
2179 return result.toLatin1();
2185 Returns true if this URL is "less than" the given \a url. This
2186 provides a means of ordering URLs.
2188 bool QUrl::operator <(const QUrl &url) const
2191 bool thisIsEmpty = !d || d->isEmpty();
2192 bool thatIsEmpty = !url.d || url.d->isEmpty();
2194 // sort an empty URL first
2195 return thisIsEmpty && !thatIsEmpty;
2199 cmp = d->scheme.compare(url.d->scheme);
2203 cmp = d->userName.compare(url.d->userName);
2207 cmp = d->password.compare(url.d->password);
2211 cmp = d->host.compare(url.d->host);
2215 if (d->port != url.d->port)
2216 return d->port < url.d->port;
2218 cmp = d->path.compare(url.d->path);
2222 cmp = d->query.compare(url.d->query);
2226 cmp = d->fragment.compare(url.d->fragment);
2231 Returns true if this URL and the given \a url are equal;
2232 otherwise returns false.
2234 bool QUrl::operator ==(const QUrl &url) const
2239 return url.d->isEmpty();
2241 return d->isEmpty();
2242 return d->scheme == url.d->scheme &&
2243 d->userName == url.d->userName &&
2244 d->password == url.d->password &&
2245 d->host == url.d->host &&
2246 d->port == url.d->port &&
2247 d->path == url.d->path &&
2248 d->query == url.d->query &&
2249 d->fragment == url.d->fragment;
2253 Returns true if this URL and the given \a url are not equal;
2254 otherwise returns false.
2256 bool QUrl::operator !=(const QUrl &url) const
2258 return !(*this == url);
2262 Assigns the specified \a url to this object.
2264 QUrl &QUrl::operator =(const QUrl &url)
2273 qAtomicAssign(d, url.d);
2281 Assigns the specified \a url to this object.
2283 QUrl &QUrl::operator =(const QString &url)
2285 if (url.isEmpty()) {
2289 d->parse(url, TolerantMode);
2295 \fn void QUrl::swap(QUrl &other)
2298 Swaps URL \a other with this URL. This operation is very
2299 fast and never fails.
2309 d = new QUrlPrivate;
2317 bool QUrl::isDetached() const
2319 return !d || d->ref.load() == 1;
2324 Returns a QUrl representation of \a localFile, interpreted as a local
2325 file. This function accepts paths separated by slashes as well as the
2326 native separator for this platform.
2328 This function also accepts paths with a doubled leading slash (or
2329 backslash) to indicate a remote file, as in
2330 "//servername/path/to/file.txt". Note that only certain platforms can
2331 actually open this file using QFile::open().
2333 \sa toLocalFile(), isLocalFile(), QDir::toNativeSeparators()
2335 QUrl QUrl::fromLocalFile(const QString &localFile)
2338 url.setScheme(fileScheme());
2339 QString deslashified = QDir::fromNativeSeparators(localFile);
2341 // magic for drives on windows
2342 if (deslashified.length() > 1 && deslashified.at(1) == QLatin1Char(':') && deslashified.at(0) != QLatin1Char('/')) {
2343 deslashified.prepend(QLatin1Char('/'));
2344 } else if (deslashified.startsWith(QLatin1String("//"))) {
2345 // magic for shared drive on windows
2346 int indexOfPath = deslashified.indexOf(QLatin1Char('/'), 2);
2347 url.setHost(deslashified.mid(2, indexOfPath - 2));
2348 if (indexOfPath > 2)
2349 deslashified = deslashified.right(deslashified.length() - indexOfPath);
2351 deslashified.clear();
2354 url.setPath(deslashified.replace(QLatin1Char('%'), QStringLiteral("%25")));
2359 Returns the path of this URL formatted as a local file path. The path
2360 returned will use forward slashes, even if it was originally created
2361 from one with backslashes.
2363 If this URL contains a non-empty hostname, it will be encoded in the
2364 returned value in the form found on SMB networks (for example,
2365 "//servername/path/to/file.txt").
2367 \sa fromLocalFile(), isLocalFile()
2369 QString QUrl::toLocalFile() const
2371 // the call to isLocalFile() also ensures that we're parsed
2376 QString ourPath = path();
2378 // magic for shared drive on windows
2379 if (!d->host.isEmpty()) {
2380 tmp = QStringLiteral("//") + d->host + (ourPath.length() > 0 && ourPath.at(0) != QLatin1Char('/')
2381 ? QLatin1Char('/') + ourPath : ourPath);
2384 // magic for drives on windows
2385 if (ourPath.length() > 2 && ourPath.at(0) == QLatin1Char('/') && ourPath.at(2) == QLatin1Char(':'))
2394 Returns true if this URL is pointing to a local file path. A URL is a
2395 local file path if the scheme is "file".
2397 Note that this function considers URLs with hostnames to be local file
2398 paths, even if the eventual file path cannot be opened with
2401 \sa fromLocalFile(), toLocalFile()
2403 bool QUrl::isLocalFile() const
2405 if (!d) return false;
2407 if (d->scheme != fileScheme())
2408 return false; // not file
2413 Returns true if this URL is a parent of \a childUrl. \a childUrl is a child
2414 of this URL if the two URLs share the same scheme and authority,
2415 and this URL's path is a parent of the path of \a childUrl.
2417 bool QUrl::isParentOf(const QUrl &childUrl) const
2419 QString childPath = childUrl.path();
2422 return ((childUrl.scheme().isEmpty())
2423 && (childUrl.authority().isEmpty())
2424 && childPath.length() > 0 && childPath.at(0) == QLatin1Char('/'));
2426 QString ourPath = path();
2428 return ((childUrl.scheme().isEmpty() || d->scheme == childUrl.scheme())
2429 && (childUrl.authority().isEmpty() || authority() == childUrl.authority())
2430 && childPath.startsWith(ourPath)
2431 && ((ourPath.endsWith(QLatin1Char('/')) && childPath.length() > ourPath.length())
2432 || (!ourPath.endsWith(QLatin1Char('/'))
2433 && childPath.length() > ourPath.length() && childPath.at(ourPath.length()) == QLatin1Char('/'))));
2437 #ifndef QT_NO_DATASTREAM
2440 Writes url \a url to the stream \a out and returns a reference
2443 \sa \link datastreamformat.html Format of the QDataStream operators \endlink
2445 QDataStream &operator<<(QDataStream &out, const QUrl &url)
2447 QByteArray u = url.toString(QUrl::FullyEncoded).toLatin1();
2454 Reads a url into \a url from the stream \a in and returns a
2455 reference to the stream.
2457 \sa \link datastreamformat.html Format of the QDataStream operators \endlink
2459 QDataStream &operator>>(QDataStream &in, QUrl &url)
2463 url.setUrl(QString::fromLatin1(u));
2466 #endif // QT_NO_DATASTREAM
2468 #ifndef QT_NO_DEBUG_STREAM
2469 QDebug operator<<(QDebug d, const QUrl &url)
2471 d.maybeSpace() << "QUrl(" << url.toDisplayString() << ')';
2479 Returns a text string that explains why an URL is invalid in the case being;
2480 otherwise returns an empty string.
2482 QString QUrl::errorString() const
2487 if (d->sectionHasError == 0)
2490 // check if the error code matches a section with error
2491 if ((d->sectionHasError & (d->errorCode >> 8)) == 0)
2494 QChar c = d->errorSupplement;
2495 switch (QUrlPrivate::ErrorCode(d->errorCode)) {
2496 case QUrlPrivate::NoError:
2499 case QUrlPrivate::InvalidSchemeError: {
2500 QString msg = QStringLiteral("Invalid scheme (character '%1' not permitted)");
2503 case QUrlPrivate::SchemeEmptyError:
2504 return QStringLiteral("Empty scheme");
2506 case QUrlPrivate::InvalidUserNameError:
2507 return QString(QStringLiteral("Invalid user name (character '%1' not permitted)"))
2510 case QUrlPrivate::InvalidPasswordError:
2511 return QString(QStringLiteral("Invalid password (character '%1' not permitted)"))
2514 case QUrlPrivate::InvalidRegNameError:
2515 if (d->errorSupplement)
2516 return QString(QStringLiteral("Invalid hostname (character '%1' not permitted)"))
2519 return QStringLiteral("Hostname contains invalid characters");
2520 case QUrlPrivate::InvalidIPv4AddressError:
2521 return QString(); // doesn't happen yet
2522 case QUrlPrivate::InvalidIPv6AddressError:
2523 return QStringLiteral("Invalid IPv6 address");
2524 case QUrlPrivate::InvalidIPvFutureError:
2525 return QStringLiteral("Invalid IPvFuture address");
2526 case QUrlPrivate::HostMissingEndBracket:
2527 return QStringLiteral("Expected ']' to match '[' in hostname");
2529 case QUrlPrivate::InvalidPortError:
2530 case QUrlPrivate::PortEmptyError:
2531 return QStringLiteral("Invalid port or port number out of range");
2533 case QUrlPrivate::InvalidPathError:
2534 return QString(QStringLiteral("Invalid path (character '%1' not permitted)"))
2536 case QUrlPrivate::PathContainsColonBeforeSlash:
2537 return QStringLiteral("Path component contains ':' before any '/'");
2539 case QUrlPrivate::InvalidQueryError:
2540 return QString(QStringLiteral("Invalid query (character '%1' not permitted)"))
2543 case QUrlPrivate::InvalidFragmentError:
2544 return QString(QStringLiteral("Invalid fragment (character '%1' not permitted)"))
2547 return QStringLiteral("<unknown error>");
2551 \typedef QUrl::DataPtr
2556 \fn DataPtr &QUrl::data_ptr()
2560 /*! \fn uint qHash(const QUrl &url)
2563 Returns the hash value for the \a url.
2565 uint qHash(const QUrl &url)
2568 return qHash(-1); // the hash of an unset port (-1)
2570 return qHash(url.d->scheme) ^
2571 qHash(url.d->userName) ^
2572 qHash(url.d->password) ^
2573 qHash(url.d->host) ^
2574 qHash(url.d->port) ^
2575 qHash(url.d->path) ^
2576 qHash(url.d->query) ^
2577 qHash(url.d->fragment);
2581 // The following code has the following copyright:
2583 Copyright (C) Research In Motion Limited 2009. All rights reserved.
2585 Redistribution and use in source and binary forms, with or without
2586 modification, are permitted provided that the following conditions are met:
2587 * Redistributions of source code must retain the above copyright
2588 notice, this list of conditions and the following disclaimer.
2589 * Redistributions in binary form must reproduce the above copyright
2590 notice, this list of conditions and the following disclaimer in the
2591 documentation and/or other materials provided with the distribution.
2592 * Neither the name of Research In Motion Limited nor the
2593 names of its contributors may be used to endorse or promote products
2594 derived from this software without specific prior written permission.
2596 THIS SOFTWARE IS PROVIDED BY Research In Motion Limited ''AS IS'' AND ANY
2597 EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
2598 WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
2599 DISCLAIMED. IN NO EVENT SHALL Research In Motion Limited BE LIABLE FOR ANY
2600 DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
2601 (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
2602 LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
2603 ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
2604 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
2605 SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
2611 Returns a valid URL from a user supplied \a userInput string if one can be
2612 deducted. In the case that is not possible, an invalid QUrl() is returned.
2616 Most applications that can browse the web, allow the user to input a URL
2617 in the form of a plain string. This string can be manually typed into
2618 a location bar, obtained from the clipboard, or passed in via command
2621 When the string is not already a valid URL, a best guess is performed,
2622 making various web related assumptions.
2624 In the case the string corresponds to a valid file path on the system,
2625 a file:// URL is constructed, using QUrl::fromLocalFile().
2627 If that is not the case, an attempt is made to turn the string into a
2628 http:// or ftp:// URL. The latter in the case the string starts with
2629 'ftp'. The result is then passed through QUrl's tolerant parser, and
2630 in the case or success, a valid QUrl is returned, or else a QUrl().
2635 \li qt.nokia.com becomes http://qt.nokia.com
2636 \li ftp.qt.nokia.com becomes ftp://ftp.qt.nokia.com
2637 \li hostname becomes http://hostname
2638 \li /home/user/test.html becomes file:///home/user/test.html
2641 QUrl QUrl::fromUserInput(const QString &userInput)
2643 QString trimmedString = userInput.trimmed();
2645 // Check first for files, since on Windows drive letters can be interpretted as schemes
2646 if (QDir::isAbsolutePath(trimmedString))
2647 return QUrl::fromLocalFile(trimmedString);
2649 QUrl url = QUrl(trimmedString, QUrl::TolerantMode);
2650 QUrl urlPrepended = QUrl(QStringLiteral("http://") + trimmedString, QUrl::TolerantMode);
2652 // Check the most common case of a valid url with scheme and host
2653 // We check if the port would be valid by adding the scheme to handle the case host:port
2654 // where the host would be interpretted as the scheme
2656 && !url.scheme().isEmpty()
2657 && (!url.host().isEmpty() || !url.path().isEmpty())
2658 && urlPrepended.port() == -1)
2661 // Else, try the prepended one and adjust the scheme from the host name
2662 if (urlPrepended.isValid() && (!urlPrepended.host().isEmpty() || !urlPrepended.path().isEmpty()))
2664 int dotIndex = trimmedString.indexOf(QLatin1Char('.'));
2665 const QString hostscheme = trimmedString.left(dotIndex).toLower();
2666 if (hostscheme == ftpScheme())
2667 urlPrepended.setScheme(ftpScheme());
2668 return urlPrepended;