4.10.18.3 Association of controls and forms
A form-associated element can have a relationship
with a form element, which is called the element's
form owner. If a form-associated element is
not associated with a form element, its form
owner is said to be null.
A form-associated element is, by default, associated
with its ancestor
form element, but may have a form attribute specified to
override this.
This feature allows authors to work around the lack
of support for nested form elements.
If a form-associated element has a form attribute specified, then that
attribute's value must be the ID of a form element in
the element's owner Document.
-
element .
form -
Returns the element's form owner.
Returns null if there isn't one.
4.10.19 Attributes common to form controls
4.10.19.1
Naming form controls: the name attribute
The name content
attribute gives the name of the form control, as used in form
submission and in the form element's elements object. If the attribute
is specified, its value must not be the empty string.
Any non-empty value for name
is allowed, but the names "_charset_" and "isindex" are special:
isindex-
This value, if used as the name of a Text control that is the first control in a form that is submitted using the
application/x-www-form-urlencodedmechanism, causes the submission to only include the value of this control, with no name. _charset_-
This value, if used as the name of a Hidden control with no
valueattribute, is automatically given a value during submission consisting of the submission character encoding.
4.10.19.2
Submitting element directionality: the dirname attribute
The dirname attribute
on a form control element enables the submission of the
directionality of the element, and gives the name of the
field that contains this value during form submission.
If such an attribute is specified, its value must not be the empty
string.
In this example, a form contains a text field and a submission button:
<form action="addcomment.cgi" method=post> <p><label>Comment: <input type=text name="comment" dirname="comment.dir" required></label></p> <p><button name="mode" type=submit value="add">Post Comment</button></p> </form>
When the user submits the form, the user agent includes three fields, one called "comment", one called "comment.dir", and one called "mode"; so if the user types "Hello", the submission body might be something like:
comment=Hello&comment.dir=ltr&mode=add
If the user manually switches to a right-to-left writing direction and enters "مرحبًا", the submission body might be something like:
comment=%D9%85%D8%B1%D8%AD%D8%A8%D9%8B%D8%A7&comment.dir=rtl&mode=add
4.10.19.3
Limiting user input length: the maxlength attribute
A form control maxlength attribute, controlled by a dirty value flag, declares a limit on the number of
characters a user can input.
If an element has its form
control maxlength attribute specified,
the attribute's value must be a valid non-negative
integer. If the attribute is specified and applying the
rules for parsing non-negative integers to its value
results in a number, then that number is the element's maximum
allowed value length. If the attribute is omitted or parsing
its value results in an error, then there is no maximum
allowed value length.
4.10.19.4
Enabling and disabling form controls: the disabled attribute
The disabled
content attribute is a boolean attribute.
A form control is disabled
if its disabled attribute is
set, or if it is a descendant of a fieldset element
whose disabled attribute
is set and is not a descendant of that
fieldset element's first legend element
child, if any.
4.10.19.5 Form submission
Attributes for form submission can be specified both
on form elements and on submit buttons (elements that
represent buttons that submit forms, e.g. an input
element whose type attribute is
in the Submit Button
state).
The attributes for form submission that may be
specified on form elements are action, enctype, method, novalidate, and target.
The corresponding attributes for form submission
that may be specified on submit
buttons are formaction, formenctype, formmethod, formnovalidate, and formtarget. When omitted, they
default to the values given on the corresponding attributes on the
form element.
The action and
formaction
content attributes, if specified, must have a value that is a
valid non-empty URL potentially surrounded by
spaces.
The action of an element is
the value of the element's formaction attribute, if the
element is a submit
button and has such an attribute, or the value of its
form owner's action
attribute, if it has one, or else the empty string.
The method and
formmethod
content attributes are enumerated
attributes with the following keywords and states:
- The keyword
get, mapping to the state GET, indicating the HTTP GET method. - The keyword
post, mapping to the state POST, indicating the HTTP POST method. - The keyword
dialog, mapping to the state dialog, indicating that submitting theformis intended to close thedialogbox in which the form finds itself, if any, and otherwise not submit.
The invalid value default for these attributes is the GET state. (There is no missing value default.)
The method of an element is
one of those states. If the element is a submit button and has a formmethod attribute, then the
element's method is that
attribute's state; otherwise, it is the form owner's
method attribute's state.
The enctype and
formenctype
content attributes are enumerated
attributes with the following keywords and states:
- The "
application/x-www-form-urlencoded" keyword and corresponding state. - The "
multipart/form-data" keyword and corresponding state. - The "
text/plain" keyword and corresponding state.
The invalid value default for these attributes is the
application/x-www-form-urlencoded
state. (There is no missing value default.)
The enctype of an element
is one of those three states. If the element is a submit button and has a formenctype attribute, then the
element's enctype is that
attribute's state; otherwise, it is the form owner's
enctype attribute's state.
The target and
formtarget
content attributes, if specified, must have values that are valid browsing
context names or keywords.
The target of an element is
the value of the element's formtarget attribute, if the
element is a submit
button and has such an attribute; or the value of its
form owner's target
attribute, if it has such an attribute; or, if the
Document contains a base element with a
target attribute, then the
value of the target attribute
of the first such base element; or, if there is no such
element, the empty string.
The novalidate
and formnovalidate
content attributes are boolean
attributes. If present, they indicate that the form is not to
be validated during submission.
The no-validate state of
an element is true if the element is a submit button and the element's
formnovalidate attribute
is present, or if the element's form owner's novalidate attribute is present,
and false otherwise.
This attribute is useful to include "save" buttons on forms that have validation constraints, to allow users to save their progress even though they haven't fully entered the data in the form. The following example shows a simple form that has two required fields. There are three buttons: one to submit the form, which requires both fields to be filled in; one to save the form so that the user can come back and fill it in later; and one to cancel the form altogether.
<form action="editor.cgi" method="post"> <p><label>Name: <input required name=fn></label></p> <p><label>Essay: <textarea required name=essay></textarea></label></p> <p><input type=submit name=submit value="Submit essay"></p> <p><input type=submit formnovalidate name=save value="Save essay"></p> <p><input type=submit formnovalidate name=cancel value="Cancel"></p> </form>
4.10.19.6
Autofocusing a form control: the autofocus attribute
The autofocus
content attribute allows the author to indicate that a control is to
be focused as soon as the page is loaded or as soon as the
dialog within which it finds itself is shown, allowing
the user to just start typing without having to manually focus the
main control.
The autofocus attribute is
a boolean attribute.
An element's nearest ancestor autofocus scoping root
element is the element itself if the element is a
dialog element, or else is the element's nearest
ancestor dialog element, if any, or else is the
element's root element.
There must not be two elements with the same nearest
ancestor autofocus scoping root element that both have the
autofocus attribute
specified.
In the following snippet, the text control would be focused when the document was loaded.
<input maxlength="256" name="q" value="" autofocus> <input type="submit" value="Search">
4.10.19.7
Input modalities: the inputmode attribute
The inputmode
content attribute is an enumerated attribute that
specifies what kind of input mechanism would be most helpful for
users entering content into the form control.
The possible keywords and states for the attributes are listed in the following table. The keywords are listed in the first column. Each maps to the state given in the cell in the second column of that keyword's row, and that state has the fallback state given in the cell in the third column of that row.
| Keyword | State | Fallback state | Description |
|---|---|---|---|
verbatim
|
Latin Verbatim | Default | Alphanumeric Latin-script input of non-prose content, e.g. usernames, passwords, product codes. |
latin
|
Latin Text | Latin Verbatim | Latin-script input in the user's preferred language(s), with some typing aids enabled (e.g. text prediction). Intended for human-to-computer communications, e.g. free-form text search fields. |
latin-name
|
Latin Name | Latin Text | Latin-script input in the user's preferred language(s), with typing aids intended for entering human names enabled (e.g. text prediction from the user's contact list and automatic capitalisation at every word). Intended for situations such as customer name fields. |
latin-prose
|
Latin Prose | Latin Text | Latin-script input in the user's preferred language(s), with aggressive typing aids intended for human-to-human communications enabled (e.g. text prediction and automatic capitalisation at the start of sentences). Intended for situations such as e-mails and instant messaging. |
full-width-latin
|
Full-width Latin | Latin Prose | Latin-script input in the user's secondary language(s), using full-width characters, with aggressive typing aids intended for human-to-human communications enabled (e.g. text prediction and automatic capitalisation at the start of sentences). Intended for latin text embedded inside CJK text. |
kana
|
Kana | Default | Kana or romaji input, typically hiragana input, using full-width characters, with support for converting to kanji. Intended for Japanese text input. |
katakana
|
Katakana | Kana | Katakana input, using full-width characters, with support for converting to kanji. Intended for Japanese text input. |
numeric
|
Numeric | Default | Numeric input, including keys for the digits 0 to 9, the user's preferred thousands separator character, and the character for indicating negative numbers. Intended for numeric codes, e.g. credit card numbers. (For numbers, prefer "<input type=number>".)
|
tel
|
Telephone | Numeric | Telephone number input, including keys for the digits 0 to 9, the "#" character, and the "*" character. In some locales, this can also include alphabetic mnemonic labels (e.g. in the US, the key labeled "2" is historically also labeled with the letters A, B, and C). Rarely necessary; use "<input type=tel>" instead.
|
email
|
Default | Text input in the user's locale, with keys for aiding in the input of e-mail addresses, such as that for the "@" character and the "." character. Rarely necessary; use "<input type=email>" instead.
|
|
url
|
URL | Default | Text input in the user's locale, with keys for aiding in the input of Web addresses, such as that for the "/" and "." characters and for quick input of strings commonly found in domain names such as "www." or ".co.uk". Rarely necessary; use "<input type=url>" instead.
|
The last three keywords listed above are only provided for completeness, and are rarely necessary, as dedicated input controls exist for their usual use cases (as described in the table above).
User agents all support the Default input mode state, which corresponds to the user agent's default input modality. The missing value default is the default input mode state.
4.10.19.8
Autofilling form controls: the autocomplete attribute
User agents sometimes have features for helping users fill forms
in, for example prefilling the user's address based on earlier user
input. The autocomplete content
attribute can be used to hint to the user agent how to, or indeed
whether to, provide such a feature.
The attribute, if present, must have a value that is a set
of space-separated tokens consisting of either a single token
that is an ASCII case-insensitive match for the string
"off", or a single
token that is an ASCII case-insensitive match for the
string "on", or the
following, in the order given below:
-
Optionally, a token whose first eight characters are an ASCII case-insensitive match for the string "
section-", meaning that the field belongs to the named group.For example, if there are two shipping addresses in the form, then they could be marked up as:
<fieldset> <legend>Ship the blue gift to...</legend> <p> <label> Address: <input name=ba autocomplete="section-blue shipping street-address"> </label> <p> <label> City: <input name=bc autocomplete="section-blue shipping region"> </label> <p> <label> Postal Code: <input name=bp autocomplete="section-blue shipping postal-code"> </label> </fieldset> <fieldset> <legend>Ship the red gift to...</legend> <p> <label> Address: <input name=ra autocomplete="section-red shipping street-address"> </label> <p> <label> City: <input name=rc autocomplete="section-red shipping region"> </label> <p> <label> Postal Code: <input name=rp autocomplete="section-red shipping country"> </label> </fieldset>
-
Optionally, a token that is an ASCII case-insensitive match for one of the following strings:
- "
shipping", meaning the field is part of the shipping address or contact information - "
billing", meaning the field is part of the billing address or contact information
- "
-
Either of the following two options:
-
A token that is an ASCII case-insensitive match for one of the following autofill field strings:
- "
name" - "
honorific-prefix" - "
given-name" - "
additional-name" - "
family-name" - "
honorific-suffix" - "
nickname" - "
organization-title" - "
organization" - "
street-address" - "
address-line1" - "
address-line2" - "
address-line3" - "
locality" - "
region" - "
country" - "
postal-code" - "
cc-name" - "
cc-given-name" - "
cc-additional-name" - "
cc-family-name" - "
cc-number" - "
cc-exp" - "
cc-exp-month" - "
cc-exp-year" - "
cc-csc" - "
language" - "
bday" - "
bday-day" - "
bday-month" - "
bday-year" - "
sex" - "
url" - "
photo"
(See the table below for descriptions of these values.)
- "
-
The following, in the given order:
-
Optionally, a token that is an ASCII case-insensitive match for one of the following strings:
- "
home", meaning the field is for contacting someone at their residence - "
work", meaning the field is for contacting someone at their workplace - "
mobile", meaning the field is for contacting someone regardless of location - "
fax", meaning the field describes a fax machine's contact details - "
pager", meaning the field describes a pager's or beeper's contact details
- "
-
A token that is an ASCII case-insensitive match for one of the following autofill field strings:
- "
tel" - "
tel-country-code" - "
tel-national" - "
tel-area-code" - "
tel-local" - "
tel-local-prefix" - "
tel-local-suffix" - "
tel-extension" - "
email" - "
impp"
(See the table below for descriptions of these values.)
- "
-
-
The "off" keyword
indicates either that the control's input data is particularly
sensitive (for example the activation code for a nuclear weapon); or
that it is a value that will never be reused (for example a
one-time-key for a bank login) and the user will therefore have to
explicitly enter the data each time, instead of being able to rely
on the UA to prefill the value for him; or that the document
provides its own autocomplete mechanism and does not want the user
agent to provide autocompletion values.
The "on"
keyword indicates that the user agent is allowed to provide the user
with autocompletion values, but does not provide any further
information about what kind of data the user might be expected to
enter. User agents would have to use heuristics to decide what
autocompletion values to suggest.
The autofill fields names
listed above indicate that the user agent is allowed to provide the
user with autocompletion values, and specifies what kind of value is
expected. The keywords relate to each other as described in the
table below. Each field name listed on a row of this table
corresponds to the meaning given in the cell for that row in the
column labeled "Meaning". Some fields correspond to subparts of
other fields; for example, a credit card expiry date can be
expressed as one field giving both the month and year of expiry
("cc-exp"), or as
two fields, one giving the month ("cc-exp-month") and
one the year ("cc-exp-year"). In
such cases, the names of the broader fields cover multiple rows, in
which the narrower fields are defined.
Generally, authors are encouraged to use the broader fields rather than the narrower fields, as the narrower fields tend to expose Western biases. For example, while it is common in some Western cultures to have a given name and a family name, in that order (and thus often referred to as a first name and a surname), many cultures put the family name first and the given name second, and many others simply have one name (a mononym). Having a single field is therefore more flexible.
| Field name | Meaning | Example | |||
|---|---|---|---|---|---|
"name"
|
Full name | Sir Timothy John Berners-Lee, OM, KBE, FRS, FREng, FRSA | |||
"honorific-prefix"
|
Prefix or title (e.g. "Mr.", "Ms.", "Dr.", "Mlle") | Sir | |||
"given-name"
|
Given name (in some Western cultures, also known as the first name) | Timothy | |||
"additional-name"
|
Additional names (in some Western cultures, also known as middle names, forenames other than the first name) | John | |||
"family-name"
|
Family name (in some Western cultures, also known as the last name or surname) | Berners-Lee | |||
"honorific-suffix"
|
Suffix (e.g. "Jr.", "B.Sc.", "MBASW", "II") | OM, KBE, FRS, FREng, FRSA | |||
"nickname"
|
Nickname, screen name, handle: a typically short name used instead of the full name | Tim | |||
"organization-title"
|
Job title (e.g. "Software Engineer", "Senior Vice President", "Deputy Managing Director") | Professor | |||
"organization"
|
Company name corresponding to the person, address, or contact information in the other fields associated with this field | World Wide Web Consortium | |||
"street-address"
|
Street address (as one line) | 32 Vassar Street; MIT Room 32-G524 | |||
"address-line1"
|
Street address (as multiple lines) | 32 Vassar Street | |||
"address-line2"
|
MIT Room 32-G524 | ||||
"address-line3"
|
|||||
"locality"
|
City, town, village, or other locality within which the relevant street address is found | Cambridge | |||
"region"
|
Provice such as a state, county, or canton within which the locality is found | MA | |||
"country"
|
Country | USA | |||
"postal-code"
|
Postal code, post code, ZIP code | 02139 | |||
"cc-name"
|
Full name as given on the payment instrument | Tim Berners-Lee | |||
"cc-given-name"
|
Given name as given on the payment instrument (in some Western cultures, also known as the first name) | Tim | |||
"cc-additional-name"
|
Additional names given on the payment instrument (in some Western cultures, also known as middle names, forenames other than the first name) | ||||
"cc-family-name"
|
Family name given on the payment instrument (in some Western cultures, also known as the last name or surname) | Berners-Lee | |||
"cc-number"
|
Code identifying the payment instrument (e.g. the credit card number, bank account number) | 4114360123456785 | |||
"cc-exp"
|
Expiration date of the payment instrument | 2014-12 | |||
"cc-exp-month"
|
Month component of the expiration date of the payment instrument | 12 | |||
"cc-exp-year"
|
Year component of the expiration date of the payment instrument | 2014 | |||
"cc-csc"
|
Security code for the payment instrument (also known as the card security code (CSC), card validation code (CVC), card verification value (CVV), signature panel code (SPC), credit card ID (CCID), etc) | 419 | |||
"language"
|
Preferred language | English | |||
"bday"
|
Birthday | 1955-06-08 | |||
"bday-day"
|
Day component of birthday | 8 | |||
"bday-month"
|
Month component of birthday | June | |||
"bday-year"
|
Year component of birthday | 1955 | |||
"sex"
|
Gender identity (e.g. Female, Fa'afafine) | Male | |||
"url"
|
Home page or other Web page corresponding to the company, person, address, or contact information in the other fields associated with this field | http://www.w3.org/People/Berners-Lee/ | |||
"photo"
|
Photograph, icon, or other image corresponding to the company, person, address, or contact information in the other fields associated with this field | http://www.w3.org/Press/Stock/Berners-Lee/2001-europaeum-eighth.jpg | |||
"tel"
|
Full telephone number, including country code | +1 617 253 5702 | |||
"tel-country-code"
|
Country code component of the telephone number | +1 | |||
"tel-national"
|
Telephone number without the county code component | 617 253 5702 | |||
"tel-area-code"
|
Area code component of the telephone number | 617 | |||
"tel-local"
|
Telephone number without the country code and area code components | 2535702 | |||
"tel-local-prefix"
|
First part of the component of the telephone number that follows the area code, when that component is split into two components | 253 | |||
"tel-local-suffix"
|
Second part of the component of the telephone number that follows the area code, when that component is split into two components | 5702 | |||
"tel-extension"
|
Telephone number internal extension code | 1000 | |||
"email"
|
E-mail address | timbl@w3.org | |||
"impp"
|
URL representing an instant messaging protocol endpoint (for example, "aim:goim?screenname=example" or xmpp:fred@example.net")
|
irc://example.org/timbl,isuser | |||
If the autocomplete
attribute is omitted, the default value corresponding to the state
of the element's form owner's autocomplete attribute is used
instead (either "on" or
"off"). If there is no
form owner, then the value "on" is used.
4.10.20 APIs for the text field selections
The input and textarea elements define
the following members in their DOM interfaces for handling their
selection:
void select(); attribute unsigned long selectionStart; attribute unsigned long selectionEnd; attribute DOMString selectionDirection; void setRangeText(DOMString replacement); void setRangeText(DOMString replacement, unsigned long start, unsigned long end, optional SelectionMode selectionMode); void setSelectionRange(unsigned long start, unsigned long end, optional DOMString direction = "preserve");
The setRangeText method
uses the following enumeration:
enum SelectionMode {
"select",
"start",
"end",
"preserve",
};
These methods and attributes expose and control the selection of
input and textarea text fields.
-
element .
select() -
Selects everything in the text field.
-
element .
selectionStart[ = value ] -
Returns the offset to the start of the selection.
Can be set, to change the start of the selection.
-
element .
selectionEnd[ = value ] -
Returns the offset to the end of the selection.
Can be set, to change the end of the selection.
-
element .
selectionDirection[ = value ] -
Returns the current direction of the selection.
Can be set, to change the direction of the selection.
The possible values are "
forward", "backward", and "none". -
element .
setSelectionRange(start, end [, direction] ) -
Changes the selection to cover the given substring in the given direction. If the direction is omitted, it will be reset to be the platform default (none or forward).
-
element .
setRangeText(replacement [, start, end [, selectionMode ] ] ) -
Replaces a range of text with the new text. If the start and end arguments are not provided, the range is assumed to be the selection.
The final argument determines how the selection should be set after the text has been replaced. The possible values are:
Characters with no visible rendering, such as U+200D ZERO WIDTH JOINER, still count as characters. Thus, for instance, the selection can include just an invisible character, and the text insertion cursor can be placed to one side or another of such a character.
To obtain the currently selected text, the following JavaScript suffices:
var selectionText = control.value.substring(control.selectionStart, control.selectionEnd);
To add some text at the start of a text control, while maintaining the text selection, the three attributes must be preserved:
var oldStart = control.selectionStart; var oldEnd = control.selectionEnd; var oldDirection = control.selectionDirection; var prefix = "http://"; control.value = prefix + control.value; control.setSelectionRange(oldStart + prefix.length, oldEnd + prefix.length, oldDirection);
4.10.21 Constraints
4.10.21.1 Definitions
4.10.21.2 The constraint validation API
-
element .
willValidate -
Returns true if the element will be validated when the form is submitted; false otherwise.
-
element .
setCustomValidity(message) -
Sets a custom error, so that the element would fail to validate. The given message is the message to be shown to the user when reporting the problem to the user.
If the argument is the empty string, clears the custom error.
-
element .
validity.valueMissing -
Returns true if the element has no value but is a required field; false otherwise.
-
element .
validity.typeMismatch -
Returns true if the element's value is not in the correct syntax; false otherwise.
-
element .
validity.patternMismatch -
Returns true if the element's value doesn't match the provided pattern; false otherwise.
-
element .
validity.tooLong -
Returns true if the element's value is longer than the provided maximum length; false otherwise.
-
element .
validity.rangeUnderflow -
Returns true if the element's value is lower than the provided minimum; false otherwise.
-
element .
validity.rangeOverflow -
Returns true if the element's value is higher than the provided maximum; false otherwise.
-
element .
validity.stepMismatch -
Returns true if the element's value doesn't fit the rules given by the
stepattribute; false otherwise. -
element .
validity.badInput -
Returns true if the user has provided input in the user interface that the user agent is unable to convert to a value; false otherwise.
-
element .
validity.customError -
Returns true if the element has a custom error; false otherwise.
-
element .
validity.valid -
Returns true if the element's value has no validity problems; false otherwise.
-
valid = element .
checkValidity() -
Returns true if the element's value has no validity problems; false otherwise. Fires an
invalidevent at the element in the latter case. -
element .
validationMessage -
Returns the error message that would be shown to the user if the element was to be checked for validity.
In the following example, a script checks the value of a form
control each time it is edited, and whenever it is not a valid
value, uses the setCustomValidity() method
to set an appropriate message.
<label>Feeling: <input name=f type="text" oninput="check(this)"></label>
<script>
function check(input) {
if (input.value == "good" ||
input.value == "fine" ||
input.value == "tired") {
input.setCustomValidity('"' + input.value + '" is not a feeling.');
} else {
// input is fine -- reset the error message
input.setCustomValidity('');
}
}
</script>
4.10.21.3 Security
Servers should not rely on client-side validation. Client-side validation can be intentionally bypassed by hostile users, and unintentionally bypassed by users of older user agents or automated tools that do not implement these features. The constraint validation features are only intended to improve the user experience, not to provide any kind of security mechanism.
4.10.22 Form submission
When a form is submitted, the data in the form is converted into the structure specified by the enctype, and then sent to the destination specified by the action using the given method.
For example, take the following form:
<form action="/find.cgi" method=get> <input type=text name=t> <input type=search name=q> <input type=submit> </form>
If the user types in "cats" in the first field and "fur" in the
second, and then hits the submit button, then the user agent will
load /find.cgi?t=cats&q=fur.
On the other hand, consider this form:
<form action="/find.cgi" method=post enctype="multipart/form-data"> <input type=text name=t> <input type=search name=q> <input type=submit> </form>
Given the same user input, the result on submission is quite different: the user agent instead does an HTTP POST to the given URL, with as the entity body something like the following text:
------kYFrd4jNJEgCervE Content-Disposition: form-data; name="t" cats ------kYFrd4jNJEgCervE Content-Disposition: form-data; name="q" fur ------kYFrd4jNJEgCervE--
4.10.22.1 URL-encoded form data
This form data set encoding is in many ways an aberrant monstrosity, the result of many years of implementation accidents and compromises leading to a set of requirements necessary for interoperability, but in no way representing good design practices. In particular, readers are cautioned to pay close attention to the twisted details involving repeated (and in some cases nested) conversions between character encodings and byte sequences.
To decode application/x-www-form-urlencoded
payloads, the following algorithm should be used. This
algorithm uses as inputs the payload itself, payload, consisting of a Unicode string using only
characters in the range U+0000 to U+007F; a default character
encoding encoding; and optionally an isindex flag indicating that the payload is to be
processed as if it had been generated for a form containing an isindex control. The output of
this algorithm is a sorted list of name-value pairs. If the isindex flag is set and the first control really was
an isindex control, then
the first name-value pair will have as its name the empty
string.
Let strings be the result of strictly splitting the string payload on U+0026 AMPERSAND characters (&).
If the isindex flag is set and the first string in strings does not contain a U+003D EQUALS SIGN character (=), insert a U+003D EQUALS SIGN character (=) at the start of the first string in strings.
Let pairs be an empty list of name-value pairs.
-
For each string string in strings, run these substeps:
-
If string contains a U+003D EQUALS SIGN character (=), then let name be the substring of string from the start of string up to but excluding its first U+003D EQUALS SIGN character (=), and let value be the substring from the first character, if any, after the first U+003D EQUALS SIGN character (=) up to the end of string. If the first U+003D EQUALS SIGN character (=) is the first character, then name will be the empty string. If it is the last character, then value will be the empty string.
Otherwise, string contains no U+003D EQUALS SIGN characters (=). Let name have the value of string and let value be the empty string.
-
Replace any U+002B PLUS SIGN characters (+) in name and value with U+0020 SPACE characters.
-
Replace any escape in name and value with the character represented by the escape. This replacement most not be recursive.
An escape is a U+0025 PERCENT SIGN character (%) followed by two ASCII hex digits.
The character represented by an escape is the Unicode character whose code point is equal to the value of the two characters after the U+0025 PERCENT SIGN character (%), interpreted as a hexadecimal number (in the range 0..255).
So for instance the string "
A%2BC" would become "A+C". Similarly, the string "100%25AA%21" becomes the string "100%AA!". Convert the name and value strings to their byte representation in ISO-8859-1 (i.e. convert the Unicode string to a byte string, mapping code points to byte values directly).
Add a pair consisting of name and value to pairs.
-
If any of the name-value pairs in pairs have a name component consisting of the string "
_charset_" encoded in US-ASCII, and the value component of the first such pair, when decoded as US-ASCII, is the name of a supported character encoding, then let encoding be that character encoding (replacing the default passed to the algorithm).Convert the name and value components of each name-value pair in pairs to Unicode by interpreting the bytes according to the encoding encoding.
Return pairs.
Parameters on the
application/x-www-form-urlencoded MIME type are
ignored. In particular, this MIME type does not support the charset parameter.
For details on how to interpret multipart/form-data
payloads, see RFC 2388. [RFC2388]
4.10.22.2 Plain text form data
Payloads using the text/plain format are intended to
be human readable. They are not reliably interpretable by computer,
as the format is ambiguous (for example, there is no way to
distinguish a literal newline in a value from the newline at the end
of the value).