From 0e0f0835b5eca2377b130714ebb797e625f8e4e9 Mon Sep 17 00:00:00 2001 From: Michael Breen Date: Wed, 9 Feb 2011 10:31:03 +0000 Subject: [PATCH] perl #82278: Overload documentation: many changes This is a partial rewrite to address long standing issues and comments that this document is confusing, in addition to accurately documenting the behaviour of fallback and nomethod in the context of overloaded operands of different types, as implemented in the fix for bug #71286. Fixes many mistakes, ambiguities, and omissions. Most of the early part of the document has been restructured and revised. For the full list of changes, see bug #82278. --- lib/overload.pm | 1123 +++++++++++++++++++++++++++++++------------------------ 1 file changed, 643 insertions(+), 480 deletions(-) diff --git a/lib/overload.pm b/lib/overload.pm index 37ce236..c538177 100644 --- a/lib/overload.pm +++ b/lib/overload.pm @@ -1,6 +1,6 @@ package overload; -our $VERSION = '1.12'; +our $VERSION = '1.13'; sub nil {} @@ -200,7 +200,7 @@ overload - Package for overloading Perl operations package main; $a = SomeThing->new( 57 ); - $b=5+$a; + $b = 5 + $a; ... if (overload::Overloaded $b) {...} ... @@ -211,224 +211,289 @@ overload - Package for overloading Perl operations This pragma allows overloading of Perl's operators for a class. To overload built-in functions, see L instead. -=head2 Declaration of overloaded functions +=head2 Fundamentals -The compilation directive +=head3 Declaration - package Number; - use overload - "+" => \&add, - "*=" => "muas"; - -declares function Number::add() for addition, and method muas() in -the "class" C (or one of its base classes) -for the assignment form C<*=> of multiplication. - -Arguments of this directive come in (key, value) pairs. Legal values -are values legal inside a C<&{ ... }> call, so the name of a -subroutine, a reference to a subroutine, or an anonymous subroutine -will all work. Note that values specified as strings are -interpreted as methods, not subroutines. Legal keys are listed below. - -The subroutine C will be called to execute C<$a+$b> if $a -is a reference to an object blessed into the package C, or if $a is -not an object from a package with defined mathemagic addition, but $b is a -reference to a C. It can also be called in other situations, like -C<$a+=7>, or C<$a++>. See L. (Mathemagical -methods refer to methods triggered by an overloaded mathematical -operator.) - -Since overloading respects inheritance via the @ISA hierarchy, the -above declaration would also trigger overloading of C<+> and C<*=> in -all the packages which inherit from C. - -=head2 Calling Conventions for Binary Operations - -The functions specified in the C directive are called -with three (in one particular case with four, see L) -arguments. If the corresponding operation is binary, then the first -two arguments are the two arguments of the operation. However, due to -general object calling conventions, the first argument should always be -an object in the package, so in the situation of C<7+$a>, the -order of the arguments is interchanged. It probably does not matter -when implementing the addition method, but whether the arguments -are reversed is vital to the subtraction method. The method can -query this information by examining the third argument, which can take -three different values: - -=over 7 - -=item FALSE - -the order of arguments is as in the current operation. - -=item TRUE - -the arguments are reversed. - -=item C - -the current operation is an assignment variant (as in -C<$a+=7>), but the usual function is called instead. This additional -information can be used to generate some optimizations. Compare -L. - -=back - -=head2 Calling Conventions for Unary Operations - -Unary operation are considered binary operations with the second -argument being C. Thus the functions that overloads C<{"++"}> -is called with arguments C<($a,undef,'')> when $a++ is executed. - -=head2 Calling Conventions for Mutators - -Two types of mutators have different calling conventions: - -=over - -=item C<++> and C<--> - -The routines which implement these operators are expected to actually -I their arguments. So, assuming that $obj is a reference to a -number, - - sub incr { my $n = $ {$_[0]}; ++$n; $_[0] = bless \$n} +Arguments of the C directive are (key, value) pairs. +For the full set of legal keys, see L below. -is an appropriate implementation of overloaded C<++>. Note that +Operator implementations (the values) can be subroutines, +references to subroutines, or anonymous subroutines +- in other words, anything legal inside a C<&{ ... }> call. +Values specified as strings are interpreted as method names. +Thus - sub incr { ++$ {$_[0]} ; shift } - -is OK if used with preincrement and with postincrement. (In the case -of postincrement a copying will be performed, see L.) + package Number; + use overload + "-" => "minus", + "*=" => \&muas, + '""' => sub { ...; }; -=item C and other assignment versions +declares that subtraction is to be implemented by method C +in the class C (or one of its base classes), +and that the function C is to be used for the +assignment form of multiplication, C<*=>. +It also defines an anonymous subroutine to implement stringification: +this is called whenever an object blessed into the package C +is used in a string context (this subroutine might, for example, +return the number as a Roman numeral). -There is nothing special about these methods. They may change the -value of their arguments, and may leave it as is. The result is going -to be assigned to the value in the left-hand-side if different from -this value. +=head3 Calling Conventions and Magic Autogeneration -This allows for the same method to be used as overloaded C<+=> and -C<+>. Note that this is I, but not recommended, since by the -semantic of L<"Fallback"> Perl will call the method for C<+> anyway, -if C<+=> is not overloaded. +The following sample implementation of C (which assumes +that C objects are simply blessed references to scalars) +illustrates the calling conventions: -=back - -B Due to the presence of assignment versions of operations, -routines which may be called in assignment context may create -self-referential structures. Currently Perl will not free self-referential -structures until cycles are C broken. You may get problems -when traversing your structures too. + package Number; + sub minus { + my ($self, $other, $swap) = @_; + my $result = $$self - $other; # * + $result = -$result if $swap; + ref $result ? $result : bless \$result; + } + # * may recurse once - see table below + +Three arguments are passed to all subroutines specified in the +C directive (with one exception - see L). +The first of these is the operand providing the overloaded +operator implementation - +in this case, the object whose C method is being called. + +The second argument is the other operand, or C in the +case of a unary operator. + +The third argument is set to TRUE if (and only if) the two +operands have been swapped. Perl may do this to ensure that the +first argument (C<$self>) is an object implementing the overloaded +operation, in line with general object calling conventions. +For example, if C<$x> and C<$y> are Cs: + + operation | generates a call to + ============|====================== + $x - $y | minus($x, $y, '') + $x - 7 | minus($x, 7, '') + 7 - $x | minus($x, 7, 1) + +Perl may also use C to implement other operators which +have not been specified in the C directive, +according to the rules for L described later. +For example, the C above declared no subroutine +for any of the operators C<-->, C (the overload key for +unary minus), or C<-=>. Thus + + operation | generates a call to + ============|====================== + -$x | minus($x, 0, 1) + $x-- | minus($x, 1, undef) + $x -= 3 | minus($x, 3, undef) + +Note the Cs: +where autogeneration results in the method for a standard +operator which does not change either of its operands, such +as C<->, being used to implement an operator which changes +the operand ("mutators": here, C<--> and C<-=>), +Perl passes undef as the third argument. +This still evaluates as FALSE, consistent with the fact that +the operands have not been swapped, but gives the subroutine +a chance to alter its behaviour in these cases. + +In all the above examples, C is required +only to return the result of the subtraction: +Perl takes care of the assignment to $x. +In fact, such methods should I modify their operands, +even if C is passed as the third argument +(see L). + +The same is not true of implementations of C<++> and C<-->: +these are expected to modify their operand. +An appropriate implementation of C<--> might look like + + use overload '--' => "decr", + # ... + sub decr { --${$_[0]}; } + +=head3 Mathemagic, Mutators, and Copy Constructors + +The term 'mathemagic' describes the overloaded implementation +of mathematical operators. +Mathemagical operations raise an issue. +Consider the code: + + $a = $b; + --$a; + +If C<$a> and C<$b> are scalars then after these statements + + $a == $b - 1 + +An object, however, is a reference to blessed data, so if +C<$a> and C<$b> are objects then the assignment C<$a = $b> +copies only the reference, leaving C<$a> and C<$b> referring +to the same object data. +One might therefore expect the operation C<--$a> to decrement +C<$b> as well as C<$a>. +However, this would not be consistent with how we expect the +mathematical operators to work. + +Perl resolves this dilemma by transparently calling a copy +constructor before calling a method defined to implement +a mutator (C<-->, C<+=>, and so on.). +In the above example, when Perl reaches the decrement +statement, it makes a copy of the object data in C<$a> and +assigns to C<$a> a reference to the copied data. +Only then does it call C, which alters the copied +data, leaving C<$b> unchanged. +Thus the object metaphor is preserved as far as possible, +while mathemagical operations still work according to the +arithmetic metaphor. + +Note: the preceding paragraph describes what happens when +Perl autogenerates the copy constructor for an object based +on a scalar. +For other cases, see L. -Say, +=head2 Overloadable Operations - use overload '+' => sub { bless [ \$_[0], \$_[1] ] }; +The complete list of keys that can be specified in the C +directive are given, separated by spaces, in the values of the +hash C<%overload::ops>: -is asking for trouble, since for code C<$obj += $foo> the subroutine -is called as C<$obj = add($obj, $foo, undef)>, or C<$obj = [\$obj, -\$foo]>. If using such a subroutine is an important optimization, one -can overload C<+=> explicitly by a non-"optimized" version, or switch -to non-optimized version if C (see -L). + with_assign => '+ - * / % ** << >> x .', + assign => '+= -= *= /= %= **= <<= >>= x= .=', + num_comparison => '< <= > >= == !=', + '3way_comparison'=> '<=> cmp', + str_comparison => 'lt le gt ge eq ne', + binary => '& &= | |= ^ ^=', + unary => 'neg ! ~', + mutators => '++ --', + func => 'atan2 cos sin exp abs log sqrt int', + conversion => 'bool "" 0+ qr', + iterators => '<>', + filetest => '-X', + dereferencing => '${} @{} %{} &{} *{}', + matching => '~~', + special => 'nomethod fallback =' -Even if no I assignment-variants of operators are present in -the script, they may be generated by the optimizer. Say, C<",$obj,"> or -C<',' . $obj . ','> may be both optimized to +Most of the overloadable operators map one-to-one to these keys. +Exceptions, including additional overloadable operations not +apparent from this hash, are included in the notes which follow. - my $tmp = ',' . $obj; $tmp .= ','; +=over 5 -=head2 Overloadable Operations +=item * C -The following symbols can be specified in C directive: +The operator C is not a valid key for C. +However, if the operator C is overloaded then the same +implementation will be used for C +(since the two operators differ only in precedence). -=over 5 +=item * C -=item * I +The key C is used for unary minus to disambiguate it from +binary C<->. - "+", "+=", "-", "-=", "*", "*=", "/", "/=", "%", "%=", - "**", "**=", "<<", "<<=", ">>", ">>=", "x", "x=", ".", ".=", +=item * C<++>, C<--> -For these operations a substituted non-assignment variant can be called if -the assignment variant is not available. Methods for operations C<+>, -C<->, C<+=>, and C<-=> can be called to automatically generate -increment and decrement methods. The operation C<-> can be used to -autogenerate missing methods for unary minus or C. +Assuming they are to behave analogously to Perl's C<++> and C<-->, +overloaded implementations of these operators are required to +mutate their operands. -See L<"MAGIC AUTOGENERATION">, L<"Calling Conventions for Mutators"> and -L<"Calling Conventions for Binary Operations">) for details of these -substitutions. +No distinction is made between prefix and postfix forms of the +increment and decrement operators: these differ only in the +point at which Perl calls the associated subroutine when +evaluating an expression. -=item * I +=item * I - "<", "<=", ">", ">=", "==", "!=", "<=>", - "lt", "le", "gt", "ge", "eq", "ne", "cmp", + += -= *= /= %= **= <<= >>= x= .= + &= |= ^= -If the corresponding "spaceship" variant is available, it can be -used to substitute for the missing operation. During Cing -arrays, C is used to compare values subject to C. +Simple assignment is not overloadable (the C<'='> key is used +for the L). +Perl does have a way to make assignments to an object do whatever +you want, but this involves using tie(), not overload - +see L and the L examples below. -=item * I +The subroutine for the assignment variant of an operator is +required only to return the result of the operation. +It is permitted to change the value of its operand +(this is safe because Perl calls the copy constructor first), +but this is optional since Perl assigns the returned value to +the left-hand operand anyway. - "&", "&=", "^", "^=", "|", "|=", "neg", "!", "~", +An object that overloads an assignment operator does so only in +respect of assignments to that object. +In other words, Perl never calls the corresponding methods with +the third argument (the "swap" argument) set to TRUE. +For example, the operation -C stands for unary minus. If the method for C is not -specified, it can be autogenerated using the method for -subtraction. If the method for C is not specified, it can be -autogenerated using the methods for C, or C<"">, or C<0+>. + $a *= $b -The same remarks in L<"Arithmetic operations"> about -assignment-variants and autogeneration apply for -bit operations C<"&">, C<"^">, and C<"|"> as well. +cannot lead to C<$b>'s implementation of C<*=> being called, +even if C<$a> is a scalar. +(It can, however, generate a call to C<$b>'s method for C<*>). -=item * I +=item * I - "++", "--", + + - * / % ** << >> x . + & | ^ -If undefined, addition and subtraction methods can be -used instead. These operations are called both in prefix and -postfix form. +As described L, +Perl may call methods for operators like C<+> and C<&> in the course +of implementing missing operations like C<++>, C<+=>, and C<&=>. +While these methods may detect this usage by testing the definedness +of the third argument, they should in all cases avoid changing their +operands. +This is because Perl does not call the copy constructor before +invoking these methods. -=item * I +=item * C - "atan2", "cos", "sin", "exp", "abs", "log", "sqrt", "int" +Traditionally, the Perl function C rounds to 0 +(see L), and so for floating-point-like types one +should follow the same semantic. -If C is unavailable, it can be autogenerated using methods -for "E" or "E=E" combined with either unary minus or subtraction. +=item * I -Note that traditionally the Perl function L rounds to 0, thus for -floating-point-like types one should follow the same semantic. If -C is unavailable, it can be autogenerated using the overloading of -C<0+>. + "" 0+ bool -=item * I +These conversions are invoked according to context as necessary. +For example, the subroutine for C<'""'> (stringify) may be used +where the overloaded object is passed as an argument to C, +and that for C<'bool'> where it is tested in the condition of a flow +control statement (like C) or the ternary C operation. - 'bool', '""', '0+', 'qr' +Of course, in contexts like, for example, C<$obj + 1>, Perl will +invoke C<$obj>'s implementation of C<+> rather than (in this +example) converting C<$obj> to a number using the numify method +C<'0+'> (an exception to this is when no method has been provided +for C<'+'> and L is set to TRUE). -If one or two of these operations are not overloaded, the remaining ones -can be used instead. C is used in the flow control operators -(like C) and for the ternary C operation; C is used for -the RHS of C<=~> and when an object is interpolated into a regexp. - -C, C<"">, and C<0+> can return any arbitrary Perl value. If the -corresponding operation for this value is overloaded too, that operation -will be called again with this value. C must return a compiled -regexp, or a ref to a compiled regexp (such as C returns), and any -further overloading on the return value will be ignored. +The subroutines for C<'""'>, C<'0+'>, and C<'bool'> can return +any arbitrary Perl value. +If the corresponding operation for this value is overloaded too, +the operation will be called again with this value. As a special case if the overload returns the object itself then it will be used directly. An overloaded conversion returning the object is probably a bug, because you're likely to get something that looks like C. -=item * I + qr - "<>" +The subroutine for C<'qr'> is used wherever the object is +interpolated into or used as a regexp, including when it +appears on the RHS of a C<=~> or C operator. -If not overloaded, the argument will be converted to a filehandle or -glob (which may require a stringification). The same overloading -happens both for the I syntax C$varE> and +C must return a compiled regexp, or a ref to a compiled regexp +(such as C returns), and any further overloading on the return +value will be ignored. + +=item * I + +If CE> is overloaded then the same implementation is used +for both the I syntax C$varE> and I syntax C${var}E>. B Even in list context, the iterator is currently called only @@ -436,26 +501,19 @@ once and with scalar context. =item * I - "-X" - -This overload is used for all the filetest operators (C<-f>, C<-x> and -so on: see L for the full list). Even though these are -unary operators, the method will be called with a second argument which -is a single letter indicating which test was performed. Note that the -overload key is the literal string C<"-X">: you can't provide separate -overloads for the different tests. +The key C<'-X'> is used to specify a subroutine to handle all the +filetest operators (C<-f>, C<-x>, and so on: see L for +the full list); +it is not possible to overload any filetest operator individually. +To distinguish them, the letter following the '-' is passed as the +second argument (that is, in the slot that for binary operators +is used to pass the second operand). Calling an overloaded filetest operator does not affect the stat value associated with the special filehandle C<_>. It still refers to the result of the last C, C or unoverloaded filetest. -If not overloaded, these operators will fall back to the default -behaviour even without C<< fallback => 1 >>. This means that if the -object is a blessed glob or blessed IO ref it will be treated as a -filehandle, otherwise string overloading will be invoked and the result -treated as a filename. - -This overload was introduced in perl 5.12. +This overload was introduced in Perl 5.12. =item * I @@ -463,9 +521,9 @@ The key C<"~~"> allows you to override the smart matching logic used by the C<~~> operator and the switch construct (C/C). See L and L. -Unusually, overloading of the smart match operator does not automatically -take precedence over normal smart match behaviour. In particular, in the -following code: +Unusually, the overloaded implementation of the smart match operator +does not get full control of the smart match behaviour. +In particular, in the following code: package Foo; use overload '~~' => 'match'; @@ -490,283 +548,345 @@ details of when overloading is invoked. =item * I - '${}', '@{}', '%{}', '&{}', '*{}'. - -If not overloaded, the argument will be dereferenced I, thus -should be of correct type. These functions should return a reference -of correct type, or another object with overloaded dereferencing. - -As a special case if the overload returns the object itself then it -will be used directly (provided it is the correct type). - -The dereference operators must be specified explicitly they will not be passed to -"nomethod". + ${} @{} %{} &{} *{} + +If these operators are not explicitly overloaded then they +work in the normal way, yielding the underlying scalar, +array, or whatever stores the object data (or the appropriate +error message if the dereference operator doesn't match it). +Defining a catch-all C<'nomethod'> (see L) +makes no difference to this as the catch-all function will +not be called to implement a missing dereference operator. + +If a dereference operator is overloaded then it must return a +I of the appropriate type (for example, the +subroutine for key C<'${}'> should return a reference to a +scalar, not a scalar), or another object which overloads the +operator: that is, the subroutine only determines what is +dereferenced and the actual dereferencing is left to Perl. +As a special case, if the subroutine returns the object itself +then it will not be called again - avoiding infinite recursion. =item * I - "nomethod", "fallback", "=". + nomethod fallback = -see L>. +See L>. =back -See L<"Fallback"> for an explanation of when a missing method can be -autogenerated. - -A computer-readable form of the above table is available in the hash -%overload::ops, with values being space-separated lists of names: - - with_assign => '+ - * / % ** << >> x .', - assign => '+= -= *= /= %= **= <<= >>= x= .=', - num_comparison => '< <= > >= == !=', - '3way_comparison'=> '<=> cmp', - str_comparison => 'lt le gt ge eq ne', - binary => '& &= | |= ^ ^=', - unary => 'neg ! ~', - mutators => '++ --', - func => 'atan2 cos sin exp abs log sqrt', - conversion => 'bool "" 0+ qr', - iterators => '<>', - filetest => '-X', - dereferencing => '${} @{} %{} &{} *{}', - matching => '~~', - special => 'nomethod fallback =' +=head2 Magic Autogeneration + +If a method for an operation is not found then Perl tries to +autogenerate a substitute implementation from the operations +that have been defined. + +Note: the behaviour described in this section can be disabled +by setting C to FALSE (see L). + +In the following tables, numbers indicate priority. +For example, the table below states that, +if no implementation for C<'!'> has been defined then Perl will +implement it using C<'bool'> (that is, by inverting the value +returned by the method for C<'bool'>); +if boolean conversion is also unimplemented then Perl will +use C<'0+'> or, failing that, C<'""'>. + + operator | can be autogenerated from + | + | 0+ "" bool . x + =========|========================== + 0+ | 1 2 + "" | 1 2 + bool | 1 2 + int | 1 2 3 + ! | 2 3 1 + qr | 2 1 3 + . | 2 1 3 + x | 2 1 3 + .= | 3 2 4 1 + x= | 3 2 4 1 + <> | 2 1 3 + -X | 2 1 3 + +Note: The iterator (C<'EE'>) and file test (C<'-X'>) +operators work as normal: if the operand is not a blessed glob or +IO reference then it is converted to a string (using the method +for C<'""'>, C<'0+'>, or C<'bool'>) to be interpreted as a glob +or filename. + + operator | can be autogenerated from + | + | < <=> neg -= - + =========|========================== + neg | 1 + -= | 1 + -- | 1 2 + abs | a1 a2 b1 b2 [*] + < | 1 + <= | 1 + > | 1 + >= | 1 + == | 1 + != | 1 + + * one from [a1, a2] and one from [b1, b2] + +Just as numeric comparisons can be autogenerated from the method +for C<< '<=>' >>, string comparisons can be autogenerated from +that for C<'cmp'>: + + operators | can be autogenerated from + ====================|=========================== + lt gt le ge eq ne | cmp + +Similarly, autogeneration for keys C<'+='> and C<'++'> is analogous +to C<'-='> and C<'--'> above: + + operator | can be autogenerated from + | + | += + + =========|========================== + += | 1 + ++ | 1 2 + +And other assignment variations are analogous to +C<'+='> and C<'-='> (and similar to C<'.='> and C<'x='> above): + + operator || *= /= %= **= <<= >>= &= ^= |= + -------------------||-------------------------------- + autogenerated from || * / % ** << >> & ^ | + +Note also that the copy constructor (key C<'='>) may be +autogenerated, but only for objects based on scalars. +See L. + +=head3 Minimal Set of Overloaded Operations -=head2 Inheritance and overloading - -Inheritance interacts with overloading in two ways. - -=over - -=item Strings as values of C directive - -If C in - - use overload key => value; - -is a string, it is interpreted as a method name. +Since some operations can be automatically generated from others, there is +a minimal set of operations that need to be overloaded in order to have +the complete set of overloaded operations at one's disposal. +Of course, the autogenerated operations may not do exactly what the user +expects. The minimal set is: -=item Overloading of an operation is inherited by derived classes + + - * / % ** << >> x + <=> cmp + & | ^ ~ + atan2 cos sin exp log sqrt int + "" 0+ bool + ~~ -Any class derived from an overloaded class is also overloaded. The -set of overloaded methods is the union of overloaded methods of all -the ancestors. If some method is overloaded in several ancestor, then -which description will be used is decided by the usual inheritance -rules: +Of the conversions, only one of string, boolean or numeric is +needed because each can be generated from either of the other two. -If C inherits from C and C (in this order), C overloads -C<+> with C<\&D::plus_sub>, and C overloads C<+> by C<"plus_meth">, -then the subroutine C will be called to implement -operation C<+> for an object in package C. +=head2 Special Keys for C -=back +=head3 C -Note that since the value of the C key is not a subroutine, -its inheritance is not governed by the above rules. In the current -implementation, the value of C in the first overloaded -ancestor is used, but this is accidental and subject to change. +The C<'nomethod'> key is used to specify a catch-all function to +be called for any operator that is not individually overloaded. +The specified function will be passed four parameters. +The first three arguments coincide with those that would have been +passed to the corresponding method if it had been defined. +The fourth argument is the C key for that missing +method. -=head1 SPECIAL SYMBOLS FOR C +For example, if C<$a> is an object blessed into a package declaring -Three keys are recognized by Perl that are not covered by the above -description. + use overload 'nomethod' => 'catch_all', # ... -=head2 Last Resort +then the operation -C<"nomethod"> should be followed by a reference to a function of four -parameters. If defined, it is called when the overloading mechanism -cannot find a method for some operation. The first three arguments of -this function coincide with the arguments for the corresponding method if -it were found, the fourth argument is the symbol -corresponding to the missing method. If several methods are tried, -the last one is used. Say, C<1-$a> can be equivalent to + 3 + $a - &nomethodMethod($a,1,1,"-") +could (unless a method is specifically declared for the key +C<'+'>) result in a call -if the pair C<"nomethod" =E "nomethodMethod"> was specified in the -C directive. + catch_all($a, 3, 1, '+') -The C<"nomethod"> mechanism is I used for the dereference operators -( ${} @{} %{} &{} *{} ). +See L. +=head3 C -If some operation cannot be resolved, and there is no function -assigned to C<"nomethod">, then an exception will be raised via die()-- -unless C<"fallback"> was specified as a key in C directive. +The value assigned to the key C<'fallback'> tells Perl how hard +it should try to find an alternative way to implement a missing +operator. +=over -=head2 Fallback +=item * defined, but FALSE -The key C<"fallback"> governs what to do if a method for a particular -operation is not found. Three different cases are possible depending on -the value of C<"fallback">: + use overload "fallback" => 0, # ... ; -=over 16 +This disables L. =item * C -Perl tries to use a -substituted method (see L). If this fails, it -then tries to calls C<"nomethod"> value; if missing, an exception -will be raised. +In the default case where no value is explicitly assigned to +C, magic autogeneration is enabled. =item * TRUE -The same as for the C value, but no exception is raised. Instead, -it silently reverts to what it would have done were there no C -present. - -=item * defined, but FALSE +The same as for C, but if a missing operator cannot be +autogenerated then, instead of issuing an error message, Perl +is allowed to revert to what it would have done for that +operator if there had been no C directive. -No autogeneration is tried. Perl tries to call -C<"nomethod"> value, and if this is missing, raises an exception. +Note: in most cases, particularly the L, +this is unlikely to be appropriate behaviour. =back -B C<"fallback"> inheritance via @ISA is not carved in stone -yet, see L<"Inheritance and overloading">. - -=head2 Copy Constructor +See L. -The value for C<"="> is a reference to a function with three -arguments, i.e., it looks like the other values in C. However, it does not overload the Perl assignment -operator. This would go against Camel hair. +=head3 Copy Constructor -This operation is called in the situations when a mutator is applied -to a reference that shares its object with some other reference, such -as +As mentioned L, +this operation is called when a mutator is applied to a reference +that shares its object with some other reference. +For example, if C<$b> is mathemagical, and C<'++'> is overloaded +with C<'incr'>, and C<'='> is overloaded with C<'clone'>, then the +code - $a=$b; - ++$a; + $a = $b; + # ... (other code which does not modify $a or $b) ... + ++$b; -To make this change $a and not change $b, a copy of C<$$a> is made, -and $a is assigned a reference to this new object. This operation is -done during execution of the C<++$a>, and not during the assignment, -(so before the increment C<$$a> coincides with C<$$b>). This is only -done if C<++> is expressed via a method for C<'++'> or C<'+='> (or -C). Note that if this operation is expressed via C<'+'> -a nonmutator, i.e., as in +would be executed in a manner equivalent to - $a=$b; - $a=$a+1; + $a = $b; + # ... + $b = $b->clone(undef, ""); + $b->incr(undef, ""); -then C<$a> does not reference a new copy of C<$$a>, since $$a does not -appear as lvalue when the above code is executed. +Note: -If the copy constructor is required during the execution of some mutator, -but a method for C<'='> was not specified, it can be autogenerated as a -string copy if the object is a plain scalar or a simple assignment if it -is not. +=over -=over 5 +=item * -=item B +The subroutine for C<'='> does not overload the Perl assignment +operator: it is used only to allow mutators to work as described +here. (See L above.) -The actually executed code for +=item * - $a=$b; - Something else which does not modify $a or $b.... - ++$a; +As for other operations, the subroutine implementing '=' is passed +three arguments, though the last two are always C and C<''>. -may be +=item * - $a=$b; - Something else which does not modify $a or $b.... - $a = $a->clone(undef,""); - $a->incr(undef,""); +The copy constructor is called only before a call to a function +declared to implement a mutator, for example, if C<++$b;> in the +code above is effected via a method declared for key C<'++'> +(or 'nomethod', passed C<'++'> as the fourth argument) or, by +autogeneration, C<'+='>. +It is not called if the increment operation is effected by a call +to the method for C<'+'> since, in the equivalent code, -if $b was mathemagical, and C<'++'> was overloaded with C<\&incr>, -C<'='> was overloaded with C<\&clone>. + $a = $b; + $b = $b + 1; -=back +the data referred to by C<$a> is unchanged by the assignment to +C<$b> of a reference to new object data. -Same behaviour is triggered by C<$b = $a++>, which is consider a synonym for -C<$b = $a; ++$a>. +=item * -=head1 MAGIC AUTOGENERATION +The copy constructor is not called if Perl determines that it is +unnecessary because there is no other reference to the data being +modified. -If a method for an operation is not found, and the value for C<"fallback"> is -TRUE or undefined, Perl tries to autogenerate a substitute method for -the missing operation based on the defined operations. Autogenerated method -substitutions are possible for the following operations: +=item * -=over 16 +If C<'fallback'> is undefined or TRUE then a copy constructor +can be autogenerated, but only for objects based on scalars. +In other cases it needs to be defined explicitly. +Where an object's data is stored as, for example, an array of +scalars, the following might be appropriate: -=item I + use overload '=' => sub { bless [ @{$_[0]} ] }, # ... -C<$a+=$b> can use the method for C<"+"> if the method for C<"+="> -is not defined. +=item * -=item I +If C<'fallback'> is TRUE and no copy constructor is defined then, +for objects not based on scalars, Perl may silently fall back on +simple assignment - that is, assignment of the object reference. +In effect, this disables the copy constructor mechanism since +no new copy of the object data is created. +This is almost certainly not what you want. +(It is, however, consistent: for example, Perl's fallback for the +C<++> operator is to increment the reference itself.) -String, numeric, boolean and regexp conversions are calculated in terms -of one another if not all of them are defined. +=back -=item I +=head2 How Perl Chooses an Operator Implementation -The C<++$a> operation can be expressed in terms of C<$a+=1> or C<$a+1>, -and C<$a--> in terms of C<$a-=1> and C<$a-1>. +Which is checked first, C or C? +If the two operands of an operator are of different types and +both overload the operator, which implementation is used? +The following are the precedence rules: -=item C +=over -can be expressed in terms of C<$aE0> and C<-$a> (or C<0-$a>). +=item 1. -=item I +If the first operand has declared a subroutine to overload the +operator then use that implementation. -can be expressed in terms of subtraction. +=item 2. -=item I +Otherwise, if fallback is TRUE or undefined for the +first operand then see if the +L +allows another of its operators to be used instead. -C and C can be expressed in terms of boolean conversion, or -string or numerical conversion. +=item 3. -=item I +Unless the operator is an assignment (C<+=>, C<-=>, etc.), +repeat step (1) in respect of the second operand. -can be expressed in terms of string conversion. +=item 4. -=item I +Repeat Step (2) in respect of the second operand. -can be expressed in terms of its "spaceship" counterpart: either -C=E> or C: +=item 5. - <, >, <=, >=, ==, != in terms of <=> - lt, gt, le, ge, eq, ne in terms of cmp +If the first operand has a "nomethod" method then use that. -=item I +=item 6. - <> in terms of builtin operations +If the second operand has a "nomethod" method then use that. -=item I +=item 7. - ${} @{} %{} &{} *{} in terms of builtin operations +If C is TRUE for both operands +then perform the usual operation for the operator, +treating the operands as numbers, strings, or booleans +as appropriate for the operator (see note). -=item I +=item 8. -can be expressed in terms of an assignment to the dereferenced value, if this -value is a scalar and not a reference, or simply a reference assignment -otherwise. +Nothing worked - die. =back -=head1 Minimal set of overloaded operations +Where there is only one operand (or only one operand with +overloading) the checks in respect of the other operand above are +skipped. -Since some operations can be automatically generated from others, there is -a minimal set of operations that need to be overloaded in order to have -the complete set of overloaded operations at one's disposal. -Of course, the autogenerated operations may not do exactly what the user -expects. See L above. The minimal set is: +There are exceptions to the above rules for dereference operations +(which, if Step 1 fails, always fall back to the normal, built-in +implementations - see Dereferencing), and for C<~~> (which has its +own set of rules - see L). - + - * / % ** << >> x - <=> cmp - & | ^ ~ - atan2 cos sin exp log sqrt int +Note on Step 7: some operators have a different semantic depending +on the type of their operands. +As there is no way to instruct Perl to treat the operands as, e.g., +numbers instead of strings, the result here may not be what you +expect. +See L. -Additionally, you need to define at least one of string, boolean or -numeric conversions because any one can be used to emulate the others. -The string conversion can also be used to emulate concatenation. - -=head1 Losing overloading +=head2 Losing Overloading The restriction for the comparison operation is that even if, for example, `C' should return a blessed reference, the autogenerated `C' @@ -782,7 +902,43 @@ When you chop() a mathemagical object it is promoted to a string and its mathemagical properties are lost. The same can happen with other operations as well. -=head1 Run-time Overloading +=head2 Inheritance and Overloading + +Overloading respects inheritance via the @ISA hierarchy. +Inheritance interacts with overloading in two ways. + +=over + +=item Method names in the C directive + +If C in + + use overload key => value; + +is a string, it is interpreted as a method name - which may +(in the usual way) be inherited from another class. + +=item Overloading of an operation is inherited by derived classes + +Any class derived from an overloaded class is also overloaded +and inherits its operator implementations. +If the same operator is overloaded in more than one ancestor +then the implementation is determined by the usual inheritance +rules. + +For example, if C inherits from C and C (in that order), +C overloads C<+> with C<\&D::plus_sub>, and C overloads +C<+> by C<"plus_meth">, then the subroutine C will +be called to implement operation C<+> for an object in package C. + +=back + +Note that since the value of the C key is not a subroutine, +its inheritance is not governed by the above rules. In the current +implementation, the value of C in the first overloaded +ancestor is used, but this is accidental and subject to change. + +=head2 Run-time Overloading Since all C directives are executed at compile-time, the only way to change overloading during run-time is to @@ -795,7 +951,7 @@ You can also use though the use of these constructs during run-time is questionable. -=head1 Public functions +=head2 Public Functions Package C provides the following public functions: @@ -818,7 +974,7 @@ Returns C or a reference to the method that implements C. =back -=head1 Overloading constants +=head2 Overloading Constants For some applications, the Perl parser mangles constants too much. It is possible to hook into this process via C @@ -920,84 +1076,14 @@ packages acquire a magic during the next Cing into the package. This magic is three-words-long for packages without overloading, and carries the cache table if the package is overloaded. -Copying (C<$a=$b>) is shallow; however, a one-level-deep copying is -carried out before any operation that can imply an assignment to the -object $a (or $b) refers to, like C<$a++>. You can override this -behavior by defining your own copy constructor (see L<"Copy Constructor">). - It is expected that arguments to methods that are not explicitly supposed to be changed are constant (but this is not enforced). -=head1 Metaphor clash - -One may wonder why the semantic of overloaded C<=> is so counter intuitive. -If it I counter intuitive to you, you are subject to a metaphor -clash. - -Here is a Perl object metaphor: - -I< object is a reference to blessed data> - -and an arithmetic metaphor: - -I< object is a thing by itself>. - -The I
problem of overloading C<=> is the fact that these metaphors -imply different actions on the assignment C<$a = $b> if $a and $b are -objects. Perl-think implies that $a becomes a reference to whatever -$b was referencing. Arithmetic-think implies that the value of "object" -$a is changed to become the value of the object $b, preserving the fact -that $a and $b are separate entities. - -The difference is not relevant in the absence of mutators. After -a Perl-way assignment an operation which mutates the data referenced by $a -would change the data referenced by $b too. Effectively, after -C<$a = $b> values of $a and $b become I. - -On the other hand, anyone who has used algebraic notation knows the -expressive power of the arithmetic metaphor. Overloading works hard -to enable this metaphor while preserving the Perlian way as far as -possible. Since it is not possible to freely mix two contradicting -metaphors, overloading allows the arithmetic way to write things I. The -way it is done is described in L. - -If some mutator methods are directly applied to the overloaded values, -one may need to I other values which references the -same value: - - $a = Data->new(23); - ... - $b = $a; # $b is "linked" to $a - ... - $a = $a->clone; # Unlink $b from $a - $a->increment_by(4); - -Note that overloaded access makes this transparent: - - $a = Data->new(23); - $b = $a; # $b is "linked" to $a - $a += 4; # would unlink $b automagically - -However, it would not make - - $a = Data->new(23); - $a = 4; # Now $a is a plain 4, not 'Data' - -preserve "objectness" of $a. But Perl I a way to make assignments -to an object do whatever you want. It is just not the overload, but -tie()ing interface (see L). Adding a FETCH() method -which returns the object itself, and STORE() method which changes the -value of the object, one can reproduce the arithmetic metaphor in its -completeness, at least for variables which were tie()d from the start. - -(Note that a workaround for a bug may be needed, see L<"BUGS">.) - -=head1 Cookbook +=head1 COOKBOOK Please add examples to what follows! -=head2 Two-face scalars +=head2 Two-face Scalars Put this in F in your Perl library directory: @@ -1021,7 +1107,7 @@ numeric value.) This prints: seven=vii, seven=7, eight=8 seven contains `i' -=head2 Two-face references +=head2 Two-face References Suppose you want to create an object which is accessible as both an array reference and a hash reference. @@ -1145,7 +1231,7 @@ overloaded operations. =back -=head2 Symbolic calculator +=head2 Symbolic Calculator Put this in F in your Perl library directory: @@ -1160,8 +1246,8 @@ Put this in F in your Perl library directory: } This module is very unusual as overloaded modules go: it does not -provide any usual overloaded operators, instead it provides the L operator C. In this example the corresponding +provide any usual overloaded operators, instead it provides an +implementation for L>. In this example the C subroutine returns an object which encapsulates operations done over the objects: C<< symbolic->new(3) >> contains C<['n', 3]>, C<< 2 + symbolic->new(3) >> contains C<['+', 2, ['n', 3]]>. @@ -1334,11 +1420,13 @@ the tables of operations, and change the code which fills %subr to $subr{$op} = eval "sub {$op shift()}"; } -Due to L, we do not need anything -special to make C<+=> and friends work, except filling C<+=> entry of -%subr, and defining a copy constructor (needed since Perl has no -way to know that the implementation of C<'+='> does not mutate -the argument, compare L). +Since subroutines implementing assignment operators are not required +to modify their operands (see L above), +we do not need anything special to make C<+=> and friends work, +besides adding these operators to %subr and defining a copy +constructor (needed since Perl has no way to know that the +implementation of C<'+='> does not mutate the argument - +see L). To implement a copy constructor, add C<< '=' => \&cpy >> to C line, and code (this code assumes that mutators change things one level @@ -1400,7 +1488,7 @@ note: due to the explicit recursion num() is more fragile than sym(): we need to explicitly check for the type of $a and $b. If components $a and $b happen to be of some related type, this may lead to problems. -=head2 I symbolic calculator +=head2 I Symbolic Calculator One may wonder why we call the above calculator symbolic. The reason is that the actual calculation of the value of expression is postponed @@ -1428,7 +1516,7 @@ the numeric value of $c becomes 13. There is no doubt now that the module symbolic provides a I calculator indeed. To hide the rough edges under the hood, provide a tie()d interface to the -package C (compare with L). Add methods +package C. Add methods sub TIESCALAR { my $pack = shift; $pack->new(@_) } sub FETCH { shift } @@ -1468,8 +1556,8 @@ Ilya Zakharevich EFE. =head1 SEE ALSO -The L pragma can be used to enable or disable overloaded -operations within a lexical scope. +The C pragma can be used to enable or disable overloaded +operations within a lexical scope - see L. =head1 DIAGNOSTICS @@ -1505,17 +1593,91 @@ to a subroutine. =back -=head1 BUGS +=head1 BUGS AND PITFALLS + +=over + +=item * + +No warning is issued for invalid C keys. +Such errors are not always obvious: + + use overload "+0" => sub { ...; }, # should be "0+" + "not" => sub { ...; }; # should be "!" + +(Bug #74098) + +=item * + +A pitfall when fallback is TRUE and Perl resorts to a built-in +implementation of an operator is that some operators have more +than one semantic, for example C<|>: + + use overload '0+' => sub { $_[0]->{n}; }, + fallback => 1; + my $x = bless { n => 4 }, "main"; + my $y = bless { n => 8 }, "main"; + print $x | $y, "\n"; + +You might expect this to output "12". +In fact, it prints "<": the ASCII result of treating "|" +as a bitwise string operator - that is, the result of treating +the operands as the strings "4" and "8" rather than numbers. +The fact that numify (C<0+>) is implemented but stringify +(C<"">) isn't makes no difference since the latter is simply +autogenerated from the former. -Because it is used for overloading, the per-package hash %OVERLOAD now -has a special meaning in Perl. The symbol table is filled with names -looking like line-noise. +The only way to change this is to provide your own subroutine +for C<'|'>. + +=item * + +Magic autogeneration increases the potential for inadvertently +creating self-referential structures. +Currently Perl will not free self-referential +structures until cycles are explicitly broken. +For example, + + use overload '+' => 'add'; + sub add { bless [ \$_[0], \$_[1] ] }; + +is asking for trouble, since + + $obj += $y; + +will effectively become + + $obj = add($obj, $y, undef); + +with the same result as + + $obj = [\$obj, \$foo]; + +Even if no I assignment-variants of operators are present in +the script, they may be generated by the optimizer. +For example, + + "obj = $obj\n" + +may be optimized to + + my $tmp = 'obj = ' . $obj; $tmp .= "\n"; + +=item * + +Because it is used for overloading, the per-package hash +C<%OVERLOAD> now has a special meaning in Perl. +The symbol table is filled with names looking like line-noise. + +=item * For the purpose of inheritance every overloaded package behaves as if C is present (possibly undefined). This may create interesting effects if some package is not overloaded, but inherits from two overloaded packages. +=item * + Relation between overloading and tie()ing is broken. Overloading is triggered or not basing on the I class of tie()d value. @@ -1527,10 +1689,11 @@ coincides with the current one. B a way to fix this without a speed penalty. +=item * + Barewords are not covered by overloaded string constants. -This document is confusing. There are grammos and misleading language -used in places. It would seem a total rewrite is needed. +=back =cut -- 2.7.4