Variables
and Scoping
Randal L. Schwartz
Programs push data around. In Perl, this data lives in variables,
and the variables can be associated with various scopes. Let's
take a look at Perl's peculiar scoping rules.
To begin, let's define a term, lexical scope. A lexical
scope provides the boundaries of some property of the program associated
with the text of the program itself, as opposed to properties that
are associated with the runtime state of the program. Lexical properties
might include variable declarations, compiler directives, exceptions
being caught, and so on.
In Perl, the largest lexical scope is the source file itself.
Lexically scoped items never affect anything larger than a file.
Additionally, nearly all blocks also introduce a nested lexical
scope that ends where the block ends. Because blocks are nested
and not overlapping, the lexical scopes also nest. This will become
clearer in the examples that follow.
Some of the variables in a Perl program are package variables
(also called symbol-table variables). A package variable's
full name consists of a package prefix followed by the specific
identifier for the variable. The prefix is separated from the identifier
by a double colon.
For example, in $Animal::count, Animal is the package
prefix, while count is the variable within the package. Both
the package and the identifier contain one or more alphanumerics
and/or underscores. Additionally, packages can have multiple, double-colon
separated parts, as in $Animal::Dog::count. Again, count
is the variable, and Animal::Dog is the package prefix. There's
no necessary relationship between Animal and Animal::Dog,
although people tend to give related names to related packages.
Although package variables are formally named with colons, you
won't see many colons in most uses of package variables.
That's because by default variable name without colons are
automatically placed into the current package. The initial
current package is package main, so the following two code
snippets are identical:
print "What is your name? ";
chomp($input = <STDIN>);
$length = length $input;
print "Your name $input is $length characters long.\n";
and
print "What is your name? ";
chomp($main::input = <STDIN>);
$main::length = length $main::input;
print "Your name $main::input is $main::length characters long.\n";
It's a good thing we don't have to have main all
over the place.
So, why do we have packages, if everything already defaults to
package main? Well, it's so that we can have multiple
portions of code brought together into one program. Suppose the
code above were to be added into a program that already had a meaning
for $main::input or $main::length. We'd have
a collision of names. But we can fix that by using a different package
prefix:
print "What is your name? ";
chomp($Query::input = <STDIN>);
$Query::length = length $Query::input;
print "Your name $Query::input is $Query::length characters long.\n";
Now $Query::input has nothing to do with $main::input,
so we no longer have a naming collision. Of course, this is a lot
of typing, and we can shorten this by changing the current package,
using the package directive:
package Query;
print "What is your name? ";
chomp($input = <STDIN>);
$length = length $input;
print "Your name $input is $length characters long.\n";
Wow, that's easier to type, and yet the $input variable
there is really $Query::input, and won't conflict with
$main::input used elsewhere.
The package directive is lexically scoped (thought we forgot about
that term, eh?). This means that the package directive stays in
effect until the end of the current scope, or until another package
directive changes the current package again. For example, we could
put that piece of code into the middle of the rest of our program
as:
# initial package main
...
$input = "Hey"; # $main::input
package Query; # now in package Query
print "What is your name? ";
chomp($input = <STDIN>); # $Query::input
$length = length $input;
print "Your name $input is $length characters long.\n";
package main; # back to package main
print $input; # $main::input again
print "that length was $Query::length\n"; # reference prior value
However, we have to remember to reset the package back to what it
was before. This is error-prone, and perhaps not easy to maintain,
especially if we're not sure what the prior package might be.
But, since the package directive is lexically scoped, we can introduce
a block to limit the directive's influence:
# initial package main
...
$input = "Hey"; # $main::input
{ # start scope
package Query; # now in package Query
print "What is your name? ";
chomp($input = <STDIN>); # $Query::input
$length = length $input;
print "Your name $input is $length characters long.\n";
} # end scope
# automatically back to package main
print $input; # $main::input again
print "that length was $Query::length\n"; # reference prior value
Ahh, that's a bit simpler.
As that last example showed, we can access any package variable
from any location in our program, much as we can spell out the full
path to any accessible file in a UNIX filesystem regardless of our
current directory, even though the files at or below the current
directory are easier to type. But these global variables can lead
to global headaches, since we can't really know at a glance
about all the code that can examine or modify the variable.
Like most modern programming languages, Perl also includes the
notion of a lexical variable. Lexical variables do not belong to
a package, so they cannot be referenced outside the lexical scope
in which they are declared. Their names also cannot contain colons,
because they do not have a package prefix.
Lexical variables are introduced with the my keyword:
print "What is your name? ";
chomp(my $input = <STDIN>); # lexical $input
my $length = length $input; # lexical $length
print "Your name $input is $length characters long.\n";
Because these variables are introduced outside any block in this example,
they are lexically scoped to the file in which they appear. If this
code is part of a file being included with eval, do,
require, or use, there's no chance that this $input
will conflict with any other use of $input. There's also
no syntax that would let any other code outside of this code access
those variables, so we can be assured that our variables won't
be changing mysteriously.
Besides file-scoped lexical variables, another common appearance
is in the block that belongs to a subroutine:
sub get_name_length {
print "What is your name? ";
chomp(my $input = <STDIN>); # lexical $input
my $length = length $input; # lexical $length
print "Your name $input is $length characters long.\n";
}
When the subroutine returns, the lexical variables are discarded,
automatically recycling the memory that had been used. Additionally,
any outer declaration of $input or $length is temporarily
shadowed within the subroutine, protecting the outer variables from
accidental alteration.
We can also create temporary variables this way:
{ # start temporary scope
print "What is your name? ";
chomp(my $input = <STDIN>); # lexical $input
my $length = length $input; # lexical $length
print "Your name $input is $length characters long.\n";
} # end temporary scope
The variables declared and used in this block will be recycled at
the end of the block, just as if we had placed this code into a subroutine.
A frequent admonition in the Perl literature is "Always use
strict!". What does this do, precisely? Well, among other things,
use strict disables the automatic prepending of the package
to a variable name. Once use strict is in effect, a name
without colons must have been declared, either as a lexical variable,
or as a specially noted package variable.
The primary purpose of use strict is to catch any random
erroneous variations of a variable name:
print "What is your name? ";
chomp($input = <STDIN>); # $main::input
my $length = length $input; # $main::length
print "Your name $input is $lenth characters long.\n"; # broken
Oops! That's $main::lenth, not $main::length. But
by turning on use strict, we no longer get main:: in
front of anything we mention, and thus we must declare the variables
lexically at first use instead:
use strict;
print "What is your name? ";
chomp(my $input = <STDIN>); # lexical $input
my $length = length $input; # lexical $length
print "Your name $input is $lenth characters long.\n"; # caught
The compiler will abort at that last line, because we can't just
turn $lenth into $main::lenth any more.
To refer to package variables, we can simply use the full prefix-included
colon name:
use strict;
print "$Animal::Dog::count dogs were seen!\n";
print "$main::length characters in that name.\n";
If we want to refer to a package variable without the package prefix,
we can use the use vars compiler directive:
use strict;
use vars qw($length); # now permits $length to mean $main::length
print "$length characters\n"; # $main::length
Any name in the use vars list can be referenced in the current
package as if it were fully specified. Once seen, the directive is
in effect for that variable name as long as the current package is
the same as the package in which the use vars appeared. So,
this is an error:
use strict;
use vars qw($length); # $length is $main::length in main
{ package Query;
print $length; # COMPILE ERROR... $Query::length not permitted
}
print $length; # would have been ok, back to $main::length
In recent versions of Perl, the our keyword was introduced
as a parallel to my. It functions similarly to use vars,
but the declaration of the package variable is lexically scoped, not
dependent on the current package.
use strict;
our $length; # $length is $main::length in this scope
{ package Query;
our $input; # $input is $Query::input in this scope
print $length; # permitted access to $main::length here
print $input; # permitted access to $Query::input here
} # end of scope, so $input goes out of scope
print $length; # still $main::length
print $input; # COMPILE ERROR, no access to $main::input permitted
As you can see, use vars and our are not precisely the
same thing, but in general, they both serve to permit selected package
variables to be used without colons.
I hope this brief overview of package and lexical variables and
scoping has been useful. Until next time, enjoy!
Randal L. Schwartz is a two-decade veteran of the software
industry -- skilled in software design, system administration,
security, technical writing, and training. He has coauthored the
"must-have" standards: Programming Perl, Learning
Perl, Learning Perl for Win32 Systems, and Effective
Perl Programming. He's also a frequent contributor to the
Perl newsgroups, and has moderated comp.lang.perl.announce since
its inception. Since 1985, Randal has owned and operated Stonehenge
Consulting Services, Inc.
|