Naming and Scope in Programs.

This is an automatically translated post by LLM. The original post is in Chinese. If you find any translation errors, please leave a comment to help me improve the translation. Thanks!

Recently, I have been reading "The Essence of Code", which explains the concepts behind many programming languages. This note mainly refers to Chapter 7 of the book: Names and Scopes.

转自:米游社@战小医仙

Naming

When a computer executes a program, it does so based on the information in memory addresses, without needing to know the names of each variable. Therefore, early programming did not have the concept of naming. Later, in order to facilitate programmers in remembering the meanings and types represented by variables, names were given to variables and functions. For example, compared to "Longitude: 108.654519, Latitude: 34.255314", "China Western Science and Technology Innovation Port" is easier to understand and remember. Similarly, compared to "8.7.198.46", the description "a non-existent website" is more memorable.

So how does a computer associate names with variables and functions? The answer is by using a lookup table. This is similar to finding a book in a library by searching the database, converting the book title into a classification number, and then using the classification number to locate the desired book. Taking the library of the Innovation Port as an example:

1
"The Essence of Code" => TP312/159, "The First Volume of Ancient and Modern Mathematical Thoughts" => O11/510-1/C.1

With such a lookup table, when the computer processes the command "find 'The Essence of Code'", it can be translated into "find the book with the number TP312/159" for execution.

Taking Python as an example:

1
2
3
4
5
6
7
x = 'python'
y = 'hello'
z = x

id(x) # => 2256008235312
id(y) # => 2256008254512
id(z) # => 2256008235312

In the above program, the computer establishes the following lookup table:

graph LR
x[x]-->a[2256008235312]
y[y]-->b[2256008254512]
z[z]-->a

In the above lookup table, the variable x and the variable z point to the same memory address. Therefore, when the variable z changes, the variable x will also change accordingly. It's just that strings in Python are immutable

Naming Conflicts

In early program design, the lookup table was shared by the entire program. This can lead to some problems, such as the following C++ code:

1
2
3
4
5
6
7
8
9
10
11
#include <iostream>
using namespace std;

int i=0;

int main()
{
for (i=0;i<10;i++){
print();
}
}

This loop should end after running 10 times. However, what if the variable i is modified in the function print:

1
2
3
4
void print(){
cout<<"ok"<<endl;
i--;
}

Since the variable i is used both inside and outside the function, it cannot end directly after the expected 10 iterations of the for loop. It will loop indefinitely.

How to avoid such naming conflicts? One way is to use longer names and indicate in the name what the variable is used for. For example, i_in_print and i_in_main, or use a variable naming policy for management in collaborative development.

Another way is to introduce scopes. The following text will mainly focus on scopes.

Scopes

Scope refers to the range of validity of a name. In order to prevent the operation of i in the print function from affecting the outside, its value can be stored before the function is executed and retrieved after the function is executed. This mechanism is called dynamic scoping.

Dynamic Scoping

Taking the Perl language as an example, the variable value is saved at the entrance of the function and written back at the end of the function.

1
2
3
4
5
sub shori{
$old_i = $i;
# do something
$i = $old_i
}

However, this approach has a problem. When a function is called within another function, the modification of variables will affect the called function. Take the following two functions as an example:

1
2
3
4
5
6
7
8
9
10
$x = "global"

sub yobu {
local $x = "yobu";
&yobarebu();
}

sub yobarebu {
print "$x\n";
}

When the function yobu is called, the variable x is changed to yobu. Before the function is executed and the value is written back, when the yobarebu function is called, the value of the variable x used is yobu. This shows that the modification of variables in the yobu function affects the execution of the called function.

The reason for this phenomenon is that in dynamic scoping, all variables share a global lookup table, and when entering a function, a dynamic lookup table is created, which is shared by all functions and can be accessed by all functions. This leads to the occurrence of the above phenomenon.

In summary, in dynamic scoping, when each variable is searched for its corresponding address, it is searched in order from the nearest to the farthest. This causes the phenomenon of mutual influence when functions are nested. To solve this problem, we can use separate lookup tables for each function to store variables. This is called static scoping.

Static Scoping

Static scoping uses a separate variable table for each function. When the computer searches for a variable, it first searches in the variable table of the current function, and if it is not found, it searches in the global variable table. This effectively solves the problem of mutual influence when functions are nested in dynamic scoping.

However, static scoping also has some problems. One is the problem of nested functions. Taking Python as an example:

1
2
3
4
5
6
7
x = "global"
def foo():
x = "foo"
def bar():
print(x)
bar()
foo()

In this nested function, at first glance, the output of the x in the bar function should be foo. However, in Python 2, when the bar function searches for the variable x, it first searches in the local lookup table, and if it is not found, it directly searches in the global lookup table. Therefore, the output result is global. This design brings a lot of misunderstandings to the program. This problem was not solved until Python 3.

The second problem is rebinding in the outer scope. When we want to modify a value in an outer scope within a function, a new variable will be created instead, making it impossible to modify the value in the outer scope. For example:

1
2
3
4
5
6
7
8
9
x = "global"
def foo():
x = "foo"
def bar():
x = "bar"
# A new x is created
# Unable to modify the outer x
bar()
foo()

In Python 3.0, the nonlocal keyword was introduced to solve this problem:

1
2
3
4
5
6
7
8
9
10
x = "global"
def foo():
x = "foo"
def bar():
nonlocal x
x = "bar"
# A new x is created
# Unable to modify the outer x
bar()
foo()

Summary

The development of computer technology has brought about more powerful computing capabilities, resulting in increasingly complex programs. The development of naming and scoping highlights the contradiction between humans and computers when naming variables and functions in large-scale programs. Any feature in a programming language is not created out of thin air, but appears to solve certain problems.