[CS143-PA3] Semantic Analyzer

[CS143-PA3] Semantic Analyzer

令和きれいだ

full file link https://github.com/Aria461863631/cs143/tree/master/PA4


used files

  • semant.c: write semantic analysis in semant() function
  • semant.h: header file
  • cool-tree.h or cool-tree.handcode.h: add additional declarations
  • symtab_example.cc: example for using SymbolTable


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
void program_class::semant(){
    initialize_constants();

    /* ClassTable constructor may do some semantic analysis */
    ClassTable *classtable = new ClassTable(classes);

    /* some semantic analysis code may go here (3-passes total) */
    installMethods(Object);
    checkClassesType(Object);

    if (classtable->errors()) {
	cerr << "Compilation halted due to static semantic errors." << endl;
	exit(1);
    }
}


Inheritance

start implementing ClassTable for representing the inheritance graph.

1
2
3
4
5
6
ClassTable::ClassTable(Classes classes) : semant_errors(0) , error_stream(cerr) {
    /* check inheritance when building inheritance graph 
     * If the inheritance graph is not well-defined, 
     * it is acceptable to abort compilation
     */
}

requirements

  • All class names are globally visible.

  • inheritance graph be acyclic

  • basic classes
    It is an error to inherit from or redefine Int, Str, Bool.
    It is an error to redefine the IO class.
    A class can make use of the methods in the IO class by inheriting from IO.
  • It is also an error if class A inherits from class B but class B is not defined.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
ClassTable::ClassTable(Classes classes) : semant_errors(0), error_stream(cerr) {

  install_basic_classes();

  /*
   * in semant.h
   * std::map<Symbol, Class_> ClassTable::m_classes
   * a map from Symbol to Class_
   * ==============================================
   * std::map<Symbol, std::list<Symbol>> edges;
   * a map from Symbol to its child Symbols
   * building inheritance tree
   */

  // dfs use
  std::map<Class_, int> visited;

  log << "Start checking inheritance" << std::endl;

  /* Fill this in */
  for (int i = classes->first(); classes->more(i); i = classes->next(i)) {
    curclass = classes->nth(i);
    Symbol curname = curclass->get_name();

    if (curname == SELF_TYPE || curname == Int || curname == IO ||
        curname == Bool || curname == Str || curname == Object) {
      semant_error(curclass)
          << "Error! Redefinition of basic class " << curname << std::endl;
      return;
    }

    if (mp_class.find(curname) != mp_class.end()) {
      semant_error(curclass)
          << "Error! Class " << curname << " was previously defined\n";
      return;
    }

    // install map;
    mp_class.insert(std::make_pair(curname, curclass));
    visited.insert(std::make_pair(curclass, 0));
    edges[curclass->get_parent()].push_back(curname);
  }

  if (mp_class.find(Main) == mp_class.end()) {
    semant_error() << "Class Main is not defined.\n";
    return;
  }

  log << "DFS begin\n";

  // dfs for asyclic
  bool err = false, cycle = false;
  for (int i = classes->first(); classes->more(i); i = classes->next(i)) {
    curclass = classes->nth(i);

    if (visited[curclass] == 0) {
      DFS(curclass, classes, visited, err, cycle);
      if (err)
        return;
    }
  }
}

/*
 * find cycles in inheritance path
 * using dfs topologic sort
 */
void ClassTable::DFS(Class_ cur, Classes classes, std::map<Class_, int> &vis,
                     bool &err, bool &cycle) {

  log << "DFS class " << cur->get_name() << "\n";

  if (vis[cur] == 1) {
    cycle = true;
    err = true;
    return;
  }

  Symbol parent_name = cur->get_parent();
  if (mp_class.find(parent_name) == mp_class.end()) {
    semant_error(cur) << "Error! Class " << cur->get_name()
                      << " inherits from an undefined class " << parent_name
                      << std::endl;
    err = true;
    return;
  }

  if (parent_name == Int || parent_name == Str || parent_name == Bool ||
      parent_name == SELF_TYPE) {
    semant_error(cur) << "Error! Class " << cur->get_name()
                      << " cannot inherit " << parent_name << std::endl;
    err = true;
    return;
  }

  vis[cur] = 1;
  if (parent_name != Object)
    DFS(mp_class[parent_name], classes, vis, err, cycle);
  if (cycle) {
    // print all classes involved in a cycle
    semant_error(cur) << "Error! Class " << cur->get_name()
                      << " is involved in an inheritance cycle\n";
  }
  if (err)
    return;
  vis[cur] = 2;
}


Naming and Scoping

The type environment consists of three parts:

  • the name of the current class $ C $

  • an object environment $ O $ is a function of the form $$ O(v) = T $$which assigns the type $ T $ to object identifier $ v $.

  • a method environment $ M $ it is a function of the form $$ M (C, f ) = (T_1 , . . . , T _{n−1} , T _n ) $$

    • $ C $ is a class name (a type), $ f $ is a method name, and $ t_1 , . . . , t_n $ are types.
    • The tuple of types is the signature of the method.
    • The interpretation of signatures is that
      in class $ C $ the method $ f $ has formal parameters of types $ (t_1 , . . . , t_{n−1} ) $—in that order—and a return type $ t_n $


using symbol table to represent type environment

1
2
3
4
5
6
7
static Class_ curclass = NULL;
// ObjectId to Class(Type)
static SymbolTable<Symbol, Symbol> objectEnv;
// methods map along inheritance path
static SymbolTable<Symbol, Feature_class> inheritEnv;
// ClassName to methodName to methods
static std::map<Symbol, SymbolTable<Symbol, Feature_class>> methodEnv;


install symbol tables

In Cool all attributes have scope local to the class, and all methods have global scope.

first install global method table

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
/*
 * utils fuction for AddMethod
 * compare two formal list of methods in different classes
 */
Symbol method_class::matchFormals(Formals actual) {
  Symbol type = return_type;
  for (int i = actual->first(); actual->more(i); i = actual->next(i)) {
    Symbol actual_type = actual->nth(i)->get_typedecl();
    if (formals->more(i) == false) {
      return No_type;
    }
    Symbol declar_type = formals->nth(i)->get_typedecl();
    if (declar_type != actual_type) {
      return No_type;
    }
  }
  return type;
}

/*
 * If a class C inherits a method f from an ancestor class P, then C may
 * override the inherited definition of f provided the number of arguments, the
 * types of the formal parameters, and the return type are exactly the same in
 * both definitions.
 */
void method_class::AddMethod() {
  log << "addMethod: " << name << " in class " << curclass->get_name()
      << std::endl;
  if (inheritEnv.lookup(name) != NULL) {
    Feature oldmethod = inheritEnv.lookup(name);
    Symbol oldret_type = oldmethod->matchFormals(formals);
    if (oldret_type == No_type || oldret_type != return_type) {
      classtable->semant_error(curclass) << "Invalid override\n";
    }
  }

  inheritEnv.addid(name, copy_Feature());
  methodEnv[curclass->get_name()].addid(name, copy_Feature());
}

/*
 * DFS from Object to all child classes
 * use inheritEnv to record methods along current inheritance path
 * use methodEnv to record methods in each class (without parent methods)
 */
void program_class::installMethods(Symbol classname) {

  curclass = classtable->getClassByName(classname);
  inheritEnv.enterscope();
  methodEnv[classname].enterscope();
  Features fs = curclass->get_features();
  for (int i = fs->first(); fs->more(i); i = fs->next(i)) {
    Feature curFeature = fs->nth(i);
    curFeature->AddMethod();
  }

  //dfs to children
  std::list<Symbol>::iterator iter;
  std::list<Symbol> nexClass = classtable->getChildClasses(classname);
  for (iter = nexClass.begin(); iter != nexClass.end(); ++iter)
    installMethods(*iter);

  inheritEnv.exitscope();
}

then install object table

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
void attr_class::AddAttribute() {
  log << "addAttribute: " << name << " " << type_decl << std::endl;
  if (name == self)
    classtable->semant_error(curclass)
        << "'self' cannot be the name of an attribute\n";
  else if (objectEnv.lookup(name) != NULL)
    classtable->semant_error(curclass)
        << "Attribute " << name << " is an attribute of an inherited class\n"
        << std::endl;
  objectEnv.addid(name, new Symbol(type_decl));
}

/*
 * DFS from Object as above
 */
void program_class::checkClassesType(Symbol classname) {

  curclass = classtable->getClassByName(classname);
  objectEnv.enterscope();
  objectEnv.addid(self, new Symbol(SELF_TYPE));//add self symbol
  Features fs = curclass->get_features();
  for (int i = fs->first(); fs->more(i); i = fs->next(i)) {
    Feature curFeature = fs->nth(i);
    curFeature->AddAttribute();
  }

  /*
   * start type checking under local object environment
   */
  if (!classtable->isBasicClass(classname)) {

    for (int i = fs->first(); fs->more(i); i = fs->next(i)) {
      Feature curFeature = fs->nth(i);
      curFeature->CheckFeatureType();//type check in arrtibute and method
    }
  }

  //dfs to child
  std::list<Symbol>::iterator iter;
  std::list<Symbol> nexClass = classtable->getChildClasses(classname);
  for (iter = nexClass.begin(); iter != nexClass.end(); ++iter)
    checkClassesType(*iter);

  //close local environment
  objectEnv.exitscope();
}


Type Checking

implemented in a single traversal over AST

type environment is passed down the tree: from parent to child

types are passed up the tree: from child to parent

method

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
void formal_class::AddFormal() {
  if (name == self) {
    classtable->semant_error(curclass)
        << "'self' cannot be the name of a formal parameter\n";
  }
  if (type_decl == SELF_TYPE) {
    classtable->semant_error(curclass)
        << "Formal parameter " << name << " cannot have type SELF_TYPE\n";
  }

  log << "addFormal: " << name << " " << type_decl << std::endl;
  if (objectEnv.probe(name) != NULL)
    classtable->semant_error(curclass)
        << "Formal parameter " << name << " is multiply defined\n";
  objectEnv.addid(name, new Symbol(type_decl));
}

Symbol method_class::CheckFeatureType() {
  objectEnv.enterscope();
  for (int i = formals->first(); formals->more(i); i = formals->next(i)) {
    Formal f = formals->nth(i);
    f->AddFormal();
  }

  Symbol expr_type = expr->CheckExprType();
  Symbol type = expr_type;

  log << "check Method: " << name << "  " << expr_type << "  " << return_type
      << std::endl;

  if (classtable->isSubtype(expr_type, return_type) == false) {
    classtable->semant_error(curclass)
        << "Inferred return type " << expr_type << " of method " << name
        << " does not conform to declared return type " << return_type
        << std::endl;
    type = Object;
  }

  objectEnv.exitscope();

  return type;
}

attribute

1
2
3
4
5
6
7
8
9
10
11
12
13
Symbol attr_class::CheckFeatureType() {
  log << "\t attr_class \n";
  Symbol expr_type = init->CheckExprType();//recursively check expressions type
  if (expr_type == No_type)
    return type_decl;

  Symbol type = expr_type;
  if (classtable->isSubtype(expr_type, type_decl) == false) {
    classtable->semant_error(curclass) << "Error! feature assign invalid\n";
    type = Object;
  }
  return type;
}


Code Generator Interface

For every expression node, its type field must be set to the Symbol naming the type inferred by your type checker.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
//in cool-tree.handcode.h
#define Expression_EXTRAS                                                      \
  Symbol type;                                                                 \
  Symbol get_type() { return type; }                                           \
  Expression set_type(Symbol s) {                                              \
    type = s;                                                                  \
    return this;                                                               \
  }                                                                            \
  virtual void dump_with_types(ostream &, int) = 0;                            \
  void dump_type(ostream &, int);                                              \
  Expression_class() { type = (Symbol)NULL; }                                  \
  virtual Symbol CheckExprType() = 0;

#define Expression_SHARED_EXTRAS                                               \
  void dump_with_types(ostream &, int);                                        \
  Symbol CheckExprType();


static vs dynamic

The dynamic type of an object is the class $ C $ that is used in the new C expression that created it

Soundness theorem for Cool type system;

$ \forall E, dynamic\_type(E) \le static\_type(E) $

all operations that can be used on an object of type $ C $ can also be used on an object of type $ C^{‘} \le C $

1
2
3
4
5
6
7
8
class A {...}
class B inherits A {...}
class Main{
	//x has static type A and dynamic type A
	x:A <- new A;
	//x has static type A and dynamic type B
	x <- new B;
}


static-dispatch

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
Symbol method_class::matchFormals(Expressions actual) {

  Symbol type = return_type;
  for (int i = actual->first(); actual->more(i); i = actual->next(i)) {
    Symbol actual_type = actual->nth(i)->CheckExprType();
    if (formals->more(i) == false) {
      return Object;
    }
    Symbol declar_type = formals->nth(i)->get_typedecl();
    if (classtable->isSubtype(actual_type, declar_type) == false) {
      classtable->semant_error(curclass) << "Error! unmatched formal type\n";
      type = Object;
    }
  }
  return type;
}

Symbol static_dispatch_class::CheckExprType() {
  log << "\t static dispatch\n";
  Symbol expr_type = expr->CheckExprType();//check e0 type

  //static dispatch to type T, so e0 must be a subtype
  if (classtable->isSubtype(expr_type, type_name) == false) {
    classtable->semant_error(curclass)
        << "Expression type " << expr_type
        << " does not conform to declared static dispatch type " << type_name
        << std::endl;
    type = Object;
    return type;
  }

  //find method in class T's ancestor, return ancestor class name
  Symbol ancestor = classtable->findMethodInAncestor(type_name, name);

  if (ancestor == No_class) {
    classtable->semant_error(curclass)
        << "Error! cannot find method " << name << std::endl;
    type = Object;
    return type;
  }

  //return method from method table
  Feature_class *method = methodEnv[ancestor].lookup(name);

  //match formals of expression types and formal types
  type = method->matchFormals(actual);

  if (type == SELF_TYPE)
    type = expr_type;
  return type;
}

if

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Symbol cond_class::CheckExprType() {
  log << "\t cond(if else)\n";

  Symbol pred_type = pred->CheckExprType();
  if (pred_type != Bool)
    classtable->semant_error(curclass)
        << "Error! condition has non-bool predicate\n";

  Symbol then_type = then_exp->CheckExprType();
  Symbol else_type = else_exp->CheckExprType();
  type = then_type;
  //return lca of types, because we don't know the predicate result now
  if (else_type != No_type)
    type = classtable->CommonAncestor(then_type, else_type);//return lca of types
  return type;
}

case

  • Case expressions provide runtime type tests on objects.
  • First, $ e_0 $ is evaluated and its dynamic type C noted.
  • Next, from among the branches the branch with the least type $ T_k $ such that $ C ≤ T_k $ is selected. The identifier $x_k$ is bound to the value of $ e_0 $ and the expression $ e_k $ is evaluated.
  • The result of the case is the value of $ e_k $

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
Symbol typcase_class::CheckExprType() {
  log << "\t cases \n";

  expr->CheckExprType();

  std::set<Symbol> s;

  int i = cases->first();
  type = cases->nth(i)->CheckBranchType();
  for (int i = cases->first(); cases->more(i); i = cases->next(i)) {
    Case oneBranch = cases->nth(i);

    if (s.find(oneBranch->get_typedecl()) != s.end()) {
      classtable->semant_error(curclass) << "Error! branch has same type\n";
    }
    s.insert(oneBranch->get_typedecl());

    Symbol branch_type = oneBranch->CheckBranchType();
    type = classtable->CommonAncestor(type, branch_type);
  }
  return type;
}

Symbol branch_class::CheckBranchType() {
  log << "\t branch\n";

  objectEnv.enterscope();
  objectEnv.addid(name, new Symbol(type_decl));

  Symbol type = expr->CheckExprType();

  objectEnv.exitscope();
  return type;
}

let-init

  • Typing a multiple let
    let $ x_1 : T_1 [← e_1 ], x_2 : T_2 [← e_2], . . . , x_n : T_n [← e_n ] $ in $ e $
    is defined to be the same as typing
    let $ x_1 : T_1 [← e_1 ] $ in (let_expr…)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Symbol let_class::CheckExprType() {
  log << "\t let\n";

  objectEnv.enterscope();
  if (identifier == self)
    classtable->semant_error(curclass)
        << "'self' cannot be bound in a 'let' expression\n";
  objectEnv.addid(identifier, new Symbol(type_decl));//add one symbol

  Symbol ini_type = init->CheckExprType();
  type = body->CheckExprType();

  if (ini_type != No_type)
    if (classtable->isSubtype(ini_type, type_decl) == false)
      classtable->semant_error(curclass) << "Error! let init val not child\n";
  objectEnv.exitscope();

  return type;
}

new

1
2
3
4
5
6
7
8
9
10
Symbol new__class::CheckExprType() {
  log << "\t new_class \n";
  type = type_name;
  if (type != SELF_TYPE && classtable->getClassByName(type_name) == NULL) {
    classtable->semant_error(curclass)
        << " Undefined return type " << type_name << std::endl;
    type = Object;
  }
  return type;
}

var

1
2
3
4
5
6
7
8
9
10
11
12
13
14
Symbol object_class::CheckExprType() {
  log << "\t object class\n";

  Symbol *found_type = objectEnv.lookup(name);

  if (found_type == NULL) {
    classtable->semant_error(curclass)
        << "Undeclared identifier " << name << std::endl;
    type = Object;
  } else
    type = *found_type;

  return type;
}


SELF_TYPE

The type SELF_TYPE is used to refer to the type of the self variable.

Note that the meaning of SELF_TYPE is not fixed, but depends on the class in which it is used.

  • $ SELF\_TYPE_C$ to refers to an occurrence of SELF_TYPE in the body of C

cannot return self with a concrete type

1
2
3
4
5
6
7
8
9
10
11
12
class Count{
	i:Int <- 0;
    //must return SELF_TYPE, so that inc method works in child classes
	inc():Count{{i<-i+1;self;}};
};
class Stock inherits Count{
	name:Str;
}
class Main{
	//if inc return Count, than assignment failed
	Stock a <- (new Stock).inc();
}


inheritance rule

Let $ T $ and $T^{‘} $ be any types but SELF_TYPE

$ SELF\_TYPE_C \le SELF\_TYPE_C$

  • In Cool we never compare SELF_TYPE coming from different classes

$ SELF\_TYPE_C \le T $ if $ C \le T $

  • $ SELF\_TYPE_C $ can be any subtype of C, including C itself

$ T \le SELF\_TYPE_C $ always false


least upper bound rule

lub($ SELF\_TYPE_C $, $SELF\_TYPE_C$) = $ SELF\_TYPE_C $

lub($ SELF\_TYPE_C $,$ T $) = lub($ C $, $ T $)

lub($ T $,$ SELF\_TYPE_C $) = lub($ C $, $ T $)


Finally, SELF TYPE may be used in the following places:

  • new SELF_TYPE

  • the return type of a method

    1
    
    m(x:T):SELF_TYPE{...}
    
  • the declared type of a let variable

    1
    
    let x:SELF_TYPE in E
    
  • the declared type of an attribute.

    1
    
    x:SELF_TYPE
    
  • actual arguments of methods ( ≤ declared arguments type )

  • type of dispatch expression

    1
    2
    
    e0 : SELF_TYPE
    e0.f(e1,e2..en)
    


No other uses of SELF_TYPE are permitted.

  • classname

    1
    2
    
    //T, T' cannot be SELF_TYPE
    class T inherits T' {...}
    
  • static dispatch

    1
    2
    
    //T cannot be SELF_TYPE
    [email protected](E1,E2...En)
    
  • formal parameters


functions

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
bool ClassTable::isSubtype(Symbol child, Symbol father) {
  if (child == SELF_TYPE && father == SELF_TYPE)
    return true;
  if (father == SELF_TYPE)
    return false;
  if (child == SELF_TYPE)
    child = curclass->get_name();

  log << "\t\t\t isSubtype";
  while (child != father && child != No_class) {
    child = mp_class[child]->get_parent();
  }
  if (child == father)
    return true;

  return false;
}

Symbol ClassTable::findMethodInAncestor(Symbol classname, Symbol methodname) {
  if (classname == SELF_TYPE)
    classname = curclass->get_name();

  if (classname == No_class)
    return No_class;
  if (methodEnv[classname].lookup(methodname) != NULL)
    return classname;

  return findMethodInAncestor(mp_class[classname]->get_parent(), methodname);
}