Inferring types in Smalltalk

  • Authors:
  • Norihisa Suzuki

  • Affiliations:
  • Xerox Palo Alto Research Centers, Palo Alto, CA

  • Venue:
  • POPL '81 Proceedings of the 8th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
  • Year:
  • 1981

Quantified Score

Hi-index 0.02

Visualization

Abstract

Smalltalk is an object-oriented language designed andimplemented by the Learning Research (Group of the Xerox Palo AltoResearch Center [2, 5, 14]. Some features of this language are:abstract data classes, information inheritance by asuperclass-subclass mechanism, message passing semantics, extremelylate binding no type declarations, and automatic storagemanagement. Experience has shown that large complex systems can bewritten in Smalltalk in quite a short period of time; it is alsoused to teach programming to children quite effectively.Object-oriented languages like Smalltalk have begun to be acceptedas friendly languages for novice programmers on personalcomputers.However, Smalltalk has some drawbacks, too. Smalltalk programsare inefficient compared with Lisp or Pascal Late binding is amajor reason of this inefficiency; every time a procedure iscalled, its implementation in the current context has to befound.Because of late binding, whether there is an implementation of aprocedure call or not can only be found at run-time. This may beconvenient in the early stages of system development; one can run apartially completed system, and when he discovers a run-time errorcaused by an unimplemented procedure, he can write the procedurebody and proceed the computation from the point where the error wasdiscovered. However, there is no way to guarantee that there willbe no run-time errors. We found many "completed" systems whichstill had such run-time errors.Another problem is that it is hard for a novice to readSmalltalk programs written by other people. The fact that there areno type declarations and the fact that the bindings are late aremajor causes of unreadability. All the Smalltalk procedures are socalled generic procedures. Each procedure name is associated withseveral procedure bodies declared in different classes. Dependingon the classes of the arguments of a procedure call differentprocedure bodies are invoked. Since the classes of the argumentsmay differ according to the context, it is impossible to staticallypredict the behavior of the procedure calls.We observed that both inefficiency and unreadability areattributed to late binding; however, early binding can beeffectively accomplished if we can tell the classes of theprocedure arguments at compile time. In the long run probablySmalltalk needs to have "type" declarations---probably not rigiddeclarations of Pascal but rather in the form of hints to compilersand programmers. Even without changing the language it would benice to have a tool that supplies "type" declarations to currentSmalltalk or partially specified Smalltalk. This will also lead toefficient compilation.We thus concluded that we need to introduce "types" toSmalltalk. The introduction of types is more promising in Smalltalkthan in similarly declarationless language Lisp, since Smalltalkhas a rich user-defined abstract classes. Therefore, the moststraightforward approach to introduce types is to associate typesof variables to classes that variables denote and to associatetypes of procedures to mappings from classes to classes. Since avariable may denote objects of different classes, we define thetype of a variable to be a union of classes that the variable willever denote.The aim of this research is not to implement compilers forSmalltalk with type declarations. We intend to design tools tosupply type declarations to current Smalltalk programs. Completetype determination is neither possible nor desirable; people dowrite Smalltalk programs that take advantage of late bindings. Weare, therefore, interested in finding a relatively efficient methodthat can find types of expressions in a large number of cases.The problem of statically assigning types totype-declarationless programs is called type-inference problem. Wecan find a number of work on type inference [3, 4, 7, 9, 11, 15];these techniques are, however, either too restrictive or tooinefficient for our purpose. The only technique implemented, provento work for non-trivial cases, and used extensively was developedby Milner [7] to determine types for ML language of LCF. Eventhough ML language is much simpler than Smalltalk, the fact thatthere exists an efficient, versatile algorithm encouraged us toinvestigate whether we can extend the method.The LCF type checker produces a set of equations from proceduredeclarations and solves them by unification [12], to obtain thetypes of the procedures; it can run in linear time due to a fastunification algorithm invented recently [10]. We extended Milner'smethod so that we can treat unions of types; in our method,wecreate a set of equations and inequalities and solve them byunification and a transitive closure algorithm. This technique isgeneral and can be applied to other data-flow problems.The advantage of Milner's method and our method is that itreduces the problems to purely mathematical domain so that we canapply various formula manipulation algorithms, without consideringthe execution order or side-effects. Another advantage is thatthese methods can handle functions with polymorphic types.In section 2 we review earlier work on type inference. The briefintroduction of the syntax and the semantics of Smalltalk is donein section 3. Then we introduce the "types" into Smalltalk insection 4. We discuss the first part of our algorithm, how toextend LCF type checking algorithm for liberal unions of types, insection 5. Then in section 6 the whole algorithm is presented.Section 7 is concerned with the implementation and experience.Smalltalk has four major different versions of the language andimplementations. The version we used for our experiments isSmalltalk-76.