[Ohrrpgce] Gradual (optional static) typing for HamsterSpeak

Tue Aug 23 10:45:57 PDT 2016

I'm finally resuming work on designing and implementing the major new
version of HamsterSpeak (which I call HS4) which adds types and object
orientation, while being backwards-compatible.  I'm looking for
feedback in the design of the language. This email discusses the type
system and specifically how type annotations might work.

May as well state that the main goals of HS4 are:
- Easy to learn. Should be:
-- as simple as possible
-- similar to popular languages
-- consistent, including with existing HS idioms
- Easy to use:
-- features you expect in modern languages, like arrays, dicts and
   strings with various handy methods
-- easy to avoid bugs and to debug
-- features useful for scripting games (hooks and extensible builtin
   objects, script fibres, signalable and pauseable scripts (e.g. tied
   to an npc or map), and live-reloading of scripts)
- Backwards compatibility
-- it's OK if a couple features are toggled off with backcompat
   bitsets or other means, most notably necessary for script
   'multitasking', but existing scripts should not need modifying
   except in exceptional cases (e.g. new keywords)

For years, the plan has been for HS4 to be a dynamically typed
language, but in the intervening years I've been using Python and
other dynamic languages heavily, and as a result now really appreciate
what a great thing static typing is, for catching bugs and improving
code readability.

If we add OO methods as alternatives to existing functions then
 set slice velocity(sl, 10, 10, 5)
could be written
 sl.set velocity(10, 10, 5)
Then what happens if you typo the function/method name or pass invalid
arguments?  Currently HSpeak can catch typo'd names and at least check
the number of arguments.  But if everything is dynamic, the typo
wouldn't be detected until you run the script. This would be a huge
step backwards. The goal is to help users eliminate bugs (people
frequently pass invalid arguments to commands, like mixing up hero IDs
and party slots), but we would be encouraging them.

Therefore I propose adding syntax to declare the types of variables
(including locals, globals, and builtin command and script arguments
and return values), and that builtin types/classes and user defined
types (UDTs) be statically defined in a hss/hsd file.  Then HSpeak
will be able to verify that accessed methods and members actually
exist, and would also be able to check that sensible arguments are
passed to scripts, which will catch a large class of bugs.  There will
also be an 'any' type, which prevents type checking but lets you do
whatever you want at runtime.  I recommend that variables that aren't
annotated with a type default to 'number' instead of 'any', so that
you have to explicitly ask for dynamic typing. Otherwise people will
default to not declaring types for anything, and thus would be
susceptible to bugs.

I'm not proposing a rich strict type system lets you be type-safe all
the time.  That means no "algebraic" nested type expressions such as
templates/generics (parametrised types) or C's function pointer types
which specify argument types.  Well, as an exception, it would be nice
to also support a more refined 'array(attack)' type, although the
'array' type would simply default to an array of 'any'.  Calling
callbacks (function types) is uncommon, so typing them seems unnecessary.

I also propose being lenient with type conversions so that there's no
need to use explicit casts (or to even add them to the language):
 - Clearly all types should automatically convert to an 'any' variable
 - An 'any' value could automatically convert to anything else
 - All existing script commands that expect ID numbers or handles work
   with objects too, and vice-versa.
Maybe we could even also implicitly convert integers into types for
builtin objects with ID numbers, e.g.:
  variable([attack] atk)   # See below about declaration syntax
  atk := 3
  show string(atk.name)
This looks a bit evil, but a more realistic example why you might want
such a thing is:
  script, npc dances, [npc] who, ( ... )
  ...
  npc dances(3)  # First NPC with ID 3
Or if not, just require an explicit like "atk := get attack(3)".
I'm leaning towards that, but also want the autoconversion in the
second example, and don't know how to reconcile them.

If type definitions are static then polymorphism requires inheritance
(probably explicitly declared ("nominal" rather than "structural")).
Unless you just use 'any', of course. It could also be possible to
allow adding new members and methods to objects via an 'any'
variable. Of course, then you have no error checking at run-time
either, so something more explicit than "x.y := z" could maybe be
required.

==Syntax==

I can't actually decide on the syntax for declaring types, which is
not easy to decide because of white-space insensitivity.  I also want
to avoid any use of commas or parentheses, which would be confusing
and likely ambiguous in script argument lists.  Also, note that we
will want builtin types like 'npc', 'item', 'hero', etc.  Since these
are all very common variable names, type names would have to live in a
separate namespace (like in FreeBasic) and the type ought to stand out
when written. Alternatively (although not entirely different) type
names could start/end with a special character, such as "!Hero".
Capitalising type names by convention would also help.

Broadly, some different options are:
(recall I suggested 'Number' be the default, hence optional)
- variablename separator typename
   script, dup item, item !Item, times !Number = 1, begin
   # Julia uses the following
   script, dup item, item ::Item, times ::Number = 1, begin
   script, dup item, item *Item, times *Number = 1, begin
   script, dup item, item -> Item, times -> Number = 1, begin
- typename separator variablename
-- e.g.
   script, dup item, Item! item, Number! times = 1, begin
   script, dup item, Item:: item, Number:: times = 1, begin
   script, dup item, Item/ item, Number/ times = 1, begin
   script, dup item, Item.item, Number.times = 1, begin
- bracketed-typename variablename  (or the reverse)
-- e.g.
   script, dup item, [item] item, [number] times = 1, begin
   script, dup item, /item/ item, /number/ times = 1, begin
   script, dup item, <item> item, <number> times = 1, begin
   script, dup item, "item" item, "number" times = 1, begin
   Note that in this case we could allow putting the type before or
   after as you wish (then you don't need to remember where it goes)!
- something that looks like a keyword
-- e.g.
   script, dup item, item .is Item, times .is Number = 1, begin
     As a bonus, "is XYZ" could be actual methods/functions for checking type:
     if (x.is number) then (...)  # Seems to imply everything-is-an-object
                                  # even if that's not the case.
     if (is number(x)) then (...)
   script, dup item, item 'as item, times 'as Number = 1, begin
   script, dup item, item "is item", times "is number" = 1, begin

':' is already heavily used as part of identifiers so unfortunately
couldn't be used as a separator, unless it was something like "::".
It might look like "item:Hammer" includes the type, but the "item:" part
is part of the name so an actual typed declaration would look something like
  variable(item::item:Hammer)  # yuck

==Syntax for script return types==

The syntax for declaring locals and globals could be the same, e.g.
  variable([any] tmp, [Slice] sl1, idx)  # idx is a Number, not a Slice
but declaring the return value of a script is a different matter. E.g.
  # Don't like this
  script, [Slice] create shadow, [Slice] from what, length, begin
  # Putting it afterwards demands consistency
  script, create shadow [Slice], from what [Slice], length, begin
  script, create shadow -> Slice, from what -> Slice, length, begin
  # Unless we allow any order
  script, create shadow "Slice", "Slice" from what, length, begin
The return type could be put somewhere else, e.g.
  script -> Slice, create shadow, from what -> Slice, length, begin
Or there could be a pseudo-argument named "return" that goes in the
argument list
  script, create shadow, return "Slice", from what "Slice", length, begin
  script, create shadow, [Slice] from what, length, [Slice] return, begin
Or we could change the syntax for script definitions:
  script, create shadow ([Slice] from what, length) -> Slice, begin

-> X is especially common syntax for the return type of a
function. (But note that in type theory and functional languages "X -> Y"
is notation for the type of a function taking an X and returning a Y)