Wednesday, 1 December 2021

Early Machine Learning: "Animal" from David Ahl's BASIC Games


I teach a philosophy course on AI.  One of the things we look at are historical programs from the beginning of AI research. One of these is the "Guess the Animal" game included in David Ah's classic BASIC Games compendium from 1978.  However, when I went to look for easily accessible and playable versions of the original program I was surprised to discover that they were kind of hard to access.  Like so many of the programs on the list of Ahl classics, you can easily find copies of the source, and scans of the original book, such as:


But finding an easily playable version of the original was tough.  I did find some modern remakes, including one that is trying to compile an ever increasing list of inputs from people on the Net:

https://www.animalgame.com/play/faq.php

The reason this site is so interesting is that Animal is one of the early simple and accessible examples of a program involving "machine learning", which is a big buzz word these day in the field of AI. The intelligence of the program really relies on the input of the users and the intelligence of the program grows with additions to the input from users, who slowly train and expand the ability of the program.  The program tries to accurately guess the animal your are currently thinking of simply by asking questions and answers input by prior users.  It only has a limited number of questions that it starts with, but the users must supply all the rest.

Obviously, on the early 8-bit home computers this kind of program was limited by a number of factors.  The first was the reliance on slow and awkward forms of long term storage such as cassette tapes.  The second was limits on machine RAM memory for dealing with any growing body of input data.  The program was originally developed for networked minicomputer systems, which allowed for data to be more easily compiled, and for more users to be interacted with:

https://www.animalgame.com/play/misc/source.php

However, the Microsoft BASIC version for Ahl's book dispenses with the the "save" and "loading" features.  As a result, in its accessible type-in form it is easy to understand why people might not have been particularly interested in the program, with its data limited to single sessions on an un-networked 8-bit home computer.

Another possible reason for the relative lack of prominence on the Net among retrocomputing enthusiasts is that the Ahl version seems to have some bugs.  In the version I worked from the question printing routine didn't properly parse the questions, which are saved as text strings with numeric values tacked onto the end of them for directing the Y and N responses. The result is that the computer posses question with some of this Y/N numeric info tacked on at the end.  It was more an esthetic nuisance, than a fatal error.

The Original Question Print Routine:

400 Q$=A$(K)

410 FOR Z=3 TO LEN(Q$)

415 IF MID$(Q$,Z,1)<>"\" THEN PRINT MID$(Q$,Z,1);

417 NEXT Z

420 INPUT C$

My Question Print Routine:

400 Q$=A$(K)

410 FOR Z=3 TO LEN(Q$)

415 IF MID$(Q$,Z,1)<>"^" THEN 417

416 M$=MID$(Q$,3,Z-3)+"?":GOSUB1:Z=255:REM JUMP TO WORD WRAP PRINT ROUTINE

417 NEXT Z

However, there were possibly other more fatal problems involving the creation of new data items due to improper conversion of the numeric values into strings for concatenation onto new question strings.

370 A$(K)="\Q"+X$+"\"+A$+STR$(Z1+1)+"\"+B$+STR$(Z1)+"\"

In Microsoft BASIC, the lingua Franca of Ahl's publications, the proceeding space of unsigned positive number values is preserved when the STR$ function is used to convert those values to a string.  I think the original program, based on some version of minicomputer BASIC, probably automatically stripped all spaces from numbers, and just returned the number alone. But the program was re-written for Microsoft BASIC for Ahl's publication, and seems to possibly to have been a little rushed. I decided to add some MID$ treatments to the STR$ function used, in order to make sure the spaces were stripped.

370 A$(K)="^Q"+X$+"^"+A$+MID$(STR$(Z1+1),2)+"^"+B$+MID$(STR$(Z1),2)+"^"

I think the program might have possibly messed up otherwise in its interpretation of the numeric data of those generated strings, although perhaps not, since the final result is processed by a VAL function, which will simply ignore spaces.  But I wanted to make sure that the data was in the same non-spaced format as the original DATA statements for questions: DATA "\QDOES IT SWIM\Y2\N3\"

I also added word wrap so the output of the program, regardless of the length of questions and animal names would be nicely formatted for a 32 character screen.  I suspect that it would be possible for some questions to easily go over the 32 character or even the more standard 40 character limit of most home computers of the day.

So maybe because of its obvious limitations and bugs relatively few people using home computers ever actually typed in this program and played around with it.  Hence, it is not widely recalled or discussed by people on the Net today.  Which is sad because it is an important historical forerunner of modern AI developments and particularly machine learning techniques.


No comments:

Post a Comment