This document describes the COMET/CASL implementation on the Casio OM-54A card
which may differ from the original specification. It is based on the Japanese
Wikipedia article <http://ja.wikipedia.org/wiki/CASL> and on the analysis of
the ROM disassembly.


Overview
--------
COMET is a virtual computer specially designed for educational purposes and
programming ability testing in the Japanese Information Technology Standards
Examination (JITSE). CASL is an assembly language for this computer.

The revised versions of COMET and CASL, called COMET II and CASL II, are not
supported by the PB-1000C and therefore are out of the scope of this document.


COMET specification
-------------------
COMET is a virtual machine with a von Neumann architecture. It operates on
words of a fixed length of 16 bits. The processing is sequential. Negative
numbers are represented in two's complement format.

The following data types are supported:

1. arithmetic, refers to signed integers in range -32768 to 32767
2. logical, refers to unsigned integers in range 0 to 65535
3. character, using an 8-bit Japanese standard JIS X 0201 that defines
   encoding for Latin and Katakana characters, stored one character per word
   in the lower 8 bits while the upper 8 bits are filled with zeros

The registers are as follows:

1. General purpose 16-bit registers GR0, GR1, GR2, GR3, GR4
These registers contain one of the operands and store results of the
arithmetic, logical and shift operations. The other operand is a memory
location referenced by the effective address, specified either directly by
an absolute address, or by a sum of an absolute address and the contents of
an index register (XR). GR1 to GR4 can be used as index registers. 

GR4 is used as a stack pointer. It holds the address of the top of the stack.
When a value is pushed onto the stack, GR4 is decremented by one, then the
value is placed at the memory location pointed to by it.
When a value is popped off the stack, the contents of the memory location
pointed to by GR4 is transferred, then GR4 is incremented by one.

An address range from #FF80 to #FFFF is allocated for the stack, but actually
the stack and the object code occupy different address spaces. Therefore it is
not possible to access the object code memory with the commands PUSH or POP,
nor the stack area through an effective address.

2. Program counter PC
This register holds the memory address of the instruction currently being
executed. After completing the instruction it is incremented so as to point
to the next one, except on branches, subroutine calls and subroutine returns
which load it with a new value.

3. Flag register FR
When the executed instruction is an arithmetic or logical operation, it is
set to 10 (binary) if the result is negative, 00 if positive, and 01 if zero.
Similarly, for comparison instructions it is set according to the comparison
result.

Instruction format:

All instructions have a fixed length of two 16-bit words. These 32 bits are
divided into the following fields:
1. The OP field (8 bits) is the instruction opcode that specifies the operation
   to be performed.
2. The GR field (4 bits) specifies the number of the register to be used in
   the operation. It is ignored for the branch and PUSH instructions.
3. The XR field (4 bits) specifies the number of the register whose contents
   is added to the adr field to form an effective address. A value of 0 does
   not mean GR0, but that no address modification is performed.
4. The adr field (16 bits) specifies the memory address, optionally modified
   by the XR. Both the adr and XR fields are ignored for the POP and RET
   instructions.

bit #     0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
        -------------------------------------------------
word 1  |       OP field        | GR field  | XR field  |
        -------------------------------------------------
word 2  |                   adr field                   |
        -------------------------------------------------

Instruction set summary:

The items within brackets [] are optional and can be omitted.

LD GR, adr [, XR] - LoaD
  Load the contents of the effective address to the specified GR register.

ST GR, adr [, XR] - STore
  Store the contents of the GR register at the effective address.

LEA GR, adr [, XR] - Load Effective Address
  Calculate the effective address and store it in the GR register.

ADD GR, adr [, XR] - ADD arithmetic
  Adds the contents of the effective address to the contents of the GR and
  stores the result in the GR. The FR is set according to the result of the
  operation.

SUB GR, adr [, XR] - SUBtract arithmetic
  Subtracts the contents of the effective address from the contents of the GR
  and stores the result in the GR. The FR is set according to the result of
  the operation.

AND GR, adr [, XR]
  Performs a bitwise AND operation between the contents of the GR and the
  contents of the effective address. The result is stored in the GR. In
  other words, the operation clears the bits of the contents of GR which
  corresponding bits of the contents of the effective address are cleared.
  The FR is set according to the result of the operation.

OR GR, adr [, XR]
  Performs a bitwise inclusive OR operation between the contents of the GR
  and the contents of the effective address. The result is stored in the GR.
  In other words, the operation sets the bits of the contents of GR which
  corresponding bits of the contents of the effective address are set. The FR
  is set according to the result of the operation.

EOR GR, adr [, XR] - Exclusive OR
  Performs a bitwise exclusive OR operation between the contents of the GR
  and the contents of the effective address. The result is stored in the GR.
  In other words, the operation toggles the bits of the contents of the GR
  which corresponding bits of the contents of the effective address are set.
  The FR is set according to the result of the operation.

CPA GR, adr [, XR] - ComPare Arithmetic
  Compare the contents of the GR with the contents of the effective address.
  The FR is set to 00 if the contents of GR is larger, 01 if equal, and 10 if
  smaller. The operands are treated as signed values.

CPL GR, adr [, XR] - ComPare Logical
  Similar to the CPA except that the operands are treated as unsigned values.

SLL GR, adr [, XR] - Shift Left Logical
  The contents of the GR is shifted to the left by the effective address. The
  shifted out bits are discarded and the vacated bits are filled with zeros.
  The FR is set according to the result of the operation.

SLA GR, adr [, XR] - Shift Left Arithmetic
  The contents of the GR, except for the sign bit, is shifted to the left by
  the effective address. The shifted out bits are discarded and the vacated
  bits are filled with zeros. The FR is set according to the result of the
  operation.

SRL GR, adr [, XR] - Shift Right Logical
  Right shift version of SLL.

SRA GR, adr [, XR] - Shift Right Arithmetic
  Right shift version of SLA. The vacated bits are filled with the sign bit
  instead of zeros.

JPZ adr [, XR] - Jump on Plus or Zero
  Branch to effective address (i.e. change the value of PC to the contents of
  the effective address) when the value of FR is 00 or 01.

JMI adr [, XR] - Jump on MInus
  Branch to effective address when the value of FR is 10.

JNZ adr [, XR] - Jump on Non Zero
  Branch to effective address when the value of FR is 10 or 01.

JZE adr [, XR] - Jump on ZEro
  Branch to effective address when the value of FR is 00.

JMP adr [, XR] - unconditional JuMP
  Branch to effective address unconditionally.

PUSH adr [, XR] - PUSH effective address
  Calculate the effective address and store it on the top of the stack.

POP GR - POP a value
  Retrieve the address stored at the top of the stack to a GR.

CALL adr [, XR] - CALL subroutine
  Push the address of the subsequent instruction (=PC+2) onto the stack then
  pass the control to specified effective address.

RET - RETurn form subroutine
  Branch to address popped from the stack.


CASL specification
------------------
A CASL program consists of a sequence of statements. Each statement is written
in a single line and consists of up to four fields:
[label] [instruction] [operands] [;comment]

A label is an identifier that is assigned the address of the first word of the
instruction. Labels are limited to 6 characters.
A label must start at the first column and begin with an upper case letter,
followed by upper case letters or digits.

An address in an instruction operand may be specified by a decimal number or
by a label.

General purpose registers may be specified using a shorthand notation. The GR
part may be omitted, so for example 0 is equivalent to GR0.

CASL supports the following pseudo instructions:

label START [optional entry point]
  This instruction begins a program block. The preceding label is mandatory
  and specifies the name of the block. It is assigned the address of the
  optional entry point specified by a label defined within the block, and if
  it is omitted, the address of the beginning of the block.
  A CASL program can consist of multiple blocks. The block names are global,
  while the labels defined in a block are local to this block.

END
  Marks the end of a program block.

DC ... - Define Constant
  Allocates a word (or words) of memory with initialized values. The operand
  may be a numeric constant or a string of characters.
  Numeric operands may be specified in decimal or hexadecimal notation,
  or by a label. Decimal constants may be signed or unsigned. Hexadecimal
  constants are unsigned only and preceded with a # character. The value is
  truncated to 16 bits and stored in a single word of the object program.
  String operands must be surrounded by apostrophes.

DS n - Define Storage
  Allocates the required number of words without initialization. The operand
  is a decimal number.

EXIT
  Terminates the program execution.

Also, CASL includes macro instructions for input and output:

IN input buffer, input length
  When this instruction is encountered during program execution, the program
  halts and waits for the user to enter a string of characters. When the user
  presses the EXE key, program execution continues. The input length contains
  the string length.
  Both IN operands are specified by label names. The size of the input buffer
  must be at least 80 words.

OUT output buffer, output length
  The contents of the output buffer is displayed as characters. The output
  length contains the data size.
  After displaying the string, the program execution pauses until any key is
  pressed.
  Both OUT operands are specified by label names.


Error messages
--------------
Errors detected during assembly (CASL):
  OM    out of memory
  LA    label undefined or multiply defined
  OC    operation error
  OR    operand error
  SO    block definition error, for example missing START or END
Run-time errors (COMET):
  ST    stack overflow/underflow
  CD    illegal opcode
  AD    illegal address


CASL menu
---------
[asmbl ]
  Assemble the selected sequential file.

[source]
  View and edit the sequential file with an empty name. If such file doesn't
  already exist, it will be created.

[edit  ]
  View and edit the selected sequential file.

[LST SW]
  Select whether to output the assembly listing to a printer.

key EXE
  Assemble the selected sequential file then execute the resuling object code
  from the beginning (i.e. at the entry point of the first block) without
  asking the user any questions. 


COMET menu
----------
[go    ]
  Run the object code at the specified address.

[dump  ]
  Invokes the following submenu:

  [object]
    Display the memory contents starting from the specified address. The
    screen can be scrolled with the up/down arrow keys. The value in the top
    row can be modified by pressing the left or right arrow key.

  [regist]
    Display and edit the contents of the registers.

  [bpoint]
    Specify a breakpoint address. The breakpoint can be cleared by typing an
    address outside the allowed range, for example -1.

  key EXE
    Invokes the same function as the menu entry [object], but sets the starting
    address to #0000 without asking the user.

[PRT SW]
  Select the output device - screen or printer.

[TR SW ]
  Select the trace mode allowing single-stepping through the code.

key EXE
  Run the object code from the beginning.


Example programs
----------------
; Traditional first "Hello, World" example, displays a message on the screen
MAIN  START
      OUT   MSG,LEN
      EXIT
MSG   DC    'Hello, World!'
LEN   DC    13
      END

-----------------------------------------------------------------------------

; Program to solve the Tower of Hanoi puzzle using recursive calls,
; taken from the Japanese Wikipedia
; http://ja.wikipedia.org/wiki/CASL
MAIN  START
      LD    GR0,N
      LD    GR1,A
      LD    GR2,B
      LD    GR3,C
      CALL  HANOI    ;hanoi(3,A,B,C)
      EXIT

; hanoi(N,X,Y,Z)
HANOI CPA   GR0,ONE  ;if N==1 then
      JZE   DISP     ;move it, return
      SUB   GR0,ONE  ;N-1
      PUSH  0,GR2    ;swap GR2 GR3
      LEA   GR2,0,GR3
      POP   GR3
      CALL  HANOI    ;hanoi(N-1,X,Z,Y)
      ST    GR1,MSG1
      ST    GR2,MSG2 ;now GR2 holds Z
      OUT   MSG,LNG  ;'from X to Z'
      PUSH  0,GR2    ;rotate GR1-GR3
      LEA   GR2,0,GR1
      LEA   GR1,0,GR3
      POP   GR3
      CALL  HANOI    ;hanoi(N-1,Y,X,Z)
      PUSH  0,GR2    ;restore registers
      LEA   GR2,0,GR1
      POP   GR1
      ADD   GR0,ONE  ;also restore N
      RET

DISP  ST    GR1,MSG1 ;'from X to Z'
      ST    GR3,MSG2
      OUT   MSG,LNG
      RET

ONE   DC    1
N     DC    3        ;number of disks
LNG   DC    11       ;message length
A     DC    'A'
B     DC    'B'
C     DC    'C'
MSG   DC    'from '
MSG1  DS    1
      DC    ' to '
MSG2  DS    1
      END

; Executing this code yields the following result (where from A to C means to
; move the disk at the top of A to C):
;
; From A to C
; From A to B
; From C to B
; From A to C
; From B to A
; From B to C
; From A to C

-----------------------------------------------------------------------------

; Program to solve the eight queens puzzle,
; taken from the Calculator Benchmark web page
; http://www.hpmuseum.org/cgi-sys/cgiwrap/hpmuseum/articles.cgi?read=700
BGN  START
     LEA  GR0,8
     ST   GR0,DIM
     LEA  GR0,0
     LEA  GR1,0
L00  CPA  GR1,DIM
     JZE  L05
     LEA  GR1,1,GR1
     LD   GR3,DIM
     ST   GR3,ARY,GR1
L01  ADD  GR0,ONE
     ST   GR1,TMP
     LD   GR2,TMP
L02  LEA  GR2,-1,GR2
     JZE  L00
     LD   GR3,ARY,GR1
     SUB  GR3,ARY,GR2
     JZE  L04
     JPZ  L03
     EOR  GR3,FFH
     LEA  GR3,1,GR3
L03  ST   GR2,TMP
     ADD  GR3,TMP
     ST   GR1,TMP
     SUB  GR3,TMP
     JNZ  L02
L04  LD   GR3,ARY,GR1
     LEA  GR3,-1,GR3
     ST   GR3,ARY,GR1
     JNZ  L01
     LEA  GR1,-1,GR1
     JNZ  L04
L05  EXIT
ONE  DC   1
FFH  DC   #FFFF
DIM  DS   1
TMP  DS   1
ARY  DS   9
     END

; The result is stored in the array ARY. Also the register GR0 contains the
; number of iterations (876).

-----------------------------------------------------------------------------

; This program calculates and displays a square root of an integer number
; entered by the user. It illustrates the usage of multiple blocks.
MAIN  START
      IN    BUF1,SIZE1
      LEA   GR1,BUF1
      LD    GR2,SIZE1
      CALL  ATOI
      ST    GR0,TEMP
      LEA   GR1,BUF3
      CALL  ITOA
      LD    GR0,TEMP
      CALL  SQRT
      LEA   GR0,0,GR1
      LEA   GR1,BUF4
      CALL  ITOA
      OUT   BUF2,SIZE2
      EXIT
BUF1  DS    80
SIZE1 DS    1
BUF2  DC    'SQRT ('
BUF3  DS    5
      DC    ') = '
BUF4  DS    5
SIZE2 DC    20
TEMP  DS    1
      END

; convert a string to an unsigned integer in GR0
; string address in GR1, length in GR2
ATOI  START
      LEA   GR0,0
L01   LEA   GR2,-1,GR2
      JMI   L02
      LD    GR3,0,GR1
      LEA   GR3,-48,GR3
      JMI   L03
      ST    GR3,TEMP1
      LEA   GR3,-10,GR3
      JPZ   L03
      SLL   GR0,1
      ST    GR0,TEMP2
      SLL   GR0,2
      ADD   GR0,TEMP2
      ADD   GR0,TEMP1
      LEA   GR1,1,GR1
      JMP   L01
L02   LEA   GR2,1,GR2
L03   RET
TEMP1 DS    1
TEMP2 DS    1
      END

; convert an unsigned integer GR0 to decimal
; result at the address GR1
ITOA  START
      LEA   GR2,4
L01   LD    GR3,ZERO
L02   CPL   GR0,TENS,GR2
      JMI   L03
      SUB   GR0,TENS,GR2
      LEA   GR3,1,GR3
      JMP   L02
L03   ST    GR3,0,GR1
      LEA   GR1,1,GR1
      LEA   GR2,-1,GR2
      JNZ   L01
      ADD   GR0,ZERO
      ST    GR0,0,GR1
      RET
ZERO  DC    '0'
TENS  DC    1
      DC    10
      DC    100
      DC    1000
      DC    10000
      END

; square root of an unsigned integer
; radicand = GR0, root = GR1
SQRT  START
      LEA   GR1,0       ;root
      LEA   GR2,0       ;remainder
      LEA   GR3,8       ;number of root bits
; shift 2 bits from the radicand to the remainder
L01   SLL   GR2,2
      ST    GR0,TEMP1
      SRL   GR0,14
      ST    GR0,TEMP2
      ADD   GR2,TEMP2
      LD    GR0,TEMP1
      SLL   GR0,2
; try to subtract 4*root+1 from the remainder
      SLL   GR1,2
      LEA   GR1,1,GR1
      ST    GR1,TEMP2
      SRL   GR1,1
      CPL   GR2,TEMP2
      JMI   L02
      SUB   GR2,TEMP2
      LEA   GR1,1,GR1
; next bit of the root
L02   LEA   GR3,-1,GR3
      JNZ   L01
      RET
TEMP1 DS    1
TEMP2 DS    1
      END
