Thursday, February 18, 2010

Week 4: How to write itoa, and the trouble with tech interviews

Source: rm.js
Test page: register-machine-test-2010-02-18.html
Language reference: assembly-language-reference.html

Changes:
- Fixed bug in division function
- All word characters (A-Z,a-z,0-9,_) can be used in labels, not just lowercase letters
- All assembly functions can use labels as the address part
- Added inp command to read from STDIN textarea
- Added a demo of subroutines, an itoa algorithm, and a STDIN reversal algorithm to the test suite
- Added right-shift and left-shift commands (lft, rgt)
- Got rid of "cor" command, replaced with unique commands for each math function (inc, dec, add, sub, mul, div)

This last change presents a question: Rework how commands are stored, or do something different with the address part of math commands? I decided for now to use the address part as an optional destination for the result of the math function. If the address is zero, the function works like its "cor" counterpart; if it is non-zero, memset is called and the result is copied to that address. This removes the need to follow up math functions with "cpo" commands to store the results. For example, add two numbers, store them in address 50, the old and the new way:

Old:

mov 0 50
mov 1 75
cor 0 2 // Add
cpo 0 50 // Copy reg 0 to mem 50
end

New:

mov 0 50
mov 1 75
add 0 50 // Add reg 0 and 1, result to mem 50
end

Additionally, for the add and subtract commands, Register 1 is added to or subtracted from the register you specify. Previously this was hardcoded to just register 0.

My "Phase 1" builds have problems that keep creeping up, and I've been fighting code and feature bloat as much as I can. For example, I started out with two registers, and quickly bumped that up to four before the first build, and now I'm up to four user-addressable registers, another register for the jump-to address, one for the STDIN pointer, three flags used in math functions, and an overflow flag... that I don't really do anything with yet.

I've reached the point where the project is becoming fleshed out enough that I've started researching other assembly environments. Interestingly, some of what I came up with independently matches what exists out in the wild, such as my algorithm for incrementing binary values, my solution for an itoa (Integer to ASCII) algorithm, and my method of creating pseudo-subroutines, which matches the way Knuth's MIX environment solves the same problem. Specifically, an executable byte contains both a command and an address, and exists in a memory location that is user-writable. If the command is "jump", I can overwrite just the address part, and change where it jumps to from within the program. This can be used to tell a "subroutine" where to return to when it's finished.

Sticking to the "only low-level bit stuff" goal, I accomplished this by using a bitmask that is all ones by the command part of a byte, and all zeroes by the address part, using the following JavaScript command:
mem[address] = (mem[address] & mask) | value; // update word part only

Imagine I have a three-bit command, 101, that corresponds to "JMP", and a four-bit address, 0011, that I want to replace with a new address of 0110. My bitmask will be 1110000. I start out by ANDing the byte and the mask:
  1 0 1 0 0 1 1
& 1 1 1 0 0 0 0
---------------
1 0 1 0 0 0 0

This retains the "command" part of the byte, and zeroes out the "address" part. Then I OR the byte with the new address, in this case 0110:
  1 0 1 0 0 0 0
| 0 0 0 0 1 1 0
---------------
1 0 1 0 1 1 0

So the old byte was 1010011, and by ANDing the mask then ORing the new value, I have a new byte 1010110, and I have updated where it will jump to.

This feature of being able to update just the address part of a byte can be extended to other functions as well. I found a use for it when trying to create a better itoa function than my first attempt. The itoa algorithm is a common function in many languages that takes a number and converts it to characters, one digit at a time, and prints each digit out in turn. This may sound like so much fluff and wasted effort, but programs commonly need to do this kind of conversion. For example, in plain JavaScript:
var ans = 5000 / 20;
alert("The answer to five-thousand divided by twenty is " + ans + ".");



The "ans" variable was a number stored as a numeric value. What is being alerted is a string, and each digit of "250" must be extracted from the number and displayed, one at a time, as ASCII. The extraction of each digit and conversion to ASCII was fantastically painful... and fun!

My first attempt at this in my assembly language was only semi-functional, and I wasn't happy with it. The principle was starting out dividing the number by 10,000 (since 65535 is the largest number that can fit in my 16 bit word length, there won't be more than 5 digits in a number). The integer portion of the quotient is the first digit in the number. Take that number and add 48 to it (ASCII 48 being where "0" is located), and print that character. Then step the divisor down to 1,000, and divide the remainder by that. Repeat, divide by 100, then 10, then 1. For example, divide 12,345 by 10,000 and you get 1 with a remainder of 2,345. Print out the 1, divide 2,345 by 1000, and you get 2 r 345. Etc.

The first algorithm was not optimal, as you have to adjust the hardcoded 10,000 divisor if the word_length changes from 16 to something else, and it behaves poorly whenever 0 is encountered. Also, the output will always be 5 bytes, with leading zeroes on small numbers. Boo! After pondering for a while, I found a better solution and implemented it in my assembly language.

The new algorithm works by starting from the small side. Divide by 10, and the remainder is your last digit. Divide that quotient by 10, and you get your next-to-last digit. Repeat until you get 0, and then print out the characters you've extracted in reverse order. To do that in my assembly language, I had to modify cpy and cpo addresses during the program execution using the same trick used above to create pseudo-subroutines, which necessitated adding the "update value only" feature I mentioned above to all functions, not just to jumps.

The working algorithm is available on the test page linked to above. An unexpected bonus to this implementation is you can change what number you're dividing by to change the base of the result. Change it to eight and it converts the number to octal, two and it prints out binary. Neat.

So after I designed and tested this, I did a Google search for "itoa algorithm", and found out that this is basically the standard solution for rolling your own itoa function, and also that this is a common interview question for coders. In fact, there appears to be a multitude of fora dedicated to helping potential interviewees perform better when quizzed in tech interviews. This irritates me for two basic reasons: First, random algorithm tests for candidates is a completely crappy interview method. Second, most of the posts I came across appeared to be pleas for help in gaming the system.

Job seekers seem to be building lists of answers to potential interview questions rather than building their skill at coding and problem-solving. So take my two approaches to the itoa problem. I hadn't "studied for the test", so my first implementation at an interview would have been scoffed at, and I wouldn't have gotten the job. I realized the problems with my algorithm, and attacked it again and wrote the optimal solution. This is my main approach to tech problem solving: take a stab at it to better learn the problem, then start over, using what you've learned to build a better solution. But I wouldn't have gotten the job, the unqualified flunky who cheated on the test would have. What a mess. Fortunately, I currently have a coding job I'm happy with, and I don't plan on trying to pass any arbitrary interview tests in the near future.

End rant.

Next steps:
- Add a "jump if flag is 1" command to branch on math and overflow flags
- Add logic commands such as xor, and, or, equals, not
- Work on functions that don't yet follow "bit manipulation only" rule
- Update test suite programs to read STDIN lines as integers for input instead of hardcoding the inputs.

Thursday, February 11, 2010

Week 3: Schoolboy Math Goes Binary

Source: rm.2010021js
Test page: register-machine-test-2010-02-11.html
Language reference: assembly-language-reference.html

Changes:
This was Speed Week. I made three major improvements, two that affect the speed of running programs, one that affects the speed of writing them.

- run_program() now dumps all output once at the end of the program (or when an error is thrown), rather than one line at a time. This dramatically speeds up output when multiple lines are being written (e.g., the ASCII chart and increment functions)
- Line-number based compiling has been replaced with label-based ("jmp 0 label:" vs. "5 jmp 0 15"). Look at the sample programs on the test page to see how this simplifies program creation. Now if you want to adjust a program, you don't need to go through and renumber everything and adjust your jump steps to match.
- Increment-based arithmetic has been replaced with real binary arithmetic. This is the largest speed improvement.
- The sample programs have been updated to the new format, and a couple new ones (factorials and the Fibonacci sequence) have been added.
- To illustrate the speed improvements, the word_length has been raised to 16 bits.

I learned a lot this week about arithmetic, which is a strange thing for a 38 year old computer programmer to say. Yes, I know how to do math by hand, they taught me all about it in elementary school (good old Mrs. Spake at Kernersville Elementary prevented me from a life of crime, all things considered). The new functions that do arithmetic schoolboy-style were interesting to write, and ended up being pretty concise, if a little odd. The exception to the fun was division, however; that was downright painful.

Here is an examination of the four basic arithmetic functions, and how I am currently treating them.

Addition

Here is my old increment-based add function:

// Increment co_0 by co_1, set co_1 to 0
add = function() {
while(co_read(1) != 0) {
co_inc(0);
co_dec(1);
}
}

Until register 1 returns a 0 value, decrement it, and increment register 0, which will end up being the sum of both registers. Nothing to it. Move Suzy's apples to John's pile, and then John has all the apples.

The function is correct, but the problem it has is with speed. The function always goes through regs[1] iterations: when register 1 starts out set to 250, the function takes 250 iterations. The increment function that add() calls, while not very expensive, iterates through each bit, least significant first, until it finds a zero. A little function analysis shows incrementing from 0 to 255 iterates through a total of 502 bits. This problem was compounded by my old multiplication function, which used a similar principle: adding the multiplicand to itself, then decrementing the multiplier, and repeating until the multiplier is 0. The total number of iterations over bits is roughly twice the product, e.g., 255x255 iterates over 129,540 bits!

It was clear after some basic testing that if I'm going to work with numbers that aren't tiny, I had to do math more efficiently, and that's what I've taken steps to do during this week's build.

Here is the revised add function:

co_add = function() {
// a = register 0's current bit, b = register 1's current bit, c = carry bit;
c = 0;
for (var x in bit) {
a = (regs[0] >> x) & 1;
b = (regs[1] >> x) & 1;
if (c != b) { regs[0] ^= bit[x] };
if (a == b && a != c) { c = !c }
}
if (c != 0) { overflow = true }
}

First, binary addition works the same way it does in base 10, except you only have ones and zeroes. You can subtotal by iterating right to left over the digits, carries work the same, everything's the same. Here's binary 011 (decimal 3) + binary 010 (decimal 2) to illustrate:

Step 1   Step 2   Step 3

1 1
011 011 011
+ 010 + 010 + 010
----- ----- -----
1 01 101 (binary 101 = decimal 5)

Just like you remember from school, right? All the possibilities for subtotaling each column can be layed out in a simple matrix of eight options: Two possibilities for each of the number being added, and two for whether there was a carry from the previous column. Here is the matrix of possibilities, the subtotals they produce, and whether or not the next column will need a carry:

Carry  Row 1  Row 2     Subtotal   New Carry
0 0 0 0 0
0 0 1 1 0
0 1 0 1 0
0 1 1 0 1
1 0 0 1 0
1 0 1 0 1
1 1 0 0 1
1 1 1 1 1

The matrix has two properties I stumbled across that I found suprising. First, If the current column's carry and row 2 values are the same, the subtotal will be the same as row 1, otherwise it will be the opposite. Second, If both rows have the same digit, the next column's carry is that digit, otherwise it remains whatever it was before. Look for yourself. It's pretty strange.

Those stumbled-upon properties combine to make it simple to calculate a sum in one pass through the digits, as well as to update row 1's values in place to make it the sum, rather than using a temporary variable. The meat happens here:

if (c != b) { regs[0] ^= bit[x] };
if (a == b && a != c) { c = !c }

The first line says "If the current value of carry doesn't equal row 2's bit, flip row 1's bit." The second says "If row one and two are equal, but do not match the current carry, flip the carry." Concise, but abstracted from what I would consider math, even though it effectively is adding the same way a third-grader does.

Subtraction:

The rewrite for subtraction works much the same way as addition. In fact, the function is identical except for the line that manages the "borrow" flag (the carry flag from addition). Here is the matrix for subtraction:

Borrow Row 1  Row 2     Subtotal   New Borrow
0 0 0 0 0
0 0 1 1 1
0 1 0 1 0
0 1 1 0 0
1 0 0 1 1
1 0 1 0 1
1 1 0 0 0
1 1 1 1 1

Subtotaling worked exactly the way it did for addition. Whether borrowing continued or not took a fair bit of analysis, though. It eventually jumped out at me that borrow tends to continue as it was, and in both cases where it switches, the previous value and row 1's bit matched. And when it changed, it changed to row 2's bit.

So with "c" being our borrow bit, and "a" and "b" being the row bits, managing the borrow flag can be accomplished with this simple statement:
if ( c == a ) { c = b };

Madness! Anyway, that makes the final subtract function:
co_subtract = function() {
c = 0; // Borrow flag
for (var x in bit) {
a = (regs[0] >> x) & 1;
b = (regs[1] >> x) & 1;
if (c != b) { regs[0] ^= bit[x] };
if ( c == a ) { c = b };
}
if (c != 0) { overflow = true }
}

Multiplication

Binary multiplication also works like decimal multiplication, but it has a fantastic simplicity to it that is really easy to code for. Take this example, multiplying 9 x 5 (binary 1001 x 101):
    1 0 0 1
x 1 0 1
-----------
1 0 0 1
0 0 0 0
1 0 0 1
-----------
1 0 1 1 0 1

First, binary 101101 is indeed 45. Second, for all the subtotal rows, one of two things happens: either the multiplicand comes down exactly as it was, or all zeroes comes down. We need only initialize a subtotal to 0, then iterate over the multiplier. For every 1, we add in the multiplicand, left shifted as appropriate, to the subtotal.

Here is what the basic rewrite looks like:
co_multiply = function() {
regs[2] = regs[1];
regs[1] = regs[0];
regs[0] = 0;
for (var x in bit) {
if ( (regs[2] >> x) & 1 ) { co_add() }
regs[1] <<= 1;
}
}

All the register shifting at the top is done to capitalize on the pre-existing addition function. It always adds register 1 to register 0, so reg[0] needs to start out empty, then get reg[1] added to it where appropriate. Inside the for loop, reg[1] gets left-shifted once per iteration.

This would work as-is, but I put in a couple additional safeguards. If either initial value is 0, we know the answer already, so there is no work to do. Also, I set the overflow flag if I'm about to left-shift off a 1, as the answer won't fit in the number of bits I'm working with. Here's the final function:
co_multiply = function() {
if (regs[0] == 0) { return }
if (regs[1] == 0) { regs[0] = 0; return }
regs[2] = regs[1];
regs[1] = regs[0];
regs[0] = 0;
for (var x in bit) {
if ( (regs[2] >> x) & 1 ) { co_add() }
if ( (regs[1] & highbit) != 0 ) { overflow = true }
regs[1] <<= 1;
}
}

I've pre-defined highbit as the highest value in the bit array, to speed up lookups. I'm considering rewriting that line as:
if ( (regs[1] >> word_length) & 1 ) { overflow = true }

Some very unscientific test cases have shown that right-shifting followed by a bitwise AND with 00000001 is slightly faster than a bitwise AND with a higher value. Plus it looks cooler. And, you know, that's important, too.

Division

There are no two ways about it: Long division brings back bad memories from school for everyone. After tackling it in code on a low level, my belief is now that it is being taught completely wrong to our kids. Yes, they are taught a method that produces correct answers, but it is abstracted from the real calculations going on so much that it might as well be a raindance that magically gives answers. Consider the steps in the standard long division technique of dividing 2825 by 25:
Step 1:   Step 2:   Step 3:   Step 4:   Step 5:
__1__ __1__ __11_ __11_ __113
25)2825 25)2825 25)2825 25)2825 25)2825
25 - 25 - 25 - 25 - 25
3 32 32 32
- 25 - 25 - 25
7 75
- 75
0

The first few things that jump out at me: In step 1, so far my answer is 1. My adult understanding of numbers can see that this is a 1 in the hundreds place. But my 8 year-old self had trouble with that.

In step 2, what exactly does that 3 represent? Or in step 3, what does 32 represent? By the time we get to step 4, there are a bunch of floating numbers abstracted from reality, and we just have to have faith in the magic system that gives us the answer.

What do I think would help? Well, let's start by filling in the blanks:
Step 1:   Step 2:   Step 3:   Step 4:   Step 5:
__100 __100 __110 __110 __113
25)2825 25)2825 25)2825 25)2825 25)2825
2500 - 2500 - 2500 - 2500 - 2500
325 325 325 325
- 250 - 250 - 250
75 75
- 75
0

So there it is. The "3" represents the first digit of "325", the difference of 2825 and 2500. So by step 2 we've said 2825 / 25 is 100, with a remainder of 325. What's left to do is to try to whittle down the remainder.

In the method with the blanks filled in, there is no mystic bringing down of digits for the next step; instead you do the complete subtraction, and what's left is the complete remainder. Maybe one day I'll flesh out what I think is a better approach to teaching division to kids, but for now let's stick to binary division in JavaScript, and what is *really* going on behind the scenes in long division.

In the binary world we are again aided by the fact that there are just ones and zeroes. When building our answer, we need only determine whether or not to put a one in the current bit. Consider 50 / 5, or binary 110010 / 101:
   ___1010
101)110010
- 101000 (101 + 3 zeroes)
01010
- 00000 (000 + 2 zeroes)
1010
- 1010 (101 + 1 zero)
000
- 000 (000 + no zeroes)

And binary 1010 is indeed decimal 10. The approach I took to tell JavaScript to do this breaks down into two groups of steps, here calling the dividend N (numerator), the divisor D (demoninator), the quotient Q, and x is a counter for how far left-shifted D is.

Group A:
A1: Set Q and x to 0.
A2: If D is greater than N, go to step B1
A3: Left-shift D
A4: Increment x
A5: Go to step A2

Group A left-shift's N until it is greater than D, incrementing x along the way. Group B does the real division work.

Group B:
B1: If D is greater than N, go to step B4
B2: Subtract D from N
B3: Add 1 left-shifted x times to Q
B4: If x is 0 or N is 0, Q is quotient, N is remainder, exit
B5: right-shift D
B6: decrement x
B7: Go to step B1

Group B starts out with D greater than N, so it skips down to right-shifting it over once. Then it iterates through subtracting from the dividend, and adding 1 bits to the quotient, and right-shifting D until the remainder is 0, or D has been right-shifted back to where it started.

The drawback to this function is that it can iterate through 2*word_length bits if the dividend is large and the divisor small. I'm sure it is possible to simplify the function and reduce its average bit iterations, but this second attempt at division is an order of magnitude faster than its predecessor. Here is the current function:
co_divide = function() {
// If either input register is 0, stop and return zeroes or throw divide by zero error
if(regs[1] == 0) { throw("Can't divide by 0") }
if(regs[0] == 0) { return }

regs[3] = 0; // Answer, ultimately. We build it up one bit at a time
var x; // current bit

// Left-shift denominator until it is >= numerator, or left-shifting will lose a 1
for (x in bit) {
a = greater(regs[0], regs[1]);
if (a == null) { // registers are equal
regs[0] = bit[x];
regs[1] = 0;
return;
}
if (a == false) { break } // Stop if left-shifted denominator is greater than numerator
if ( (regs[1] & highbit) != 0 ) { overflow = true }
regs[1] <<= 1;
}

// Subtract, right-shift, repeat until remainder or x = 0
while (regs[0] != 0) {
if(a != false) {
co_subtract();
regs[3] |= bit[x];
}
if (x == 0) { break }
x = decrement(x);
regs[1] >>= 1;
a = greater(regs[0], regs[1]);
}

regs[1] = regs[0]; // Remainder
regs[0] = regs[3]; // Quotient
}

Next steps:
I still need to look at floats, but I'm more interested in cleaning up the compiler to make it more like the other functions, and not rely on JavaScript text matching. I'm not sure how much progress I'll make on either of those by next week, but I'm having fun whipping up small programs using the new language style, so I'll at least add a few more of those to the test suite.

Thursday, February 4, 2010

Week 2: Bit Twiddling

Source: rm20100203.js
Test page: register-machine-test-2010-02-03.html

Changes:
- Increment/decrement now using bit twiddling of integers, not strings
- Test suite page redesigned to display output in textarea rather than alert window
- Working memory handling, program executor, and compiler
- Test suite contains sample assembly programs for arithmetic and displaying an ASCII table

There is an inherent flaw in this project that is unavoidable: JavaScript does not provide direct access to memory. No matter what my design, at some point I'm going to have to emulate something a chip does to memory by using a JavaScript-provided construct. My first draft from last week was using strings of zeroes and ones to be virtual memory, and I set out to have the chip functions do all math through my wacky search/replace increment functions. As I built on last week's code and provided the interface to memory and assembly instructions, I could not seem to avoid doing this:
var memsize = Math.pow(2,word_length)-1;

...and a lot of this:
for (var x = 0; x < word_length; x++) ...

I ended up with chunks of code considering addresses as strings of zeroes and ones, and other chunks considering it with integer values. I also was letting more native JavaScript math slip in everywhere. I decided, then, to rewrite everything to use JavaScript integers to store values, and to be as much of a purist as possible in avoiding JavaScript math or high level compare functions. I also abandoned the idea of registers as objects, and just store them now as arrays of integers.

After some heavy rewriting, the only maths my math and core functions now use are logical bitwise operators, and testing for 0 and null. The program executor and compiler do no math, but use switch statements to check for more values than just 0. Future builds will be more true to my "purist" idea, but rather than taking weeks to build the perfect framework, I wanted to build something quickly that would work.

And work it does. The goal of being able to type in assembly commands, and have them interpreted by a virtual machine and executed, and send output to a textarea works famously, as shown in this screenshot:


The assembly language I invented is very simple, comprised of only 8 commands, all following a standard format: [line] [command] [register] [word]. These get turned by the compiler into 13 bit integers (3 for the command, 2 for the register, 8 for the word) and inserted in my memory array, using "line" as the index. The executor then iterates over memory, and extracts from the integers the command, register, and word, and then acts on them.

Here is an overview of the commands, where "r" is the specified register, "w" the word:
jmp - Jump to address w, unless register r is less than register 0 (If r = 0, always jump)
jmz - Jump to address w only if register r's value is 0
cpy - Copy memory address w register r
cpo - Copy register r to address w
mov - Move the literal value w to register r
cor - Run core math function w. w = 0, increment register r. 1 = decrement r. 2 = add regsters 0 and 1. 3 = subtract, 4 = multiply, 5 = divide
out - Copy address w to output device (getElementById("op").value, for now). Copy decimal value plus newline if r = 0, ASCII value otherwise
end - Stop program. Register and word are ignored, but must be included

This is a very limiting language, naturally, but I've been able to accomplish some fun stuff with it during my first week of playing with it. Here, for example, is a simple 11 line program to display an ASCII table, an improvement over the one that comes in the built-in test suite, as it won't get stuck in an infinite loop if you try to show all 256 ASCII characters:

0 mov 0 0 // Start
1 mov 1 255 // End
2 mov 2 32 // Space
3 cpo 2 51
4 cpo 0 50
5 out 1 50 // Character
6 out 1 51 // Space
7 out 0 50 // Number
8 cor 0 0 // Increment to next character
9 jmz 0 11 // Exit if counter overflows
10 jmp 1 4 // Go to loop if End >= current
11 end 0 0


You can cut and paste that directly into the "input" window on the test page I link to above, and click "Run program" to watch it work. It may take a few seconds, but my engine is still sort of slow. A last note about the program, I said "all 256 ASCII characters" above, however the JavaScript conversion function I'm using to extract characters is String.fromCharCode, which displays unicode values. I tested this using 16 bit words and displaying characters greater than 256, and successfully output unicode values.

My functions now conceptually think of memory and registers as bits, my increment/decrement functions use xor-ing and flags to manipulate one bit at a time, and I assemble my cpu instructions using register shifting. Here are my old and new increment functions, to illustrate:

Old:
function increment(input) {
// Last index of 0 becomes 1, everything after becomes 0
var l = input.lastIndexOf("0") + 1;
var a = input.substring(0,l);
var b = input.substring(l);
return a.replace(/0$/, "1") + b.replace(/1/g, "0");
}

New:
// The bit[] array has flags used in XOR/AND operations, starting at 1, then doubling
// i.e., bit[0] = binary 00000001, bit[1] = 00000010, bit[2] = 00000100, etc.
var bit = new Array(1, 2, 4, 8, 16, 32, 64, 128);
function increment(num) {
for (var x in bit) {
var tmp = num & bit[x]; // tmp will be 0 if current bit being examined is 0
num = num ^ bit[x]; // This flips the current bit
if (tmp == 0) { return num; } // Stop if we just flipped a 0 to a 1
}
overflow = 1;
return num;
}

So, the new incrementing function works on same basic principle, except it's more explicit, whereas the old function was more "tricky" with its use of regular expressions. The new function conceptually thinks of the number as binary, and starts at the least significant bit and works backwards. Here is an example of how it works, incrementing 3 (binary 00000011) to 4 (00000100).

Step 1:
num    00000011                 00000011
bit[0] 00000001 & (AND) 00000001 ^ (XOR)
-------- --------
tmp 00000001 new num 00000010

tmp does not equal 0, so continue with new num 00000010 (decimal 2, yes I realize so far we're going the wrong way)

Step 2:
num    00000010                 00000010
bit[1] 00000010 & 00000010 ^
-------- --------
tmp 00000010 new num 00000000

tmp still isn't 0, so continuing with new num being 0. Looks bad, but have faith...

Step 3:
num    00000000                 00000000
bit[2] 00000100 & 00000100 ^
-------- --------
tmp 00000000 new num 00000100

OK, tmp is zero, so stop and return the new num value of 00000100 (decimal 4). Ta-da! increment(3) = 4, just like it is supposed to, and no actual arithmetic was done, just logical operations on bits and testing for 0. I'm building all arithmetic the register machine does on top of that. Up from first principles, so to speak, or rather, I'm pretending my increment function does what my theoretical cpu would do with bits. Decrementing works the same way, and I won't expound on it. The only difference is we stop when tmp does not equal 0.

Enjoy toying with the test page, modifying my sample programs, and making your own. Watch out for infinite loops. I've had to kill my browser dozens of times while experimenting this week due to exceptions I didn't see. Remember, assembly is a low-level and completely unsafe environment to code in.

Next steps:
- Write assembly language reference and more sample programs
- Speed up binary arithmetic functions instead of relying on increment/decrement only
- Add label support to assembly copiler, to eleminate reliance on line numbers. (Making small changes to a program is painful, as you currently have to renumber everything.)
- Experiment with floating point numbers

Friday, January 29, 2010

Week 1: Welcome

Welcome to the JavaScript Register Engine Project! My name is Curtis Autery, a software developer currently working at AEP, where I mainly bang out perl, webMethods flow, and java code, and perform the virtual analogy of tying helium balloons to a leaning Jenga puzzle. Most of the time, nothing could make me happier, but lately I've been trying to flex and grow my tech muscles, and so spend some downtime at home working on crazy experiments. This blog is the home of my latest one.

This project will chronicle my attempts at writing a computer emulator (of sorts) in JavaScript. I plan to write a new blog entry weekly on Friday nights with a discussion of updates to the project, what my next steps are, and sample code to talk about. All of this project's code is released under the GNU LGPL version 3. Working code for my first week's entry is available here, and a simple HTML interface to invoke predefined tests on the functions is available here. My email address in included in the source code, so if you have any questions, please fire away.

Conception

My wife of 22 months, who has no formal technical training, but has both taught a college class on Photoshop and is as skilled at disabling and removing malware as the people paid to do so at my office, bought me a book this Christmas that triggered everything. The book is "Coders at Work", by Peter Seibel. In it he interviews heavy hitters in the tech world who have laid the foundation of modern compilers, operating systems, and programming languages. I've been glued to the book, praising my wife for her genius in buying it for me, and spending many hours on Wikipedia reading about concepts discussed in the book that I am not versed in.

The old-timers in the book all tell the same story: "I originally was on another career path, and saw this behemoth computer at my college, with buzzing vacuum tubes and punchcards, and lots of important-looking people in labcoats who I schmoozed into letting me play with it, and then I was hooked. Years later, I was struggling over arcane problems with something low-level (large assembly code, memory lock contention on parallel processes, code that should work, but exposed an undiagnosed processor or OS bug) and spent my days stewing over my debugger, adding trace steps in my code, taking printouts home with me to mark up with a pencil. After making my bones, I drew on the pool of knowledge I built from my experiences to accomplish something great (Erlang, Haskell, optimizations to the GCC compiler, JavaScript, Lisp, Ghostscript, XEmacs, Unix itself).

I was in heaven reading their stories, and was also reminded of the love I used to have for the tech world that has waned over the years. I was also reminded of something I had never done before: write code in assembly language. The reason I never investigated that was completely silly: the guy in my high school who I first saw debugging assembly code on an Apple // was smug and aloof, and I was too annoyed with him to look over his shoulder and ask questions. I got busy learning other technology, and never came back to it. An interesting example of the butterfly effect, and yet another lost opportunity for personal growth in my teen years for no good reason.

I wasn't completely ignorant of assembly, however. I borrowed a volume of Don Knuth's "The Art of Computer Programming" from a library last year, and struggled through as much as I could of it for 30 days. It discusses low-level coding and program optimization via his invented "MIX" architecture, a theoretical hybrid of real machines. I enjoyed those chapters, with their talk of manipulating registers, jumping, overflow, bit shifting, including numerous examples of assembly code to accomplish sorting, random number generation, etc. It was heavy reading, and I'd love to revisit the whole series of books some day without time constraints. The book was a good second introduction to assembly language, and the type of low-level thinking needed to wield it effectively.

After "Coders at Work" rekindled my love of things tech, I had a desire to create something. I spun up some perl and JavaScript experiments and blogged about them (perl recursion and JavaScript closures, from earlier this month on my "Medicine for the Sky" blog), and concurrently did some thought experiments about assembly language. What would I include in a language I designed? I need to be able to read and write, display output, move in memory, compare values, do math... Would I want my machine to think in decimal, or in binary? Should it output in ASCII or just numbers? Should it manipulate on the byte or bit level? Should the commands be sort of high level, or should you be required to build up macros from simpler commands?

After a day of brainstorming, I crystallized the idea to code an engine to emulate hardware that can support assembly commands. I would code it in JavaScript, in pieces, grow it over time, and blog about the process. The Wikipedia articles on register machines and Turing machines (the latter Knuth mentions in "The Art of Computer Programming") describe roughly the level of complexity I'm aiming for.

After completing a first draft of code to create registers, an incrementer/decrementer, and some basic math functions that ride on top of it, I'm happy about the project, my code, and the prospect of having a usable final product out of all this. I'm only at the first baby steps, however. I have no abstraction yet of memory, output, a character set, or a command interpreter, just some functions to do basic arithmetic and counting.

Here I'm including the first draft source code, and some brief commentary of the major pieces.

The Register Engine provides a global function to create registers of arbitrary bit length, and a second global function that receives a register and a value to set it to as inputs, and sanitizes by making sure a real register has been passed, and a real binary value (for now this is actually a string containing ones and zeroes) is what it should be set to. The intention here is to have an immutable definition of what a register is that all other functions inherit. For now, registers just have data bits and a name. No error flag. Doing some basic thought experiments after playing with my initial version's test suite and thinking about overflow and multiplication, I see that this is probably a mistake that I'll need to correct later.
// JavaScript-based register machine
// Written by Curtis Autery
// (email address in .js file linked to above)
// Released under the GNU LGPL version 3 license

// The registers created by this script may only contain binary values, expressed
// as strings containing 0s and 1s. This is a regular expression to make sure only
// 0s and 1s are included in values being set to input registers
var re_word = /^[01]+$/;

// This is a regular expression for testing if a register contains only zeroes, useful
// in functions that decrement a counter to 0, the "jump on zero" assembler command,
// and in multiplication and division tests
var is_zero = /^0+$/;

// This global function returns a simple register object and initializes it to all 0s
function register(bits,name) {
this.bits = bits;
this.name = name;
this.value = "";
while (bits-- > 0) this.value += "0";
}

// This global function attempts to set a register to a specific word value. There are
// checks to make sure that an input register is being referenced, and that the word
// value is binary and is of the correct length for the register
function set_register(r, word) {
try {
if (r instanceof register == false) { throw "Not a register" }
if (re_word.test(word) == false) { throw ("Invalid word: " + word) }
if (word.length != r.bits) { throw (word + " needs " + r.bits + " bits, has " + word.length ) }
r.value = word;
} catch(e) {
alert (e);
}
}

Next, a "core" function is provided to be the main data manipulator. This is my first attempt to abstract a cpu, and I suspect this function will undergo major surgery in future versions. Four private registers are created, and closure functions are provided to set them to specific binary values, set them to all zeroes, and increment and decrement them. The plan is for all math to be broken down to this level. Incrementing deserves a special blurb here, as I have to get it right since all math will be based on it. I'm not incrementing in the traditional way, I'm doing a pair of search/replaces based on the last occurrence of zero in the string.

There are three types of binary strings: ones that end with zero, ones that contain zero but do not end with it, and ones that do not contain zero. In all three cases, we can increment to the next binary value by searching for the last zero, replace it with one, and replace all further digits with zero.

For values ending in zero, (e.g., 000, 010, 100), the last zero is also the last digit. Replace it with one, and there are no further digits to replace. For values containing zero but not ending with it (e.g., 001, 011), the final zero becomes one, the remaining digits (all ones) change to 0, simulating an arithmetic carry. For values containing no zeros (e.g., 111), the first match fails and returns null. Then "all further digits" encompasses the entire string, and everything rolls back to 0. In future versions, I may raise an error or overflow flag when this happens.
// This function creates the "core" register machine, creating four private registers,
// and providing closure functions to set, increment, and decrement them, and to alert
// all or return individual values.
function co_init() {
var regs = new Array();
regs[0] = new register(8, "co_0");
regs[1] = new register(8, "co_1");
regs[2] = new register(8, "co_2");
regs[3] = new register(8, "co_3");

function increment(input) {
// Last index of 0 becomes 1, everything after becomes 0
var l = input.lastIndexOf("0") + 1;
var a = input.substring(0,l);
var b = input.substring(l);
return a.replace(/0$/, "1") + b.replace(/1/g, "0");
}

function decrement(input) {
// Last index of 1 becomes 0, everything after becomes 1
var l = input.lastIndexOf("1") + 1;
var a = input.substring(0,l);
var b = input.substring(l);
return a.replace(/1$/, "0") + b.replace(/0/g, "1");
}

co_set = function(r,word) {
// Possible todo: add leading zeroes if necessary
set_register(regs[r],word);
}

co_set_zero = function(r) {
var bits = regs[r].bits;
var word = "";
while (bits-- > 0) word += "0";
set_register(regs[r],word);
}

// This is just a debug function to iterate through all the registers and pop up
// an alert window with their current values.
co_show_registers = function() {
var return_value = "Name\tValue\n";
for (var n in regs) {
return_value += regs[n].name + "\t" + regs[n].value + "\n";
}
alert (return_value);
}

co_inc = function(r) {
co_set(r,increment(regs[r].value));
}

co_dec = function(r) {
co_set(r,decrement(regs[r].value));
}

co_read = function(r) { return regs[r].value; }
}

Last in the major functions, a math processor contains abstracted macros to do the four basic arithmetic operations in terms of incrementing and decrementing binary values. For addition and subtraction, the first two registers are loaded up with values to be processed. The second register is decremented down to 0 (decrementing functions the same way as incrementing, just with opposite numbers) while the first register is incremented or decremented to the answer. Multiplication and division are more complicated, requiring extra registers as temp variables. In all cases, nothing is returned from the math functions, the answer is always left in the first register.
// Basic addition, subtraction, and multiplication routines using the core register
// machine functions to manipulate registers. Note that none of these functions returns
// any values, but rather leaves the "answer" in co_0, decrementing co_1 to 0 (or
// leaving the division remained there), and using co_2 and co_3 as temp variables as
// needed in the multiplication and division functions
function math_init() {

// Incremember co_0 by co_1, set co_1 to 0
add = function() {
while(is_zero.test(co_read(1)) == false) {
co_inc(0);
co_dec(1);
}
}

// Decrement co_0 by co_1, set co_1 to 0
subtract = function() {
while(is_zero.test(co_read(1)) == false) {
co_dec(0);
co_dec(1);
}
}

// Multiple co_0 by co_1, using co_2 and co_3 as temp space, return co_0
multiply = function() {
// If either input register is 0, stop and return zeroes
if(is_zero.test(co_read(0))) { return }
if(is_zero.test(co_read(1))) { co_set_zero(0); return }

co_set(2,co_read(0)); //co_2 will be original co_0 value
co_set(3,co_read(1)); //co_3 will be our counter to be decremented down to 0
co_dec(3);

while(is_zero.test(co_read(3)) == false) { // Repeat until co_3 is all zeroes
co_dec(3); // Decrement counter
co_set(1,co_read(2)); // Load co_1 back up with original co_0 value
add();
}
}

divide = function() {
try {
// If either input register is 0, stop and return zeroes or throw divide by zero error
if(is_zero.test(co_read(1))) { throw("Can't divide by 0") }
if(is_zero.test(co_read(0))) { return }

co_set_zero(2); // Counter;
co_set(3,co_read(1)); // Original divisor
while(co_read(0) >= co_read(1)) {
subtract();
co_inc(2);
co_set(1,co_read(3));
}
co_set(1,co_read(0)); // What's left after the last subtraction is the remainder
co_set(0,co_read(2)); // And the counter that we've been incrementing is the quotient
} catch(e) {
alert(e);
}
}

}

Additionally, I have provided a test suite function to run some sample values through incrementing, decrementing, and the arithmetic functions. They are testable from the link I provided above.
function test_suite_init() {
test_inc = function() {
var output = "Set co_0 to 0, increment 10 times\n";
co_set_zero(0);
output += "Initial value:\t" + co_read(0) + "\n";
for (var x = 1; x <= 10; x++) {
co_inc(0);
output += "Interation " + x + "\t" + co_read(0) + "\n";
}
alert (output);
}

test_dec = function() {
var output = "Set co_0 to 1s, decrement 10 times\n";
co_set(0,"11111111");
output += "Initial value:\t" + co_read(0) + "\n";
for (var x = 1; x <= 10; x++) {
co_dec(0);
output += "Interation " + x + "\t" + co_read(0) + "\n";
}
alert (output);
}

test_inc_overflow = function() {
var output = "Set co_0 to 11111101, increment 10 times\n";
co_set(0,"11111101");
output += "Initial value:\t" + co_read(0) + "\n";
for (var x = 1; x <= 10; x++) {
co_inc(0);
output += "Interation " + x + "\t" + co_read(0) + "\n";
}
alert (output);
}

test_dec_overflow = function() {
var output = "Set co_0 to 00000010, decrement 10 times\n";
co_set(0,"00000010");
output += "Initial value:\t" + co_read(0) + "\n";
for (var x = 1; x <= 10; x++) {
co_dec(0);
output += "Interation " + x + "\t" + co_read(0) + "\n";
}
alert (output);
}

test_addition = function() {
var output = "Set co_0 to 00001110, co_1 to 00001111, add them (14 + 15)\n";
co_set(0,"00001110");
co_set(1,"00001111");
output += "Initial values:\t" + co_read(0) + "\t" + co_read(1) + "\n";
add();
output += "Final values:\t" + co_read(0) + "\t" + co_read(1) + "\n";
output += "(Binary 29 is\t00011101)";
alert (output);
}

test_subtraction = function() {
var output = "Set co_0 to 01011101 (93), co_1 to 00001100 (12), subtract them\n";
co_set(0,"01011101");
co_set(1,"00001100");
output += "Initial values:\t" + co_read(0) + "\t" + co_read(1) + "\n";
subtract();
output += "Final values:\t" + co_read(0) + "\t" + co_read(1) + "\n";
output += "(Binary 81 is\t01010001)";
alert (output);
}

test_multiplication = function() {
var output = "Set co_0 to 00001110, co_1 to 00001111, multiply them (14 * 15)\n";
co_set(0,"00001110");
co_set(1,"00001111");
output += "Initial values:\t" + co_read(0) + "\t" + co_read(1) + "\n";
multiply();
output += "Final values:\t" + co_read(0) + "\t" + co_read(1) + "\n";
output += "(Binary 210 is\t11010010)";
alert (output);
}

test_division = function() {
var output = "Set co_0 to 01011101 (93), co_1 to 00001100 (12), divide them\n";
co_set(0,"01011101");
co_set(1,"00001100");
output += "Initial values:\t" + co_read(0) + "\t" + co_read(1) + "\n";
divide();
output += "Final values:\t" + co_read(0) + "\t" + co_read(1) + "\n";
output += "Answer should be\t00000111\t00001001\n(7 remainder 9)";
alert (output);
}

}

co_init();
math_init();
test_suite_init();

Plans for the next version:
1) Tweak the increment/decrement, and division functions to make them simpler.
2) Write a shell for computer memory, and an interface to perform I/O on it.
3) Frown about uglier functions, and play with them until I like them more.
4) Anxiously check email for any feedback.