Friday, January 29, 2010

Week 1: Welcome

Welcome to the JavaScript Register Engine Project! My name is Curtis Autery, a software developer currently working at AEP, where I mainly bang out perl, webMethods flow, and java code, and perform the virtual analogy of tying helium balloons to a leaning Jenga puzzle. Most of the time, nothing could make me happier, but lately I've been trying to flex and grow my tech muscles, and so spend some downtime at home working on crazy experiments. This blog is the home of my latest one.

This project will chronicle my attempts at writing a computer emulator (of sorts) in JavaScript. I plan to write a new blog entry weekly on Friday nights with a discussion of updates to the project, what my next steps are, and sample code to talk about. All of this project's code is released under the GNU LGPL version 3. Working code for my first week's entry is available here, and a simple HTML interface to invoke predefined tests on the functions is available here. My email address in included in the source code, so if you have any questions, please fire away.

Conception

My wife of 22 months, who has no formal technical training, but has both taught a college class on Photoshop and is as skilled at disabling and removing malware as the people paid to do so at my office, bought me a book this Christmas that triggered everything. The book is "Coders at Work", by Peter Seibel. In it he interviews heavy hitters in the tech world who have laid the foundation of modern compilers, operating systems, and programming languages. I've been glued to the book, praising my wife for her genius in buying it for me, and spending many hours on Wikipedia reading about concepts discussed in the book that I am not versed in.

The old-timers in the book all tell the same story: "I originally was on another career path, and saw this behemoth computer at my college, with buzzing vacuum tubes and punchcards, and lots of important-looking people in labcoats who I schmoozed into letting me play with it, and then I was hooked. Years later, I was struggling over arcane problems with something low-level (large assembly code, memory lock contention on parallel processes, code that should work, but exposed an undiagnosed processor or OS bug) and spent my days stewing over my debugger, adding trace steps in my code, taking printouts home with me to mark up with a pencil. After making my bones, I drew on the pool of knowledge I built from my experiences to accomplish something great (Erlang, Haskell, optimizations to the GCC compiler, JavaScript, Lisp, Ghostscript, XEmacs, Unix itself).

I was in heaven reading their stories, and was also reminded of the love I used to have for the tech world that has waned over the years. I was also reminded of something I had never done before: write code in assembly language. The reason I never investigated that was completely silly: the guy in my high school who I first saw debugging assembly code on an Apple // was smug and aloof, and I was too annoyed with him to look over his shoulder and ask questions. I got busy learning other technology, and never came back to it. An interesting example of the butterfly effect, and yet another lost opportunity for personal growth in my teen years for no good reason.

I wasn't completely ignorant of assembly, however. I borrowed a volume of Don Knuth's "The Art of Computer Programming" from a library last year, and struggled through as much as I could of it for 30 days. It discusses low-level coding and program optimization via his invented "MIX" architecture, a theoretical hybrid of real machines. I enjoyed those chapters, with their talk of manipulating registers, jumping, overflow, bit shifting, including numerous examples of assembly code to accomplish sorting, random number generation, etc. It was heavy reading, and I'd love to revisit the whole series of books some day without time constraints. The book was a good second introduction to assembly language, and the type of low-level thinking needed to wield it effectively.

After "Coders at Work" rekindled my love of things tech, I had a desire to create something. I spun up some perl and JavaScript experiments and blogged about them (perl recursion and JavaScript closures, from earlier this month on my "Medicine for the Sky" blog), and concurrently did some thought experiments about assembly language. What would I include in a language I designed? I need to be able to read and write, display output, move in memory, compare values, do math... Would I want my machine to think in decimal, or in binary? Should it output in ASCII or just numbers? Should it manipulate on the byte or bit level? Should the commands be sort of high level, or should you be required to build up macros from simpler commands?

After a day of brainstorming, I crystallized the idea to code an engine to emulate hardware that can support assembly commands. I would code it in JavaScript, in pieces, grow it over time, and blog about the process. The Wikipedia articles on register machines and Turing machines (the latter Knuth mentions in "The Art of Computer Programming") describe roughly the level of complexity I'm aiming for.

After completing a first draft of code to create registers, an incrementer/decrementer, and some basic math functions that ride on top of it, I'm happy about the project, my code, and the prospect of having a usable final product out of all this. I'm only at the first baby steps, however. I have no abstraction yet of memory, output, a character set, or a command interpreter, just some functions to do basic arithmetic and counting.

Here I'm including the first draft source code, and some brief commentary of the major pieces.

The Register Engine provides a global function to create registers of arbitrary bit length, and a second global function that receives a register and a value to set it to as inputs, and sanitizes by making sure a real register has been passed, and a real binary value (for now this is actually a string containing ones and zeroes) is what it should be set to. The intention here is to have an immutable definition of what a register is that all other functions inherit. For now, registers just have data bits and a name. No error flag. Doing some basic thought experiments after playing with my initial version's test suite and thinking about overflow and multiplication, I see that this is probably a mistake that I'll need to correct later.
// JavaScript-based register machine
// Written by Curtis Autery
// (email address in .js file linked to above)
// Released under the GNU LGPL version 3 license

// The registers created by this script may only contain binary values, expressed
// as strings containing 0s and 1s. This is a regular expression to make sure only
// 0s and 1s are included in values being set to input registers
var re_word = /^[01]+$/;

// This is a regular expression for testing if a register contains only zeroes, useful
// in functions that decrement a counter to 0, the "jump on zero" assembler command,
// and in multiplication and division tests
var is_zero = /^0+$/;

// This global function returns a simple register object and initializes it to all 0s
function register(bits,name) {
this.bits = bits;
this.name = name;
this.value = "";
while (bits-- > 0) this.value += "0";
}

// This global function attempts to set a register to a specific word value. There are
// checks to make sure that an input register is being referenced, and that the word
// value is binary and is of the correct length for the register
function set_register(r, word) {
try {
if (r instanceof register == false) { throw "Not a register" }
if (re_word.test(word) == false) { throw ("Invalid word: " + word) }
if (word.length != r.bits) { throw (word + " needs " + r.bits + " bits, has " + word.length ) }
r.value = word;
} catch(e) {
alert (e);
}
}

Next, a "core" function is provided to be the main data manipulator. This is my first attempt to abstract a cpu, and I suspect this function will undergo major surgery in future versions. Four private registers are created, and closure functions are provided to set them to specific binary values, set them to all zeroes, and increment and decrement them. The plan is for all math to be broken down to this level. Incrementing deserves a special blurb here, as I have to get it right since all math will be based on it. I'm not incrementing in the traditional way, I'm doing a pair of search/replaces based on the last occurrence of zero in the string.

There are three types of binary strings: ones that end with zero, ones that contain zero but do not end with it, and ones that do not contain zero. In all three cases, we can increment to the next binary value by searching for the last zero, replace it with one, and replace all further digits with zero.

For values ending in zero, (e.g., 000, 010, 100), the last zero is also the last digit. Replace it with one, and there are no further digits to replace. For values containing zero but not ending with it (e.g., 001, 011), the final zero becomes one, the remaining digits (all ones) change to 0, simulating an arithmetic carry. For values containing no zeros (e.g., 111), the first match fails and returns null. Then "all further digits" encompasses the entire string, and everything rolls back to 0. In future versions, I may raise an error or overflow flag when this happens.
// This function creates the "core" register machine, creating four private registers,
// and providing closure functions to set, increment, and decrement them, and to alert
// all or return individual values.
function co_init() {
var regs = new Array();
regs[0] = new register(8, "co_0");
regs[1] = new register(8, "co_1");
regs[2] = new register(8, "co_2");
regs[3] = new register(8, "co_3");

function increment(input) {
// Last index of 0 becomes 1, everything after becomes 0
var l = input.lastIndexOf("0") + 1;
var a = input.substring(0,l);
var b = input.substring(l);
return a.replace(/0$/, "1") + b.replace(/1/g, "0");
}

function decrement(input) {
// Last index of 1 becomes 0, everything after becomes 1
var l = input.lastIndexOf("1") + 1;
var a = input.substring(0,l);
var b = input.substring(l);
return a.replace(/1$/, "0") + b.replace(/0/g, "1");
}

co_set = function(r,word) {
// Possible todo: add leading zeroes if necessary
set_register(regs[r],word);
}

co_set_zero = function(r) {
var bits = regs[r].bits;
var word = "";
while (bits-- > 0) word += "0";
set_register(regs[r],word);
}

// This is just a debug function to iterate through all the registers and pop up
// an alert window with their current values.
co_show_registers = function() {
var return_value = "Name\tValue\n";
for (var n in regs) {
return_value += regs[n].name + "\t" + regs[n].value + "\n";
}
alert (return_value);
}

co_inc = function(r) {
co_set(r,increment(regs[r].value));
}

co_dec = function(r) {
co_set(r,decrement(regs[r].value));
}

co_read = function(r) { return regs[r].value; }
}

Last in the major functions, a math processor contains abstracted macros to do the four basic arithmetic operations in terms of incrementing and decrementing binary values. For addition and subtraction, the first two registers are loaded up with values to be processed. The second register is decremented down to 0 (decrementing functions the same way as incrementing, just with opposite numbers) while the first register is incremented or decremented to the answer. Multiplication and division are more complicated, requiring extra registers as temp variables. In all cases, nothing is returned from the math functions, the answer is always left in the first register.
// Basic addition, subtraction, and multiplication routines using the core register
// machine functions to manipulate registers. Note that none of these functions returns
// any values, but rather leaves the "answer" in co_0, decrementing co_1 to 0 (or
// leaving the division remained there), and using co_2 and co_3 as temp variables as
// needed in the multiplication and division functions
function math_init() {

// Incremember co_0 by co_1, set co_1 to 0
add = function() {
while(is_zero.test(co_read(1)) == false) {
co_inc(0);
co_dec(1);
}
}

// Decrement co_0 by co_1, set co_1 to 0
subtract = function() {
while(is_zero.test(co_read(1)) == false) {
co_dec(0);
co_dec(1);
}
}

// Multiple co_0 by co_1, using co_2 and co_3 as temp space, return co_0
multiply = function() {
// If either input register is 0, stop and return zeroes
if(is_zero.test(co_read(0))) { return }
if(is_zero.test(co_read(1))) { co_set_zero(0); return }

co_set(2,co_read(0)); //co_2 will be original co_0 value
co_set(3,co_read(1)); //co_3 will be our counter to be decremented down to 0
co_dec(3);

while(is_zero.test(co_read(3)) == false) { // Repeat until co_3 is all zeroes
co_dec(3); // Decrement counter
co_set(1,co_read(2)); // Load co_1 back up with original co_0 value
add();
}
}

divide = function() {
try {
// If either input register is 0, stop and return zeroes or throw divide by zero error
if(is_zero.test(co_read(1))) { throw("Can't divide by 0") }
if(is_zero.test(co_read(0))) { return }

co_set_zero(2); // Counter;
co_set(3,co_read(1)); // Original divisor
while(co_read(0) >= co_read(1)) {
subtract();
co_inc(2);
co_set(1,co_read(3));
}
co_set(1,co_read(0)); // What's left after the last subtraction is the remainder
co_set(0,co_read(2)); // And the counter that we've been incrementing is the quotient
} catch(e) {
alert(e);
}
}

}

Additionally, I have provided a test suite function to run some sample values through incrementing, decrementing, and the arithmetic functions. They are testable from the link I provided above.
function test_suite_init() {
test_inc = function() {
var output = "Set co_0 to 0, increment 10 times\n";
co_set_zero(0);
output += "Initial value:\t" + co_read(0) + "\n";
for (var x = 1; x <= 10; x++) {
co_inc(0);
output += "Interation " + x + "\t" + co_read(0) + "\n";
}
alert (output);
}

test_dec = function() {
var output = "Set co_0 to 1s, decrement 10 times\n";
co_set(0,"11111111");
output += "Initial value:\t" + co_read(0) + "\n";
for (var x = 1; x <= 10; x++) {
co_dec(0);
output += "Interation " + x + "\t" + co_read(0) + "\n";
}
alert (output);
}

test_inc_overflow = function() {
var output = "Set co_0 to 11111101, increment 10 times\n";
co_set(0,"11111101");
output += "Initial value:\t" + co_read(0) + "\n";
for (var x = 1; x <= 10; x++) {
co_inc(0);
output += "Interation " + x + "\t" + co_read(0) + "\n";
}
alert (output);
}

test_dec_overflow = function() {
var output = "Set co_0 to 00000010, decrement 10 times\n";
co_set(0,"00000010");
output += "Initial value:\t" + co_read(0) + "\n";
for (var x = 1; x <= 10; x++) {
co_dec(0);
output += "Interation " + x + "\t" + co_read(0) + "\n";
}
alert (output);
}

test_addition = function() {
var output = "Set co_0 to 00001110, co_1 to 00001111, add them (14 + 15)\n";
co_set(0,"00001110");
co_set(1,"00001111");
output += "Initial values:\t" + co_read(0) + "\t" + co_read(1) + "\n";
add();
output += "Final values:\t" + co_read(0) + "\t" + co_read(1) + "\n";
output += "(Binary 29 is\t00011101)";
alert (output);
}

test_subtraction = function() {
var output = "Set co_0 to 01011101 (93), co_1 to 00001100 (12), subtract them\n";
co_set(0,"01011101");
co_set(1,"00001100");
output += "Initial values:\t" + co_read(0) + "\t" + co_read(1) + "\n";
subtract();
output += "Final values:\t" + co_read(0) + "\t" + co_read(1) + "\n";
output += "(Binary 81 is\t01010001)";
alert (output);
}

test_multiplication = function() {
var output = "Set co_0 to 00001110, co_1 to 00001111, multiply them (14 * 15)\n";
co_set(0,"00001110");
co_set(1,"00001111");
output += "Initial values:\t" + co_read(0) + "\t" + co_read(1) + "\n";
multiply();
output += "Final values:\t" + co_read(0) + "\t" + co_read(1) + "\n";
output += "(Binary 210 is\t11010010)";
alert (output);
}

test_division = function() {
var output = "Set co_0 to 01011101 (93), co_1 to 00001100 (12), divide them\n";
co_set(0,"01011101");
co_set(1,"00001100");
output += "Initial values:\t" + co_read(0) + "\t" + co_read(1) + "\n";
divide();
output += "Final values:\t" + co_read(0) + "\t" + co_read(1) + "\n";
output += "Answer should be\t00000111\t00001001\n(7 remainder 9)";
alert (output);
}

}

co_init();
math_init();
test_suite_init();

Plans for the next version:
1) Tweak the increment/decrement, and division functions to make them simpler.
2) Write a shell for computer memory, and an interface to perform I/O on it.
3) Frown about uglier functions, and play with them until I like them more.
4) Anxiously check email for any feedback.