« More Sparkfun geiger counter explorations | Main | Propeller based geiger counter »

Saturday, February 16, 2013

Propeller Programming stream of consciousness

For those of you who think this is a serious discussion of Propeller PASM, it's not. It started off as a comment on PASM when suddenly it seemed to take on a life of its own. So, it's probably best described as a sort of rant combined with stream of consciousness after a week spent programming that wonderful Propeller chip which is my favorite computer architecture at the moment. Chip Gracey, you rock.

While trying to write propeller subroutines on paper upstairs, suddenly a few things became clear; the reason for the jmpret instruction and the reason for movs and movd instructions. I made some stupid mistakes in my Prop code when first came home as I made the unforgivable error of assuming that #<variable> would give me indirect addressing (No, No, No, No ---- repeat*1000(No)).

There is no indirect addressing instruction in the Prop instruction set. Remember that. To move a variable to a buffer, one does this as follows:

mov BuffAddr, #Buff

movd #ins1, BuffAddr

ins1 mov #1, Data

So, instead of the nice PDP-11 method of:

MOV Data, @BuffAddr

which takes 2 bytes, one has the 12 byte instruction sequence above. The primary difference is that the PDP11 instruction would take about 4 microseconds to execute as it involved a memory based indirect address whereas the Prop instruction takes 12 clocks to execute or 0.15 microseconds.

The other thing is subroutine calls. The PDP8 was more advanced than the Prop when it came to subroutine calls. However, with the JumpRet instruction, one can return from anywhere as long as one allocates a return address for every subroutine, ie

jmpret Sub1Ret, Sub1

rs1 <code to execute after routine>

Sub1 <do something>

jmp Sub1Ret

The jmpret instruction writes #rs1 to variable Sub1Ret and calls Sub1. Have to just ensure that do a jmp Sub1Ret to get back to the caller. Kind of neat and also finicky but have to play the hand I've been dealt.

There's something about the Prop chip which is intensely appealing and I think that it's an architecture that DEMANDS self modifying code is what makes it such an attractive machine for me. Self-modifying code is something that will get one flunked out of a CompSci course. Mention self-modifying code to a CompSci graduate and they will suddenly turn pale, make the sign of the cross and back slowly away from you saying very slowly, "just stay calm and no-one gets hurt". If you want to then induce a vagal freeze response in them, say the words "Go To". That will overload their wetware, create potentially irreperable cognitive dissonance making them maybe capable of employment at a bottle recycling depot where their ability to count rapidly will be useful.

Self-modifying code seems to be the divide between those who want contact with the bare silicon and those who prefer to live in a world of virtual machines that all talk Java to one another and keep their variables private. Mention a global variable to these people and it has the same effect as if one pulled out ones dick at a Victorian tea party and asked "does anyone here have a longer one?"

Well I happen to like living dangerously which is far preferable to living a life of boredom. Having to restrain myself from trying out all the neat meatware hacks I come up with in my other life as a doctor makes me far more experimental when it comes to my hacker alter-ego. Basically, if there's a limit, I want to see if it can be broken or stretched.

When I first entered the micromedic contest, it was with some trepidation given the 496 longs/cog. This is a hard limit and I don't have access to a silicon foundry so I don't try to fit in 1 Mb of code into a cog. However, when I started writing PASM, I suddenly realized that I was using far fewer instructions that I thought would be needed. That's because data acquisition software is, when one looks at it in detail, really very simple. One either takes a value from one location and stores it in another, or one generates an SPI/I2C clock sequence which is used to either send or recieve serial data. This made me realize that most of the programs which I write in a HLL really aren't doing anything useful as far as the fundamental problem is concerned. All of the routines which I'm especially proud of are very terse and, when I was a young hacker wanabee, the head hacker reminded us of the importance of being terse. That was in the tiny 12 Kword memory space of the TSS/8. Verbosity in PDP8 code was met with a trembling, angry, outstretched hand pointing to the evil empires hulking 360/65 and a command to "go play with your bloatware".

Today I can't think of a single hacker that would consider the IBM 360's OS to be "bloatware" as the usual 360 installation was lucky to have a single megabyte of core storage. The OS was written in assembly language, but was very arcane and if one threw in 2 or 3 instructions that really weren't needed, one didn't notice given that there was a full MEGABYTE of core. That type of profligacy didn't go over very well in the minimalist atmosphere of the PDP8 head hacker's lair. There would be instruction sequences on a blackboard to be simplified and the best one could expect from him if one came up with a particularly novel simplification was "now that's a neat hack".

The Prop has way more RAM than the TSS/8, 2048 longs in the cogs and 8192 longs in hub RAM. That works out to 40,960 bytes of RAM in comparison to the 18432 bytes (although the unit of storage in a PDP8 was a 12 bit word) on the TSS/8. The TSS/8 ran 8 slave teletype machines, each one capable of downloading paper tape programs into the TSS/8 at the blinding speed of 110 baud. When I got an account on the TSS/8, I was in Nirvana. Now, to be fair to the TSS/8, it did have a hard disk drive which I believe was 2 Megabytes or so in size. OTOH, my Propeller board of education (BOE) has a mini-SD card socket and will handle cards up to 32 Gb in size. The TSS/8 also had it's own large room dedicated to it at the U of Calgary and, those of us who had proved especially worthy in the head hacker's eyes, were allowed to come into the inner sanctum and actually touch the PDP-8 machine! We were also allowed to reboot it, but the only PDP-8 we were allowed direct contact with was a bare machine with, I believe 1024 or 2048 words of core.

That's where I first learned assembly language programming. Spending much of my classroom time in highschool writing PDP8 software in octal and getting angry glares from teachers who assumed that their trivial subject was much more deserving of my time than programming. Once I had the sequence of octal instruction codes perfected, on yellow sheets of paper often with holes from multiple erasures of incorrect code sequences, I'd head over to the UofC Data Center and sign up for an hours time on the PDP8. Then, trembling with anticipation I'd key in my program on the switch panel; key in the address, load the address into the address register and then key in the sequence of 12 bit words hitting deposit address after every word was entered. It was one of the most exhilerating things I'd ever done at that point in my life. If I was right, then the panel lights would flash and the results of the mathematical operation I'd coded would be available in RAM and they were read out by setting the base address and then reading out the contents of all of the variables that were modified. Then I learned to use the PAL assembler which was a multiple pass assembler which required one to read both the assembler and ones paper tape program in multiple times before the binary code was punched out on paper tape. This was what one then loaded into the PDP8 <sorry, caught in a reminiscence loop, but when you remember your first computer far more clearly than your first lay after 40+ years, you must be a hacker>

I guess that's why I like the Propeller so much as it's triggered this gush of computer related memories of a much simpler time in the computational age. That's where I'm coming from and why I hate Java. When I finally got into HLL's, FORTRAN was my language of choice as real men used FORTRAN and just suits used COBOL. Well, actually real men used assembly language but I never did get an IBM 360 assembler program to run although I did generate thousands of pages of memory dumps attempting to get one to run (memo to self - don't start with trying to write re-entrant subroutines on an architecture you are just starting to learn).

Let's get away from this nostalgic haze now. What surprised me with the Prop was how short my routines were and that suddenly 496 instructions and variables was starting to look like a large amount of memory. As I'm writing this, I have a Propeller program which is sampling a 3 axis accelerometer and timing each event to msec precision, is reading out a Polar HR monitor that I'm wearing and is storing the data to a SD card based datalogger. There's an LED that's flashing with my HR and it's going fast now as I'm so excited to finally have an ambulatory physiologic monitor (APM) (which has been a dream of mine for 3 years now) on a Prop project board inside a little Sparkfun components box. It's running off 6 volts of AA batteries and survived a shopping outing earlier today stuffed in my jacket pocket.

The first program that I wrote for the APM was a 1 msec clock (there's something very reassuring about being able to create a precise timebase for a processor architecture and many of the development boards that I've gotten nowhere so far with do have mice millisecond clocks on them). The nice thing about the Prop is that it has very deterministic instruction timing. The vast majority of instructions execute in 4 clocks which, at an 80 MHz clock speed, works out to 50 nsec/instruction. 20 MIPS/cog is rather powerful when I was used to the 0.5 MIPS PDP-11/23 as my previous favorite machine. Thus, when one combines short programs with high clock speeds, one gets damn fast execution times. Also, way more leftover RAM in each cog than I had expected.

The clock cog generates a 1 msec clock which is output on Pin0 and the longword clock resides at hub RAM location $30 (why no-one has ever considered a Propeller page0 is beyond me but that's way beyond the scope of this already meandering text). That way other cogs can grab the timer when they need it. Because of all the room I had left over in the clock cog, I figured I might as well add a real time clock (RTC) to it and the RTC needs to be told if a year is a leap year, but it has a better knowledge of how many days are in a particular month than I do. It keeps excellent time which it should as the Prop system clock is generated by a 5.00 MHz crystal oscillator. Then, when I got the Polar HR receiver and chest strap, I decided why not throw in the HR monitoring chunk of the project into the clock cog?

Just a few more instructions to locate the 15 msec long high-going Polar HR pulse and also time the duration of the pulse so can distinguish noise from actual Polar HR events. The time of the HR event and duration of the pulse are saved to a hub RAM buffer.

Then, found some code on the Propeller object exchange (OBEX) which had assembler code to read out the 3 axis accelerometer which is accessed via SPI. Hacked the code till it did what I wanted and put it in another cog. Brutally chopped up a demo program someone had written to put accelerometer code into a spreadsheet and ran wild with code in the massive 32 Kb memory space which are the realms of Spin and the hub. Most of this code is just endless calls to the serial driver to print out various variables and test out subroutines in cogs and is due for a very serious pruning in the near future.

Then, of course, there is the serial I/O cog which uses some very cool multiprocessing techniques which let it reliably emulate an 115,200 baud modem. When I'm finished interacting with the program, I switch the serial port output to pin4 where it just streams data to the SD datalogger; another really cool Sparkfun device which just lets you throw text at it and it stores it to text files on the SD card.

I'm steeling myself for the next step which is to write the VB6 code needed to analyze the huge amount of accelerometry an HR data I've recorded today. Normally if I was told I could do some VB6 coding rather than some terminally boring activity like dealing with my hospitalist billings, I'd jump at the chance. However, after spending a week in the wilds of Propeller assembly language (PASM), I feel like someone who's just spent the afternoon on Wreck beach playing in mud and forgets to dress or clean up before they go to a black tie cocktail party. Now I face numerous restrictions on what I can and can't do. Don't get me wrong, VB6 is my favorite HLL, but it's just that unique feeling one has from intimate contact with the hardware that's missing. when one moves to a windoze platform from a hardcore minimalist development system. M$ is the new evil empire and IBM is like a now senile dictator who's discovered philanthropy at some point during his decline. (IBM's heavily into open source software whereas if an asteroid hit Redmond, I'd be deleriously happy; gotta teach those alien asteroid tossers better English - "Russia or Redmond - what the hell, they're probably the same")

Well, I'm not writing the analysis code in Spin - that's for damn sure. For an architecture that's totally unique and aimed at the true hacker, Spin is a letdown. It's almost as Spin was thrown in as an afterthought -- "most people can't program in PASM - we need a HLL on this sucker if it's going to go anywhere". Spin is annoying, but I can write Spin programs. The most heinous sin that the authors of Spin committed was omitting a "Go To" from their language. May they burn forever in the fires of hell for such insolence. You DON'T write a HLL language for a microprocessor without a Go To!!!!!

It would get way beyond the point of this essay which has the direction of a drunkards Brownian walk (Disclaimer - I'm celebrating the first successful outing of the APM with a bottle of "Stump Jump" Australian Shiraz, and all those who find this essay in some way offensive should send their complaints to the Osbourne family located somewhere near McLaren Vale in South Australia - its all their fault) if I were to start going on a rant about object oriented languages (OOP). To me OOP is a gimmick and nothing's going to change the fact that there's a 100 fold or greater difference between the average programmer and a great programmer. I've known lots of great programmers and they can code well in any language that suits their purpose. I much prefer the engineering approach to programming which is get the job done and who gives a shit if the code is ugly. The CompSci approach is: "lets make sure that the indentation is perfect and we've fully followed all of the rules setout by whoever happens to be the current guru of programming etiquette at the moment". Often this results in very nice looking code, every comma where it should be, impeccable indentation but, more often than not, doesn't work. If it does work, it uses 100x the resources that the product of a "just get the job done" programmer would have written.

We could go on forever about whether my use of "magic numbers" in code is a valid shortcut or an unpardonable sin that deserves at least a week in the stocks in a public square. For a very slow programmer, portability is a key concern because if they're faced with a new architecture, it's nice if they can just tweak a few variables and have it running on that architecture. For a very fast programmer who's totally grokked the problem they're dealing with and simulates/debugs in wetware, they can effortlessly bang off another version of a program for a new architecture, magic numbers and goto's included, and have it working before the slow programmer has yet to come up with a code indentation scheme.

For a while I was seduced by the "we have RAM to burn" paradigm that made me temporarily part ways with my previous perniciously parsimonious programming style. I've now rerealized that bloatware is evil even if one is running code on a 2.9 GHz i7 processor with 16 Gb of RAM. That doesn't mean that I'm going to worry about wasting the odd byte or two in a data structure (and I no longer have a fixation about data structure sizes absolutely needing to be the smallest power of 2 one can get away with) but I have a very dim view of XML and other "human readable" POS that seem to be the products of psychosis (and that's being charitable). (Sorry, wandered off again from the increasingly hard to follow train of though).

I've been told in the past that I get weird when I program and, if I find that patients are suddenly getting up and unexpectedly leaving the exam room, then there may be a grain of truth to this assertion which was made by a female. One week ago, I gave myself the goal of having a functioning APM before the end of the 7 day period. Considering that I totally crashed and burned when I proposed to do the same project on an ARM based Stelleris system in a Circuit Cellar design contest in 2010 (it came with a fully "modern" ARM IDE trial version which took up hundreds of megabytes on my HDD and I wasn't even able to get it to generate a "Hello world" program despite days of trying), I reassured myself last Friday that the contest deadline is July 31. Then something funny happened; the code just started to flow. One of the best things that I did was to write an assembly language routine for an individual on the Parallax Forums who needed to time pulses. This seemed trivial and then I was somewhat surprised when another commenter (apparently one of the forum's uber hackers) poked numerous holes in my code. My initial reaction was irritation - after all I'd run the code numerous times in my wetware Propeller model and it was ***Perfect***. Then, I reread his comments and had one of those WTF moments where I wondered how I could have missed something so blatently obvious and why did it take someone else to point it out. The commenter on the Prop Forums site was quite right - I had buggy code; just because it worked 99.9% of the time was insufficient. It had to be perfect and I had failed in that task.

So, I went back to the Prop documentation, reread everything, updated my wetware model and found a solution to the problem. I'll post the problem and the solution some other time as this isn't designed to be a technical exposition - instead it's just a stream of consciosness rendering of how neat it feels to have finally gotten an APM together - it may be V0.01 of the device, and it would take a very dedicated volunteer to carry the sucker around for a day, but it works!!! So, I feel like celebrating a bit and have decided to post a blog entry as my cat just went to sleep when I started telling her about the device. Can't post stuff to the Prop forums as, after all, it is a contest that I'm in and even though my chances of winning are slimmer than finding a 16 year old virgin in Kamloops, it's the deadline that makes me stop puttering around and really apply myself. I don't care about the $5 K prize at the end of the contest; I've given up considerably more money than that by chosing to play with my electronic toys rather than practicing medicine 1 week/month. What's important to me is the feeling I get when I come up with an idea and see it realized in physical reality. There's nothing like it. Of course, as expected, all of my code and hardware will be open source as of 31/7/2013. Don't get me started on what I think of software patents.

It's very reassuring to see the flashing red LED and blue SD card LED brightly lit to the left of me which lets me know that data is being streamed to the SD card. What I can't understand is why more people don't program in PASM??? To me it's a no brainer - there's the instruction set, there's a pad of paper - start writing code. I'd say that 95% of the submissions to the Parallax Forums are in Spin which is an ugly language (and I believe I'm mentioned the no GO TO heresy) but it's a pseudo-OOP language. Purists will look down on it as it doesn't meet some arcane criteria that a language must posess before it becomes OOP.

To be fair to the adherents of Spin, it's fast to use if you haven't done assembly language programming in the past. Right now, taking one of the coddled progeny of the modern age and asking them to do assembly language programming would be the equivalent of telling them to fix their vehicle with a screwdriver after it's broken down in July at noon in Death Valley. They haven't a clue where to start or what might possibly be wrong and, a very substantial proportion of the population haven't the foggiest notion of how an internal combustion engine works. As far as they're concerned, it's just magic - press one button and the door unlocks, put the key in the ignition and drive away. Most of the current population can be trained to read a gas gauge and become attentive to that most idiotic piece of automotive engineering ever invented, the "check engine" light. Beyond that, all they know is that Henry Ford was visited by Athena in the middle of the night and, the next morning, aside from a few drying wet spots on the sheets, he had the inspiration for the automobile. Automobiles are produced in the same factory that produces unicorns and free cell phones for the masses.

When you're a relic of the dinosaur age, like I am (and I still remember women in fur bikinis distracting tyranosaurs while the men closed in on them from the sides) you know how to start with a bunch of TTL chips, a soldering iron and a large quantity of wire and build a computer from the ground up. It's simple and very tedious and a few bits into a memory address register you start wondering do I really need that many bytes of RAM. Of course, when I was younger I was told - you kids have it easy, back when I was your age there was no such thing as an IC and if you wanted a logic gate, that was one tube, two for a flip flop and lots of resistors and capacitors. Of course you could use relays if you didn't mind your computer being a bit slower.

So, it's not so much what you start with, but how much you understand the basics. I've built tube logic gates and I've used relay logic. I've also played with an abacus and taken apart mechanical calculators. These are all significant aspects of our civilizations computational history. The key to this is trying to figure out how things work. When I used to watch TV ages ago, my favorite shows were How It's Made and Junkyard Wars. I could watch how things are made forever as the first thing I wonder about when I pick up a manufactured object is exactly that - how was it put together, where did the pieces come from, who makes the various pieces, what ingredients go into it, etc. Junkyard wars is an indictment of our society in which we throw away so many absolutely useful items which can be disassembled to make other new things. To me, a junkyard has always been a treasure trove of neat items and I loved spending time in junkyards when I was a kid (my mother wasn't too happy with this).

So, programming in Spin is the easy way out. It's a no brainer. Find that your program is too slow, launch the Spin interpreter in another cog - after all the Propeller has 8 independent cogs. And, yes, Spin is fast but PASM is much much faster. The minimum delay time in a WaitCnt instruction in Spin is 371 clocks whereas in PASM it's 8 clocks. 371 clocks is a glacial 4.348 microseconds whereas 8 clocks is a more reasonable 100 nanoseconds. Exchange your 5 MHz crystal for a 6.25 MHz crystal on the Prop BOE and you're at 10 nsec/clock. I happen to like doing things fast and why would I accept a language that's 43 times slower than PASM?

Presumably any reader who's stuck it out so far might wonder what the point of this piece is and all I can say is it's in the text. Read it over myself and I like it (but then I'm on glass #5 or so of the Stump Jump wine) so I might have differing opinions tomorrow morning. However, it does encapsulate a lot of the thoughts that have crossed my mind during a week of programming and that's why it's going on the Electronics Blog. Next week is back to meatware hacking which I'm not really looking forward to. Meatware has many more degrees of freedom than hardware and, from a debugging standpoint, is a far far greater intellectual challenge than trying to figure out why a light doesn't come on when, as far a you know, your program should have turned it on. Meatware also responds in unexpected ways to fixed stimulus paradigms and that's what makes it interesting. Now on to some VB6 APM analysis program construction.

Posted by Boris Gimbarzevsky at 11:20 PM
Categories: Embedded systems, Rants