Friday, March 08, 2013
Windoze sucks bigtime
I promise this is my last entry on my Geiger counter (GC) explorations for the indefinite future. The most important point in this post is that windoze sucks bigtime. Now I don't know if the problems I've unearthed belong to the windoze serial driver or MSComm VB6 control, but they raise some very serious issues with using windoze for any form of data acquisition.
M$ doesn't advertise windoze as a real-time OS and the 10 msec quantum used by WinXP certainly relegates it to the non-realtime mode. What I naively assumed was that in a WinXP system with a 10 msec quantum, and most of the threads in idle mode, that I'd be capable of getting 20 msec temporal precision. The windoze scheduling algorithms are fairly efficient and they run through the list of runnable threads in order of priority and, if a thread has nothing to do, it immediately relinquishes its quantum. Thus one would expect a process which has been assigned real-time priority to be able to time events with a precision of 1-2 quanta.
There are near realtime aspects of windoze as when one samples the sound card or video card. If there weren't, then one would get distorted sound or video. It may be that this is a driver issue and that M$ made the assumption that all that one needs to deal with serial data is to allocate a large enough buffer and then who cares about the temporal spacing of serial events. I don't know if this is the case since windoze is a closed source system and the serial driver is a black box. In such a setting I don't know if the driver black box is defective or the OS black box is the culprit. It may be that the driver is fine and the problem lies in the MSComm control black box. That's what happens when one deals with a closed OS -- the source of the problems is not at all obvious and, one can't simply peruse the source code to find and fix bugs like this. That's why I'm switching to Linux.
In the Propeller pulse timing routine that I discussed in a previous blog posting, I had deterministic timing of all GC events as well as measureing the duration of the GC output pulse. The resolution of this time was 2 microseconds; if windoze allowed one the amount of freedom to write interface code that I have on the Propeller chip, that temporal resolution would be 100 nsec or less because my laptop is a 1.6 GHz Pentium processor. As it turns out, my estimate of 20 msec temporal precision for windows was wildly optimistic.
The Propeller program that I wrote grabbed the 2 microsecond clock time and width of the GC pulse and sent it via a serial line to the Teraterm program which logged it to a disk file. As expected, when I plotted a histogram of the GC inter-event intervals at a 50 msec bin width, the scatter around the best fit exponential decreased steadily with sqrt(# of intervals). When one samples the same GC under windoze, however, something totally different happens. The histogram contains regularly spaced peaks which don't decrease as one gets more and more intervals. These peaks are quite periodic and have periods ranging from 200 to 400 msec.
The output of the Sparkfun GC is a single character for each event; either a 0 or a 1 depending on whether the current inter-event interval was less than or greater than the previous interval. I don't know the exact relationship but there is a very short latency between a sufficiently energetic photon hitting the GM tube and the Sparkfun GC outputting a serial byte. This latency can be easily computed by perusing the GC MCU source code which Sparkfun freely supplies. My VB6 program to time GC events sets the MSComm buffer size to 1 which means that an event is generated as soon as a single character is recieved on the serial input. This event is used to grab the time on the windoze msec timer and the event data is then written to a file. The VB6 portion of the code executes in < 1 microsecond.
I don't know how windoze prioritizes drivers, but it seems that the USB serial driver is of very little importance to M$. The only way in which these 200-400 msec spaced peaks can be produced is if the serial driver is ignored for such long periods of time. Presumably the wizards of Redmond assumed that a 4096 byte buffer meant that ignoring the serial driver for 400 msec while data accumulated in the buffer was a perfectly acceptable thing to do. They likely never envisioned the possibility that one would ever think of precisely measuring the exact time at which a particular byte was recieved. So, in this particular application, windoze is completely defective. The peaks in the inter-event histogram suggest bunching of bytes in the buffer which are then released as a group when windoze thinks it's time to deal with obsolete serial connections. I haven't run into anything about the serial driver in Mark Russinovich's very detailed reverse engineering of windoze, but W7 has the same problem as WinXP. Only Win 3.1 would be able to precisely deliver bytes as they arrive to the serial driver.
So, the only conclusion one can derive is that, if one wants to do precise timing, stay as far away from windoze as possible (there's good reason to do so for ideologic reasons as well). Most of my data acquisition applications involve precise timing of events to sub-millisecond precision. My Commodore 64 was more than capable of doing so as was the PDP-11. However, a 3 GHz superscalar processor, when infected with windoze, behaves more like an ancient mechanical calculator than a machine capable of timing events with microsecond precision. Thus, my transition from windoze to Linux is in progress. Linux is not a real time OS as well, but it's possible to launch Linux as a process by a real time kernel which can exploit the full speed of a Pentium processor. Also, Linux is open source so if one doesn't like the way that something works, it can be changed. If one has to work with windoze, then the only option is to do data acquisition with hard real time MCU's such as the Propeller chip which can then send data into the senile windoze OS which will take care of the data packets in a doddering fashion.
Sorry, no nice graphs in this post as all the data from GC2 is being acquired on my 64 bit ASUS laptop which has an 1820x1024 screen resolution and I'm too lazy to copy the graphs over to my HP tablet PC and render them in a smaller size for posting to this blog entry.
While on the subject of Geiger Counters, the new version of the Sparkfun GC doesn't suffer from the excess of short intervals which plagued the first version of their GC. This redesign was in response to a negative user comment on the flakiness of the initial GC high voltage supply. Clearly, if the short interval excess was a power supply problem, it has been fixed on the current Sparkfun GC. And thus endeth the GC topics (unless of course I detect a supernova in my basement).