PCS500.exe now runs on X64 systems. See end of article for details. The given solution should work with all Velleman products connected to a parallel port; PCS500.exe is taken as example here. Moreover, it should work with any other I/O software with compiled-in instructions, like avrdude (tested: yes).
The assembly code, translated to C would be like this
void readsamples(DWORD p2[4096+4096], WORD lptbase, BYTE inv) {
DWORD*k0=p2; // was: ESI
DWORD*k1=p2+4096; // was: EDI
outportb(lptbase+2,inportb(lptbase+2)|2); // enable 74HC244
outportb(lptbase+0,0xFF);
outportb(lptbase+0,0xFB); // Counter Clock
for (int i=0; i<4096; i++) {
BYTE b;
outportb(lptbase+0,0xFF); // open BD2, enable !OE, CH2 upper nibble
b=inportb(lptbase+1)<<1 & 0xF0; // read high nibble of A/D data
outportb(lptbase+0,0xFE);
b|=inportb(lptbase+1)>>3 & 0x0F;
*k1++=(BYTE)~b; // save sample
outportb(lptbase+0,0xFD);
b=inportb(lptbase+1)<<1 & 0xF0;
outportb(lptbase+0,0xFC);
b|=inportb(lptbase+1)>>3 & 0x0F;
*k0++=(BYTE)(~b^inv);
outportb(lptbase+0,0xFB); // another counter clock
}
outportb(lptbase+0,0xFF); // Maximum power supply
outportb(lptbase+2,inportb(lptbase+2)&~2); // disable 74HC244
}
Weird code, to put the samples into two distinct DWORD arrays instead of leave them interleaved compatible to 8-bit stereo sound data.
The first attempt
To get InpOut32.dll to work, a debugger-like extension has to be introduced that puts the scope software as debuggee into a sandbox with hardware breakpoints to the I/O base address. It’s interesting whether WoW64 will emulate this feature too. USB-based LPT emulators can benefit from knowledge of this basic routine because the target for the data is known and can be patched after reading all the 4096x2 samples.
It seems that (at least some) I/O instructions are not implemented as native assembly instructions inside PCS500.EXE but by calling DLPORTIO.DLL’s entries DlPortReadPortUchar() and DlPortWritePortUchar().
Unluckily, PCS500.EXE tries to open a “\.\dlportio” device which fails when using InpOut32.dll. I’m working on a solution to cope with this problem.
It’s correct, DriverLINX is the same as dlportio. It is the Service Name (FriendlyName) for dlportio.sys.
You may crash PCS500.EXE while running by executing this command:
net stop dlportio
In this case, an exception will be thrown and catched by a custom messagebox.
The trial debugger software did run but had some disadvantages:
[ul]
[li] Requires the PCS500.EXE to be patched so it “thinks” running on Windows 9x/Me[/li]
[li] Performance is incredibly poor[/li][/ul]
Some thought about better solutions for the coders
It would be MUCH BETTER when Velleman changes its source code to rely on InpOut32.DLL instead of DlPortIo.dll.
Advantages:
[ul]
[li] Runs in Win64[/li]
[li] No installation and initialization code at all (it’s all “outsourced” to InpOut32.dll)[/li]
[li] No need for LoadLibrary / GetProcAddress[/li][/ul]
Disadvantages:
[ul]
[li] “in” and “out” assembly instructions must be replaced by Inp32() / Out32() function calls[/li]
[li] Somewhat slower data transfer rate[/li][/ul]
The BEST way is when Velleman splits their PCS500.EXE into the core .EXE and a hardware access DLL with the following entry points:
[ul]
[li] SetData(WORD portbase, LONGLONG bits) // fill shift register chain[/li]
[li] GetData(WORD portbase, BYTE*data) // get sample buffer data, 8192 bytes[/li]
[li] GetStatus(WORD portbase) // query status lines to see whether sampling is complete[/li][/ul]
A quite good working solution, suitable for end-users, with no patch at all
Now I found a clue to install a process-wide exception hook in front of the stack-based exception handler, the AddVectoredExceptionHandler() function. This enables handling of exceptions without context switching. However, exceptions are still thrown massively, because IN and OUT instructions are 1-byte-opcodes and cannot be useful patched to another code. A test routine doing 1000000 IN instructions revealed 7 seconds for exception catching, and more than 30 seconds for context switching (between debugger and debuggee, as fallback needed for Win2k ONLY where no AddVectoredExceptionHandler() API exist).
So I modified the (my) InpOut32.dll to install exactly this exception handler on startup. Moreover, it contains a rundll32.exe-callable entry point to inject this DLL (i.e. itself) into the process. This trick enables the possibility to fake the Windows version without patching the .EXE image. As a net result, PCS500.exe works on Win64 now, with acceptable performance, using the following procedure:
[ul]
[li] Download www-user.tu-chemnitz.de/~heha/vi … pout32.dll[/li]
[li] Rename it to DlPortIo.dll and overwrite the same file name in the PCS500.exe directory[/li]
[li] Create a shell link with this command line: “rundll32 DlPortIo,CatchIo PCS500”[/li]
[li] Execute it with raised privilege, and be happy![/li]
[li] DlPortIo.sys is useless [i.e. cannot be loaded on X64 ever] and can be removed.[/li][/ul]
[color=#4040BF]Under the hood[/color]
[ul]
[li] The rundll32.exe loads the renamed InpOut32.dll and calls the entry point CatchIoW with the command line argument L"PCS500"
In case you have started the 64-bit version of rundll32.exe, loading the DLL will fail silently, and the 32-bit version of rundll32.exe will automatically launched by that 64-bit process, which then will not fail to load that 32-bit DLL.[/li]
[li] The CatchIoW entry creates a suspended process using the PCS500.exe image. It is important not to be a debugger, to prevent costly exception routing between debuggee and debugger.[/li]
[li] A remote thread is created that loads the renamed InpOut32.dll into the PCS500.exe process. This is called DLL injection. This injection is the reason why the DLL must be of same bitness as PCS500.exe, i.e. 32 bit.[/li]
[li] Its initialization routine unpacks, installs, and loads the signed inpoutx64.sys driver. This step requires administrative privilege. As there is no known I/O access map in X64 (the so called “long mode”), no clean driver can ever make I/O address space transparent to user-mode applications. Therefore, an exception mechanism has to be used, see next step. As a side effect, InpOut32.dll can re-route to unusual port addresses and some USB devices. The necessary initialization and device/address detection is done at this DLL startup.[/li]
[li] The initialization routine installs the process-wide exception filter routine.[/li]
[li] Another remote thread is created that changes the address to the kernel32.dll routine GetVersionEx(). The faking routine reports that the process is running on Win9x/Me. This is called API hooking. The reason for that second thread is that API hooking should not occur when the DLL is loaded and used regularly {by LoadLibrary() or static binding}.[/li]
[li] The PCS500.exe main thread is released and runs. It will “detect” Win9x/Me and doesn’t try to install+load DlPortIo.sys.[/li]
[li] PCS500.exe loads DlPortIo.dll using LoadLibrary(). This call does nothing as this library is already injected.[/li]
[li] PCS500.exe collects all entry points of DlPortIo using GetProcAddress() - for nothing but detecting that there are no NULL pointers. None of these entries are used. All this behaviour comes from the DlPortIo.dpr Delphi unit, which is open-source and obviously used by Velleman without changes.[/li]
[li] Then, communication with hand-made assembly IN+OUT instructions are executed. Each such instruction will raise an exception which goes to kernel mode, I believe Int13 (general protection fault, #GP). The handler there detects that it’s by a privileged instruction, and delegates handling to ntdll:KiUserExceptionDispatcher, which in turn executes the installed process-wide exception filter first (before the stack-based handlers using try/except).[/li]
[li] The exception filter (which resides in the renamed InpOut32.dll) in turn detects that it comes from an IN or OUT instruction. The faulting instruction is skipped, and the inpoutx64.sys driver is called using DeviceIoControl() to make the actual operation. For IN instructions, the register AL is modified in the process’ context. Then, the exception handler returns with information “Do not seek further, the exception is handled”.[/li]
[li] To save time, contiguous series of equal I/O instructions, originally inserted to create some µs delay, are skipped automatically. The exception handling itself consumes several microseconds.[/li]
[li] After this lengthy detour, the next assembly instruction is executed.[/li][/ul]
[color=#404000]Can dlportio.sys ever be made compatible to X64?[/color]
This is not entirely cleared. The AMD64 architecture supports an I/O permission map in the TSS (Task State Segment) compatible to the IA-32 architecture. While it is possible to a kernel mode driver to access and to patch this table, there are some annoyances:
[ul]
[li] There is no API to access the I/O map[/li]
[li] The map is no more task specific. (This AFAIK applies only to DOS boxes and Win16 emulation in 32-bit Windows versions.)[/li]
[li] The I/O permission map is AFAIK set to zero length, therefore, space must be allocated and the TSS must be moved somewhere which may disturb Windows heavily[/li]
[li] A kernel code guard watches for changes at critical memory areas as IDT (interrupt descriptor table) etc. If it detects a change, a bluescreen will end your Windows session some seconds or minutes later.[/li]
[li] There are tools to defeat the guard, and Windows updates that defeat the tools. At the end, Microsoft will win.[/li][/ul]
As a conclusion: Theoretically yes, practically at least experimental, but the solution above is much more safe.
[color=#804000]What about USB adapters?[/color]
Unluckily, some disassembly of PCS500.exe revealed that the assembly routines are implemented multiple times and somehow more complicated than explained at the top of article. Therefore, emulating these routines in USB devices is almost impossible, and it’s much better to write a tailored (i.e. new) scope software for the PCS500 device rather than twiddeling with the executeable by Velleman somehow.
I wrote a suitable universal scope software some years before, see http://www.tu-chemnitz.de/~heha/hs/. It features a standard plugin for soundcards, a scaleable screen, and in-screen trace scaling/shifting operations with no valuable-screen-space-consuming control knobs. But for writing a suitable plugin, I need a real PCS500 device lent by someone.
Can someone reply performance reports? I have no PCS500 hardware to check.
henni, 130902 - 131209