Bitmap methods for filing decompress: bm fromByteArray: ba at: index "Decompress the body of a byteArray encoded by compressToByteArray (qv)... The format is simply a sequence of run-coded pairs, {N D}*. N is a run-length * 4 + data code. D, the data, depends on the data code... 0 skip N words, D is absent (could be used to skip from one raster line to the next) 1 N words with all 4 bytes = D (1 byte) 2 N words all = D (4 bytes) 3 N words follow in D (4N bytes) S and N are encoded as follows (see decodeIntFrom:)... 0-223 0-223 224-254 (0-30)*256 + next byte (0-7935) 255 next 4 bytes" "NOTE: If fed with garbage, this routine could read past the end of ba, but it should fail before writing past the ned of bm." | i code n anInt data end k pastEnd | <primitive: 'primitiveDecompressFromByteArray' module: 'MiscPrimitivePlugin'> <var: #bm declareC: 'int *bm'> <var: #ba declareC: 'unsigned char *ba'> i := index. "byteArray read index" end := ba size. k := 1. "bitmap write index" pastEnd := bm size + 1. [i <= end] whileTrue: ["Decode next run start N" anInt := ba at: i. i := i+1. anInt <= 223 ifFalse: [anInt <= 254 ifTrue: [anInt := (anInt-224)*256 + (ba at: i). i := i+1] ifFalse: [anInt := 0. 1 to: 4 do: [:j | anInt := (anInt bitShift: 8) + (ba at: i). i := i+1]]]. n := anInt >> 2. (k + n) > pastEnd ifTrue: [^ self primitiveFail]. code := anInt bitAnd: 3. code = 0 ifTrue: ["skip"]. code = 1 ifTrue: ["n consecutive words of 4 bytes = the following byte" data := ba at: i. i := i+1. data := data bitOr: (data bitShift: 8). data := data bitOr: (data bitShift: 16). 1 to: n do: [:j | bm at: k put: data. k := k+1]]. code = 2 ifTrue: ["n consecutive words = 4 following bytes" data := 0. 1 to: 4 do: [:j | data := (data bitShift: 8) bitOr: (ba at: i). i := i+1]. 1 to: n do: [:j | bm at: k put: data. k := k+1]]. code = 3 ifTrue: ["n consecutive words from the data..." 1 to: n do: [:m | data := 0. 1 to: 4 do: [:j | data := (data bitShift: 8) bitOr: (ba at: i). i := i+1]. bm at: k put: data. k := k+1]]]
From my VM simulator I printed out the ByteArray that was being decompressed incorrectly by the jitted code generated by my new code generator (the new code was invoking primitiveFail in the above). So now I wanted to trace through execution of the above in the debugger so I could see what the values of local variables should be in a correct execution to identify the location of the error as opposed to the location of its symptom. So I naïvely debug evaluated the following:
Bitmap decompressFromByteArray: #[ 16r10 16r27 16rC0 16r0 16r0 16r0 16rE0 16r0 16r0 16r0 16rF0 16r0 16r0 16r0 16rF8 16r0 16r0 16r0 16rFC 16r0 16r0 16r0 16rFE 16r0 16r0 16r0 16rFF 16r0 16r0 16r0 16rFF 16r80 16r0 16r0 16rFF 16r0 16r0 16r0 16rA 16rFE 16r0 16r0 16r0 16rB 16rCF 16r0 16r0 16r0 16rF 16r0 16r0 16r0 16rA 16r7 16r80 16r0 16r0 16r7 16r3 16r0 16r0 16r0]
and stepped into the activation of the following method:
Bitmap class methods for instance creation decompressFromByteArray: byteArray | s bitmap size | s := ReadStream on: byteArray. size := self decodeIntFrom: s. bitmap := self new: size. bitmap decompress: bitmap fromByteArray: byteArray at: s position+1. ^ bitmap
So far so good. But when I tried to step into the Bitmap>>#decompress:fromByteArray:at: method the debugger of course evaluated the primitive that is in the VM, primitiveDecompressFromByteArray (which just happens to get generated from the above by the VMMaker, but that’s a different story). What the debugger didn’t do, quite correctly, is evaluate the non-primitive method, starting at i := index. "byteArray read index", the actual Smalltalk code whose execution I wanted to observe.
This is where things turn awesome. Since Smalltalk has first-class activation records I can actually create an activation of the above method poised to start execution at the first bytecode, hence after the evaluation of the primitive. Here’s how. In the debugger window I evaluated
thisContext swapSender: (MethodContext sender: ThisContext receiver: bitmap method: (Bitmap>>#decompress:fromByteArray:at:) arguments: {bitmap. byteArray. s position + 1 }). self halt
What’s going on here? First, the debugger allows me to access the local variables of the activation I’m debugging, from within a script evaluated in the debugger (neat!). If you look at the script in the debugger you see it translated into:
DoItIn: ThisContext thisContext swapSender: (MethodContext sender: ThisContext receiver: (ThisContext namedTempAt: 3) method: Bitmap >> #decompress:fromByteArray:at: arguments: {ThisContext namedTempAt: 3. ThisContext namedTempAt: 1. (ThisContext namedTempAt: 2) position + 1}). ^ self halt
But the clever bit is what the script does. ThisContext with capitals is the debugger’s name for the activation of decompressFromByteArray: I had stepped into; its simply the argument to the DoItIn: method created to run my script. thisContext without capitals is the name for the current activation (any activation) and so refers to the activation of DoItIn:. The (MethodContext sender:…) expression creates an activation record on the method (Bitmap>>#decompress:fromByteArray:at:), and initializes the activation at the start of the bytecodes for the method, effectively after the primitive it contains. The thisContext swapSender: … expression substitutes this activation as the one to return to from thisContext. i.e. my script "thisContext swapSender: (MethodContext…" etc will now return into the activation of (Bitmap>>#decompress:fromByteArray:at:) I just created instead of back to the compiler that compiled and evaluated the script. The "self halt." brings up the debugger. So now I simply step out of the halt, into my script, stepping until it returns into the method I wanted to evaluate. Awesome!
Carl Gundel | 01-Dec-10 at 12:52 pm | Permalink
Can you do this in VisualWorks, or just Squeak/Pharo?
Eliot Miranda | 01-Dec-10 at 12:58 pm | Permalink
Absolutely. You can do it in any Smalltalk that supports contexts. In VW it is slightly different because primitives actually have a bytecode, so (IIRC) a primitive invocation looks like
0: call prim #N
3: primFail
4: first non-primitive bytecode
Hence you may have to set the initial PC differently on creating the activation to skip over the primitive call; I’m not sure. But the principle is the same.
Carl Gundel | 01-Dec-10 at 1:34 pm | Permalink
Thanks. BTW, is your presentation at Smalltalks 2010 going to be generally available? Was it recorded? 🙂
Eliot Miranda | 01-Dec-10 at 1:35 pm | Permalink
Apparently all talks were recorded and will be coming on line soon.
David | 07-Dec-10 at 6:12 pm | Permalink
Good to see you’re still alive and kicking, Eliot. You hadn’t posted in so long that we were beginning to wonder … 😉
Mana | 30-Nov-15 at 3:31 pm | Permalink
This is a great tip. But I was caught more by the psd pearsr you mentioned. I wish I was more proficient with byte arrays. I’ve been looking for a starting point for writing/reading similar data structures. Specifically, compressing files and folders to simulate a file type. Do you know of any libraries or resources for writing and extracting compressed files? Thanks again
admin | 30-Nov-15 at 4:20 pm | Permalink
Hi Mana, take a look at InflateStream, DeflateStream and subclasses. These provide convenient wrappers for dealing with zip, gzip and zlib compressed data. HTH