This Week’s Program: Jan 8 - Jan 12
This week, I was working in the intersection of three areas:
- GStreamer (through Racket FFI bindings)
- Cocoa internals and the implementation of
I realized that I am probably the only person in the world working in that space.
I also expanded my toolkit a bit, finally reaching
lldb. Admittedly, I had little success
debugging this situation, but I feel like it’s a good tool to have
close at hand.
The two challenges I was working through this week both involve GStreamer and Cocoa, and how Racket does some things to subvert or avoid the usual Cocoa routines.
Here’s the first challenge, illustrated with a basic Overscan program:
#lang overscan (broadcast (videotestsrc) (element-factory%-make "osxvideosink"))
This program connects a test video source to an osxvideosink, a GStreamer element that will draw a window preview of the video in macOS. This program, when run through the racket command line tool or REPL, works just fine.
The snag is the same program, but with the
#lang overscan (require racket/gui/base) (broadcast (videotestsrc) (element-factory%-make "osxvideosink"))
This program crashes. Why does it crash? I poured over several bits of code. The code for the osxvideosink itself:
Something I learned through a fair bit of random googling, is that
some programs are very persnickety about running on the “main thread”
or “main runloop” of an application. The osxvideosink is one such
program. There’s a function that checks whether or not the program is
running in the main
This function runs a line of code to do a quick check to see if the main runloop is running:
[[NSRunLoop mainRunLoop] currentMode]
Through some experiments with my repl and
this appears to be false.
I start up
lldb and, in another shell, startup the Racket repl. I
attach lldb to the repl with
process attach --pid (with
continue to ensure that control returns to the repl). I run the
program and it crashes, as expected. Now within
lldb I can type
thread backtrace, to see where this failed. Something in the Cocoa
runloop is calling back into Racket and finding a big fat NULL.
At this point, I decide to dig into some Racket internals.
This is the code that implements the Racket Gui in Cocoa. I don’t understand most of it, but what I am able to glean is that Racket doesn’t create a Cocoa application in the traditional sense. It instead kind of, sort of fakes an application using some runtime tricks.
So what it looks like is happening is that the
unable to detect that it is running in the main runloop, tries to
start it’s own
NSApplication, and something goes very bad when the
two collide. Interesting and intellectually stimulating, with very
little I can do about it.
Challenge number two involves this program:
#lang overscan (require racket/gui/sink) (broadcast (videotestsrc) (element-factory%-make "glimagesink"))
Very similar to the previous example, this time with
glimagesink element. The program runs, opens a
window, and I can see the first frame of the test video signal. But
the window does not continue to animate, it’s just a frozen frame.
I run two different shells with some GStreamer logging turned on. The first like so:
GST_DEBUG='gl*:5' gst-launch-1.0 -e videotestsrc ! glimagesink
This turns on logging for every element that begins with “gl” to the
DEBUG level, and starts up
gst-launch — a command line tool for
testing out GStreamer pipelines. This time, the glimagesink performs
as expected. In the second shell I run this command:
GST_DEBUG='gl*:5' racket -I overscan
This sets up logging the same way, but starts up an Overscan repl
where I run the above program. I’m looking closely at the output and
notice this very small distinction between the two. The working
gst-launch log has a block like this each time a frame renders:
glcontext gstglcontext.c:749:gst_gl_context_activate:<glcontextcocoa0> activate:1 glcontext gstglcontext.c:749:gst_gl_context_activate:<glcontextcocoa0> activate:1 glcontext gstglcontext.c:749:gst_gl_context_activate:<glcontextcocoa0> activate:1 glcontext gstglcontext.c:749:gst_gl_context_activate:<glwrappedcontext0> activate:1 glcontext gstglcontext.c:749:gst_gl_context_activate:<glwrappedcontext0> activate:0 glcontext gstglcontext.c:749:gst_gl_context_activate:<glcontextcocoa0> activate:1 glcontext gstglcontext.c:749:gst_gl_context_activate:<glcontextcocoa0> activate:1 glcontext gstglcontext.c:749:gst_gl_context_activate:<glcontextcocoa0> activate:1 glcolorbalance gstglcolorbalance.c:231:gst_gl_color_balance_before_transform:<glcolorbalance0> sync to 0:00:00.100000000
Several calls to
activate on these glcontexts. When I look at the
logs in the Racket implementation the two outliers, the activate calls
glwrappedcontext aren’t there.
Then, I repeat this with a different debugging configuration:
GST_DEBUG='glcaopengllayer:7'. This sets up logging for the
underlying CoreAnimation layer up to the
LOG level. On the working
version I see frequent calls to this line for each frame:
-[GstGLCAOpenGLLayer drawInCGLContext:pixelFormat:forLayerTime:displayTime:]: CAOpenGLLayer drawing with cgl context 0x7fcecd063200
In the Racket version, this line is only displayed once! Digging
deeper into the code, I think I’m facing a similar issue with my
osxvideosink: something about this drawing operation is happening in a thread
that is assumed to be running, but because of the Racket GUI
implementation details, might not be!
Finally, I turn to the Racket community with a post on the mailing list: https://groups.google.com/d/msg/racket-users/wH_OU3haWgk/MDeRcnuNAAAJ
And who better to answer my incredibly esoteric question than
Professor Matthew Flatt, one of the
core members of the Racket team. He gives me this handy little
and expands my mind in the process. Running
broadcast within this
proc prevents the crash from happening, because I’m instructing Cocoa
to run my
broadcast function from the main runloop!
Now, my problem with
glimagesink still persists, but I’ve narrowed
down some of this odd behavior
to this Cocoa function:
I think banging my head into this particular brick wall might actually get me closer to my goal of a working cross-platform(ish) solution.