Nov 8, 2012 at 6:28 PM
Edited Nov 8, 2012 at 11:20 PM
I'm also interested in using SharpMod in a Unity project. Unity recently added support for playing MOD/XM files natively, but the project I'm working on requires interaction with (and run-time composition of) chiptunes - SharpMod seems like a perfect fit!
Unfortunately, Unity doesn't support audio engines like NAudio; instead, they expose a callback (OnAudioFilterRead) that gets called
every time a chunk of audio is routed through the main filter. It gets called every ~20ms, depending on the samplerate and platform. The buffer passed in is an array of floats, consistently sized to 2048 elements; I'm also given the number of channels.
Because it's expecting stereo audio data, I write 1024 samples, between [-1.0f, 1.0f], each time the callback fires.
Because the callback runs on a separate thread, I avoid retrieving bytes from the SharpMod player during the callback. Instead, I create a ring buffer to which I maintain a read/write pointer. I write to it during the script's normal execution and read from
it in the OnAudioFilterRead callback.
What I have working at the moment sounds very, very close to what I'd expect to hear, except with a lot more noise and some unusual looking values when I graph out the waveform. I think I may be doing something incorrectly when converting the 16-bit PCM
data to signed floats.
Here is my short-to-float conversion function
private float ShortToFloat(byte bytes, int index)
// Convert unsigned short to float
float f = (bytes[index]) | (bytes[index + 1] << 8);
f -= 32768f;
f /= 32768f;
And here is how I read data from the SharpMod player and store it in my circular buffer:
// Declared earlier:
// byte tmpBuffer = new byte;
// float latencyBuffer = new float[LATENCY_BUFFER_SIZE];
int bytesPerSample = Player.MixCfg.Is16Bits ? 2 : 1;
int channels = (Player.MixCfg.Style == SharpMod.Player.RenderingStyle.Mono) ? 1 : 2;
int read = tmpBuffer.Length;
read = Player.GetBytes(tmpBuffer, read);
for (int i = 0; i < read; i += bytesPerSample * channels)
float lSample = ShortToFloat(tmpBuffer, i);
float rSample = ShortToFloat(tmpBuffer, (i + 2));
latencyBuffer[writePtr] = lSample;
latencyBuffer[writePtr + 1] = rSample;
writePtr = (writePtr + 2) % LATENCY_BUFFER_SIZE;
And here is what the contents of the buffer look like. The top and bottom grey lines mark the +1/-1 boundaries. The sound itself is recognizably the chiptune I'm testing, but everything is heavily amplified
with periodic pops and clicks.
If I normalize the PCM data by scaling down by 65,535 instead of 32,768, I wind up with
something that looks like this instead. It's much closer to what I'd expect the waveform to look like, but it's still quite wrong.
I feel like I'm overlooking something very obvious, but I'm not quite sure what. If you could provide any insight, it'd be greatly appreciated.