Hacker Newsnew | past | comments | ask | show | jobs | submit | Stenzel's commentslogin

Processors with saturating capabilities usually clip at 0x7FFFFFF and 0x80000000, so the clipping thing has no relation to the original topic. But in fixpoint DSP, 0x8000 (add trailing zeros if you like) is often the only possibility to achieve a multiplicative identity by using a = - b * 0x8000, with the negation part of the assembly instruction.


Title should say screen design, for general UI these are quite useless, especially if your UI is physical.


I think these guidelines are pretty translatable for any kind of user interface, including physical.

Machines should have consistent physical controls that make it readily apparent whether it's a button to press or a knob to turn. They should prevent errors like an exposed blade cutting my hand off.


OK for beginners, but there are lots of nice voicings missing, also those where the root note is not played at all.


It looks like it is just out of the oven, and it's a pretty good starting point to flesh it out, the hard work has been done.


This is indeed an excellent way of filtering if your audio is cyclic and fits into a single FFT, like a periodic waveform, a drum loop or an Optigan track. Just make sure not to apply any window.


You also need to make sure that the gibbs phenomena does not cause the filtered signal to leave the range of the representation. So a prefiltered signal is less than 1, but the post filtered signal wont be. Meaning if 1 is the cap for your audio output, then enjoy a brand new category of distortion. But its a terrible way in general ofc, just optimize an appropriate, zero phase filter with an unaffected passband and minimum distortion. Its trivially easy, and there is no excuse to just use the terribly shitty short "classic" filters common in audio processing or graphics as implemented by skilled programmers who dont know the difference between dft and fft. (which is always easy to tell as they use the term fft as if it was synonymous with frequency transform estimates)


This is the case for any sharp filter. It is not unique to the FFT approach. It doesn't matter if you use a linear phase FIR; any time you "remove" frequencies you can increase your peak levels. Try graphing sin(x) + 0.2sin(3x) and then try removing/filtering out the 3x component.

It's even true for reconstruction. A digital waveform can represent peak levels far above "digital peak", in between samples.

This is why if you're mastering songs, you'd better keep your peak levels at -0.5dB or -1dB so (so the filtering from lossy compression won't make it clip), and why you'd better use an oversampling limiter. Especially if you're doing loudness war style brutal limiting, because that's the stuff that really creates inter sample peaks. But you shouldn't be doing that, because Spotify and YouTube will just turn your song down to -14 LUFS anyway and all you'll have accomplished is making it sound shitty :-)


not quite, designing minimum passband distortion filters with no amplitude increase anywhere is slightly harder, but its not impossible. Even if you are strictly removing some specific frequencies, as you can design the filter in such a way that the spectrum amplitude ringing strikes zero for them. In practice though, simply reducing these frequencies by a factor of 100 is good enough and thats possible for bands without needing to have any amplitude above 1.


You didn't understand my example. This isn't about spectral ringing. It doesn't matter if you have zero spectral ringing, and no amplitudes above 1. There is no way to have a sharp filter that removes (or almost removes) certain frequencies, even if it has zero spectral ringing, while guaranteeing it doesn't increase peak levels in the time domain. The filter will decrease the total energy of the signal, but a decrease in signal energy can still cause an increase in peak levels. This is because the addition of a frequency component can decrease peak levels by lining up with the existing peaks in such a way, and thus removing it can conversely increase peak levels.

Just punch sin(x) + 0.2sin(3x) into a graphing calculator, then remove the 0.2sin(3x) component and look at peak levels increase. No filter can fix that without also decreasing the sin(x) component significantly to compensate.


Your proposed zero phase filter won't work in realtime, and there are many use cases for what you consider shitty.


Modern computers are ridiculously powerful, processing a 4k image with a 50x50 filter is real time >60hz on cpu using a decent fft solution, and on a decent gpu you could do many times more >60hz.

Oh sure, there are lots of usecases for fast but bad. But what bugs me is that people arent using them for those cases. Worse still, people are using repeated fast but bad, and then more fast and bad to fix the problems that caused, and effectively creating slow and bad filters.


I'd assumed fft was just a dft with O(n log(n)) performance - am I missing something?


You're not. The FFT is just a particular way of implementing the DFT.


Quite. So I'm puzzled by what mistake the 'skilled programmers' are making, when confusing DFT and FFT. Implementing DFT in quadratic time?


It is possible to get around the cyclic properties of convolution via FFT by zero padding your array prior to transforming though.


There is no illusion of pitch, and it is a common misconception that the fundamental frequency must be present in a tone. Pitch is the perceived periodicity of a tone, which is roughly the greatest common divisor of the harmonics. If perceived pitch without fundamental is considered an auditory illusion with, common pitch detection techniques should fail if the fundamental is not present, but they work quite well in the absence of the fundamental. So either there is no illusion of pitch or algorithms have illusions too.


the answer is: algorithms have illusions too.

consider 2 ideal harmonic notes with a frequency ratio of 3:2, say 3kHz and 2kHz ... The brain / algorithm must doubt between interpreting the collection of frequency peaks at m * 3kHz, and n * 2kHz as either (occasionally overlapping) harmonics of 2 notes at 2kHz and 3kHz, OR it could interpret this as harmonics of a single note at 1kHz (as you say the GCD of the frequencies).

There is inherent ambiguity between interpreting as 2 notes of each a timbre, vs interpreting as 1 note with another timbre...

One could physically construct 3 bowed strings with modekilling on the 1kHz string, such that these could make perceptually identical sounds whether the 2kHz and 3kHz strings are played simultaneously vs the 1kHz string.

at that point from the sound alone one can not discern in an ABX test which is the case, neither a human brain nor any algorithm. The doubt forces to guess (deterministically or not).

The sound is a projection of properties occuring in reality, and loses information.


True, but ambiguity does not imply that one possible interpretation must necessarily be an illusion.


Tones and harmonics get clustered into pitches, e.g. mistuned harmonics as seen in bass guitar or piano still get decoded into pitches via some sort of best match if the mistuning does not exceed certain percentage. And it works even if some harmonics disappear and reappear.

The pitch is higher level than purely perceptual.


this is correct, and the reason we are tolerant is because of dispersion: even though the different harmonics are present on the same string of the same length, the resonant frequencies don't need to be integer multiples of the fundamantal since waves of different frequency have different propagation speeds on the string.

in the case of bowed strings mode-locking ensures the phases of all the harmmonics are reset each cycle (the bow sticks and slips), so that bowed instruments can be played harmonically to parts per billion.

since a lot of sounds are plucking we must be tolerant for frequency dependent propagations speeds in regular strings / media


In the case of either the 2 strings being bowed vs the 1 string, there is an actual underlying reality, that can not be deduced from the limited information available in the sound, so any guess risks being an illusion (with probability 50%).

Assuming we agree that "illusion" merely means mismatch between interpretation and reality.


Mixing two frequencies leads to 'sum' and 'difference' frequencies.

https://en.wikipedia.org/wiki/Frequency_mixer

It doesn't quite work that way with audio but the effects are close enough.


yes that is a second type of ambiguity, and it does occur in audio as well:

an lower frequenncy sinusoidally amplitude modulated higher frequency sinusoid can be indistinguishable from 2 constant amplitude sinusoids at the sum and difference of the frequencies.

see an article by Plomp and Leveldt for the determination of the bandwidth of the auditory frequency bins (or filter bank)


I think the psychoacoustic term for this is "combination tones":

https://en.wikipedia.org/wiki/Combination_tone


I think you’re kinda right ... but maybe a better way to say it is something like:

Pitch can’t be termed an illusion, because it is a perception, not a fact of physics. So it is either always an illusion or never an illusion.

Frequency is a fact in the realm of physics. Pitch is something our mind labels things.


The missing fundamental illusion persists when the harmonics are split across different ears, and whether it is perceived or not is highly subjective.

That's why it's widely accepted to be an illusion that arises in the brain's auditory center.


Sadly most sorce code follows the exact opposite convention - starting with legalese, followed by boring initialisation. I wish more source code would come to the point right at the start, and I am thankful for the new acronym BLUF that expresses this concept quite clearly.


Maybe in languages without declaration hoisting, you could go with BLOB: Bottom Line On Back.


Much faster if constraints are represented by single bits in an integer mask and binary masking operations are used - even the hardest soduko solves instantly.


The transposed direct form II allows a biquad to be calculated with two scalar by vector multiplications and some shuffling, which should be faster than the proposed matrix solution I believe.


I'd be happy to see benchmarks of that. The problem is that the "shuffling" creates serial data dependencies, while the matrix form doesn't. Sure, the number of multiplications is smaller for direct forms, but that's not what has the most effect on performance.


MIDI is great, and I am so thankful that they had the foresight to require a mandatory opto-coupler in the input to avoid ground loops of interconnected gear. Even today, very few devices have a comparable insulation in their USB connection.


On my guitar pedalboard, I tried to be clever once and I hooked all of my digital pedals with a usb hub to a main connector. Horrible ground loops and digital noise was introduced. Does anyone make an electrically isolated usb hub?


A bit harder for USB, because of the power it supplies. USB is also not connected in loops, where MIDI will form loops with audio connections.


This book contains some wrong and oversimplified statements, the explanation of the sampling theorem is awkward and suggests that the author has not fully understood it himself.

Example of wrong claims: "The heart of digital noise generation is the random number generator." Not true, many digital noise generators use LFSR.

"Just as analog filters are designed using the Laplace transform, recursive digital filters are developed with a parallel technique called the z-transform." Hello no, there are gazillion ways to design filters - analog or digital - without those transforms.

"The frequency domain becomes attractive whenever the complexity of the Fourier Transform is less than the complexity of the convolution. This isn't a matter of which you like better; it is a matter of which you hate less."

Clearly I hate both domains less than this book. It might serve as an introduction to DSP, but please remain suspicious, if some claim herein seems oversimplified, it probably is.


I don't understand your issue with his statement on PRNGs... LFSRs are functionally pseudorandom number generators in this context, so you haven't invalidated his statement.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: