Pure intonation vs. 12 equal temperament musical intervals

Given a musical sound X, its pitch is characterized by the frequency f(X) of its sound wave. For example, if we regard the pitch as a musical note, then the pitch of frequency 2\times\mathrm{freq}(X) corresponds to what is called an octave higher above X; so we have had this feeling of consonance in between two frequencies differing by a multiple of 2. Now a thesis here is that our minds’ ears are musically interested in the ratios of frequencies, and this is the basis of the usual octave division in western music: if you are going to divide pitches between the note X and its higher octave into 12 notes, then it might be better to put this notes in a geometric progression, i.e. the k-th note has frequency \mathrm{freq}(X) \times 2^{\frac{k-1}{12}}. In this manner, the ratio between any two consecutive notes is equal to a universal amount, 2^{1/12}. Pianos, classical guitars, saxophones, and other instruments all obey this tuning, known as 12 equal temperament (12-ET) tuning.

Yet, in the meantime you may have heard of the consonance quality of the perfect fifth interval, e.g. when you play the notes G and C of an octave, that the note G in an octave has the frequency \frac{3}{2}\times \mathrm{freq}(C) for the note C in the same octave, and it’s said that it is because of this simplest ratio of \frac{3}{2} in between them that the two notes sound so resonant to our ears (I checked with Google Gemini now, it said it too). But is the ratio actually equal to \frac{3}{2}? Doing the math from the above paragraph, we have that the perfect fifth interval G is 8 divisions (called semitones officially) higher than C, we have k=8 in the above formula and

\mathrm{freq}(G)=\mathrm{freq}(C)\times 2^{7/12} \simeq \mathrm{freq}(C) \times 1.498

so the actual ratio is not 1.5! What’s happening here?

Continue reading