Digital Audio Interfaces
Many of the advantages of digital signal processing are lost if signals are repeatedly converted back and forth between the digital and analogue domain. So that the number of conversions could be kept to a minimum, as early as the 1970s, manufacturers started to introduce proprietary digital interface standards enabling various pieces of digital audio hardware to pass digital audio information directly without recourse to standard analogue connections. Unfortunately, each manufacturer adopted its own standard, and the Sony digital interface (SDIF) and the Mitsubishi interface both bear witness to this early epoch in digital audio technology when compatibility was very poor between different pieces of equipment. It wasn’t long before customers were demanding an industry-standard interface so that they could “mix and match” equipment from different manufacturers to suit their own particular requirements. This pressure led to the introduction of widespread, standard interfaces for the connection of both consumer and professional digital audio equipment.
The requirements for standardizing a digital interface go beyond those for an analogue interface in that, as well as defining the voltage levels and connector style, it is necessary to define the data format the interface will employ. The two digital audio interface standards described here are:
(1) The two-channel, serial, balanced, professional interface (the so-called AES/EBU or IEC958 type 1 interface).
(2) The two-channel, serial, unbalanced, consumer interface (the so-called SPDIF or IEC958 type 2 interface).
In fact, both these interfaces are very similar, with variations being more due to electrical differences than between differences in data format.
AES/EBU or IEC958 Type 1 Interface
This electrically balanced version of the standard digital interface was originally defined in documents produced by the Audio Engineering Society (AES) and the European Broadcasting Union (EBU) and is, consequently, usually referred to as the AES/EBU standard. This is the standard adopted mainly by professional and broadcast installations. Mechanically, this interface employs the ubiquitous XLR connector and adopts normal convention for female and male versions for inputs and outputs, respectively. Electrically, pin 1 is specified as shield and pins 2 and 3 for balanced signal. One of the advantages of the digital audio interface over its analogue predecessor is that polarity is not important, so it is not necessary to specify which pin of 2 and 3 is “hot.” The balanced signal is intended to be carried by balanced, twisted-pair, and screen microphone-style cable and voltage levels are allowed to be between 3 and 8 V pk-pk (EMF, measured differentially). Both inputs and outputs are specified as transformer coupled and earth free. The output impedance of the interface is defined as 110 ohms, and a standard input must always terminate in 110 ohms. A drawing for the electrical standard for this interface is given in Figure 18.1.
The SPDIF or IEC985 Type 2 Interface
This consumer version of the two-channel, serial digital interface is very different electrically from the AES/EBU interface described earlier. It is a 75-ohm, matched termination interface intended for use with coaxial cable. It therefore has more in common with an analogue video signal interface than with any analogue audio counterpart. Mechanically the connector style recommended for the SPDIF interface is RCA style phono with sockets always being of the isolated type. Voltage levels are
defined as 1 V pk-pk when unterminated. Transformer coupling is by no means always used with this interface but it is recommended on at least one end. Figure 18.2 is a drawing of a common implementation of the SPDIF interface.
Data
Despite the very considerable electrical differences between the AES/EBU interface and the SPDIF interface, their data formats are very similar. Both interfaces have capacity for the real-time communication of 20 bits of stereo audio information at sampling rates between 32 and 48 kHz, as well as provision for extra information, which may indicate to the receiving device various important parameters about the data being transferred (such as whether preemphasis was used on the original analogue signal prior to digitization). There is also a small overhead for limited error checking and for synchronization.
Some of the earlier digital–audio interfaces such as Sony’s SDIF and the Mitsubishi interface sent digital audio data and synchronizing data clocks on separate wires. Such standards obviously require multicore cable and multiway connectors, which looked completely different from any analogue interface that had gone before. The intention of the designers of the AES/EBU and SPDIF interfaces was to create standards that created as little “culture shock” as possible in both the professional and the consumer markets and therefore they chose connector styles that were both readily available and operationally convenient. This obviously ruled out the use of multicore and multiway connectors and resulted in the use of a digital coding scheme that buries the digital synchronizing signals in with the data signal. Such a code is known as “serial and self- clocking.” The type of code adopted for AES/EBU and SPDIF is biphase mark coding. This scheme is sometimes known as Manchester code and it is the same type of self- clocking, serial code used for SMPTE and EBU time code. Put at its simplest, such a code represents the “ones and noughts” of a digital signal by two different frequencies where frequency Fn represents a zero and 2Fn represents a one. Such a signal eliminates almost all DC content, enabling it to be transformer coupled, and also allows for phase
inversion since it is only a frequency (and not its phase) that needs to be detected. The resulting signal has much in common with an analogue FM signal and since the two frequencies are harmonically related (an octave apart), it is a simple matter to extract the bit clock from the composite incoming data stream.
In data format terms the digital audio signal is divided into frames. Each digital audio frame contains a complete digital audio sample for both left and right channels. If 48-kHz sampling is used, it is obvious that the 48 thousand frames pass over the link in every second, leading to a final baud rate of 3.072 Mbit/s. If 44.1-kHz sampling is employed, 44 thousand one-hundred frames are transmitted every second, leading to a final baud rate of 2.8224 Mbit/s. The lowest allowable transfer rate is 2.084 Mbit/s when 32 kHz is used. Just as each complete frame contains a left and right channel sample, so each frame may be further divided into individual audio samples known as subframes. A diagram of a complete frame consisting of two subframes is given in Figure 18.3.
It is extremely important that any piece of equipment receiving the digital audio signal, as shown in Figure 18.3, must know where the boundaries between frames and subframes lie. That is the purpose of the “sync preamble” section of each frame and subframe. The sync preamble section of the digital audio signal differs from all other data sent over the digital interface in that it violates the rules of a biphase mark encoded signal. In terms of the FM analogy given earlier you can think of the sync preamble as containing a third nonharmonically related frequency, which, when detected, establishes the start of each subframe. There exists a family of three slightly different sync preambles, one to mark the beginning of a left sample subframe and another to mark the start of the right channel subframe. The third sync preamble pattern is used only once every 192 frames (or once every 4 ms in the case of 48-kHz sampling) and is used to establish a 192 bit repeating pattern to the channel-status bit labeled C in Figure 18.3.
The 192 bit repeat pattern of the C bit builds up into a table of 24 bytes of channel-status information for the transmitted signal. It is in this one bit of data every subframe that the
difference between the AES/EBU interface data format and the SPDIF data format is at its most significant. The channel status bits in both the AES/EBU format and SPDIF format communicate to the receiving device such important parameters as sample rate, whether frequency preemphasis was used on the recording, and so on. Channel- status data are normally the most troublesome aspect of practical interfacing using the SPDIF and AES/EBU interface, especially where users attempt to mix the two interface standards. This is because the usage of channel status in consumer and professional equipment is almost entirely different. It must be understood that the AES/EBU interface and the SPDIF interface are thus strictly incompatible in data format terms and the only correct way to transfer data from SPDIF to AES/EBU and AES/EBU to SPDIF is through a properly designed format converter that will decode and recode digital audio data to the appropriate standard.
Other features of the data format remain pretty constant across the two interface standards. The validity bit, labeled V in Figure 18.3, is set to 0 every subframes if the signal over the link is suitable for conversion to an analogue signal. The user bit, labeled U, has a multiplicity of uses defined by particular users and manufacturers. It is used most often over the domestic SPDIF interface. The parity bit, labeled P, is set such that the number of ones in a subframe is always even. It may be used to detect individual bit errors but not conceal them.
It’s important to point out that both the AES/EBU interface and its SPDIF brother are designed to be used in an error-free environment. Errors are not expected over digital links and there is no way of correcting for them.
Practical Digital Audio Interface
There are many ways of constructing a digital audio interface, and variations abound from different manufacturers. Probably the simplest consists of an HC family inverter IC, biased at its midpoint with a feedback resistor and protected with diodes across the input to prevent damage from static or overvoltage conditions. (About the only real merit of this circuit is simplicity!) Transformer coupling is infinitely preferred. Happily, while analogue audio transformers are complex and expensive items, digital audio—containing no DC component and very little low-frequency component—can be coupled via transformers, which are tiny and inexpensive! So, it represents a false economy indeed to omit them in the design of digital interfaces. Data-bus isolators manufactured by Newport are very suitable. Two or four transformers are contained within one IC-style package. Each transformer costs about $2—a long way from the $20 or so required for analogue transformers. Remember too that “in digits” only, one transformer is required to couple both channels of the stereo signal. You’ll notice, looking at the circuit diagrams (Figure 18.4), RS422 (RS485) receiver-chips buffer and reslice digital audio data. The SN75173J is a quad receiver in a single 16 pin package costing a few dollars. The part has the added advantage that, to adapt the interface between SPDIF and AES, all that is required is to change the value of the terminating resistor on the secondary side of the input transformer. SPDIF digital output can be derived by inverters driving in tandem. If AES/EBU output is required, it is best performed by an RS422 driver IC.
TOSlink Optical Interface
In many ways an optical link seems to be the ideal solution for joining two pieces of digital audio equipment together. Obviously a link that has no electrical contact cannot introduce ground-loop, hum problems. Also, because the bandwidth of an optical link is so high, it would appear from a superficial inspection that an optical link would provide the very fastest (and therefore “cleanest”) signal path possible. However, the optical TOSLink is widely regarded as sounding a little less crisp than its coaxial, electrical counterpart. There are a number of possible reasons for this. In the first place, the speed of the link is compromised by the relatively slow light-emitting diode transmitter and phototransistor receiver housed within the connector shells. Second, inexpensive optical fibers, which allow the optical signal more than one direct path between transmitter and receiver (the technical term is multimodes), cause a temporal smearing of the audio pulses, resulting in an effect known as modal dispersion. This can cause a degree of timing instability in digital audio circuits (jitter) and can affect sound quality. The only advantage the optical link confers, therefore, is its inherent freedom from ground path- induced interference signals such as hum and RF noise. Yet at digital audio frequencies, ground isolation—if it is required—is obtained much better by means of a transformer. If you want to modify a piece of equipment with an optical interface to include SPDIF coaxial output, a modification is shown in Figure 18.5.
Transmission of AES3-Formatted Data by Unbalanced Coaxial Cable In October 1995, the AES produced an information document (AES-3id-1995) relating to the transmission of digital audio information (utilizing the professional data format)
over an electrical interface that has much in common with the interconnection standards employed in analogue video. Limitations of AES data traveling on twisted pairs and terminated in XLRs include poor RF radiation performance and a limitation of maximum transmission distance to 100 m. The proposed unbalanced interface is suitable for transmission distances of up to 1000 m. Furthermore, by a prudent choice of impedance and voltage operating level, coupled with a sensible specification of minimum rise time, the signal is suitable for routing through existing analogue video cables, switchers, and distribution amplifiers.
The salient parts of the signal and interface specification are given in Table 18.1.