Theatre

The ability to localize a reinforced voice to a performer and have that image follow the performer around the stage automatically has long been a dream for theatre sound designers and directors.

TiMax SoundHub-S source-oriented reinforcement (SOR) system is available with optional radar-assisted TiMax Tracker package to locate multiple performers, each to within 6” in any direction, and provide convincing image localization for upwards of 90 percent of the audience, no matter where in the house they are sitting.

In a 2-RU package, the SoundHub-S contains all audio and control inputs and outputs, DSP, delay matrix and mix circuitry, and random access audio players. The Soundhub is programmable via the TiMax control software GUI, which has been designed to run on both PCs and Macs, and can be operated through controls on the front panel with the computer disconnected.

The tracking system is modular and expandable, based on the size and complexity of the show. Each performer wears a small, 1” square plastic TT tag, about a quarter of an inch thick, that emits ultra-wide-band radar pulses to typically four or six TiMax Tracker radar sensors. The latter, which are typically mounted out of sight on a lighting truss, transmit data about the location of performers to the Soundhub via MIDI.

 

Widespread Adoption

More and more theatres are adopting this approach, including New York’s City Center and the UK’s Royal Shakespeare Company. A number of Raymond Gubbay productions of opera-in-the-round at the notoriously difficult Royal Albert Hall—including Aida, Tosca, The King and I, La Bohème and Madam Butterfly—as well as Carmen at the O2 Arena, have benefited from source oriented reinforcement, as have recent productions of Les Miserables, Jesus Christ Superstar, Into the Woods, Beggar’s Opera, Marie Antoinette, Andromache, Tanz de Vampire, Lord of the Flies, Fela!, and many others at venues around the world.

Veteran West End sound designer Gareth Fry employed the technique at the Barbican Theatre for The Master and Margarita, to make it possible for all audience members to continuously localize to the actors’ voices as they moved around the Barbican’s very wide stage. He noted that, in the three-hour show with a number of parallel story threads, this helped greatly with intelligibility to ensure the audience’s total immersion in the show’s complex plot lines.

Based on the experience, Fry said, “I’m quite sure that in the coming years, SOR will be the most common way to do vocal reinforcement in drama.”

In addition to the Soundhub and tracking system, of course, a sound reinforcement loudspeaker system is called for—but, rather than the LCR clusters or line arrays typically seen in live sound production, more like eight or 16 channels of relatively controlled dispersion speakers are arrayed to cover separate seating areas, fed from preprogrammed, variably delay-matrixed audio from the Soundhub, plus a number of front-fills.

 

It's About Time

The objective is to ensure that every audience member receives an acoustic wavefront from each performer about 10-20 milliseconds before receiving the reinforcing energy from the loudspeakers. Within this short time difference, the brain integrates the two arrivals together as one sound, but causes the listener instinctively to localize to the slightly earlier sound arriving directly from the performer. This psychoacoustic phenomenon is often referred to as the Haas or precedence effect.

TiMax achieves this by setting up multiple unique delay relationships between every performer’s wireless microphone and each loudspeaker reinforcing it. These relationships are changed every time a performer moves to a different location on stage, in order to maintain the acoustic precedence that makes the audience localize to the performer and not to the loudspeakers.

The TiMax software simplifies the process by allowing on-stage localization zones to be pre-defined as image definitions, which are simply tables of level and delay instructions preprogrammed into the Soundhub, instructing it to place the performer’s audio image in the appropriate zone on stage.

“There’s no black magic involved. TiMax is based on psychoacoustics and the physics of sound, primarily the Haas effect that allows us to localize a sound source in space using interaural cues based on time difference of arrival of a sound at each of our two ears,” says Dave Haydon, one of two directors of Out Board, the U.K. company behind TiMax.

The TiMax Two Soundhub contains a 16-input x 16-output delay matrix, expandable to 64 x 64 in 2RU via additional DSPand I/O cardsets. The different delays to the various channels are preprogrammed for multiple onstage zones using Smaart software and laser measuring devices, the data outputs of which are entered into a custom spreadsheet that yields the requisite level and delay data for each audio channel. During a show, a performer’s position in three dimensions can be tracked by the TiMax Tracker radar tracking system. As the performer moves from one zone to another, the I/O matrix is switched with soft crossfades, so that audio levels and delays remain appropriate to the performer’s position for accurate psychoacoustic localization according to the Haas effect.

The success of the system depends to a large degree on a performer’s ability to project direct sound adequately throughout the house, to provide a strong direct sound anchor that the audience can unconsciously correlate that with the delayed sound reinforcement emanating from the loudspeakers.

“You must have a good anchor. In an opera, the anchor is putting out 130dB SPL at 1m—an opera singer is a loud as a 12-and-a-horn box. That’s a perfect anchor, but if you or I walked on stage in a big domed building like the Royal Albert Hall wearing a lav mic, and walked around talking at our normal speech level, it would be very hard to get the fader on the desk anywhere near that level without hearing the room, because although the system is distributed, it’s still exciting the room a bit. But opera singers have so much power to their voices acoustically that you’re getting a very strong direct anchor,” Haydon explains.

“But for weaker voices, one of the tricks you can use is to build first-wavefront reinforcing speakers into the stage or the set to create an artificial time zero somewhere near the performer,” Haydon continues. “While the ideal would be to strap a loudspeaker to the performer’s chest, in reality you can accomplish almost the same thing by positioning a loudspeaker above the head, halfway back up the stage, which is virtually at time zero as far as the audience is concerned. So, as well as feeding the voice though the PA with appropriate delays, you run it undelayed through the speakers above their heads, which helps the audience hear it as the anchor reference. That’s been done in a number of theatre spaces, including the Royal Danish Theatre [in Copenhagen], as well as in large outdoor venues, where it has helped localize the first arriving wavefront.”

Clearly, delay imaging or actor tracking doesn't suit every environment. High-energy rock musicals do not have strong enough anchors to Haas-localize the vocals, and simply level-panning them in the PA would be annoying, so more conventional sound reinforcement arrays are more appropriate for such productions.

“It’s sometimes suggested that you can do a version of source-oriented reinforcement with a conventional sound system for a conventional stage musical,” says Haydon. “The trouble is, if you have a loud rock musical like we have in London now—Dirty Dancing, Grease, Mama Mia, for example—they are treated quite differently from the operas in the Royal Albert Hall. In opera, you are looking for realism and unobtrusive reinforcement, whereas in rock opera, you’re looking for energy and excitement, and to send people home on a buzz. So it’s common to go for a center cluster for the vocals and left-right for the band, with a decent amount of subs so everyone has a good ride. There’s usually too much band level coming out of the pit and band level in the PA to be able to use the opera- or theatre-style acoustic reinforcement for vocals, so you just stick them in the center cluster. It is sometimes suggested you could use a tracking system like a pan-pot controller to pan voices across the LCR system using just level control, but if you continually level-pan performers every time they cross the stage, the audience will get  seasick—you’ll end up with people vomiting in the aisles.”

 

Soundhub-S

The TiMax Two Soundhub is available in two formats. Soundhub-S, the subject of this article, is used for audio show control and playback in live shows and events. The Soundhub-R format is aimed more at systems integrators and contractors who need a comprehensive set of audio routing, mixing, processing, and playback tools for fixed installations, with a wide variety of remote control options. Both formats come in the same compact chassis; differences lie in the control software.

The Soundhub comes standard with a single 16 x 16 matrix card and 16 channels of random access audio playback. Three additional card slots are available for system expansion up to 64 x 64, with 64 channels of playback. Multiple Soundhubs can be cascaded for scaling beyond this for operation in multi-zoned theme parks and other large venues.

Each matrix input can be selected from among three sets: analog or AES-3 digital audio, Cobranet or Ethersound audio, and random access audio playback from the system’s internal hard disk drives. The three inputs can be mixed together onto a matrix input, or crossfaded one to another, using presets. This effectively makes a 16x16 into a 48 x 16.

“We expect most people to select between these submix inputs, but it’s not unusual for some live shows to have playback backup, which could be used to thicken up the backing vocals if the band is having a slack night,” Haydon says.

Balanced analog audio interconnect is via groups of eight channels on DB25 connectors in the now familiar Yamaha pinout. Digital I/O with sample rate conversion is provided for 16 channels on DB25 connectors, with an option to sync automatically to embedded word clock, or to lock to external word clock on a BNC connector. Digital audio on Cobranet or Ethersound networks is provided in pairs of 32 on Cat 5 cable. MIDI, SMPTE, and GPIO connectors round out the rear panel. Optional dual power supplies, dual cooling fans, dual mirrored audio hard disk or flash drives, and input relay bypass provide for fail-safe operation.

In a multi-user environment, single or multiple Soundhub units can be programmed from one or an indefinite number of computers in any mix of PCs and Macs on 100 Base-T Ethernet, while a front panel control pad and color LCD screen with simple push-button switches and menus permit stand-alone operation of the system for show and cue recall. Alternatively, presets and cues can be recalled remotely from AMX/Crestron, MIDI, SMPTE, GPIO, or TCP/IP controllers.

Level, delay adjustment, and parametric equalization are provided for every input and output, with eight bands of EQ and output delay (in addition to the crosspoint delay) available on each output channel to allow for conventional loudspeaker system alignment. “If you need to ‘move’ a particular loudspeaker a little further back, the additional output delay allows you to do that without having to rewrite all the image definitions,” says Out Board director Robin Whittaker, who is responsible for the technical development of TiMax. All levels, EQs, delays, and routing paths can be stored in libraries and recalled to additional channels as required.

Each cue or preset can be crossfaded between different routing, level, delay, and EQ settings for seamless operation as performers move from one zone to another during a show. Operation of the TiMax GUI is extremely intuitive, with drag-and-drop functionality, and familiar console features such as control group faders, signal meters, and EQ displays.

A list of preprogrammed cues or presets is stored as a show or configuration, a number of which can be stored on hard disk or flash drives in the Soundhub. Via a series of simple front-panel screens, soft switches, and a rotary encoder, the user can recall a show and execute its cues. In addition, the user has access to pre-assigned level and mute groups to make adjustments across multiple zones and sources. I/O metering, solos and mutes permit localized zone control, source switching, and diagnostics. Needless to say, access is password protected.

 

TiMax Tracker

For many years, TiMax imaging cues were executed manually to alter the sound localization as performers moved from zone to zone around the stage. This is still appropriate for certain types of shows, but, for others, changes by the director during rehearsal, missed cues, and complex movements by large numbers of actors have the potential to create havoc in the sound department. With the addition of TiMax Tracker, sound imaging automatically follows up to 60 actors at a time as they cross the stage, without operator intervention. Considering that it could easily take dozens of cues to achieve this manually, integrating radar tracking means a substantial reduction in pre-programming effort, and obviates the need for operator intervention during a performance.

The ultra-wideband (UWB) RF tracking technology, developed by Ubisense in Cambridge, uses a combination of angle-of-arrival (AOA) and time-difference-of-arrival (TDOA) measurement techniques to locate performers wearing TT emitter tags to within 6” in any dimension. In a production of Tosca staged in-the-round at the Royal Albert Hall last year, the TiMax Tracker system automatically tracked the heroine, Tosca, as she delivered her final aria at the top of the castle walls 20’ above the stage before taking her famous tragic plunge over the parapet. The location data transmitted to the Soundhub enabled the system to switch to the appropriate preprogrammed loudspeaker delay and level parameters for the audience to localize her voice accurately at that height.

Typically four to six sensors are recommended for accurate localization and to provide sufficient redundancy, in view of the fact that the UWB pulse emissions from the TT tags do not penetrate objects such as the human body or metallic threads and sequins in costumes; however, under optimum conditions, as few as two sensors are sufficient to determine a precise 3-D location. A major advantage of TiMax Tracker is that adding one or two extra sensors is often all that is needed to overcome a problem encountered during rehearsals due to such scenography or costume issues. The sensors are fairly unobtrusive, measuring just 8 x 6”. Each interlinked group of sensors is called a cell, and multiple cells can be deployed for performances in-the-round and in larger venues or outdoors.

The TT tags worn by performers transmit at a rate of 0.01Hz-20Hz, dynamically speeding up or slowing down their pulse rate depending on a performer’s rapidity of movement. Stationary or sluggish performers or objects thereby refresh their positions more slowly, freeing up bandwidth for faster pulse rates for dancers, performers on roller skates, and moving objects.

The tags employ a dual-radio architecture, using the license-free 6-8GHz range for the radar pulses, and a bidirectional 2.4GHz radio link for control and telemetry, allowing the system to manage the tags and dynamically vary update rates, send self-identification commands to illuminate tally LEDs, and monitor battery life. In case of failure, tags can be hot-swapped during a show.

A software location engine analyzes the data from the sensors, generating a 3-D animated image of performers moving around the stage and sending location information to the TiMax Two Soundhub matrix. The sensors themselves are networked via Cat 5 cable back to a POE Ethernet switch that supplies their power. The switch is linked to the host computer and location engine, which is, in turn, connected to the Soundhub via MIDI. A stream of MIDI messages contains the tag numbers that correspond to Soundhub inputs, and their stage locations corresponding to the preprogrammed image definitions. In this way, the mapping and crossfading of inputs to outputs in the matrix is slaved to the location data provided by the tracking system. At the same time, the system can run other play list imaging or effects cues triggered manually, without dropping a cue.

Events such as playback of prerecorded sound effects or music can be triggered automatically from the system’s internal drives when a performer enters a particular zone. Obviously useful for theatre production, this feature can also be employed in interactive exhibits, for example; if patrons are given TT tags to wear on entering, then their movement through a zoned exhibit will trigger various preprogrammed events—great for haunted houses or museum tours.

 

Loudspeakers

“For TiMax or any other source-oriented reinforcement system to work well, your sound system design—your speaker choice and deployment—needs to be sympathetic to what you’re trying to achieve,” says Haydon. “It’s also pretty pointless trying to do it with only a couple of massive line arrays. If you try to set delay times in the hall for everybody sitting within the incredibly wide dispersion characteristic of a line array, there is no one delay setting that will work for all of those people once you get halfway back into the hall.

“But if you distribute lots of speakers and essentially make it so that within the bounds of the Haas effect and within quite small regions you’re just pointing individual speakers at people, then you can set the delay times on those speakers to focus back to the positions on stage that you want the imaging to locate to,” he adds. “As in any production, the sound system design has to suit the application. If you look at the Albert Hall in-the-round operas, where Robin has implemented TiMax SOR to its purest degree, we have vertical slices of PA hanging in the ceiling, helped along by little ground fills right underneath, and these vertical slices slice into the audience, pointing at a set of seats within a very narrow dispersion angle of 40-60°—there is no PA covering a very wide area, except for a separate end-on orchestra system.”

“And then, for each of those stage zones, you set up the focusing delays for every loudspeaker that’s hitting a slice of the audience, every one of them with different delays that make the different seating areas they cover reference back to each particular zone on the stage,” he continues. “You repeat that for every zone on the stage, and as you might expect, it gets to be a very large number of parameters, but the great thing is you only do it once and thereafter, the TiMax software works out which settings to switch to as performers move around on stage.

“Ideally, you want loudspeaker dispersion characteristics within 60°—we think a 50-60° box is perfect for us, but if you’re not firing it very far, you can have wider dispersion; for example, if you have a 90° box, and it’s only projecting a very short distance, it’s still covering quite a narrow width.”

Whittaker likens the process of zoning the audience to slicing a cake—or, at least half a cake, for something other than theatre-in-the-round, or what he terms “end-on” shows. “From almost all audience perspectives for almost all positions on the stage, the loudspeaker should not deviate too much from a coincident angle of visual perspective and sound perception. You almost always want to be reinforcing the sound with a loudspeaker that’s pointing more or less along the line of sight. Then you set up the appropriate delay and let precedence do the work of localizing the sound source on the stage,” he explains.

“For end-on shows, we have come to the conclusion that the most effective loudspeaker arrangement is a center, an inner left-right pair pointing slightly outward, and an outer left-right pair pointing slightly inward, which gives me the ability to localize with the three inner loudspeakers any action that’s more or less center-stage, and to localize action on either extreme side of the stage with one or the other of the outer pair. I can deliver sound to the entire auditorium with loudspeakers pointing more or less in the right direction for action anywhere on the stage. It’s best to position the loudspeakers within the width of the stage, but, if that’s not possible, it’s not too disastrous. And if you have the luxury of onstage anchors, that helps as well,” Whittaker says.

Haydon recognizes that TiMax may be—and frequently is—used in a system with line arrays, “in which case you’re working between the theoretical ideal—and when do we ever get that?—and practical reality, which is about where management will let you place loudspeakers. Where it’s a big space, such as an outdoor stadium with huge line arrays, we apply a lot more compromises to the delay settings and use level localization a bit more, but we don’t move away from the core principle of Haas-effect localization. With the wide horizontal coverage angles inherent in line arrays, you have to be more careful about where you center the delays to ensure that you’re not creating echoes for half the people who are listening to each array.”

So how long and involved is the process of setting up a system for a performance with something like 16 zones on stage?

“With Smaart and a spreadsheet to handle the numbers, we can set up the Royal Albert Hall for opera in about two or three hours. It’s very simple—it helps, of course, if we show you how to do it first. We set up a lot of static zones for an orchestra in the Coliseum in Rome in less time than it took the Meyer guys to EQ with the SIM system. It probably takes less time than the lighting guys to set up. An extra bonus to using Smaart is that you can eliminate any errors resulting from latency in all the digital gear that’s used these days,” Haydon notes.

“The setup is not razor-sharp, because there’s a window within which you work—it’s not going to be down to half a millisecond, but more like within 10 to 15ms. The Haas effect works over a range of zero to 25ms, or about 25’, so there is some latitude there within which the precedence effect still works for accurate localization,” he says.

“We tend to be reasonably generous with over-delays so that we can be reasonably generous with the size of stage zones, front-to-back,” adds Whittaker. “With spoken word and the singing voice, we’ve got a threshold of echo perception of about 25ms., which means we can have a stage zone 5-6m in diameter with a little bit of over-delay in the front area. But tightening it up to 3-4m makes the sound imaging a bit more coherent.”

 

Delay panning vs. level panning

To this day, there remains a legacy requirement for mono compatibility in multichannel sound systems, from two-channel stereo upwards, particularly in matrixed systems where multichannel signals may be inadvertently collapsed or intentionally downmixed to mono. When multichannel signals that are not coincident in time are downmixed to mono, undesirable audible anomalies occur, including partial or complete phase cancellation, comb filtering, and flanging effects. It is primarily for this reason that simple level panning is preferred over delay panning in sound reproduction systems, since the only significant anomaly that occurs in mono mixdowns of level-panned material is a modest and generally tolerable increase in center channel build up. Once the current technological migration away from matrixed multi-channel sound formats, such as FM-stereo and Dolby surround, to discrete formats is complete, however, the historic requirement for mono compatibility will likely disappear, along with the concomitant bias against delay panning.

“Unless you’re sitting absolutely mid-way between two loudspeakers, it’s very hard to get a decent level-pan law from left to right,” says Whittaker. “As soon as you’re sitting closer to one loudspeaker, the other has to get 10dB or 12dB hotter before you hear the sound moving in its direction. Whereas with delay panning, you are electronically moving one loudspeaker closer to you and the other further away. In the tests that we’ve done panning sounds around a multichannel surround sound system, when we do it just with level panning, you tend to hear the loudspeakers and the spaces in between them—you hear the sound hopping from one loudspeaker to the other. But when we delay-pan that same signal, or pan it with a combination of level and delay, we get a much smoother transition. If you don’t take control of sound in the time domain, it will mess with you and get in your way.” He issues the caveat that, in order to execute delay panning properly, you need information about the performance space and the spatial relationships between the various loudspeaker systems in it:

“Our take on it—and we have filed patent applications on the subject—is that when you mix a multichannel stem, metadata should be embedded in the audio stream to steer a delay matrix at the point of delivery, and allow for correct steering on playback in rooms of widely varying sizes and geometry. The home for this sort of technique is definitely in the performance audio space, whether it be a boardroom with a conferencing system where you want to be able to provide directional cues for who is talking on the other end of the line, or arena opera or musical theatre—the benefits are always significant.”

 

Increased intelligibility at lower volume

Most people find it fairly easy to follow a conversation in an animated cocktail party, because directional cues greatly increase intelligibility in noisy environments. This “cocktail party effect” is equally applicable to performance sound.

“If you mix multiple sound sources to mono in a large listening space such as a theatre—the idea of stereo in a large listening space is a bit of a misnomer—all of the different elements of the mix are sitting in the same place spatially,” explains Whittaker. “That limits the ability of the auditory cortex to discriminate based on binaural cues relative to spatial position, and hence limits your ability to focus on what is of interest to you. Whereas if sound is presented in a distributed, multiple mono way, as it is in nature where every sound is essentially monophonic with a single path to each ear, your binaural system enables your auditory cortex to focus on each part as you wish. This means that in a source-oriented reinforcement system, we don’t have to boost the level of a solo to make it stand out, as we would in a conventional mix. The solo has its own spatial signature, and by processes of cross-correlation between time differences of arrival between your two ears, you can filter out almost everything else,” Whittaker explains.

He related the story of a sound mixer who, having mixed the same show on a conventional system for five years, confessed to him, after hearing it once through TiMax, he heard details in the orchestration that he never knew existed. Perhaps not coincidentally, filmmaker Michael Gibson told me, after attending a recent demonstration of new 3-D cinema technology featuring the rock band U2, that he was able to distinguish details in the set—specifically a bandana lying on the stage—that would have gone unnoticed in a regular 2-D version of the movie. I suspect the same sort of thing is going on in the brain in both cases, a kind of selective foreground-background discrimination process, made possible by two eyes or two ears working together.

“In the large, outdoor shows from which we’ve garnered a bit of a reputation, where you’ve got a stage that’s 80m (250’) wide and 5,000 spectators on bleachers, ‘Who is speaking now?’ is incredibly important. You need your ears to guide your eyes. Without that auditory cue, it’s hard to know where to look in this great big space with a 180° visual panorama in front of you that is full of performers,” Whittaker says.

Other benefits may be less obvious, but no less important. By using a distributed loudspeaker system, an SPL of 100dBA can be measured 30m (about 100’) from the stage, while not necessitating 130dBA in the first row. This enables better conformance to health and safety criteria where applicable, and leads to reduced annoyance and stress levels among members of the audience. In addition, the lower overall SPL leads to fewer problems with reflections from the walls, balcony fronts, and other architectural features of the room. Loudspeakers that are placed closer to the audience do not excite the space to the same extent as a large and distant flown system. Lower powered amplifiers and loudspeakers tend to be smaller and easier to mount and conceal, and may be less expensive than high-powered units of equivalent quality

Prior to its general market rollout this summer, TiMax with the new TiMax Tracker automation was put through a few real-world tests in 2008. In addition to Tosca in-the-round at the Royal Albert Hall, a production of Tellspiele, the story of William Tell, was staged in the Uri Theatre in Altdorf, Switzerland, from August through October, under the direction of Volker Hesse.

The theatre’s interior was almost completely refashioned to create a 35m (100’) long stage platform stretching all the way from the rear upstage wall right to the far back of the auditorium, above the original audience seating area. New audience seating was established in tiers, atop and at a 90° angle to the original seating, on both sides and facing across the new central stage.

The audience’s visual scope spilled beyond this wide 180° plane, however, because a significant portion of the action took place behind both seating areas. The potential for audience confusion was, therefore, fairly high, but TiMax Tracker’s inherent 3-D capability was invaluable in maintaining localization in the performance areas high above the main stage platform.

The stage was broken down into 16 different zones: six across the expanse of the central stage, four on each side at the back of the audience, and one each at the extreme ends of the stage. Sixteen actors, each with an individual Sennheiser wireless microphone, were tracked as the action took them through the full span of the established TiMax zones across the central stage and to the far rear of the audience area.

A distributed loudspeaker system, comprising 21 Klein & Hummel Pro-X6N enclosures, was hung as seven outward-facing pairs along a central truss above the stage, with a further three cabinets positioned at each end of the stage and pointing towards the opposite stage end. A single box covered the sound-mix position.

“We knew that vocal localization from TiMax was going to be very advantageous to the production, but, in hindsight, Tellspiele would have been almost impossible to comprehend without it,” said the sound designer Tom Strebel. “The action in many places is so fast—the audience must be able to instantly place which actor is speaking. The TiMax Tracker system too, we used for the first time, and the reliability and accuracy far supersedes my expectations even knowing the quality of the TiMax product. It is fascinating to watch the software depict the movement of the actors between the zones and hear the results. It also gives the actors more freedom and makes watching even more pleasurable for the audience.