Pro Tools, OMFs, and the Audio Post Workflow
There are as many systems as ways to tackle workflow in audio post production, but thanks to industry conventions and Digidesign's long track record, Pro Tools has become the de facto standard for audio with concern for film and video alike.
Over the course of this TUT, I’ll explain the process of working with audio for video and describe the tools that help along the way. I’ll also introduce you to a container format called OMF, which is not only standard, but very helpful in getting you on your way to mixing your project.
Before we begin, it must first be said that this is a Pro Tools article, and though these same practices will most likely hold true for other DAW’s, this TUT will describe practices using Digidesign’s (now Avid) software.
Note: In order to follow along you must of course have Pro Tools, but more importantly a software package called Digitranslator, which is available by itself or bundled into the DV or Complete Production Toolkits. If you’re on LE, I’d highly recommend at least the DV Toolkit, as it also enables the timecode/feet frames rulers as well as some other essential features when working in post.
Ok, so now eager to edit, let’s begin
Step 1: A Little Background
The first step towards a finished mix is getting materials from the client. This will most likely be the picture editor, but might just as well be the director or producer. In any case, before your handed said materials, it’s important to meet with and get to know the production heads. You’ll also want to speak with the editor, because you’ll be asking for a few things to help you get set up.
Working in Pro Tools means that you have many tools at your disposal. Knowing how to utilize the power of Pro Tools is of course your first task, but working in the audio post environment brings along another set of issues to deal with. For one, post production deals in timecode, not minutes/seconds. Another point of interest is session setup. You’ll most likely be working with 24bit / 48kHz files and when concerned with video, Quicktime is king.



We’ll go over requirements in a bit, but first let’s talk about what an OMF is. The Open Media Framework Interchange, or OMF for short, is a standard developed by Avid to aid in transferring sessions between editors. The file format has been around now for a quite awhile, and while it’s reputed better-acclimated cousin AAF is getting recent attention, editors still mostly use OMF’s to get goods from people to people.
The OMF will open the session timeline in Pro Tools just as it was exported from the NLE. In it you’ll find session info, track layout, pan and automation, timecode, bit depth and sample rate and of course the tracks the editor transferred. This container format, while exhaustive, will only hold 2GB of data. When dealing with short films you should be fine, but on long-form projects you should have the editor break the film into reels of 20 minutes or less.
You should also ask the editor to include handles, which can be vital extra audio when editing dialogue, fades and creating room tone. These ‘handles’ are extra bookended audio from each region that you can pull out if need be. While in the asking mood, it might also be good to ask the editor to embed the audio in the OMF, therefore skirting the un-linked audio problem of unfound regions.
Once you’ve discussed the OMF, you can move on to asking for the video in a specified format as well. These days people are shooting with so many different cameras and while SD is still around, HD is becoming the standard. You shouldn’t concern yourself with the full-res video, instead ask the editor to transcode you a copy in one of the following formats.
- Standard Definition
- DV
- DV-25
- DV-50
- Motion Jpeg A (MJPEG A)
- High Definition
- Apple's Pro Res
- DNxHD (Download the free codecs from Avid)
Note: I’m assuming you’ll be working with video internally and thus we'll negate the conversation of peripherals and the MXF format.
While Pro Tools only works with Quicktime, this “Container Format” can encompass various different codecs. Each codec, or Compression/DECompression format, as it’s shortened, has it advantages and disadvantages and each can be grouped into one of two families. INTRA or INTER frame compression as it’s know is the practice by where a video is compressed to a smaller file for viewing and delivery (think MP3).
Important to audio editors is 'INTRA-FRAME' compression, or, a compression codec that compresses each frame individually then assembles them in line, frame edge to frame edge. The other type, 'INTER-FRAME', denotes a style of compression that uses a Group Of Pictures (GOP) approach, i.e. compressing the delta (change), not the frame individually. Without being too specific, the first option creates larger file sizes and is frame edge accurate, whereas the second option is meant to shrink file size while not sacrificing video quality (think H.264). From now on, try and ask project editors for the former, your region spotting will thank you.
Step 2: Working With the OMF
Ok, now with the details out of the way, we’re ready to begin the import. You’ve probably obtained the project materials on either DVD or HDD, in either case transfer them to your hard drive and if at all possible place the video on a separate drive from the session. Even better would be to have separate drives for video, the session and Pro Tools. Handling your sessions this way enables higher track counts, faster I/O time and less DAE errors.
Once the files reside on your system, navigate to the OMF file and double click it. The first screen you’re presented with is the new session window. Here you can change the bit depth and sample rate of the audio files. Let’s leave them as is, retaining the project settings.



After clicking through you're presented with the save dialogue window, which will create your new project parent folder containing the session and audio files.



Step 2b
The Session setup window.



This window is the heart of the OMF import process. Here you'll find many options, each of which I'll go over in greater detail.
Step 2c



This is the first pane on the top left. This area contains the source properties like the name and information for the OMF, and thus the editors project. If you saved through the new session window, your project will remain as stated here. You'll notice here that even though I've spoken about Post projects as 24bit/48kHz, this project was created as 16/48, which is always fine. No need here to adjust the sample rate as the production deemed this necessary. You'll find that there are still large pockets of the industry that still use 16/48, which can in part be attributed to the widespread use of Sony's DigiBeta decks and older Avid systems. These only support a sample rate of 16bit.
Step 2d



The next pane down is the media import options. While you could choose several options here, you'll notice that I've selected "Copy from source media" in the audio section. The reason for this is two fold. One, it's good to create a fully encompassed session with all media files residing in either the parent or audio files folder, this makes it easy to transport and backup. Two, Pro Tools has a hard time linking to OMF's. In this case if you selected "link to source media", Pro Tools would attempt to read from the container and most likely would run into errors. The system is most happy working in its default folder structure.
The video on the other hand is different. Since you've copied the video to a location on a hard drive, linking is perfectly ok, and actually the norm. To get access to the video, you'll be going through another import dialogue window anyhow, so for now just leave this setting alone. Hopefully when you talked with the editor, you stipulated that you wanted the video separate from the OMF!
Step 2e

This pane refers to the automation written by the editor. This is one of the inherent benefits in the OMF format. During the edit, our editor has probably constructed cuts and fades to align audio to picture. He's also most likely administered basic volume automation to get the mix in check.
Since we're not in the business of over-working ourselves, we'll click the "Ignore rendered audio effects" and select "Convert clip-based gain to automation". Highlighting these two carries over that mix to our DAW and kills any clip processing with regard to gain, leaving the automation to dictate the volume. This is important because most likely the director has sat with the editor and semi-finalized the rough mix, so this is a hint at where they want levels.
It's also important to have this automation data because you'll probably re-write most of it. In the NLE environment, all edits are done at the frame edge, but in audio we're able to work much more granularly, going all the way down to the sample level. Because of this, it's beneficial to re-fade and re-automate some regions to more accurately augment the volume and clip.
Step 2f



The main dialogue across the middle shows the track layout/count and enables you to import with several options. The default is to create a new track for every track in the OMF. If you're new to this process, I'd suggest leaving this as is.
However, if you've been working with projects like this for some time, you've probably created a template session with routing and plugins instantiated. If this is the case, then you'd have already opened the template and set the session settings to the project.
The following picture shows you where to go to import the OMF when the session is already open. When you import this way, each drop down menu per track reveals the tracks existent in the session, thus enabling you to route the OMF tracks to the correct designation in your template.



Step 2g



The last pane is where you decide on what track data to import. There are many options here, most of them fine to leave alone. For example, you probably want to keep volume automation, I/O labels, comments and pan settings.
One setting I DO like to uncheck is the bottom one, mix and edit groups. I group my own tracks into logical stems and therefore have no need to carry over any grouping done earlier. You'll also take note that most of these settings are similar to functions in Pro Tools, and indeed they are.
Most of the options are for DAW to DAW transfer and thus won't really show up in an NLE to DAW transfer. Most of these you can leave alone.
Step 3: Session Window



Now on to session setup. Let's say you elected to do a vanilla import, not allocating tracks to an existent template, you'd be left with something quite like what's above. You'll notice the track names take on numbers and more importantly, the session is setup exactly as the editor left it, cuts/fades and all.



The picture above illustrates a common workflow for editors, but one that doesn't quite work for audio engineers. You'll notice that the top four tracks are very similar in length, this isn't a coincidence, it's how editors are forced to work - in stereo. It doesn't mater whether you're working in stereo or surround, dialogue is always mono!
We need to do a little investigation here, and decide which tracks to ditch. It would be easy to just look at the waveforms and delete visibly identical ones, and most of the time you'd be right. However, during production, sound recordists utilize two channels in very different ways. Some record boom on 1, lavaliere on 2. Some do close mic on 1, room on 2, and others simply record stereo. It's important to ask which technique was used.
I've even found that some editors just duplicate a track they like to stereo to use it in the timeline. In any case, it's important to understand what your getting rid of. Even though the tracks still reside in the audio region bin, it's nice to have choice on the timeline, without searching through hundreds of similarly-named audio. It's also important to note that these top tracks could have been anything, but most editors stick to convention, placing dialogue in the uppermost tracks.
Step 4: Video
Now it's time to import that video. I'll assume you'll just be working with video in the same window, or if you're lucky enough to own two monitors, the second window.



Click through here to find the video where you initially saved it.



Your presented with the option window above, here you'll select "New Track" and "Import audio from file". You want a copy of the embedded (guide) audio in case you need to spot sync issues or just need a reference as to what the editor was going for. In most cases it'll be a hidden track, but you want it in your session. The option in the middle toggles between session start and spot.. since I'm not aware of timecode yet, I'll keep it as session start and show you how to spot next.



Spotting picture is very easy. Essential to the process however is window burn -or- timecode burn, which means, asking the editor to include a visible timecode bar somewhere on the picture. Having this makes your job SO much easier. To spot video, click anywhere in the video timeline and write down the timecode you see, for ease of use, this could be program start at 01:00:00:00, but anytime will do. Now the important part. At that visible frame, drop a sync point.



For this to be accurate, you need to spot that sync point to the timecode you wrote down. To backup, find a point in the video timeline and write down or remember the timecode reading (on the video), then at that point, drop a sync point. You should see a little arrow on the video track at the exact point where you dropped it.
Now switch to SPOT mode in the mode selector in the upper left part of the edit window, and select the HAND tool. Now, click and try to drag the video region. Instantly you'll be presented with the spot dialogue window.



Now enter the timecode you wrote down (or remembered) into the sync point pane and hit enter. Viola! you've just spotted the video.
Now if you set your grid and nudge settings to timecode and 1 frame, you can nudge or move your cursor anywhere in the video window and visibly check the Pro Tools timecode window to that of the video burn in, they should match. Importantly, you should lock the video region (Command + L / Control + L). With the video spotted, the rest of the regions in the session should be frame accurate with the picture.
For help in matching the guide to the the video, you can either set the edit window to grid and visibly match the two regions, or highlight the video region and with the HAND tool selected, Contorl + Click the guide audio... it should snap to the highlighted window (video).
Step 5: Session Setup
Before going to far in setting up the tracks, I like to drop a few markers to aid in selecting regions and help in bouncing final audio. The first this I do is drop a marker at each end of the video region. After that's done, I drop a marker at the beginning of the 2pop, and after the tail pop. I also drop a marker at the beginning of program (most likely the hour mark 01:00:00:00).

Let's talk a bit about these markers and areas I've just mentioned. The editor might have made the video region exactly the length of the program, but in most cases this isn't true. More often than not, they've included bars/tone, a 2pop and hopefully a tail pop. Bars and Tone are a left over from tape and analog systems and don't really have a place in the NLE to DAW world, but never-the-less they still get added. Bars were a color gamut test of the video playback window and Tone was used to calibrate audio levels from system to system. In our current digital realm, most often systems are speaking the same language, and until you need to layback to tape, these two aren't as important.
The 2pop, flash, or blip are a both a visible and audible marker that tells both editors that program starts in exactly 2 seconds. These pops are very helpful because they denote finite points in the program beginning and end. Although some programs might be easy to tell where beginning and end frames reside, most film projects contain at least one fade up or down and thus can be hard to judge when they actually end. Plus, unless the project is broken into reels, program always should start at the hour. This in convention and should be followed. You'll notice that the video I'm using for demonstration doesn't start at the hour, it actually starts at 0. This can be used as well, but it doesn't fall in line with the industry. I had a talk with editor about it.
The editor probably already created a 2pop at the head of the program, but might not have created a tail pop, if not, then if it's a visible picture end, create one. If not, either ask for last frame of program or leave it off. I also always create my own pops. Usually the ones included are either too loud, not the right tone, or worse.. go on longer than expected. It's easy to create your own. Find the frame exactly 2 seconds from program and highlight exactly one frame. This is either 2 seconds before program or 2 seconds after program.

Select the frame with your cursor.



Then instantiate a signal generator Audiosuite plugin.



Leave the settings in the signal generator to SINE wave and -20dbfs at 1000Hz (or 1kHz) as these are the standard for the industry.

Now you're left with a 1 frame region containing a 1kHz audio sine wave. You can lock the region to keep it there.
Step 6: Track Setup
This is always a point of contention. Many different workflows apply and each house/enginner handles this according to ease of use or layout with regards to hardware patching and I/O. I'm going to take you through my preferred setup, the one I've been using for several years now.
I group my tracks based on the final deliverables I'll hand to the client when done. It's industry standard to provide the client with the following tracks upon completion.
- Composite (Full Mix / Comp)
- Dialogue Stem (Dx)
- Effects Stem (Fx)
- Music Stem (Mx)
These completed tracks get created as a function of grouping and routing in your mix window. I'm on an LE Complete Production Toolkit system, but this will work just as well in HD and regular LE.



I always start with 4 dialogue (Dx) tracks, always mono. In the effects (Fx) stem I use 6 stereo Fx tracks, 4 mono Fx tracks and 4 stereo ambience tracks. For Music (Mx) I usually create 2 stereo tracks to begin with. I also create a master AUX and a print track (audio track) for each stem. Of course this is a beginning point. It's common for me to create more Fx tracks as I need them. If you've followed this setup, you could save this as a template for future use.
With these tracks created, you'll now have to drag the regions from the OMF up into your template. Be absolutely sure you drag with the control key pressed, this will keep the vertical alignment of the track in sync. As a tip, drag the dual mono Fx and Mx tracks to a destination stereo one, that way they're easier to manage and automate.
Now it's up to you to mix the film. Keep in mind that you should maintain audio levels designated by the standards organizations for TV and Film. Do not just mix to digital zero!
Step 7: Deliverables
With the mix nearing completion and the director happy with your work, it's time to begin the export of audio regions. First you'll need full length regions for all the stems. The way most editors handle this is by printing the finished audio back into the session. This is easy.

You'll notice above that output has been designated to 1-2. To print back into the session you'll need to select tracks and hold the CONTROL key while selecting output. Once pressed, select an unused bus and let go. The output selector will now display a '+' next to the reading, designating it for dual outputs. You'll want to do this for each of your stems (Comp, Dx, Fx, Mx). Each of these should go out to a separate bus (keep the Dx mono).
Once each stem aux has both outputs routed to a separate bus, head over to the print tracks you created. Set the input of these tracks to the output of each corresponding aux stem. So, in essence here's the signal flow for Dx. 4 channels Dx -> Dx Aux Stem -> Master + Dx Print Track. Once this is completed, set each print track to a hardware output you don't use, this will prevent double monitoring and feedback. Next, arm each print track and highlight both the lead and tail 2pop markers and hit record. You're left with a full mix and stems including pops to mark sync.
Printing in the style above enables you to stop and revisit any issues during the bounce. If this is the case, don't forget to re-consolidate the regions to full length at the end. When you're left with the printed regions, double click and name them appropriately and export them (Shift + Command + K) to a newly created folder (maybe called Final) inside your Pro Tools session folder structure, this enables you to keep everything in one parent folder. Now burn these files to a DVD, hand them to an editor and go get some congratulatory Champagne.