Syncing audio to video of a grabbed VHS video

Often in a grabbed video sound and video diverge.
This page explains how you can fix the audio track.

Why audio and video can get out of sync
Find out the exact delay time between video and audio
How to sync audio and video
Delay of mp3 audio in videos

Why audio and video can get out of sync

Symptom

When playing the movie you notice a delay between audio and video images. Mostly the offset is increasing with the length of the video. In the properties window of your video file sometimes there are also shown different total run times of the audio and video track.

Causes of audio delay

  1. Audio delay as a result of the automatic video frame synchronization of the VHS player.
    VHS players compensate geometric variations of the mechanic tape flow by regulating the playback speed slightly until the image is properly synchronized. This is done automatically or manually via dial. As a result in VHS, the actual frame rate during playback is never exactly 25 fps. It is always slightly different.
    If the grabbed recording is played at a computer with exactly 25 fps this leads to an incorrect duration of the video track. But the audio track is played with the same sampling frequency as it has been recorded. As a result audio and video track diverge.
    Already 0.1% deviation of video speed lead to a an audio delay of 1s after 20 minutes running time.
    This audio delay increases constantly with increasing running time of the video.
  2. Sudden offset due to image dropouts (Dropped Frames).
    Wrinkled VHS tape
    leads to stroyed frames. Possibly this results in dropped frames in the video file, because sometimes the grabber cannot sync to destroyed frames. But audio will be recorded without interruptions. The biggest source of this problem are E-300 cartridges because they use the thinnest video tape.
    Debut video capture software displays the number of dropped frames at the end of the recording.

    Another reason for dropped frames is the incorrect handling of some codecs when reading video streams containing NULL frames.
    In this case, the video with the raw data still seems to be fine, but as soon as the video has been further edited, sudden jumps between video and audio occur, which get bigger and bigger. (The video is getting more and more ahead of the sound).
    How to avoid this is described in  Avoiding Dropped Frames during video import.

    Audio offset caused by lost frames occurs stepwise, so that the sound abruptly shifts at faulty points of the video streabs and gets a significant delay.
  3. Sudden dropouts due to processor overload (Dropped Frames).
    In older days dropped frames also arose as a result of high CPU load during capture. The computer does not manage to compress and store the frames in time.
    This error should not occur if you follow the advices given in part 1.
  4. Seldom stronger fluctuations in playback speed (frame rate) happen, e.g. due to stretched video tape. In this case, the pitch of the audio track also fluctuates.
    This can only be corrected with extreme effort.
    The procedure would be here:
    1. Correction of the pitch fluctuations with a special software like Celemony Capstan.
    2. Correction of the audio delay as shown on this page.

An example

Audio delay in VHS video
Audio delay vs running time of a grapped VHS video (frame rate not 25 fps and dropped frames starting from 0:50)

In the figgure you can see the real audio delay of a fairy tale I had digitized from VHS. Until the 50th minute there is a steadily increasing offset, which arises solely because the tape is not exactly running at 25 frames per second. The frame rate differs only 0.026%. But that's enough, so that from about the 10th Minutes you will note that the audio track lags behind the picture. We will notice a shift of greater 100 ms. At 50 minutes, the offset has been increased to 800 ms. That is almost one second.

From the 50th minute the audio delay suddenly increases sharply. In that range I could see stepwise movements in the digitized video. So there were dropped frames or (in this case) delayed frames. In the audio track there were some small clicks and crackle noises too. So I conclude that the dropped frames result from disturbances in VHS cassette tape flow, and were not caused by excessive computer load when digitizing.

Time corrections of the audio track then were performed in Wavelab. I've corrected each section of the graph between two points separately in Wavelab. The correction value which is used as offset is always the difference between the offset at the end of the section and the offset at the beginning. For example between 1:05 and 1:07 we have 200 ms delay. After these adjustments, the synchronicity between sound and video image was fine.

Step 1: Finding out the exact delay time between video and audio

Before you can correct the offset of audio, you must first get out exactly how big it is.

a) Estimation by reading the file information of your video

Unfortunately this only works with MPG files.
In the AVI files that were recorded with Debut Video Capture software and the PICVideo MJPEG encoder, the running times for image and sound met not exactly.

Open the video file in Avidemux: File > File information

Audio and Video delay
0.01381% delay between audio and video. This is about 1 s offset per hour of video run-time.

You see the length of the video stream (basing on 25.00 fps) and the length of the audio stream (which is corresponding to the actual recording time). You get a delay expressed in percent by converting both values into seconds, and then dividing audio length by video length. In the example, you get 0.01381% offset.
You just can enter this value for the audio length correction in Wavelab.

But for grabbing we do not use MPG files, because the image is compressed which is bad when it comes to deinterlacing.

b) Estimation by calibrating your recording set up

Make a test recording with 10…30 minutes of your video as MPG file (you may use the WinAVI Video Grabber software). Then you can apply the method a) to your recording to get the exact percentage of audio delay.

If all of your videos are running with exactly the same playback speed, they all need the same correction value.
After that you can record the video with a different codec. The percentage of offset between sound and image will remain the same.

The calibration method works only if your VHS player is playing the videos with the same constant synchronization setting. If your player uses an automatic synchronization adjustment, you cannot ensure that the speed has not changed. Most likely you have to apply method c) in this case.

c) More accurate estimation: the Trying Method

With a little practice you will be able to estimate the sound offset to about 100 ms.

Open the video with a video player (VLC Media player or MPC Media Player Classic).

If your video does not have too much frame dropouts, you can assume that it has a constantly increasing time offset between audio and video track. Consequently, the offset at the beginning of the video is 0, a and at the end it reaches a maximum.

Go to a part of the video where you can easily compare the sound delay with the corresponding images. Sequences with clearly visible lip movements are favorable.

In your video player, call the menu item where you can set an audio offset.

The VLC media player is well suited. Via Tools > Track Synchronization a separate window opens, where you can adjust the offset of the running video without closing that dialog.

VLC with settings windor for Audio Offset
Adjusting audio-Offset in VLC.

In Media Player Classic you reach audio settings via: Right-click > Audio > Options.
Activate [x] Audio time shift and put in any estimated value. You can input positive or negative values.

Enter an estimated value. You can enter positive or negative values here.
Play the scene again and see if the picture and sound are better synced.
Gradually improve the numerical value for Audio time shift. If the audio is lip-synchronous, note the exact numerical value along with the time stamp.

For videos with high picture dropouts, you may need to determine the audio offset after each disturbed section, as it can change.

MPC audio time shift
The example shows in Media Player Classic an audio time offset adjusted to +1100 ms.

Note: MPC will remember the last used value for audio time offset. Therefore, you should set the value to zero afterwards.

Finally, you will receive a table that lists the correct audio offset for different time stamps of your video.

If we display these values as a curve, it is much easier to see whether they are sudden changes (all of which need to be corrected individually), or whether it is sufficient to change the entire duration of the audio track only.

For this purpose download the Excel datasheet Video-Audioversatz.xls.
Enter your values into the table.
The result will be a curve similar to the one in the following example.

Video-Audioversatz.xls
Curve of measured audio offset and a linear interpolation in a Excel sheet

The example shows the following:

In the table a start offset of 100 ms was entered (which already exists at the beginning of the video) and the last value is an offset of 2500 ms (= 2.5 s) at the end of video.

Excel calculates from this an average deviation over the runtime.
The measured actual audio offset deviates sometimes upwards and sometimes downwards from the interpolated linear deviation.
This could also be due to measurement inaccuracies.

Since one clearly perceives an audio offset of about 100 ms or more, I have marked the range of this allowed deviation with the two yellow lines.
You can see that the actual offset (except for the part at the end) is just within these allowed deviations.

For this reason, I only stretched the audio ance once in Wavelab to an elongated running time by 2400 ms, and I additionally added 100 ms silence at the beginning of the file.
In the resulting video there were still some small deviations visible (you are able to perceive an offset of 100 ms), but this was still acceptable with this film.
However, in a second run it was necessary to extend the part from the 60th minute to the end of the film by another 400 ms.

How to sync audio and video (repair digital recordings from VHS tape)

Save audio

In order to process the audio track, you will need to extract it from the video file. To do this:

In Virtualdub:
Open video file
Virtualdub > File > Save WAV...

In Avidemux:
Open video file
Avidemux > Audio > Save Audio track...

If you have recorded audio as AC3 stream (Dolby digital), you can export it as PCM stream in this way:

In Avidemux:
Open video
Video: Copy
Audio: PCM
File > Save as...
A new video file is written, where only the audio track is newly rendered.
Open the new video in Avidemux and save the audio track.

Also you can save the AC-3 audio data stream directly from Avidemux.
Then you need to use the tool "BeSweet" which converts AC-3 data into wave files.

Correcting audio track length with Wavelab

Open the audio file in Wavelab.

Correct the volume first.

If necessary, you should also filter the sound, unclick it, carefully de-noise or de-hum it, and possibly refresh the treble range.
If you grabbed the video with a the USB Video Grabber, you should apply a 60 Hz high pass filter.

Save the result.
You may have to return to this point if you make a mistake during the following runtime correction.

Time correction with Wavelab:
Wavelab > Perform > Time correction
You can specify either the planned duration, or the percentage of the desired correction.
I recommend the latter, since the numerical value will be the same for most of your videos.
In the options select Quality: Good, [x] maintain pitch
You should use the Dirac processor only if your video contains music. A time correction with the Dirac processor takes up to several hours (!) Without it just a few minutes.

Wavelab time correctioni
Settings for Time corrections in Wavelab

Attention: Wavelab time correction opens with the last used adjustments. For this you first have to click at the Ratio: Source button to reset the destination leng to 100% of input length.
Then just add the necessary time offset to this value (in most cases some 100 ms).

At the beginning of the file you can - if necessary - insert an initial offset.
This is done via the menu item Edit > Silence....
- Set the position pointer at the beginning of the file
- Edit > Insert Silence
- [ ] like selection range (deselect), and specify the length of silence you want to add

Wavelab - Stille einfügen
Settings for adding silence in Wavelab

Save the finished file as a new file with a different name so you can take a step back if necessary.

Correct audio run time with other audio software

Please submit suggestions to me: sven@engon.de

Testing

Direct test in MPC:
File > Open file > Dub: Browse, select corrected audio file
Play > Audio: select the corrected audio track
Play the video. Audio time offset should be set to 0.

Direct Test in Virtualdub:
Virtualdub > Audio > Audio from other file...: chose your corrected audio file.
Play different positions of the video to check whether audio is now in sync with the running video. When everything is okay, you can filter the video and do the final encoding.

Indirect Test with a software video player and Virtualdub:
Virtualdub > Audio > Audio from other file...: chose your corrected audio file..
Video: Direct stream copy
Save the new Avi video file with a different name (which works really fast, because the video is not rendered again)
Load this file into a video player and check if the corrected audio track is in sync with the video stream.

When everything is fine, you can filter the video and do the final encoding.

Version b) Sync video to audio by correcting the video data stream

In principle, one could also change the duration of the video stream by inserting or deleting intermediate images. This may lead to jerky movements, so I do not recommend this.

For this reason I have removed the rest of this section.

Delay of mp3 audio in videos

AVI files can use MP3 audio, but in many cases it will result in a delay between audio and video.

I had the best synchronisation results with the following workflow:
Produce input audio as a Wav file. Start Avidemux and open the input video file. Load the wave file as external audio track. In audio output settings select MP3 with a constant (!) bit rate (CBR). Save the resulting video. Now the audio track will be converted and woven into the video data stream.
You must not select a variable bit rate! This would result in a variable bit rate per time so that audio will not be hard synced to the video frames.
Attention: Convert MP3 audio for a video track only with Avidemux. Do not use any other software. I got a lot out-of-sync audio tracks with Wavelab and other programs, even when they used the same LAME MP3 encoder with identical settings!