Glossary · concept

What is GOP (group of pictures)?

A GOP (group of pictures) is a sequence of video frames that starts with a keyframe and continues with frames described as differences from earlier frames. GOP length determines seek granularity and stream resilience. YouTube typically uses 5-second GOPs (one keyframe every ~150 frames at 30 fps).

Also called:group of pictures · i-frame · gop length

Modern video codecs only store full pictures occasionally. Most frames are described as "the previous frame, but with these differences" — which compresses extremely well because adjacent frames are usually 95% identical. The full frames are called keyframes (or I-frames); the partial ones are P-frames and B-frames.

A GOP is everything between two keyframes. Inside a GOP, every frame depends on earlier frames in the same GOP. Across GOPs, there is no dependency — you can start decoding cleanly at any keyframe. This is why video players can't seek to arbitrary timestamps; they jump to the nearest keyframe and play forward.

A short GOP (1-2 seconds) gives fine-grained seeking and faster recovery from packet loss, at the cost of compression efficiency (more keyframes = more bytes). A long GOP (10 seconds) compresses better but seeks coarser. YouTube sits at 5 seconds — a reasonable balance.

Common questions

Why do clips of YouTube videos start at slightly different timestamps than I asked?
Lossless clips can only start at GOP boundaries. If you ask for a clip starting at 0:23 and the nearest keyframe is at 0:21, the clip starts at 0:21. To start exactly at 0:23 the clip would need to be re-encoded.
Is GOP the same as a keyframe interval?
GOP length and keyframe interval are the same thing — the number of frames (or seconds) between two keyframes.

Related terms

VidPickr is a free, browser-based YouTube downloader. Every term in this glossary either describes how YouTube delivers video or why your downloads behave the way they do. Try the downloader →