qtdemux: Provide a 2 frames lead-in for audio decoders
AAC and various other audio codecs need a couple frames of lead-in to
decode it properly. The parser elements like aacparse take care of it
via gst_base_parse_set_frame_rate, but when inside a container, the
demuxer is doing the seek segment handling and never gives lead-in
data downstream.
Handle this similar to going back to a keyframe with video, in the
same place. Without a lead-in, the start of the segment is silence,
when it shouldn't, which becomes especially evident in NLE use cases.