--- /dev/null
+This document contains some ideas that can be used to implement
+interactivity in media content.
+
+Possible application: DVD navigation, flash,...
+
+Requirements
+------------
+
+- capture mouse clicks, mouse position, movement occuring on
+ a video display plugin
+- transport these events to the interested plugins
+- allow for automation (ie, the technique should work without
+ a video plugin too)
+- the core doesn't care
+
+Capturing events
+----------------
+
+- the videosink element captures mouse events
+ - event is encapsulated into a generic data structure
+ describing the event (need to define a caps?)
+ - event is signalled to the app?.
+ - event is sent upstream?
+
+ * videosink has to add something to the main_loop to
+ be able to grab events
+ * thread issues?
+ * does the app need to know about the events?
+
+- app captures mouse events
+ - no idea if that's possible
+ - app sends events upstream
+
+Sending events to plugins
+-------------------------
+
+- are sent upstream using the event methods
+ * more generic
+ * less app control
+
+- are sent to the appropriate plugin by the app
+ * app needs to know what plugins are interested,
+ less generic.
+ * more app control
+
+automation will always work, the app can construct navigation
+events and insert them into the pipeline.
+
+What about flushing to minimize latency?
+
+
+Defining an event
+-----------------
+
+some ideas:
+
+GST_CAPS_NEW (
+ "videosink_event",
+ "application/x-gst-navigation"
+ "type", "click",
+ "x_pos", 30,
+ "y_pos", 40
+ )
+
+GST_CAPS_NEW (
+ "videosink_event",
+ "application/x-gst-navigation"
+ "type", "move",
+ "x_pos", 30,
+ "y_pos", 40
+ )
+
+...
+
+do we need a library for this?
+
+do we use custom events and use the mime type to detect the
+type? do we creat a GST_EVENT_NAVIGATION?
+
+can we encapsulate all events into a GstCaps? I would think so
+
+Random thoughts
+---------------
+
+- we're basically defining an event model, I don't think there is
+ anything wrong with that.
+- how is our coordinate system going to work? do
+ we use normalized values, 0-1000000 (or floats)
+ or real pixel values? real pixel values require scalers to adjust
+ the values (I don't think I like that)
+
+
+