Communication

Beyond HTML5: Peer-to-Peer Conversational Video

We've in a previous blog post shown you our work on conversational voice and video using "beyond HTML5" solutions. In that work we used websockets and a media relay to route streams between peers. Now we'd like to show you how we have extended this to use peer-to-peer streaming.

Peer-to-peer streaming means that voice/video frames are streamed directly between peers, without any server in between. The effect is lower latency and more efficient network utilization. Up until now, however, web browsers have lacked the capability to communicate peer-to-peer. Instead, communication has traditionally relied on a shared relay server in the network.

The attached video includes a quick demo and a brief explanation of the connection establishment procedure. Below we explain this in more detail.

The ConnectionPeer API

There is an existing proposal for an API called ConnectionPeer for establishing a direct connection between two peers (web browsers). ConnectionPeer is a very minimalistic API, and leaves most of the signaling (logging in, inviting friends, and so on) to be performed using traditional HTTP techniques. For example, the EventSource API (which we submitted to WebKit in 2009) comes in handy for receiving invitations.

The API is presented, along with a brief example, in the HTML specification at the WhatWG site, http://www.whatwg.org/specs/web-apps/current-work/multipage/commands.html#peer-to-peer-connections.

ConnectionPeer is responsible for the minimal functionality needed for establishing peer-to-peer connectivity, using the following steps:

  1. Each peer collects information about itself about how it can be reached from the outside. Typically this is one or more IP address/port combinations, along with some more information. This is done by the getLocalConfiguration method.
  2. Add the corresponding information about the other peer (obtained "out of band", that is, typically over HTTPS from the chat server). This is done by the addRemoteConfiguration method.
  3. Establish the peer-to-peer connection, allowing for streaming data to be exchanged between the peers. This is done implicitly once both methods above have been called. Once the connection has been established, an onconnect event is generated and allows the application to react.

In addition to connection establishment, ConnectionPeer also includes methods for streaming data over the connection. These are used to add real-time voice and video streams. Here's a sketch of an example of the above:

<script>
var serverConfig = ...; // provided by server to handle, e.g., TURN
var local = new ConnectionPeer(serverConfig);
 
window.onload = function() {
 
  local.onconnect = function() {
    // executed when we're connected to the other peer:
    // from now on, we can start adding streams
  }
 
  local.onstream = function() {
    // executed when the other peer adds a stream, e.g., video or voice
    var remoteView = document.getElementById("remoteView");
    remoteView.src = local.remoteStreams[0].url;
  }
 
  var videoDevice = document.getElementById("videoDevice");
  videoDevice.onchange = function() {
    // executed when the user selects a video source in the <device> element
    var localStream = videoDevice.data;
    var selfView = document.getElementById("selfView");
 
    // display the selected video source (self view)
    selfView.src = localStream.url;
 
    // ... and show it to the remote peer by adding it to the connection
    local.addStream(localStream);
  }
}
 
// listen to an EventSource for invitation events
var invitationEvents = new EventSource(...);
invitationEvents.addEventListener("message", function(event) {
  // request the local connectivity configuration (step 1 above)
  local.getLocalConfiguration(function (peer, configuration) {
    // include the local configuration in an invitation response
    // to the server (step 2 above) using some "out-of-band" mechanism,
    // such as an XHR
  }
});
</script>;
 
<video width="320" height="240" id="selfView" autoplay="true"></video>
<video width="320" height="240" id="remoteView" autoplay="true"></video>
 
<device id="videoDevice" type="media">

We became interested in this API and wanted to learn more about it, so we went ahead to implement it (at least, a subset of it). We are still in a Linux environment, using WebKit GTK+ and gstreamer, and are re-using the implementations of the device element and Stream API as well as large parts of the MediaStreamTransceiver (previous post)

NAT traversal and ICE

Most networks use some type of NAT (Network Address Translation), which complicates peer-to-peer connections like this. The ICE (Interactive Connectivity Establishment; RFC 5245) procedure allows for establishing connectivity even in the presence of NATs, using STUN/TURN servers. This means that step 1 above results in a set of addresses, including both local ones and NATed ones. It also means that a prioritization is made in step 3 that values local addresses higher than NATed ones, to make sure latency is kept as low as possible.

We thus use ICE to implement the native parts of ConnectionPeer. In our modified WebKit GTK+, we use libnice http://nice.freedesktop.org/wiki/ for the ICE implementation, and it integrates rather nicely with gstreamer and the GTK+ main loop.

It seems to us that the functionality of ICE matches the ConnectionPeer API rather well; however, we have some comments on the finer details, and we plan to bring those comments up with the WhatWG community.

Summary

Although ConnectionPeer is a rather small API, it provides something fundamentally different from the traditional web: peer-to-peer connections without an intermediate relay. In the efforts to start standardization (through activities in IETF and W3C) of peer-to-peer support in browsers to enable real-time voice and video communication without plug-ins, the ConnectionPeer API is the most concrete API proposal so far. Our tests indicate that it is (with some minor changes) a good starting point.

If you would like to discuss more, please add a topic in the Web Real-Time Communication community!

--Patrik Persson, Xing Fan, Yuan Song, Stefan Håkansson

Comments

sdsharp's picture

Hi there,
Quite interesting.

Is this something that is NOT YET available?
Is this something internal to Ericsson Labs?
Is this something we (external developers) can use?

Regards,
FP

stefan_h's picture

Hi FP!

I'm glad you're interested. For the time being this not available to labs users, it is an internal prototype only. We're however in the process of contributing parts of the implementation to the WebKit project.

Br,
Stefan

kcarter80's picture

Does this API use UDP?

stefan_h's picture

Yes, in our implementation media is transported using RTP over UDP.

KarolP's picture

Looks promising :)

WebKit, GTK+, gstreamer -- looks to me like a good field for Vala language ;)
http://live.gnome.org/Vala

I wonder if You use plain c++ or that neety language?

Best regards
Karol

p.s. By the way, some errors about mixed SSL/non-SSL connection.
I think youtube is messing whole transmission.

- although Youtube runs fine on SSL, it first presents certificate for *.Google.com

aviflax's picture

This is fantastic! I'm very glad and appreciative that you're exploring the implementation of this functionality, it's valuable and important. Even better that you plan to contribute the code to WebKit! Thank you!

nog_lorp's picture

Does this allow generic data, or only a/v streams?

stefan_h's picture

The ConnectionPeer API allows sending texts, bitmaps and files as well. We've only implemented support for streams so far though.

mohan.chinnappan's picture

Great work!
Do you have plans to show to outside Ericsson Labs? This way others can contribute to your nice work and make it available to WebKit based browsers like Chrome and Safari. Please let us know.

microsoft's picture

Another great app using the same HTML5 principle can be found here:
http://www.rcpsecure.com/govisiochat/ It is a must for all P2P developers.

Brian.

redXDevil's picture

Hi ... Webkit project is a great idea... I'd like to use html5 for a program to play video files (mpeg, mp4, avi) and to also all webcam and video conferencing .. I need a code sample do it in windows. I appreciate any help that can get me to move forward in this. Thanks

redXDevil's picture

Hello,
I'm new to the whole system and would like some help if possible. I come from C# visual studio environment and totally new to the HTML5 technology all together.
I just finished installing Suse linux 12.1 so I can learn your project and might benefit you with some suggestions as well.
I'd appreciate it if anyone can give me some instructions steps on video conferencing with Suse for beginners like me.
Thanks a lot

Subscribe to Comments for &quot;&quot;