Context aware communication

Beyond HTML5 - Audio Capture in Web Browsers

HTML5 - the next generation standard for web browsers is in last call in WHATWG and is currently being implemented by browser vendors. The work to complete the specification has taken over five years and is still in progress. HTML5 takes a great leap into making the browser a more powerful application platform by adding several new features. To name a few: WebSockets and Server-sent events open up for more flexible networked applications and the media elements and canvas element can be used to seamlessly embed new types of content into web pages. So what is the next step in HTML?

One answer to the question above is the ability to control new types of input devices. The device element represents a device selector which allows the user to grant a web page access to input devices such as a microphone or web camera. The code snippets below demonstrates how the device element and the Stream API can be used to record a short audio clip. Note that the device element and the related APIs are not available in any browser yet, and the APIs may change before this happens. But anyway - here is the code.

<p>Select device: <device type="audio_capture" id="media_device"></p>
<input type="button" id="record_ctl_but" value="Record" disabled></input>

The page layout content simply consists of the a device selector, represented by the device tag, and a button. The type attribute of the device element has been specified to "audio_capture" to narrow down the list of devices to only include the ones capable of recording audio.

// in window.onload
document.getElementById("media_device").onchange = function () {
    // ready to record
    audioStream =;
    recordCtlBut.disabled = false;

Upon loading the page, we will attach a listener to the device element to monitor changes. When a device is selected, i.e. the change event is triggered, the data property represents the Stream object that is connected to the selected device.

// in window.onload
recordCtlBut = document.getElementById("record_ctl_but");
recordCtlBut.onclick = function () {
    if (!recorder) {
        // start recording
        recordCtlBut.value = "Stop";
        recorder = audioStream.record();
        // set the maximum audio clip length to 10 seconds
        recordTimer = setTimeout(stopRecording, 10000);
    } else

A single button is used to both start and stop the recording and the label alternates to display the recording state. Calling the record method on the Stream object starts the recording and returns a StreamRecorder object. A timer is started to limit the maximum length of a recorded audio clip to 10 seconds.

function stopRecording() {
    var audioFile = recorder.stop();
    // reset to allow new recording session
    recorder = null;
    recordCtlBut.value = "Record";

The recording is stopped when the stop button is clicked, or the 10-second timer times out. The recorded audio data is retrieved from the StreamRecorder object by calling the stop method. The recorded audio data is represented as a File object (W3C File API). It is then up to you what to do with the recorded clip; perhaps publish it on a web server.

As mentioned above, the device element is not limited to audio devices. It could in a similar manner be used to select other types of devices, such as a web camera, and use the video element to display what the camera is seeing. The next step could be to share live audio and video with others in a web based video conferencing system.

To conclude, the device element and related APIs will open up for bringing new devices into the web experience that previously only could be accessed by using browser plug-ins. The browser is really becoming a powerful application platform while still having the advantage of superior application portability. It will be interesting to see the specification evolve.


amarcello's picture


Sorry for reopening an old post. I'm working in a Thesis related to HTML5 and it will be awesome if I can use the Webkit version that you have used for your tests. Could that be possible?

Thanks in advance,

pererik's picture

The patched WebKit version that we're using is an internal prototype, but we're working on contributing the code to WebKit.
Best regards

ubuntourist's picture

The link to the <device> element in the second paragraph leads to the WHATWG spec, but nothing about the <device> element appears there... Should it? (or did it? or will it?) ;-) And if not, is there a newer alternative? Thanks.

StefanAlund's picture

The device element has been removed from the specification, instead there is a new JavaScript API for getting access to mic and camera:

This post is from last year, I think you will be happy to learn that we now have made our experimental browser engine available for download. Follow this link to learn more about how you can start building web apps that supports real-time audio/video communication:

jonasl's picture


The device element is no longer available in the specification since it has been replaced with a JavaScript API, getUserMedia.

BR Jonas

annieguan's picture

Does ericsson webkit support capture bluetooth audio input?

StefanAlund's picture
shashank8960's picture

I want to record my voice and send it to the server directly .But I am unable to use your application.I want to use get user media html 5 application for audio recording.But n my case this is not working.I have latest Google chrome with version 22 .But still it doesn't support it.Please provide me the guidance to overcome the problem

Subscribe to Comments for &quot;&quot;