Using Web Audio API to decode and play an MP3 file (part 2)

Hello again! Just FYI this sample builds on the previous post: Using Web Audio API to decode and play an MP3 file (part 1), I recommend checking that post out before you continue with this one!

Last time we created a basic page that allows you to play an MP3 file, this sample will add a layer of simple audio processing to the pipeline allowing you to adjust the playback volume.

Audio context effect nodes

The Web Audio API would be pretty boring if all it does is simply play audio out to the speakers. To add a little spice the Web Audio API uses a processing chain system that sits between the audio source and the computer speakers. There are numerous “effects nodes” that you can place between the start and end points.

A diagram illustrating AudioContext's processing chain.

There are quite a number of different effects, including nodes that allow you to manipulate samples in real-time, but in order to get to grips with things, we’ll start with a simple gain node (volume) and move on to better things in another article.

The Gain node

This node gives you control over the output volume and is dead simple to add to the processing chain.

// First instantiate an AudioContext (unfortunately Chrome still requires a prefix)
var audioContext = new (window.AudioContext || window.webKitAudioContext)(); 
// The gain node is created by calling the "createGain()" method of the audio context
var gainNode = audioContext.createGain(); 

To connect the gain node to the processing chain we do the following:

var audioContext = new (window.AudioContext || window.webKitAudioContext)(); // Our audio context
var audioSourceNode = // ... some code to create the AudioNode that provides the input audio here ...
var gainNode = audioContext.createGain(); // Create a gain node 

// Pipe the output from audioSourceNode into the gainNode
audioSourceNode.connect(gainNode); 

// Pipe the processed output from gainNode to the output destination (probably your speakers)
gainNode.connect(audioContext.destination);

The above code is sufficient to connect a gain node to your playback chain, but is pretty useless if we can’t also adjust the gain. In order to adjust the gain we do the following:

gainNode.gain.value = 1.0; // 100% volume
gainNode.gain.value = 0.5; // 50% volume 
gainNode.gain.value = 0; // muted 
gainNode.gain.value = parseFloat('0.75'); // Parsed from string

The gain value is a unitless value, in the example here I am setting it using a floating point value between 0 and 1.0, where 0 is muted and 1.0 is 100% volume.

Platform support

The GainNode is supported in the latest version of Mozilla Firefox, Google Chrome, Microsoft Edge and Apple Safari.

Sample code

You can view the sample that I’ve prepared here:

Further reading

External links

My articles

In conclusion

I hope you have a little insight on how to create an audio node and connect it to the processing chain. The gain node is one of many types of audio node that you can create. For more types of node check the AudioContext on MDN article.

In the next sample I’ll be showing you how to create a volume level meter using the AudioContext.createScriptProcessor() method.

Using Web Audio API to decode and play an MP3 file (part 1)

Hi there! Thanks for coming!

This article is the first in a mini 3 part series where I will use the Web Audio API to decode and play an MP3 file. We’ll then build on this sample to introduce a Gain node to adjust the volume of your audio, and then finally a script processing node in order to get access to the raw audio samples to do a volume calculation.

By the end of this article I hope you’ll appreciate how powerful, yet simple the Web Audio API is for playback.

If you haven’t already, please consider reading the ‘Introduction to the Web Audio API‘ article that I wrote before proceeding.

What the sample does

Starting with the basics I’d like to show you how to play an AudioBuffer directly to your output device (which is more than likely your speakers).

First we’ll obtain the MP3 bytes from file system using the File API to read an MP3 file from the computer’s local file system.

In order to obtain the audio buffer object we will need to use the AudioContext.decodeAudioData() method that will use your browser (or system) audio decoder to transform the Mp3 data into a buffer of PCM samples.

After we’ve obtained this buffer we’ll then create an ‘AudioBufferSourceNode‘ object that will consume the audio samples and play them out at the correct speed to your audio output device.

The full code

Below you’ll find the completed sample source code, the comments should pretty much explain everything that you need to know, but I’ll also explain a few things below.

<!DOCTYPE html>
<html>
    <head>
        <meta charset="utf-8" />
        <title>Decode an MP3 using Web Audio API</title>


<style>
            body {
                font-family: sans-serif;
                font-size: 9pt; 
            }
        </style>


    </head> 
    <body>
        


<form id="playerForm">


<h1>Decode an MP3 using Web Audio API</h1>




This example will show you how to decode an MP3 file using the Web Audio API.



<input type="file" id="mp3FileSelect" />



<input type="button" id="playButton" value="Play" />

        </form>


        
        <script>


            var audioContext = new (window.AudioContext || window.webKitAudioContext)(); // Our audio context
            var source = null; // This is the BufferSource containing the buffered audio
            
            
            // Used the File API in order to asynchronously obtain the bytes of the file that the user selected in the 
            // file input box. The bytes are returned using a callback method that passes the resulting ArrayBuffer. 
            function obtainMp3BytesInArrayBufferUsingFileAPI(selectedFile, callback) {

                var reader = new FileReader(); 
                reader.onload = function (ev) {
                    // The FileReader returns us the bytes from the computer's file system as an ArrayBuffer  
                    var mp3BytesAsArrayBuffer = reader.result; 
                    callback(mp3BytesAsArrayBuffer); 
                }
                reader.readAsArrayBuffer(selectedFile);
                
            }
              
                        
            function decodeMp3BytesFromArrayBufferAndPlay(mp3BytesAsArrayBuffer) {
                
                // The AudioContext will asynchronously decode the bytes in the ArrayBuffer for us and return us
                // the decoded samples in an AudioBuffer object.  
                audioContext.decodeAudioData(mp3BytesAsArrayBuffer, function (decodedSamplesAsAudioBuffer) {
                        
                    // Clear any existing audio source that we might be using
                    if (source != null) {
                        source.disconnect(audioContext.destination);
                        source = null; // Leave existing source to garbage collection
                    } 
                    
                    // In order to play the decoded samples contained in the audio buffer we need to wrap them in  
                    // an AudioBufferSourceNode object. This object will stream the audio samples to any other 
                    // AudioNode or AudioDestinationNode object. 
                    source = audioContext.createBufferSource();
                    source.buffer = decodedSamplesAsAudioBuffer; // set the buffer to play to our audio buffer
                    source.connect(audioContext.destination); // connect the source to the output destinarion 
                    source.start(0); // tell the audio buffer to play from the beginning
                }); 
                
            }
            
            
            // Assign event handler for when the 'Play' button is clicked
            playerForm.playButton.onclick = function (event) {
                
                event.stopPropagation();
                
                // I've added two basic validation checks here, but in a real world use case you'd probably be a little more stringient. 
                // Be aware that Firefox uses 'audio/mpeg' as the MP3 MIME type, Chrome uses 'audio/mp3'. 
                var fileInput = document.forms[0].mp3FileSelect; 
                if (fileInput.files.length > 0 && ["audio/mpeg", "audio/mp3"].includes(fileInput.files[0].type)) {
                    
                    // We're using the File API to obtain the MP3 bytes, here but they could also come from an XMLHttpRequest 
                    // object that has downloaded an MP3 file from the internet, or any other ArrayBuffer containing MP3 data. 
                    obtainMp3BytesInArrayBufferUsingFileAPI(fileInput.files[0], function (mp3BytesAsArrayBuffer) {
                       
                        // Pass the ArrayBuffer to the decode method
                        decodeMp3BytesFromArrayBufferAndPlay(mp3BytesAsArrayBuffer);  
                                          
                    });
                    
                } 
                else alert("Error! No attached file or attached file was of the wrong type!");
                                    
            }


        </script>
        
    </body>
</html>

Explaining a few things

Mime type differences between browsers

Unfortunately we don’t live in a perfect world, and these imperfections extend to how different browsers register MIME types for audio files. In the following line you notice that we test the selected file’s MIME type against “audio/mpeg” and “audio/mp3“.

["audio/mpeg", "audio/mp3"].includes(fileInput.files[0].type)

This is because Chrome regards an MP3 file as “audio/mp3” whereas Firefox and Edge regard an MP3 file as “audio/mpeg“.

Usage of the File API

Why I separated obtaining the MP3 bytes and decoding them into two methods

So in this example I am using the File API in order to obtain the MP3 file bytes.

You’ll notice that I split the code into two sections, one to obtain the MP3 bytes and the other is to decode those bytes using the AudioContext.

The reason for this is that the AudioContextdecodeAudioData()‘ method doesn’t care how you obtain the MP3 bytes, so it makes sense to separate these two functions to better illustrate that. In a real world app you might not separate these two methods.

Briefly on the File API

The File API is a very neat way to obtain the data from a file on the user’s file system. The File API is granted access to a particular file on the file system when the user selects it in a file input element.

Processing the file contents in JavaScript before it is sent to the server yields advantages in that you can work with a file before it is sent to a server. For example; assume you run an image gallery web site, instead of blocking a user when they select a larger than allowed image, you could automatically process it, by resizing it, and converting it to the correct format before sending it to the server.

Likewise here we’re able to play the audio before it is sent to the server. In practice you could so something cool like render the file’s waveform and use it as a progress bar as it uploads to a remote server.

Disconnecting existing audio nodes

if (source != null) {
    source.disconnect(audioContext.destination);
    source = null;
} 

Just to explain the above code snippet. If we don’t include this code then every time we click the play button whatever is already playing will continue to play and our new audio data will be mixed into it.

In order to stop the existing audio from playing we disconnect the existing audio source if it exists. By setting our reference to the existing audio source to null the garbage collector will automatically clean it up for us on it’s next run.

About the AudioBufferSourceNode object

source = audioContext.createBufferSource();
source.buffer = decodedSamplesAsAudioBuffer; 
source.connect(audioContext.destination); 
source.start(0); 

In this part you can see that we ask the AudioContext to create an AudioBufferSourceNode object. This object streams the contents of an AudioBuffer to a AudioDestinationNode.

The sequence of this code can be explained as follows:

  1. Ask our AudioContext to create an AudioBufferSourceNode object.
  2. Assign our AudioBuffer of decoded MP3 samples as the buffer to be played.
  3. Connect the output of our AudioBufferSourceNode object to the AudioDestinationNode of our AudioContext.
  4. Tell the AudioBufferSourceNode object to start streaming the buffer from the start.

Chrome and support for the “buffer” property of AudioBufferSourceNode

For some reason the MDN documentation states “[1] The buffer property was removed in Chrome 44.0.” for AudioBufferSourceNode (link), and on other pages it states that object has been deprecated (link).

In my experience with writing this article there was no problem with using this property in Chrome 47 and I’ve found no information with regards to the property being deprecated in the latest Web Audio API draft on the W3C documentation web site (see here).

However I did find these two links here and here that state that the ability to set the buffer property more than once should be deprecated. So in English, it appears that Chrome may have just removed the ability to set the “buffer” property more than once.

Therefore I’ve concluded that the otherwise excellent MDN documentation is wrong in this case and I’ve suggested the correction to them.

What’s next?

Hopefully this sample gives you an example on how to decode an MP3 file to a buffer and play that buffer using the Web Audio API. So far we’ve only just connected directly to the playback destination node. In the next example I’ll be showing you how to insert an AudioNode in the middle to transform your audio slightly.

Further reading and references

My stuff (opens in same tab)

External references (opens in different tab)

Simple microphone Web Audio API / WebRTC example

Hey, a simple use case is the best place to start right?

This example will show you how to use the Web Audio API and WebRTC in order to connect to your microphone and play the audio back from your speakers.

What we’ll be doing

In English the process is as follows:

  1. Create an AudioContext.
  2. Use the WebRTC “navigator.getUserMedia” method to connect to your microphone.
  3. Tell the AudioContext object to create a source stream from the returned microphone WebRTC stream and connect that stream as the source.
  4. Connect the microphone input stream directly to the output stream.

The code

// Create our audio context object 
// - Webkit still requires the audio context to be prefixed.
var audioContext = new (window.AudioContext || window.webKitAudioContext)();

// // This variable will hold the source stream that we create from the WebRTC source
var webRtcSource; 

// Handler for 'Start' button 'onclick' event
function handle_startMonitoring() {
    // At the time of writing browsers still require prefixing to use 'navigator.getUserMedia'  
    navigator.getUserMedia = (navigator.getUserMedia || navigator.webkitGetUserMedia || navigator.mozGetUserMedia || navigator.msGetUserMedia);
    // Call get 'getUserMedia' asking for access to an audio source 
    navigator.getUserMedia(
        { audio: true, video: false }, 
        function (mediaStream) {
            // On success we return a WebRTC media stream
            // createMediaStreamSource will create an audio source node that wraps the MediaStream 
            webRtcSource = audioContext.createMediaStreamSource(mediaStream);
            // Connect this source directly to the audio destination (your speakers)
            webRtcSource.connect(audioContext.destination);
        }, 
        function (error) {
            console.log("There was an error when getting microphone input: " + err);
        }
    );
}

// Handler for 'Stop' button 'onclick' event
function handle_stopMonitoring() {
    // Close off the audio context by calling disconnect on the input source
    // The browser will dispose disconnected input sources
    webRtcSource.disconnect(); 
    webRtcSource = null; 
}

You’ll find the full code available below and read below for some notes about prefixing and Chrome.

Here’s the code without comments

var audioContext = new (window.AudioContext || window.webKitAudioContext)();
var webRtcSource; 

function handle_startMonitoring() {
    navigator.getUserMedia = (navigator.getUserMedia || navigator.webkitGetUserMedia || navigator.mozGetUserMedia || navigator.msGetUserMedia);
    navigator.getUserMedia(
        { audio: true, video: false }, 
        function (mediaStream) {
            webRtcSource = audioContext.createMediaStreamSource(mediaStream);
            webRtcSource.connect(audioContext.destination);
        }, 
        function (error) {
            console.log("There was an error when getting microphone input: " + err);
        }
    );
}

function handle_stopMonitoring() {
    webRtcSource.disconnect(); 
    webRtcSource = null; 
}

Live examples

Some notes on prefixing

Audio context

Chrome and Webkit based browsers will require prefixing in order to create an audio context. I’ve noticed that Firefox and Edge do not have this requirement.

Create a new audio context as follows:

var audioContext = new (window.AudioContext || window.webKitAudioContext)();

WebRTC

The WebRTC API is still in draft, most browsers require prefixing in order to use it. In this example I am ensuring the ‘navigator.getUserMedia‘ method is set by assigning it to whatever prefixed version is available.

navigator.getUserMedia = (navigator.getUserMedia || navigator.webkitGetUserMedia || navigator.mozGetUserMedia || navigator.msGetUserMedia);

Avoiding prefixing

You can use libraries such as Modernizr to get around prefixing, but in this example I am avoiding extraneous libraries.

Chrome gotchas

I never had any luck testing this locally using Chrome 47… In fact I spent an hour trying to debug, only to find that Chrome recently added a security policy around WebRTC. It seems that now you can only use the features over HTTPS, or local connections.

Make sure you consider this when testing!

Read more by following this link.

Related

Conclusion

This was a really simple example of how to use both WebRTC and Web Audio API. I am preparing another article which shows you how to access audio samples via code, as well as add some processing effects and also use the new promise style API in order to access WebRTC media devices.

If you have any comments or corrections then please leave them below.

Stay tuned!

Twitter  @nicdoescode on Twitter.

Introduction to the Web Audio API

Audio is a wonderful addition to the array of APIs introduced for the modern web. It is a versatile API that allows you to do some pretty exciting things with audio, such as recording, applying effects, mixing, conversion, etc. The API runs completely in the browser, you do not need  external plugins such as Adobe Flash or Microsoft Silverlight in order to use it.

This article takes a high level look at the Web Audio API and I will be going into further detail in coming articles. At the bottom of the article you’ll find some practical code samples so that you can get started quickly with the Web Audio API.

Web Audio API in a nutshell

The API itself follows a very straightforward chain-of-responsibility pattern (meaning we daisy chain a series of nodes off each other).

You can describe a simple use case as follows:

  1. Instantiate an AudioContext object.
  2. Define the input source for the AudioContext (eg. microphone).
  3. Optionally create one or more effect nodes, connect the first to the input source node, and then daisy chain any additional ones off the last effect node.
  4. Create a destination node and connect it to the last effect node.

AudioContext

The AudioContext object

Think of this as the container for your audio processing chain. In addition it also provides useful functions such as decoding, buffer creation, etc.

Audio inputs

The input node can be an HTML5 <audio> element, a stream such as the microphone via WebRTC, or an oscillator node (generates artificial audio ‘tone’, built into the API).

The audio input node provides an endpoint to connect effect, or destination nodes to.

Effect nodes

These take in audio data from a node, might apply some processing, and then provide an endpoint so that you can connect other nodes, or a destination node to.

The API provides a number of pre-built effects nodes such as channel splitters / joiners, gain adjustments, linear convolution, compression and delay.

You can write your own nodes by using the ScriptProcessorNode.

Destinations

The last link in your audio context is the destination, this can take the form of your speakers, an audio buffer or a MediaStreamAudioDestinationNode that creates a MediaStream to be used like any WebRTC stream.

Audio buffers

The AudioBuffer is an object that represents audio in memory, you can create them with an AudioContext or obtain them from various audio sources.

After you have created the AudioBuffer object you can use it to read and write the audio samples directly using JavaScript.

When finished you can wrap the buffer in an AudioBufferSource node that can be consumed by an AudioContext as an input source.

Code samples

I am in the process of creating a number of code samples for the Web Audio API. As I create these samples I’ll update them here on this page.

Further reading and references

In closing

I hope you enjoyed this article, if you have any feedback or corrections for me then please drop me a comment 🙂

Twitter  @nicdoescode on Twitter.

So, what of TypeScript?

I’ve come to really like this language and I hope to publish articles about interesting things that I do and learn in TypeScript.

This post will talk a little about TypeScript at a high level and abstract level.

Firstly, what is TypeScript?

In a nutshell it is a language that transpiles to JavaScript as either ES5 or ES6.

TypeScript implements a lot of ES6 features and supersets it with it’s own range of features.

So why would you use it in place of ES6?

First of all, I love JavaScript and I am super excited about ES6, I really want to spend some more one on one time with ES6. TypeScript fits in nicely with me and also ES6 because implements lots of ES6 features, it can compile into ES6, and it allows you to apply OO thinking to JavaScript.

You should always use the right tool for the right job. I find TypeScript great for applications where you’re working with data, especially in the case of a REST API, you can define constraints on what you send and receive from the API.

The module based approach works well with single page apps, or anything that’s componentised.

If you’ve done something stupid in your code, then the compiler will throw an error, alerting you to mistakes before you even run your code. But remember that constraints are only applied at compile time, and when running your code is JavaScript.

I wouldn’t use TypeScript in a situation where you’re mostly manipulating the DOM or not working with a lot of data structures.

Future proof

TypeScript is an open source project spearheaded by Microsoft and supported by a vast range of companies including Google who’ve written Angular 2.0 in TypeScript.

I would expect TypeScript to be maintained for years to come.

Your code base is also future proofed; as ES6 features are implemented your existing code can be re-transpiled to support them without any changes.

Some cool features

These are some of the features of TypeScript that I enjoy:

Generics

var result: Array<NinjaTurtle> = []; 

Enums

enum TurtleColour { Blue, Purple, Orange, Red };
interface NinjaTurtle {
    bandannaColour: TurtleColour;
}
var newTurtle: NinjaTurtle = { bandannaColour: TurtleColour.Purple };

Easy classes and modules

module MathStuff {

    // 'export' makes the class visible outside the module 
    export class Addition {

        // Static methods are supported 
        static addTwoNumbers(first: number, second: number): number {
            return first + second;
        }

        // Return type omitted, the compiler infers it 
        static multiplyTwoNumbers(first: number, second: number) {
            return first * second; 
        }

    }

}

class Totaller {

    private total: number; 

    constructor(initialValue: number) {
        this.total = initialValue; 
    }

    // Methods are public by default
    getCurrentTotal() {
        return this.total; 
    }

    // No 'return' call, therefore it returns void 
    addToTotal(amount: number) {
        this.total += amount; 
    }

    // Use the module we defined above 
    addToTotal_usingModule(amount: number) {
        this.total = MathStuff.Addition.addTwoNumbers(this.total, amount);
    }

}

Get / Set accessors

class StringKeeper {

    // Getter / setter here
    get currentValue() {
        return this.currentValue; 
    }
    set currentValue(value: string) {
        this.currentValue = value; 
    }

    private stringToKeep: string; 

    constructor() {
        this.stringToKeep = "Default";
    }

}

Can you use libraries such as JQuery or React with TypeScript?

Yes you can! In fact you can use any JavaScript library with TypeScript.

The cheat way is to cast the object to ‘any’, which tells the TypeScript compiler to not enforce typings on that object.

var someJsObject = <any>fantasticLibrary.returnSomeObject({ ingredient: "custard" });

Of course to take full advantage of TypeScript and it’s typings you’ll need a ‘definition file’ which ends with extension ‘.d.ts’ (example: jquery.d.ts). You can write your own or you can take advantage of the wonderful library of typings available at DefinitelyTyped.

DefinitelyTyped maintains a high quality library of TypeScript typings for many JavaScript libraries, both common and obscure.

Where could I get started on TypeScript

At the very least I’ve made you aware of TypeScript, but hopefully I’ve piqued your interest enough for you to want to find out more.

Check out typescriptlang.org which is the official home of TypeScript. You can get some great insight into the language by checking out their TypeScript handbook.

Nothing is stopping anyone hardcore from coding TypeScript using Notepad but for the rest of us, most popular IDEs support TypeScript.

Thanks for reading, and I hope you enjoy TypeScript.