Web Real Time Communication (WebRTC)
Enable human communication via voice and video (Real Time Communication) was a major challenge for the web. WebRTC allows web browsers to not only request resources from backend servers, but also real-time information from browsers to other users. This enables applications such as video conferencing, file transfer, chat, or desktop sharing without the need of either internal or external plugins. Simply webRTC enables peer to peer communication.
Earlier, RTC has been corporate and complex, requiring complex expensive audio and video technologies to be licensed or developed in house. Integrating RTC technology with existing content, data and services has been difficult and time consuming. Particularly on the web.
Gmail video chat became popular in 2008, and in 2011 Google introduced Hangouts, which use the Google Talk service (as does Gmail). Google bought GIPS, a company which had developed many components required for RTC, such as codecs and echo cancellation techniques. Google open sourced the technologies developed by GIPS and engaged with relevant standards bodies at the IETF and W3C to ensure industry consensus. In May 2011, Ericsson built the first implementation of WebRTC.
Why we needs webRTC.
- Currently, there is no free, high-quality, complete solution available that enables communication in the browser. WebRTC enables this.
- Many web services already use RTC, but need downloads, native apps or plugins. These includes Skype, Facebook (which uses Skype) and Google Hangouts (which use the Google Talk plugin).Downloading, installing and updating plugins can be complex, error prone and annoying. WebRTC does not require any plugins.
- Already integrated with best-of-breed voice and video engines that have been deployed on millions of endpoints over the last 8+ years. Google does not charge royalties for WebRTC.
WhatsApp, Facebook Messenger, appear.in and platforms such as TokBox uses WebRTC now a days. And google chrome, Firefox opera and Microsoft edge supports webRTC.
WebRTC applications need to do several things:
- Get streaming audio, video or other data.
- Get network information such as IP addresses and ports, and exchange this with other WebRTC clients (known as peers) to enable connection, even though NATs and firewalls.
- Coordinate signaling communication to report errors and initiate or close sessions.
- Exchange information about media and client capability, such as resolution and codecs.
- Communicate streaming audio, video or data.
To communicate streaming data WebRTC implements three APIs:
1. MediaStream (getUserMedia)
- Get access to data streams, such as from the user's camera and microphone.
- Available in chrome, Firefox opera and edge
2. RTCPeerConnection
- Enables Audio or video calling, with facilities for encryption and bandwidth management.
- Chrome (on desktop and for Android), Opera (on desktop and in the latest Android Beta) and in Firefox.
3. RTCDataChannel
- peer-to-peer communication of generic data.
- Supported by Chrome, Opera and Firefox.
MediaStream (getUserMedia)
The getUserMedia() method prompts the user for permission to use a media input which produces a MediaStream with tracks containing the requested types of media. That stream can include a video track (produced by either a hardware or virtual video source such as a camera, video recording device, screen sharing service, and so forth), an audio track (similarly, produced by a physical or virtual audio source like a microphone, A/D converter, or the like), and possibly other track types.
It returns a Promise that resolves to a MediaStream object. If the user denies permission, or matching media is not available, then the promise is rejected with PermissionDeniedError or NotFoundError respectively
In above example there's no audio so stream.getAudioTracks() returns an empty array and stream.getVideoTracks() returns an array of one MediaStreamTrack representing the stream from the webcam. Each MediaStreamTrack has a kind ('video' or 'audio'), and a label (something like 'FaceTime HD Camera (Built-in)'), and represents one or more channels of either audio or video. In this case, there is only one video track and no audio, but it is easy to imagine use cases where there are more: for example, a chat application that gets streams from the front camera, rear camera, microphone, and a 'screenshared' application.
Each MediaStream has an input, which might be a MediaStream generated by navigator.getUserMedia(), and an output, which might be passed to a video element or an RTCPeerConnection.
RTCPeerConnection
The RTCPeerConnection interface represents a WebRTC connection between the local computer and a remote peer. It provides methods to connect to a remote peer, maintain and monitor the connection, and close the connection once it's no longer needed.
RTCDataChannel
The RTCDataChannel interface represents a network channel which can be used for bidirectional peer-to-peer transfers of arbitrary data. Every data channel is associated with an RTCPeerConnection, and each peer connection can have up to a theoretical maximum of 65,534 data channels (the actual limit may vary from browser to browser).
To create a data channel and ask a remote peer to join you, call the RTCPeerConnection's createDataChannel() method. The peer being invited to exchange data receives a datachannel event (which has type RTCDataChannelEvent) to let it know the data channel has been added to the connection.
Security
There are several ways a real-time communication application or plugin might compromise security.
- Unencrypted media or data might be intercepted en route between browsers, or between a browser and a server.
- An application might record and distribute video or audio without the user knowing.
- Malware or viruses might be installed alongside an apparently innocuous plugin or application
WebRTC has several features to avoid these problems:
- WebRTC implementations use secure protocols such as DTLS and SRTP.
- Encryption is mandatory for all WebRTC components, including signaling mechanisms.
- WebRTC is not a plugin: its components run in the browser sandbox and not in a separate process, components do not require separate installation, and are updated whenever the browser is updated.
- Camera and microphone access must be granted explicitly and, when the camera or microphone are running, this is clearly shown by the user interface.