Subscribe to bloggeek feed bloggeek
The leading authority on WebRTC
Updated: 1 hour 17 min ago

Scalability, VP9, and what it means for WebRTC

Thu, 04/14/2016 - 12:00

Why and where do we use SVC exactly?

[When Alex Eleftheriadis, Ph.D., the Chief Scientist & Co-founder of Vidyo, approached me about writing a piece about SVC and WebRTC – how could I refuse? Someone had to give the explanation, and what better person than Alex to do that?]

Just when the infamous WebRTC video codec debate appears to have been settled, with both H.264 and VP8 being set as mandatory-to-implement by browsers, VP9 has started making inroads into the WebRTC software stack and into browsers themselves. Indeed, Chrome 48 includes, for the first time, VP9 support for WebRTC. Firefox also includes support for it in WebRTC in the Developer Version of Firefox 46.

Why is this relevant for the WebRTC community – users and developers? First off, VP9 offers significantly better compression efficiency compared with H.264, and even more so compared with VP8. This translates to better quality for the same bit rate, or a lower bit rate for the same quality (as low as 50%). This by itself is a big plus, but it does not tell even half of the story.

The Need for Scalability

When using WebRTC beyond two-way, peer-to-peer calls, or in networks with significant quality problems, system architects are encountering the same design issues that the videoconferencing industry has been dealing with for a long time now. It is not accidental then that WebRTC solutions designed for multi-point video gravitate towards those offered in videoconferencing, or that videoconferencing companies are adapting their systems to become WebRTC solutions. For the latter, this typically entails aligning with transport-level, security, and NAT traversal specifications, and of course providing a JavaScript library that enables WebRTC-enabled browsers to use their system’s facilities.

If we look at today’s architectural landscape for high-quality multi-point video, there are two main designs. One is based on transmission of a single stream of scalable coded video. Scalable means that the same bitstream contains subsets, called layers, that allow you to reconstruct the original at different resolutions. If you get the lowest, or base, layer you can decode the video at a certain resolution, whereas if you also get a higher, or enhancement layer, you can decode the video at a higher resolution. This is great for robustness and adaptability, because you do not need to process the video at all to get at the different resolutions.

The second design is based on simulcast transmission of two separate streams that encode the same video at different resolutions. Contrary to the scalable design, here we have two encoding passes rather than one, with the associated streams requiring a higher bitrate compared with scalable coding. It is also less error resilient. On the plus side, however, simulcast allows the use of older, non-scalable decoders. This has been an important consideration for systems that interface with legacy devices (not relevant for WebRTC).

Single Layer, Scalable, and Simulast Coding of Video. In scalable coding the various layers (“a” and “A”) are multiplexed in a single stream. In simulcast two or more independently encoded streams are produced and are transmitted separately.

Both of these designs utilize a special type of server for which I have coined the term “Selective Forwarding Unit” (SFU). This type of server was not known when the original RTP Topologies RFC was published in 2008 (RFC 5117), but it is now included in its 2015 update, RFC 7667.

The operation of the SFU, using the VidyoRouter as an example. In the diagram the SFU receives three scalable streams, and it selects to forward the full resolution for the blue participant (base and enhancement layers), but only the base layer for the green and yellow participants.

The SFU works in the following way: it receives scalable or simulcast video, and it decides which layer or which stream to forward to a receiving participant. There is no signal processing involved, and the operation incurs very little delay (less than 10 ms is typical). If we contrast this with the traditional architectures that are still being used and involve transcoding of multiple videos, the advantages are obvious – both in terms of processing complexity but also in terms of delay (150 ms delays would be typical for the traditional architectures). Minimizing delay is hugely important for perfecting the end-user experience.

What is interesting is also how the receiving endpoint operates. Contrary to legacy videoconferencing systems, it receives multiple streams that it has to individually decode, compose, and display on the screen. This multi-stream architecture perfectly matches WebRTC’s design.

The multi-stream architecture of an SFU endpoint – the endpoint receives multiple video streams that it has to individually decode, and composite on the user’s screen.

To appreciate the significance of these architectures it suffices to point out that both Skype for Business and Google+ Hangouts use simulcasting (of H.264 and VP8, respectively). So does the open source VideoBridge by Jitsi. Vidyo, which first introduced the concept in its VidyoRouter product in 2008, is using scalability (with H.264 SVC). Simulcast support is now in the scope of the WebRTC 1.0 specification and it is being actively worked upon. Scalable coding is already supported by the ORTC specification, and will be addressed in WebRTC-NV (post 1.0).

Scalability, SVC and VP9

Now we can turn back to our original question regarding scalability and VP9. If you want to be able to use an SFU architecture with scalable coding, the codec itself must support scalability. That’s why back in 2013 Vidyo announced that it would be collaborating with Google to develop a scalable extension for the VP9 codec. This effort is now bearing fruits.

One may ask, “why care about VP9, I will just use whatever stock codec my browser has and be done with it.” The answer is that you do want to care, when quality matters. Depending on the codec used, and the type of multi-point server architecture deployed, the end user will get a vastly different quality of experience.

We can think of the WebRTC endpoint as a kitchen that has a bunch of ingredients. If your expectations are low, you can go for the raw vegetables and have a meal in no time. If you want a fine meal, you will want both the right ingredients as well as the right recipe. The standardization process will ensure that the WebRTC kitchen has all the right ingredients. The recipe and, in fact, the cook, are all part of whoever is offering the service. By taking into account all the realities of imperfect network transmission, heterogeneous clients, mobility, etc., they make sure that the users enjoy a great experience. If you go with a proprietary solution, you can then add plenty of secret sauce.

Endpoint Quality Scale: One ordering of relative quality of different codec and endpoint engine combinations.

Taking into account the different combinations of video codecs and endpoint engines, I put together an “Endpoint Quality Scale” diagram, shown above. You can think of it as the skeleton of the multi-point video kitchen menu. Vidyo is vigorously trying to be the three Michelin star restaurant; its proprietary engine uses a lot of secret sauce in addition to the standard ingredients. But together with the industry as a whole we want to make sure that the menu, especially when it comes to WebRTC, offers something for all tastes and price ranges.

Bottom line, when people select platform providers for their WebRTC-based solutions they need to be aware of these differences and, especially when quality matters, make an educated and well-informed choice. Bon appetit.

The post Scalability, VP9, and what it means for WebRTC appeared first on

You Won’t Find Guesstimates in My WebRTC PaaS Report

Mon, 04/11/2016 - 12:00

Forecasts are overrated.

I’ve been asked time and again things related to the market sizing of WebRTC, and I’ve tried to shy away from it all the time. The Dilbert strip above explains why…

This whole notion of estimating the size of a market that is, to be frank, hard to define, without solid numbers, too new – all lead to the question: why bother?

What are you looking for? The market size of WebRTC contact centers? Is that only for the WebRTC piece of it? Greenfield ones? With or without call widgets in tiny WordPress sites? How do you place the monetary value on it? Is it the WebRTC part or the whole contact center you’re interested in? Do you want the number to amount to a billion $ and go backwards from there so it fits your desired strategy?

All useless.

2 billion users. X% CAGR. 15% YoY growth.


Any day.

With something like WebRTC, such things are close to impossible as far as I can say, and probably not really worth it. Need to throw a number in the air? Generate it randomly.  It’s good enough for the TSA then why not for you?

Which leads me to something you won’t find in my WebRTC PaaS report – the one dealing with the WebRTC API market and assists developers in understanding if they should use a vendor and help picking up the right vendor (if that’s the course selected). These estimates don’t help in such a case. They are worse than useless.

Need estimates? Find some other report online. They will happily share their guesstimates in press releases out there (seen a few lately) so you can decide if it is worth paying for to get that “validation” you need for your management.

Need to make real decisions on how and what to implement? That probably won’t be in these reports.

The post You Won’t Find Guesstimates in My WebRTC PaaS Report appeared first on

Microsoft, Apple and WebRTC in 2016

Thu, 04/07/2016 - 12:00

There’s progress, but the real action will be in 2017.

There has been a lot of chatter lately around Apple’s snail-like progress in supporting WebRTC and Microsoft’s announcements at their BUILD conference. I am still left under-impressed but positive and confident. Here’s why.

Apple and WebRTC

Let’s start with Apple. The only official statement we will get from Apple will be “we have WebRTC”. Question is when, for what and if at all.

We have indications of progress in Apple, and Alex is keeping us updated on the goings with Apple and WebRTC.

I think Itay Rosenfeld  is making a good case why Apple needs WebRTC more than WebRTC needs Apple.

So we know WebRTC is of interest to Apple and we know it is being added to Safari.

We know one more thing. Apple is actually trying to refresh and update its Safari browser. Dare I say “modernize” it. They even recently started a Technology Preview for Safari, joining the rest of the gang of browser vendors to showcase their upcoming plans and intentions. That doesn’t include WebRTC, but WebKit indicates WebRTC as “in development” – and WebKit is the rendering engine used by Safari.

Will Safari include WebRTC? Yes.

When? My guess is end of 2016.

What will it include? WebRTC. H.264. No VPx “nonsense”.

Where? On Mac OS X, but not on iOS. That one will come in 2017.

Microsoft’s Romance with xRTC

Microsoft added ORTC to Edge. I shared my view about Edge already. To sum it up – great browser. No adoption.

To date, there has been little adoption of Edge/ORTC by vendors. If my memory serves me right, adopters include Twilio, &yet and Frozen Mountain. That’s less them impressive. And Microsoft knows that.

The problem here isn’t ORTC. It is Edge. And Microsoft seems to miss that minor detail.

At the recent Microsoft BUILD conference, a few announcement were made (thanks @hcornflower for the tip):

"We now have more than 150 million monthly active devices using Microsoft Edge" — @morris_charles #EdgeWebSummit

— Kenneth Auchenberg (@auchenberg) April 4, 2016

So. “150 million” monthly active devices. But no monthly minutes as in their last disclosure. I wonder what monthly active means and how many of them open it up just to get to IE when Chrome doesn’t work. I know that’s how I use it to get to a Silverlight site that my kid wants to use.

I guess this number was high and positive that the managers at Microsoft decided to focus on it instead of the more important number of average use time per user. This led them to this decision:

MS announces new WebRTC goodies coming to Edge:
H.264/AVC, VP8, MediaRecorder, DTLS 1.2, ECDSA certs

— Justin Uberti (@juberti) April 4, 2016

  1. Adding WebRTC and not ignoring it with an implementation of only ORTC. At long last, they got sense and decided to make it easier for developers to support Edge instead of having developers look at the abysmal market share of Edge. The challenge they face is that people who switch to Windows 10 opt for Chrome over Edge more often than not, and in the enterprise die hards stick with IE. Until this trend changes, there’s a real issue here
  2. Supporting H.264, and not only their interanl H.264UC proprietary codec. This makes sense, as Chrome is adding H.264 and Firefox is already supporting it
  3. VP8 is “under consideration”, so we will have it after H.264. Probably somewhere into 2017. Too late
  4. No VP9. With current development speeds, I can see VP9 getting adopted en mass in many use cases and Microsoft Edge staying behind, with no Edge

It would have been better to just add WebRTC to IE11 in parallel than to entice users to switch altogether to Chrome.

When will vendors need to revisit Edge when it comes to WebRTC? Not before Q4 2016.

Microsoft Skype

Skype is interesting. Late to the market. 300 million active users. A lot, but unimpressive if you compare to the leading consumer communication services that are out there.

Skype for Web is what Google Hangouts did the first two or three years of WebRTC’s existence – took components of the WebRTC implementations, modified it to fit their needs and made a plugin for Hangouts out of it. Until they just made it “native” to the browser when they could.

As written in a recent comment I’ve read – they should have done this 5 years ago, but better late than never.

The more interesting part here is the newly minted Skype SDK. I think this is Skype’s third attempt at an SDK – there may have been more. Previous ones were failures. Not because of lack of adoption, but rather because the way developers were treated. This doesn’t bode well for this round. Especially not if you couple it with the current numbers and the size of Skype.

That said, I can easily see Lync/Skype for Business enterprises adopting the SDK to deal with customer support related requirements, taking a bit of the market from WebRTC PaaS vendors. To go beyond this use case, it will take more effort from Microsoft.

The Microsoft Skype for Web and SDK initiatives need to be viewed in the light of other players as well.

Cisco Spark

Cisco Spark (along with their Telepresence and UC offering) goes head to head against Lync/Skype for Business.

Cisco made several interesting moves lately:

  • Acquired Tropo, to beef its Communication APIs and integrate them with Spark
  • Acquired Acano, which fits nicely into high-end paying customers who use WebRTC
  • Acquired Synata, to offer better search capabilities to Spark, to better compete with Slack
  • Created the Cisco Spark Innovation Fund, and placed $150 million to developers building use cases on top of Spark

That’s a lot of milage to go against Skype for Web and the Skype SDK.

You can easily say that when it comes to publicizing and marketing their investment in communication services and enables, Cisco is ahead of Microsoft.

Google Hangouts

Google Hangouts is a shadow of what it can be when it comes to usage.

As a platform, it has it all. Everything you need to communicate, at a fraction of the cost of other solutions or for free. We use it daily at testRTC – both internally and to host meetings with customers and potential customers. We have no incentive to switch to anything else.

Hangouts adopted WebRTC from the beginning. First by embedding the WebRTC stack into the Hangouts plugin, using the components of WebRTC that it could, until it was able to just use WebRTC natively in Chrome. It still runs as a plugin on other browsers, but I assume that will change when WebRTC will be supported with all of the nuances of Hangouts.

What Hangouts is lacking is the traffic and the APIs to go along with its service. I am assuming Google are aware of it.

Apple FaceTime

Apple has FaceTime. Its proprietary service that should have been standardized at some point.

I’ll be surprised if Apple did anything interesting or serious when it comes to connecting FaceTime to WebRTC or adding an SDK to it. Or god forbid, let the poor people of the world who use Android – or a 5 year old Windows PC, connect to FaceTime.


Slack just added voice support with WebRTC and intends to add video. I’ve written about Slack a few times before, and how WebRTC is a logical investment for them. If they add integration points in their API that can access their real time communication capabilities it might become a very interesting player in the SDK/API space.

The real question in this case: Will a vendor using Slack continue using Skype in the long run?

Facebook and WhatsApp

Facebook Messenger uses WebRTC. WhatsApp somewhat uses it.

Skype has 300 million monthly active users. That’s way smaller than WhatsApp’s billion and Messenger’s 800 million. I am assuming there’s more voice and video calling happening on Skype on average per user than on either Messenger or WhatsApp, but the trend is probably towards Facebook and not Microsoft here.

The reason Facebook is so strong here is their new initiatives towards enabling businesses connect with their user base – the Facebook user base directly, which is the largest social network at the moment. If they want, they can throw in voice or video interactions with an SDK on top of it.

WeChat, LINE, ooVoo and Viber

All have integration points. All heading in multiple directions for monetization. Be it businesses connecting to their user base, market places, digital currency or bots.

Leveraging Skype as an SDK means you want their reach and users base. But all of these messaging plaforms have their user bases in the hundreds of millions of active users as well. They essentially compete over similar mind share and budgets of enterprises.

What’s in store for us in 2016?

More chatter and talks about Apple and Microsoft, but little in the way of progress by developers making use of Edge or Safari WebRTC capabilities. That will wait for 2017.

For Skype, there’s a challenge here, but also an opportunity. They can leverage WebRTC, focus on developers and come with use cases and success stories that will be hard to compete against. Microsoft is doing a lot already in this space, but there’s a lot more they need to be doing when you look at the competition they have.


Want to make the best decision on the right WebRTC platform for your company? Now you can! Check out my WebRTC PaaS report, written specifically to assist you with this task.

Get your Choosing a WebRTC Platform report at a $700 discount. Valid until the beginning of May.

The post Microsoft, Apple and WebRTC in 2016 appeared first on

Messaging is Migrating from the Browser to the Desktop

Tue, 04/05/2016 - 12:00

Messaging is used too much to stay only in the browser.

There seems to be a few conflicting trends going on at the moment:

  • Most people on mobile consume their services through apps they install on the device
  • People don’t install apps on their desktops and laptops anymore. They expect everything (most things?) to be available in the browser
  • A new approach of Progressive Web Applications is on the rise, while at the same time, frameworks like React are becoming trendy
  • Chrome dropping support for their app launcher due to limited use by users – at least from their own experience – people prefer launching their services from inside the browser than let it live on their desktop
  • Mobile (and to some extent web) based messaging is leaving the browsers in favor of apps on desktops and laptops

This last trend is what I want to focus here. When all of the apps we use are now browser web apps on the PC, there are generally two types of apps I still install on my laptop:

  1. Microsoft Office
    • I use Google apps whenever I can, but official documents to most of my customers still necessitate a Word file
    • Oh, and Powerpoint is a lot faster for me to create presentations with than the Google alternative
  2. Developer tools
    • There’s something about development that still doesn’t fit in the browser
    • Oftentimes, I just need things locally or more responsiveness. Can’t explain why. Maybe I am just old fashioned

When it comes to communications, though, I prefer pinning tabs to the browser for the most common tasks I have – or just leave it to my phone. WhatsApp, Slack, Gmail – all get a pinned tab on Chrome for me. Whenever I need to use messaging in other domains (Facebook, LinkedIn, Meetup, Upwork, etc) – I just open a new tab in Chrome “on demand” and then close it once done.

I assume others install apps locally on Windows for things they want to use frequently. Which brings me to two interesting developments from the last year or so:


So we are now taking HTML5 web apps, wrapping them as Windows apps and install them locally.

It probably makes sense for a lot of the enterprise messaging apps – instead of just living inside the browser, be part of the installed set of apps on the desktop. Purists of WebRTC will complain that this is not how its done. Detractors of WebRTC will say it isn’t WebRTC at all. I’ll say it is just another way of using the technology.

If you want to take your own communication web apps and make a desktop application out of them, then the most popular approach these days that I know of is CEF – Chromium Embedded Framework. It takes your web app, and packages it with Chromium so that they both get downloaded and installed together.

I assume that this is what Slack used. I am not sure about the Facebook Messenger one though – the addition of Windows tiles is a complication, but probably solvable.

In a way, web and HTML5 have already took over our desktop. Even in apps what you get is HTML5 these days.

I wonder if and when will this trend hit mobile, and if so, will it be achieved via the new Progressive Web Apps approach.

The post Messaging is Migrating from the Browser to the Desktop appeared first on

Which WebRTC PaaS Vendor is Investing in His Platform?

Mon, 04/04/2016 - 12:00

Not all of them.

Who is investing in its platform?

Twilio. Added a slew of services in 2015.

TokBox. Got a new Spotlight live broadcast service. But not only.

VoxImplant. Added HD to its audio conferencing.

The rest? Not really sure?

Most of the time, when people talk to me about their use case, and the need to pick a specific platform, it boils down to a shopping list of features. They want everything. Usually more than any single vendor can offer. When prodded further, they reduce the need to a small set of requirements. But then again, they do see in their future these added set of features.

In many cases, selecting a vendor means understanding which of them might have what you need in the future down the road in their roadmap – not necessarily in their service today, but they will get there by the time you will.

Guess what – this is another factor that needs to included to the list of requirements you need to look at when selecting a vendor to work with.

This is why in the latest release of my “WebRTC PaaS report”, I am adding a new section, which will give a quick indication to which vendors made changes to their platform (and if these changes were serious or not). The information there will date back two years, giving some perspective.

If you are thinking of stating to use one of the WebRTC API platforms out there and not sure which one to use, then this report may come in handy. Until this next updated release, I’ve taken the price down considerably – if you purchase now you pay $1250 instead of $1950 and you get a year of updates (so that the updated version will be yours next month the moment it gets published).

Check the WebRTC PaaS report page to decide if you need.

The post Which WebRTC PaaS Vendor is Investing in His Platform? appeared first on

DIY or SaaS for Your In-App Messaging?

Tue, 03/29/2016 - 12:00

No easy answer.

What route should your messaging implementation take?

If there’s something I like is to write code. I haven’t done so in years, but it still is my passion. A year or two ago, I’ve done a small coding project for something I needed. After a whole day of coding it dawned on me that I haven’t checked my email, social networks or notifications the whole time – and didn’t even miss it. The only thing these days that can focus me on a single task at a time is programming.

When I did develop, and manage developers, there was always that tension of NIH in the air – the Not Invented Here syndrome that we developers are so good at. We want to develop stuff on our own and not “outsource” it to others. Hell – if I wrote a piece of code a year ago it was crap the next year and had to be rewritten.

I had the chance to listen in to Apigee’s recent webcast on Build vs Buy API Management. See it here:

This webcast goes over a lot of reasoning I see going on in any development project when the decision needs to be building build and buy.

The funny thing is that I don’t hear this kind of a discussion enough when it comes to messaging. Somehow, people think it is trivial.

I took a few of the concepts in this webcast, and “translated” them into the realm of build vs but for messaging.

Limited view of the scope

When a project starts, it seems that adding messaging isn’t that hard. You have a bunch of people. Maybe some presence indication. Run around a few Websocket messages for the text involved in the conversation and you’re done.

But is it really true, or is there more to messaging? It is far from trivial. Even simple things like delivering messages while disconnected or handling push notifications are notoriously hard to get right – even for those who should be the experts in it.

When you define what it is you need to build for your messaging, most often than not, you’ll be doing it with the following “mistakes”:

  • You will have a narrow scope of what is really needed
  • You will focus on the functional part of messaging, but probably a lot less of the other requirements (such as a good backend to understand what your system is doing and how people end up using it)

With limited scope comes the challenge of not comparing the right things when deciding between build or buy.


Every development project is risky. Purchasing an off the shelf solution usually mitigates the risk by having it done by someone else where the payment and deliverables are known in advance.

Developers tend to ignore risk – especially if the project is interesting enough to build. And yes. A distributed, low latency, high efficiency, large scale messaging backend written in Lua or Go is highly interesting.

You are not WhatsApp. Or Netflix

Building your own messaging system is hard. It takes a lot of effort. WhatsApp seems so easy, but getting there is hard.

This shift towards in-app messaging that is occurring means that in most cases, messaging is becoming part of an IT project and not exactly an R&D project. As a company, this means the focus is elsewhere and that messaging is considered a commodity or a non-core technology.

In such cases, there is no real funding for ongoing development, support and maintenance of an in-house DIY messaging framework.

Can open source help?

Sure, but is it at the right level of maturity?

There are a few dozen open source messaging frameworks out there. They probably do the work, but barely.

And the main challenge is that messaging is rapidly changing, which means that whatever is out there today is probably somewhat obsoleted or out of sync with what you need anyway – and getting it to where you need it means more investment on your end. Probably.

To top it all, with most of these open source initiatives, what you’ll find out that they have one main contributor behind them. That contributor is most probably a vendor who is offering support and proprietary modules to take care of commercializing the open source offering. Things like reporting, scaling, maintenance, etc. – all these will fall in the domain of proprietary and payment.

So if the idea from the start was to use open source to refrain from having to negotiate and work with a vendor, where does that lead you down the road? Isn’t it better to acknowledge the fact from the onset and find a suitable solution out of a larger set of available vendors?

Time To Market

I know. I know.

If you write your own messaging system, it will take you the better part of a weekend. Adding a bit of code and stability around it clocks it at a month. Nothing can beat that.

But what is it you are comparing here? Are you concerned about your prototype implementation or is that like production grade we’re talking about?

Getting something to production requires a lot more time.

Why are you even going DIY?

Is it because it will be cheaper?

Because you’ll have more control over your future and destiny?

DIY is going to cost you in time and effort which you don’t necessarily have.

If and when this project of yours going to succeed, you’ll find out that with it more requirements and maintenance work is necessary. But what you’ll also find out is that the budget might not be there for you to handle that extra load in development. You promised the organization a working messaging system, and now that it is working – why are you asking for more funding exactly?



Easy? Hard? Core? Commodity?

I guess in most cases, deciding to develop your own messaging system requires a very good reason.

At testRTC we had that same need, though slightly different. We needed a way to communicate with the browser machines we’re running. It was all fine and well when the number of machines was rather small and their locations were simple. It became a real headache when we grew bigger and when customers started connecting machines in locations with flaky internet connections. We ended up using integrating one of the realtime messaging players for that purpose – and haven’t looked back at it since.

Messaging might seem easy, but it is pretty hard once you get to the details.

So why not outsource it and be done with it?

The post DIY or SaaS for Your In-App Messaging? appeared first on

Everyone and His Dog is Fixing WebRTC

Mon, 03/28/2016 - 12:00

Enhanced. Fixing. Solving. Enterprise grade. Improving. Completing.

I’ve been seeing this too much lately.

Companies decide to market their product as a way to “fix” WebRTC. The gall.

I understand where this comes from. Marketing is a lot about FUD. How to put fear in your potential customer until the only thing left for him to do is buy.

If you look closely, though, none of them really “fixes” WebRTC. The only thing they are doing is using WebRTC in a way that may fit you as a customer.

An example?

Companies who “fix” WebRTC by adding signalling to it. Or adding authorization. Or having it connect to PSTN.

This isn’t about “fixing”. This is about supporting a specific scenario or feature in a product – not even related to WebRTC itself.

Others “fix” WebRTC by having it work on IE (forcing a plugin on the user or using Flash). Again, less about WebRTC, and more about the use case.

And you know what? WebRTC doesn’t offer notifications either – I am sure you can go ahead and “fix” WebRTC by adding push notifications to your app on top of WebRTC!

WebRTC is a very powerful building block, but that’s about all it is – a building block. You’ll need to add additional building blocks to create a solution with it, so no – you aren’t fixing it – you are just implementing your use case with it.


Stop fixing WebRTC. It isn’t broken.

Just focus on solving a real world problem for a real customer and be done with it.

The post Everyone and His Dog is Fixing WebRTC appeared first on

Standards are for Losers

Mon, 03/21/2016 - 12:00

They really truly are.

Whenever someone whines to me that WebRTC isn’t a standard yet so it isn’t ready it makes me laugh. Who the hell cares about such a thing anymore?

The standard is whoever’s got the clout and strength in the market. Ask any marketer – would they want to be able to interact with the carrier’s standardized, federated (and almost non-existent) RCS client to send a message – or would they rather be able to interact with WhatsApp users. The answer, for countries where WhatsApp is popular will be WhatsApp. Marketers don’t care about the standard. The users don’t care about the standard. And most developers don’t care either – as long as the interface is adequately documented.

Enter WebRTC.

No. The IETF hasn’t gone through the motions and finalized the spec yet.

Yes. It might change.

No. I couldn’t care less.

You see, there are already billions of users available to me via WebRTC. There’s source code I can take, compile and run anywhere I want. There’s a vibrant ecosystem of developers and vendors ready to assist. There’s a large and growing number of companies and use cases that make use of WebRTC.

Who am I to say that WebRTC doesn’t exist because someone didn’t put their “standard” stamp on it?

For the last 3 years I’ve been using WebRTC almost daily to communicate with others using various services. I didn’t think for once that this isn’t working because there’s no standard.

Whenever companies band together to create a standard, I begin to question their motive. These days, it usually comes from a point of weakness – a place where there is one (or more) vendors who are strong in a domain and the only way the smaller kids can have a go at it is by specifying a standard to rally all small players to fight the dominant force.

Whenever you see a standard being announced – ask who isn’t there – that’s the one with the power.

In the case of codecs, the MPEG-LA asserts its power and dominance over H.264 and H.265/HEVC for video codecs. Which is why the aomedia was created and announced – to find an alternative codec and win the market back.

The examples are countless.

In the domain of real time communications, everyone were using H.323 or SIP. Then Skype came out, ignoring standards altogether. The industry tried its best to explain that Skype isn’t federated. There’s no standard there. To no avail. So companies (the same ones) tried connecting to Skype, to offer that as part of their service.

The same is happening today with WhatsApp and other social networks. They are so big, that they are the standard.

WebRTC is making the same distinction. It is taking away the hegemony on VoIP from VoIP vendors and putting the weight of this industry on the browser vendors. And now, these vendors are complaining that WebRTC isn’t interoperable. Doesn’t fit their needs. They don’t understand that they are neither in control here nor influencers. They lost control over that part of technology.

This isn’t to say that WebRTC won’t stabilize or get standardized – it is just that it doesn’t matter when it comes to adoption.

Standards? They are for the losers to run after to make sure they get to play the game. The winners don’t really need them.

Planning on introducing WebRTC to your existing service? Schedule your free strategy session with me now.

The post Standards are for Losers appeared first on

WebRTC is a Distraction

Mon, 03/14/2016 - 12:00

Had to take this one out of my system.

Just in time for Enterprise Connect, Dave Michels decided to write a post to attract readers. The title? WebRTC is a distraction. It is hard to pin point what’s wrong with the arguments in this one, but most of them are just lacking in knowledge or understanding of this market and how it operates, which is sad – especially coming from Dave who I value very much.

The 4 main reasons why it is a distraction for Dave?

  1. Limited support
  2. Mobile is what really matters
  3. Why bother?
  4. WebRTC is dangerous

Let’s try to dismantle each of these so called arguments one by one. Shall we?

#1 – Limited Support

WebRTC today runs on Chrome and Firefox. Microsoft went for ORTC (=WebRTC) and is now “considering” WebRTC as well.

Apple isn’t there, but frankly – I almost never hear complains about Safari not having WebRTC. For some reason, Mac uses have been trained to use Chrome when needed. Furthermore, there’s work been done at Apple about WebRTC, if you care about rumors.

Add to that the fact that no other solution runs on a browser. No other. None. Zilch. They are all getting thrown out from browsers who are stopping support for plugins, Java and probably Flash in the future. And what else have this amount of support anyway?

Now, you can use WebRTC as a desktop app, using a plugin, through Java – or in whatever other manner people use their comms today – so that limited support is wider than any other alternative to date.

#Doesn’t work for you? Don’t use it. But don’t complain that others are using it and are happy about it.

#2 – Mobile is what really matters

To whom?

And while at it, using WebRTC inside an app makes a lot of sense. You shouldn’t care about the technology – just your customers. If they want apps, give them apps. Wrap WebRTC and be done with it.

There’s no other serious media engine for mobile that can be considered – the price point for it will be too prohibitive as well as the investment made.

Mobile is what really matters, which is why Facebook Messenger uses WebRTC. In both mobile and desktop. And is probably larger in deployment, users, minutes, seconds and engagement than anything else the unified communications market has to show for its huge success in its 10+ years of existence.

You know what? I am tired of waiting for unified communications to happen. It is time we take matters into our own hands (with WebRTC) instead of waiting for these large stale companies to move at a reasonable pace and come up with a workable solution.

#3 – Why bother?

Dave says Google no longer cares or invests in WebRTC. I’d say this can’t be further away from the truth.

Google are heavily invested in WebRTC today, based on the number of new features and changes they bring with every new version of Chrome (which happens every 6-8 weeks as opposed to 12-18 months of the slow vendors Dave asks us to put our trust in).

The pace of change for WebRTC is staggering. Nothing comes close to it.

In the span of a year, we’ve seen the echo canceler getting replaced in WebRTC, VP9 introduced, H.264 is underway, ORTC related APIs getting added and that’s just what I can remember off the top of my head (and really took place in the last couple of months only).

Will Google continue at these breakneck speed? Who knows? For now, I’ll take what I am given – especially for free.

#4 – WebRTC is dangerous

Not sure where to start here.

With Unified Communications and its current cadre of vendors, the issues raised by Dave (things you don’t understand and control coupled with hard to patch and upgrade) are a lot more dangerous.

Do you know when your PBX was upgraded last for that critical security issue it had? Do you even know if it was upgraded at all? What about the router you have at home? This FUD about security in WebRTC wreaks of misundersanding of the technology.

We are living in a world where we move everything to the cloud and our mobile devices. In such a world, security needs to be taken seriously. Not by introducing stupid proprietary solutions that are hard to manage or maintain, but rather by introducing cloud based solutions that can upgrade and update automatically. Ones where security is taken into account from the ground up and not as a bolt on feature to show the buyer.

WebRTC has all that and more, so if you think WebRTC is dangerous – sure it is. To anyone who is trying to compete against the companies using it. In the long run, resistance is futile.

The truth of it

Google doesn’t care about the unified communication market when it comes to WebRTC.

They just couldn’t care less if this does headaches to Cisco or Polycom or anyone else in this market. The way vendors are bitching about WebRTC shows how they view VoIP and UC as their own, as if they are entitled to what goes on there and as if someone needs to think about their business models and legacy deployments so they don’t get hurt.

Get over it.

WebRTC is a huge distraction to those who aren’t built to embrace it. They are going to fade away. Just a matter of time. And Dave – you won’t need to wait much longer for it to happen.


[show promotion title=”strategy-session”]

The post WebRTC is a Distraction appeared first on

Developer Ecosystem Acquisitions Makes Build vs Buy Decisions Harder

Thu, 03/10/2016 - 12:00

Who do you go to with your WebRTC needs?

That moment you realized you selected the wrong vendor

There are now over 20 vendors out there offering WebRTC APIs in the cloud.


How the hell do you decide which one to pick for your service?

This question was rather “simple” to answer, but it is getting harder.

Two months ago, Facebook decided to shutdown Parse. This is something that should not be taken lightly.

In 2013, Facebook acquired Parse. Parse was a MBaaS(mobile backend as a service platform). If you want to build a mobile app, you’ll be needing some backend in high probability – a place to store account information, maybe sync data between users, etc. MBaaS does exactly that, and in this domain, Parse was one of the bigger platforms. They had around 60,000 applications on their platform at the time of acquisition – not something to take lightly.

Facebook didn’t acquire Parse for its great technology but rather for its developer ecosystem – for its popularity. In the two years since, Facebook invested more in the platform – just so it can close it.

In the context of communication API platforms with WebRTC capabilities, what we’ve seen so far are two kinds of acquisitions:


  1. Acquiring a technologySnapchat acquiring AddLive, Requestec getting acquired by Blackboard are such examples. So is Crocodile RCS acqisition by Acision and then Acision wrapped into Xuar
  2. Acquiring a developer ecosystemTokBox’s acquisition by Telefonica and the recent Cisco acquisition of Tropo

Will Cisco decide in a year or two to shutter down Tropo if it doesn’t bring the traction it wants or if it serves its purpose of getting enterprises to adopt Cisco Spark?

Would Telefonica stop investing in TokBox? Highly unlikely after 3 years, but who knows? I wouldn’t have bet on Facebook shedding Parse.

The thing about Parse is that Facebook didn’t even spun it off again – or sold it. It just closed the service. More akin to how Snapchat treated its own acquisition of AddLive.

Kin Lane explains nicely the false expectations people had from Facebook and Parse:

There is no basis for believing a platform or API will ALWAYS be there, no matter what you are promised. Companies go out of business, get acquired, and in this fast paced tech climate, companies are always looking to deliver the latest product, and features. Everything in the space points to disruption, change, and evolution, where the hell did we get the idea these services shouldn’t go away?

What can we deduce?
  1. Platforms with large ecosystems aren’t impervious to being taken off market. TokBox may get shuttered. Twilio might get acquired
  2. In the build vs buy decision of WebRTC, using a platform doesn’t mean write once and forget. You may need to update your code, switch vendors, etc. – be ready for it

As I start working on another update for my Choosing a WebRTC API Platform report, I will take the time to research the reasons for vendors selecting the less popular API platforms – what makes them take that plunge. If you are such a vendor – contact me.

Until this new update gets released (April-May timeframe), there’s a $700 USD discount on the report (which includes a 1-year update period).

The post Developer Ecosystem Acquisitions Makes Build vs Buy Decisions Harder appeared first on

WebRTC Multiparty Video Alternatives, and Why SFU is the Winning Model

Mon, 03/07/2016 - 12:00

It’s the money stupid.

We all love to hate the model of an MCU (besides those who sell MCUs that is).

There are in general 3 main models of deploying a multiparty video conference:

  1. Mesh – where each participant sends his media to all other participants
  2. MCU – where a participant is “speaking” to a central entity who mixes all inputs and sends out a single stream towards each participant
  3. SFU – where a participant sends his media to a central entity, who routes all incoming media as he sees fit to participants – each one of them receiving usually more than a single stream

I’ve taken the time to use testRTC to show the differences on the network between the 3 multiparty video alternatives on the network.

To sum things up:

  • Mesh fails miserably relatively fast. Anything beyond 3 isn’t usable anywhre in a commercial product if you ask me
  • MCU seems the best approach when it comes to load on the network
  • SFU is asymmetric in nature – similar to how ADSL is (though this can be reduced, just not in Jitsi in the specific scenario I tried)

This being the case, how can I even say that SFU is the winning model for WebRTC?

It all comes down to the cost of operating the service.

Here’s what an MCU does in front of each participant:

How media gets processed by an MFU

Here’s what an SFU does in front of each participant:

How media gets processed by an SFU

To make things easy for you, I’ve marked with colors varying from green to red the amount of effort it puts on a CPU to deal with it.

The most taxing activity in an MCU is the encoding and decoding of the video. With the current and upcoming changes in video and displays, this isn’t going to lessen any time soon:

  • Google just switched to VP9, which takes up more CPU
  • 4K displays and cameras are becoming a reality. 8K is being discussed already. This means 4 times the resolutions of full HD

If anything – things are going to get worse here before they get any better.

It is no surprise then that MCUs scale on single machines in the 10’s of ports or low 100’s at best; while SFUs scale on single machines in the 1,000’s of ports or low 10,000’s.

Which brings us to two very important aspects of this:

  1. Price per port, where an SFU will ALWAYS be lower than MCU – by several factors
  2. Deployment complexity

The first reason is usually answered by people that if you want quality – you need to pay for it. Which is always true. Until you start reminding yourself that video calling today is priced at zero for the most part.

The second reason isn’t as easy to ignore. If you aim for cloud based services needing to serve multiple customers, your aim is to go to 10,000 or more parallel sessions. Sometimes millions or more. Here would be a good time to remind you that WhatsApp crossed the billion monthly active users and most messaging services become interesting when they cross 100 million monthly active users.

With such numbers, placing 100 times more machines to support an MCU architecture instead of an SFU one is… prohibitive. There are more costs that needs to be factored in, such as power consumption, rack space and higher administration costs.

The end result?

An SFU model is by far the most popular deployment today for WebRTC services.

Does it fit all use cases? No

Will it fit your use case? Maybe

Do customers care? No


Planning on introducing WebRTC to your existing service? Schedule your free strategy session with me now.

The post WebRTC Multiparty Video Alternatives, and Why SFU is the Winning Model appeared first on

Stop Whining about WebRTC Security Threats

Thu, 03/03/2016 - 12:00

It is a waste of time.

I’ve heard it more than one.  Security threats in WebRTC make it a bad alternative. You have MITM (man in the middle) attacks on it. It leaks IP addresses. You can screen share without the user’s knowledge. The list goes on.

It isn’t the first time I write about WebRTC security and it still pisses me off when I see such answers on Quora:

The WebRTC plugin (which means Web Real-Time Communication) allows to conduct audio and video teleconferencing just in a browser without any additional software installed. However, it reveals the true IP address. How to disable WebRTC in various browsers.

A few things about that one:

  1. WebRTC isn’t a plugin…
  2. Why would you want to disable it?

If you trust Skype or any other VoIP or messaging app more, then you are in for a big surprise.

I read the above Quora answer on the same day I read Troy Hunt’s piece on controlling a Nissan remotely – one that… well… isn’t YOUR Nissan.

The things Nissan got wrong here includes:

  • Having cars get sequential serial numbers, so they are easy to guess
  • Having an undocumented backend API that controls cars remotely – with no authentication on it

I don’t want to go into additional measures they could have added such as geolocation for the origination of the command or throttling to bar hackers from going berserk on their car fleet.

What would a leaked IP address on a WebRTC session in a browser do exactly compared to such stupidity?

The bane of security is developers and processes.

IOT (Internet of Things) is going to bring us many more such stories. That’s because it is based on developers and they make mistakes. Increase that a thousand fold, put it in a heating market where features and gadgets take center role, pushing back privacy and security – and you get hackable cars.

Telephony and video conferencing systems or old are devices sitting in networks. They need to “interoperate”. They have IT people who like controlling how things get deployed and updated. Are you sure these have been configured to work encrypted (I am sure most deployments aren’t). Are you sure the IT person really upgraded to the latest version that patches a bunch of security flaws?

And while we are talking about communications. The router you have at home that gives you WiFi on one end and connects you to the internet via ADSL or whatever on the other end – when did you last upgrade its firmware? Did you ever updated its password from the default? Is your service provider taking care of these things for you by any chance?

Here’s why:

  • It is encrypted. By default. And there’s no way to remove that encryption from occurring (people complain about that one as well – go figure)
  • It gets updated every 6-8 weeks with your browser. That update includes security patches when they are found
  • It now forces (at least on Chrome) the sites using it to run over HTTPS instead of HTTP (did we say encryption?)
  • It has permission mechanisms around camera and microphone access
  • It has stricter permission mechanisms around screen sharing (white listing and extensions)
  • Whenever someone peeps about security – it gets discussed and potentially updated in the implementation. Which gets to your browser in… 6-8 weeks
  • Being a part of Chrome and other browsers means security gets front row and is prioritized properly

Yes. Developers can still do stupid things on top of WebRTC and botch it all, but that’s true about that snazzy new car you just bought or the smart TV that looks at you and hears what you say.

What more do you want?

If I wanted to hack you, WebRTC would be the last place I’d start.

The post Stop Whining about WebRTC Security Threats appeared first on

Does Google’s Support of RCS Changes Anything for WebRTC?

Mon, 02/29/2016 - 12:00


Now that we got that one out of the way, lets see why the recent announcement from Google and the GSMA isn’t relevant to WebRTC.

On February 22, the GSMA issued a press release titled Global Operators, Google and the GSMA Align Behind Adoption of Rich Communications Services. The subheading sums up the message:

Operators align on universal RCS profile; Google to provide RCS messaging client in Android

I was asked if this kills WebRTC – and the efforts of companies invested in WebRTC already.

There are two ways to view these questions:

  1. People don’t understand what WebRTC (or RCS) is
  2. People are just afraid of Google deciding on a whim to close WebRTC as just another experiment (think Google Reader, Wave, Buzz and a lot of other technologies and services in the Google graveyard)
Nothing really changed

I’ve written about the Google’s acquisition of Jibe. Nothing changed since then. I then assumed that Telcos will accept this and adopt it.

The recent press release shows that that has happened – at least by the GSMA. Time will tell which of the carriers will join this initiative.

I am not sure it will save RCS, but as I still believe it is the only alternative that brings RCS any future.

How is that different than WebRTC?

When I think about RCS, I think signaling, messaging and federation. It is about serving all people with a mobile device.

When I think about WebRTC, I think about media processing, business enablement. business processes and customizaton.

RCS isn’t about to win back the world in storm. It won’t beat WhatsApp or Facebook Messenger or WeChat or any of these other players any time soon. And if it does, it won’t be useful for most use cases I’ve seen with WebRTC anyway.

While both RCS and WebRTC can now be said to be promoted by Google, they aren’t serving the same needs in Google.

Will Google stop supporting WebRTC?

I don’t think that’s a possibility in the foreseeable future. How much investment will it put on WebRTC is another topic.

WebRTC is now part of HTML5. It is implemented by Google, Mozilla and Microsoft (don’t start with me on ORTC here please). Rumors abound about Apple, but I don’t really care at this point.

Google dropping WebRTC means back to plugin realm for things like Google Hangouts. And for things like RCS.

When you want to implement an RCS client on a browser, and initiative a voice call through it. From inside the browser. What are you going to use for it? Flash?

Google needs to continue its investment in WebRTC as long as it feels it needs Hangouts as part of its strategy. Messaging is  important to Google – check out their investments and acquisitions around messaging vendors. To that end, it can’t just drop WebRTC.

If, on the other hand, WebRTC gets to a point where it is good enough for Google, its investment in it may change. Until all browsers support WebRTC reasonably – there’s no threat of this happening.

The post Does Google’s Support of RCS Changes Anything for WebRTC? appeared first on

Join me in London for WebRTC Global Summit

Sun, 02/28/2016 - 14:00

Why don’y we meet in London on April?

It is that time of year. Informa is doing their annual WebRTC Global Summit in London on April.

This year, there are three tracks going on: Telecom, Developer and Enterprise

As with last year, if you arrive early (=for the weekend), you can also attend the TADHack event that is taking place.

I am chairing the developer day along with Chris Khoencke, we. We’ve worked hard to bring you some interesting topics and fresh new content.

While the developer day is free to attend, the rest of the conference is something I am waiting for as well.

When? 11-12 April

Where? Cavendish Conference Centre, London, UK

Free registration here

I will speak about two topics during the event:

  1. Video codecs and WebRTC
  2. Testing challenges with WebRTC

If you plan on attending or are just in town, then make sure to contact me in advance or just come say hi when you see me at the conference.


The post Join me in London for WebRTC Global Summit appeared first on

SoftBank’s Adoption of WebRTC Should be a Wake Up Call to Video Conferencing Vendors

Thu, 02/25/2016 - 12:00

Wake up and smell the ashes?

This week, as part of the slew of announcements of MWC, there was this one – SoftBank Deploys Large-Scale WebRTC-Based Conferencing Application Enabled by Dialogic. From the press release:

SoftBank Corp. has selected Dialogic® PowerMedia™ XMS software media server as a core network element of their new multimedia web conferencing solution, supporting SoftBank’s enterprise collaboration needs for video conferencing and chat room capabilities. The WebRTC-based web conferencing application will replace aging legacy video equipment and services for employees across their various divisions and brands.

The emphasis is mine, so lets unravel it a bit.

  • Dialogic PowerMedia XMS is a media server for developers
  • Video conferencing in enterprises was something you purchase not something you develop
  • But something is changing
  • Fidelity in the US acquired Vidtel a few years ago to get in-house the ability to build their own video conferencing capabilities
  • SoftBank is doing the same now by licensing PowerMedia XMS and probably some other tools from other vendors
  • To top it off, it is transitioning from “legacy video equipment” (=video conferencing vendors) to an in-house solution

Microsoft Skype? Cisco Telepresence? Or Spark? Polycom?

No. Just WebRTC. With their own logic and implementation.

It is not only verticals

If you asked me in 2015, I’d have said that video conferencing has its place, but it is now limited to the enterprise. Finance, Retail, Contact centers, healthcare, education – all these now have their own specialized vendors offering WebRTC solutions that are a lot more focused on the business of the vertical than a generic video conferencing vendor can ever be. It was easy to see why these verticals are heading away from video conferencing towards WebRTC vendors.

But video conferencing?

And without even a vendor?


Unheard of!

But SoftBank is now doing it.

Why is it important?

The value of video conferencing in its generic unified communications form is diluting.

It is no wonder that Polycom closed its office in Israel and many of the other players of this market are struggling to grow. The future ahead of a legacy video conferencing vendor is murky. If I were working in that market – I’d be worried. Very worried.

SoftBank is just another instance of the tectonic shift taking place – the change in guard in communications that is happening all around us.



The post SoftBank’s Adoption of WebRTC Should be a Wake Up Call to Video Conferencing Vendors appeared first on

The Biggest Risk of Building a Business over Messaging Platforms

Tue, 02/23/2016 - 12:00

Do you really want to trust a messaging platform to be there tomorrow as well?

Building house of cards on top of Facebook?

Facebook just killed Parse. A successful mobile BaaS platform they acquired in 2013. There’s a nice round up of feedback about it on Business Insider.

Inside the span of the same year, Facebook also announced the ability for businesses to integrate with its messaging platforms (both Messenger and WhatsApp).

It is funny somehow. The Business Insider article indicates Orbitz being one of Parse’ customers. I wonder how willing they will be to use another Facebook API to drive their messaging in front of their own users.

Here’s the thing. Messaging platforms are about messaging platforms. Most of them, don’t really care about the ecosystem of developers being built around them.

Twitter is famous for closing doors on developers. In 2012, it changed its rules around APIs, limiting access in a way that virtually killed any possibility to develop alternative Twitter clients.

What are we left with? The simple fact that relying on a single messaging platform and its API access for your service and business model is risky at best. Probably suicidal.

There’s a shift happening in the world. It started somewhere in the dot com bubble, morphing every couple of years:

  • Websites
  • Mobile Apps
  • Messaging

Websites was easy. With access to the internet, everyone could be doing anything. There were no real gatekeepers, besides Google and its search engine – but that’s a rather “soft” sort of a gatekeeper – you could succeed without it (ask Facebook or Twitter).

Then we started the great migration towards mobile and applications. We were left with two gatekeepers – Apple and Google. Apple with its inconsistent and somewhat puritan approval rules, and again Google. Now if you want to reach out to users, you go through these companies, who hold the keys to that kingdom.

Recently, it started changing, with a migration happening towards messaging apps. With billions of users interacting through messaging, these are turning into platforms of interaction – places where businesses, virtual assistants and bots can interact with the users of the platform.

The difference now, is that these messaging platforms have a lot more control over the users who end up using them – and by extension, over the enterprises who integrate with their service.

My suggestion?

If you need messaging in your service, build it your own unless “socializing” and communicating directly with specific social networks add some huge benefit to you. The risks are just too great to be worth it.


Kranky Geek India takes place in Bangalore on 19 March 2016. Register to join us!

The post The Biggest Risk of Building a Business over Messaging Platforms appeared first on

Different Requirements of Scaling real time video

Mon, 02/22/2016 - 12:00

There’s scaling and then there’s scaling.

The post from last week about the future of WebRTC live broadcast left some interesting impressions. Comments on that post and in Facebook. Red5 even did a follow up post on it.

One thing that was missing from these comments is an understanding of what scale means. Or rather the different types of scaling that are required when it comes to real time video.

Here are a few different aspects of scaling real time video.

#1 – Streams per machine

This is something that was raised on one of the comments on Facebook:

Most of the SFUs out there can actually handle 100’s and even 1000’s of connections (our data is not public but look at JVB: and with most of them it should be possible without much effort to configure multiple SFUs in cascade to scale almost without any limit in my opinion.

That answers the question how many parallel sessions can you conduct on a single machine?

What is this one good for?

When you know how many sessions / streams you plan on having, you can then calculate how many machines you’ll need to run that scenario. From there, it is easier to extrapolate costs.

But that’s not our only vector of scale.

#2 – Streams per session

How many streams can we “bundle” per session?

In the comment above, what was failed to be mentioned was that these tests of 100’s and 100’s of connections were when each session had no more than 33 streams in it. So if what I want is to live broadcast a singer to 1000’s of viewers in real time – this SFU solution won’t be suitable for my need.

It is nice to be able to do multiparty video or to broadcast live with low latency, but always ask yourself – what’s the upper limit here for this single session? How many participants can I cram into that session without making things impossible on my infrastructure?

There are, in general, two critical challenges here:

  1. When the number of users per session grows, the amount of communications between peers should be limited. At the extreme, a broadcaster should not be harassed by viewers directly (which is wher e the SFU starts breaking at scale and why I assume Jitsi preferred not to check above 33 participants)
  2. When the number of users per session grows beyond a single machine, how does that compute? You’ll need to be able to distribute the session somehow either by cascading or using some other means of architectural magic

It is also worth pointing out that the larger the group, the more fragmentation issues you’ll have across parallel sessions – if the size of a session is dynamic, then on what kind of a machine should you start it? One which is free or one which is already somewhat busy? Can you dynamically route a session to other machines when the need arise? How do you load balance this?

#3 – Failure diffusion

This one is related because the higher the scale and capacity, the more of an issue this will be.

Let’s assume we can get a machine to run 10,000 streams in parallel. I am optimistic today. Let’s also assume that this all happens in a single process running in our machine.

What happens if there’s a bug somewhere (and believe me – there already is), which happen to cause the system to crash? Whenever we hit the bug, 10,000 streams get disconnected.

Now let’s further assume that each session holds 10 streams on average. And the bug was invoked due to one of these streams doing something slightly unorthodox. Now we have one session causing the disconnection of 999 more sessions on that machine.

Which leads us to the question –

Can I run multiple processes on the same machine, each catering a smaller number of sessions? Maybe even only a single session? How does that impact memory and performance? Is it even desirable?

For some, this might be necessary in their architecture – and it is very far from how telecom services are architected…

When Talking About Scaling…

Make sure you refer to the specific aspects you wish to scale.


Planning on introducing WebRTC to your existing service? Schedule your free strategy session with me now.


The post Different Requirements of Scaling real time video appeared first on

The Future of WebRTC Live Broadcast

Thu, 02/18/2016 - 12:00

It is in the viewer side.

Live broadcast is all the rage when it comes to WebRTC. In 2015 it grew 3-fold. It is a hard nut to crack, but there are solutions out there already – including the new Spotlight service from TokBox.

WebRTC Live Broadcast Today

If you look closely, most of the deployments today for live broadcast using WebRTC look somewhat like the following diagram:

How you live broadcast using WebRTC today

What happens today, is that WebRTC is used for the presenter – the acquisition of the initial video happens using WebRTC – just right to the broadcast server. There, the media gets transcoded and changes format to the dialects used for broadcasting – Flash, HLS and/or MPEG-DASH.

The problem is that these broadcast dialects add latency – check this explanation about HLS to understand.

With our infatuation to real time and the strive of moving any type of workload and use case towards real time, there’s no wonder that the above architecture isn’t good enough. With my discussions, many entrepreneurs would love to see this obstacle removed with live broadcasts having latency of mere seconds (if not less).

The current approaches won’t work, because they rely heavily on the ability to buffer content before playing it, and that buffering adds up to latency.

WebRTC Live Broadcast Tomorrow

This is why a new architecture is needed – one where low latency and real time are imperatives and not an afterthought.

Since standardization and deployment takes time, the best alternative out there today is utilizing WebRTC, which is already available in most browsers.

How WebRTC live broadcast will look like tomorrow

The main difference here? The broadcast server needs to be able to send WebRTC at scale and not only handle it on its ingress.

To do this, we need a totally different server side WebRTC media implementation than the alternatives on the market today (both open source and commercial).

What happens today is that WebRTC implementations on the server are designed to work almost back-to-back – they simulate a full WebRTC client per connection. That’s all nice and well, but it can’t scale to 100’s, 1000’s or millions of connections.

To get there, the sever will first need to split the dependency on the presenter – it will need to be able to process media by itself, but do that in a way that optimizes for large scale sessions.

This, in turn, means rethinking how a WebRTC media stack is architected and built. Someone will need to rebuild WebRTC from the ground up with this single use case in mind.

I am leaving a lot of the details out of this article due to two reasons:

  1. While I am certain it can be done, I don’t have the whole picture in my mind at the moment
  2. I have a different purpose here, which we are now getting to
A Skillset Issue

To build such a thing, one cannot just say he wants low latency broadcast capabilities. Especially not if he is new to video processing and WebRTC.

The only teams that can get such a thing built are ones who have experience with video streaming, video conferencing and WebRTC – that’s three different domains of expertise. While such people exist, they are scarce.

Is it worth it?

Optimizing down from 20 seconds latency to 2 seconds latency. That’s what we’re talking about.

Is investing in it worth the effort? I don’t have a good answer for this one.


Planning on introducing WebRTC to your existing service? Schedule your free strategy session with me now.

The post The Future of WebRTC Live Broadcast appeared first on

Are WhatsApp and Messenger competitors or partners in Facebook?

Tue, 02/16/2016 - 12:00

Two messaging services. Focused on consumers. Doing practically the same thing. Do they compete or cooperate under Facebook’s roof?

Messenger and WhatsApp are the biggest messaging platforms toady. Messenger announced 800M monthly active users recently, while WhatsApp celebrated hitting the 1 billion mark. As they both strive to continue with this rapid growth, I have to question – are they joining forces or competing fiercely between themselves.

The reason I raise it stems with how they implemented web support and VoIP:

  • Messenger unbundled from Facebook, opening its own independent site, which acts as a full messenger client. If you want to make calls, you use WebRTC for that
  • WhatsApp created a web frontend tethered to the phone app. It cannot work without the phone nearby. And when it comes to VoIP, it might be using the same codecs as WebRTC, but not the vinyl implementation

They are taking different architectural approaches. But they end up implementing the same feature set.

WhatsApp in 2015

Here’s what WhatsApp did or was rumored to be working in the last year:

Messenger in 2015

Here’s what Messenger did in the last year:


Not much of a difference…

Running such a thing at scale of 100’s of millions of people is painfully hard. Doing that twice under the same roof is even harder:

  • It seems like they develop everything twice or separate infrastructure and architecture.
  • There’s no federation between the two – you can’t send a message from a Messenger user to a WhatsApp user – even though both belong to the same company

Where would each of these services go next for growth?

The above slide from eMarketer shows how in some countries, the main competitor of WhatsApp is Facebook Messenger – and vice versa. I think each of them tries independently to raise his users base – with no real regard of the other’s footprint at any given location.

This one from Activate goes to show how growth for both these platforms come from the same areas – and where they overlap or compete on the same set of users.

Something doesn’t work out here for me, though it is hard to lay a finger on it.

WhatsApp is probably still a strange bird in Facebook, far from the rest of the company and its DNA. Getting it in line with Facebook will take considerably more time.


The post Are WhatsApp and Messenger competitors or partners in Facebook? appeared first on

Would WebRTC be as Big a Thing if it Didn’t Run in a Web Browser?

Mon, 02/15/2016 - 12:00

Probably not.

I wrote about Peer-to-Peer and WebRTC recently, and got this interesting question due to it from Fabian Bernhard on LinkedIn:

Without arguing about the quality of a specific Open Source media stack, would you say that WebRTC was as big a thing if it didn’t run in a web browser?

I guess the answer is no it wouldn’t be that big a thing.

Here’s where I am getting at it. There are two popular slides I usually use:

The one above explains that WebRTC sits at an intersection – it appeals both to VoIP people as well as to Web people.

The second slide above is about what makes WebRTC so transformative – it is about the fact that it is Free, but also because it is available for Web people.

Without the web browser part, we would have been left with only Free.

We’ve had open source media engines before. GStreamer is a popular one. Codecs were a bit harder to come by – especially those that don’t require patent payments (royalty free). It wasn’t the best thing out there, but it worked – people still use it today.

WebRTC made the open source version of a media engine as good as a commercial one – it came out of an acquisition of a commercial media engine vendor after all.

But that’s where it stops – it wouldn’t have made such a transformation in the market – it would be more of the same with a small evolutionary step. Nothing to write home about.

The browser bit, though… that made VoIP available and open to everyone with some HTML and JS experience – a lot larger pool of talent – and one dabbling a lot in experimentation. This is what got us so many use cases.

Mobile might be different

For mobile only use cases, WebRTC would have made all the difference – same as it does today. The idea behind it in mobile isn’t that it offers a browser experience or that it is available in the browser (it isn’t on iOS). The idea is that it would have been the cheapest route to a product than anything else out there. And with the trend of communications moving in-app, that would still make the impact it does there relevant.

Which brings us full circle.

Let’s assume mobile is eating up the world. Let’s assume it is only a matter of time until content creation and not only content consumption moves from the PC to mobile. Once that happens – who cares about what happens in the browser?

It will all be in-app anyway.

And there – WebRTC is making a difference.


Kranky Geek India takes place in Bangalore on 19 March 2016. Register to join us!

The post Would WebRTC be as Big a Thing if it Didn’t Run in a Web Browser? appeared first on


Using the greatness of Parallax

Phosfluorescently utilize future-proof scenarios whereas timely leadership skills. Seamlessly administrate maintainable quality vectors whereas proactive mindshare.

Dramatically plagiarize visionary internal or "organic" sources via process-centric. Compellingly exploit worldwide communities for high standards in growth strategies.

Get free trial

Wow, this most certainly is a great a theme.

John Smith
Company name

Startup Growth Lite is a free theme, contributed to the Drupal Community by More than Themes.