Subscribe to bloggeek feed bloggeek
The leading authority on WebRTC
Updated: 1 hour 30 min ago

Discount on the Advanced WebRTC Architecture Course ends tomorrow

Thu, 09/22/2016 - 12:00

If you haven’t yet enrolled to my Advanced WebRTC Architecture course – then why wait?

I just noticed that I haven’t written any specific post here about the upcoming course, so consider this one that announcement. To my defense – I sent it out a few days ago to the monthly newsletter I have.

Why a course on WebRTC architecture?

I’ve been working with entrepreneurs, developers, product managers and people in general about their WebRTC products for quite some time. But somehow I missed to notice that in many such discussions there were large gaps in what people thought about WebRTC and what WebRTC really is.

There’s lots of beginner’s information out there for WebRTC, but somehow it always focuses on how to use the WebRTC APIs in the browser, or what the meaning of a specific feature in the standard is. There is also a large set of walk-throughs of different frameworks that you can use, but no one seems to offer a path for a developer to decide on his architecture. To answer the question of “what should I be choosing for my service?

So I set out to put a course that answers that specific question. It gives the basics of what WebRTC is, and then dives into the part of what it means to put an architecture in place:

  • How to analyze the real requirements of your scenarios?
  • What are the various components you will need?
  • Go through common design patterns that crop up in popular service archetypes
What’s in the course?

The easiest way is to go through the course syllabus. It is available online here and also in PDF form.

When will the course take place?

The course is all conducted online, but not live.

It starts on October 24, and I am now in final preparation of recording the materials after creating them in the past two months.

The course is designed to be:

  • Built out of 7 modules
  • Have 40 lessons give or take, each on average should take you 30 minutes
  • This means if you take a lesson on every working day, you should complete this in 2 months
  • You can do it at a faster pace if you wish
  • Course materials are available online for students for a period of 2 months. This can be extended to 4 months for those who wish to add Office Hours on top of the course
Any discount for friends and family?

Enrolling to the course is $247 USD. Adding Office Hours on top of it means an additional $150 USD.

Until tomorrow, there’s a $50 USD discount – so enroll now if you’re already certain you want to.

There are discounts for those who want to enroll as a larger group – contact me for that.

Have more questions?

Check the FAQ. I’ll be updating it as more questions come it.

If you can’t find what you need there – just contact me.

The post Discount on the Advanced WebRTC Architecture Course ends tomorrow appeared first on

Twilio Acquires Kurento. Who will Acquire Janus?

Wed, 09/21/2016 - 12:00

Open source media frameworks in WebRTC are all the rage these days.

Jitsi got acquired by Atlassian early last year and now Twilio grabs Kurento.

What to expect in the coming days

Yesterday Twilio announced several interesting initiatives:

  1. Country specific guidelines on using SMS
  2. A new Voice Insights service
  3. The Kurento acquisition

Add to that their recent announcement on their new Enterprise offering and the way they seem to be adding more number choices in countries. What we get is too much work to cover a single vendor in this industry.

Twilio is enhancing its services in breadth and depth at the same time, doing so while trying to reach out to new customer types. I will be covering all of these issues soon enough. Some of it here, some on other blogs where I write. Customers with an active subscription for my WebRTC PaaS report will receive a longform written analysis separately covering all these aspects later this month.

What I want to cover in this article

What I want to cover in this part of my analysis of the recent Twilio announcements is their acquisition of Kurento.

Things I’ll be touching is Why Kurento – how will it further Twilio’s goal – and also what will happen to the many users of Kurento.

I’ll also touch the open source media server space, and the fact that the next runner up in the acquisition roulette of our industry should be Janus.

But first things first.

What is Kurento?

Kurento is an open source WebRTC server-side media framework implemented on top of GStreamer. While it may not be limited to WebRTC, my guess is that most if not all of its users make use of WebRTC with it.

What does that mean exactly?

  • Open source – anyone can download and use Kurento. And many do
    • There’s a vibrant community around it of developers that use it independently, Outsourcing development shops that use it in their projects to customers and the Kurento team itself offering free and paid support to it
    • It is distributed under the Apache license which is quite lenient and enterprise-friendly
  • server-side media framework – when you want to process media in WebRTC for recording, multiparty or other processes, a server-side media framework is necessary
  • GStreamer – another popular open source project for media processing. Just another tidbit you may want to remember

I am seeing Kurento everywhere I go. Every couple of meetings I have with companies, they indicate that they make use of Kurento or when you look at their service it is apparent it uses Kurento. Somehow, it has become one of these universal packages that developers turn to when they need stuff done.

The Kurento team is running multiple activities/businesses (I might be doing a few mistakes here – it is always hard to follow such internal structures):

  1. Kurento, the open source project itself
    • Assisted by research done at theUniversidad Rey Juan Carlos located in Madrid, Spain
    • Funding raised through the European Commission
    • Money received by selling support and customization services
    • A new initiative focused on scaling and an open source PaaS offering on top of Kurento
    • You can read more about it in a guest post by Luis Lopez (the face of Kurento)
  3. elasticRTC
    • Another new initiative, but a commercial one
    • Focused at getting scalable Kurento running on AWS
  4. Naevatec / Tikal Technologies SL
    • The business side of the Kurento project, where customization and support is done for a price

Kurento have a busy team…

What did Twilio acquire exactly?

This is where things get complicated. From my understanding, reading the materials online and through a briefing held with Twilio, this is what you can expect:

  • Kurento as an open source project is left open source, untouched and un-acquired. That said, the bulk of the team maintaining Kurento (the Naevatec developers) will be moving to be Twilio employees
  • Naevtec was not acquired and will live on. A new team will need to be hired and trained. During the transition period, the Twilio team will work on the Kurento project fulfilling any existing obligations. After that, Naevatec will supposedly have the internal manpower to take charge of that part of the business
  • elasticRTC was acquired. They will not be onboarding any new customers, but will continue supporting existing customers
    • This sounds like the story of AddLive and Snapchat (they waited for support contracts to expire and worked diligently but legally to get customers off the AddLive service)
    • That said, it seems like Twilio wants to leverage these early adopters of elasticRTC to design and build their own Twilio API offering around that domain (more on that later)
    • As I don’t believe there are many customers to elasticRTC, I don’t see this as a real blow to anyone
  • NUBOMEDIA was not mentioned in any of the announcements of the acquisition
    • I forgot to prod about it in my briefing…
    • Twilio are probably unhappy about this one, but had nothing to do about it
    • NUBOMEDIA is funded by multiple European projects, so was either impossible to acquire or too expensive for what Twilio had an appetite for
    • It might also had more partners to it than just the Kurento team(s)
    • How will the acquisition affect NUBOMEDIA’s project and the zeal with which Twilio’s new employees from Naevatec will have for it is an open question

To sum things up:

Twilio acqui-hired the team behind the Kurento project and took their elasticRTC offering out of the market before it became too popular.

How will Twilio use Kurento?

I’d like to split this one to short term and long term

Short term – multiparty calling

Twilio needed an SFU. Desperately.

In April 2015 the Twilio Video initiative was announced. Almost 18 months later and that service is still in beta. It is also still 1:1 calling or mesh for multiparty.

Something had to be done. While I am sure Twilio has been working for quite some time on a solid multiparty option, they probably had a few roadblocks, which got them to start using Kurento – or decide they need to buy that technology instead of build it internally.

Which got them to the point of the acquisition. Twilio will probably embed Kurento into their Twilio Video offer, adding three new capabilities to their platform with it:

  1. Multiparty calling, in an SFU model, and maybe an MCU one
  2. Video recording capability – a popular Kurento use case
  3. PSTN connectivity for video calling – Kurento has a SIP-Gateway component that can be used for that purpose
Long term – generic media server

In the long term, Twilio can employ the full power of Kurento and offer it in the cloud with a flexible API that pipelines media in real time.

This can be used in our new brave world of AI, Bots, IOT and AR – all them acronyms people love talking about.

It will be interesting to see how Twilio ends up implementing it and what kind of an API and an offering they will put in place, as there are many challenges here:

  • How do you do something so generic but still maintain low resource consumption?
  • How do you price it in an attractive way?
  • How do you decide which use cases to cover and which to ignore?
  • How do you design it for scale, especially if you are as big as Twilio?
  • How do you design simple yet flexible and powerful API for something so generic in nature?

This is one of the most interesting projects in our industry at the moment, and if Twilio is working towards that goal, then I envy their product managers and developers.

What will be left of the Kurento project?

That’s the big unknown. Luis Lopez, project lead of Kurento details the official stance of Kurento and Twilio on the Kurento blog. It is an expected positive looking write up, but it leaves the hard questions unanswered.

Maintaining the Kurento project

Twilio is known for their openness and the way they work with developers. While that is true, the Twilio github has little in the way of projects that aren’t samples written on top of the Twilio platform or open sourced projects that touch the core of Twilio. While that is understandable and expected, the question is how will Twilio treat the Kurento open source project?

Now that most of the workforce that is leading Kurento are becoming Twilio employees, will they work on the open source Kurento build or on internal needs and builds of Twilio? Here are a few hard questions that have no real answers to them:

  • What will be contributed back to the Kurento project besides stability and bug fixes?
  • If Twilio work on optimizing Kurento to higher capacities or add horizontal scalability modules to Kurento. Will that be open sourced or left inside Twilio?
  • How will Twilio prioritize bugs and requests coming from the large Kurento community versus handling their own internal roadmap?

While in many cases, with Kurento the answer would have been that Naevatec could just as well limit the access to higher level modules for paid customers – there was someone you could talk to when you wanted to purchase such modules. Now with Twilio, that route is over. Twilio are not in the business of paid support and customization of open source projects – they are in the business of cloud APIs.

There will be ongoing friction inside Twilio with the decision between investing in the open source Kurento platform versus using it internally. If you thought that was bad with Atlassian acquiring Jitsi – it is doubly so here, where Twilio may have to compete with a build vs buy decisions of companies where “build” is done on top of Kurento.

I assume Twilio doesn’t have the answers to these questions yet either.

Maintaining the business model

Kurento has customers. Not only users and developers.

These customers pay Naevatec. They pay for support hours or for customization work.

Will this be allowed moving forward?

Can the yet-to-be-hired new team at Naevatec handle the support?

What happens when someone wants to pay a large sum of money to Naevatec in order to deploy a scalable Kurento service in the cloud? Will Naevatec pick that project? If said customer also wants to build an API platform on top of it, will that be something Naeva Tec will still do?

What will others who see themselves as Twilio competitors do if they made use of Kurento up until now? Especially if they were a Naevatec paying customer…

The good thing is, that many of the Kurento users ended up getting paid support and customization by third party vendors. Now if you only could know which one of them does a decent job…

Should TokBox be worried?

Yes and no.

Yes, because it means Twilio will be getting their multiparty story, and by that competing with TokBox. Twilio has a wider set of features as well, making them more attractive in some cases.

No, because there’s room for more players, and for video calling services at the moment, TokBox is the go-to vendor. I wonder if they can maintain their lead.

What about Janus?

I recently compared Jitsi to Kurento.

Little did I know then that Twilio decided on Kurento and was in the process of acquiring it.

I also raised the question about Janus.

To some extent, Janus is next-in-line:

  • Those I know who use the project are happy with it and its architecture. A lot more than other smaller open source media framework projects
  • Slack has been using Janus for awhile now
  • Other vendors, some got acquired recently, also make use of it

Does Meetecho, the company behind Janus, willing to sell it isn’t important. It is a matter of price points.

We’ve seen the larger vendors veer towards acquiring the technology that they are using.

Will Slack go after Janus? Maybe Vonage/Nexmo? Oracle, to beef their own WebRTC offering?

Open source media frameworks have proven to be extremely effective in churning out commercial services on top of them. WebRTC made that happen by being its own open source initiative.

It is good to see Kurento finding a new home and growing up. Kudos to the Kurento team.


Learn how to design the best architecture for our WebRTC service in this new Advanced WebRTC Architecture course.


The post Twilio Acquires Kurento. Who will Acquire Janus? appeared first on

How Media and Signaling flows look like in WebRTC?

Mon, 09/19/2016 - 12:00

I hope this will clear up some of the confusion around WebRTC media flows.

I guess this is one of the main reasons why I started with my new project of an Advanced WebRTC Architecture Course. In too many conversations I’ve had recently it seemed like people didn’t know exactly what happens with that WebRTC magic – what bits go where. While you can probably find that out by reading the specifications and the explanations around the WebRTC APIs or how ICE works, they all fail to consider the real use cases – the ones requiring media engines to be deployed.

So here we go.

In this article, I’ll be showing some of these flows. I made them part of the course – a whole lesson. If you are interested in learning more – then make sure to enroll to the course.

#1 – Basic P2P Call Direct WebRTC P2P call

We will start off with the basics and build on that as we move along.

Our entities will be colored in red. Signaling flows in green and media flows in blue.

What you see above is the classic explanation of WebRTC. Our entities:

  1. Two browsers, connected to an application server
  2. The application server is a simple web server that is used to “connect” both browsers. It can be something like the Facebook website, an ecommerce site, your heatlhcare provider or my own site with its monthly virtual coffee sessions
  3. Our STUN and TURN server (yes. You don’t need two separate servers. They almost always come as a single server/process). And we’re not using it in this case, but we will in the next scenarios

What we have here is the classic VoIP (or WebRTC?) triangle. Signaling flows vertically towards the server but media flows directly across the browsers.

BTW – there’s some signaling going off from the browsers towards the STUN/TURN server for practically all types of scenarios. This is used to find the public IP address of the browsers at the very least. And almost always, we don’t draw this relationship (until you really need to fix a big, STUN seems obvious and too simple to even mention).


Summing this one up: nothing to write home about.

Moving on…

#2 – Basic Relay Call Basic WebRTC relay call

This is probably the main drawing you’ll see when ICE and TURN get explained.

In essence, the browsers couldn’t (or weren’t allowed) to reach each other directly with their media, so a third party needs to facilitate that for them and route the media. This is exactly why we use TURN servers in WebRTC (and other VoIP protocols).

This means that WebRTC isn’t necessarily P2P and P2P can’t be enforced – it is just a best effort thing.

So far so go. But somewhat boring and expected.

Let’s start looking at more interesting scenarios. Ones where we need a media server to handle the media:

#3 – WebRTC Media Server Direct Call, Centralized Signaling WebRTC Media Server Direct Call, Centralized Signaling

Now things start to become interesting.

We’ve added a new entity into the mix – a media server. It can be used to record the calls, manage multiparty scenarios, gateway to other networks, do some other processing on the media – whatever you fancy.

To make things simple, we’ve dropped the relay via TURN. We will get to it in a moment, but for now – bear with me please.


The media now needs to flow through the media server. This may look like the previous drawing, where the media was routed through the TURN server – but it isn’t.

Where the TURN server relays the media without looking at it – and without being able to look at it (it is encrypted end-to-end); the Media Server acts as a termination point for the media and the WebRTC session itself. What we really see here is two separate WebRTC sessions – one from the browser on the left to the media server, and a second one from the media server to the browser on the right. This one is important to understand – since these are two separate WebRTC sessions – you need to think and treat them separately as well.

Another important note to make about media servers is that putting them on a public IP isn’t enough – you will still need a TURN server.


On the signaling front, most assume that signaling continues as it always have. In which case, the media server needs to be controlled in some manner, presumably using a backend-to-backend signaling with the application server.

This is a great approach that keeps things simple with a single source of truth in the system, but it doesn’t always happen.

Why? Because we have APIs everywhere. Including in media servers. And these APIs are sometimes used (and even abused) by clients running browsers.

Which leads us to our next scenario:

#4 – WebRTC Media Server Direct Call, Split Signaling WebRTC Media Server Direct Call, Split Signaling

This scenario is what we usually get to when we add a media server into the mix.

Signaling will most often than not be done between the browser and the media server while at the same time we will have signaling between the browser and the application server.

This is easier to develop and start running, but comes with a few drawbacks:

  1. Authorization now needs to take place between multiple different servers written in different technologies
  2. It is harder to get a single source of truth in the system, which means it is harder for the application server to know what is really going on
  3. Doing such work from a browser opens up vulnerabilities and attack vectors on the system – as the code itself is wide open and exposes more of the backend infrastructure

Skip it if you can.

Now lets add back that STUN/TURN server into the mix.

#5 – WebRTC Media Server Call Relay WebRTC Media Server Call Relay

This scenario is actually #3 with one minor difference – the media gets relayed via TURN.

It will happen if the browsers are behind firewalls, or in special cases when this is something that we enforce for our own reasons.

Nothing special about this scenario besides the fact that it may well happen when your intent is to run scenario #3 – hard to tell your users which network to use to access your service.

#6 – WebRTC Media Server Call Partial Relay WebRTC Media Server Call Partial Relay

Just like #5, this is also a derivative of #3 that we need to remember.

The relay may well happen only in one side of the media server – I hope you remember that each side is a WebRTC session on its own.

If you notice, I decided here to have signaling direct to the media server, but could have used the backend to backend signaling.

#7 – WebRTC Media Server and TURN Co-location WebRTC Media Server and TURN Co-location

This scenario shows a different type of a decision making point. The challenge here is to answer the question of where to deploy the STUN/TURN server.

While we can put it as an independent entity that stands on its own, we can co-locate it with the media server itself.

What do we gain by this? Less moving parts. Scales with the media server. Less routing headaches. Flexibility to get media into your infrastructure as close to the user as possible.

What do we lose? Two different functions in one box – at a time when micro services are the latest tech fad. We can’t scale them separately and at times we do want to scale them separately.

Know Your Flows

These are some of the decisions you’ll need to make if you go to deploy your own WebRTC infrastructure; and even if you don’t do that and just end up going for a communication API vendor – it is worthwhile understanding the underlying nature of the service. I’ve seen more than a single startup go work with a communication API vendor only to fail due to specific requirements and architectures that had to be put in place.

One last thing – this is 1 of 40 different lessons in my Advanced WebRTC Architecture Course. If you find this relevant to you – you should join me and enroll to the course. There’s an early bird discount valid until the end of this week.

The post How Media and Signaling flows look like in WebRTC? appeared first on

IMTC: Supporting WebRTC Interoperability

Thu, 09/15/2016 - 12:00

Where is the IMTC focusing it efforts when it comes to WebRTC?

[Bernard Aboba, who is IMTC Director and Principal Architect for Microsoft wanted to clarify a bit what the IMTC is doing in the WebRTC Activity Group. I was happy to give him this floor, clarifying a bit the tweet I shared in an earlier post]

One of the IMTC’s core missions is to enhance interoperability in multimedia communications, with real-time video communications having been a focus of the organization since its inception. With IMTC’s membership including many companies within the video industry, IMTC has over the years dealt with a wide range of video interoperability issues, from simple 1:1 video scenarios to telepresence use cases involving multiple participants, each with multiple cameras and screens.

With WebRTC browsers now adding support for H.264/AVC as well as VP9, and support for advanced video functionality such as simulcast and scalable video coding (SVC) becoming available, the need for WebRTC video protocol and API interoperability testing has grown, particularly in scenarios implemented by video conferencing applications. As a result, the IMTC’s WebRTC Activity Group has been working to further interoperability testing between WebRTC browsers.

In the past, the IMTC has sponsored development of test suites, including a test suite for SIP over IPv6, and most recently a tool for testing interoperability of HEVC/H.265 scalable video coding. For SuperOp 2016, the WebRTC AG took on testing of WebRTC audio and video interoperability. So a logical next step was to work on development of automated WebRTC interoperability tests. Challenges include:

  1. Developing basic audio and video tests that can run on all browsers without rewriting the test code for each new browser to be supported.
  2. Developing tests covering not only basic use cases (e.g. peer-to-peer audio/video), but also advanced use cases requiring a central conferencing server (e.g. conferencing scenarios involving multiple participants, simulcast, scalable video coding, screen sharing, etc.)

For its initial work, IMTC decided to focus on the first problem. To enable interoperability testing of the VP9 and H.264/AVC implementations now available in browsers, the IMTC supported Philipp Hancke (known to the community as “fippo”) in enhancing automated WebRTC interoperability tests, now available at Sample code used in the automated tests is available at

The interoperability tests depend on adapter.js, a Javascript “shim” library originally developed by the Chrome team to enable tests to be run on Chrome and Firefox. Support for VP9 and H.264/AVC has been rolled into adapter.js 2.0, as well as support for Edge (first added by fippo in October 2015). The testbed also depends on a merged fix (not yet released) in version 2.0.2. The latest adapter.js release as well as ongoing fixes is available at

With the enhancements rolled into adapter.js 2.0, the shim library enables WebRTC developers to ship audio and video applications running across browsers using a single code base. At ClueCon 2016, Anthony Minessale of Freeswitch demonstrated the Verto client written to the WebRTC 1.0 API supporting audio and video interoperability between Chrome, Firefox and Edge.

Got questions or want to learn more about the IMTC and its involvement with WebRTC? Email the IMTC directly.


Planning on introducing WebRTC to your existing service? Schedule your free strategy session with me now.

The post IMTC: Supporting WebRTC Interoperability appeared first on

Do you still need TURN if your media server has a public IP address?

Mon, 09/12/2016 - 12:00

Yes you do. Sorry.

This is something I bumped into recently and was quite surprised it wasn’t obvious, which lead me to the conclusion that the WebRTC Architecture course I am launching is… mandatory. This was a company that had their media server on a public IP address, thinking that this should remove their need to run a TURN server. Apparently, the only thing it did was remove their connection rate.

It is high time I write about it here, as over the past year I actually saw 3 different ways in which vendors break their connectivity:

  1. They don’t put a TURN server at all, relying on media servers with public IP addresses
  2. They don’t put a TURN server at all, assuming STUN is enough for a peer to peer based service (!)
  3. They don’t configure the TURN server they use for TCP and TLS connectivity, assuming UDP relay is more than enough


I digress though. I want to explain why the first alternative is broken:

Why a public IP address for your media server isn’t enough

With WebRTC, traffic goes peer to peer. Or at least it should:

But this doesn’t always work because one or both of the browsers are on private networks, so they don’t really have a public address to use – or don’t know it. If one of them has a public IP, then things should be simpler – the other end will direct traffic to that address, and from that “pinhole” that gets created, traffic can flow the other way.

The end result? If you put your media server on a public IP address – you’re set of success.

But the thing is you really aren’t.

There’s this notion of IT and security people that you should only open ports that need to be used. And since all traffic to the internet flows over HTTP(S); and HTTP(S) flows over TCP – you can just block UDP and be done with it.

Now, something that usually gets overlooked is that WebRTC uses UDP for its media traffic. Unless TURN relay over TCP/TLS is configured and necessary. Which sometimes it does. I asked a colleague of mine about the traffic they see, and got something similar to this distribution table:

With up to 20% of the sessions requiring TURN with TCP or TLS – it is no wonder a public IP configured on a media server just isn’t enough.

Oh, and while we’re talking security – I am not certain that in the long run, you really want your media server on the internet with nothing in front of it to handle nasty stuff like DDoS.

What should you do then?
  1. Make sure you have TURN configured in your service
    • But make sure you have TCP and TLS enabled in it and found in your peer connection’s configuration
    • I don’t care if you do that as part of your media server (because it is sophisticated), using a TURN server you cobbled up or through a third party service
  2. Check out my new WebRTC Architecture course
    • It covers other aspects of TURN servers, IP addresses and things imperative for a production deployment
    • The images used in this article come from the materials I’ve newly created for it
  3. Test the configuration you have in place
    • Limit UDP on your test machines, do it on live networks
    • Or just use testRTC – we have in this service simple mechanisms in place to run these specific scenarios

Whatever you do though, don’t rely on a public IP address in your media server to be enough.

The post Do you still need TURN if your media server has a public IP address? appeared first on

Should you use Kurento or Jitsi for your multiparty WebRTC video conference product?

Mon, 09/05/2016 - 12:00

Kurento or Jitsi; Kurento vs Jitsi – is the the ultimate head to head comparison for open source media servers in WebRTC?

Yes and no. And if you want an easy answer of “Kurento is the way to go” or “Jitsi will solve all of your headaches” then you’ve come to the wrong place. As with everything else here, the answer depends a lot on what it is you are trying to achieve.

Since this is something that get raised quite often these days by the people I chat with, I decided to share my views here. To do that, the best way I know is to start by explaining how I compartmentalized these two projects in my mind:

Jitsi Videobridge

The Jitsi Videobridge is an SFU. It is an open source one, which is currently owned and maintained by Atlassian.

The acquisition of the Jitsi Videobridge serves Atlassian in two ways:

  1. Integrating Jitsi Videobridge into HipChat while owning the technology (it took the better part of the last 18 months)
  2. Showing some open source love – they did change the license of Jitsi from LGPL to APL

Here’s the intro of Jitsi from its github page:

Jitsi Videobridge is an XMPP server component that allows for multiuser video communication. Unlike the expensive dedicated hardware videobridges, Jitsi Videobridge does not mix the video channels into a composite video stream, but only relays the received video channels to all call participants. Therefore, while it does need to run on a server with good network bandwidth, CPU horsepower is not that critical for performance.

I emphasized the important parts for you. Here’s what they mean:

  • XMPP server component – a decision was made as to the signaling of Jitsi. It was made years ago, where the idea was to “compete” head-to-head with Google Hangouts. So the choice was made to use XMPP signaling. This means that if you need/want/desire anything else, you are in for a world of pain – doable, but not fun
  • does not mix the video channels – it doesn’t look into the media at all or can process raw video in any way
  • only relays the received video – it is an SFU

Put simply – Jitsi is an SFU with XMPP signaling.

If this is what you’re looking for then this baby is for you. If you don’t want/need an SFU or have other signaling protocol, better start elsewhere.

You can find outsourcing vendors who are happy to use Jitsi and have it customized or integrated to your use case.


Kurento is a kind of an media server framework. This too is an open source one, but one that is maintained by Kurento Technologies.

With Kurento you can essentially build whatever you want when it comes to backend media processing: SFU, MCU, recording, transcoding, gateway, etc.

This is an advantage and a disadvantage.

An advantage because it means you can practically use it for any type of use case you have.

A disadvantage because there’s more work to be done with it than something that is single purpose and focused.

Kurento has its own set of vendors who are happy to support, customize and integrate it for you, one of which are the actual authors and maintainers of the Kurento code base.

Which one’s for you? Kurento or Jitsi?

Both frameworks are very popular, with each having at the very least 10’s of independent installations and integrations done on top of them and running in production services.

Kurento or Jitsi? Kurento or Jitsi? Not always an easy choice, but here’s where I draw the line:

If what you need is a pure SFU with XMPP on top, then go with Jitsi. Or find some other “out of the box” SFU that you like.

If what you need is more complex, or necessitates more integration points, then you are probably better off using Kurento.

What about Janus?

Janus is… somewhat tougher to explain.

Their website states that it is a “general purpose WebRTC Gateway”. So in my mind it will mostly fit into the role of a WebRTC-SIP gateway.

That said, I’ve seen more than a single vendor using it in totally other ways – anything from an SFU to an IOT gateway.

I need to see more evidence of use cases where production services end up using it for multiparty as opposed to a gateway component to suggest it as a solid alternative.

Oh – and there are other frameworks out there as well – open source or commercial.

Where can I learn more?

Multiparty and server components are a small part of what is needed when going about building a WebRTC infrastructure for a communication service.

In the past few months, I’ve noticed a growing requests in challenges and misunderstandings of how and what WebRTC really is. People tend to focus on the obvious side of the browser APIs that WebRTC has, and forget to think about the backend infrastructure for it – something that is just as important, if not more.

It is why I’ve decided to launch an online WebRTC Architecture course that tackles these types of questions.

Course starts October 24, priced at $247 USD per student. If you enroll before October 10, there’s a $50 discount – so why wait?

The post Should you use Kurento or Jitsi for your multiparty WebRTC video conference product? appeared first on

Will there ever be a decentralized web?

Mon, 08/29/2016 - 12:00

No. Yes. Don’t know.

I’ve recently read an article at iSchool@Syracuse. For lack of a better term on my part, pundits opining about the decentralized web.

It is an interesting read. Going through the opinions there, you can divide the crowd into 3 factions:

  1. We want privacy. Also we hate governments and monopolies. This is the largest group
  2. There’s this great tech we can put in place to make the internet more robust
  3. We actually don’t know

I am… somewhat split across all of these three groups.

#1 – Privacy, Gatekeepers and Monopolies

Like any other person, I want privacy. On the other hand, I want security, which in many cases (and especially today) comes at the price of privacy. I also want convenience, and at the age of artificial intelligence and chat bots – this can easily mean less privacy.

As for governments and monopolies – I don’t think these will change due to a new protocol or a decentralized web. The web started as something decentralized and utopian to some extent. It degraded to what it is today because governments caught on and because companies grew inside the internet to become monopolies. Can we redesign it all in a way that will not allow for governments to rule over the data going into them or for monopolies to not exist? I doubt it.

I am taking part now in a few projects where location matters. Where you position your servers, how you architect your network, and even how you communicate your intent with governments – all these can make or break your service. I just can’t envision how protocols can change that in a global scale – and how the forces that be that need to promote and push these things will actively do so.

I think it is a good thing to strive for, but something that is going very challenging to achieve:

  • Most powerful services today rely on big data = no real privacy (at least not in front of the service you end up using). This will always cause tension between our design for privacy versus our desire for personalization and automation
  • Most governments can enforce rules in the long run in ways that catch up with protocols – or simply abuse weaknesses in products
  • Popular services bubble to the top, in the long run making them into monopolies and gatekeepers by choice – no one forces us to use Google for search, and yet most of us view search on the web and Google as synonymous
#2 – Tech

Yes. Our web is client-server for the most part, with browsers getting their data fix from backend servers.

We now have technologies that can work differently (WebRTC’s data channel is one of them, and there are others still).

We can and should work on making our infrastrucuture more robust. More impregnable to malicious attackers and prone to errors. We should make it scale better. And yes. Decentralization is usually a good design pattern to achieve these goals.

But if at the end of the day, the decentralized web is only about maintaining the same user experience, then this is just a slow evolution of what we’re already doing.

Tech is great. I love tech. Most people don’t really care.

#3 – We just don’t know

As with many other definitions out there, there’s no clear definition of what the decentralized web is or should be. Just a set of opinions by different pundits – most with an agenda for putting out that specific definition.

I really don’t know what that is or what it should be. I just know that our web today is centralized in many ways, but in other ways it is already rather decentralized. The idea that I have this website hosted somewhere (I am clueless as to where), while I write these words from my home in Israel, it is being served either directly or from a CDN to different locations around the globe – all done through a set of intermediaries – some of which I specifically selected (and pay for or use for free) – to me that’s rather decentralized.

At the end of the day, the work being done by researchers for finding ways to utilize our existing protocols to offer decentralized, robust services or to define and develop new protocols that are inherently decentralized is fascinating. I’ve had my share of it in my university days. This field is a great place to research and learn about networks and communications. I can’t wait to see how these will evolve our every day networks.



The post Will there ever be a decentralized web? appeared first on

Are WebRTC room systems interesting again?

Mon, 08/22/2016 - 12:00

I get a feeling that the room system is actually about to change. And that’s probably a good thing.

For many years, video conferencing was defined by the “codec”. The “codec” in this case wasn’t H.264 or any other specification of a video compression standard. It was the term given to the grey box sitting inside a meeting room connected to a camera. For me, a better term for it was always the “room system”. The first ones started as designed, proprietary hardware, running proprietary embedded operating systems. They were connected to a specific camera that was either a part of the box or connected to the box externally – but in most cases was again a proprietary camera.

There have been attempts in the past to replace the room system with something less expensive. I even remember GIPS (remember them? Google acquired them 6 years ago and made WebRTC out of them) writing a post on their blog on how to build your own video conferencing system from an Intel machine and a Logitech webcam. It was nice, but it really didn’t change the industry.

Little has changed in the video conferencing room system. When I stopped following that industry closely, which was a few years ago, things were still in the same trajectory:

  • Use proprietary hardware (the industry leaned towards the TI DSP at the time)
  • Use Embedded Linux as the OS (at the time, this was actually a refreshing sidestep from VxWorks)
  • Use an external proprietary camera (sourced from Sony if you wanted expensive highend or from another vendor if you wanted expensive “lowend”)

Software was taking the same design concepts of embedded platforms and closed systems at the time. You wrote ugly proprietary code from scratch with specialized UI frameworks. No fun at all.

When I decided to write my first posts about WebRTC, I wanted to share my views o f what WebRTC will do to the video conferencing room system. I noted three changes we will see:

So how will we handle it now?

  1. Commodity hardware, probably still with proprietary cameras
  2. Android operating system
  3. WebRTC multimedia and a web browser for signaling and everything else

I wrote it more than 4 years ago. And it still hasn’t happened. What I did fail to see, was how two additional changes are going to affect this industry:

  1. Migration towards cloud based deployments, services and business models (specifically in the video conferencing industry)
  2. Open hardware. Or at the very least, the constant grind of Moore’s Law and the stupidly capable hardware we have today

Hardware is cool again. IoT (the Internet of Things) made sure of that. Everything from wristbands, to drones, to self driving cars. Somehow, hardware startups had to also look at the video conferencing system.

Highfive was an early indication of that. A company conceived in 2012, just about the time I’ve written my own thoughts on the video conferencing room system. To some extent, also Double Robotics, who made use of an iPad and a Segway-like device. Both employed cloud for their distribution, selling a service around their devices. They were pioneers in selling their own video “codec” (=room system) coupled with a service they host and manage.

In the past month, things seem to be progressing in this same trajectory. Three items on the news recently caught my attention:

#1 – HELLO

HELLO is a video conferencing room system created by Solaborate. Solaborate is a social business/collaboration platform that has been around for several years now. Their CEO, Labinot Bytyqi was interviewed here a few years ago about Solaborate. I am not sure how they are fairing since then, but they must have been busy.

It seems that they are now adding a hardware component to the Solaborate platform in the form of HELLO. And what better place to go about doing that than a Kickstarter campaign?

HELLO Kickstarter

The thing I liked most is the image they shared of their first prototype:

For the uninitiated, that’s the Logitech C920 webcam, cut from its plastic contraption and glued together to something that looks like one of them Linux or Android-in-a-stick devices. Probably what holds the quad core ARM processor. Commodity hardware at its best.

Solaborate took a low goal for their Kickstarter campaign, passing it and then some. They will probably end up below the million dollar mark, but with a rather solid number of backers considering this is at the end of the day an enterprise product.

Oh – and did I mention they use WebRTC?

#2 – Pluot

Pluot is a new startup I came across over TechCrunch when they reported that Pluot raised $2.5 million.

The idea isn’t any different than the previous set of vendors. You get a small box and a camera, connected to the Pluot service.

From a hardware standpoint, it isn’t much different than the HELLO box. The camera from the picture is a Logitech C920 one.

The box, if you ask me, is too similar to an Intel NUC.

And it is actually running an Intel off-the-shelf commodity hardware:

The Pluot device is an Intel NUC running Ubuntu Core. […]

All the WebRTC media streams are peer-to-peer. […] That’s why we’re using an Intel Core i3 instead of a cheaper ARM option.

And yes. It is using WebRTC. And guess what? As with Skype, Pluot is also based on Electron (and Chromium as an extension of it):

So we scratched our own itch and built a little appliance, using WebRTC and atom-shell (which is now electron).

Pluot took a different business model approach – one used extensively by mobile operators: the box is free and you pay for the monthly subscription service only.

Commodity hardware, commodity software, commodity video conferencing core inside a Chromium shell, powering the whole video conferencing service.

#3 – Cisco trimming its workforce

In seemingly unrelated news, Cisco is trimming down its workforce. Everywhere in the news that this is mentioned, it also comes with an indication that the cuts are mainly on the hardware side of the house. There’s a need to focus more on software these days.

As one of the biggest players in video conferencing room systems, I wonder what that means. Is it a move towards leaner, more software focused room systems? Is the room systems in Cisco considered hardware or software in essence? Will we see a shift in business models?

The room system is slowly starting to change and take a new shape.

This change isn’t just a technical one in the specification of the hardware and software, but goes a lot deeper than that. These changes come with a change of how the room system is built, which parts are developed and which are “sourced” from open source alternatives (or paid third parties), who offers the service and how the business model look like.


Planning on introducing WebRTC to your existing service? Schedule your free strategy session with me now.

The post Are WebRTC room systems interesting again? appeared first on

Microsoft Acquires Beam, Showing the Value of WebRTC to Interactive Live Streaming

Mon, 08/15/2016 - 12:00

Low latency is critical for interactive live streaming.

Microsoft acquired last week Beam, a company focused on a gamer interactive live streaming service.

According to CrunchBase, Beam has been around for almost 2 years before getting plucked by Microsoft. The investment in them has been smaller than 0.5M USD.

For some reason unknown to me, there are people who love watching other people play games. I guess it is similar to some extent to people sitting down to watch a soccer game. Another thing I can’t really understand. It is the reason why Twitch was acquire by Amazon for almost a billion dollar – a month prior to Beam’s founding.

What Beam worked on was a way to enable viewers to be a part of the game and up their engagement. You do this by allowing viewers to push feedback to the gamers – add challenges to them, buy virtual goods for them, etc. From Beam’s website:

We make it possible for streamers to involve viewers in their gameplay, no matter what game they’re playing.

Want to let your viewers choose your weapon, make quests for you, or even fly a drone around your room? You can do that, all in realtime. Our SDK allows developers to create interactive experiences for existing games with as few as 25 lines of code.

In the console world, there are two major players – Microsoft Xbox and Sony PlayStation. With the acquisition of Beam, Microsoft is trying to build an ecosystem of viewers around the gamers and games offered in Xbox. Will they share the SDK and platform with Sony? It is too soon to tell, especially now that Microsoft is opening up and trying to build large ecosystem around its services as opposed to its operating systems. It might just be that Microsoft is trying to become a big player in gaming in general – not just console ones but also mobile.

Back to Beam and video streaming.

To enable higher and richer interactions between viewers and gamers, and offer the kind of  that, latency higher than a second are detrimental. This makes HLS and MPEG-DASH protocols irrelevant. Flash is on its way out the window. The only other technology that can get to a sub-second latency for real time video streaming then is WebRTC.


WebRTC is exactly what Beam has been using in their “protocol” dubbed FTL. It used WebRTC to stream video to the viewers instead of the more traditional mechanism of Flash.

I have been a believer in WebRTC for live streaming and broadcast for over a year now. It is just another place where WebRTC makes a lot of sense, but it will take time for us to get there. The main reason for that is that current implementations are too focused on video chat scenarios – trying to leverage the WebRTC implementation found in Chrome and hooking it up to backend media servers that are again geared towards video chat use cases.

There are 4 different techniques that WebRTC can be leveraged in interactive live streaming (or streaming at all):

  1. Use WebRTC’s data channel as a replacement for HTTP(S) to send video packets
    • Theoretically, this should be faster than HTTP and enables optimization to buffering
    • No one has taken that route yet as far as I can tell
  2. Build a kind of P2P CDN on top of WebRTC’s data channel
    • Think BitTorrent inside the browser
    • Peer5 and a view other vendors are doing just that
  3. Use WebRTC in its full glory – voice and video channels opened and streamed
    • Acquire the original live stream using WebRTC or some other mechanism, and then use WebRTC to connect the viewers via a VOD like architecture to the broadcast
    • Probably the most wasteful of all approached
    • And the one I am guessing Beam is currently employing
  4. Optimize on (3) to offer something akin to a Flash/HLS streamer
    • Handle multiple bitrates and resolutions
    • Be able to get high density of streams in a single machine

Options (1) and (2) require knowledge of networking.

Option (2) requires knowledge of P2P networks.

Option (3) requires WebRTC knowledge at its basic level.

Option (4) means you practically implement a WebRTC stack of your own with a focus on live streaming.

My guess is that with time, we will see vendors implementing options (2) and (4) which will be the winning architectures for live streaming.

Option (2) will be deployed to support today’s use cases, while option (4) will be deployed to support future use cases, where interactivity between viewer and broadcaster are important.

Beam took the right challenge on itself. It got it acquired in a short timespan and in a way redefine live streaming and low latency.

For Microsoft, this is yet another acquisition in the WebRTC space, and another area in which it now relies on this technology – even without supporting it on IE.


Planning on introducing WebRTC to your existing service? Schedule your free strategy session with me now.

The post Microsoft Acquires Beam, Showing the Value of WebRTC to Interactive Live Streaming appeared first on

WebRTC Plugin? An Electron WebRTC app is the only viable fallback

Mon, 08/08/2016 - 12:00

I was meaning to write something about Skype, Linux and WebRTC. But never got around to it. Until now.

The reason why I decided to write about it eventually? This tweet by Alex:

IMTC (Microsoft, Cisco, polycom, unify, sonus, …) to provide free (no cost) and free (do what you want) webrtc plugin for I.E. And Safari.

— Dr. Alex. Gouaillard (@agouaillard) August 3, 2016

Hmm. The IMTC is planning to offer a FREE plugin for IE and Safari.

Sounds like Temasys, and from the person who worked at Temasys at the time of releasing their plugin – now a commercial one rather than a free offering.

While some like this plugin, others don’t. They tried it and decided that the warning messages it pops up when being installed aren’t worth the effort.

The Electron WebRTC app approach

What did catch my eye was the Skype for Linux announcement. This is an alpha release of the Skype app for Linux – something that Microsoft have been neglecting for quite some time now.

The interesting bit isn’t that Microsoft is actively investing in a Linux version for Skype and acknowledging this part of the user base, but rather how they did that and the stance they have.

Here are a few lines from the announcement on the Skype community site:

The new version of Skype for Linux is a brand new client using WebRTC, the launch of which ensures we can continue to support our Linux users in the years to come.

[…] you’ll be using the latest, fastest and most responsive Skype UI, so you can share files, photos, videos and a whole new range of new emoticons with your friends.

The highlighted text is my own addition.

Here are my thoughts:

  • This is implemented on top of WebRTC and not ORTC. In a way, we’ve gone full circle with Microsoft – from ORTC, to adding WebRTC support in Edge to using WebRTC to develop their own products where needed
  • Microsoft gives the best reasoning behind using WebRTC in its own development: to ensure continued support for Linux
    • For the most part, using WebRTC equates better support for more devices and platforms than any other technology out there today
    • Yes. You still need to put some effort into getting it working on some platforms – but with a lot less of a hassle than any other technology and at a lower cost
  • Responsive Skype UI = HTML5. So there’s some browser engine / rendering engine for HTML in there somewhere
  • Latest and fastest…

It turns out Microsoft decided to use Electron.

What is Electron? It is a framework around Chromium that can be used to created desktop apps from web apps. And it is the most popular platform for doing it these days.

The irony.

Microsoft. Who owns, develops and promotes IE and Edge. Who was against WebRTC and for ORTC. That Microsoft used Chromium (effectively Chrome) to bring its Linux Skype app to market.

A few years ago, that would have been unheard of. Today? It makes too much sense – it actually increased the value of Microsoft in my eyes. Making the most practical decision of all and putting the ego aside.

Back to a WebRTC Plugin


The IMTC is now investing its time and effort in a WebRTC plugin. Call me skeptic, but I can’t see this heading in the right direction.

Here’s why:

  • The IMTC is an interoperability group. Its strength lies in getting multiple vendors into the same room and having them test their products against each other. “their products” being products that follow the same specification and end up being deployed in the same network and service
  • Companies put their money into the IMTC to enable them that testing services
  • The problem with WebRTC and the IMTC is that WebRTC doesn’t really require interoperability per se – besides that between browser vendors. And browser vendors aren’t exactly the type of audience the IMTC caters for. To be exact, Microsoft is the only browser vendor who is part of the IMTC – and that’s probably for their Skype for Business product and not Edge or IE
  • Writing and maintaining a WebRTC plugin is hard work. It gets updated too frequently to be considered a one-time effort, so maintaining it comes at a cost – a type of cost that is new to the IMTC and its member companies

I believe it will be hard for the IMTC to maintain such a plugin on their own, and if the idea is to open source it to the larger community so the external community can take it up and continue to work and maintain it for the IMTC then that’s just wishful thinking. Open source projects are not synonymous with community development – they don’t all get picked up, adopted, used and maintained by the masses. The webrtc-everywhere project on github shows that – 2 contributors, a few forks, but not much of a collaboration or community around it.

Since the IMTC is a group of vendors who all seek reaching interoperability of the spec while maintaining a technical advantage on the rest of the vendors (I was there once), I can’t see them cooperating for a long term development of such a thing and putting the resources into it while contributing back to the community.

Furthermore, do we really need a WebRTC plugin?

Yes. I know. Safari. Important. IE. All those poor enterprise guys forced to use it. You can’t live without it and such.

But guess what? That same target market? How receptive do you think it will be for a plugin? What will be the install rate and usage rate for a plugin in such environments?

I have a warm place in my heart for the IMTC, but I think it is losing its way when it comes to WebRTC. I can’t see how a free plugin for WebRTC today will make a change. There are better things to focus on.

What to do in 2016 with WebRTC on IE/Safari?

There are two use cases here:

  1. I need to use the service daily
  2. I just want to get on a URL and do whatever needs to be done (call a doctor for example)

The first one can be solved with an installed PC app. A quaint choice maybe, but one which seems to be popular by comms vendors who started from the web. Think Slack or even Whatsapp – they both have a PC app. If you are using a service daily, the idea goes, you might as well just have it somewhere handy in the background of your PC instead of having to have it opened in a browser tab all the time.

The second one is where things get nasty. Asking for a plugin installation for it is just like asking for an app installation for it. Maybe worse if the installer of the plugin comes with a large set of browser warnings (because browsers now hate plugins). So you might just rethink the app option – or just ask the user to come back with a better browser.

My suggestion?

Explore the option of using Electron instead of a plugin.


Planning on introducing WebRTC to your existing service? Schedule your free strategy session with me now.

The post WebRTC Plugin? An Electron WebRTC app is the only viable fallback appeared first on

Surprise: Free Video Calling is no Guarantee for Success (or Adoption)

Mon, 08/01/2016 - 12:00

Guess what? Mozilla is removing Hello from Firefox.

It will still be available as an add-on, but it seems to have degraded in its importance to Mozilla, which is understandable.

Goodbye HelloWhat is/was Hello?

Hello was Mozilla’s attempt to build a video calling service. Something that is baked right into the browser, but can be used by any browser supporting WebRTC. Think FaceTime or Hangouts but without the app or even a website.

Mozilla partnered for Hello with TokBox (a Telefonica company), which provided the backend to the service – mainly NAT traversal as far as I can tell.

When Hello was announced, I had my doubts and questions about it.

What went wrong?

A few things were wrong from the onset in Firefox Hello:

  1. While it debuted on a desktop browser, its main purpose was mobile. The problem is that Firefox OS got scrapped/pivoted, leaving Hello with no real use
  2. It came at a low point in Mozilla’s history. Mozilla partnered during 2014 with 3 vendors, trying to reduce Google’s hold on it: Yahoo, Cisco and Telefonica
    • Yahoo is all but dead – it just got acquired by Verizon
    • Telefonica needed Firefox OS on mobile, and now that that hasn’t matured, my guess is that its interests lie elsewhere these days, so having Telefonica/TokBox as part of Hello probably isn’t helping too much today
    • Cisco only wanted to protect its H.264 investments, which it succeeded
    • This cost Mozilla in focus and diluted its brand from being a pure open alternative
  3. Firefox has no real network effect or user base to rely on. It doesn’t connect users to one another but rather it connects viewers to web pages. Having hundreds of millions of viewers doesn’t equate monthly active users for a personal communication tool that is baked into the same product
  4. Hello was simple, but offered nothing interesting/innovative/new/needed. People who used apps continued to use apps. Those that wanted to meet over URLs used URLs. Having the button in the browser wasn’t enough to make people leap for the opportunity to use it
  5. While available in all WebRTC supporting browsers (=Chrome & Firefox), it was really a Firefox thing. This limited the user base, and especially the ability to start or to really receive a call over a mobile device

The main issue though is that a free video calling service isn’t that much of a deal these days (if this surprises you – just ask Google).

So Mozilla started by embedding Hello right into the browser. Then making it into a system add-on. And now it is making it into just another add-on. I assume it has a lot to do with the usage they’ve seen over the past year for Hello (and its non-adoption). It makes no sense to continue investing the time and effort in it if no one is using it – and having it officially released with the browser once every few months is a waste. Better throw it out of the browser and simplify the browser releases.

The next step might be to sunset the add-on/service altogether and say goodbye to Hello.

Is this predictive to Google’s Duo app?

Google announced Duo and is about to release it. Simplifying things a bit (and dumbing it down), Duo is a FaceTime clone. I covered Allo/Duo a few months back.

On face value, there’s no reason why Google Duo won’t meet a similar fate as Mozilla Hello.

That said, there are a few notable differences:

  • Duo is a mobile only app, whereas Hello focused on desktop browsers
  • Duo will probably be released on Android and iOS, covering 100% of the mobile market from day one
  • Google has a large users base on Android and the ability to get Duo in front of users. It also has the social graph of these people – via the phone’s address book
  • While Google kept Duo simple, it did bake two features into it:
    • Speed of connectivity, taking it to the extreme by adding QUIC into the mix
    • Caller’s video sent even before you accept the call

Will this be enough for Google Duo to get the adoption? I don’t know.

Where do we go from here?

In 2016 there should be no doubt anymore:

If you plan to monetize a video calling service, you need a serious business plan.

Most services I see launched have no business plan. They attempt to grow to millions of users. There’s a lot of dumb luck involved in it.

I’ve had my doubts about the viability of Wire as a company due to the same reasons. The only progress made by Wire is open sourcing their app – this doesn’t strike me like a business plan or a signal of strength and healthy growth.


Planning on introducing WebRTC to your existing service? Schedule your free strategy session with me now.

The post Surprise: Free Video Calling is no Guarantee for Success (or Adoption) appeared first on

VP9 Hardware Acceleration is Real

Mon, 06/20/2016 - 12:00

Hardware acceleration for video codecs is almost mandatory.

VP9 is getting a performance boost

There are three things that keep VP8 in the game when compared to H.264:

  1. It was the only video codec in Chrome for WebRTC in the last 5 years, giving it a headstart in deployments
  2. H.264 while available in mobile chipsets isn’t always accessible for the developer (or works as it should when it is accessible)
  3. VP8 and H.264 are rather old now, so software implementations of them are quite decent


With VP9, the main worry was that it will be left behind and not get the love and attention from chipset vendors – leading it to the same fate as VP8 – abysmal, if any, hardware acceleration support. It is probably why Google went to great lengths to make it running on YouTube so soon and is publicizing its stats all the time.

This worry is now rather behind us. Recent signs show some serious adoption from the companies that we should really care about:

#1 – ARM


Without checking stats, I’d say that 99% or more of all smartphones sold in the past 5 years are based on ARM.

If and when ARM decides to support a feature directly, that brings said feature very close towards world domination in future smartpones.

Which is somewhat what happened last week – ARM announced its Mali Egil Video Processor with VP9 acceleration.

Here’s a deck they shared:

ARM Mali "Egil" technical preview from Phil Hughes

Being farther away from chipsets than I were 5 years ago, it is hard for me to say if this is an integral part of an ARM processor, but I believe that it isn’t. It is an add-on component that takes care of video processing that chipset vendors add next to their ARM core. They can source the design from ARM or other suppliers – or they can develop their own.

Not sure how popular the ARM alternative is for video processing, but they have the advantage of being the first alternative for any chipset vendor (hell – they already source the ARM core itself, so why not bundle?). Which also means every other vendor needs to match up to their feature set – and improve on it.

Now that VP9 encode/decode capabilities are front and center in the ARM Mali Egil, it has become a mandatory checkmark for everyone else as well.

#2 – Intel

If ARM is the king of mobile, then Intel rules the desktop.

As with ARM, I haven’t been following up on Intel CPU acceleration lately. And as with ARM, it was Fippo who got my attention with this link here: the new Intel Media SDK.

For those who don’t know, Intel is providing several interesting software packages that make direct use of its chipset capabilities. Especially when it comes to optimizing different types of workloads. The Intel IPP and Media SDKs handle media related processing, and are quite popular by low level developers who need access to such facilities.

From the release page itself:

With this release we are happy to announce new full hardware accelerated support for HEVC and VP9.

  • HEVC Main 10 (10-bit) encoder and decoder support
  • VP9 8-bit and 10-bit decoder support

So… HEVC (=H.265) has encode and decode while VP9 only has decode support.

Probably because HEVC has been in the works for a lot longer than VP9, but there’s hope still.

#3 – Alliance of Open Media

The Alliance of Open Media. I’ve published a recent update on the alliance.

Intel was there from the start. The recent additions include ARM, AMD and NVIDIA.

I am sure additional chipset vendors will be joining in the coming months – there seems to be a ramp up in memerships there, with Ateme and Adobe added to their logos just last week.

While the alliance is about what comes after VP9, it is easy to see how these vendors may sway to using VP9 in the interim.

The Future

The future is most definitely one of royalty free video codecs. We’ve got there with voice, now that we have OPUS (though Speex and SILK were there before to pave the way). We will get there with video as well.

Coding technologies need to be accessible and available to everyone – freely – if we are to achieve Benedict Evans’ latest claims: Video is the new HTML. But for that, I’ll need another post.


Planning on introducing WebRTC to your existing service? Schedule your free strategy session with me now.

The post VP9 Hardware Acceleration is Real appeared first on

Will Microsoft’s Acquisition of LinkedIn Change the WebRTC Landscape?

Tue, 06/14/2016 - 12:00

It’s good to have Fippo when there’s lack of ideas in your head.

While there are synergies abound, a flawless execution is necessary

Yap. Fippo again prodded me about a topic, so here comes the post for it.

If you missed it, yesterday Microsoft acquired LinkedIn. $26.2B.

In some ways, Microsoft now rules the enterprise space – communication, collaboration and creation:

  • Microsoft Office suite (Excel, PowerPoint and Word as the main pillars)
  • Microsoft Outlook and the Exchange server (Email)
  • Yammer (Enterprise communications)
  • Skype (Voice and video communications)
  • LinkedIn (User identities and profiles)

Dean Bubley puts it nicely:

The @microsoft / @linkedin deal has nailed enterprise comms federation. Complete map of who knows whom. Add Skype4B & goodbye telephony

— Dean Bubley (@disruptivedean) June 13, 2016

There’s a longform here, but I am less convinced.

I am more inclined to how Radio Free Mobile sees this:

However, for all of this to work, LinkedIn’s systems and data has to become deeply integrated with those of Microsoft which with the companies remaining independent, will be orders of magnitude more difficult.

Microsoft of late has an issue with the ability to execute and follow through.

Skype, while huge, isn’t growing since Microsoft’s acquisition. It is actually letting others take its place.

Same with Yammer. Have you heard anything about it in the last few years? The news is all about Slack, and worse still – it is about how Atlassian’s HipChat is struggling because of Slack – Yammer isn’t even mentioned as a competitor/contender in this space.

Which brings us to LinkedIn, Microsoft’s intents for it and its ability and willingness to follow through.

Back to LinkedIn

I wrote about LinkedIn exactly a year ago. It was about their acquisition at the time of Lynda, a learning company, and me griping on why LinkedIn isn’t doing anything about comms (and WebRTC).

The people at LinkedIn aren’t stupid. They are $26.2B smarter than I am. And frankly, that’s also $17.7B smarter than Skype.

What does that tell us?

  • LinkedIn saw no real value in real time communications
    • Not enough to invest in it and build something with WebRTC
    • Not enough to acquire someone outright
    • Not enough to partner and integrate someone like Skype (Facebook did that in the past for example)
  • That decision played well for LinkedIn – they just got acquired
  • Messaging isn’t that important to LinkedIn either
    • They have rudimentary messaging capability in their platform
    • But it is lacking in so many ways that it is hard to enumerate them
    • And you can’t call its messaging anything similar to… messaging. If feels more like emails

If LinkedIn can’t find value in real time communications for its platform on its own, can Microsoft do a better job at it?

I don’t know.

Now lets look at the Microsoft assets that canbe integrated with LinkedIn.

Skype and LinkedIn

As Dean suggested, there is some synergy in Skype connecting to LinkedIn.

LinkedIn can slap a Skype button on its profiles, making it easy to connect to the people you’re connected with on LinkedIn.

While that’s great, most communication today happens OUTSIDE of LinkedIn. You reach out to people on it, connect with them, and then shift to email and other means of communications. Especially once you know a person to some extent.

To make a point – I wouldn’t send a message to Dean over LinkedIn – I’ll make it over email. Or just ping him on Skype, because that’s where he is.

When someone asks me for an introduction, it usually goes like this: “I saw you are connected to John Doe on LinkedIn. Can you send an intro email for me?”. It happens a lot less on LinkedIn even when it is driven from LinkedIn.

Getting the communication back to LinkedIn will be hard. Getting slightly more communications from LinkedIn directly to Skype is possible, though I am not sure it will be widely accepted.

Yammer and LinkedIn

Yammer isn’t best of breed in enterprise messaging. Not even sure if doing anything with it and LinkedIn is worth the effort.

My suggestion is to open the coffers and take out a few more billions of dollars and acquire Slack. Then throw out all voice integrations and bolt Skype in there. But that has nothing to do with LinkedIn.

Outlook/Exchange and LinkedIn

Email is what drives LinkedIn in the most effective way.

Having the ability to embed and merge profiles properly into Outlook – without any ugly add-ons – that’s great.

But nothing earth shattering that we haven’t seen before with Rapportive on Gmail.

Office and LinkedIn

I guess that having a tighter integration between PowerPoint and Slideshare would be great. But that isn’t the reason LinkedIn was acquired.

Sarah Perez of TechCrunch wrote about the integration of Office and LinkedIn. It includes Outlook. Focuses on Outlook.

And mostly goes one-way: how LinkedIn can enrich Office/Outlook related information. A bit on how Office can enrich LinkedIn data by adding more users. But nothing about how LinkedIn’s functionality can grow. A shame.

If this is where things are headed – growing Office but not growing LinkedIn, then I am afraid LinkedIn is expecting a similar fate to Yammer and Skype. Its days of greatness will be behind it and its level of innovation and introduction of powerful features that can compete in the market – will come to an end.

Other Domains

Cortana and Microsoft’s CRM are areas I missed. You can read more about them in Richard’s analysis on Radio Free Mobile.

The Corporate Structure

It seems that LinkedIn will sit as an independent entity within Microsoft under Satya Nadella directly.

I wonder how that will make things easy for the tight integrations envisioned for LinkedIn and the rest of Microsoft’s assets. How easy will it get to get the Skype team to cooperate and assist the LinkedIn team to integrate Skype for Web? What will the Office team want in return for the data they will be passing to LinkedIn? Will legal even authorize it?

There will be a lot of coordination taking place here, and I do hope that along the way, they won’t lose what’s needed to be done – there’s a lot of synergies and power here, but this will require a lot of agility from a huge company.

Back to WebRTC

This affects larger players in the UC space. If (and that’s a big if) Microsoft can connect the dots of Office, Exchange, Skype and LinkedIn – this makes for a very compelling offering. One that can differentiate and top Cisco and Google.

If Microsoft can make LinkedIn into the congregation point of people across enterprises – and not only a place to find CVs – it will be in a position to expand its offering towards real time communications in ways that others will find hard to compete against. LinkedIn lacked this vision. I wonder if Microsoft can follow through – or will they as well see it as unnecessary.


Planning on introducing WebRTC to your existing service? Schedule your free strategy session with me now.

The post Will Microsoft’s Acquisition of LinkedIn Change the WebRTC Landscape? appeared first on

The Alliance of Open Media – 10 Months in

Thu, 06/09/2016 - 12:00

How time flies.

About 10 months ago, the announcement of the creation of a new alliance caught me off guard.

Somehow, Google, Microsoft and a few other companies put their differences aside and decided to create the Alliance of Open Media. The intent – create royalty free video codec to rival H.265/HEVC. I’ve written about the Aliance of Open Media. It is time to revisit the topic.

A few things happened these last few months that are worth mentioning:

  1. We’ve learned more about the alliance – Jan Ozer  wrote a good progress report
  2. AMD, ARM and Nvidia joined the alliance
  3. Ittiam joined the alliance
  4. Vidyo joined the alliance

I am told work is being done on the actual codec itself. From the report Jan Ozer wrote, the following is apparent:

  • Baseline for the codec is VP10 (Google)
  • Most contributions of technologies on top of it come from Mozilla and Cisco; though I assume Microsoft is contributing there as well
  • Hardware vendors are putting their weight to make sure the algorithms used are easy to place in a hardware design
  • There’s a focus on GPU acceleration, which is important
  • Intent is to have it integrated into a browser by the beginning of 2017 and have hardware acceleration a year later

All the right moves.

ARM and Nvidia

Adding ARM and Nvidia is quite a catch.

ARM is in charge of the architecture of most smartphones on the market today, along with many of the IOT devices out there. Having them on board means that considerations for mobile and low power devices are taken into consideration by the alliance – but also that the work of the alliance will find its way into future designs of ARM.

Nvidia is where you find GPU processing power. They complement the attendance of Intel, brining the important GPU players to the table. In a recent whitepaper I’ve written for Surf, I touched the GPU issue briefly. I’ve done some research in that domain, and it does seem like the GPU is the best candidate to handle our future video coding – having GPUs relevant to this next generation codec fron the start is an important catch for the alliance.


Ittiam is a recent addition to the alliance.

I’ve had the chance to know Ittiam a decade ago, while competing head to head with their VoIP software. They have expertise in the multimedia space and in video compression, but they still are the smallest (or least relevant) player in this alliance. Having them is required to fill in the ranks and grow in numbers.

It would be nice to see others join such as Imagination Technologies (who are larger and a lot more meaningful).


Vidyo just join the alliance. On one hand, it surprised me. On the other hand, it should have.

Vidyo is collaborating with Google for a long time now in VPx and WebRTC. Recently it reiterated that with the work it is doing on VP9 SVC for WebRTC (you can find out more about it on a guest post Alex Eleftheriadis shared here on scalability and VP9).

Their addition to the alliance means several things:

  • Vidyo is making itself an integral part of every initiative related to future video codecs. This is a smart move, as it maintains its lead in the backend side and the smarts that is placed on top of SVC capabilities
  • This future codec will have SVC support in it, hopefully from the moment it is released to market
  • While a smaller company compared to the other members, the contribution of Vidyo to the alliance can be larger than many others of its members

Qualcomm is missing.

So is Samsung.

And a few other smaller mobile chipset vendors.

I think it is their loss, as well as a missed opportunity.

They both should have joined the alliance at its inception.


Apple being Apple, they aren’t a part of it. Putting ads in the App Store and changing subscription revenue sharing models were more important to them, which is understandable.

The thing I don’t understand here is that Apple has removed most of its support in H.265. What does it have to lose by joining the alliance?

There are three paths available to Apple:

  1. Go with H.265. The current reduction in its support of H.265 can only be explained as a negotiation tactic in such a case
  2. Go with the Alliance of Open Media. Which it could do at any point in time. But if that is the case, then why wait?
  3. Release its own unique iCodec. Apple knows best, and it is time to lock its customers a bit further anyways

I wonder which route they are taking here.

Content Creators and Service Providers

We’ve got YouTub, Netflix and Amazon already covered. The internet may rejoice.

But what about Game of Thrones? Or the next movie blockbuster? Are they staying on the route of H.265 or will they veer away from it towards the alliance?

Hard to tell, though for the life of me, I can’t understand a long term decision of staying with H.265.

It would be nice to see the large studios and even Bollywood join the alliance – or at the very least back it publicly.


If we look at the VP9 timeline, we havethe following estimates:

  • 1 year – Chrome decoding, along with a small percentage of YouTube videos supported
  • 2 years – First chipsets and reference designs support. My bet is on Nvidia and Intel here
  • 2.5 years – Chrome official support of it for WebRTC
H.264 in WebRTC

H.264 is hear to stay. More worrying – H.264 will grow in popularity in WebRTC services during 2016.

This progress and success of the alliance changes nothing in the current ecosystem and the current video technology.

The future of H.265

The future of H.265 does look grim. I do hope the alliance will kill it.

H.265 is in a collision course with VP9. It is still the more “popular” choice in legacy businesses, but that may change, as commercial deployments of it are small or non-existent.

The alliance simply means that a future codec is based on the VPx line of codecs instead of the H.26x ones. Now developers shifting from H.264 to a better codec will need to decide if they switch codec lines now or just later.

The royalty issues around H.265 along with the progress made in the alliance should tip the scales towards VP9 on this one.

What’s next?

Money time.

Where does that leave us all?

  • Vendors who handle codecs directly should join the alliance. The benefits outweigh the risks.
  • Consumers and users can continue not caring
  • Developers, especially those of backend media servers, need to decide if they shift towards VP9 or wait for the next generation to switch to a royalty free codecs. They also need to decide if they want to use VP8 or H.264 today


Planning on introducing WebRTC to your existing service? Schedule your free strategy session with me now.

The post The Alliance of Open Media – 10 Months in appeared first on

4 Reasons to Choose H.264 for your WebRTC Service (or why H.264 Just won over VP8)

Mon, 05/30/2016 - 12:00

H.264 is set to replace VP8 for WebRTC services.

You can thank Fippo for making me write this one.

Microsoft ended last week with an announcement of sorts on their Edge dev blog, indicating that H.264/AVC support for ORTC is now available in Edge.

  • Yes. It is ORTC and not WebRTC
  • Yes. It is only behind a runtime flag
  • Yes. It is only on Edge. No IE

But then again, it is the only way today (or at least tomorrow) to get a video call running cross browser between Firefox, Chrome and Edge. VP8 or VP9 gets you as far as Chrome and Firefox.

Which got me to this one over here. Edge support for H.264 in ORTC isn’t much. It isn’t even interesting in the bigger scheme of things (Edge has literally no market share compared to the other browsers, so why bother with it?). And still it marks a turning point – one in which we can all ask ourselves what video codec should we be leaning towards if we started developing a product that uses WebRTC today?

Last year, the answer would have been “VP8”.

A few months ago, it was, “it depends”.

Today, it will lean towards “H.264, unless you must use VP8”.

Here are 4 reasons why this is happening:

#1 – Browser interop baseline

If you want your service to get the most coverage on as many browsers as possible and you need video, then H.264 is the way to go. In a few months, H.264 will get official support by all of these vendors and that will be the end of it. Furthermore, you can expect Apple to use H.264 first and contemplate VP8 – same as Microsoft is doing now with Edge.

#2 – Mobile

Mobile devices like H.264 more than they like VP8. Video codecs take up a lot of resources. To overcome this, mobile handsets use hardware acceleration for video codecs. They all have H.264 video acceleration (though you can’t always gain access to it as a developer). Many of them don’t even know how to spell VP8. This boils down to WebRTC implementations on mobile needing to implement VP8 using software.

Some developers ended up replacing VP8 with H.264 on mobile just because of this reason. Especially for mobile only products.

While I am sure support for VP8 is improving in new chipsets, there’s this pesky issue of supporting the billion and more devices that are already out there. And now that all browsers support H.264 in one way or another, what incentive do developers needing to support mobile apps have to use VP8?

#3 – Legacy video systems

All them video conferencing systems? They use H.264. Most don’t have VP8. Not even in their latest released products. The way they end up supporting WebRTC until today is via a specialized gateway, on the MCU or not at all.

Transcoding was one of the main barriers to getting WebRTC to legacy video systems. It just costs a lot. It would have been easier to just go H.264 all the way. Which is what is now available.

It is one of the reasons why Cisco first worked on Firefox with Spark. It made a decision to use H.264 for WebRTC instead of transcoding from VP8.

#4 – Streaming

Over 60% of the Internet traffic is video. Most of it isn’t real time video, but rather the YouTube or Netflix kind. Passive consumption.

Video streaming today is predominantly H.264 based, and at times VP9 (=YouTube whenever possible).

To get video content on an iPhone device, HLS is required, and that again means H.264.

So again we are left with the alternative of either transcoding our WebRTC generated content to H.264 when we want to stream it out – or to create it using H.264 to begin with.

Do you even care?

If your service is a 1:1 calling service with no server side media processing, then you shouldn’t even care. In such a case, whatever the browsers end up negotiating will be good enough for you (and most probably the best alternative for that specific situation).

Those who invested in server side media processing, be it recording, mixing, routing –  have made investments that are targeted at VP8. Modifying these to work with H.264 as well may not be trivial. For them, the decision of switching to H.264 is a harder one to make, but one that needs to be addressed.

The Future of Video Coding in WebRTC

Once we step into the future, we see VP9. And the SVC flavor of VP9.

And then there’s the Alliance of Open Media and the work they are doing towards a widely accepted next gen royalty free video codec. I’ve touched the progress they are making in my recent Virtual Coffee session

For the record, I rather hate H.264 and what it stands for. But now I must accept that it is here to stay and grow with WebRTC.


Planning on introducing WebRTC to your existing service? Schedule your free strategy session with me now.

The post 4 Reasons to Choose H.264 for your WebRTC Service (or why H.264 Just won over VP8) appeared first on

NUBOMEDIA: the first open source WebRTC PaaS

Wed, 05/25/2016 - 12:00

[Luis Lopez is the face in front of Kurento, one of the popular open source media servers that can handle WebRTC. He wanted to share here the story of the new open source WebRTC PaaS – NUBOMEDIA]

When I first heard about WebRTC by 2011, I was fascinated by the idea of standardized APIs and protocols enabling the creation of interoperable RTC applications for the Web. However, I noticed very soon that my peer-to-peer services were too limited and that, as a developer, I was hungry for further features that could only be provided by a WebRTC infrastructure. This is why I got involved in the Kurento project for creating a media server. Kurento got nice traction but, as it was maturing, we found an increasing number of feature requests related to its scalability. The message was quite clear: a cloudification of Kurento was necessary.

With this in mind, by 2014 we got down to work and, with the financial support of the European Commission, we worked hard during a couple of years in cooperation with some of the most remarkable cloud experts around Europe. These efforts were worthy: NUBOMEDIA, the first open source WebRTC PaaS, is now a reality.

NUBOMEDIA: the first WebRTC PaaS

In the WebRTC ecosystem, scalable clouds for developers are not new. Providers such as Tokbox, Kandy, Twilio and many others offer them. These solutions are commonly called “WebRTC API PaaS”, “WebRTC Cloud APIs”, or just “Cloud APIs” as they expose a number of WebRTC capabilities through custom APIs that exhibit all the nice “-ilities” of cloud services (i.e. scalability, security, reliability, etc.)

For NUBOMEDIA we also considered this “Cloud API” concept as a solution. However, although APIs are the main building block developers use for creating applications, applications are more than just a set of API calls. After analyzing WebRTC developers’ needs, we felt more appealing the concept of platform than the concept of API. A platform is more than an API in the sense that it provides all the required facilities for executing applications. These typically include an operating system, some programming-language-specific runtime environments and some service APIs. The cloud version of a platform is commonly called a PaaS, which is (literally) a platform that is offered “as a Service”.

There are many such PaaSes in the market including Heroku, the Google App Engine or AWS Elastic Beanstalk. All of them expose to developers the ability of uploading, deploying, executing and managing applications written in different programming languages. These PaaS services are quite convenient as they let developers to concentrate on creating their applications’ logic while all the complex aspects of provisioning, scaling and securing them are assumed by the PaaS. In spite of the wide offer of PaaS services, we noticed that most common PaaS providers did not expose WebRTC capabilities as part of their APIs. Hence, WebRTC developers were not able to enjoy all the advantages of full PaaSes.

The main difference between a WebRTC cloud API and a full WebRTC PaaS is illustrated in the following figure. As it can be observed, WebRTC Cloud API providers (left) do not host developers’ applications, but just expose some WebRTC capabilities through a network API that applications consume. On the other hand, full WebRTC PaaSes host application and take the responsibility of executing, scaling and managing them.

Based on these ideas, the NUBOMEDIA idea emerged clearly: instead of evolving Kurento into a cloud API we should rather create a full PaaS out of it, so that developers could enjoy the nice features of PaaSes (i.e. application deployment, execution, scaling, etc.) while consuming the Kurento APIs in a scalable and secure way.

Why NUBOMEDIA may be interesting for you

NUBOMEDIA is now a reality and it can be enjoyed openly by developers worldwide. Like solutions such as OpenShift, Cloud Foundry or Apprenda, NUBOMEDIA is a private PaaS in the sense that it consists of an open source software stack that can be downloaded, installed and executed on top of any OpenStack IaaS cloud.

If you are a developer, you may be interested in trying NUBOMEDIA for your next application as it combines the simplicity and ease of development of WebRTC Cloud APIs with the flexibility of full PaaSes. When doing so, consider that NUBOMEDIA is a Java PaaS. Hence, you will be able to leverage all the capabilities of the Java platform for creating your WebRTC application. The only difference with other Java PaaS services it that NUBOMEDIA will provide you a specific SDK through which you will be able to access the complete feature set of Kurento in a scalable way.

From a practical perspective, the main differences between NUBOMEDIA and other WebRTC cloud solutions are illustrated in the next figure. As it can be seen, there is a trade-off between flexibility and simplicity: the simplest the development, the less flexible the application is and the more difficult it is to adapt it to custom needs and requirements.

For example, most flexible solutions (IaaS on the bottom left corner of the image) require complex developments for creating fully operational WebRTC applications. On the other hand, SaaS solutions (top right corner) do not require much development efforts, but developers’ ability for customizing and adapting it to special requirements is typically very limited. For this reason, WebRTC developers tend to prefer WebRTC Cloud APIs that provide some flexibility but, at the same time, enable simple developments.

NUBOMEDIA also positions within this balance but giving more prevalence to flexibility. This makes NUBOMEDIA more suitable for developments requiring to comply with special or rare requirements. Just for illustration, these are some of the things you can make with NUBOMEDIA that are complex to achieve using the common WebRTC Cloud APIs:

  • To use the signaling protocols you prefer (e.g. SIP, XMPP, custom, etc.)
  • To have special communication topologies. For example, imagine that you need a videoconferencing room with “spy participants” that can view others but should not be noticed by the rest; or imagine that you need simultaneous translators that are not viewed but need to listen to some participants while being listened by others.
  • To have custom AAA (Authentication, Authorization and Accounting). For example, imagine that you wish to implement rules customizing who can access the media capabilities (e.g. recording, viewing a specific stream, etc.) so that they depend on some non-trivial logic (e.g. context information, time-of-day, time-in-call, etc.).
  • To go beyond calls. We may imagine lots of use-cases where WebRTC might be used beyond plain calls. For example, person-to-machine or machine-to-machine scenarios where you need cameras to connect to users or to other systems in a flexible way without restricting to the typical room videoconferencing models commonly exposed by WebRTC Cloud APIs.

As another interesting property, as NUBOMEDIA is a private PaaS, it can execute onto any OpenStack infrastructure. This means that the operational costs of an application running in NUBOMEDIA are fully under your control as you can decide in which IaaS to deploy the PaaS. This significantly reduces the operational costs with respect to an equivalent application consuming a Cloud API, as the Cloud API provider margins disappear.

The NUBOMEDIA Open Source Community

We have created NUBOMEDIA following the same open philosophy we used with Kurento. Currently, it is supported by an active and vibrant open source software community that is structured as an association of several projects providing different technological enablers including: the cloud orchestration mechanisms, the PaaS management technologies, the media server, many media processing modules and client SDKs for Android, iOS and Web.

If you are interested in knowing more about NUBOMEDIA you can check the community documentation where you will be able to find detailed information showing how to install and manage the platform and how to develop and deploy applications into the PaaS. You can also check the community YouTube channel and see one of the many videos with demos and tutorials illustrating how to develop and deploy NUBOMEDIA applications. If you want to know about the latest news of the NUBOMEDIA Community, you may follow it on Twitter.


Want to make the best decision on the right WebRTC platform for your company? Now you can! Check out my WebRTC PaaS report, written specifically to assist you with this task.

Get your Choosing a WebRTC Platform report at a $700 discount. Valid until the beginning of May.

The post NUBOMEDIA: the first open source WebRTC PaaS appeared first on

With WebRTC, Vendors Must Embrace True Aglie

Mon, 05/23/2016 - 12:00

And not only the development.

For too many years now we’ve been enamored with Agile. Supposedly the successor of the fountain development model, agile is all about short iterations and faster feedback.

In larger places, agile is usually just the next undertaking of the program manager – or whatever equivalent you have in the company that deals with processes. I remember hearing the term “we must be agile”. With the end result being… 18 to 24 months product release cycles.

That’s nice, but it isn’t really agile – at least not more than the Geek & Poke caricature above.

I had an interesting discussion with a consultant during the London WebRTC conference two months ago. He complained that browsers are moving too fast, making it hard for enterprises to follow suit and adopt WebRTC.

Here’s a quick reminder – WebRTC doesn’t care about enterprises. It cares about innovation and forward moving. If something breaks, then you’re just out of luck.

WebRTC today forces enterprises to think and act Agile

Why is this the case?

  • Browsers are updating at the speed of light – every 6 to 8 weeks
    • Each time they do, something gets deprecated
    • And other things can get broken
    • This is doubly so with WebRTC, which is essentially a perpetual work in progress
    • And will stay that way well into 2017
    • Enterprises need to be prepared for it and willing to update their own deployments to keep pace
  • WebRTC’s codecs are changing – and upgrading
    • VP9 is upon us
    • H.264 is here to stay
    • R&D teams need to adopt new codecs to keep their service pristine
    • Otherwise, competitors will do it and win the market simply by offering better user experience and media quality
  • New capabilities
    • Browser side recording?
    • Playing video from a canvas?
    • Pipelining media?
    • WebRTC has it all, and things are only improving
    • Do these affect your product? Do you need someone to define how this changes things for you?
What Needs to Change

Enterprises need to change their stance. They aren’t in control anymore. They should act accordingly.

This means having product managers, developers, testers, support and IT all working in concert in an agile way – thinking about launched products as living and breathing entities that must be updated continuously.

Thinking of launchng a WebRTC based product? Especially if it is an on premise one – you must make sure you understand the implications AND that your customers understand the implications as wlel.


Planning on introducing WebRTC to your existing service? Schedule your free strategy session with me now.

The post With WebRTC, Vendors Must Embrace True Aglie appeared first on

Allo, Duo, Hangouts or Jibe? Help…

Thu, 05/19/2016 - 12:00

Wasn’t there enough complications already?

I use Hangouts all the time. At testRTC, we use it for most of our demos and customer meetings. As good and complete as Hangouts is in terms of the feature set that I need, it can be quite confusing at times. Something that probably stems from its dual use nature: Google Hangouts is both a consumer messaging app and an enterprise unified communications app. And while the two rely on the same technology – they are not the same.

If there is one other similar service that does that it is Skype, and even with it, it is mostly by branding and not by the service itself (I am not sure how uniform the Skype and Skype for Business apps and infrastructure are, but they sure are getting worse in the last year or two).

Can a single app rule them all? By the way things look today – no.

And yet this latest move by Google leaves me somewhat baffled.

At Google I/O’s keynote yesterday, Google came out with a slew of announcements. The ones interesting for me here are those related to messaging or to WebRTC:

  • Allo – a new messaging app to fend off Facebook Messenger
  • Duo – a new video chat app to fend off Apple FaceTime
  • Firebase – a new version which I won’t be covering here

Allo is Google’s “Smart Messaging App”.

It is yet-another-messaging-app – until you see the suggestions it gives you.

I use Switfkey as my Android keyboard, and it “learns” what you click so future clicking is shorter. The smart messaging replies in Allo are the next step for me – instead of doing it on the word level it does it on the conversation level.

The smarts in Allo seems to be split into two parts – what Allo does on his own, which is suggestions inside the conversation. On top of it, Google added something they call Google Assistant, which goes “out” of the conversation to offer suggestions for external actions. The example in the I/O keynote was restaurant reservation.

This competes directly with messaging and bots. Specifically Facebook. Maybe others.

Where can this lead us?

  • If I were Google, I’d make this into a bot or a layer that can be stitched into everything
  • Messaging services could use it directly, which will allow Google to sift every interaction and offer their suggestions and automation – no matter the app
  • Would messaging apps adopt it? I don’t know, but why shouldn’t they try it out?

Duo IS WebRTC. Or at least what you can do with it.

A not about Duo, WebRTC and purism – Duo is mobile only (for now), closed app, running on Android and iOS.

I’ll repeat that.

Duo is mobile only (for now), closed app, running on Android and iOS.

No web browser. No complaints about unsupportive Safari or IE browsers. And from Google.

To those who decide to skip WebRTC just because it doesn’t run on IE or not supported by Safari (without really understanding what WebRTC means) – this should be the best wake up call. Coming directly from Google, the company who wants everything running in the browser.

Recognize anyone in the Duo app?

If tech media outlets taught me anything this time, is that you should be suspicious at what they write.

Ingrid Lunden on TechCrunch did a nice write up on Duo, offering the gist of it:

  • 1:1 video chat app, like FaceTime
  • Focus is on super fast (responsiveness) and media quality
  • You see the caller’s video before you answer a call. A nice gimmick I guess
  • Based on WebRTC

This is where things flal apart a bit in her coverage:

The other thing that Duo is touting is the engineering that has gone into making the video in the app work. Google says it will work the same whether your network is superfast or patchy. This in itself, if it really bears out, would be amazing for anyone who has cursed his or her way through a bad Hangout or Skype call.

Duo was built by the same team that created WebRTC and it uses WebRTC, engineering director Erik Kay said today on stage at I/O. It was built using a new programming protocol, Quic, which Google unveiled last year as a route to speeding up data-heavy applications that travel over the web.

So Duo has this magic of working better than Hangouts and Skype. Great. So why didn’t Google just build it into Hangouts? Especially considering both use WebRTC…

That reference to the QUIC protocol – to be sure – this does NOTHING to the actual media – only to the time you wait until the smartphone “dials”. You shave a few hundreds of milliseconds there, but that won’t move the needle in the industry either way.

Mashable’s Raymond Wong explains QUIC and how it is a serious advantage:

Google says people don’t place as many video calls with their friends and family because connections can sometimes be spotty and drop. Duo uses a new protocol called QUIC that’s supposed to be more robust than any other video calling infrastructure out there.

QUIC won’t make the call more robust or get calls work better. It will just make them make the initial connection faster or having the mute button appear QUICker on the other end’s device. QUIC is a nice touch of how Google can go to extremes sometimes with optimizing the technology. Sometimes it makes a lot of sense, but other times less so. QUIC is definitely a step forward from TCP, but its effect on video calling isn’t huge.

What do we have here? Apple FaceTime, done by Google, working on both Android and iOS. Nothing more and nothing less.


There’s also Jibe

An acquisition from last year, placing Google as a serious RCS player.

No mention of it in I/O. Probably because its focus is on “fixing”/”improving”/”popularizing” the basic Google Messenger app, which does SMS.

This being something that needs to be synchronized with carriers – it will take time to materialize.

The future of Hangouts

Is the enterprise.

With Allo and Duo, why should consumers even care about Hangouts from now on?

Can this succeed?

Can such an approach succeed for Google? Having multiple communication apps, two of them announced in the same day.

Can they reach mass adoption?

Google is taking the path of unbundling here, but doing it to what was until now the same service – communications. They split it into multiple smaller apps, tearing real time voice and video calling from current messaging apps. It feels somewhat like iMessage and FaceTime, but Allo is more capable than iMessage (sans SMS) and Duo is a bit more capable than FaceTime (the knock knock feature).

I can’t really decide if taking this unbundling approach is better or worse. Will it increase engagement of users with these services or hurt them. And where does Google Hangouts fit in here, if at all?

The post Allo, Duo, Hangouts or Jibe? Help… appeared first on

WebRTC Signaling Protocols and WebRTC Transport Protocols Demystified

Mon, 05/16/2016 - 12:00

A refresher on what I’ve written in 2014 (here and here).

Can you guess the signaling and transport here?

WebRTC as a protocol comes without signaling. This means that you as a developer will need to take care of it.

The first step will be selecting the protocol for it. Or more accurately – two protocols: transport and signaling. In many cases, we don’t see the distinction (or just don’t care), but sometimes, they are important. A recent question in the comments section of one of the two posts mentioned here in the beginning, got me to write this explanation. Probably yet again.

WebRTC Transport Protocols and Browsers

This actually fits any browser transport protocol.

A transport protocol is necessary for us to sent a message from one device to another. I don’t care what is in that message or how the message is structured at this point – just that it can be sent – and then received.


5 years ago browsers were simple when it came to transport protocols. We essentially had HTTP/1.1 and all the hacks on top of it, known as XHR, SSE, BOSH, Comet, etc. If you are interested in the exact mechanics of it, then leave a comment and I’ll do my best to explain in a future post (though there’s a lot of existing explanation around the internet already).

I call the group of solutions on top of HTTP/1.1 workarounds. They make use of HTTP/1.1 because there was no alternative at the time, but they do it in a way that makes no technical sense.

Oh – and you can even use REST to some extent, which is again a minor “detail” above HTTP/1.1.

Since then, three more technique materialized: WebSocket, WebRTC and recently HTTP/2.


The WebSocket was added to do what HTTP/1.1 can’t. Provide a bidirectional mechanism where both the client and the web server can send each other messages. What these messages are, what they mean and what type of format they follow was left to the implementer of the web page to decide.

There’s also or the less popular SockJS. Both offer client side implementations that simulate WebSocket in cases it cannot be used (browser or proxy doesn’t support it). If you hear that the transport is – for the most part you can just think about it as WebSocket.

When your WebSocket work great, they are great. But sometimes it doesn’t (more on that below, under the HTTP/2 part).

WebRTC’s Data Channel

To some extent, the Data Channel in WebRTC can be used for signaling.

Yes. You’ll need to negotiate IP addresses and use ICE first – and for that you’ll need an additional layer of signaling and transport (from the list in this post here), but once connected, you can use the data channel for it.

This can be done either directly between the two peers, or through intermediaries (for multiple reasons).

Where would you want to do that?

  1. To reduce latency in your signaling – this is theoretically the fastest you can go
  2. To reduce load on the server – now it won’t receive all messages just to route them around – you’ll be sending it things it really needs
  3. To increase privacy – not sending messages through the server means the server can’t be privy to their content – or even the fact there was communication

For the most part, this is quite rare as transport for signaling in WebRTC.


I’ve written about HTTP/2 before. Since then, HTTP/2 has grown in its popularity and spread.

HTTP/2 fixes a lot of the limitations in HTTP/1.1, which can make it a good long term candidate for transport of signaling protocols.

A good read here would be Allan Denis’ writeup on how HTTP/2 may affect the need for WebSocket.


WebRTC Signaling Protocols

Signaling is where you express yourself. Or rather your service does. You want one user to be able to reach out to another one. Or a group of people to join a virtual room of sorts. To that end, you decide on what types of messages you need, what they mean, how they look like, etc.

That’s your signaling protocol.

As opposed to the transport protocol, you aren’t really limited by what the browser allows, but rather by what you are trying to achieve.

Here are the 3 main signaling protocols out there in common use with WebRTC:


I hate SIP.

Never really cared for it.

It has its uses, especially when it comes to telephony and connecting to legacy voice and video services.

Other than that, I find it too bloated, complex and unnecessary. At least for most of the use cases people approach me with.

SIP comes from the telephony world. Its main transport was UDP. Then TCP and TLS were added as transport protocols for it. Later on SCTP. You don’t care about any of these, as you can’t really access them directly with a browser. So what was done was to add WebSocket as a SIP transport and just call it “SIP over WebSocket”. Before WebRTC got standardized (it hasn’t yet), SIP over WebSocket got standardized and already has an RFC of its own. Why is it important? Because the only use of SIP over WebSocket is to enable it to use WebRTC.

So there’s SIP. And if you know it, like it or need it. You can use it for your WebRTC signaling protocol.


I hate XMPP.

Not really sure why. Probably because any time I say something bad about it, a few hard core fans/followers/fanatics of XMPP come rushing in to its rescue in the comments section. It makes things fun.

XMPP has a worldview revolving around presence and instant messaging, and use cases that need it can really benefit from it – especially if the developer already knows XMPP and what he is doing.

If you like it enough – make sure to slam me in the comments – you’ll find their section at the end of this post…


I hate NIH. And yet a proprietary signaling protocol has a lot of benefits in my view.

In many cases, you just want to get the two darn users into the “same page”. Not much more. I know I am dumbing it down, but the alternative is to carry around you extra protocol messages you don’t need or intend using.

In many other cases, you don’t really want to add another web server to handle signaling. You want your web server to host the whole site. So you resolve into a proprietary signaling protocol. You might not even call it that, or think of it as a signaling protocol at all.

How to Choose?

Always start from the signaling protocol.

If there’s reason to use SIP due to existing infrastructure or external systems you need to connect to – then use it. If there’s no such need, then my suggestion would be to skip it.

If you like XMPP, or need its presence and instant messaging capabilities – then go use it.

If the service you are adding WebRTC to already has some logic of its own, it probably has signaling in there. So you just add the relevant messages you need to that proprietary signaling.

In any other case, my advice would be to use a proprietary signaling solution that fits your exact need. If you’re fine with it, I’d even go as far as picking a SaaS vendor for signaling.


Planning on introducing WebRTC to your existing service? Schedule your free strategy session with me now.

The post WebRTC Signaling Protocols and WebRTC Transport Protocols Demystified appeared first on

Last Chance to Enjoy a $700 Discount on my WebRTC PaaS Report

Fri, 05/13/2016 - 14:00

Grab your copy now.

I am in the last stretch of updates for my Choosing a WebRTC API Platform report. In the past month, the report has been available at a discounted price – from $1950 down to $1250. Purchasing the report includes 1 year of updates, which means that if you get your copy now – you’ll be receiving the new update next week.

What’s new in the report?

Things are at constant change with the WebRTC ecosystem, and the best place to see it is in the API space. Since the last update, we’ve experienced the rebranding of Comverse as XURA, which affected their Forge platform as well.

Here’s what you will find in the updated report, due next week:

  • Updated all vendor profiles and feature sets, so they now reflect the existing
  • Added a new vendor – QuickBlox. This brings us to 24 covered platforms in the report
  • I added a new KPI to the report – investment level – where I indicate for periods between updates how much investment was made in new features and capabilities in the platform. This can be an indicator to the level of commitment the vendor has to his platform and what to expect moving forward when it comes to new features being introduced
  • I’ve written a new Vendor Selection Blueprint. This document can assist you in the vendor selection process by guiding you through it. It includes an Excel sheet as well as a mockup example of such a process for an imaginary use case
  • Presentation deck of the visuals has been redesigned and improved, so now if you need visuals – they will be even more professional looking
What do you get when you purchase the report?

The report itself isn’t only a PDF file you print and put on your manager’s table. It includes a lot more than that:

  • The report, in PDF format (obviously)
  • 1 year of free updates, these will cover 1-2 more updates (I tend to publish them every 6-8 months or so)
  • Site membership access to additional materials
  • Online comparison matrix, to make quick comparisons easy to handle
  • Presentation visuals, which you can use in your own presentations
  • Vendor Selection Blueprint, to guide you through the vendor selection process
  • Access to the monthly Virtual Coffee sessions as well as the archived sessions
How to purchase?


  1. Go to the WebRTC PaaS report page
  2. Scroll down to the end of the page
  3. Select the Premium option and press the BUY NOW button
  4. Use your PayPal account or a credit card to make the purchase

If you do this in the next couple of days – you are guaranteed to enjoy the discounted price.

The post Last Chance to Enjoy a $700 Discount on my WebRTC PaaS Report appeared first on


Using the greatness of Parallax

Phosfluorescently utilize future-proof scenarios whereas timely leadership skills. Seamlessly administrate maintainable quality vectors whereas proactive mindshare.

Dramatically plagiarize visionary internal or "organic" sources via process-centric. Compellingly exploit worldwide communities for high standards in growth strategies.

Get free trial

Wow, this most certainly is a great a theme.

John Smith
Company name

Startup Growth Lite is a free theme, contributed to the Drupal Community by More than Themes.