Yap. It is that time of the year for me.
In the past two months there have been lots and lots of summaries, predictions and even reports thrown around about different markets. That’s what you get at the passing of a year.
I had my share of such articles written recently as well:
- WebRTC market expected to see more gradual changes
- CPaaS and API trends: Forecasting the future of embedded communications
But it is now time for something that will become a tradition here I hope, which is the yearly WebRTC Start of the Market infographic. I did one last year, so why not this time around as well?
Yes. I know. This one is almost a month into 2017, and I’ve written 2017 and not 2016 on it. In a sense, it was about practicallity – we care about what comes next more than what happened. To some of us, 2016 is almost a long forgotten memory already.About the Infographic
What’s different this year in the infographic?
- I decided to have a sponsor, and Vidyo were kind enough t oblige and assist. As time pass, it is becoming increasingly difficult to collect and maintain all of my WebRTC dataset fresh, so having vendors pitch in and help make it worthwhile is great. So thanks!
- It includes a webinar
- Last year I had a private Virtual Coffee session on this topic to my customers
- This year I am doing a public webinar with Vidyo about this topic and the findings
Vidyo and BlogGeek.me will be hosting a webinar on 2017 WebRTC Market Outlook. I’ll be joining Nicholas Reid, which will make this all the more fun (doing a webinar solo is… lonely).
We will be covering the various stats in the infographic, go over trends and see how we think they will develop into 2017. We will also discuss the just announced Vidyo.io CPaaS and see how it fits into this outlook.
2017 is bound to be interesting and dynamic, so join us.
2017 WebRTC Market Outlook
When? Tuesday, February 7, 2017; 3 pm eastern / noon pacific
Where? OnlineThe Numbers
- Over 1,100 vendors and projects using WebRTC, and just looking at the adoption numbers of January 2017, I can say we’re in for an interesting ride this year
- Largest markets using WebRTC? Customer Management and specific verticals (Healthcare and Education lead the way)
- Outsourcing vendors are cropping like mushrooms after the rain with growth of over 100% in 2016. With the pressure and challenge of finding experienced developers for WebRTC, it is no surprise that many outsourcing vendors are either adding WebRTC to their technology warchest or going all out and focusing in WebRTC projects
- Live streaming is going strong with 70% growth in 2016. This will continue into 2017 as well, taking a lot of the attention span of enterpreneurs in the social media space
- In CPaaS there are many ways to split the market. One of them, is by horizontal and vertical players. The dynamics here are truly fascinating, along with the acquisitions in this specific domain (Xura, Twilio and Sinch)
That’s the big question, and one I’ll be focusing on a lot this year.
Here’s what you can do next:
- Register to the webinar and meet Nicholas and myself there
- Subscribe to my newsletter, so you don’t miss out. Lots of interesting announcements coming soon
Thanks again for Vidyo for sponsoring this infographic.
If you think WebRTC is free, then think again.WebRTC is free
I see it everyday and it is glaringly obvious. I probably haven’t made things any better myself with this slide of mine:
WebRTC is free. But lets consider what exactly is free with this technology:
- The code. You can go download it from webrtc.org. And then… do with it whatever. It is licensed under BSD
- The codecs used – and yes – I know – there are people who feel entitled to them through some patents – but no one yet cares about it – almost everyone assumes it is free and uses it – for now
- It is readily available inside browsers. Well… at least some of them (Chrome, Firefox & Edge. Coming to Safari sometime)
- And that’s a wrap
It is a hell of a lot to give for free. Especially if you went to sleep ten years ago and just woke up.What’s missing in WebRTC?
But that’s only a small piece of the puzzle. Or as another slide in my deck usually states:
There are loads of things in WebRTC that you need to do in order to get a service to production. Here’s a shortlist of things that just came to mind:
- Selecting and implementing singaling
- Writing the application
- Installing and deploying TURN servers (preferably at scale)
- Adding media servers if needed – and making them work for your scenario
- Testing it all
- Monitoring it in production
- Tweaking and upgrading as you go along
All these complementary solutions come in different shapes and sizes:
- Open source frameworks of various kinds you can use. Most will be half baked (=require more work to get them to production), or not exactly fit your needs
- Vendors offering consulting and outsourcing (check out a few of them in the WebRTC Index)
- Different vendors offering hosted and managed services. From signaling, to NAT traversal, testing & monitoring and complete CPaaS
The funny thing is, that whenever you talk to one of the companies developing with WebRTC, they believe everything in WebRTC should be free.
- STUN servers? Free. There are lists of free STUN servers you can use
- TURN servers? Free. Or more like “why can’t I find free TURN srevers?” (mind you – you should NEVER use free TURN servers)
- Using a WebRTC PaaS vendor? That’s waaaay to expensive. We want to build it on our own to keep costs down
The thing is that building these things on your own will take time and money. Lots of both to be exact.
Same thing about how you end up testing it all or monitoring it. I’d say this is how the market looks like when it comes to testing WebRTC:So what?
If you are planning a product that needs communications, you should definitely consider WebRTC first. Before anything else. It is probably going to be the cheapest technology for your needs but also the best one.
That said, you shouldn’t consider it all free. Plan it. Budget it. Write down your requirements. Decide on your architecture. Figure out who your partners should be in this road.
This is why I decided to start off the year with a giveaway. $1,000 worth of credits on VoxImplant. And all you have to do is signup for it. It just might get you started on your road with WebRTC. And who knows what will happen later on?
If you got until here and haven’t entered your email yet, then you should definitely go back up.
I decided to kick off 2017 with a few interesting initiatives. And the first one is this giveaway – a first that I am doing on my website.
I’ve been following the CPaaS space for quite some time now, and have focused on WebRTC PaaS. CPaaS, PaaS and other XaaS acronyms are confusing. For the laymen, if you want to develop something that needs communication capabilities but don’t want to host the communication service itself – that’s what you end up using. And if this is your case, then why not try out one of the interesting vendors out there?
VoxImplant was kind enough to offer $1,000 in credits for one person.
To enter, all you need to do is place your email in the large giveaway box at the top – and that’s about it.
Be sure to share this giveaway with others using that URL you’ll be getting, as that will increase your chances of winning – and if enough people join the giveaway – will get you a nice bonus as well (even if you don’t win in this giveaway).
What do you have to lose?
Join now, and maybe it will get you a bit faster to your application goal with the help of VoxImplant.
The post Jumpstart Your WebRTC Development with $1000 on VoxImplant appeared first on BlogGeek.me.
How many WebRTC RTCPeerConnection objects should we be aiming for?
This is something that bothered me in recent weeks during some analysis we’ve done for a specific customer at testRTC.
It all started when a customer using Tokbox came to us. He was complaining he couldn’t get the product he built stabilized enough and due to that, couldn’t really get it launched. The reason behind it was partially his inability to decide how many users in parallel can fit into a single session.
So we took that as a side project at testRTC. It is rather easy for us to get 50, 100, 200 or more browsers to direct towards a single service and session and get the analysis we need. So it was easy to use once we’ve written the script necessary. While we have Tokbox customers using our platform, I never did try to go deeper into the analysis until now for such customers. This time, it was part of what the customer expected of us, so it got me looking closer at Tokbox and how they implement multiparty sessions.
In the past couple of weeks we’ve done our digging and got to conclusions of our own. I haven’t meant to write about them here, but a recent question on Stack Overflow compelled me to do so – Maximum number of RTCPeerConnection:
I know web browsers have a limit on the amount of simultaneous http requests etc. But is there also a limit on the amount of open RTCPeerConnection’s a web page can have?
And somewhat related: RTCPeerConnection allows to send multiple streams over 1 connection. What would be the trade-offs between combining multiple streams in 1 connection or setting up multiple connections (e.g. 1 for each stream)?
The answer I wrote there, slightly modified is this one:
Not sure about the limit. It was around 256, though I heard it was increased. If you try to open up such peer connections in a tight loop – Chrome will crash. You should also not assume the same limit on all browsers anyway.
Multiple RTCPeerConnection objects are great:
- They are easy to add and remove, so offer a higher degree of flexibility when joining or leaving a group call
They can be connected to different destinations
That said, they have their own challenges and overheads:
- Each RTCPeerConnection carries its own NAT configuration – so STUN and TURN bindings and traffic takes place in parallel across RTCPeerConnection objects even if they get connected to the same entity (an SFU for example). This overhead is one of local resources like memory and CPU as well as network traffic (not huge overhead, but it is there to deal with)
- They clutter your webrtc-internals view on Chrome with multiple tabs (a matter of taste), and SSRC might have the same values between them, making them a bit harder to trace and debug (again, a matter of taste)
A single RTCPeerConnection object suffers from having to renegotiate it all whenever someone needs to be added to the list (or removed).
I’d like to take a step further here in the explanation and show a bit of the analysis. To that end, I am going to use the following:
- testRTC – the service I’ll use to collect the information, visualize and analyze it
- Tokbox’ Opentok demo – Tokbox demo, running a multiparty video call, and using a single RTCPeerConnection per user
- Jitsi meet demo/service – Jitsi Videobridge service, running a multiparty video, and using a shared RTCPeerConnection for all users
But first things first. What’s the relationship between these multiparty video services and RTCPeerConnection count?WebRTC RTCPeerConnection and a multiparty video service
While the question on Stack Overflow can relate to many issues (such as P2P CDN technology), the context I want to look at it here is video conferencing that uses the SFU model.
The illustration above shows a video conferencing between 5 participants. I’ve “taken the liberty” of picking it up from my Advanced WebRTC Architecture Course.
What happens here is that each participant in the session is sending a single media stream and receiving 4 media streams for the other participants. These media streams all get routed through the SFU – the box in the middle.
So. Should the SFU box create 4 RTCPeerConnection objects in front of each participant, each such object holding the media of one of the other participants, or should it just cram all media streams into a single RTCPeerConnection in front of each participant?
Let’s start from the end: both options will work just fine. But each has its advantages and shortcomings.Opentok: RTCPeerConnection per user
If you are following the series of articles Fippo wrote with me on testRTC about how to read webrtc-internals, then you should know a thing or two about its analysis.
Here’s how that session looks like when I join on my own and get testRTC to add the 4 additional participants into the room:
Here’s a quick screenshot of the webrtc-internals tab when used in a 5-way video call on the Opentok demo:
One thing that should pop up by now (especially with them green squares I’ve added) – TokBox’ Opentok uses a strategy of one RTCPeerConnection per user.
One of these tabs in the green squares is the outgoing media streams from my own browser while the other four are incoming media streams from the testRTC browser probes that are aggregated and routed through the TokBox SFU.
To understand the effect of having open RTCPeerConnections that aren’t used, I’ve ran the same test scenario again, but this time, I had all participants mute their outgoing media streams. This is how the session looked like:
To achieve that with the Opentok demo, I had to use a combination of the onscreen mute audio button and having all participants mute their video when they join. So I added the following lines to the testRTC script – practically clicking on the relevant video mute button on the UI:
After this most engaging session, I looked at the webrtc-internals dump that testRTC collected for one of the participants.
Let’s start with what testRTC has to offer immediately by looking at the high level graphs of one of the probes that participated in this session:
- There is no incoming data on the channels
- There is some out going media, though quite low when it comes to bitrate
What we will be doing, is ignore the outgoing media and focus on the incoming one only. Remember – this is Opentok, so we have 5 peer connections here: 1 outgoing, 4 incoming.
A few things to note about Opentok:
- Opentok uses BUNDLE and rtcp-mux, so the audio and video share the same connection. This is rather typical of WebRTC services
- Opentok “randomly” picks SSRC values to be numbered 1, 2, … – probably to make it easy to debug
- Since each stream goes on a different peer connection, there will be one Conn-audio-1-0 in each session – the differences between them will be the indexed SSRC values.
For this test run that I did, I had “Conn-audio-1-0 (connection 363-1)” up to “Conn-audio-1-0 (connection 363-5)”. The first one is the sender and the rest are our 4 receivers. Since we are interested here in what happens in a muted peer connection, we will look into “Conn-audio-1-0 (connection 363-2)”. You can assume the rest are practically the same.
Here’s what the testRTC advanced graphs had to show for it:
I removed some of the information to show these two lines – the yellow one showing responsesReceived and the orange one showing requestsReceived. These are STUN related messages. On a peer connection where there’s no real incoming media of any type. That’s almost 120 incoming STUN related messages in total for a span of 3 minutes. As we have 4 such peer connections that are receive only and silent – we get to roughly 480 incoming STUN related messages for the 3 minutes of this session – 160 incoming messages a minute – 2-3 incoming messages a second. Multiply the number by 2 so we include also the outgoing STUN messages and you get this nice picture.
There’s an overhead for a peer connection. It comes in the form of keeping that peer connection open and running for a rainy day. And that is costing us:
- Some small amount of bitrate for STUN messages
- Maybe some RTCP messages going back and forth for reporting purposes – I wasn’t able to see them in this streams, but I bet you’d find them with Wireshark (I just personally hate using that tool. Never liked it)
- This means we pay extra on the network for maintenance instead of using it for our media
- That’s CPU and memory
- We need to somewhere maintain that information in memory and then work with it at all times
- Not much, but it adds up the larger the session is going to be
Now, this overhead is low. 2-3 incoming messages a second is something we shouldn’t fret about when we get around 50 incoming audio packets a second. But it can add up. I got to notice this when a customer at testRTC wanted to have 50 or more peer connections with only a few of them active (the rest muted). It got kinda crowded. Oh – and it crashed Chrome quite a lot.Jitsi Videobridge: Shared RTCPeerConnection
Now that we know how a 5-way video call looks like on Opentok, let’s see how it looks like with the Jitsi Videobridge.
For this, I again “hired” the help of testRTC and got a simple test script to bring 4 additional browsers into a Jitsi meeting room that I joined with my own laptop. The layout is somewhat different and resembles the Google Hangouts layout more:
What we are interested here is actually the peer connections. Here’s what we get in webrtc-internals:
A single peer connection for all incoming media channels.
And again, as with the TokBox option – I’ll mute the video. For that purpose, I’ll need to get the participants to mute their media “voluntarily”, which is easy to achieve by a change in the testRTC script:
What I did was just was instruct each of my automated testRTC friends that are joining Jitsi to immediately mute their camera and microphone by clicking the relevant on-screen buttons based on their HTML id tags (#toolbar_button_mute and #toolbar_button_camera), causing them to send no media over the network towards the Jitsi Videobridge.
To some extent, we ended up with the same boring user experience as we did with the Opentok demo: a 5-way video call with everyone muted and no one sending any media on the network.
Let’s see if we can notice some differences by diving into the webrtc-internals data.
A few things we can see here:
- Jitsi Videobridge has 5 incoming video and audio channels instead of 4. Jitsi reserves and pre-opens an extra channel for future use of screen sharing
- Bitrates are 0, so all is quiet and muted
- Remeber that all channels here share a single peer connection
To make sure we’ve handled this properly, here’s a view of the video channels’ bitrate values:
There’s the obvious initial spike – that’s the time it took us to mute the channels at the beginning of the session. Other than that, it is all quiet.
Now here’s the thing – when we look at the active connection, it doesn’t look much different than the ones we’ve seen in Opentok:
We end up with 140 incoming messages for the span of 3 minutes – but we don’t multiply it by 4 or 5. This happens once for ALL media channels.Shared or per user RCTPeerConnection?
This is a tough question.
A single RTCPeerConnection means less overhead on the network and the browser resources. But it has its drawbacks. When someone needs to join or leave, there’s a need to somehow renegotiate the session – for everyone. And there’s also the extra complexity of writing the code itself and debugging it.
With multiple RTCPeerConnection we’ve got a lot more flexibility, since the sessions are now independent – each encapsulated in its own RTCPeerConnection. On the other hand, there’s this overhead we’re incurring.
Here’s a quick table to summarize the differences:What’s Next?
Here’s what we did:
- We selected two seemingly “identical” services
- The free Jitsi Videobridge service and the Opektok demo
- We focused on doing a 5-way video session – the same one in both
- We searched for differences: Opentok had 5 RTCPeerConnections whereas Jitsi had 1 RTCPeerConnection
- We then used testRTC to define the test scripts and run our scenario
- Have 4 testRTC browser probes join the session
- Have them mute themselves
- Have me join as another participant from my own laptop into the session
- Run the scenario and collect the data
- Looked into the statistics to see what happens
- Saw the overhead of the peer connection
I have only scratched the surface here: There are other issues at play – creating a RTCPeerConnection is a traumatic event. When I grew up, I was told connecting TCP is hellish due to its 3-way handshake. RTCPeerConnection is a lot more time consuming and energy consuming than a TCP 3-way handshake and it involves additional players (STUN and TURN servers).download the RTCPeerConnection count deck here.
The post WebRTC RTCPeerConnection. One to rule them all, or one per stream? appeared first on BlogGeek.me.
There is no real peak telephony.
[Chad Hart is no stranger to my readers here. He runs webrtcHacks, part of the Kranky Geek team and works at Voxbone. This time, he takes a look at telephony and where it stands today – with and without WebRTC]
Back in April of 2015, I recall Google WebRTC Product Manager Serge LaChapell talking about the WebRTC team’s focus on mobile and how they wanted to kick “VoLTE’s butt”. To be fair he was referencing call connection times, but reading between the lines I like to believe he has had ambitions well beyond that – namely beating VoLTE and the traditional telephony network in minutes.
— Chad Hart (@chadwallacehart) April 15, 2015
For many years I have tried to keep track of how the traditional telecoms has fared against the emerging VoIP application world (what they sometimes derogatorily call “OTT”). I have had two hypotheses for several years now:
- Traditional telecoms over the PSTN is past “peak” and will continue to decline
- Real time communications (RTC) in general has been on the decline but is poised to make a comeback thanks to better implementations and technologies like WebRTC
Let’s check the data to test these statements.Peak Telephony? Maybe Not…
Digging into the statistics from various sources, I was surprised to find I was wrong about my first hypothesis on peak telephony.The US market
Let’s start by taking a look at the situation in the US, one of the world’s largest communications markets. The Consumer Telecommunications Industry Association (CTIA) provides an annual update that sometimes includes Minutes of Use (MoU) and subscriber data. The data shows that on a subscriber-level, mobile telephony usage already peaked on a per-subscriber basis in 2007.
However, there is a growing number of data-only subscriptions for our tablets and other devices counted as subscribers. This negatively skews the numbers. Looking at “minutes” on a per capita basis is a cleaner metric, so let’s divide the minute figures by the US population. This shows a much more interesting picture where mobile phone usage for traditional calling actually went up by 16%.
Total US cellular telephony minutes appear to be rising after stalling for many years (my calculations)
Checking the data against other FCC sources, this growth may be overstated but there is no clear evidence of decline. So what’s going on? Much of cellular’s continued volumes can be attributed to fixed-mobile substitution – both in terms of people dropping their fixed lines and as the FCC reports “A significant percentage of homes with both landline and wireless phone access received all or almost all calls on wireless telephones despite also having a landline telephone.” If we assumed total PSTN calling was flat, then according to my estimates, a 30% annual decline in fixed line minutes would be required to explain the decrease. This is possible, but way faster than past usage declines in fixed so it is more likely cellular usage did indeed have a very good year in 2015.
There is no clear evidence of peak PSTN telephony in the US, so let’s check some other sources.The UK market
The UK’s Ofcom is generally a much better datasource than the FCC since they look at communications as a whole within the UK and compare it to other countries.
They are a lot more pessimistic when it comes to PSTN-based telephony. Their data is very definitive showing a continued, gradual decline in PSTN call volumes going back to 2010. With a -3% 5 year CAGR, no matter how you cut it, “operator” traffic is down.
Ofcom CMR 2016 report shows declines in operator voice usage
They have not released their global figures for 2015, but their 2014 report showed similar trends with mature markets declining (US, Western Europe, Japan & Korea). However, emerging marketings like China, India, and Russia show show growth and just make up for declines in the mature markets in 2014.Does anyone care about the PSTN anyway?
Outside of adding touch tone dialing and going cordless, the Public Switched Telephone Network (PSTN) telephony user experience hasn’t exactly changed a whole lot in a hundred years. The PSTN is only one way to make calls – now we dedicated VoIP apps, messenger apps with voice, and a growing number of video communications options. Do these new forms for RTC give us any hope of reversing traditional telephony’s demise? The data here is more positive.
Ofcom’s data shows an increasing usage of VoIP for voice calls and a very definitive increase in video call usage. This is consistent with their international research from a year earlier:VoIP Apps Save the Day
So where do newer RTC apps and features fit into all this? Using Ofcom’s methodology, the 18 countries they track produce somewhere around 10 Trillion minutes a year. Microsoft has previously claimed Skype does up to 3 Billion minutes a day – that’s a Trillion minutes a year if one assumes around a 3 Billion daily average. Even if the true annual value is half of that, clearly Skype alone is meaningful compared to PSTN volumes.
Apple does not release any figures for its FaceTime service introduced in 2011, but presumably its usage is substantial, although less than Skype’s based on Ofcom’s past user surveys. WeChat, Line, and Viber all have more than 200 million monthly active users with various VoIP features. WhatsApp now has more than 1 billion MAU. Its VoIP calling feature launched in April 2015 has more than 100 million voice calls a day. Taken together, these other VoIP services are easily more than a trillion minutes a year.
At 10 to 20% of the PSTN’s volume, clearly VoIP traffic has a ways to go before it dominates the PSTN, but there is no doubt its volumes are meaningful in comparison. Furthermore, these services are still growing. Certainly some of that growth will come at the expense of the PSTN, but it appears they are also encouraging more RTC use in general.Does WebRTC matter?
WebRTC does not factor heavily into the services cited above, but that is poised to change. At only 5 years old, WebRTC has not had that much time to widely establish itself in relation to other VoIP technologies. Still, there are a few notable standouts – particularly notably Facebook Messenger. Facebook has stated it has more than 300 million monthly active users of Messenger’s VoIP features and just this week announced it had 245 million monthly video users. Other notable users include Snapchat and of course Google’s Hangouts and Duo services.
There are a lot of other WebRTC apps showing big user gains too such as Houseparty which reported it had 20 million minutes of usage a day last month – not bad for an app that only emerged from the ruins of Meerkat a few months ago. In addition, more traditional VoIP apps like Whatsapp and Skype are starting to use WebRTC, albeit in limited circumstances today but that will certainly grow too.
In aggregate, I estimate WebRTC-based services easily have over 500 million MAU this year across 2 billion devices. Comparing this to other VoIP technologies at the 5 year mark, WebRTC is way ahead. This bodes well for WebRTC to be an incremental driver of VoIP traffic and further accelerator of RTC.Conclusions
I have been concerned that the desire of people communicate in real time reached its pinnacle long ago. Why focus on RTC if the trend is clearly toward “messaging” and other forms of textual interaction? Has telephony peaked? The evidence suggests that is probably the case for the PSTN in developed markets, but there are plenty of pockets of growth. Where declines exist, they are gradual. Even better, there is a large body of evidence that VoIP services are more than making up the gaps of any declines and then some. This indicates that we are actually using real time communications more than even. The recent and rapid rise of many WebRTC services is a further shows that this trend is very likely to continue, or perhaps even accelerate. That’s great news for the hundreds of WebRTC vendors out there and those that have yet to come.
WebRTC is the most secure technology for video communications. And yet – developers can screw this for you.
There is a rise in security breaches and data theft incidents in 2016. You see this from the amount of information out there. I’ve written about WebRTC and security for quite some time, but a recent post I’ve read compelled me to write about it again.
- Site is secure
- A contractor places database dump on the internet for backup
- And that get found
It probably happens more often than not. You build a service. You take care of its security. And then, someone down the lines screws you over with his maintenance processes. To some extent, this is just as bad as social engineering, where a hacker tries to gain access by fooling people to believe he is someone else.Make sure to download the WebRTC Security checklist. Print it and stick it on the wall behind your monitor so you don’t forget.WebRTC Security baseline
WebRTC comes with a few security concepts that are quite new and innovative in VoIP:
- In WebRTC, EVERYTHING is encrypted. Not only by default, but also in a way that can’t be modified – there is no way to send data over WebRTC in the clear
- WebRTC forces you to operate over HTTPS and WSS in your web application, so signaling gets encrypted as well
- Screensharing requires an additional layer of consent, be it whitelisting of your site or a creation of a browser extension
- Browsers today update frequently and automatically, so any security threat found gets patched faster than most enterprise and VoIP vendors react to their security breaches
The thing people forget is that WebRTC is just a piece of technology. A building block. It is up to the developers to decide how to use it in their own product. During that integration, security breaches can be created quite easily.
In the WebRTC course I launched two months ago, I’ve added a lesson dealing with WebRTC security. It goes through the mechanisms that exist in WebRTC and the areas that need to be further secured by the application.
Two big issues left to developers today are TURN passwords and access to backend server resources.#1 – TURN passwords
TURN servers predate WebRTC. They are used by SIP (or at least are found in the spec), and there, the notion is that the user agent (=device/endpoint) is secure and “named”. So a username and password mechanism was created to get a TURN binding. The reason you want such a mechanism in the first place is because TURN servers are bandwidth hogs – they relay media, and by doing that they cost a lot in terms of bandwidth. So if you are paying for it, you don’t want others to piggyback on it.
The current approach out there is to use temporary passwords (I like calling them ephemeral – it makes me sound intelligent). Ones that become useless in an hour or two.
This means that someone in your backend randomly creates a password that is short-lived and shares it with both the TURN server and the client.
The above illustrates how this is done.
- The App Server, in charge of signaling in this case, creates a password. It updates the TURN server about said password and also gives that information to the User
- The User then creates a peer connection, configuring the TURN server in it with the relevant temporary password
Now lets add a media server into the mix.
Who should be generating that password and passing it around to whom? Should the Media Server now be in charge of it, or is it up to the App Server still to take care of this?
Which leads me to the second important security aspect of WebRTC when it comes to your development – backend server resources you need to protect.#2 – Backend server resources
In many cases, I find that when the work is outsourced, the end result tends to be a jumble of an architecture if things aren’t thought out properly from the beginning.
This usually causes the wrong servers to need to connect and communicate directly with the User. While not an issue on its own, it can easily turn into a headache:
- Not having a clear picture of the state in your backend means you lose control – this can turn ugly when issues arise
- Opening up more of your backend towards the internet means more points to secure against penetration
- And yes – I know there’s a trend to treat servers in the cloud as if they are always open to the internet
- Which means you need to think about how best to protect them in the first place anyway, which happens to be closing them as much as possible
What I suggest in many cases is:
- Media servers should never be controlled or accessed directly from the Internet
- Media servers should only pass media to and from the Internet
- Whenever they need to be controlled, you do that using backend-to-backend communication from other servers you have that are already managing the users on the Internet
I am not a security expert. I know a bit about it and try to stay informed, but I am by no means an expert in it.
You should make sure to take security into consideration when developing your service and don’t assume WebRTC does everything for you. It doesn’t, but it is the best starting point you’ll get.
If you want to learn more about WebRTC, I will be opening the course again for another round. Probably during April.
If you are a corporate looking to have an open access to course materials throughout the year for your workforce – I am going to announce such a plan soon, but feel free to reach out to me before that happens.
Just do me a favor – don’t leave WebRTC security to chance.Need a reminder? Download the WebRTC Security checklist. Print it and stick it on the wall behind your monitor so you don’t forget.
The post The Best WebRTC Security is Prone to the Stupidest Developer appeared first on BlogGeek.me.
Got a requirements document to write for WebRTC? Here’s a step by step guide to doing just that.
Here is something that I do with my customers quite often. In many cases, when I consult vendors, they are in the process of building a new product or integrating an existing product with some new communication capabilities. This involves using WebRTC and outsourcing the actual development.
More often than not, I find myself writing the baseline of the requirements document for the customer, to server as a WebRTC RFP (Request For Proposals) that get used to communicate the requirements with the potential outsourcing vendors.
I wanted to share the process that I use in writing the first draft of this document. To make this a bit more useful, let’s assume that what we want to do is build a webinars service, where a few people can join as the speakers in the webinar and people online can “listen in”.
I’ve created a WebRTC requirements template and a sample webinars requirements document that you can use when you need to write the requirements for your own product.Get the WebRTC Requirements Template and Sample Webinars Requirements Document
Here’s step by step how I’d go about doing that.Step #1: Structure your document
First things first. To make sure I don’t forget anything, I like to split my requirements document into 4 sections:
As you can see in above, I place TBD for each section in the document. I do that for all sub sections that I add to the document as well. This way, I can easily search the areas that haven’t been filled in properly yet when I work on it. Most often than not, writing these WebRTC requirements take a couple of hours and span a few days because they are collaborative in nature.
I tend to leave out the mechanics around the project – such as the price model I am looking for, or the timeline of the project. These tend to change between companies and they often better reflected elsewhere than in the technical requirements that I try to describe here.Step #2: Write the overview
First thing I do once I have the template ready for my needs is write the overview part.
I try to keep the overview short and sweet, with a focus on making sure people understand what it is that I am trying to achieve in the service – what my challenges are and what I consider as success.
Usually, 2-3 paragraphs should be enough.Step #3: Describe the architecture
Now it is time to start thinking about our architecture. By that, I don’t mean the architecture of the solution, what processes, servers and switches I want – I leave that for the vendor to fill in. What I mean is the entities I have in my service, trying to focus around the session – the types of media and signaling I want running there.
I do this by going analog, and just jot it down on my whiteboard and taking a picture of the end result. I find this more natural for me than using Powerpoint or Vizio. Later on, I might redo it as a Powerpoint diagram, but more often than not, I just leave it as is.
Above is the drawing I just did to describe the BlogGeek.me Webinar I just invented.
After the visual, I explain the different entities that are in the drawing and the relations between them. This part is really important, as oftentimes, it will reveal entities or flows that I haven’t thought about earlier.
In the case of the BlogGeek.me Webinar, we’ve got multiple potential Speakers who interact using audio and video with each other in the Webinar, which then gets sent to multiple Viewers and also to an external Storage.
I try to keep things focused and to the bare minimum that is necessary for the understanding of the service.Step #4: Fill in the features
To some extent, this step is the main chunk of what the product does. For me, this is a brain dump of the things a user should be able to do in the system.
There are different types of features you might be needing. I focus on those that relate to the communication part of the product and nothing else.
Here’s a checklist of what I usually go through when doing this:
- Is this a 1:1 service or multiparty?
- Audio? Video?
- Any screen sharing or other collaboration capabilities?
- How do users get authorized, authenticated and connected?
- Any in-session controls users need to have?
- Any indicators to show on their display?
- Do we need recording capabilities?
- Anything that we missed?
Make sure you answer all the questions above as requirements in the document if they make sense and add your own to the list.
Here are a few of the ones I’ve written for the webinars product:
Notice how I’ve indicated that connectivity via PSTN is optional in a future phase? This serves two purposes for me:
- It gives the vendor a hint of what architecture to put in place to support this later on down the road
- It also gives the vendor a feeling that this is a journey and not a one-off project. He will be more committed to its success if he knows you might call on him later on to improve and extend the service
Now it is time to go over the non-functional requirements. These are the boring and ugly details that can make or break a service, so spend enough time on this one.
What do I mean by non-functional? These will usually be things you will take for granted, but the vendor won’t. To reduce friction and arguments in the future, I add these. In most likelihood, if you don’t write these down, a vendor will ask about a few of these things anyway – so just write them down to reduce the unnecessary round trips and to make sure you and the vendor are on the same page.
I tend to split this section to 5 subsections, each with its own focus:1. Devices
Here I list all the devices I want to support. Browsers, operating systems, mobile devices, etc.
Each gets its own special treatment. Things I usually look at here are:
- Which browsers to support natively?
- Do I need an Electron PC app? If I do, then on which operating systems?
- What versions of iOS to support? Which earliest devices?
- What versions of Android API to support? How many specific devices do I want tested?
In many ways, I derive the requirements here based on the WebRTC Device Cheat Sheet that I published.
NAT traversal is often overlooked. There are two areas where I cover NAT traversal – here and in the Security subsection below.
Here, I define who takes care of it – do I expect the vendor to bring a NAT traversal solution, will I be doing it, or should they use a third party hosted service (there are a few out there offering it).
The second part that I sometimes decide here, but not always, is where I want it deployed – along with the media servers or closer to the connecting user. It is a matter of architecture needs that I prefer leaving to the vendor to fill in but not always (can’t really say when in a definitive way).
In my webinar example, I decided to make things easy and just use a third-party hosting service:3. Scalability
For scalability I make sure I cover a few areas:
- What’s the scale of a single session? Just make sure that if there are different types of users, you indicate how each can scale
- What’s the scale of the service as a whole? How many sessions can exist concurrently?
- Do I need to address any geographical locations when it comes to scaling?
- How are the different parts of the systems scale? Independently of each other?
Here’s how I fit it into our webinar example:4. Security
The security part is slightly tricky. First, because I am not an expert. But also because almost nobody is.
What I usually place here is the basics of how I’d like to see the backend (encryption between the servers), but I do cover two important areas:
- Media servers. I prefer access to them to be limited to the application server only when it comes to control and have all signaling be routed through the application server or a signaling server. I don’t like giving access to my resource hog openly over the internet. Call me old fashioned
- TURN servers. Here I always state that I want ephemeral passwords. Otherwise, vendors usually do the short route of using a static username and password, which in WebRTC is like no password at all on the TURN server
The DevOps section deals with things required to run this product on a daily basis. I tend to fill in three main things here:
- Hosting – is this planned to be deployed on premise? In a specific cloud provider? Using Docker or some other container technology?
- Reporting – what type of information do you want to collect to generate reports on the use of the system? These should be for offline use – think a daily email or something similar
- Events and statistics collection – this is what you want to be able to collect to monitor the health of the service in realtime
Now that we’ve written t all, time to go over the whole document to make sure things aren’t missing:
- Clean up any leftover TBDs
- Add clarifications where necessary
- Add things not in the template
Here’s what I decided to add to the webinar example:
As you can see, for me, open source was really important.
Now that you are done – go share the document with your colleagues, and once approved internally, it is time to share with potential outsourcing vendors.Why so short?
To some, this approach may seem a bit shallow. It doesn’t include all corner cases or describe in a lot of detail what goes on. The thing is, that there is a balance between what you can effectively do and achieve as a small startup or even a big company with a new project than what you’d do on a long running multi-year millions of dollars project.
For me, this proves itself as a good way to capture the essence of what it is that needs to be developed and getting replies from potential vendors to building the product. Once I get the replies, it is time to go over them and see who makes the most sense – a lot based on how they replied to the RFP in the first place.What’s next?
So here’s how you should write your next WebRTC requirements document:
Step #1: Structure the document to make sure all bases are covered
Step #2: Focus on the overview – explain what your product needs to achieve
Step #3: Draw the architecture and explain it
Step #4: Write down your functional requirements
Step #5: Write down all non-functional requirements
Step #6: Do a one-over to make sure you didn’t miss anything
I’ve built a WebRTC Requirements Template document for you. You can copy it and fill it in with the requirements of your own product. It already holds many of the questions you’ll need to answer, so it can serve as a guide for you.
Now, to write this article, I also had to create a real-world example (remember our webinar service?). This example is also shared so you can see how I write things down.WebRTC Requirements Template and Sample Webinars Requirements Document
Oh, and if you still need help – I do offer a consulting service, where a lot of the time invested is placed into writing these requirements documents, finding suitable potential vendors and going over their responses.
The post How to Write the WebRTC Requirements for Your New Product? appeared first on BlogGeek.me.
Want to learn more about WebRTC in education?
Next week, testRTC will be hosting a webinar titled How WebRTC ushers the next wave of e-Learning innovation. As a co-founder of testRTC, I am tasked with the actual creation and hosting of the webinar, which means I will be speaking about what vendors are doing WebRTC when it comes to education and where I see their challenges.
I haven’t done a webinar in quite some time, so this is going to be fun for me.
We’ve decided to use Crowdcast as our webinars platform for it. Partially because it is a WebRTC based service, and I do love dog fooding. But also because I received some good reviews about it.
If I had to pick two very active verticals in the domain of WebRTC, these would be healthcare and education. We see this also at testRTC, where we help these vendors in testing and deploying their services to production.
So here’s what we’re doing next Wednesday – me and you:
- On December 14 at 14:30 EDT, we’re going to meet online
- I am going to give a few interesting examples of how education looks like when it meets WebRTC
- Then talk about some of the challenges involved
- You will have time to ask questions. I’ll answer them to the best of my ability
- And then I am going to give you a bonus
The examples part of the webinar is probably going to be the most interesting one.
I remember talking almost 3 years ago with a startup in India about their use case. It was related to education and it blew my mind. It was so starkly different than what I assumed a startup in India would do within their local market for education that I saw it as my own private lesson. Since then, I talked with tens of vendors in this space. Each doing his own thing. Each focusing on solving a problem in tutoring. They are so wide in variety that you can’t even look at them as a single market.
But this is exactly what we will try to do here. I am going to categorize them a bit – I wonder where you will find yourself in that categorization.The challenges
Learning has its challenges for the student, the teacher and now also for the platform.
My intent is to look at the challenges of the platform – what are the things necessary to put these different education systems in production and how to make sure they work properly.
For the various types of education platforms, I’ll give you tips for where you should focus with your testing – what are the weak spots to look for – so you can find and deal with them before your customers do.The bonus
I am not going to say what the bonus is now – it will ruin the surprise. I will say though, that this is something you’ll find immediately useful.
The bonus will be available only to those who will be with me during the webinar itself, so register now and save your place.What’s next?Register of course!
And feel free to write down your questions in advance – Crowdcast allows for that.
The post WebRTC and Education – the Webinar Edition (and a Bonus) appeared first on BlogGeek.me.
Different ways to do the same thing.
One of the biggest problems is choice. We don’t like having choice. Really. The less options you have in front of you the easier it is to choose. The more options we have – the less inclined we are to make a decision. It might be this thing called FOMO – Fear Of Missing Out, or the fact that we don’t want to make a decision without having the whole information – something that is impossible to achieve anyway, or it might be just the fear of committing to something – commitment means owning the decision and its ramifications.
WebRTC comes with a huge set of options to select from if you are a developer. Heck – even as a user of this technology I can no longer say what service I am using:
- I use Drum for my Virtual Coffee sessions (haven’t done one in some time. Should do one next month)
- I now use Jitsi meet for my Office Hours
- Google Hangouts for testRTC meetings with customers
- Whatever a customer wants for my own consultation meetings, which varies between Hangouts, Skype, appear.in, talky, GoToMeeting, WebEx, … or the customer’s own service
In my online course, there’s a lesson discussing NAT traversal. One of the things I share there is the need to place the TURN server as close as possible to the edge – to the user with his WebRTC client. Last week, in one of my Office Hour sessions, a question was raised – how do you make that decision. And the answer isn’t clear cut. There are… a few options.
My guess is that in most cases, the idea or thought of taking a problem and scaling it out seems daunting. Taking that same scale out problem and spreading it across the globe to offer lower latency and geolocation support might seem paralyzing. At the end of the day, though it isn’t that complex to get a decent solution going.
The idea is you’ve got a user that runs on a browser or a mobile device. He is trying to reach out to your infrastructure (to another person probably, but still – through your infrastructure). And since your infrastructure is spread all over the globe, you want him to get the closest box to him.
How do we know what’s closest? Here are two ways I’ve seen this go down with WebRTC based services:Via DNS
When your browser tries to reach out the server – be it the STUN or TURN server, the signaling server, or whatever – he ends up using DNS in most cases (you better use DNS than an IP address for these things in production – you are aware of it – right?).
Since the DNS knows where the request originated, it can make an informed decision as to which IP address to give back to the browser. That informed decision is done in the infrastructure side but by the DNS itself.
One of the popular services for it is AWS Route 53. From their website:
Amazon Route 53 Traffic Flow makes it easy for you to manage traffic globally through a variety of routing types, including Latency Based Routing, Geo DNS, and Weighted Round Robin.
This means you can put a policy in place so that the Route 53 DNS will simply route the incoming request to a server based on its location (Latency Based Routing, Geo DNS) or based on load balancing (Weighted Round Robin).
Amazon Route 53 isn’t the only such service – there are others out there, and depending on the cloud provider you use and your needs, you may end up using something else.Via Geo IP
Another option is to use a Geo IP type of a service. You give your public IP address – and get your location in return.
You can use this link for exampleto check out where you are. Here’s what I get:
A few things that immediately show up here:
- Yes. I live in Israel
- Yes. My ISP is Bezeq
- Not really… Tel-Aviv isn’t a state. It is just a city
- And I don’t live in Bat Yam. I live in Kiryat Ono – a 20km drive
That said, this is pretty close!
Now, this is a link, but you can also get this kind of a thing programmatically and there are vendors who offer just that. I’ve head the pleasure to use MaxMind’s GeoIP. It comes in two flavors:
- As a service – you shoot them an API and get geo IP related information, priced per query
- As a database – you download their database and query it locally
There’s a kind of a confidence level to such a service, as the reply you get might not be accurate at all. We had a customer complaining at testRTC servers which jinxed his geolocation feature and added latency. His geo IP service thought the machine was in Europe while in truth it was located in the US.
The interesting thing is, that different such services will give you different responses. Here’s where I am located base (see here):
As you can see, there’s a real debate as to my exact whereabouts. They all feel I live in Israel, but the city thing is rather spread – and none of them is exact in my case.
There are many Geo IP services. They will differ in the results they give. And they are best used if you need an application level geolocation solution and a DNS one can’t be used directly.Telemetry
When inside an app, or even from a browser when you ask permission, you can get better location information.
A mobile device has a GPS, so it will know the position of the device better than anything else most of the time. The browser can do something similar.
The problem with this type of location is that you need permission to use it, and asking for more permissions from the user means adding friction – decide if this is what you want to do or not.What’s next?
I am sure the DNS option is similar in its accuracy level to the geo IP ones, though it might be a bit more up to date, or have some learning algorithm to handle latency based routing. At the end of the day, you should use one of these options and it doesn’t really matters which.
Assume that the solution you end up with isn’t bulletproof – it will work most of the times, but sometimes it may fail – in which case, latency will suffer a bit.
Planning on introducing WebRTC to your existing service? Schedule your free strategy session with me now.
The post WebRTC, TURN and Geolocation. How to Pick the Best Server to Work With? appeared first on BlogGeek.me.
Kranky Geek last week was quite a rush.
What can I say. Last week, our Kranky Geek event was so much fun.
I won’t bore you with the details. We’ve focused this time on WebRTC in mobile. Got the best speakers possible – really. And had a blast of an event. I received so much positive feedback that it warms my heart.
I’d like to thank our sponsors for this event: Google, Vidyo, Twilio and TokBox. Without them, this event wouldn’t have been possible.
The videos are available online, and below you’ll find the playlist of the event:
Tomorrow we’re doing another Kranky Geek event. This time in Sao Paulo, Brazil. Different theme. Different sessions. I am dead tired, but working hard with Chad and Chris to make that a huge success as well. See you soon!
All you wanted to know but didn’t know how to ask.
2 billion Chrome browsers? 7 billion WebRTC enabled devices by 2017? 50 billion IoT devices?
At the end of the day, who cares? What you are really interested in is to make sure that the WebRTC product you develop will end up working for YOUR target customers. If these customers end up running Windows XP with Internet Explorer 6 then you couldn’t care less about Apple, Safari and iOS support. But if what you are targeting is a mobile app, then which browser supports webRTC is less of an issue for you.
To make things a bit simpler for you, I decided to create a quick Cheat Sheet. A one pager to focus you better on where you need to invest with your WebRTC efforts.
This cheat sheet includes all the various devices and browsers, and more importantly, how to get WebRTC to work on them.
So why wait? Grab your copy of the cheat sheet by filling out this form:
- Name* First Last
- CommentsThis field is for validation purposes and should be left unchanged.
Time for a quick reality check when it comes to browsers and WebRTC.
I know you’ve been dying for Apple to support WebRTC in Safari. I am also aware that without WebRTC in your Microsft Internet Explorer 6 that you have deployed in your contact center there is no way for WebRTC to become ubiquitous or widely adopted. But hear me out please.Browsers market share
The recent update by NetMarketShare on the desktop browsers market share is rather interesting:
It shows the trend between the various desktop browsers for the last year or so.
Here are some things that comes to mind immediately:
- Google Chrome now has 55% market share. Its rise has stalled somewhat in the last couple of months
- Microsoft Internet Explorer is still free falling. It will probably stop somewhere at 10% or so if you ask me
- While Chrome gained the most users from Internet Explorer, it seems that Firefox has picked up users from Internet Explorer in the past two months
- Microsoft Edge gained very little from the demise of Microsoft Internet Explorer. People who have adopted Windows 10 aren’t adopting Edge and are most probably opting to install and use Chrome or Firefox instead. I’ve mentioned it here in the past
What happens between Microsoft Edge and Apple Safari is even more interesting. Apple Safari is falling behind Microsoft Edge:
Something doesn’t add up here.
The Edge numbers should rise a lot higher, due to the successful upgrades we’ve seen for Windows 10 in the market. And it doesn’t. We already noticed how Chrome and to some extent Firefox enjoyed that switch to Windows 10.
I am not sure how the slip of Apple Safari market share from almost 5% in the beginning of this year to below 4% can be explained. Is it due to the slip in Mac sales in recent months or is it people who prefer using Chrome or Firefox on their Macs?
There’s one caveat here of course – these numbers are all statistics, and statistics do tend to lie. When going to specific countries, there will be a different spread across browsers, and to a similar extent, your service sees a different type of browser spread because your users are different. Here’s the stats from Google Analytics for this blog:
For me, it is titled towards browsers supporting WebRTC, and Safari is way higher than Edge and Internet Explorer put together.Back to WebRTC
Every once in a while, someone would stand up and ask: “But what about Internet Explorer?” when I talk about WebRTC. It is becoming one of these questions I now expect.
Here’s what you need to think about and address:
- Chrome is probably your go-to browser and the first one to support with your WebRTC product
- Firefox comes next, and growing. So keep your tabs on it to see how it “performs” with your product
- Edge. Useless for most. Add support to it if:
- You do voice only (should work nicely), and you want that extra market share
- You know for sure your users are on Edge
- Internet Explorer. Ignore
- Microsoft probably won’t invest in having WebRTC support in it, so don’t wait for them
- Use a plugin or whatever if you must
- Safari. Ignore for now. Nothing to do about it anyway
I am working on a quick cheat sheet for you. One which will enable you to make fast decisions for browser support. It will extend also into apps and mobile. Probably by next week.
Until then, if you plan on picking up browsers to support, think of your target audience first. Don’t come up with statements like “IE must be supported” or “Without Safari I can’t use this technology”. You are just hurting yourself this way.
Planning on introducing WebRTC to your existing service? Schedule your free strategy session with me now.
The post Desktop browsers support in WebRTC – a reality check appeared first on BlogGeek.me.
Kranky Geek is coming to town!
WebRTC is maturing. We’re 5 years into this roller coaster and it seems most companies have already understood that they need to use WebRTC in one way or another. To many, this is going to be an excruciatingly painful journey. They will need to change their business model, think differently about how they develop products and even rewrite their core values.
One of the reasons we decided to launch Kranky Geek over two years ago was to have a place where developers can teach developers about WebRTC. Somewhere that isn’t already “tainted” with the telecom views of the world – not because they are bad – just because WebRTC can accomplish so much more. What we are going to do next with WebRTC takes place in November and will happen in two separate locations:Kranky Geek San Francisco
San Francisco is where Kranky Geek started and where I feel at home when it comes to this event. We will be doing our 3rd Kranky Geek event in San Francisco (and 4th in total).
It will take place on November 18, at Google’s office on Spear street.
Our focus this time around is going to be mobile. We’ve got sessions lined up that should cover most of the aspects related to WebRTC and mobile. Things like using React, cross platform development, video compression, specific aspects in iOS as well as specific aspects in Android related to WebRTC.
If you are into mobile development with real time communications – then this is an event you don’t want to pass up.
There is also a new attendance fee that was added – $10 that gets donated to Girl Develop It. You may notice we don’t have a woman speaker this time – it is hard to find women speakers in this domain, so if you are one or know one – make sure to let us know for our future events.
I’d like to thank our sponsors who made this thing possible:
- Google – who brought us WebRTC in the first place and is instrumental to the success that is Kranky Geek
- TokBox – sponsoring both the San Francisco and São Paulo events. They will share their experiences with mobile aspects of WebRTC related to Android
- Twilio – sponsoring both the San Francisco and São Paulo events. Their session in San Francisco will cover WebRTC and the Internet of Things
- Vidyo – a new sponsor that is joining the Kranky Geek family, and probably the best one suited to talk about real time video compression technologies that make sense in mobile devices
This will be my first time in Brazil and also the first time we run Kranky Geek in Barzil. As with San Francisco, the event is hosted at Google’s office in São Paulo.
Our focus for São Paulo will be back to the basics of WebRTC. We are trying this time to fill in the gaps – share resources and insights that developers who use WebRTC in their daily activities need. This is why we have a few sessions that are targeted at debugging and troubleshooting WebRTC in this event.
Registration for the São Paulo event is free.
For the São Paulo event, we got the help of a few sponsors as well:
- TokBox – at São Paulo, TokBox will share with us how to deal with device and connectivity issues when it comes to WebRTC sessions
- Twilio – will be looking at the makeup of a WebRTC service, as the browser implementation of WebRTC is the beginning of the journey only
- WebRTC.ventures – who are sponsoring this event for the first time, will give the overview and introduction to WebRTC
- Callstats.io – will explain what you can find in getstats() and how to use it
I have my own session to prepare for the upcoming Kranky Geek, along with a lot of work to make these two events our best yet. There are also changes and modifications that need to make their way to the website – but rest assured – these events have great content lined up for you.
If you happen to be in the area, my suggestion is come to the event – it is the best place to learn and interact with people who know way better than I do what WebRTC is in and out.
And if you want to meet me – just contact me. I’ll be “in town” for an extra day or so.
See you all at Kranky Geek!
The post Get Ready for Kranky Geek San Francisco AND São Paulo appeared first on BlogGeek.me.
No article today.
My course is launching today: Advanced WebRTC Architecture Course.
I’ve got some solid attendance for it, along with a good bulk of high quality material lined up.
Hopefully, this will be a success.
If you are taking the course – then good luck and please share your thoughts with me – I’ve built this course for you and I’d like you to benefit from it as much as possible.
If you aren’t taking it but still want to attend – feel free to enroll. I’ll be closing up course signups end of this week, with no clear indication if and when I’ll be running it next.
Now quiet please – there are people studying in here. Somewhere. Hopefully.
WebRTC course starts Monday next week.
At long last, the wait will be coming to an end and my recent sleepless nights as well. I’ve been working these past months to put up the content for the course, not knowing how it will end up.
Most of the materials have been recorded, uploaded and prepared already, waiting for me to just manually add all the people who enrolled. There’s a lot of material in that course, and a lot more that I am sure is still missing in there. Trying to cover WebRTC in its entirety isn’t easy.
Through the process of putting this stuff up and out there, I’ve learned a lot myself.
The course is split into 7 sections:
- The basics of WebRTC – explanation of what WebRTC is, a review of its APIs and call flows, and general knowledge. This should get you up to speed about what it is and will probably place you among the first 10,000 people in the world who know it at this depth. It will also enable you to read the stuff that is out there about WebRTC more critically
- Networking basics – while we all use the Internet, many of us don’t know the distinction between TCP and UDP, or what Websockets really is. This section tries to put these things in perspective and lay the groundwork for later sections. It will be super useful for VoIP developers but also great for web developers. It also covers the NAT traversal challenge and the solutions found in WebRTC for it
- WebRTC signaling – signaling isn’t part of WebRTC, but is often something to contemplate. This section dives into the alternatives of signaling that are available, different types of transport protocols, as well as a lesson on SDP. It also covers the security aspects relevant to WebRTC – and it it sheds some light on FUD (fear, uncertainty and doubt) around WebRTC
- Codecs – I love codecs. I know little about them, but somehow, more than most. This section explains voice and video codecs, while focusing on what you need to know about them in the context of WebRTC. You won’t be able to implement a codec after this section (I never implemented a codec), but you will gain the understanding necessary for you to decide the codecs for your own scenarios
- Media processing – media processing is at the heart of most decisions you will take in your use case. In this section, I take the time to review how RTP and RTCP work, and then dive into different architectures and processes you might need in your back end. Things like mixing, routing and recording
- 3rd party frameworks and services – here we will be diverting from the beaten path of “normal course material”, and instead of talking about specifications, standards and capabilities, we will look into the various products and open source frameworks that are out there. We will review them and see which one fits what use case, and also gain an understanding of the various routes available to us, trying to match our company’s DNA and requirements to the alternatives at hand
- Common WebRTC design patterns – this is where we will take specific scenarios and challenges, from a list of those I see every day when people reach out to me, and analyze them. Go over the scenario, break it down to requirements and then map them into architectural alternatives. The idea here is to give you the tools to do such things on your own with your products
Most of the lessons are already ready. There are around 6 lessons that I still need to write. Hopefully, they will be available on launch day, but if not, then the following week.
I want to answer a few quick questions here – things I’ve been asked time and again in the past month:Is this a one-time thing?
Yes and no.
The course takes place October 24 and lasts for 2 months. Those who enroll for office hours get an extended duration of 4 months (as well as office hours).
I don’t plan on doing this an ongoing thing where you can enroll whenever and do the course. I will be taking the time throughout these two months to listen to the students and see if there’s anything that requires updating in “real time”. I can’t do it if this is an ongoing thing.
This might change in the future, but for now, there’s this timing.
I might do that some months from now, after I rest a bit from the effort and decide if it makes financial sense to run it again.
If you have your own timing issues, then understand that the course is self-paced. You can “leave” for a week or two and come back, do it faster or slower.Is the course for me?
I can’t really say.
Here are a few types of students that I have already enrolled for the course:
- Developers who need to start using WebRTC, more often than not through a framework that was already selected. They know how it works, but are looking to gain deeper understanding so they can troubleshoot issues or add features to their product
- Product managers who want to learn and understand more about WebRTC. Mostly to give them the language necessary to talk with their developers. And mainly to keep the developers honest
- Teams who work with WebRTC, so they can get the experience together as a team and improve their proficiency
- Testers wanting to understand the technology better and find effective ways to design their test plans
The course doesn’t include too much code. There’s the occasional piece of code shown, but the idea isn’t to explain to you how to develop with WebRTC. In truth – most of you won’t develop with WebRTC directly anyway – you’ll end up using a framework or a third party for that.
The intent is to give you the understanding of the limits and capabilities of WebRTC. To know how to yield this amazing tool and how to use it effectively in your product.How is the course conducted?
If you enrolled, then you will be receiving an email a day or two prior to the course.
I will be registering you to the course mini-site inside the BlogGeek.me website. Once you login, you will be able to access all course sections and lessons.
Each lesson has a page of its own in the site. Most lessons have a recorded video session as the main bulk of it, along with text and additional reading material. In most cases, that additional reading material is important.
You can “tune in” to any lesson you wish and learn it at your own pace and in your free time.
There is an online forum for the course. Students will be able to raise their questions, issues and feedback there. If things require changes on my end, I’ll try making the changes to the lessons as we move along, maybe even adding course materials and lessons if the need will arise. I will also be using the forum to ask questions myself, and check out on the progress of students.
For those taking office hours, these will take place twice a week at different times to accommodate different time zones. In there, I will answer questions as they come and basically make myself available to you “in the flesh”. I haven’t decided yet which WebRTC service to use for that – suggestions are welcome.
I am still debating if I should use quizes as part of the course, placing them at the end of each section or not. If you have an opinion – please voice it (even if you’re not going to attend the course).
Learn how to design the best architecture for our WebRTC service in this new Advanced WebRTC Architecture course.
The course starts next week.
There’s a Q&A page that may answer additional questions you might have.
Official course syllabus is also available in PDF form.
I’d be happy to meet you if you decide to enroll to the course. This is a new thing for me and I an quite excited about it.
If you are not sure about the course – email me. If there’s no fit – I’ll tell you immediately. If this might help you, I’ll explain to you what you will gain from it so you can make a better decision
Until next Monday – have an awesome week.
The future of streaming includes WebRTC.
Disclaimer: I am an advisor for Peer5.
If you look at reports from Ericsson or Cisco what you’ll notice is the growth of video as a large portion of what we do over the Internet. As video takes up an order of magnitude more data to pass than almost anything else we share today this is no wonder. Here are a few numbers from Cisco’s forecast from Feb 2016:
- Mobile video traffic accounted for 55 percent of total mobile data traffic in 2015. Mobile video traffic now accounts for more than half of all mobile data traffic
- Three-fourths of the world’s mobile data traffic will be video by 2020. Mobile video will increase 11-fold between 2015 and 2020, accounting for 75 percent of total mobile data traffic by the end of the forecast period
I think there are a few reasons for this growth:
- While we’re continuously moving towards HD video resolutions, 4K is already being experimented with. The increase in resolution and frame rates is inevitable. We’ve seen this growth with the displays of our devices and with the cameras we hold in our pockets. Time to see it in the videos we play back
- The hegemony of content creators is broken. User generated content is growing rapidly. It started with YouTube, moving to services such as Vine and now live streaming services such as Periscope, Facebook Live, YouNow and others. More creators = more video sources
- Viewing habits are changing. We are no longer interested in TV series broadcasted on air but rather pick and choose what we want to watch and when we want to watch, from an exponentially larger pool and variety of content
The challenge really begins when you look at the Internet technologies available to stream these massive amounts of content:
- Flash / RTMP. This is how we streamed video over our internet for years, and that period is coming to an end. Google announced limiting its support to Flash by requiring users to opt in on sites that make use of it. This is causing large content sites to scurry towards HTML5 based streaming technologies
- HLS. HTTP Live Streaming – Apple’s mechanism used on iOS devices. And one that is enforced if you wish to stream to iOS devices. To some extent, this makes it “necessary” to support it elsewhere – so there’s also an HLS player for browsers
- MPEG-DASH – the standardized cousin of HLS
- Something else, not necessarily intended for video streaming
The challenge with HLS and MPEG-DASH is latency. While this might be suitable for many use cases, there are those who require low latency live streaming:From my course on WebRTC architecture
For those who can use HLS and MPEG-DASH, there’s this nagging issue of needing to use CDNs and pay for expensive bandwidth costs (when you stream that amount of video, everything becomes expensive).
Which brings me to the recent deal between Peer5 and Dailymotion. To bring you up to speed:
- Dailymotion is huge
- Peer5 is a startup dealing with peer assisted delivery
- They offload video traffic and reduce strain on servers and CDNs by sending video data across peers
- They do this by using WebRTC’s data channel
- Some of the traffic of Dailymotion now flows via Peer5’s technology, and that’s now official
There are other startups with similar technologies to Peer5, but this is the first time any of them has publicized a customer win, and with such a high profile to top it off.
In a way, this validates the technology as well as the need for new mechanisms to assist in our current state of video streaming – especially in large scales.
WebRTC seem to fit nicely in here, and in more than one way only. I am seeing more cases where companies use WebRTC either as a complementary technology or even as the main broadcast technology they use for their service.
It is also the reason I’ve added this important topic to my upcoming course – Advanced WebRTC Architecture. There is a lesson dedicated to low latency live broadcasting, where I explain the various technologies and how WebRTC can be brought into the mix in several different combinations.
If you would like to learn more about WebRTC and see how to best fit it into your scenario – this course is definitely for you. It starts October 24, so enroll now.
Learn how to design the best architecture for our WebRTC service in this new Advanced WebRTC Architecture course.
This is something I don’t get asked directly, but it does pop up from time to time. Especially when people come up with a specific language in mind and ask if it is suitable for WebRTC.
While the answer is almost always yes, I think a quick explanation of where programming languages meet WebRTC exactly is in order.
We will start with a small “diagram”, to show where we can find WebRTC related entities and move from there.
We’ve got both client and server entities with WebRTC, and I think the above depicts the main ones. There are more as your service gets more complicated, but that’s all an issue of scaling and pure development not directly related to WebRTC.
Learn how to design the best architecture for our WebRTC service in this new Advanced WebRTC Architecture course.
So what do we have here?Web app
The web app is what most people think about when they think WebRTC.
This is what ends up running in the browser, loaded from an HTML and its derivatives.
What this means is that the language you end up with is Java Script.Mobile app
When it comes to the mobile domain, there are two ways to end up with WebRTC. The first is by having the web app served inside a mobile browser, which brings you back to Java Script.
The more common approach though is to use WebRTC inside an app. You end up compiling and linking the WebRTC codebase as an SDK.
The languages here?
- C, C++ for the low level stuff that makes up WebRTC. In all likelihood, you won’t need to handle this (either because it will just work or because you’ll be outsourcing it to someone else)
- Java for native Android app development
- Objective-C and/or Switft for native iOS app development
Embedded is where things get interesting.
There are cases where you want devices to run WebRTC for one reason or another.
Two main approaches here will dictate the languages of choice:
- C, C++ if you port the webrtc.org code base and use it. And then whatever else you fancy on top of it
- Any language you wish (Java anyone?), while implementing what you need of the WebRTC protocol (=what goes on the network) on your own
In general, here you’ll be going to lower levels of abstraction, getting as close as possible to the machine language (but stopping at C most probably).TURN server
STUN and TURN servers are also necessary. Most likely – you won’t be needing to do a thing about them besides compiling, configuring and running them.
So no programming languages here.
I would note that the popular open source alternatives are all written in C. Again – this doesn’t matter.Media server
The programming languages here depend on the media server itself. Jitsi and Kurento are Java based. Janus is mostly C. In most cases – you wouldn’t care.
Media servers are usually entities that you communicate with via REST or Websocket, so you can just use whatever language you like on the controlling side. It is a very popular choice to juse Node.js (=Java Script) in front of a Kurento server for example. It also brings us to the last entity.App/Signaling server
The funny thing is that this is where the question is mostly targeted at. The application and/or signaling server is what stitches everything together. It serves the web app, communicates with the mobile and embedded apps. It offers the details of the TURN server and handles any ephemeral passwords with it, it controls the media servers.
And it is also where the bulk of the development happens since it holds the business logic of the application.
And here the answer is rather simple – use whatever you want.
- Node.js and Java Script are great and popular choice (there are good reasons for that)
- Java seems to be a thing in enterprises though for the life of me I just can’t understand why
- PHP works well. It is used by many WordPress plugins for WebRTC
- Erlang seems to be something that adventurous developers like to adopt – and like
- Ruby and Python are also good choices
- .Net is something I’ve seen once or twice used
In general, whatever you can use to build websites can be used to build a WebRTC service.What’s your language?
Back to you. What is the programming languages you use with WebRTC?
If you are looking for developers, then what would be the languages you’d view as mandatory and which ones as preferable with applicants?
This as well as other topics are covered in my upcoming Advanced WebRTC Architecture course. Be sure to enroll if you wish to deepen your understanding in this topic.
The post What’s Your Preferred Language for WebRTC Development? appeared first on BlogGeek.me.
So far so good, but it is time to add some more options for you.A selection of three different course packages
I am working to complete all lessons for the course. It takes time to work things through, go over the lessons, make sure everything is in order and record the sessions.
The interesting thing to me is the variety of people that enroll to this course – they come from all over the globe, varying from small startups to large companies. I found some interesting vendors who are looking at WebRTC that I wasn’t aware of.A few updates about the course
There are a few minor updates that are taking place in the course:
- I will most probably add a forum to go along with the course. The forum is opened for all packages, and it will be a place where discussions and questions can take place between the students
- The FAQ page was updated, based on questions I received in the past several weeks – check it out
- The enrollment page now shows a pricing table, in an effort to make things clearer
- There are now 3 packages:
- Basic – access to the course for 2 months + forum
- Plus – access to the course for 4 months + forum + office hours
- Premium – a new package – see below for information
- For those who wish to enroll by wire transfer instead of PayPal – just contact me through my contact form
The course duration is 8 weeks, give or take a few days.
That said, if you want access to the recorded materials for a longer period, then you might want to consider going for the Plus or Premium packages.
The Plus package extends access to the course materials, including the forum and the office hours by an additional 2 months.
Office hours happen twice a week, at two different times to accommodate multiple time zones. During office hours I will be reviewing with the students their learning and understanding of WebRTC and assist in person in areas that will arise. I might even decide to hold a quick online lesson on relevant or timely topics during the office hours.
The Premium package extends access to the curse materials up to a full year. More about the premium package below.Groups
If you want to enroll multiple employees or just come join as a team, they just contact me directly.
For large enough groups, I can offer discounts. For others, just the service of proforma invoice and wire transfer (which can still be better than PayPal for you).
We will be having 3-4 medium sized groups in our course this time, which will make things interesting – especially during office hours.The Premium Package
I decided to add a premium package to the offering.
The idea behind it is to allow those who want more access to my time, and in a more private way.
The premium package offers two substantial additions on top of the Plus package:
- Access to course materials for a full year (instead of 2 or 4 months)
- Two private consultation calls with me
In the past few months I’ve noticed a lot of small companies who end up wanting an advice. A few hours of my time to explain to me what they are doing and chat about it, to see if there’s anything I can suggest. I decided to offer this service through this course as well, by bundling it as two consultation calls that go on top of the course itself.
We select together the agenda of these calls and what you want to achieve in them before we start. We then schedule the time and medium to use for the call (think something with WebRTC and a webcam, but not necessarily). And then we sit and chat.If you already enrolled
If you already enrolled via PayPal and haven’t heard anything from me other than an order form and an invoice – don’t worry. I will be reaching out to all students a week or two before the course.
I am excited to do this, and really hope you are too.
See you next month!
The post Advanced WebRTC Architecture Course: Adding a Premium Package appeared first on BlogGeek.me.
Recording WebRTC? Definitely server side. But maybe client side.
This article is again taken partially from one of the lessons in my upcoming WebRTC Architecture Course. There, it is given in greater detail, and i recorded.
Recording is obviously not part of what WebRTC does. WebRTC offers the means to send media, but little more (which is just as it should be). If you want to record, you’ll need to take matters into your own hands.
Generally speaking, there are 3 different mechanisms that can be used to record:
- Server side recording
- Client side recording
- Media forwarding
Let’s review them all and see where that leads us.#1 – Server side recording
This is the technique I usually suggest developers to use. Somehow, it fits best in most cases (though not always).
What we do in server-side recording is route our media via a media server instead of directly between the browsrs. This isn’t TURN relay – a TURN relay doesn’t get to “see” what’s inside the packets as they are encrypted end-to-end. What we do is terminate the WebRTC session at the server on both sides of the call – route the media via the server and at the same time send the decoded media to post processing and recording.
What do I mean by post processing?
- We might want to mix the inputs from all participants and combine it all to a single media file
- We might want to lower the filesize that we end up storing
- Change format (and maybe the codecs?), to prepare it for playback in other types of devices and mediums
There are many things that factor in to a recording decision besides just saying “I want to record WebRTC”.
If I had to put pros vs cons for server side media recording in WebRTC, I’d probably get to this kind of a table:+–No change in client-side requirementsAnother server in the infrastructureNo assumptions on client-side capabilities or behaviorLots of bandwidth (and processing)Can fit resulting recording to whatever medium and quality level necessaryNow we must route media#2- Client side recording
In many cases, developers will shy away from server-side recording, trying to solve the world’s problem on the client-side. I guess it is partially because many WebRTC developers tend to be Java Script coders and not full stack developers who know how to run complex backends. After all, putting up a media server comes with its own set of headaches and costs.
So the basics of client-side recording leans towards the following basic flow:
We first record stuff locally – WebRTC allows that.
And then we upload what we recorded to the server. Here’ we don’t really use WebRTC – just pure file upload.
Great on paper, somewhat less in reality. Why? There are a few interesting challenges when you record locally on machine you don’t know or control, via a browser:
- Do you even know how much available storage do you have to use for the recording? Will it be enough for that full hour session you planned to do for your e-learning service?
- And now that the session is done and you’re uploading a Gb of a file. Is the user just going to sit there and wait without closing his browser or the tab that is uploading the recording?
- Where and what do you record? If both sides record, then how do you synchronize the recordings?
It all leads to the fact that at the end of the day, client side recording isn’t something you can use. Unless the recording is short (a few minutes) or you have complete control over the browser environment (and even then I would probably not recommend it).
There are things you can do to mitigate some of these issues too. Like upload fragments of the recording every few seconds or minutes throughout the session, or even do it in parallel to the session continuously. But somehow, they tend not to work that well and are quite sensitive.
Want the pros and cons of client side recording? Here you go:+–No need to add a media server to the media flowClient side logic is complex and quite dependent on the use caseRequires more on the uplink of the user – or more wait time at the end of the sessionNeed to know client’s device and behavior in advance#3 – Media forwarding
This is a lesser known technique – or at least something I haven’re really seen in the wild. It is here, because the alternative is possible to use.
The idea behind this one is that you don’t get to record locally, but you don’t get to route media via a server either.
What is done here, is that media is forwarded by one or both of participants to a recording server.
The latest releases of Chrome allows to forward incoming peer connection media, making this possible.
This is what I can say further about this specific alternative:+–No need to add a media server into the flow – just as an additional external recording serverRequires twice the uplink or moreDo you want to be the first to try this technique?Things to remember
Recording doesn’t end with how you record media.
There’s meta data to treat (record, playback, sync, etc).
And then there’s the playback part – where, how, when, etc.
There are also security issues to deal with and think about – both on the recording end and on the playback side.
These are covered in a bit more detail in the course.What’s next?
If you are going to record, start by leaning towards server side recording.
Sit down and list all of your requirements for recording, archiving and playback – they are interconnected. Then start finding the solution that will fit your needs.
And if you feel that you still have gaps there, then why not enroll to the Advanced WebRTC Architecture course?
The post Recording WebRTC Sessions: client side or server side? appeared first on BlogGeek.me.
Analytics != Operation
Twilio just announced a new service to its growing cadre of services. This time – Voice Insights.What to expect in the coming days
This week Twilio announced several interesting initiatives:
Add to that their recent announcement on their new Enterprise offering and the way they seem to be adding more number choices in countries. What we get is too much work to cover a single vendor in this industry.
Twilio is enhancing its services in breadth and depth at the same time, doing so while trying to reach out to new customer types. I will be covering all of these issues soon enough. Some of it here, some on other blogs where I write. Customers with an active subscription for my WebRTC PaaS report will receive a longform written analysis separately covering all these aspects later this month.What I want to cover in this article
I already wrote about Twilio’s Kurento acquisition. This time, I want to focus on Voice Insights.
All the media outlets I’ve checked to read about Voice Insights were regurgitating the Twilio announcement with little to add. At most, they had callstats.io to refer to. I think a lot is missing from the current conversation. So lets dig in.What is Voice Insights?
Voice Insights is a set of tools that can be used to understand what’s going on under the rug. When you use a communications API platform – or build your own for that matter – the first thing to notice is that there’s lack of understanding of what’s really happening.
Most dashboards focus on giving you the basics – what sessions you created, how long were they, how much money you owe us. Others add some indication of quality metrics.
The tools under the Voice Insights title at Twilio include:
- Collection of all network stats, so you can check them out in the Twilio console
- Real time triggers on the client, telling you when network issues arise or the volume is too low/high
- Pre-call network test on the client
- User feedback collection (the Skype “how was your call quality” nag)
Some of them were already available in some form or another in the Twilio offering – such as user feedback collection.
The features here can be split into two types:
- Client side – the real time triggers, pre-call network test
- Server side – collection of network stats
Twilio gave a good introduction to all of thee capabilities, so I won’t be repeating them here.
What is interesting, is if and how they have decided to implement the real time triggers – do they get triggered from the backend or directly by running rules on the device. But I digress here.How is it priced?
Interestingly, Voice Insights is priced separately from the calling service itself.
If you want insights into the voice minutes you use on Twilio, there’s an extra charge associated with it.
Prices start from $0.004 per minute, going down to ~$0.002 per minute for those who can commit to 1 million voice minutes a month. It goes down to a shy above $0.001 a minute.
For comparison, SIP-to-SIP voice calling on Twilio starts at $0.005 per minute, making Voice Insights a rather expensive service.
Comparisons with callstats.io are necessary at this point. If you take a low tier of 10,000 voice minutes a month, callstats.io is priced at 19 EUR (based on their calculator – it can get higher or lower based on “data points”) whereas Twilio Voice Insights stands at 40 USD. How do these two vendors offer lower rates at bulk is an exercise I’ll leave for others to make.
Is this high? low? market price? I have no clue.
Voice Insights can take several directions with Twilio:
- Extend it to support video sessions as well
- Enhance and deepen the analytics capabilities, probably once enought feedback is received from customers on this feature
- Switch from a paid to free offering, again, based on customer feedback
- Unbundle it from Twilio and offer it as a stand-alone service to others – maybe to all the vendors that are using Kurento on premise?
With analytics, the sky usually isn’t the limit. It is just the beginning of the dreams and stories you can build upon a large data set. The problem is how can you take these dreams and make them come true.
Which brings us to the next issue.The future of Analytics in Comm APIs
There’s a line drawn in the sand here. Between communications and analytics.
Analytics has a perceived value of its own – on top of enabling the interaction itself.
Will this hold water? Will other communication API vendors add such capabilities? Will they be charging extra for them?
I’ve had my share of stories around CEM (Customer Experience Management). Network equipment vendors and those handling video streaming are marketing it to their customers. Analytics on network data. This isn’t much different.
Time will tell if this is something that will become common place and desired, or just a failed attempt. I still don’t have an opinion where this will go.Up next
Next in my quick series of articles on Twilio comes coverage of their new Enterprise plan, and how Twilio is trying to grow in breadth and depth at the same time.
Test and Monitor your WebRTC Service like a pro - check out how testRTC can improve your service' stability and performance.
The post Twilio’s Voice Insights for WebRTC – a line on the sand appeared first on BlogGeek.me.