News from Industry

Kamailio v4.4.7 Released

miconda - Mon, 02/26/2018 - 18:00
Kamailio SIP Server v4.4.7 stable is out – a minor release including fixes in code and documentation since v4.4.6. The configuration file and database schema compatibility is preserved, which means you don’t have to change anything to update.Kamailio v4.4.7 is based on the latest version of GIT branch 4.4. We recommend those running previous 4.4.x versions to upgrade either to v4.4.7 or even better to 5.0.x or 5.1.x series. When upgrading to v4.4.7, there is no change that has to be done to configuration file or database structure comparing with the previous release of the v4.4 branch.Important: Kamailio v4.4.7 is the last planned release in 4.4.x series. From this moment, the maintained stable release series are 5.0.x and 5.1.x.Resources for Kamailio version 4.4.7Source tarballs are available at:Detailed changelog:Download via GIT: # git clone kamailio
# cd kamailio
# git checkout -b 4.4 origin/4.4Relevant notes, binaries and packages will be uploaded at:Modules’ documentation:What is new in 4.4.x release series is summarized in the announcement of v4.4.0:Note: the branch 4.4 is an old stable branch, going out of mainenance with the release of v4.4.7 – if no major regression discovered, then no future releases will be made out of branch 4.4. The latest stable branch is 5.1, at this time with v5.1.1 being released out of it. The project is officially maintaining the last two stable branches, these are now 5.0 and 5.1. Therefore an alternative is to upgrade to latest 5.1.x – be aware that you may need to change the configuration files and database structures from 4.4.x or 5.0.x to 5.1.x. See more details about it at:We hope also to meet many of you at the next Kamailio World Conference, May 14-16, 2018, in Berlin, Germany. The details for a selection of speakers and sessions has been already published and the registration is open. See more on the website of the event at:Thanks for flying Kamailio!

“Open Source” SDK for SaaS and CPaaS are… Meh

bloggeek - Mon, 02/26/2018 - 12:00

Open Source SDKs from SaaS vendors aren’t interesting.

Every once in awhile, I see a SaaS vendor boasting to have open source SDKs. The assumption is that if you say “open source” on something you are doing it immediately makes the thing free and open. The truth is far from it.

Planning on selecting a CPaaS vendor? Check out this shortlist of CPaaS vendor selection metrics:

Get the shortlist

Open Source Today

I want to start with an explanation of open source today.

Open source is a way for a vendor or a single developer to share his code with the “community” at large. There are many reasons why a vendor would do such a thing:

  1. To get others in the industry to assist in the effort of building and maintaining that code base (in most cases, such initiatives fail to meet their objective)
  2. To show technical savviness as a company. This is good for the brand’s name and when a company wants to attract top notch developers
  3. To showcase one’s technical abilities. An individual developer can use his github account to attract potential employers and projects
  4. To offer a reference implementation or a helper library for integrating with the company’s application

The above reasons are related to companies with proprietary software that they want protected. What they end up doing, is share modules or parts of their codebase as open source. Usually ones they assume won’t help a competitor copy and compete with them directly.

The other approach, is to use open source as a full fledged business model:

  1. Releasing a project as open source, then offering a non-open source license
  2. Or offering support and an SLA to it
  3. Or offering a hosted version of it
  4. Or offering customization work around it

A good example here is FreeSWITCH. They are offering support and customization work around this popular open source project. And now, there’s SignalWire, an upcoming hosted version of FreeSWITCH.

You see, for a company to employ open source, there needs to be an upside. Philanthropy isn’t a business model for most.

Cloud versus On-premise when Consuming Open Source

SaaS changes the equation a bit.

I tried placing different open source licenses on a kind of a graph, alongside different deployment models. Here’s what I got:

(if you’re interested here’s where to learn more about open source licenses)

CPaaS and SaaS in general are cloud deployments. They enable the company more leeway in the type of open source licenses it can consume. An on-premise type of business better beware of using GPL, whereas a cloud deployment one is just fine using GPL.

This isn’t to say that GPL can’t be used by on premise deployments – just that it complicates things to a point that oftentimes the risks of doing so outweighs the potential reward.

CPaaS / SaaS vendors and Interfaces

On the other end of the equation you’ll find how customers interact with CPaaS vendors.

Towards that goal, the main approach today is by way of an API. And APIs today are almost always defined using REST.

In the illustration above, we have a SaaS or CPaaS vendor exposing a REST API. On top of that API, customers can build their own applications. The vendor wants to make life easier for them, to increase adoption, so he ends up implementing helper libraries. The helper libraries can be official ones or unofficial ones, either created by third parties or the vendor himself. They can just be reference implementations on top of the API, offered as starting points to customers with no real documentation or interface of their own.

For the most part, helper libraries are something I’d expect customers to deploy and run on their servers, to make it easier for them to connect from whatever language and framework they want to use to the vendor’s service.

On a client device, we have SDKs. In some ways, SDKs are just like helper libraries. They connect to the backend REST API, though sometimes they may have a more direct/optimized connection to the platform (proprietary, undocumented WebSocket connection for example).

SDKs is something you’ll find with most of the services where a state machine needs to be maintained on the client side. In the context of most of the things I write here, this includes CPaaS platforms deciding to offer VoIP calling (voice or video) by way of WebRTC or by other means over non-browser implementations. In many of these cases, the developers never actually implement REST calls – they just use the SDK’s interface to get things done.

Which is where the notion of open source SDKs sometimes comes up.

The Open Source SDK

If we’re talking about a SaaS platform, then having the source code of the SDK has its benefits, but none of them relate to “open source”. There’s no ecosystem or adoption at play for the open source code.

The reasons why we’d like to have the source code of an SDK are varied:

  1. Reading the code can give us better understanding of how the service works
  2. Being able to run the code step by step in a debugger makes it easier to troubleshoot stuff
  3. Stack traces are more meaningful in crashes

Here’s the thing though –

Trying to market the SDK as open source is kinda misleading as to what you’re getting out of your end of the deal.

When it comes to CPaaS and WebRTC, there’s this added complexity: vendors will “open source” or give the source code of their JS SDK (because there’s no real alternative today, at least not until WebAssembly becomes commonplace). As for the Android and iOS SDKs, I don’t remember seeing one that is offered in source code form – probably because all vendors are tweaking and modifying the baseline WebRTC code.

SaaS and Open Source

In a way, SaaS has changed the models and uses of open source. When it was first introduced to the world, software was executed on premise only. There was no cloud, and SDKs and frameworks were commercially licensed. If you wanted something done, you either had to license it or build it yourself.

Open source came and changed all that by enabling vendors to build on top of open source code. Vendors came out with business models around dual licensing of code as well as support and customization models.

SaaS vendors today use open source in three different ways:

  1. They use it to build their platform. Due to their model, they are less restricted as to the type of open source licenses they can live with
  2. They open source code modules. Either by forking and sharing modified open source modules they use or by open sourcing specific modules
    1. Mostly because their developers push towards that goal
    2. And because they believe these modules won’t give away any of their competitive advantages
    3. Or to attract potential customers
  3. They may open source their whole platform. Not common, but it does happen. Idea here is to make revenue out of hosting the service at scale and giving away the baseline service for free (think WordPress for example)


Planning on selecting a CPaaS vendor? Check out this shortlist of CPaaS vendor selection metrics:

Get the shortlist

The post “Open Source” SDK for SaaS and CPaaS are… Meh appeared first on

Do I Need a Media Server for a One-to-Many WebRTC Broadcast?

bloggeek - Tue, 02/20/2018 - 12:00


Do I need a media server for a one-to-many WebRTC broadcast?

That’s the question I was asked on my chat widget this week. The answer was simple enough – yes.

Decided you need a media server? Here are a few questions to ask yourself when selecting an open source media server alternative.

Get the Selection Sheet

Then I received a follow up question that I didn’t expect:


That caught me off-guard. Not because I don’t know the answer. Because I didn’t know how to explain it in a single sentence that fits nicely in the chat widget. I guess it isn’t such a simple question either.

The simple answer is a limit in resources, along with the fact that we don’t control most of these resources.

The Hard Upper Limit

Whenever we want to connect one browser to another with a direct stream, we need to create and use a peer connection.

Chrome 65 includes an upper limit to that which is used for garbage collection purposes. Chrome is not going to allow more than 500 concurrent peer connections to exist.

500 is a really large number. If you plan on more than 10 concurrent peer connections, you should be one of those who know what they are doing (and don’t need this blog). Going above 50 seems like a bad idea for all use cases that I can remember taking part of.

Understand that resources are limited. Free and implemented in the browser doesn’t mean that there aren’t any costs associated with it or a need for you to implement stuff and sweat while doing so.

Bitrates, Speeds and Feeds

This is probably the main reason why you can’t broadcast with WebRTC, or with any other technology.

We are looking at a challenging domain with WebRTC. Media processing is hard. Real time media processing is harder.

Assume we want to broadcast a video at a low VGA resolution. We checked and decided that 500kbps of bitrate offers good results for our needs.

What happens if we want to broadcast our stream to 10 people?


Broadcasting our stream to 10 people requires bitrate of 5mbps uplink.

If we’re on an ADSL connection, then we can find ourselves with 1-3mbps uplink only, so we won’t be able to broadcast the stream to our 10 viewers.

For the most part, we don’t control where our broadcasters are going to be. Over ADSL? WiFi? 3G network with poor connectivity? The moment we start dealing with broadcast we will need to make such assumptions.

That’s for 10 viewers. What if we’re looking for 100 viewers? A 1,000? A million?

With a media server, we decide the network connectivity, the machine type of the server, etc. We can decide to cascade media servers to grow our scale of the broadcast. We have more control over the situation.

Broadcasting a WebRTC stream requires a media server.

Sender Uniformity

I see this one a lot in the context of a mesh group call, but it is just as relevant towards broadcast.

When we use WebRTC for a broadcast type of a service, a lot of decisions end up taking place in the media server. If a viewer has a bad network, this will result with packet loss being reported to the media server. What should the media server do in such a case?

While there’s no simple answer to this question, the alternatives here include:

  • Asking the broadcaster to send a new I-frame, which will affect all viewers and increase bandwidth use for the near future (you don’t want to do it too much as a media server)
  • Asking the broadcaster to reduce bitrate and media quality to accomodate for the packet losses, affecting all viewers and not only the one on the bad network
  • Ignoring the issue of packet loss, sacrificing the user for the “greater good” of the other viewers
  • Using Simulcast or SVC, and move the viewer to a lower “layer” with lower media quality, without affecting other users

You can’t do most of these in a browser. The browser will tend to use the same single encoded stream as is to send to all others, and it won’t do a good job at estimating bandwidth properly in front of multiple users. It is just not designed or implemented to do that.

You Need a Media Server

In most scenarios, you will need a media server in your implementation at some point.

If you are broadcasting, then a media server is mandatory. And no. Google doesn’t offer such a free service or even open source code that is geared towards that use case.

It doesn’t mean it is impossible – just that you’ll need to work harder to get there.

Looking to learn more about WebRTC? In the coming weeks, I’ll be refreshing my online WebRTC training. Join now so you don’t miss out.

Enroll to the WebRTC course


The post Do I Need a Media Server for a One-to-Many WebRTC Broadcast? appeared first on

Kamailio World 2018: Preview With A Selection Of Sessions

miconda - Mon, 02/19/2018 - 22:00
Less than 3 months till the start of the 6th edition of Kamailio World Conference, time if flying fast!About one week ago we published the details for a group of accepted speakers, today we made a selection of sessions at the Kamailio World 2018. We had more proposals than we could accommodate, we are trying hard to fit in as many as possible, taking also in consideration the feedback from the participants at the past editions.For now you can head to the Schedule page and see the details of 15 sessions, from both workshops and conference days:A very divers range of topics, from using Kamailio for emergency services (112/911), scaling with Redis backend, deploying in a containerized environment with Docker and Kubernetes, how to migrate the SIP routing logic to rich KEMI languages such as Lua, Python or Javascript, unit testing for Kamailio and test driven deployments, to blockchains in telephony, using Kamailio and FreeSwitch together, or latest updates from Asterisk PBX.The IMS/VoLTE workshop is going to show what you can do with latest Kamailio in mobile networks. And, of course, we have the very popular two sessions that never missed a Kamailio World edition: Dangerous Demos with James Body and VUC Visions with Randy Resnick.The details for other speakers and sessions will be published in the near future, stay tuned!Do not miss Kamailio World Conference 2018, it is going to be another great edition! You can register now!Looking forward to meeting many of you at the next Kamailio World Conference, during May 14-16, 2018, in Berlin, Germany!

DB_REDIS – Kamailio Database Connector Module For Redis Serve

miconda - Fri, 02/16/2018 - 21:18
Andreas Granig from Sipwise has pushed recently a new module for Kamailio, respectively db_redis, which implements database connector API. The readme of the module can be found at:Practically it should be possible to use db_redis module instead of any other database connector module, such as db_mysql or db_postgres., for modules like usrloc, auth_db, a.s.o.Redis is know to be very fast key-value storage system, with very good replication and redundancy option, already popular in Kamailio ecosystem  – see also ndb_redis or topos_redis modules.Andreas is testing the performances of Kamailio with db_redis versus other popular database connectors, the results are very promising in a boost of performances.As a matter of fact, Andreas will give a presentation about this topic at Kamailio World Conference 2018, a session you should not miss if scalability is important for your VoIP/RTC service! See you there!Thanks for flying Kamailio!

Testing Kamailio On RaspberryPi 3

miconda - Thu, 02/15/2018 - 21:17
Stefan Mititelu has shared some statistics about stressing Kamailio on a Raspberry PI 3 device. All the relevant details were made available at:Here are device’s characteristicsAn over-clocked Raspberry PI 3 running Raspbian Stretch with a U3 MicroSD card.

pi@raspberrypi:~ $ cat /etc/issue
Raspbian GNU/Linux 9 \n \l
pi@raspberrypi:~ $ uname -a
Linux raspberrypi 4.9.59-v7+ #1047 SMP Sun Oct 29 12:19:23 GMT 2017 armv7l GNU/Linux

pi@raspberrypi:~ $ cat /boot/config.txt
sdram_over_voltage=2His remarks on Kamailio’s sr-users mailing list:The tests ran for 60 seconds, repeated a couple of times, and they were done in a LAN, using PI’s ethernet interface, running Kamailio 5.1.1.
  1. REGISTER/200, __with db_text__
    – at 900 cps test did finish: all UAC registered; pi htop threads were ~15-20%
    – at 950 cps test did NOT finish: got “Overload warning” on my UAC/UAS SIPp testing machine
  2. INVITE/180/200/PAUSE(3sec)/BYE/200, __with no media__
    – at 370 cps test did finish: all UAC->UAS calls completed; ~150 “180 Trying” Unexpected-Msg on UAC side; pi htop threads were ~50%
    – at 380 cps test did NOT finish: few(~5) UAC->UAS calls not completed; pi htop threads were ~50%
The results are really impressive (even if the used testing configs were really basic ones)!!!Moreover, I think that I’ve reached the limit of my current SIPp testing machine, but not of PI’s.Should you have something interesting to share about using Kamailio, do not hesitate to contact us, we will gladly publish an article on our website.Thanks for flying Kamailio!

Transcoding With Kamailio And RTPEngine

miconda - Wed, 02/14/2018 - 21:11
The developers at Sipwise were very engaged and creative lately, bringing major features in the Kamailio ecosystem:
  • audio transcoding support in RTPEngine by Richard Fuchs
  • database API connector implementation for Redis by Andreas Granig (expect a post here about it very soon as well as a presentation at Kamailio World Conference 2018)
Sipwise is one of the oldest companies involved in Kamailio project, since SER/OpenSER times — likely out there in the community are very few that used (or even heard of) the OpenSER Configuration Wizard published by Andreas Granig around years 2006-2007, but that helped many to start building Kamailio-based VoIP platforms back in those days. Andreas, the CTO and one of the founders of Sipwise, has been member of Kamailio management team for more than 10 years now.Back to the topic of this article, RTPEngine introduced recently the capability of transcoding audio channel for SIP/VoIP calls. It relies on ffmpeg project, therefore the it supports the relevant codecs out there, respectively:
  • G.711 (a-Law and µ-Law)
  • G.722
  • G.723.1
  • G.729
  • Speex
  • GSM
  • iLBC
  • Opus
  • AMR (narrowband and wideband)
Another feature added along with the transcoding was the support for repacketization of the RTP traffic, which can help in increasing QoS over long distance connections.These features are immediately available even on old releases of Kamailio (such as v5.0.x or 5.1.x), the control protocol for RTPEngine being flexible to support such new commands. The commands are not yet documented inside Kamailio’s rtpengine module, but you can read more about them in the README of RTPEngine application:It is no wonder that this topic became a hot discussion on Kamailio’s sr-users mailing list.Along with its old popular feature to gateway between WebRTC DTLS-SRTP and plain RTP (decryption/encryption) as well as the high throughput capacity with in-kernel RTP packets forwarding (useful for NAT traversal or QoS), RTPEngine is nowadays a must-have component in modern Kamailio-based RTC platforms.Here we express our great appreciation for all these contributions by Sipwise and their continuous support for Kamailio project over the years!Exciting times ahead for Kamailio project, a lot of new features are baking as you read here! Join us at the 6th edition of Kamailio World Conference, May 14-16, 2018, in Berlin, Germany, to meet the developers and learn more about using Kamailio and related projects. Registration is open!Thanks for flying Kamailio!

The Internet of Things or Things on the Internet?

bloggeek - Mon, 02/12/2018 - 12:00

Time to stop playing things on the internet and start building the internet of things.

We’ve been using that stupid IOT acronym for quite some time. Probably a decade. The idea and notion that every object can be network enabled, share its collected data and receive its commands remotely is quite exciting. I think we’re far from that vision.

It isn’t that we’re not making progress. We are. The apartment building I now live in is 3 years old. It is more automated than the previous apartment building I lived in, which was 15 years old. I wouldn’t call it IOT or a smart building quite yet. And I don’t think there’s a simple way to turn a dumb building into a smart one either.

When we moved to our new apartment we renovated a bit. There was this opportunity to add smart-home capabilities into the apartment. There were just a few teeny set of problems here:

  1. There’s no real business case for us yet. As a family, we really don’t need a smart-home, and frankly – I still haven’t seen one to appreciate the added benefit
  2. Since we’re in a highrise, the need for an apartment security/surveillance system seemed like an overkill. The most we ended up with is a peephole camera for the door. Mainly to empower or kids to see who’s knocking (no IOT or smarts in it)
  3. Talking to the electrician to ended up dealing with our power outlets at home, I understood that there’s not enough electricians available who know how to install a smart-home kit here in Israel

And to top it all, it felt like a one time undertaking that will be hard/impossible to upgrade or modify later on without a complete overhaul. That wasn’t what I was aiming for.

Mozilla just announced their Things Gateway that can be installed on a Raspberry Pi 3. It is a rather interesting project, especially since its learnings are then applied to the W3C Web of Things Interest Group with the intent of reducing the fragmentation of IOT. They’ve got their hands full of work.

IOT today is a patchwork of devices and companies, each trying to become a dominant player. The end result is that we’re living in a world where things can be placed on the internet, but they don’t amount for an internet of things.

Here are a few questions/hurdles that I think we’ll need to answer as an industry before we can reach that vision of IOT.


I am putting security here first. Here’s why:

  1. We all know it is mandatory
  2. We all know it is left as a backlog item if it is considered at all

I’ve seen it happen with VoIP and it is definitely happening today with IOT.

Until this becomes a priority, IOT will not really happen.

Security has many different aspects to it:

  • Encryption of the communications, to maintain privacy and allow for authorization and authentication of it
  • Upgradability, which itself should be secure, straightforward and automated
  • Audit logs that are hard to tamper with, so we can investigate hacks

Most vendors won’t be able to get these done properly to being with. And they don’t have any real incentive to do that either.


There’s a need for standardization in this space. One that tackles all levels of the IOT food-chain.

Out of the top of my head, here are a few areas:

  • Physical – Wi-Fi, Zigbee, Bluetooth – all are standards for the underlying network layer to be used. There’s also RFID and other type of connections that can be used. And we need to factor in 5G at some point. We’ve got wireless ones and wireline ones. A total mess. Just look at the mozilla Things Gateway announcement for the set of connectors they support and how these get supported. Too much information to get things done easily
  • Transport – once we get communications, and assume (naively) that we have IP communications going, do we then run our data over TCP? Or TLS? Or maybe UDP? Or should we go for QUIC? Or HTTP/2? Should we do it over MQTT maybe? Over a WebSocket? There’s too many alternatives here
  • Signaling – What are the types of messages we’re going to allow? What controls what sensor data? How do we describe it in a way that can be easily extendable and unambiguous? I’ve been there with VoIP and it was hard enough. Doing it for IOT is an order of magnitude harder (more players, more devices, more everything)
  • Processing – this relates to the next topic of automation. Once we can collect, control and make decisions over a single device, can we do it in aggregate, and in ways that won’t lock us in to a single vendor?

I don’t believe we’ll get this thing standardized properly in our industry for quite some time.


I’ve seen a lot of rules engines when it comes to IOT. You can program them to create sequences of events – if the density sensor indicates someone is at home, open the lights.

The problem is that you need to program them. This can’t scale.

The other problem is the issue of what to do with all that sensor data? Someone needs to collect it, aggregate it, process it, analyze it and make decisions out of it.

Simple rule engines are nice, but they won’t get us far down the IOT path.

We also need to add machine learning and AI into the mix.

The end result? Probably similar in nature to AWS Deep Lens. Only problem, it either needs to be really generic and flexible.

Different Industries, Different Requirements and Ecosystems

There are different markets in IOT. they have different needs and different customers. They will have different ecosystems around them.

In broad strokes, we can split to consumer and enterprise. Enterprise here includes industrial, smart cities, etc. The consumer is all about the home, the car and the self.

Who will be the players here?

From Smartphones to Smart Speakers

This is where I think we made the most progress.

Up until a year ago, IOT was something you end up delivering to customers via apps on a smartphone. You purchase a lightbulb, you get an app. You get a new TV, there’s an app. Refrigerator? App.

Amazon Alexa did something miraculous. It moved the discussion over the home from an app towards a stationary home device with voice activation and control. No screen or touch screen needed.

Since then, Google and Apple have joined and voice assistants in the home are all the rage now.

In some ways, I expect this to find its way into the enterprise as well. First via conference rooms and later – who knows?

This is one more piece in the IOT puzzle.

Where do we go from here?

I have no clue.

To me, it seems that we’re still in the things on the internet, and we will be there for a lot longer.

The post The Internet of Things or Things on the Internet? appeared first on

Kamailio World 2018 – First Group Of Speakers

miconda - Thu, 02/08/2018 - 14:06
The details for the first group of speakers at Kamailio World Conference 2018 have been published. So far they come from three continents: Europe, North America and Asia, many presenting for the first time at our event.The two sessions present at all editions so far will be there also in 2018, at our 6th edition, respectively Dangerous Demos with James Body and VUC Visions with Randy Resnick.Besides covering various use cases for KamailioAsterisk or FreeSwitch, the sessions go into WebRTC, VoLTE/IMS, IoT, blockchains for telecommunications or scalability using NoSQL data storage systems. Definitely another edition with very interesting content – soon we will publish more details about the sessions as well.See more about the speakers at:You can register now to benefit of the early registration price:Looking forward to meeting many of you at Kamailio World Conference, May 14-16, 2018, in Berlin, Germany!Thanks for flying Kamailio!

Kamailio Administration Group

miconda - Wed, 02/07/2018 - 14:04
After several discussions at some of the past IRC devel meetings, finally we started to build a team to be involved more actively in the administration of Kamailio. The project has grown steadily, not only in terms of code, but also packaging, continuous integration, social networking interactions as well as participation to events world wide.For a better coordination and ability to handle related tasks, we invited the most active developers and community members to join so called Kamailio Administration Team, the initial details about it are published as part of management page on more details about its rules and purpose:It will still take some time to get it properly rolling, more or less now looking to see if the community has suggestions/improvements on what can be done in these aspects – you can just write to sr-users mailing list.Thanks for flying Kamailio!

Kamailio Management Group Updates

miconda - Tue, 02/06/2018 - 14:03
Markus Monka has just replaced Marcus Hunger in Kamailio project management group. Marcus (still at sipgate) has moved to work more on frontend applications than backend, no longer interacting with Kamailio project.Markus Monka has managed the VoIP operations at sipgate for more than 15 years, helping the project over the years with various resources, mainly in respect of organizing events and testing infrastructure, sipgate being one of the oldest VoIP services using Kamailio (since first releases of SER). The change is reflected now on the website:Warm welcome to Markus and many thanks to both of them for what they did so far for Kamailio project!Thanks for flying Kamailio!

5 Mistakes to Avoid When Developing WebRTC Applications

bloggeek - Mon, 02/05/2018 - 12:00

There are things you don’t want to do when you are NIH’ing your way to a stellar WebRTC application.

Here’s a true, sad story. This month, the unimaginable happened. Rain (!) dropped from the sky here in Israel. The end of it was that 6 apartments in my building are suffering from moisture due to a leakage from a balcony of the penthouse. Being a new building, we’re at the mercies of the contractor to fix it.

Nothing in the construction market moves fast in Israel – or without threats, so we had to start sending official sounding letters to the constructor about the leak. I took charge, and immediately said we need to lawyer up and have a professional assist us in writing a letter from us to the constructor. Others were in the opinion we can do it on our own, as we need a lawyer only if he is signed directly on the document.

And then it hit me. I wanted to lawyer up is because I see many smart people failing with WebRTC. They are making rookie mistakes, and I didn’t want to make rookie mistakes when it comes to the moisture problems in my apartment.

Why are we Failing with WebRTC?

I am not sure that smart people fail a lot more around WebRTC technology than they are with other technologies, but it certainly feels that way.

A famous Mark Twain quote goes like this:

“There is no such thing as a new idea. It is impossible. We simply take a lot of old ideas and put them into a sort of mental kaleidoscope. We give them a turn and they make new and curious combinations. We keep on turning and making new combinations indefinitely; but they are the same old pieces of colored glass that have been in use through all the ages.”

Many of the rookie mistakes people do about WebRTC stems from this. WebRTC is this kind of new. It is simply a lot of old ideas meshed into a new and curious combination. So we know it. And we assume we know how to handle ourselves around it.

Entrepreneurs? Skype is 14 years old. It shouldn’t be that hard to build something like Skype today.

VoIP developers? SIP we know. WebRTC is just SIP without the signaling. So we force SIP onto it and we’re done.

Web developers? WebRTC is part of HTML5. A few lines of JS code and we’re practically ready to go live.

Video developers? We can just take the WebRTC video feeds and put them on a CDN. Can’t we?

The result?

  1. Smart people decide they know enough to go it alone. And end up making some interesting mistakes
  2. People put their faith in one of the above personas… only to fail

My biggest gripe recently is people who decide in 2018 that peerJS is what they need for their WebRTC application. A project with 402 lines of code, last updated in 2015 (!). You can’t use such code with WebRTC. Code older than a year is stale or dead already. WebRTC is still too new and too dynamic.

That said, it isn’t as if you have a choice anymore. Flash is dying, and there’s no other serious alternative to WebRTC. If you’re thinking of adopting WebRTC, then here are five mistakes to avoid.

Mistake #1: Failing to Configure STUN/TURN

You wouldn’t believe how often developers fail to configure NAT traversal servers. Just yesterday I had someone ask me over the chat widget of my website how can he run his application by hosting his signaling and web servers on HostGator without any STUN/TURN servers. It just doesn’t work.

The simple answer is that you can’t – barring some esoteric use cases, you will definitely need STUN servers. And for most use cases, TURN servers will also be mandatory if you want sessions to connect.

In the past month, I found myself explaining quite a lot about NAT traversal:

  • You must use STUN and TURN servers
  • Don’t rely on free STUN servers, and definitely don’t use “free” TURN servers
  • Don’t force all sessions via TURN unless you absolutely know what you’re doing
  • TURN has no added security in using it
  • You don’t need more than 1 STUN server and 3 TURN servers (UDP, TCP and TLS) in your servers configuration in WebRTC
  • Use temporary/ephemeral passwords in your TURN configuration
  • STUN doesn’t affect media quality
  • coturn or restund are great options for STUN/TURN servers

There’s more, but this should get you started.

Mistake #2: Selecting the WRONG Signaling Framework

PeerJS anyone? PeerJS feels like a tourist trap:

With 1,693 stars and 499 forks, PeerJS is one of the most popular WebRTC projects on github. What can go wrong?

Maybe the fact that it is older than the internet?

A WebRTC project that had its last commit 3 years ago can’t be used today.

Same goes for using Muaz Khan’s code snippets and expecting them to be commercial grade, stable, highly scalable products. They’re not. They’re just very useful code snippets.

Planning to use some open source project? Make sure that:

  • Make sure it was updated recently (=the last couple of months)
  • Make sure it is popular enough
  • Make sure you can understand the framework’s code and can maintain it on your own if needed
  • Try to check if there’s someone behind it that can help you in times of trouble

Don’t take the selection process here lightly. Not when it comes to a signaling server and not when it comes to a media server.

Mistake #3: Not Using Media Servers When You Should

I know what you’re thinking. WebRTC is peer to peer so there’s no need for servers. Some think that even signaling and web servers aren’t needed – I hope they can explain how participants are going to find each other.

To some, this peer to peer concept also means that you can run these ridiculously large scale sessions with no servers that carry on media.

Here are two such “architectures” I come across:

Mesh. It’s great. Don’t assume you can get it to run properly this year or the next. Move on.

Live broadcasting by forwarding content. It can be done, but most probably not the way you expect it to grow to a million users with no infrastructure and zero latency.

For many of the use cases out there, you will need a media server to process and route the media for you. Now that you are aware of it, go search for an open source media server. Or a commercial one.

Mistake #4: Thinking Short-Term

You get an outsourcing vendor. Write him a nice requirements doc. Pay him. Get something implemented. And you’re done.

Not really.

WebRTC is still at its infancy. The spec is changing. Browser implementations are changing. It is all in flux all the time. If you’re going to use WebRTC, either:

  1. Use some WebRTC API platform (here are a few), and you’ll be able to invest a bit less on an ongoing basis. There will be maintenance work, but not much
  2. Develop on your own or by outsourcing. In this case, you will need to continue investing in the project for at least the next 3 years or more

WebRTC code rots faster than most other HTML5 code. It will eventually change, but we’re not there yet.

It is also the reason I started with a few colleagues testRTC a few years ago. To help with the lifecycle of WebRTC applications, especially in the area of testing and monitoring.

Mistake #5: Failing to Understand WebRTC

They say assumption is the mother of all mistakes. Google seems to agree with it. Almost.

WebRTC isn’t trivial. It sits somewhere between VoIP and the web. It is new, and the information out there on the Internet about it is scattered and somewhat dynamic (which means lots of it isn’t accurate).

If you plan on using WebRTC, make sure you first understand it and its intricacies. Understand the servers that are needed to deploy a WebRTC application. Understand the signaling mechanisms that are built into WebRTC. Understand how media is processes and sent over the network. understand the rich ecosystem of solutions that can be used with WebRTC to build a production ready system.

Lots of things to learn here. Don’t assume you know WebRTC just because you know web development or because you know VoIP or video processing.

If you are looking to seriously learn WebRTC, why not enroll to my Advanced WebRTC Architecture course?

Enroll to course

What about my apartment? We’ve lawyered up, and now I have someone review and fix all the official sounding letters we’re sending out. Hopefully, it will get us faster to a resolution.


The post 5 Mistakes to Avoid When Developing WebRTC Applications appeared first on

Using @kamailio on Twitter

miconda - Thu, 02/01/2018 - 14:01
With the involvement of Daniel-Constantin Mierla (me), Henning WesterholtFred PosnerOlle E. Johansson and the assistance from Twitter SupportKamailio SIP Server project is now able to use @kamailio handle on Twitter.So far, Olle E. Johansson used @kamailioproject for pushing news about the project, because @kamailio was not available. However, there were situation when even people close to the project mistakenly referred to @kamailio when willing to actually mention Kamailio project (e.g., is a registered trademark in European Union, process completed by Henning Westerholt many years ago. The main domains and are registered by Daniel-Constantin Mierla. There is no other relevant organization having the same name, therefore such situation was affecting the Kamailio brand. Twitter Support was open to listen to our story and finally we were assigned the @kamailio name.The process concluded with renaming @kamailioproject to @kamailio, the old followers, tweets and discussions were kept intact. So if you followed @kamailioproject in the past you should see you are following @kamailio now.If you haven’t followed us yet, you can now be up to date with news about the project via @kamailio.The @kamailio account is going to be shared-managed by several people, more details will be exposed soon.Thanks for flying Kamailio! We hope to see you at Kamailio World Conference 2018 (May 14-16, in Berlin, Germany).

WebRTC Electron Implementations are on 🔥

bloggeek - Mon, 01/29/2018 - 12:00

For WebRTC, Mobile and PC are moving in different directions. In the desktop, WebRTC Electron apps are gaining momentum.

In the good old days, people used to complain that WebRTC isn’t available on all browsers. Mobile was less of an issue for most as mobile application developers port WebRTC and use it natively on both iOS and Android.

How times change.

Need to know where WebRTC is available? Download this free WebRTC Device Cheat Sheet.

Get the Cheat Sheet

Today? All modern browsers support WebRTC. We’ve got Chrome, Firefox, Edge and Safari with official WebRTC implementations.

The challenge? None of the browsers are ready:

  • Chrome uses Plan B, switching to Unified Plan
  • Firefox is doing fine, but isn’t high on the priority list
  • Edge doesn’t support the data channel, had its market share isn’t that great
  • Safari doesn’t support VP8 and breaks a wee bit too often at the moment

What’s a developer to do?

Use adapter.js. Or go for a plugin. Or just ignore a few browsers.

Or maybe. Just maybe you should treat PCs and laptops the same way you do mobile? And build an app.

If that’s what you plan on doing then you’re not alone.

The most popular way to build an app for the desktop is by using Electron. There are other ways, like CEF and actual native development, but Electron is by far the most common approach.

Here are 3 vendors making use of Electron (and WebRTC) for their desktop application:

#1 – Slack

Slack are a popular team collaboration application. I’ve been using it in the browser for the last 3 years, but switched to their desktop Electron app on both my Ubuntu desktop and my Windows 10 laptop.

Why didn’t I use the app for so long? Because I don’t like installing things.

Why have I installed it now? Because I need to track 3+ slack accounts in parallel at all times now. This means a tab per slack account in my browser. On the desktop app, they don’t “eat up” multiple tabs. It isn’t a matter of memory or performance for me. Just one of “esthetics” – trying to preserve a tabs diet on my Chrome.

And that’s how Slack likes it. During the last Kranky Geek, the Slack team gave an interesting presentation about their current plans. It had about a minute dedicated to Electron in 2:30 of the session:

This recording lacks the Q&A part of the session. In an answer to a question regarding browsers support, Andrew MacDonald of Slack, said their focus is in their desktop app – not the browser. They make sure everything works on Chrome. Invest less time and effort on the other browsers. And focus a lot on their Slack desktop application.

It was telling.

If you are looking for desktop-application-only-features in Slack, then besides having a single window for all projects, there’s the collaboration they offer during screen sharing that isn’t available in the browser (yet another reason for me to switch – to check it out).

During that session, at 2:30 minutes? Andrew says why Electron is so useful to Slack, and it is in the domain of cross platform development and time to market – with their team size, they can’t update as fast as Electron does, so they took it “as is” for the built-in WebRTC implementation of it.

#2 – Discord

Discord is a kind of Slack but different. A social network targeting gamers. You can also find there non-gaming groups. Discord is doing all it can to get you from the comfort of your browser right into their native application.

Here’s how the homepage looks like:

From the get go their call to action is to either Open Discord (in the browser) or Download for your operating system. On mobile, if you’re curious, the only alternative is to download the app.

Here’s the interesting part, though.

Discord’s call to action suggest by using green buttons you open Discord in the browser. That’s a lower friction action. You select a user name. Then pick an email and password (or use an unclaimed channel until you add your username and password). And now that you’re signed up for the service, it is time to suggest again you use their app:

And… if you skip this one, you’ll get a top bar reminder as well (that orange strip at the top):

You can do with Discord almost anything inside the browser, but they really really really want to get you off that damn internet and into their desktop app.

And it is working for them!

#3 – TalkDesk

TalkDesk has its own reason for adopting Electron.

TalkDesk is a contact center solution that integrates with CRMs and third party systems. Towards that goal, you can:

  • Use the TalkDesk application (=browser web app)
  • Install the TalkDesk extension from Chrome, and have it latch on to other CRM systems
  • install the Chrome Callbar app, so you can use it as a standalone without the need to have the browser opened at all

That third option is going the way of the dodo, along with Chrome apps. TalkDesk solved that by introducing Callbar Electron.

What we see here differs slightly from the previous two examples.

Where Slack and Discord try getting people off the web and into their desktop application, TalkDesk is just trying to be everywhere for them. Using HTML5 and Electron means they need not write yet-another-application for the desktop – they can reuse parts of their web app.

They are NOT Alone

There are other vendors I know of that are using Electron for their WebRTC applications. They do it for one of the following reasons:

  • It is an easy way to support Internet Explorer by not supporting it (or Safari)
  • They want a “native” app because they need more control than what a browser could ever offer, but still want to work with cross platform development, and HTML5/JS seems like the cleanest approach
  • Their users work in front of the service all day, so the browser isn’t the best interface for them
  • They don’t want to tether themselves or limit themselves to the browser. Using web technology is just how they want to develop
  • It brings with it “stability”, as it is up to you to decide when to push an update to your users as opposed to having browser vendors do it on their own timeframe. It is only semblance as most would still support both browsers and applications in parallel

Add to that CPaaS vendors officially supporting Electron. and TokBox are such examples. They do it not because they think it is nice, but because there’s customer demand for it.

This shift towards Electron apps makes it harder to estimate the real usage base of WebRTC. If most communications is shifting from Chrome browser (lets face it, most WebRTC comms happens in Chrome today if you only care about browsers) towards applications, then the statistics and trends collected by Google about WebRTC use are skewed. That said, it makes Chrome all the more dominant, as Electron use can be attributed back to Chromium.

Expect vendors to continue adopting Electron for their WebRTC applications. This trend is on .

Need to know where WebRTC is available? Download this free WebRTC Device Cheat Sheet.

Get the Cheat Sheet


The post WebRTC Electron Implementations are on 🔥 appeared first on

Kamailio v5.1.1 Released

miconda - Mon, 01/22/2018 - 13:59
Kamailio SIP Server v5.1.1 stable is out – a minor release including fixes in code and documentation since v5.1.0. The configuration file and database schema compatibility is preserved, which means you don’t have to change anything to update.Kamailio® v5.1.1 is based on the latest version of GIT branch 5.1. We recommend those running previous 5.1.x or older versions to upgrade. There is no change that has to be done to configuration file or database structure comparing with the previous release of the v5.1 branch.Resources for Kamailio version 5.1.1Source tarballs are available at:Detailed changelog:Download via GIT: # git clone kamailio
# cd kamailio
# git checkout -b 5.1 origin/5.1Relevant notes, binaries and packages will be uploaded at:Modules’ documentation:What is new in 5.1.x release series is summarized in the announcement of v5.1.0:Do not forget about the next Kamailio World Conference, taking place in Berlin, Germany, during May 14-16, 2018. Call for presentations is still going on for few weeks, but the first group of sessions and speakers will be announced very soon, stay tuned!Thanks for flying Kamailio!

AWS DeepLens and the Future of AI Cameras and Vision

bloggeek - Mon, 01/22/2018 - 12:00

Are AI cameras in our future?

In last year’s AWS re:invent event, which took place end of November, Amazon unveiled an interesting product: AWS DeepLens

There’s decent information about this new device on Amazon’s own website but very little of anything else out there. I decided to put my own thoughts on “paper” here as well.

Interested in AI, vision and where it meets communications? I am going to cover this topic in future articles, so you might want to sign-up for my newsletter

Get my free content

What is AWS DeepLens?

AWS DeepLens is the combination of 3 components: hardware (camera + machine), software and cloud. These 3 come in a tight integration that I haven’t seen before in a device that is first and foremost targeting developers.

With DeepLens, you can handle inference of video (and probably audio) inputs in the camera itself, without shipping the captured media towards the cloud.

The hype words that go along with this device? Machine Vision (or Computer Vision), Deep Learning (or Machine Learning), Serverless, IoT, Edge Computing.

It is all these words and probably more, but it is also somewhat less. It is a first tentative step of what a camera module will look like 5 years from today.

I’d like to go over the hardware and software and see how they combine into a solution.

AWS DeepLens Hardware

AWS DeepLens hardware is essentially a camera that has been glued to an Intel NUC device:

Neither the camera nor the compute are on the higher end of the scale, which is just fine considering where we’re headed here – gazillion of low cost devices that can see.

The device itself was built in collaboration with Intel. As all chipset vendors, Intel is plunging into AI and deep learning as well. More on AWS+Intel vs Google later.

Here’s what’s in this package, based on the AWS blog post on DeepLens:

  • 4 megapixel camera with the ability to capture 1080p video resolution
    • Nothing is said about the frame rate in which this can run. I’d assume 30 fps
    • The quality of this camera hasn’t been detailed either. In many cases, I’d say these devices will need to work in rather extreme lighting conditions
  • 2D microphone array
    • It is easy to understand why such a device needs a microphone, a 2D microphone array is very intriguing in this one
    • This allows for better handling of things like directional sound and noise reduction algorithms to be used
    • None of the deep learning samples provided by Amazon seem to make use of the microphone inputs. I hope these will come later as well
  • Intel Atom X5 processor
    • This one has 4 cores and 4 threads
    • 8GB of memory and 16GB of storage – this is meant to run workloads and not store them for long periods of time
  • Intel Gen9 graphics engine (here)
    • If you are into numbers, then this does over 100 GFLOPS – quite capable for a “low end” device
    • Remember that 1080p@30fps produces more than 62 million pixels a second to process, so we get ~1600 operations per pixel here
    • You can squeeze out more “per pixel” by reducing frame rate or reducing resolution (both are probably done for most use cases)
  • Like most Intel NUC devices, it has Wi-Fi, USB and micro HDMI ports. There’s also a micro SD port for additional memory based on the image above

The hardware tries to look somewhat polished, but it isn’t. Although this isn’t written anywhere, this is:

  1. The first version of what will be an iterative process for Amazon
  2. A reference design. Developers are expected to build the proof of concept with this, later shifting to their own form factor – I don’t see this specific device getting sold to end customers as a final product

In a way, this is just a more polished hardware version of Google’s computer vision kit. The real difference comes with the available tooling and workflow that Amazon baked into AWS DeepLens.

AWS DeepLens Software

The AWS DeepLens software is where things get really interesting.

Before we get there, we need to understand a bit how machine learning works. At its basic, machine learning is about giving a “machine” a large dataset, letting it learn the data in one way or another, and then when you introduce similar new data, it will be able to classify it.

Dumbing the whole process and theory, at the end of the day, machine learning is built out of two main steps:

  1. TRAINING: You take a large set of data and use it for training purposes. You curate and classify it so the training process has something to check itself against. Then you pass the data through a process that ends up generating a trained model. This model is the algorithm we will be using later
  2. DEPLOY: When new data comes in (in our case, this will probably be an image or a video stream), we use our trained model to classify that data or even to run an algorithm on the data itself and modify it

With AWS DeepLens, the intent is to run the training in the AWS cloud (obviously), and then run the deployment step for real time classification directly on the AWS DeepLens device. This also means that we can run this while being disconnected from the cloud and from any other network.

How does all this come to play in AWS DeepLens software stack?

On device

On the device, AWS DeepLens runs two main packages:

  1. AWS Greengrass Core SDK – Greengrass enables running AWS Lambda functions directly on devices. If Lambda is called serverless, then Greengrass can truly run serverless
  2. Device optimized MXNet package – an Apache open source project for machine learning

Why MXNet and not TensorFlow?

  • TensorFlow comes from Google, which makes it less preferable for Amazon, a direct cloud competitor. It is also preferable by Intel (see below)
  • MXNet is considered faster and more optimized at the moment. It uses less memory and less CPU power to handle the same task
In the cloud

The main component here is the new Amazon SageMaker:

SageMarker takes the effort away from the management of training machine learning, streamlining the whole process. That last step in the process of Deploy takes place in this case directly on AWS DeepLens.

Besides SageMaker, when using DeepLens you will probably make use of Amazon S3 for storage, Amazon Lambda when running serverless in the cloud, as well as other AWS services. Amazon even suggests using AWS DeepLens along with the newly announced Amazon Rekognition Video service.

To top it all, Amazon has a few pre-trained models and sample projects, shortening the path from getting a hold of an AWS DeepLens device to seeing it in action.

AWS+Intel vs Google

So we’ve got AWS DeepLens. With its set of on-device and cloud software tools. Time to see what that means in the bigger picture.

I’d like to start with the main players in this story. Amazon, Intel and Google. Obviously, Google wasn’t part of the announcement. Its TensorFlow project was mentioned in various places and can be made to work with AWS DeepLens. But that’s about it.

Google is interesting here because it is THE company today that is synonymous to AI. And there’s the increasing rivalry between Amazon and Google that seems to be going on multiple fronts.

When Google came out with TensorFlow, it was with the intent of creating a baseline for artificial intelligence modeling that everyone will be using. It open sourced the code and let people play with it. That part succeeded nicely. TensorFlow is definitely one of the first projects developers would try to dabble with when it comes to machine learning. The problem with TensorFlow seems to be the amount of memory and CPU it requires for its computations compared to other frameworks. That is probably one of the main reasons why Amazon decided to place its own managed AI services on a different framework, ending up with MXNet which is said to be leaner with good scaling capabilities.

Google did one more thing though. It created its own special Tensor processing unit, calling it TPU. This is an ASIC type of a chip, designed specifically for high performance of machine learning calculations. In a research paper released by Google earlier last year, they show how their TPUs perform better than GPUs when it comes to TensorFlow machine learning work loads:

And if you’re wondering – you can get CLOUD TPU on the Google Cloud Platform, albait this is still in alpha stage.

This gives Google an advantage in hosting managed TensorFlow jobs, posing a threat to AWS when it comes to AI heavy applications (which is where we’re all headed anyway). So Amazon couldn’t really pick TensorFlow as its winning horse here.

Intel? They don’t sell TPUs at the moment. And like any other chip vendor, they are banking and investing heavily in AI. Which made working with AWS here on optimizing and working on end-to-end machine learning solutions for the internet of things in the form of AWS DeepLens an obvious choice.

Artificial Intelligence and Vision

These days, it seems that every possible action or task is being scrutinized to see if artificial intelligence can be used to improve it. Vision is no different. You can find it other computer vision or machine vision and it covers a broad set of capabilities and algorithms.

Roughly speaking, there are two types of use cases here:

  1. Classification – with classification, the images or video stream, is being analyzed to find certain objects or things. From being able to distinguish certain objects, through person and face detection, to face recognition to activities and intents recognition
  2. Modification – AWS DeepLens Artistic Style Transfer example is one such scenario. Another one is fixing the nagging direct eye contact problem in video calls (hint – you never really experience it today)

As with anything else in artificial intelligence and analytics, none of this is workable at the moment for a broad spectrum of classifications. You need to be very specific in what you are searching and aiming for, and this isn’t going to change in the near future.

On the other hand, there are many many cases where what you need is a camera to classify a very specific and narrow vision problem. The usual things include person detection for security cameras, counting people at an entrance to a store, etc. There are other areas you hear about today such as using drones for visual inspection of facilities and robots being more flexible in assembly lines.

We’re at a point where we already have billions of cameras out there. They are in our smartphones and are considered a commodity. These cameras and sensors are now headed into a lot of devices to power the IOT world and allow it to “see”. The AWS DeepLens is one such tool that just happened to package and streamline the whole process of machine vision.


On the price side, the AWS DeepLens is far from a cheap product.

The baseline cost is of an AWS DeepLens camera? $249

But as with other connected devices, that’s only a small part of the story. The device is intended to be connected to the AWS cloud and there the real story (and costs) takes place.

The two leading cost centers after the device itself are going to be AWS Greengrass and Amazon SageMaker.

AWS Greegrass starts at $1.49 per year per device. Amazon SageMaker costs 20-25% on top of the usual AWS EC2 machine prices. To that, add the usual bandwidth and storage pricing of AWS, and higher prices for certain regions and discounts on large quantities.

It isn’t cheap.

This is a new service that is quite generic and is aimed at tinkerers. Startups looking to try out and experiment with new ideas. It is also the first iteration of Amazon with such an intriguing device.

I, for one, can’t wait to see where this is leading us.

3 Different Compute Models for Machine Vision

AWS DeepLens is one of 3 different compute models that I see in this space of machine vision.

Here are all 3 of them:

#1 – Cloud

In a cloud based model, the expectation is that the actual media is streamed towards the cloud:

  • In real time
  • Or at some future point in time
  • When events occur; like motion being detected; or sound picked up on the mic

The data can be a video stream, or more often than not, it is just a set of captured images.

And that data gets classified in the cloud.

Here are two recent examples from a domain close to my heart – WebRTC.

At the last Kranky Geek event, Philipp Hancke shared how is trying to determine NSFW (Not Safe For Work):

The way this is done is by using Yahoo’s Open NSFW open source package. They had to resize images, send them to a server and there, using Python classify the image, determining if it is safe for work or not. Watch the video – it really is insightful at how to tackle such a project in the real world.

The other one comes from Chad Hart, who wrote a lengthy post about connecting WebRTC to TensorFlow for machine vision. The same technique was used – one of capturing still images from the stream and sending them towards a server for classification.

These approaches are nice, but they have their challenges:

  1. They are gravitating towards still images and not video streams at the moment. This relates to the costs and bandwidth involved in shipping and then analyzing such streams on a server. To give you an understanding of the costs – using Amazon Rekognition for one minute of video stream analysis costs $0.12. For a single minute. It is high, and the reason is that it really does require some powerful processing to achieve
  2. Sometimes, you really need to classify and make faster decisions. You can’t wait that extra 100’s of milliseconds or more for the classification to take place. Think augmented reality type of scenarios
  3. At least with WebRTC, I haven’t seen anyone who figured how to do this classification on the server side in real time for a video stream and not still images. Yet
#2 – In the Box

This alternative is what we have today in smartphones and probably in modern room based video conferencing devices.

The camera is just the optics, but the heavy lifting takes place in the main processor that is doing other things as well. And since most modern CPUs today already have GPUs embedded as part of the SoC, and chip vendors are actively working on AI specific additions to chips (think Apple’s AI chip in the iPhone X or Google’s computational photography packed into the Pixel X phones).

The underlying concept here is that the camera is always tethered or embedded in a device that is powerful enough to handle the machine learning algorithms necessary.

They aren’t part of the camera but rather the camera is part of the device.

This works rather well, but you end up with a pricy device which doesn’t always make sense. Remember that our purpose here is to aim at having a larger number of camera sensors deployed and having an expensive computing device attached to it won’t make sense for many of the use cases.

#3 – In the Camera

This is the AWS DeepLens model.


The computing power needed to run the classification algorithms is made part of the camera instead of taking place on another CPU.

We’re talking about $249 right now, but assuming this approach becomes popular, prices should go down. I can easily see such devices retailing at $49 on the low end in 2-3 technology cycles (5 years or so). And when that happens, the power developers will have over what use cases can be created are endless.

Think about a home surveillance system that costs below $1,000 to purchase and install. It is smart enough to have a lot less false positives in alerting its users. AND can be upgraded in its classification as time goes by. There can be a service put in place behind it with a monthly fee that includes such things. You can add face detection and classification of certain people – alerting you when the kids come home or leave for example. Ignoring a stray cat that came into view of the camera. And this system is independent of an external network to run on a regular basis. You can update it when an external network is connected, but other than that, it can live “offline” quite nicely.

No Winning Model


All of the 3 models have their place in the world today. Amazon just made it a lot easier to get us to that third alternative of “in the camera”.

IoT and the Cloud

Edge computing. Fog computing. Cloud computing. You hear these words thrown in the air when talking about the billions of devices that will comprise the internet of things.

For IoT to scale, there are a few main computing concepts that will need to be decided sooner rather than later:

  • Decentralized – with so many devices, IoT services won’t be able to be centralized. It won’t be around scale out of servers to meet the demands, but rather on the edges becoming smarter – doing at least part of the necessary analysis. Which is why the concept of AWS DeepLens is so compelling
  • On net and off net – IoT services need to be able to operate without being connected to the cloud at all times. Think of an autonomous car that needs to be connected to the cloud at all times – a no go for me
  • Secured – it seems like the last thing people care about in IoT at the moment is security. The many data breaches and the ease at which devices can be hijacked point that out all too clearly. Something needs to be done there and it can’t be on the individual developer/company level. It needs to take place a lot earlier in the “food chain”

I was reading The Meridian Ascent recently. A science fiction book in a long series. There’s a large AI machine there called Big John which sifts through the world’s digital data:

“The most impressive thing about Big John was that nobody comprehended exactly how it worked. The scientists who had designed the core network of processors understood the fundamentals: feed sufficient information to uniquely identify a target, and then allow Big John to scan all known information – financial transactions, medical records, jobs, photographs, DNA, fingerprints, known associates, acquaintances, and so on.

But that’s where things shifted into another realm. Using the vast network of processors at its disposal, Big John began sifting external information through its nodes, allowing individual neurons to apply weight to data that had no apparent relation to the target, each node making its own relevance and correlation calculations.”

I’ve emphasized that sentence. To me, this shows the view of the same IoT network looking at it from a cloud perspective. There, the individual sensors and nodes need to be smart enough to make their own decisions and take their own actions.

All these words for a device that will only be launched April 2018…

We’re not there yet when it comes to IoT and the cloud, but developers are working on getting the pieces of the puzzle in place.

Interested in AI, vision and where it meets communications? I am going to cover this topic in future articles, so you might want to sign-up for my newsletter

Get my free content

The post AWS DeepLens and the Future of AI Cameras and Vision appeared first on

Upcoming Events In 2018

miconda - Thu, 01/18/2018 - 13:58
2018 just started, time to look at upcoming events during the next few months where you can meet with Kamailio folks.
  • FosdemFeb 3-4, 2018, in Brussels, Belgium – the yearly conference for free and open source developers in Europe, which has become a place to meet with many Kamailio friends, by now at a traditional dinner event. Daniel-Constantin Mierla will give a presentation as part of RTC Devroom on Sunday, Feb 4, 2018.
  • IT ExpoFeb 13-16, 2018, Fort Lauderdale, Florida, USA – meet with Fred Posner and other Kamailio friends as well as peers from Asterisk and FreeSwitch projects
  • Digium Asterisk WorldFeb 14-16, 2018, Fort Lauderdale, Florida, USA – Fred Posner will give a presentation about Kamailio as part of the conference track
  • Mobile World CongressFeb 26 – Mar 1, 2018, Barcelona, Spain – Carsten Bock and NG Voice will be there with their own stand in the expo area. Quobis will participate as well, once again part of the Spain pavilion. Barcelona is the home town of Voztelecom, they can be met at the event.
  • Call Center WorldFeb 26 – Mar 1, 2018, Berlin, Germany – Daniel-Constantin Mierla can be met on premises at the event
  • Kamailio Advanced TrainingMar 5-7, 2018, Berlin, Germany – the event to learn how to build and deploy professional VoIP and RTC services with Kamailio
  • FossasiaMar 22-25, 2018, Singapore – the yearly conference for free and open source software in Asia, Daniel-Constantin Mierla will give a presentation during this event
  • Kamailio World ConferenceMay 14-16, 2018, Berlin, Germany – two days and a half of workshops and conference sessions dedicated to Kamailio and related projects. The event where to meet many of Kamailio developers. Do not miss it!
Should you participate or be aware of other events with sessions related to Kamailio, write us and we will happily make a news article about them!Thanks for flying Kamailio!

How Many Users Can Fit in a WebRTC Call?

bloggeek - Mon, 01/15/2018 - 12:00

As many as you like. You can cram anywhere from one to a million users into a WebRTC call.

You’ve been asked to create a group video call, and obviously, the technology selected for the project was WebRTC. It is almost the only alternative out there and certainly the one with the best price-performance ratio. Here’s the big question: How many users can we fit into that single group WebRTC call?

Need to understand your WebRTC group calling application backend? Take this free video mini-course on the untold story of WebRTC’s server side.

Enroll now

At least once a week I get approached by someone saying WebRTC is peer-to-peer and asking me if you can use it for larger groups, as the technology might not fit for such use cases. Well… WebRTC fits well into larger group calls.

You need to think of WebRTC as a set of technological building blocks that you mix and match as you see fit, and the browser implementation of WebRTC is just one building block.

The most common building block today in WebRTC for supporting group video calls is the SFU (Selective Forwarding Unit). a media router that receives media streams from all participants in a session and decides who to route that media to.

What I want to do in this article, is review a few of the aspects and decisions you’ll need to take when trying to create applications that support large group video sessions using WebRTC.

Analyze the Complexity

The first step in our journey today will be to analyze the complexity of our use case.

With WebRTC, and real time video communications in general, we will all boil down to speeds and feeds:

  1. Speeds – the resolution and bitrate we’re expecting in our service
  2. Feeds – the stream count of the single session

Let’s start with an example.

Assume you want to run a group calling service for the enterprise. It runs globally. People will join work sessions together. You plan on limiting group sessions to 4 people. I know you want more, but I am trying to keep things simple here for us.

The illustration above shows you how a 4 participants conference would look like.

Magic Squares: 720p

If the layout you want for this conference is the magic squares one, we’re in the domain of:

You want high quality video. That’s what everyone wants. So you plan on having all participants send out 720p video resolution, aiming for WQHD monitors (that’s 2560×1440). Say that eats up 1.5Mbps (I am stingy here – it can take more), so:

  • Each participant in the session sends out 1.5Mbps and receives 3 streams of 1.5Mbps
  • Across 4 participants, the media server needs to receive 6Mbps and send out 18Mbps

Summing it up in a simple table, we get:

Resolution 720p Bitrate 1.5Mbps User outgoing 1.5Mbps (1 stream) User incoming 4.5Mbps (3 streams) SFU outgoing 18Mbps (12 streams) SFU incoming 6Mbps (4 streams) Magic Squares: VGA

If you’re not interested in resolution that much, you can aim for VGA resolution and even limit bitrates to 600Kbps:

Resolution VGA Bitrate 600Kbps User outgoing 0.6Mbps (1 stream) User incoming 1.8Mbps (3 streams) SFU outgoing 7.2Mbps (12 streams) SFU incoming 2.4Mbps (4 streams)


The thing you may want to avoid when going VGA is the need to upscale the resolution on the display – it can look ugly, especially on the larger 4K displays.

With crude back of the napkin calculations, you can potentially cram 3 VGA conferences for the “price” of 1 720p conference.

Hangouts Style

But what if our layout is a bit different? A main speaker and smaller viewports for the other participants:

I call it Hangouts style, because Hangouts is pretty known for this layout and was one of the first to use it exclusively without offering a larger set of additional layouts.

This time, we will be using simulcast, with the plan of having everyone send out high quality video and the SFU deciding which incoming stream to use as the dominant speaker, picking the higher resolution for it and which will pick the lower resolution.

You will be aiming for 720p, because after a few experiments, you decided that lower resolutions when scaled to the larger displays don’t look that good. You end up with this:

  • Each participant in the session sends out 2.2Mbps (that’s 1.5Mbps for the 720p stream and the additional 80Kbps for the other resolutions you’ll be simulcasting with it)
  • Each participant in the session receives 1.5Mbps from the dominant speaker and 2 additional incoming streams of ~300Kbps for the smaller video windows
  • Across 4 participants, the media server needs to receive 8.8Mbps and send out 8.4Mbps
Resolution 720p highest (in Simulcast) Bitrate 150Kbps – 1.5Mbps User outgoing 2.2Mbps (1 stream) User incoming 1.5Mbps (1 stream)

0.3Mbps (2 streams) SFU outgoing 8.4Mbps (12 streams) SFU incoming 8.8Mbps (4 streams)


This is what have we learned:

Different use cases of group video with the same number of users translate into different workloads on the media server.

And if it wasn’t mentioned specifically, simulcast works great and improves the effectiveness and quality of group calls (simulcast is what we used in our Hangouts Style meeting).

Across the 3 scenarios we depicted here for 4-way video call, we got this variety of activity in the SFU:

Magic Squares: 720p Magic Squares: VGA Hangouts Style SFU outgoing 18Mbps 7.2Mbps 8.4Mbps SFU incoming 6Mbps 2.4Mbps 8.8Mbps


Here’s your homework – now assume we want to do a 2-way session that gets broadcasted to 100 people over WebRTC. Now calculate the number of streams and bandwidths you’ll need on the server side.

How Many Users Can be Active in a WebRTC Call?

That’s a tough one.

If you use an MCU, you can get as many users on a call as your MCU can handle.

If you are using an SFU, it depends on a 3 different parameters:

  1. The level of sophistication of your media server, along with the performance it has
  2. The power you’ve got available on the client devices
  3. The way you’ve architected your infrastructure and worked out cascading

We’re going to review them in a sec.

Same Scenario, Different Implementations

Anything about 8-10 users in a single call becomes complicated. Here’s an example of a publicly available service I want to share here.

The scenario:

  • 9 participants in a single session, magic squares layout
  • I use testRTC to get the users into the session, so it is all automated
  • I run it for a minute. After that, it kills the session since it is a demo
  • It takes into account that with 9 people on the screen, reducing resolutions for all to VGA, but it allocates 1.3Mbps for that resolution
  • Leading to the browsers receiving 10Mbps of data to process

The media server decided here how to limit and gauge traffic.

And here’s another service with an online demo running the exact same scenario:

Now the incoming bitrate on average per browser was only 2.7Mbps – almost a fourth of the other service.

Same scenario. Different implementations.

What About Some Popular Services?

What about some popular services that do video conferencing in an SFU routed model? What kind of size restrictions do they put on their applications?

Here’s what I found browsing around:

  • Google Hangouts – up to 25 participants in a single session. It was 10 in the past. When I did my first-ever office hour for my WebRTC training, I maxed out at 10, which got me to start using other services
  • Hangouts Meet – placed its maximum number at 50 participants in a single session
  • Houseparty – decided on 8 participants
  • Skype – 25 participants
  • – their PRO accounts support up to 12 participants in a room
  • Amazon Chime – 16 participants on the desktop and up to 8 participants on iOS (no Android support yet)

Does this mean you can’t get above 50?

My take on it is that there’s an increasing degree of difficulty as the meeting size increases:

The CPaaS Limit on Size

When you look at CPaaS platforms, those supporting video and group calling often have limits to their meeting size. In most cases, they give out an arbitrary number they have tested against or are comfortable with. As we’ve seen, that number is suitable for a very specific scenario, which might not be the one you are thinking about.

In CPaaS, these numbers vary from 10 participants to 100’s of participants in a single sesion. Usually, if you can go higher, the additional participants will be view-only.

Key Points to Remember

Few things to keep in mind:

  • The higher the group size the more complicated it is to implement and optimize
  • The browser needs to run multiple decoders, which is a burden in itself
  • Mobile devices, especially older ones, can be brought down to their knees quite quickly in such cases. Test on the oldest, puniest devices you plan on supporting before determining the group size to support
  • You can build the SFU in a way that it doesn’t route all incoming media to everyone but rather picks partial data to send out. For example, maybe only a single speaker on the audio channels, or the 4 loudest streams
Sizing Your Media Server

Sizing and media servers is something I have been doing lately at testRTC. We’ve played a bit with Kurento in the past and are planning to tinker with other media servers. I get this question on every other project I am involved with:

How many sessions / users / streams can we cram into a single media server?

Given what we’ve seen above about speeds and feeds, it is safe to say that it really really really depends on what it is that you are doing.

If what you are looking for is group calling where everyone’s active, you should aim for 100-500 participants in total on a single server. The numbers will vary based on the machine you pick for the media server and the bitrates you are planning per stream on average.

If what you are looking for is a broadcast of a single person to a larger audience, all done over WebRTC to maintain low latency, 200-1,000 is probably a better estimate. Maybe even more.

Big Machines or Small Machines?

Another thing you will need to address is on which machines are you going to host your media server. Will that be the biggest baddest machines available or will you be comfortable with smaller ones?

Going for big machines means you’ll be able to cram larger audiences and sessions into a single machine, so the complexity of your service will be lower. If something crashes (media servers do crash), more users will be impacted. And when you’ll need to upgrade your media server (and you will), that process can cost you more or become somewhat more complicated as well.

The bigger the machine, the more cores it will have. Which results in media servers that need to run in multithreaded mode. Which means they are more complicated to build, debug and fix. More moving parts.

Going for small machines means you’ll hit scale problems earlier and they will require algorithms and heuristics that are more elaborate. You’ll have more edge cases in the way you load balance your service.

Scale Based on Streams, Bandwidth or CPU?

How do you decide that your media server achieved full capacity? How do you decide if the next session needs to be crammed into a new machine or another one or be placed on the current media server you’re using? If you use the current one, and new participants want to join a session actively running in this media server, will there be room enough for them?

These aren’t easy questions to answer.

I’ve see 3 different metrics used to decide on when to scale out from a single media server to others. Here are the general alternatives:

Based on CPU – when the CPU hits a certain percentage, it means the machine is “full”. It works best when you use smaller machines, as CPU would be one of the first resources you’ll deplete.

Based on Bandwidth – SFUs eat up lots of networking resources. If you are using bigger machines, you’ll probably won’t hit the CPU limit, but you’ll end up eating too much bandwidth. So you’ll end up determining the capacity available by way of bandwidth monitoring.

Based on Streams – the challenge sometimes with CPU and Bandwidth is that the number of sessions and streams that can be supported may vary, depending on dynamic conditions. Your scaling strategy might not be able to cope with that and you may want more control over the calculations. Which will lead to you sizing the machine using either CPU or bandwidth, but placing rules in place that are based on the number of streams the server can support.

The challenge here is that whatever scenario you pick, sizing is something you’ll need to be doing on your own. I see many who come to use testRTC when they need to address this problem.

Cascading a Single Session

Cascading is the process of connecting one media server to another. The diagram below shows what I mean:

We have a 4-way group video call that is spread across 3 different media servers. The servers route the media between them as needed to get it connected. Why would you want to do this?

#1 – Geographical Distribution

When you run a global service and have SFUs as part of it, the question that is raised immediately is for a new session, which SFU will you allocate for it? In which of the data centers? Since we want to get our media servers as close as possible to the users, we either have pre-knowledge about the session and know where to allocate it, or decide by some reasonable means, like geolocation – we pick the data center closest to the user that created the meeting.

Assume 4 people are on a call. 3 of them join from New York, while the 4th person is from France. What happens if the French guy joins first?

The server will be hosted in France. 3 out of 4 people will be located far from the media server. Not the best approach…

One solution is to conduct the meeting by spreading it across servers closest to each of the participants:

We use more server resources to get this session served, but we have a lot more control over the media routes so we can optimize them better. This improved media quality for the session.

#2 – Fragmented Allocations

Assume that we can connect up to 100 participants in a single media server. Furthermore, every meeting can hold up to 10 participants. Ideally, we won’t want to assign more than 10 meetings per media server.

But what if I told you the average meeting size is 2 participants? It can get us to this type of an allocation:

This causes a lot of wasted server resources. How can we solve that?

  1. By having people commit in advance to the maximum meeting size. Not something you really want to do
  2. Taking a risk, assume that if you allocate 50% of a server’s capacity, the rest of the capacity you leave for existing meetings allowing them to grow. You still have wasted resources, but to a lower degree. There will be edge cases where you won’t be able to fill out the meetings due to server resources
  3. Migrating sessions across media servers in an effort to “defragment” the servers. It is as ugly as it sounds, and probably just as disrupting to the users
  4. Cascade sessions. Allow them to grow across machines

That last one of cascading? You can do that by reserving some of a media server’s resources for cascading existing sessions to other media servers.

#3 – Larger Meetings

Assuming you want to create larger meetings than one a single media server can handle, your only choice is to cascade.

If your media server can hold 100 participants and you want meetings at the size of 5,000 participants, then you’ll need to be able to cascade to support them. This isn’t easy, which explains why there aren’t many such solutions available, but it definitely is possible.

Mind you, in such large meetings, the media flow won’t be bidirectional. You’ll have fewer participants sending media and a lot more only receiving media. For the pure broadcasting scenario, I’ve written a guest post on the scaling challenges on Red5 Pro’s blog.


We’ve touched a lot of areas here. Here’s what you should do when trying to decide how many users can fit in your WebRTC calls:

  1. Whatever meeting size you have in mind it is possible to support with WebRTC
    1. It will be a matter of costs and aligning it with your business model that will make or break that one
    2. The larger the meeting size, the more complex it will be to get it done right, and the more limitations and assumptions you’ll need to add to the equation
  2. Analyze the complexity you need to support
    1. Count the incoming and outgoing streams to each device and media server
    2. Decide on the video quality (resolution and bitrate) for each stream
  3. Define the media server you’ll be using
    1. Select a machine type to run the media server on
    2. Figure out the sizing needed before you reach scale out
    3. Check if the growth is linear on the server’s resources
    4. Decide if you scale out based on bandwidth, CPU, streams count or anything else
  4. Figure how cascading fits into the picture
    1. Offer with it better geolocation support
    2. Assist in resource fragmentation on the cloud infrastructure
    3. Or use it to grow meetings beyond a single media server’s capacity

What’s the size of your WebRTC meetings?

Need to understand your WebRTC group calling application backend? Take this free video mini-course on the untold story of WebRTC’s server side.

Enroll now

The post How Many Users Can Fit in a WebRTC Call? appeared first on

7 CPaaS Trends to Follow in 2018

bloggeek - Mon, 01/08/2018 - 12:00

Here are CPaaS trends you should be expecting this year.

There’s no doubt about it. CPaaS is growing and it is doing so rapidly. It is a multi billion dollars industry, and while still small, there’s no sign of its growth stopping anytime soon. You’ll see the numbers $4 billion and $8 billion a year appearing in different reports and estimates that are flying around when talking about the near future of the CPaaS market size and growth potential. I have no clue if the numbers are correct – I’ve never been one to play with estimates.

What I do know, is that we’ve got multiple CPaaS vendors now with ARR (Annual Run Rate) higher than $100 million. Most of it may still come from good old SMS and phone calls, but I think this will change along with how consumers communicate.

This change will make CPaaS a lot more interesting and diversified than the boring race to the bottom that seems to be prevalent in some of the players’ offering and messaging in this market. The problem with CPaaS today is twofold:

  1. SMS and voice are somewhat commoditized. There is a finite way in which you can send and receive SMS and phone calls over phone numbers, and we’ve exhausted them and how to express them in a simple API for developers to use years ago. Since then, the game we played was one of scalability, stability and price points
  2. Developers are resistant to paying for IP based communications services at the moment. They somehow believe that these are a lot easier to develop. While that is correct for the “hello world” implementation, once you need to provide long term maintenance and scalability capabilities this can grow into a huge headache – especially when you couple this with some of the trends in communication that are being introduced

Which brings me to what you can expect in 2018. Here are 7 CPaaS trends that will grow and become important this year – and more importantly – what they mean.

Planning on selecting a CPaaS vendor? Check out this shortlist of CPaaS vendor selection metrics:

Get the shortlist

#1 – Serverless

Serverless is also known as Functions.

You might know about serverless from AWS Lambda, Azure Functions, Google’s Cloud Functions and Apache’s OpenWhisk. The list here isn’t random – it goes to show that all big cloud platforms are now offering serverless capabilities.

This still isn’t prevalent in CPaaS, where for the most part, developers are expected to develop, maintain and operate their own servers that communicate with the CPaaS vendor’s infrastructure. But we do see signs of serverless making its way here.

I’ve covered that last year, when I took a deeper look into the Twilio Functions offering and what that means to the CPaaS market.

At the time, Twilio stated that Functions is already Twilio’s fastest growing product ever. Here’s where they explain what it does:

Twilio being the market leader in CPaaS, and Functions being a fast growing product of theirs means that other CPaaS vendors will follow. Simply because demand here is obvious.

#2 – Omnichannel

When SMS just isn’t enough.

Not sure when you last used SMS for personal reasons – I know that I rarely end up inside that app on my smartphone. The way things are going, SMS can be considered the spam channel of 2018. Or maybe the channel used by businesses who’ve been told that this is the best way to reach customers and interrupt them.

While I definitely see value in SMS, I also think that businesses should strive to communicate with their customers on other channels – channels their users are now focusing on with their social life. In Israel that would be Whatsapp. In the US probably a mixture of Facebook and iMessage will work better. Telegram would be the choice for Russia.

Whatever that channel is, to support it, someone needs to integrate with it. And then decide which channel to use for which customer and for what interaction. For CPaaS, that’s what Omnichannel is about. Enabling developers, and by extension businesses to communicate with their customers on the customer’s preferred channel.

2018 is going to be the year Omnichannel becomes a serious requirement.


Because now we can actually use it.

Apple’s own Business Chat service is planned to make its public debut this year.

Facebook has its own APIs already, and Whatsapp announced business accounts (=APIs).

That alone covers a large majority of customer bases.

Throw in SMS, mix and choose the ones you want. And voila! Omnichannel.

For businesses, relying on CPaaS for Omnichannel makes sense, as the hassle of adding all of these channels and maintaining them is expensive. Omichannel CPaaS APIs will abstract that away.

For CPaaS vendors, this is a way to differentiate and make switching between vendors harder.

A win-win.

The ones offering that already? Nexmo with their Chat App and Twilio through their Engagement Cloud.

#3 – Visual / IDE

From code, to REST, to point-and-click.

We used to use DOS as an “operating system”. I worked at a small computer shop as a kid when I grew up. For a couple of years, my role was to go to people’s homes and explain to them how to use the new computer they just purchased. How to put the DOS disk inside the floppy drive, list the files in a floppy, run games and other applications.

Then came Windows (along with Mac and OS/2 and others) and we all just moved to using a visual operating system and a mouse.

As a kid, I programmed using Logo and Basic. Then Turbo Pascal – in a decent IDE for the first time. In the university, I got acquainted to Tcl/Tk. And then UI development seemed fun. Even it if was by writing code by hand. Then one day, vtcl came to life – a visual editor. Things got easier.

Developing communications is taking the same path now.

It started by needing to build your own stuff from scratch, then with open source frameworks and later CPaaS and REST (or god forbid SOAP) APIs.

In 2017, Twilio Studio was announced – a visual IDE to use on top of the Twilio functionality. In that corner, you can also count Amazon Connect, though not CPaaS but still in the domain of communications – it has a visual IDE of its own.

In a recent VoxImplant event I was invited to speak at in Russia, VoxImplant introduced a new service in beta called Smartcalls – a visual IDE on top of their CPaaS offering. Albeit… in Russian.

The concept of using visual tools requiring less coding can greatly increase productivity and the target audience of these tools. They are no longer restricted to developers “who code”. Hell – I can use these tools. I played with Twilio Studio a bit – it was fun and intuitive. It guides the way you think about what needs to be done. About the flow of the service.

I really can’t see how other CPaaS vendors are going to ignore this trend and not work on their own visual offerings during 2018.

#4 – Machine Learning and Artificial Intelligence

It is time to be smart about communications

When I worked at Amdocs some years ago, we’ve looked into the area of Big Data Analytics. It was all about how you take the boatloads of information telecommunication companies have and do something with it. You start by analyzing and visualizing it, moving towards the domain of actionable.

It frustrated the hell out of me to understand how little communication vendors are doing with their data compared to enterprises in other markets. Or at least that was my impression looking from inside a vendor.

Fast forward to today, and what you find with CPaaS vendors is that they are offering a well oiled machine that provides generic communications. You can do whatever you want with it, and the smart ones are adding analytics on top for their own needs.

But want about the CPaaS vendors themselves? Shouldn’t they be doing something about analytics? Or its better branded colleague known as machine learning?

Gustavo Garcia wrote a good article about it – improving real time communications with machine learning. This is where most CPaaS vendors are probably looking today, optimizing their network to offer a better service.

But it is just scratching the surface.

The obvious is adding things around NLP – speech to text, text to speech, translation. All those are being done by integrating with third parties today, and many of the CPaaS vendors offer these out of the box.

To move the needle and differentiate, more needs to be done:

  1. The internal structure of the CPaaS vendors should take into account the need for researching data. Data scientists and machine learning people have to be part of the development and product teams for this to ever happen
  2. CPaaS vendors need to start thinking on what they can offer by analyzing their own data (and their customer’s communications) beyond just optimizing it

If you are a CPaaS vendor and you don’t have at least a data scientist, a machine learning developer and a product manager savvy in this domain yet, then start recruiting.

#5 – AR/VR

Time to connect ARKit and ARCode to communications.

Augmented reality and virtual reality have been around for the better part of the last decade or two. But somehow, they are only now becoming interesting.

I guess the popularity of AR has grown a lot, and where it fits directly in smartphones today (and not the bulky 3D headsets) is with things like Pokemon Go and camera filters (started by popularized snapchat and found everywhere today).

With the introduction of Apple ARKit and Google ARCore, this is only going to get more commonplace. And what we see now is CPaaS vendors finding their way around this technology.

The most interesting one yet is Twilio’s work with ARKit, which they showcased at last year’s Kranky Geek event:

With all the focus put in this domain, I am sure we’ll see more CPaaS vendors looking into it.

#6 – Bots

Omnichannel + Machine Learning + Automation = Bots

Chat bots is all the rage. Search the internet and you’ll be thinking that humans no longer talk to customers anymore. It is all taken care of by bots.

I’ve added a chat widget to certain pages on my website. And every once in awhile I get a question there asking if that’s a human they’re interacting with.

Bots require integration and APIs. They are also about communications. Which is probably why CPaaS vendors are taking a step towards this direction as well. The ones adding Omnichannel offerings across multiple channels are in effect enabling bots to be created there across channels.

That’s a first step though, as the next would be to cater this market better by enabling conversational interfaces and easing the part of packaging the bots for the various channels.

Expect to see a few announcements around bots to be made by CPaaS vendors this year. A lot of it will revolve around Amazon Alexa and Google Home

#7 – GDPR

The governance headache we’ve all been waiting for.

GDPR stands for General Data Protection Regulation. It is a new set of EU rules that have been put in place to protect the data related to EU citizens that is collected and stored.

While it is easy to assume that CPaaS vendors store no data – they “live” in the real time, that isn’t accurate.

Stored meta data and logs may fall into the GDPR black hole, and definitely recording services. With the introduction of Omnichannel and Bots comes chat history storage.

Twilio jumped on this bandwagon last year with a GDPR program. Other vendors such as MessageBird indicated future support of GDPR. All global CPaaS vendors will need to support GDPR, and since these regulations come to force this year, 2018 will be the year GDPR gets more attention and focus by CPaaS vendors.

2018 – The Year CPaaS Vendors Differentiated

In the past few years, we’ve seen CPaaS vendors struggling in two directions:

  1. Increasing their customer base, mainly around SMS and voice offerings – which is where most of the revenue is these days
  2. Growing from a telecom focused player to a global player

That second point is important. Up until recently, CPaaS equated to running one or two data centers (or the equivalent of running from a small number of cloud based data centers), connecting developers via REST APIs to the telecom backend. With the introduction of IP based communications (and WebRTC), the was a growing need for client side SDKs along with more points of presence closer to the end user.

We seem to be past that hurdle for most CPaaS vendors. Most of them have grown their footprint to include a global infrastructure.

The next frontier is going to happen elsewhere:

  1. Serverless – in making the services easier for developers to adopt by reducing the requirement for customers to deploy their own machines
  2. Omnichannel – extending the reach beyond the telecom channels of SMS and voice into social networks
  3. Visual / IDE – grow the service beyond developers, making it easier to use and faster to deploy with
  4. Machine Learning and Artificial Intelligence – add intelligence and analytics based services
  5. AR/VR – capture the new world of augmented and virtual reality and enhance it with communications
  6. Bots – align with the A2P model of businesses communicating with customers through automation
  7. GDPR – provide support for the new EU initiative, adding governance and regulation as another added value of choosing CPaaS instead of in-house development

CPaaS will move in rapid pace in the next few years. Vendors who won’t invest and grow their offerings and business will not stay with us for long.

Planning on selecting a CPaaS vendor? Check out this shortlist of CPaaS vendor selection metrics:

Get the shortlist

The post 7 CPaaS Trends to Follow in 2018 appeared first on

New Developer: Paul Claudiu Boriga

miconda - Thu, 01/04/2018 - 13:56
Recently another person got commit access to Kamailio git repository, respectively Paul Claudiu Boriga. He is working for 1&1 Germany and in the past he has contributed valuable patches to several components, such as ndb_rediscarrierroute and rtpenginemodules. Claudiu joins other colleagues from 1&1 in the Kamailio development team to maintain modules contributed by the company over the time, like carrierroutememcachedpdb or userblacklist. His Github profile is available at:A warm welcome from us all, looking forward to more contributions from him in the future!Thanks for flying Kamailio!


Subscribe to OpenTelecom.IT aggregator

Using the greatness of Parallax

Phosfluorescently utilize future-proof scenarios whereas timely leadership skills. Seamlessly administrate maintainable quality vectors whereas proactive mindshare.

Dramatically plagiarize visionary internal or "organic" sources via process-centric. Compellingly exploit worldwide communities for high standards in growth strategies.

Get free trial

Wow, this most certainly is a great a theme.

John Smith
Company name

Yet more available pages

Responsive grid

Donec sed odio dui. Nulla vitae elit libero, a pharetra augue. Nullam id dolor id nibh ultricies vehicula ut id elit. Integer posuere erat a ante venenatis dapibus posuere velit aliquet.

More »


Donec sed odio dui. Nulla vitae elit libero, a pharetra augue. Nullam id dolor id nibh ultricies vehicula ut id elit. Integer posuere erat a ante venenatis dapibus posuere velit aliquet.

More »

Startup Growth Lite is a free theme, contributed to the Drupal Community by More than Themes.