News from Industry

What Does Machine Learning Have to do with MOS Scores?

bloggeek - Mon, 12/03/2018 - 12:00

What Does Machine Learning Have to do with MOS Scores?

Human subjectivity in MOS calculations doesn’t hold water when it comes to heterogeneous environments. That’s where machine learning comes to play.

MOS score. That Mean Opinion Score. You get a voice call. You want to know its quality. So you use MOS. It gives you a number between 1 to 5. 1 being bad. 5 being great. If you get 3 or above – be happy and move on they say. If you get 4.something – you’re a god. If you don’t agree with my classification of the numbers then read on – there’s probably a good reason why we don’t agree.

Anyways, if you go down the rabbit hole of how MOS gets calculated, you’ll find out that there isn’t a single way of doing that. You can go now and define your own MOS scoring algorithm if you want, based on tests you’ll conduct. From that same Wikipedia link about MOS:

“a MOS value should only be reported if the context in which the values have been collected in is known and reported as well”

Phrased differently – MOS is highly subjective and you can’t really use MOS scores produced in one device to MOS scores produced in another device.

This is why I really truly hate delving into these globally-accepted-but-somewhat-useless quality metrics (and why we ended up with a slightly different scoring system in testRTC for our monitoring and testing services).

What Goes into MOS Scoring Calculations?

Easy. everything.

Or at least everything you have access to:

  • RTCP sender and receiver reports
  • Received RTP packets
  • Knowing the voice codec used
  • Actually decoding the audio stream and “listening” to it
  • Understanding what the end user is really going to hear

Here are a few examples:

Physical desk phone

A physical IP phone has access to EVERYTHING. All the software and all the hardware.

It even knows how the headset works and what quality it offers.

Theoretically then, it can provide an accurate MOS that factors in everything there is.

Android native app

Android apps have access to all the software. Almost. Mostly.

The low level device drivers are as known as the hardware that app is running on. The only problem is the number of potential devices. A few years back, these types of visualizations of the Android fragmentation were in fashion:

This one’s from OpenSignal. Different devices have different location for their mics and speakers. They use different device drivers. Have different “flavors” of the Android OS. They act differently and offer slightly different voice quality as well.

What does measuring what an objective person think about the quality of a played audio stream mean in such a case? Do we need to test this objectivity per device?

Media server who routes voice around

Then we have the media server. It sends and receives voice. It might not even decode the audio (it could, and sometimes it does).

How does it measure MOS? What would it decide is good audio versus bad audio? It has access to all packets… so it can still be rather accurate. Maybe.

WebRTC inside a browser

And we have WebRTC. Can’t write an article without mentioning WebRTC.

Here though, it is quite the challenge.

How would a browser measure MOS of its audio? It can probably do a good a job as an Android device. But for some reason, MOS scoring isn’t part of the WebRTC bundle. At least not today.

So how would a JavaScript web application calculate MOS of the incoming audio? By using getStats? That has access to an abstraction on top of the RTCP sender and receiver reports. It correlates to these to some extent. But that’s about as much as it has at its disposal for such calculations, which doesn’t amount for much.

Back to MOS calculations

But what does MOS really calculate?

The quality of the voice I hear in a session?

Maybe the quality of voice the network is capable of supporting?

Or is it the quality of the software stack I use?

What about the issue with voice quality when the person I am speaking with is just standing in a crowded room? Would that affect MOS? Does the actual original content need to be factored into MOS scores to begin with?

I’ll leave these questions opened, but say that in my opinion, whatever quality measurement you look at, it should offer some information to the things that are in your power to change – at least as a developer or product owner. Otherwise, what can you do with that information?

What Affects Audio Quality in Communications?

Everything.

  • The quality of the microphone used to record the original audio (though this usually gets neglected in discussions around MOS)
  • The location of the person speaking – a crowded room, airport, next to a working vacuum cleaner – or in a silent recording studio
  • The voice codec used, its configuration and the level and aggressiveness of the compression it is using for this session
  • The network conditions – in the last mile from both the sender and the receiver, of every hop along the way and the routers and servers it has to pass through
  • The media servers – and every possible aspect about them
  • The receiver’s software. Especially the jitter buffer and packet loss concealment algorithms
  • The sender’s acoustic echo cancellation implementation quality
  • The receiver’s voice decoder implementation
  • The receiver’s speakers

I am sure I missed a bullet or two. Feel free to add them in the comments.

The thing is, there’s a lot of things that end up affecting audio quality when you make the decision of sending it through a network.

Is Machine Learning Killing MOS Scoring or Saving It?

So what did we have so far?

A scoring system – MOS, which is subjective and inaccurate. It is also widely used and accepted as THE quality measure of voice calls. Most of the time, it looks at network traffic to decide on the quality level.

At Kranky Geek 2018, one of the interesting sessions for me was the one given by Curtis Peterson of RingCentral:

He discussed that problem of having different MOS scores for the SAME call in each device the call passes through in the network. The solution was to use machine learning to normalize MOS scoring across the network.

This got me thinking further.

Let’s say one of these devices provides machine learning based noise suppression. It is SO good, that it is even employed on the incoming stream, as opposed to placing it traditionally on the outgoing stream. This means that after passing through the network, and getting scored for MOS by some entity along the way, the device magically “improves” the audio simply by reducing the noise.

Does that help or hurt MOS scoring? Or at least the ability to provide something that can be easily normalized or referenced.

Machine Learning and Media Optimization

We’ve had at Kranky Geek multiple vendors touching the domain of media optimizations. This year, their focus was mainly in video – both Agora.io and Houseparty gave eye opening presentations on using machine learning to improve the quality of a received video stream. Each taking a different approach to tackling the problem.

While researching for the AI in RTC report, we’ve seen other types of optimizations being employed. The idea is always to “silently” improve the quality of the call, offering a better experience to the users.

The next couple of years, we will see this area growing fast, with proprietary algorithms and techniques based on machine learning are added to the arms race of the various communication vendors.

Interested in more of these sessions around real time communications and how companies solve problems with it today?

Subscribe to our YouTube channel

The post What Does Machine Learning Have to do with MOS Scores? appeared first on BlogGeek.me.

Kamailio 5.2: Deb And RPM Repositories

miconda - Fri, 11/30/2018 - 12:05
The packages of Kamailio v5.2.0 for Debian/Ubuntu and RPM-based distributions (CentOS, RedHat, OpenSuse, Fedora) are available to use.For Debian/Ubuntu, you can set the APT repository on your system to the links provided at:For the RPM-based distributions, their repositories are listed at:Enjoy!Thanks for flying Kamailio!

Kamailio v5.2.0 Released

miconda - Wed, 11/28/2018 - 19:30
November 28, 2018Kamailio v5.2.0 is out –  a new major release, bringing new features and improvements added during nine months of development and about two months of testing.In short, this major release brings 6 new modules and enhancements to more than 70 existing modules, plus components of the core and internal libraries as well as optimizations for embedded interpreters (KEMI framework). Detailed release notes are available at:This is the third major release in the series of 5.x.y versions. Besides adding plenty of new features, a lot of development was directed to unify the exports structure for modules, enhance dispatcher (the load balancer module), tls, RTP processing and to make available more functions to KEMI interface.Enjoy SIP routing in a secure, flexible and easier way with Kamailio v5.2.0!Thank you for flying Kamailio and looking forward to meeting you at Kamailio World Conference 2019!

HELLO 2. Is Hardware Gear Finally Taking WebRTC Seriously?

bloggeek - Tue, 11/27/2018 - 12:00

It is about time for video room systems to adopt WebRTC native approaches.

When I first started this blog, I had no clue where it was going to take me. I wanted it to be about developers. To be interesting. I also decided early on to write three posts about WebRTC:

  1. What is WebRTC
  2. How WebRTC is going to affect signaling
  3. What a room system needs to look like in a WebRTC world

Somehow, I ended up covering a lot more ground since then when it comes to WebRTC…

Signaling came a long way since then. Most of you might not even know what H.323 is. SIP is still important, but a lot less these days. Proprietary signaling mechanisms are thriving – and that’s a good thing.

The thing that never did come to play was WebRTC in video room systems. When you went to purchase a room system, you were tethered to the vendor providing you that system, along with the signaling standards it supported. It is still painfully hard to connect room systems of different vendors. And if you factor in the need to integrate it with other services the enterprise uses, it becomes even worse.

What’s a Video Room System Anyway?

This is called a codec for some arcane reason.

A video room system is a device split into 4 parts in most cases:

  1. High end camera
  2. Speaker pod
  3. Remote control
  4. The brains (that’s the “codec”)

The TV display itself is almost never included in the package (unless you’re starting to look at the new touch boards).

Speaker pods are sometimes integrated into the camera itself. This is suitable for smaller meeting rooms, also known as huddle rooms.

Remote controls were always nasty. A meeting room will have at least 3 of those: one for the TV, one for the projector in the room and one for the video room system. The one for the video room system is somehow the most complex to use. The projector one is gone along with the projector, now that we all just use the TV(s) instead.

In many cases, an external touch panel will be used to control the gizmos in the room, including lighting and other moving parts. And today, in many cases, these room systems are capable of tethering themselves to apps on smartphones for the control, killing the need for the remot control altogether.

The brains? They are sometimes just wrapped into the same box as the camera, just to save on cabling and space.

It started off as an all customized solution. The hardware, the software – it was all proprietary and specific. DSPs made up the “brains”. High end cameras were purchased and branded from Sony. The software was written in embedded operating systems like VxWorks (anyone remembers that painful thing?)

We’ve standardized some of it as time went by. Cameras have become somewhat of a commodity, now that we’re all carrying powerful ones in our pockets. Operating systems for these devices have moved on to be Linux based. DSPs are less common now that we can just use SoC (system on chip, packing the host operating system and the DSPs nicely together) or just rely on Intel chips.

What never happened is the standardization and commoditization of the software in the brains – the actual video software running the room system.

Let’s Talk UCaaS

That may finally be changing. As we head to the cloud, UCaaS (unified communication as a service) vendors are beefing up their offerings. Adding contact centers, APIs, video support and other trinkets to their battle chest.

In the past few months, we’ve seen:

Each of these vendors is using today a third party for its video calling services but can now potentially displace them with its own technology stack.

While that solves their video software issues, how are they going to handle video room systems?

Lets see what the other notable players have done in that domain:

  1. Microsoft, which has Teams and Skype, has been partnering with hardware vendors for years, getting these vendors to build their stack to the Microsoft spec in order to integrate with it and become official partners
  2. Cisco has its own hardware products, giving it the full spectrum of the solution
  3. Google has its Chromebox

Vonage, 8×8 and RingCentral aren’t hardware vendors. They aren’t going to start designing and manufacturing video room systems. When it comes to physical phones, they partner with multiple device manufacturers. This is hard work when it comes to integration and to adding more devices into the fold and trying to introduce new features. The video room systems types of devices are limited today. Polycom offer partner-friendly solutions. Logitech sells components/peripherals (mainly the cameras). Lifesize has its own cloud service. And again, integrating these video room systems with other features and capabilities is sometimes close to impossible.

On the other end of the spectrum, there’s the customer. Banking on one UCaaS supplier is fine, but if you invest in hardware devices, will they be usable when switching to another vendor? What if you want more than a single service to run on a room system? Let’s say you want to record and transcribe physical meetings taking place in a room – when not on a call. Is the UCaaS vendor or the video room system vendor need to add such a capability? Can you add it on your own by partnering with a totally different vendor while still using the same hardware?

Now, here’s the thing:

  • TokBox uses proprietary signaling
  • Jitsi uses proprietary signaling
  • Microsoft’s own use of the SIP standard is notoriously non-standard to some extent
  • Cisco puts its own “secret sauce” in all of its devices
  • And Google uses Meet, which runs… proprietary signaling

How can you partner with video room system vendors (even if there are ones) in a way that is relatively easy?

You Redefine What a Room System is

The one thing that is now changing is the software that is built into a video room system.

That is done by first changing the operating system. Instead of Linux – Android.

And Android means we can start thinking of a video room system as a device that can run multiple different applications by different vendors for different tasks.

Need to run Zoom? Why not?

Wanna switch to GoToMeeting? Fine.

How about attending a WebEx call? Sure.

Just install any of these apps – or better yet – try joining them from an integrated Chrome browser if they happen to support WebRTC.

But what if you want to show internal news for your company on that display connected to the video meeting room? Or give the ability to record and transcribe local meetings? Or connect to other internal or external services with ease? Not a problem. Just install that app on Android and you’re ready to go.

The difference here is that there is no integration work required from the video room system vendor. This is something the UCaaS vendor can do – or god forbid – the actual enterprise who is using the video room system.

I’ve been waiting for this level of commoditization and flexibility to take place.

Enter HELLO 2

One of the vendors in this space, is Solaborate. I’ve interviewed Labinot years ago on this blog. That was about his enterprise social network service. Since then, he’s added a hardware device called HELLO which successfully launched on Kickstarter; and he is now running a Kickstarter campaign for HELLO 2.

The HELLO 2 is an “all in one” video room system capable of what I was looking for to happen:

  • The brains is built into the camera
  • It is based on Qualcomm chipset, giving it most of what a high end phone can do (which is… a lot)
  • It has a 4K camera with zoom capabilities
  • Built-in mic array
  • And … AI capabilities (why not?)

The best though? It runs on Android, so you can either use the HELLO 2 / Solaborate applications or any other application you fancy using (that said, the applications may not be as polished on the big screen as they are on a phone or a tablet and that requires a bit of reworking on their end).

This gives some real flexibility:

  1. UCaaS vendors can now offer a hardware video room system running their own software applications, not needing to rely on the vendor doing the work and the integration. This gives full brandability along with the ability to integrate intimately with all of UCaaS vendor’s services and capabilities
  2. End customers can install and add the other services and apps that they use within their enterprise, without needing to beg to the UCaaS vendor to support and integrate with them

One more thing – you can run Chrome directly on the HELLO 2, and it will successfully operate any WebRTC based web page with it.

The Future

This is the model of the future when it comes to video room systems. Generic types of devices, packing all the needed hardware, letting other vendors and customers handle the software components.

And today, there’s no easier way to do that than using Android as the baseline operating system. Having a Chrome browser inside the device is just an added bonus to let you join with guest access to those pesky calls your suppliers and customers schedule on their own services.

The post HELLO 2. Is Hardware Gear Finally Taking WebRTC Seriously? appeared first on BlogGeek.me.

Announcing Next Kamailio World Conference, May 6-8, 2019, in Berlin

miconda - Mon, 11/26/2018 - 12:00
The next edition of Kamailio World Conference is planned to take place at the same location like the past editions, respectively hosted by Fraunhofer Fokus and Forum in the city center of Berlin, Germany, during May 6-8, 2019.The website of the event and the call for presentations will be launched in the near future, stay tuned!Meanwhile, you can browse the website of the previous edition in order to get an idea about the type of event and its content:Enjoy the upcoming winter or summer season!Thanks for flying Kamailio!

Digital Ocean private LAN is totally useless

TXLAB - Thu, 11/22/2018 - 11:07

Digital Ocean is offering a private LAN for internal communication between the VMs, and they claim it’s isolated from other customers. You get some random addresses within 10.133.0.0/16 (or maybe some other range), and they can talk to each other on dedicated virtual NICs.

But that’s it. You cannot run OSPF because multicast packets are not let through. Even if you manage configuring direct neighbors in OSPF, it renders useless because the private LAN does not allow packets with destination IP addresses outside of the LAN range. So, any kind of routing with next hop in the private LAN would not work.

Too bad guys, very disappointed. So, we need to resort to Tinc VPN for internal routing, and this private LAN doesn’t make any sense.

Kranky Geek 2018. A post event post

bloggeek - Mon, 11/19/2018 - 12:00

For me, Kranky Geek 2018 was a tremendously fun experience.

We had our fourth Kranky Geek event in San Francisco last week. As usual, it is a nerve wrecking experience up until the point it ends. And it doesn’t start on the day of the event itself – we’ve been busy with content curation, handling presentation drafts and doing dry runs for a few weeks.

The result is quite satisfying. We’ve decided this time to dig even deeper into the domain of artificial intelligence and machine learning and its role in real time communications. As I’ve been saying, WebRTC is ready – so what would be the point of doing an event about WebRTC? We have a lot of WebRTC topics already covered from our past events – and they are all available in the Kranky Geek YouTube channel.

The way we see it, there are 4 domains we had to cover: speech analytics, voicebots, computer vision and RTC optimization.

So we went hunting for the event. In the end, we were able to cover all four domains and squeeze a few WebRTC specific topics as well.

The Sessions

This year, we had the biggest number of sessions. The event has become a full day event from a shorter one over the years. The people I talked to noted that the day was long and tiring, but somehow, almost everyone stayed to the end. Here’s what we had this year:

Our own welcome

Kranky Geek SF 2018: AI in RTC from Tsahi Levent-levi

One thing to note here – our AI in RTC report got a promotional discount of ~33%, which will be available until the end of the month. If this space interests you, then definitely check it out.

Discord

Discord operates a large chat operation for gamers. Part of that service includes voice and video calling. At peak, they handle 2.8 million concurrent voice connections to their service.

What they shared, was the changes they have done to the vinyl WebRTC code base in order to fit their needs.

Facebook

Facebook were kind enough to give a presentation around Facebook Portal – their new home device that is capable of handling video calls (using WebRTC of course). The device uses machine learning to track the people in the room during a call. They talked about the challenges that comes with automating the camera’s zoom and with connecting calls from Portal devices to mobile phones.

This was the first time they shared that information publicly at a conference.

Intel

Intel announced open sourcing their media server – the Intel Collaboration Suite for WebRTC – under the name of Open Media Streamer. They also shared information of svt-hevc, their open source HEVC encoder.

Voicebase

Voicebase talked about Paralinguistics – the way we speak as opposed to the words we are saying. They shared the path they took charting that space, and understanding what makes more sense or less sense in terms of value.

Voicera

Voicera discussed virtual assistants and how they need to understand transcriptions.

IBM

IBM explained the notion of voicebots and how it fits into contact centers. They explained the need to be able to handoff a voicebot to a human agent.

Nexmo

Nexmo showed a demo using Dialog Flow, connected to a voice service for ordering a pizza. It stressed the need to be able to connect communication services to various machine learning ones.

Dialpad

Dialpad explained how to take an open source speech to text engine and add some custom words into it in order to improve the accuracy of the transcription.

Callstats

Callstats clustered the sessions they are collecting, trying to figure out by that information the type of call and root cause of issues it may have.

RingCentral

RingCentral normalized MOS scores of audio calls across its network and devices, to be able to give a clear indication of call quality – it appears that while there’s a standard specification for MOS, asking device manufacturers to follow it to the letter is rather challenging, so using machine learning they are “fixing” that issue.

Google

Google talked about the current status and efforts in getting Chrome’s WebRTC implementation to 1.0 specification. It also shared the work being done to improve audio stability and performance in Chrome (lots of architecture changes in how devices get accessed in order to reduce the number of threads used and get a stable delay model for its acoustic echo canceller). There was also a look at what goes after 1.0 – WebRTC NV and what role may WebAssembly play there (I’ll write more about it in the future).

Agora

Agora showed how they use super resolution to improve video quality in calls, and what it means to run super resolution on a mobile device.

Houseparty

Houseparty used machine learning to improve video quality as well, taking a different approach. They shared the work they are doing and the effort it takes to bring it to production.

Microsoft

Microsoft shared the work done on WebRTC on UWP and explained how AR/VR fits into the story and the enterprise use cases they are seeing in the market.

Session Recordings

As always, all the sessions were recorded and are available online.

Kranky Geek in 2019

Every year we’ve done a Kranky Geek event, we came in with the notion that this is the last one. Not sure why, but that was always the case. Then about 9 months after the event, we started discussing with Google about the next event.

We’ve changed that this time. We are going to do an event in 2019, and we have a name for it:

Kranky Geek SF 2019

We have a tentative date for the event: November 15, 2019

Put it in your calendar.

We don’t yet know what the theme for next year will be, but I have a hunch that it will include WebRTC and machine learning

If you want to speak – contact me

If you want to sponsor – contact me

If you have feedback on what we should improve – you know – contact me

Oh – and if you are interested in AI in WebRTC, check out our report – there’s a discount available for it until the end of the month.

The post Kranky Geek 2018. A post event post appeared first on BlogGeek.me.

Releasing Kamailio v5.2.0

miconda - Mon, 11/19/2018 - 11:59
We are considering to release v5.2.0 (the first stable version out of branch 5.2) next week, likely on Wednesday, Nov 28, 2018.It still allows a bit more than a week of testing as well as well time to prepare the online resources for it (documentation, wiki pages, upgrade guidelines, etc…).If there is any issue you are of and not yet reported to github.com bug tracker, do it as soon as possible to give it a chance to be fixed in time for the next major release.Thanks for flying Kamailio!

8×8 Acquires Jitsi From Atlassian. Winners and Losers

bloggeek - Thu, 11/08/2018 - 12:00

Jitsi was just acquired by 8×8, shifting hands from Atlassian. Here’s what to expect.

It seems that Jitsi has now switched hands, moving from Atlassian to 8×8.

Three months ago, Atlassian made a bold (desperate?) decision. It put up a white flag, decided to kill Stride, after investing in it huge amounts of money and resources, throw Hipchat along with it, and “sell” them to Slack, who “acquired” them.

The weird thing in this acquisition was that Jitsi was left behind.

Jitsi is an open source media framework. One of the most popular WebRTC frameworks out there. I wrote about that acquisition in 2015. The reason behind it was Atlassian’s need to own the video communications technically that powered Hipchat. And now that Hipchat is gone, what would Atlassian need Jitsi for?

The last 3 years

The last 3 years have been good for Jitsi in Atlassian.

The team of developers it had was big, considering its scope (and open-sourceness). Especially if you factor in the fact that everything that Hipchat (and Stride) needed from Jitsi was implemented directly inside Jitsi. Not on a private branch of the project available only to Atlassian.

Compare it to how Twilio treated Kurento after its acquisition… Atlassian did a great job at keeping Jitsi’s momentum and community. At the very least, it didn’t hurt the project, letting it grow and flourish, paying the salaries of its developers.

The interesting initiative that took place alongside the Jitsi open source project is Jitsi Meet – a free version of a group video calling service. One that wasn’t limited to a small number of participants or lower video resolutions.

Jitsi is in a better place than it were 3 years ago prior to its acquisition.

Leaving Atlassian

Leaving Atlassian was a matter of time.

There was no room in today’s Atlassian for an open source project like Jitsi that brings no added value to its commercial products.

Jitsi didn’t go to Slack as part of the Hipchat/Stride deal. Slack were already using Janus, and moving on to their own homegrown media server – something they shared with us at Kranky Geek 2017 (hint: come and join us this year at Kranky Geek 2018). There was no reason for them to further invest in yet another migration – or they might have wanted to migrate to Jitsi and acquihire the team but it didn’t pan out.

That left Atlassian with one of 3 alternatives:

  1. Kill the project and be done with it. Send the developers home or integrate them into some other parts of Atlassian. It would work nicely, but if the asset can be sold, then why not recoup some money?
  2. Spin out the project. Let the team go, giving them back ownership of the code, and have them go scrape for a livelihood around Jitsi. Probably by offering a commercial license, support and customization services, etc. – this isn’t that far out as an idea – it is how Janus (another open source media framework) operates today and how Jitsi operated prior to its acquisition by Atlassian
  3. Sell it to someone who’s interested in it. This is what it ended up doing. Given the other alternatives in front of them, I tend to agree with Andy’s statement that this is a mercy sale
Joining 8×8

8×8 acquiring Jitsi is an interesting choice.

Here’s where things get interesting:

8×8 already has a WebRTC based web conferencing solution called “8×8 Virtual Office Meetings Online”. Somewhere in 2016, this service got rewritten. At some point between then and now, guest access on Chrome was introduced. From the looks of it, based on WebRTC.

Why would 8×8 need/want Jitsi when it had a solution already?

I can think of three possible reasons for it:

  1. Their WebRTC solution isn’t that good, too expensive, and they were looking for a better alternative. Jitsi was a catch in such a case
  2. 8×8 is looking to own its video technology and not use third party software, commercial or open source
  3. They were using Jitsi for their 8×8 meetings thingy and Atlassian selling that assent was an opportunity for them to control the tech stack without relying on a third party – probably on the cheap

What would 8×8 do with Jitsi?

The obvious thing is to integrate the tech into its meetings service. If it is already there, then use the Jitsi team of developers to tweak and finetune the thing for the 8×8 use case.

If it isn’t there yet, then integrate it and replace its current WebRTC tech in the meetings app. This is a more challenging undertaking, as Jitsi will need to meet the current feature list of what 8×8 already has in that domain, along with integrating to an existing codebase of a service and an application.

Jitsi probably has most of the needed features to make this happen. It wouldn’t have been acquired otherwise.

On a different area, 8×8 has no real open source activity at the moment. Its github account is mostly forked repos. Searching for “8×8 open source” is dominated by the Jitsi acquisition news:

(the rest are comparisons to other vendors, who are leaning more heavily on open source)

If 8×8 is interested in embracing open source, then it just got an interesting opportunity to do just that. While brings me to the last topic –

The future of Jitsi

What will be of Jitsi?

Here we need to look at Jitsi and Jisti Meet separately.

Jitsi

The Jitsi Videobridge, along with its derivatives, add ons, plugins, extensions and client-side SDKs.

That’s the open source part of the project. At Atlassian, there was nothing kept for internal use of Hipchat/Stride. Everything found its way back to the open source project.

Will 8×8 continue in that path?

Their focus in the coming months is going to be the integration of Jitsi into their 8×8 meetings service. They are bound to use the resources of the Jitsi team to do that.

Managers may decide to implement some of the features in the 8×8 meetings service moving forward and not invest in adding it to the Jitsi open source project. Or they might decide to add everything via Jitsi.

8×8 might end up taking the extreme – ditching the Jitsi project as an open source one – embed it into their meetings app and from there on, invest in that privat branch only. I see that as a highly unlikely outcome in the next 2-3 years.

Time will tell which direction is taken.

Jitsi Meet

Jitsi Meet is a different story altogether.

It is a group video meeting service. One which doesn’t limit the users’ bitrate in sessions, doesn’t limit the number of users in a session, offers mobile apps, Slack and calendar integration and scales globally. All for free.

Would 8×8 see it as competition to their own 8×8 meetings app? If it grows in popularity and its maintenance costs increase, how happy would 8×8 be in paying the bills? Would it see Jitsi Meet as a sales tool for its other services? How would it measure the success of this service?

Whatsapp’s founders just left Facebook this year. It was over disputes about data, privacy and such. Most of all, it was probably a dispute around the future of Whatsapp and Facebook’s intent of monetizing the asset. The same (at a much smaller scale) can happen here at some point.

How would 8×8 monetize Jitsi Meet? Should it? If it doesn’t, should it kill it?

I don’t know the answers. I am sure 8×8 doesn’t either. It is just too early to tell.

Last Words

Jitsi is an open source success story in WebRTC. There’s no doubt about it.

It is now entering a new chapter in its life, under 8×8.

I wish the team the best of luck and us as an industry to have the option to use Jitsi for our future projects.

Media Frameworks are part of the picture of the backend story of WebRTC. Care to learn the rest? Try out my free mini-video series on WebRTC backedn servers:

Register to the video series

The post 8×8 Acquires Jitsi From Atlassian. Winners and Losers appeared first on BlogGeek.me.

Development Open For Kamailio v5.3

miconda - Wed, 11/07/2018 - 11:58
With the creation of branch 5.2 done yesterday, the master branch is from now on open for adding new features, to be part of future release series v5.3.x.Based on the workflow used during the past years, the next future release v5.3.0 should be out after another 8-10 months of development, plus 1-2 months of testing, so sometime in the summer or autumn of 2019.Even now there is a new pull request on its way to be merged in master branch that is adding a new module – the rtp_media_server:So the new development cycle is starting very promising. Expect plenty of enhancements and new feature during the development of v5.3 series.Thanks for flying Kamailio!

Kamailio Git Branch 5.2 Created

miconda - Tue, 11/06/2018 - 19:56
The branch 5.2 has been created in the git repository of Kamailio, to be used for releasing v5.2.x series.To check out this branch, the following commands can be used:git clone https://github.com/kamailio/kamailio kamailio-5.2
cd kamailio-5.2
git checkout -b 5.2 origin/5.2Pushing commits in this branch:git push origin 5.2:5.2Note that 5.2 is an official stable branch, so only bug fixes, missing kemi exports (to be discussed on sr-dev if something needs to be sorted out about the purpose of the exports) or improvements to documentation or helper tools will be pushed to this branch.As usual, if there is a bug fixed, the commit will be pushed first to master branch and then cherry picked to 5.2 branch.In few weeks, the first release from branch 5.2 will be out, respectively Kamailio v5.2.0.Thanks for flying Kamailio!

Meet me @ Kranky Geek San Francisco 2018

bloggeek - Mon, 11/05/2018 - 12:00

Kranky Geek is happening this year again, the date is Nov 16, and we’ve got the best lineup of speakers for you.

Kranky Geek started almost by mistake. Like most good things that happened to me. It wasn’t planned. The result though is becoming a tradition by now, where I get to work with Chris Koehncke and Chad Hart for a period of time that can be considered quite intense (we’re all too opinionated).

Google, along with our other sponsors make this event happen. We only curate the content to make sure the end result is great.

In last year’s event, we started looking at the domain of AI. You can find the recordings of that event on YouTube. The feedback we got was positive, so this year we’re taking a step further here. Many of the sessions will focus on machine learning and AI and its impact on real time communications.

What’s on the Agenda?

AI in RTC.

As always, our intent here is to focus as much as possible on services and applications that are running in production already. It won’t be theories about what can be done but what are people doing. Today.

The updated agenda can be found online. It might change a bit in its ordering, but it is mostly ready.

This year, we have some brand new speakers for you:

  • Discord will be giving a session about their service and what they had to do with WebRTC to make it work for their use case. My suggestion? Read their post to get ready for this session – it will be really interesting
  • Houseparty are joining us for the first time as well. Tinkering with machine learning on device. One of the main challenges these days is deciding where to run inference with machine learning – on device or in the cloud. We will see both options throughout the day
  • Agora will explain what they are doing to improve video quality in real time on mobile devices by using machine learning
  • Voicera will be talking about the challenges in speech recognition when it comes to handling meetings
  • Dialpad are there to talk custom vocabularies. Every company has that. How do you transcribe Kranky Geek? That’s a question I’ll ask in the Q&A of this session…
  • Intel will discuss newly open sourced visual processing tools to help you build out your application
  • RingCentral is joining us late in the game. We’re figuring out with them a stellar topic for the event

We also have some “repeat” speakers:

  • Facebook this year will give us a sneak peek at the technology (and AI) behind their new Facebook Portal device. What I am really keen on hearing is what decisions they made to get their “follow you around” feature to work
  • Voicebase will focus on paralinguistics this time. The nuances of speech that aren’t text – and how to capture their meaning
  • Callstats will be discussing this time the use of looking at ongoing call data using… machine learning
  • IBM will be all over voicebots and their uses in contact centers. We will get to look under the hood on how these get implemented
  • Nexmo are going to show us the complexity of connecting real time voice streams to cloud based speech to text engines. (technically, there are a new speaker, but I figured that now that TokBox is part of Vonage which also owns Nexmo, they are repeat speakers)
  • Google will give an update on Chrome’s implementation of WebRTC, with a focus on 1.0. They will also give a deep-dive into the upcoming architectural changes in Chrome’s audio processing engine
  • Microsoft is going to give us a demo of WebRTC, Mixed/Augmented Reality and HoloLens. And we’re saving this for last so you’ll stick around

We are expanding our family of Kranky Geek speakers and Kranky Geek companies, which is a true joy. I can’t wait to hear your feedback once the day is over.

Our sponsors this year

As always, the event is practically free to attend (there’s a $10 admission fee that gets donated to Girl Develop It).

The companies that made this event happen this year are Google, Intel, Agora.io and Nexmo who are our premium partners for the event; Callstats.io ,Voicebase and RingCentral who are our silver partners for the event.

No fire drill

I am not sure if this is good or bad. We had a surprise fire drill last year. We knew about it about a week or two before the event. It cause so much headache for us. And a lot of worries.

It ended up pretty well, with our audience and speakers getting a one hour break outside on a beautiful sunny day. Almost all of them came back after the drill, which isn’t obvious or even expected.

Many were happy for the break – and the smalltalk that ensued during it.

Hopefully, there will only be pleasant surprises this year as well.

What are we looking for in Kranky Geek?

We had to turn down a few vendors who wanted to speak. This is a process that takes place every year.

There’s no specific set of rules of what we approve or don’t as a session in Kranky Geek, but for me it boils down to this:

  1. Something new that wasn’t discussed at Kranky Geek before
  2. Preference to something running in production at scale
  3. An interesting topic that would appeal developers
  4. Related to real time communications
  5. A speaker that can “hold a room”

While the lineup of speakers for this year is full, if you want to speak in future Kranky Geek events – be sure to catch me during the event for a chat.

Should you travel just for this single day?

I got this question a few times in the past few weeks.

My guess is that if this is the only thing you’re doing in San Francisco and coming for, then skip it. Especially if you are traveling from abroad.

That said, if you want to feel where WebRTC is headed, talk to many of the people who deal with it daily in the real world, then this is the place to be. So many discussions take place during the breaks that it might be worth coming only for the breaks… I know a person or two that are coming only for that.

We try to make Kranky Geek special and unique. We work hard to select the speakers and work with them on their presentations. All to make it worth your travel, wherever you come from.

Can non-developers attend?

We received this question recently.

There is no easy answer to this one. On one hand, the event and its session are technical in nature as our focus is developers. On the other hand, the sessions are short (20 minutes all-in-all), so our speakers tend to focus on the essence and not dive too deep into the nitty gritty details. So a tough call.

My suggestion? Check out some of the session recordings on YouTube from past events and make your decision based on that.

Register now

Yes. there’s this minor detail.

You need to register to attend. There’s limited room capacity, and at some point, we will need to close the registration.

We’re already half full in our registration list, so save your spot now and don’t wait.

Register NOW

 

 

 

Do you want to meet me prior to the event?

I’ll be in San Francisco Nov 12-17. Nov 15-16 are reserved for Kranky Geek. The rest for meetings with people – around WebRTC, CPaaS, testRTC, my WebRTC course, consulting and just catching up.

If you want to meet me during that week, leave me a note.

The post Meet me @ Kranky Geek San Francisco 2018 appeared first on BlogGeek.me.

FOSDEM 2019 – RTC DevRoom – CFP And Volunteers

miconda - Mon, 10/29/2018 - 20:30
FOSDEM 2019 (the free and open source software developers meeting) takes place during the 2nd and the 3rd of February 2019 in Brussels, Belgium:The application to host a Real Time Communication devroom has been accepted and the call for presentations and volunteers has been started. The announcement with all the relevant details has been sent to the mailing list:Consider to submit a proposal if you have worked on something FOSS and interesting to share that is related to real time communications.It is very likely that Kamailio project will participate once again at the event with a consistent group of developers and community members, continuing our more than 10 years long tradition to meet for a dinner and catch up on what new around RTC world!Thanks for flying Kamailio!

Are Embeddable Video Experiences Necessary?

bloggeek - Mon, 10/29/2018 - 12:00

There’s no one size fits all in communications. In video, that means that embeddable video experiences are necessary and they are here to stay – they aren’t a passing trend.

Source: Vidyo

Years ago, before WebRTC came into our lives, I worked at a video conferencing company. My role there at the time was CTO of the business unit dealing with licensing VoIP technology to others. The leading product at the time, was a video conferencing client that can fit into device and able to interoperate in SIP and H.323. As a CTO, I was given the initiative of getting us into the cloud, which ended up involving something that was meant to become a CPaaS (just not using that term as it didn’t exist). It never came to fruition since I left the company a bit after WebRTC was announced and I knew where the future is headed.

Anyway, one day I was asked to take a business trip to the US, to meet with customers and potential customers. One of these customers was a vendor involved in the prison industry (not sure what’s the whitewashed term for that is, so just using prison industry).

Video Conferencing in Prisons

To clarify: I am not taking a stand here around prisons, prisoners or video conferencing in prisons. Just sharing this as a requirement that I’ve seen in the past.

What they were doing was building “phone booths” for prisoners so they could call home and talk to friends and family. They were in the process of shifting towards video calling, and were using at the time one of the known brands – I don’t remember which. Think of Polycom or Cisco video conferencing systems for reference.

Source (somehow, the happy faces seem exaggerated for the use case)

The challenge was in the fact that these vendors and their solutions were geared towards video conferencing in the enterprise – what we now wrap under the term of unified communications. This meant that a lot of the features and requirements that a vendor developing a communications service for prisoners were hard or impossible to meet:

  • Full moderation of the call by a third party at all times
  • Ability to join the session as a silent or known participant (that’s the moderator)
  • Ability to manage and control session length
  • Knowing the identity of both people in the call, but having the system flexible enough to accomodate for new users and guests in the system
  • Wrap the whole experience with other features (browsing) that prisoners might want to use

They ended up licensing our technology to build it all, at prices that today would seem ridiculously high, though made sense at these days, when real time communications technology wasn’t a commodity and wasn’t open sourced.

If we’re at the domain of anecdotes, funnily enough, we’ve been using GIPS for the audio codecs at that time on PCs. The same company that Google acquired and built WebRTC out of.

Back to Embeddable Video Experiences

Prisons and prisoners aren’t the real story here.

Embeddable video is.

Communications between humans is something that can’t really be placed into a set of known rules.

Yes. We’ve had the telephone companies around for 120 years or so, explaining and educating us on how to communicate with each other remotely.

Unified communications has a gazillion of features dealing with telephony, trying to accommodate each and every eventuality that a customer may want and need. Which is nice, but from a certain point, it is really hard to scale across customers with different needs.

Video conferencing has been the hardest of all. Video is hard, so everything about it is hard as well.

This all meant that communications was always a service. Something you get “out of the box” as is. Or something you can customize if you are big enough, with enough money to pay.

WebRTC, cloud, virtualization, SaaS and a few other terms came into our lives. What they essentially did was reduce the barrier of entry for those who need video communications. This meant that scenarios that weren’t catered for with enterprise video conferencing were now possible to achieve at lower price points.

The end result?

We are now seeing video communications being embedded in places where it never really existed.

Are these new?

They are and they aren’t.

They aren’t because the need was always there.

They are because only now they can be satisfied commercially.

The only question that remains is where do you see embeddable video contributing to your business and how do you go about implementing it. In the last few months, I’ve been working with Vidyo on a research around this topic exactly.

Interested in the state of embedded video in 2018? Download the free report here.There’s also a joint webinar on the topic coming up – be sure to register to it:

Register to the free webinar

The post Are Embeddable Video Experiences Necessary? appeared first on BlogGeek.me.

How Zoom’s web client avoids using WebRTC

webrtchacks - Tue, 10/23/2018 - 10:30

Zoom has a web client that allows a participant to join meetings without downloading their app. Chris Koehncke was excited to see how this worked (watch him at the upcoming KrankyGeek event!) so we gave it a try. It worked, removing the download barrier. The quality was acceptable and we had a good chat for half an hour.

Opening chrome://webrtc-internals showed only getUserMedia being used for accessing camera and microphone but no RTCPeerConnection like a WebRTC call should have.

Continue reading How Zoom’s web client avoids using WebRTC at webrtcHacks.

WebRTC is Ready. Now What? (a look at the state of WebRTC in 2019)

bloggeek - Mon, 10/22/2018 - 12:00

There should be no doubt about WebRTC anymore. It is here and it is ready for everyone. The question is: “now what?” Where are we headed with WebRTC in 2019

Is WebRTC Ready Yet?

That was the name of a website that tracked how well is WebRTC adopted by the various browser vendors.

Apparently, it is also the most common question on Google about WebRTC:

It is time we say it outloud (I don’t believe anyone has done that up until now):

WebRTC is READY

I was asked to speak at Apidays Amsterdam last week, which was a true joy. The topic I was tasked was around WebRTC being a standard, and well… where are we headed next. So I decided to rephrase it a bit and ignore that tiny bit of a fact that WebRTC 1.0 still isn’t an official standard (nobody but those in standardization organizations and those opposing to adopting WebRTC seem to care either).

So I sat down to think what does it mean that WebRTC is ready. Which led to this question:

Why I think that WebRTC is ready?

The best way for me to answer that question was to give 3 recent examples on things happening with WebRTC (and I don’t mean Uber doing VoIP using WebRTC):

#1 – VP8 Supported by Safari

I’ve been a critic about Apple’s non-support of WebRTC and then Apple’s non-support of VP8.

The fact that Apple decided at the time to support only the H.264, a royalty bearing video codec, and ignore VP8, the royalty free alternative, wasn’t a good sign.

In the past two weeks, tweets and webkit bug links have been flying around, indicating that if the mountain won’t come to Muhammad, then Muhammad must go to the mountain. Or more accurately, that Apple decided to do a Microsoft and support VP8.

Do a Microsoft because this is the same steps Microsoft took when going WebRTC. Starting with H.264 and only later adding VP8.

So Apple has started with H.264 and only now adding VP8.

When will this be available for all? Ask Apple.

What’s important is that ALL modern browsers now support both VP8 and H.264. More on that in a sec.

It doesn’t stop there either. Apple joined the Alliance of Open Media as a founding member. This alliance is behind the future video codec AV1, and now has 40 members in it.

#2 – H.264 Simulcast Support

The second example is H.264. It is now becoming a first class citizen.

H.264 on Chrome didn’t have simulcast support. The “fix” for that was available for quite some time, but was never incorporated into Chrome. Simulcast increases the quality of group video calls, so not supporting it in H.264 made H.264 useless for group video calls.

There can be two reasons for this feet dragging by Google:

  1. Timing and priorities. Google didn’t really care enough to add that in and deal with the headaches of pushing code from a third party with the fix and validating it
  2. The push towards VP8. Increasing the quality of H.264 would get more developers to adopt it, especially when Apple supports only H.264 on Safari

Since VP8 is coming to Safari, the reason to give it an edge over H.264 isn’t there anymore. Especially considering the healthy growth of the Alliance of Open Media.

The end result?

  • All modern browsers support VP8 (Safari support is imminent)
  • All modern browsers support H.264; and simulcast will soon be possible for it
  • VP9 is available only in Chrome and Firefox for WebRTC – but who cares? The future will be AV1. And ALL browser vendors are part of the Alliance of Open Media where AV1 is getting specified (YouTube is already testing AV1 decoding in Chrome and Firefox)

This media codecs disparity between browsers was the main challenge for the WebRTC community. It is now behind us.

#3 – Google Shifts Focus

That third reason why I believe WebRTC is ready?

Google is shifting focus. It is doing what is needed to support WebRTC and the migration to the 1.0 specification (unified plan for example), but its heart and mind is already elsewhere:

At the beginning of this month, Google announced Project Stream – a cloud based service that streams high end games from resource intensive cloud based machines to low end devices in real time.

There’s not a lot to go on about the technology, but it seems to be based on WebRTC.

Project Stream official gameplay capture: 1080p@60fpshttps://t.co/SjznbRCBAP

— Justin Uberti (@juberti) October 2, 2018

Why else would Justin Uberti from Google’s WebRTC team publish this? 1080p resolution at 60 frames per second with low latency for gaming. This type of a use case is different from real time communications. It requires a different focus and optimizations. And yet… the WebRTC team at Google have probably spent some cycles on supporting it.

Why is that a good thing?

Because for Google, WebRTC is ready when it comes to real time communications, and beyond optimizations and house keeping, it is time to move on and look at other use cases where WebRTC can be beneficial.

What’s Next?

So. WebRTC is here:

  1. Apple supports it now; and there’s codec parity across browsers
  2. H.264 is a first class citizen in WebRTC
  3. And Google has moved on to other use cases for WebRTC

What’s next for WebRTC?

The answer I gave in that presentation at Apidays was Machine Learning.

I like that slide above. I like it because you can take RTC out of it, replace it with whatever word/term/industry you want and it will STILL be true.

In the rest of that presentation, I went over the research report that Chad Hart and I have written, sharing some of our findings.

I went into the 4 domains we’ve mapped in our research, in each giving an example of the impact and use cases that are now possible:

  1. Speech analytics, and how we’re shifting from offline processing to real time
  2. Voicebots, and how work in that area is accelerating
  3. Computer vision, where use cases are vastly different between consumer and enterprise settings
  4. Media optimization, and the shift from heuristics to machine learning
That Deck from Amsterdam

That slide deck from Amsterdam is now available online as well. You can view it here:

WebRTC is READY. What's Next? from Tsahi Levent-levi Machine Learning and Real Time Comms

If you are interested to learn more about machine learning, to be able to make smart decisions in your own company about the use and introduction of machine learning and artificial intelligence in a communications application, then definitely check out our report: AI in RTC

The post WebRTC is Ready. Now What? (a look at the state of WebRTC in 2019) appeared first on BlogGeek.me.

Breaking Point: WebRTC SFU Load Testing (Alex Gouaillard)

webrtchacks - Fri, 10/19/2018 - 05:47

If you plan to have multiple participants in your WebRTC calls then you will probably end up using a Selective Forwarding Unit (SFU).  Capacity planning for SFU’s can be difficult – there are estimates to be made for where they should be placed, how much bandwidth they will consume, and what kind of servers you need.

To help network architects and WebRTC engineers make some of these decisions, webrtcHacks contributor Dr. Alex Gouaillard and his team at CoSMo Software put together a load test suite to measure load vs.

Continue reading Breaking Point: WebRTC SFU Load Testing (Alex Gouaillard) at webrtcHacks.

Kamcli v1.1.0 Released

miconda - Tue, 10/16/2018 - 21:30
Kamcli v1.1.0 has been released. It is a command line management tool for Kamailio deployments, aiming to be a modern alternative to the venerable kamctl.Kamcli offers a set of subcommands for controlling Kamailio, among them:
  • subscriber – manage SIP subscribers
  • ul – manage user location records
  • address – manage permissions address records
  • aliasdb – manage database aliases
  • db – manage kamailio database content
  • dialog – manage active calls (dialog)
  • dialplan – manage dialplan records
  • dispatcher – manage load balancer (dispatcher)
  • group – manage group membership records (acl)
  • moni – continuous refresh of the values for a list of statistics
  • mtree – manage memory trees (mtree)
  • ps – print the details for kamailio running processes
  • rpc – interact with kamailio via jsonrpc control commands (alias of jsonrpc)
  • rpcmethods – return the list of available RPC methods (commands)
  • speeddial – manage speed dial records
  • srv – server management commands (sockets, aliases, …)
  • stats – get kamailio internal statistics
  • tls – management commands for TLS profiles and connections
  • uptime – print the uptime for kamailio instance
How to install kamcli and examples of usage can be found at:This release has been tagged on Github repository at:Enjoy! Thanks for flying Kamailio!

Can Google RCS Win the Messaging Game Through AI?

bloggeek - Mon, 10/15/2018 - 12:00

RCS is being brought from the dead by Google, and its next play will probably be with AI.

Carriers have a problem

SMS won’t stay here forever. In fact, most of the messaging traffic is happening on social networks now.

Voice is shifting as well. Migrating to these same social networks. With the ability to upgrade these calls to video calls. With stickers. And silly hats, cat lenses and whatnots.

Want to learn more about the use if silly hats and other AI features in communications? Check out our AI in RTC report preview

Download the preview

Their circuit switched network technology is decaying, left in its 80’s or probably 50’s. Most of what goes on there is spam or OTP passwords anyways. Nobody cares.

So much so that Google is planning on diverting incoming calls to its assistant (but more about it later).

The solution, in the form of IMS and later RCS (or call it Joyn or whatever other branding it was given throughout the years) are some 20 years in the making. And they don’t seem to be coming any time soon. At least not if left to the arduous processes of carriers and their suppliers.

Google has a problem

 

A VERY different problem.

Google has no messaging clout.

For consumers?

Apple iMessage wins on iOS. It acts as a Chameleon, catching up your messages and deciding if they should be demoted to SMS or use modern messaging via iMessage instead.

Facebook with Messenger and Whatsapp is ruling supreme in Android, and in many cases on iPhones as well. Where they aren’t as strong, you’ve got a slew of other social players with 100+ million monthly active users. None of them looks like a carrier. And none of them is Google.

Google has Allo, Duo, Chat, Meet, Hangouts, Messages and probably a few more apps that I’ve forgotten to mention. All in different states and capabilities; but none which is dominant compared to its competitors. Actual monthly active users and amount of real messages going between users? Not shared. Probably not stellar.

And Google has RCS..

For businesses?

Apple, Facebook and others are adding APIs. Introducing bot platforms. Building marketplaces. And they are doing it slowly, fearful of becoming the spam cesspit that is the good ol’ carrier communications tech today.

Slack is killing it. And the rest of the cadre of UCaaS and enterprise communications players are trying to move into their space.

Google has Meet and Hangouts Chat. Part of G Suite. Meet gets used. Hangouts Chat I don’t really know. But it seems that most just skip it and move on to Slack or some other tool.

Google also has nothing similar to a business angle to its consumer facing communications applications yet, or at least nothing popular enough.

What’s new in RCS land?

Nothing really.

I’ve written in April about RCS being still dead. For some reason, Google is still hammering away at it. Similar to Google+ if I need something to compare it to.

A press release last month by Samsung and Google brings Samsung to the RCS graveyard. New Samsung devices, and maybe layer older ones will come -gasp- with a Samsung Messages app that will work seamlessly with the Android Messages app using each other’s RCS technology!

This interoperability nightmare of the carriers will continue on, leaving RCS dead.

Adding new carriers or smartphones or chipset makes into the fold won’t help either.

And it isn’t as if Apple is making any noises of being interested in RCS, and why should they be?

That said, there are those who will be adopting RCS.

We are shifting towards an omnichannel world. No single protocol to rule them all. No single vendor to rule them all. You want to send your message as a business to a consumer?

You can use SMS. Or better do it over Messenger or Whatsapp or Apple Business Chat – there’s more context and richness in those, and consumers actually care about these channels. Which brings us to a place where businesses just need to support wherever their customers are with no decent common denominator.

And wouldn’t it be great if we could throw SMS and use RCS instead? At least where we can?

So CPaaS vendors are adding support for RCS and announcing it in their arms race to world domination by collecting as many social messaging icons as they can.

That’s great, but not enough to save RCS.

Can Google change RCS predicament?

Not really.

There are just too many players and this is a domain where Google has been struggling to go it alone as it is.

Here’s what it takes to bring RCS properly to the masses:

Chipset vendors

Chipset vendors are at the bottom of the food chain, but they need to offer their support to make RCS happen.

Unlike other messaging services, RCS is “bolted” on to the identity of the user and his device. The SIM card. The ability to connect the end user, through an application, to the SIM card, and from there to the carrier network is what presumably makes RCS different. But for that to happen, chipset vendors need to pave the way, even if just a little bit.

Handset manufacturers

Handset manufacturers need to make sure that the RCS application is there implemented, supported and pre-installed in the device.

Without being pre-installed, users will need to pick and choose between an RCS app from a handset manufacturer or a carrier (the word bloatware comes to mind) OR pick Whatsapp instead. The choice is a simple one for most.

They need to make the application attractive and sleek. Things they can’t really do. Competing with current successful social messaging apps requires a lot of investment. Nailing the user experience is a lot harder than it looks.

Carriers

Carriers need to actually support RCS. As a service. In their network. And have these things called mobile phones that support RCS. and enough people that have these devices so they can actually talk to each other.

Preferably, all carriers within a country should light on the switch on RCS simultaneously.

How likely is that to happen?

Single, very complex specification

And all of these players need to do so for a very complex IMS/RCS specification.

Testing the combinations of devices and networks is going to be hellish, especially for those who aren’t going to just select the default Google implementation of RCS client/server.

Which is exactly what Samsung decided to do. Have its own service and then interoperate it with Google’s. I can easily see other big players – chipset vendors, handset vendors and carriers who would be either scared shitless of ceding control to Google or not magnanimous enough in letting Google take control over that piece.

This headache also suggests something really important:

If RCS succeeds, it won’t move as fast as any of the other social networks in introducing new features, services and capabilities

There are too many moving parts, controlled by different players, some of which doing the same things.

Network effects

Then there’s the network effects.

When can I use RCS on my phone?

It needs to be installed there. Probably pre-installed.

The people I communicate should have it as well.

Our networks should support it.

Oh – and there’s this minor detail of me actually going into that app to send a message.

How many times this week have you clicked on this icon on your Android phone?

What about these icons?

Enter Artificial Intelligence

I’ve been thinking about it for quite some time.

How can Google become relevant in messaging?

It is unlikely to come from features and capabilities at the core of social messaging. None of its services stick:

  • Google+ was “shutdown” publicly this month. Google found a great excuse – a potential security flaw
  • Duo was supposed to compete head-on with Apple FaceTime, offering things like faster connections and knock knock feature. But what have we seen from Duo since its launch? And are you using it at all?
  • Allo was interesting, but got no adoption. It got halted on April if you believe the news
  • Hangouts is being replaced by Meet, at least for the enterprise. Will it be shut down for consumers? Time will tell
  • Hangouts Chat is only starting its way, though I haven’t heard anything at all since its public launch
  • Meet works just fine. For the enterprise. If you have a Google account
  • The Google Messages app is purely for SMS. And it is crappy to say the least. It doesn’t respond as fast or as fluid as other social messaging apps, and frankly, I don’t really care about the technical reasons for it

The one thing Google has going for it is AI. in droves.

Which is probably why Google Duplex is reportedly rolling out next month, helping phone users book tables at restaurants – on their behalf.

It is also why Google is now adding to its Assistant the ability to screen spam calls:

These AI features have a potential to actually succeed. They don’t really relate to RCS or even messaging, but they are about telephony.

Allo was about messaging. As reported on The Verge in the April Allo pause:

As part of that effort, Google says it’s “pausing” work on its most recent entry into the messaging space, Allo. It’s the sort of “pause” that involves transferring almost the entire team off the project and putting all its resources into another app, Android Messages.

Google won’t build the iMessage clone that Android fans have clamored for, but it seems to have cajoled the carriers into doing it for them. In order to have some kind of victory in messaging, Google first had to admit defeat.

That’s the Google RCS effort right there.

If you take the AI related features in Allo, and think of them as getting Google Assistant into Messages, the Google RCS app, then it makes sense in a way. But not enough sense.

The Google Assistant doesn’t feel like a product by now. It is a large set of features and capabilities that can be used to add smarts into phones. It is a window to the phone’s (and Google’s) AI for the consumer.

Limiting it to run for RCS only doesn’t seem like the right thing to do. Would it be enough to save RCS? Would it be enough for Google to gain back users from other messaging apps?

It is too early to say, as none of it as come to fruition in an app customers can use.

Google could have tried to do with Allo the same things it is doing with its Contact Center AI:

Provide the whole AI for communication part as an API, a set of building blocks for others to use and embed. It worked so well for them that it got many in the industry lining up to partner with it in contact centers. Launch partners for the Contact Center AI include Mitel, Genesys, Vonage, Cisco, RingCentral, Five9 and Twilio to name a few.

Would such a thing work with social messaging apps?

Apple wouldn’t touch it with a long stick for its iMessage.

Facebook wouldn’t either. So no Messenger or Whatsapp.

Telegram? I don’t see that happening.

WeChat? Chinese.

Who would they be left with? The smaller players, who might grow, but none seem to be rising above white noise level.

Which gets us back to Google itself. With Messenger/RCS/Chat.

What Google needs to do is find the sticky features that will get users to use its app. Those that can get value out of it even when the other participant isn’t using the same app. Add smarts into SMS itself, while providing a rich experience to the user when interacting with others who have that app.

The real question is why limit this to RCS and carriers? why not just offer it as the out of the box Android experience to everyone? Have it there by default. Let people download and install it on older devices and on iPhones.

Probably because Google still believes it relies on carriers for its Android success. Which is what’s keeping it back in mobile social messaging since Android came to our lives.

Want to learn more about the use if silly hats and other AI features in communications? Check out our AI in RTC report preview

Download the preview

The post Can Google RCS Win the Messaging Game Through AI? appeared first on BlogGeek.me.

Pages

Subscribe to OpenTelecom.IT aggregator

Using the greatness of Parallax

Phosfluorescently utilize future-proof scenarios whereas timely leadership skills. Seamlessly administrate maintainable quality vectors whereas proactive mindshare.

Dramatically plagiarize visionary internal or "organic" sources via process-centric. Compellingly exploit worldwide communities for high standards in growth strategies.

Get free trial

Wow, this most certainly is a great a theme.

John Smith
Company name

Yet more available pages

Responsive grid

Donec sed odio dui. Nulla vitae elit libero, a pharetra augue. Nullam id dolor id nibh ultricies vehicula ut id elit. Integer posuere erat a ante venenatis dapibus posuere velit aliquet.

More »

Typography

Donec sed odio dui. Nulla vitae elit libero, a pharetra augue. Nullam id dolor id nibh ultricies vehicula ut id elit. Integer posuere erat a ante venenatis dapibus posuere velit aliquet.

More »

Startup Growth Lite is a free theme, contributed to the Drupal Community by More than Themes.