How IP Media Changed the Voice Business

This post is about a critical technical development in the history of Voice over IP which had a wide-reaching impact on the development of voice and related communications solutions. I’m referring to IP media, which was introduced early in the 2000s and has been ramping up every since.

In my last two posts, we discussed two important technologies which were instrumental in the early and middle years of voice-based solutions. The first post covered the introduction of voice boards and the second post reviewed the impact of media gateways on voice solutions.

On the business side, the introduction of media gateways provided a stepping stone which encouraged the pioneers of Voice over IP and other voice solution providers to decide where they offered the most strategic value to their customers.  In particular, should they focus on applications or upon enabling technology which could either complement applications or be used to build underlying infrastructure. The introduction of IP media pushed companies further down this decision path.

Two directions emerged. Several manufacturers of voice boards began to kick the tires on creating software-based versions of their voice boards. In the post on voice boards, we noted how control of voice and media functions was controlled using Application Program Interfaces (APIs) and that private APIs tied to particular vendor product families gained much more market traction than attempts at standards-based APIs. Hence, an early product in the space, Host Based Media Processing (HMP) from Dialogic®, offered the value proposition of being software-based, but was still controlled using the same set of APIs that were used with Dialogic voice boards. In parallel, another movement emerged. Two startup companies, Snowshore and Convedia, introduced a new category of product called the Media Server. In the last post, I mentioned how the Session Initiation Protocol (SIP) started to gain traction early in the 2000s. The Media Server concept took SIP as a starting point and added the ability to manipulate media by using a markup language, which was typically based on the  Extensible Markup Language (XML) recently standardized by the World Wide Web Consortium (W3C). The implications were profound, both on the technical and business sides, but like many new innovations, the transition to using this new approach took many years to develop. For example, by the time Media Servers truly hit the mainstream, the two originating companies had both been acquired by larger organizations who were able to make the needed capital investment to build sustainable businesses for media servers.

Of the two approaches, IP Media controlled by APIs essentially was an incremental development and IP Media managed by Media Servers introduced radical change. Let’s consider why this was the case. IP Media controlled by APIs retained the API-based model for control of media. For existing voice application developers, this was great.  They could start the transition away from including voice board hardware in their solutions and thus vastly simplify their go-to-market strategies. As a result, many voice application developers now raised the flag and said they were now out of the hardware business and their solutions were totally software-based. In reality, this typically meant their application software would run on industry standard Commercial Off the Shelf (COTS) PCs or servers using Intel or compatible CPUs such as those offered by AMD. But by using IP Media, the solution providers could skip the step of adding voice boards to their computer-based solutions and eliminate all of the complications of integrating 3rd party hardware. They did have to be careful to have enough CPU horsepower to run both their applications and IP media software, but it represented a major step forward. Voice and multi-media application solutions had now become a separate business in the Voice over IP market.

I mentioned that the introduction of the IP-based Media Server was a more radical step. So, I’ll review a few points to back up that assertion.

  1. The need to have a private API controlled by a single vendor went away.  The new concept of “the protocol is the API” replaced the programmatic approaches which had required developers to use programming languages such as C, C++ or Java for media operations. Instead, simple operations like playing back voice prompts or collecting digits could be accomplished using the combination of SIP and an XML-based markup language, thus eliminating the need for a programmatic language to carry out these operations.
  2. The application developer could focus clearly on making their applications best-of-breed and partner with media server vendors, who would focus on creating world-class voice and multimedia solutions.
  3. The application developers no longer needed to include media processing in their applications at all, thus reducing the CPU cycles needed for those media operations, However, the application developers did need to partner closely with the media server vendors and ensure their SIP + XML commands would work correctly when issued over an IP network to the paired media server.
  4. The concepts of the standalone application server and the standalone media server got included in the new IP Multimedia Subsystem (IMS) architecture, which was being standardized by the Third Generation Partnership Project (3GPP) as a linchpin for the next generation of mobile networks.

So the move toward IP Media was a major step forward for the Voice over IP industry and encouraged further market segmentation. For the first time, companies could specialize in applications and include the ability to support voice, tones and other multimedia, and do all of this in software which would run on industry standard COTS servers. In turn, hardware component and appliance vendors were able to focus on more distinct market segments where they could utilize either embedded solutions technology or start making the move toward running  media on COTS servers.

In my next post, I’ll talk more about how the business models for voice and unified communications solutions have evolved due to the more wide spread use of server and appliance based technology for applications, signaling and media.

James Rafferty has been active in the worlds of telecommunications, standards and university teaching in a variety of roles. He's been a thought leader in areas such as Voice over IP and Internet fax through his consulting, product management, marketing, writing and standards activities, and he is currently teaching business at Northeastern University. He loves to write and talk about new connections, applications and business models as communications, related technologies and business concepts evolve.

Leave Comment

Your email address will not be published. Required fields are marked *