Tuesday, October 29, 2013

Big Data in Telecom - Is mediation the right starting point?

Mediation is the starting point for the revenue generation processes in the telecommunications or more broadly the communications service providers back office systems. Telecommunications industry is a highly capital intensive industry where the returns on invested capital typically start to be become profitable in at least 7-10 years horizon.  Realization of ROIC (return on invested capital) is the result of the combination of managing the network, keeping up with the technological advancements and making sure every bit of the usage is translated into the revenue.

More effective is the process of revenue recognition or generation, better is the return on invested capital on the network. Translating the use of network by subscribers into the revenue starts with the mediation. Simply put, streams and files of data from switches, router and gears across the network are collected, collated, formatted, and then processed through the myriad of billing sub-applications to generate bill and thus recognize revenue.

Today’s networks are complex mesh of interconnected and intelligent devices and systems, which generate a lot more information than what is just needed for the revenue recognition. Traditionally, the billing systems have focused on three categories of information from the mediation feed– the charging attributes, the attributes influencing rates, also called qualitative attributes, and other non-charging attributes, which provide supplementary information. Rest of the information from the network devices, typically ignored and often discarded by the billing systems, has much more meaningful information about the usage patterns and the usage behavior of the subscribers.

How often are particular services or features used?  How does the subscriber use the services? What is the geographical usage pattern and how mobile is the user? How the demographical attributes affect the usage of particular services and new service adoption? How diverse is the use of the various products across the subscriber base? Many of these questions are not directly relevant to the billing process in the short run, and are thus often ignored with a billing centric view of mediation data.

There are also system level limitations or constrains which encourage a very billing system centric view of the data acquired by mediation devices. The mediation platforms are typically integral part or subsystems of the billing platform. Billing platforms are designed as structured relational database based systems, where additional storage and additional attributes means additional cost. The cost of change is very high for these monolithic billing systems in place today and requires thousands of man-hours of efforts over many months of release cycles. Any change in the mediation feed structure, or addition of new switches resulting in new data feeds, lead to cascading change effect on the mediation system. Mediation systems therefore avoid this cost by discarding the information that is not needed directly from a billing perspective right at the door, and focusing on the charging aspect of the data.

In the end, by using the traditional relational database based billing systems, we end up losing a lot of meaningful information from the mediation feeds. Also, whatever we capture is very billing centric and carries a high cost of storage.  Communications Service Providers thus bear a high cost and at the same time are not able to realize the full potential of the network data.

Thinking out loud, what if we could store all kinds of usage data provided to the mediation platforms at a much lower cost for a much longer retention period? What if we could accommodate all formats of usage data (there is a fancy word for it -unstructured data) from switches without having to invest in defining schemas and associated databases upfront? What if we could keep all this usage data and add additional streams of data like diagnostic information, network outage information, incident information from CRM systems, and customer profile to create a mesh of meaningful information?

All of the above would create much higher value from the mediation data, part of which is discarded today due to associated cost and no immediately known value. The CSPs will be able to create usage patterns, segment level subscriber behavior, analytics on the device and their usage, revenue patters for subscribers, and geographical usage patterns to the level of devices, towers and subscriber segments. The possibilities created by just the ability to store and process this unstructured mediation feed are numerous.

Fortunately, the technology to achieve the above is available today from Big Data technologies from Apache Hadoop ecosystem.  Big Data technologies like Hadoop HDFS file system supports unstructured data and can store the data feeds with high level of redundancy on commodity hardware requiring no database or schema definition upfront. There is actually no database in big data. Once the data feeds are ingested into the big data repository, the ‘Data Lake’, map-reduce applications can process the data creating insights and meaningful information, when needed. Map-reduce applications are distributed applications which run where the data is among the nodes of the ‘Data Lake’ and provide extremely high level of horizontally scaled processing.  They can also provide charging specific information to billing system, effectively replacing the billing centric mediation systems.

 By creating data lakes of mediation data and wiring in additional information feeds, CSPs can create meaningful datasets, which can be analyzed and correlated to create new insight that can shape the network planning, the customer care and the product design. Insights into the usage patterns and subscriber behavior can provide opportunity for creating personalized offering. The segment level usage analytics can create opportunities for targeted marketing and network development to service the targeted segments.


The telcos also get an advantage that they have the usage information outside of the billing systems and they can tie the rest of the information systems to the ‘Data Lake’ at a much lower cost without dependency on the billing systems. With petabytes of information about the use of their network and the usage patterns of applications offered on top of the network layer, should not communications service provider use this information to their competitive advantage as Google and Yahoo have done it for the world wide web?

2 comments:

  1. good big data telecom services provide your blog
    http://www.softql.com/

    ReplyDelete
  2. Informative article, just what I was looking for. Thank you so much for taking the time for you personally to share such a nice info.
    _____________________
    Data lake

    ReplyDelete