Mediation is the starting point for
the revenue generation processes in the telecommunications or more broadly the communications
service providers back office systems. Telecommunications industry is a highly
capital intensive industry where the returns on invested capital typically
start to be become profitable in at least 7-10 years horizon. Realization of ROIC (return on invested
capital) is the result of the combination of managing the network, keeping up
with the technological advancements and making sure every bit of the usage is
translated into the revenue.
More effective is the process of
revenue recognition or generation, better is the return on invested capital on
the network. Translating the use of network by subscribers into the revenue
starts with the mediation. Simply put, streams and files of data from switches,
router and gears across the network are collected, collated, formatted, and
then processed through the myriad of billing sub-applications to generate bill
and thus recognize revenue.
Today’s networks are complex mesh
of interconnected and intelligent devices and systems, which generate a lot
more information than what is just needed for the revenue recognition.
Traditionally, the billing systems have focused on three categories of
information from the mediation feed– the charging attributes, the attributes
influencing rates, also called qualitative attributes, and other non-charging attributes,
which provide supplementary information. Rest of the information from the
network devices, typically ignored and often discarded by the billing systems,
has much more meaningful information about the usage patterns and the usage
behavior of the subscribers.
How often are particular services
or features used? How does the
subscriber use the services? What is the geographical usage pattern and how
mobile is the user? How the demographical attributes affect the usage of
particular services and new service adoption? How diverse is the use of the
various products across the subscriber base? Many of these questions are not
directly relevant to the billing process in the short run, and are thus often
ignored with a billing centric view of mediation data.
There are also system level
limitations or constrains which encourage a very billing system centric view of
the data acquired by mediation devices. The mediation platforms are typically integral
part or subsystems of the billing platform. Billing platforms are designed as
structured relational database based systems, where additional storage and
additional attributes means additional cost. The cost of change is very high
for these monolithic billing systems in place today and requires thousands of
man-hours of efforts over many months of release cycles. Any change in the
mediation feed structure, or addition of new switches resulting in new data
feeds, lead to cascading change effect on the mediation system. Mediation systems
therefore avoid this cost by discarding the information that is not needed directly
from a billing perspective right at the door, and focusing on the charging
aspect of the data.
In the end, by using the
traditional relational database based billing systems, we end up losing a lot of
meaningful information from the mediation feeds. Also, whatever we capture is
very billing centric and carries a high cost of storage. Communications Service Providers thus bear a high
cost and at the same time are not able to realize the full potential of the
network data.
Thinking out loud, what if we could
store all kinds of usage data provided to the mediation platforms at a much
lower cost for a much longer retention period? What if we could accommodate all
formats of usage data (there is a fancy word for it -unstructured data) from
switches without having to invest in defining schemas and associated databases
upfront? What if we could keep all this usage data and add additional streams
of data like diagnostic information, network outage information, incident information
from CRM systems, and customer profile to create a mesh of meaningful information?
All of the above would create much
higher value from the mediation data, part of which is discarded today due to
associated cost and no immediately known value. The CSPs will be able to create
usage patterns, segment level subscriber behavior, analytics on the device and
their usage, revenue patters for subscribers, and geographical usage patterns to
the level of devices, towers and subscriber segments. The possibilities created
by just the ability to store and process this unstructured mediation feed are
numerous.
Fortunately, the technology to
achieve the above is available today from Big Data technologies from Apache
Hadoop ecosystem. Big Data technologies
like Hadoop HDFS file system supports unstructured data and can store the data
feeds with high level of redundancy on commodity hardware requiring no database
or schema definition upfront. There is
actually no database in big data. Once the data feeds are ingested into the
big data repository, the ‘Data Lake’, map-reduce applications can process the
data creating insights and meaningful information, when needed. Map-reduce
applications are distributed applications which run where the data is among the
nodes of the ‘Data Lake’ and provide extremely high level of horizontally scaled
processing. They can also provide charging
specific information to billing system, effectively replacing the billing
centric mediation systems.
By creating data lakes of mediation data and wiring
in additional information feeds, CSPs can create meaningful datasets, which can
be analyzed and correlated to create new insight that can shape the network
planning, the customer care and the product design. Insights into the usage
patterns and subscriber behavior can provide opportunity for creating
personalized offering. The segment level usage analytics can create
opportunities for targeted marketing and network development to service the
targeted segments.
The telcos also get an advantage
that they have the usage information outside of the billing systems and they
can tie the rest of the information systems to the ‘Data Lake’ at a much lower
cost without dependency on the billing systems. With petabytes of information
about the use of their network and the usage patterns of applications offered
on top of the network layer, should not communications service provider use
this information to their competitive advantage as Google and Yahoo have done
it for the world wide web?