Log Management & Maintainance in Telecom Industry



Users’ business requirements

One of the largest province branch of China Mobile, its daily highest throughput for log data need to be archived and analyzed is about 300 million records per hour (about 300GB), data volume increment 凹凸3TB. And they need to build a large data log management platform based on the specific requirements are as follows:

    - Origin query of business issues, response time within 3s.

    - Error analysis based on log

    - Matching analysis of log

    - Issues exploration and early warning based on log

    - The aggregation and real-time displaying of multi-dimension analysis results.


As shown in the following figure, the origin system is based on Hadoop HDFS and Greenplum,for log monitoring system, and the original log files are stored in the NAS file system.


And then through the stream processing engine, the system will offset log message corresponding to the file corresponding to each transaction log in the (location) into a formatted record, and the resulting data will be stored in the GP (Greenplum).


The original solution have several disadvantages:

-  Stream processing engine, GP and NAS management will be more complex and difficult when all the log data accumulate in the system.

-  Log data in NAS and Hadoop is difficult to integrate, so it can not be analyzed with united management tool (such as Hive).

Solution of us

The intelligent log analysis system based on SequoiaDB +Hadoop Framework, increase the function of intelligent log analysis based on the existing log monitoring system, and the system can automatically analyse application logs, improving the error correcting capability for application.


In this system architecture, SequoiaDB play important roles:

-  Mass Application Log Data Storage: SequoiaDB stores mass amount of original log,data, and distributed the data into 5 servers by fileds of time and transaction number. SequoiaDB’s backup mechanism ensures data security at the same time, together with the DR of data.

-  Data source for Hadoop: SequoiaDB is deeply integrated with Hadoop framework, SequoiaDB support MapReduce processes for fast computing, and the system can also user HiveSQL to query data.

-  Provide analysis of the original data to Hadoop SequoiaDB database can be integrated with the depth of Hadoop, the MapReduce program can seamlessly SequoiaDB data source for efficient iterative calculation, can also be retrieved by the Hive SQL log data on the distribution of the complete parallel distributed computing.

-  The real-time SQL query: SequoiaDB also support natively SQL access for querying data, this provides a direct and high performance way to query the log data.

Please login to post comments
Latest Comment
About Us

SequoiaDB is a financial-level distributed database vendor and is the first Chinese database listed in Gartner’s Magic Quadrant OPDBMS report. SequoiaDB has recently released version 3.0.
SequoiaDB is now penetrating the vertical sector Financial Industry quickly and had more than 50 banking clients and hundreds of enterprise customers in industries including government, telecommunication, Internet and IoT.

Tower R, No.8 North Star East Road, Chaoyang District, Beijing,China
Tower A, No.22 Qinglan Street, Panyu District, Guangzhou,China
Tsing Hua Tech Park, Nanshan District, Shenzhen,China