Product Overview

Lenovo Enterprise Analytics Platform Hadoop Distribution (LeapHD) is an industry-leading high-performance one-stop analytics platform oriented to enterprise-level big data scenarios. LeapHD platform is characterized by easy use, efficient operation, easy scalability, security and reliability. It can help an enterprise to rapidly establish a uniform data and computing platform, rapidly support an enterprise's internal/external data acquisition and integration, realize the storage of mass data and deliver excellent data computing and deep analysis and mining ability. A user can establish the corresponding analysis and mining applications on a big data platform for timely identification of new business opportunities and potential risks and higher competitiveness.

Functional architecture
Product Advantage

Installation and deployment automation

  • Supports a deployment capacity of over 1,000 nodes;
  • Guided installation process; rolling upgrade of platform and components without the need to shut down and restart and without loss of data;
  • Centralized parameter maintenance and one-stop service for host, service and components;
  • Performance monitoring and early warning; supports multiple notification modes, including system, email and SMS;
  • Provides a big data platform automatic, visual testing framework; allows the user to perform batch performance and stress testing on various platform components and optimize performance;
  • The LeapHD system provides a flexible system monitoring and control dashboard that allows configuration of monitoring and control content according to needs of operation and maintenance personnel, e.g;
  • The platform manages the installation and operation overview, thermal maps and configuration changes and records of various services; it enables visual monitoring and control of operating conditions and performance;
  • Supports device clusters, platform and service deployment, configuration and upgrade management;
  • Supports online addition, removal and migration of computing nodes and storage nodes in a cluster; enables automatic deployment of agents on nodes; and realizes the centralized monitoring and management of hosts of physical servers or virtual hosts in a LeapHD cluster.

Big data computing platform

  • Continuously track the component’s community version updating and select the best version via automation tests;
  • Continuously update and optimize the component parameters and improve the search and computing performances to the optimal extent;
  • Support MPP database, spark-based high-performance framework and support CRUD operation and storage process;
  • Support Kylin-based MOLAP technical mode and satisfy multidimensional analysis needs in BI scenarios.

Metadata management and security

  • Supports the collection and management of metadata from HADOOP, Oracle, DB2, MS SQL Server, etc;
  • Supports Hive, HDFS metadata automatic access and full and incremental updates;
  • Supports the collection and management of metadata from XML, Excel, CSV, TXT, log files, etc;
  • It provides a perfect metadata management tool to support the search and search of metadata;
  • Support full link data processing flow consanguinity tracking and impact analysis;
  • Support multi-layer, multi-level and multi-role data permission model to manage metadata, as well as sensitive data filtering and active detection hiding.

Multi-tenant management

  • Resource sharing and isolation, quota management;
  • Preemptive resource scheduling and fine-grained resource allocation based on priority are supported;
  • Enables project creation, designation of project names, project storage resources, computing resource configuration and designation of a project administrator;
  • Enables project list information, preview of allocated resource data and editing of project information;
  • Perform key word searches and fuzzy searches of project information.
LeapHD products

Data integration package

The big data platform provides real-time, batch and diversified other data acquisition modes. It has a development tool package that supports different systems and devices and thus can be rapidly scaled up according to enterprise needs.

Big data computing platform

Lenovo big data platform is based on Hadoop, Spark and other open-source ecologies. It comprises multiple core features and components and enables high-level integration and performance optimization of open-source technologies. It also enables deep optimization of the infrastructure layer. It establishes a generic resource dispatch management system based on the distributive storage system to efficiently support large-scale batch processing, interactive search computing, flow computing and multiple computing engines.

Metadata management

Metadata (metadata) is a data management tool of big data platform, which manages the metadata owned by enterprises, supports business view and physical view management of data, and can view the basic information of metadata, the location of data, the consanguinity of data, the impact analysis of data, the life cycle of management data, etc.

Task scheduling

Task scheduling (taskscheduler) is an efficient graphical big data workflow configuration and execution management platform, which supports visual big data computing task construction ability. By encapsulating the complexity of the underlying technology and providing the visual operation of many computing modules, such as sql script, mr,spark scala,shell script, mysql/oracle, data import / export and so on, developers focus more on the computing itself than on the underlying technical details.

SQL search analyzer

SQL search analyzer is an online search tool constructed on the basis of big data platform and enables data resource management, customized data search, timed task and file management. The user doesn’t have to master complicated big data development technologies. Instead, it only needs to be familiar with SQL grammar to perform search operations on mass data like that of a relational database and it can display the search results in a visual way.

Multi-user management

Opens up data capacity on an as-needed and controlled basis and enables resource management for multiple users; allocate storage and computing resources, monitor the use thereof and calculate the charges by item..

System operation and maintenance monitoring

The operation and maintenance management system of Lenovo's big data platform enables convenient graphical monitoring of operation and maintenance. With easier use, it enables automatic software deployment, real-time monitoring of the operating status of various nodes, authority control of various types of users, collective resource quota dispatch and automatic system warning.

Description of big data platform components

The official release component installation version can be viewed.

Lenovo Big Data case studies
  • Big data application casestudy—steel manufacturer

    Lenovo Big Data provides deep learning smart application platforms and comprehensive solutions for clients, including big data software and hardware, big data consulting, modeling and other services. It assists enterprises in using big data to accurately forecast market demand and establish a technology-driven big data analytics platform based on operating needs and oriented to full data and self-learning.

  • Case study—big data platform construction in the medical industry

    Lenovo Big Data has established a centralized and unified big data operation platform for the hospital’s applications and has thus set up an end-to-end medical data application system covering data acquisition, data storage, data analysis and data application for the hospital.