Current location - Quotes Website - Team slogan - Do you need an examination and approval center for Sino-Thai construction?
Do you need an examination and approval center for Sino-Thai construction?
At the beginning, we need to consider three aspects: organization, supporting technology, methodology, and often consulting services.

China Taiwan Province, as a * * * capability with business attributes, needs a full-time organization that understands business and undertakes business responsibilities first. Whether to build a China-Taiwan platform depends first on whether the leaders have the courage to integrate and establish a China-Taiwan organization. Because the original platform department usually doesn't know the business, and the people who know the business are scattered in the front desk business department, the establishment of intermediate platform institutions often involves the adjustment of personnel, organizational structure and departmental responsibilities. Because of this, the construction of Taiwan Province in China often needs to be a top-level project to be successful.

The key of China Taiwan Province Organization is to understand business and take business responsibility. For example, the construction and operation team of a big data platform is not a middle-level organization. If a team makes a very perfect mid-platform product (such as developing index management system, data warehouse development system, data quality management system, etc.). ), but only providing products for business parties to use, the team still can't be said to be a middle-level organization. Only when this team undertakes the construction and management of index system, the design and implementation of data warehouse and the guarantee of data quality can it be said to be a middle-level institution. To do this, the organization must be familiar with the business, and its objectives and assessment must also be related to the business (of course, not only non-business indicators such as platform stability).

The level of the middle-level organization should preferably correspond to the level of the middle-level organization. BU-level middle-level institutions should report directly to BU supervisor or CXO, and enterprise middle-level institutions should report directly to CEO or CXO.

What's special here is that if we don't build an online business center and only adopt micro-services, cloud native and other technologies, we can realize the transformation in the original R&D department without involving large-scale organizational changes. Generally speaking, it can also improve the usability, flexibility and R&D efficiency of the system.

Construction support technology of intermediate platform

Generally speaking, building an intermediate platform requires a set of supporting technologies.

First, the support technology in online commerce

The construction of online business platform generally needs the support of cloud native, DevOps and micro-service technology system, because:

Micro-service technology: China Taiwan Province is an independent organization responsible for and serving multiple front-office businesses, so it needs standard service interfaces, mature service governance capabilities and efficient agile R&D technology. In the current technical environment, adopting REST-style synchronous API and message queue asynchronous communication as standard service interface technology, and adopting service framework (such as Spring Cloud), API gateway and APM as standard service governance and agile R&D technology are the most suitable choices. It is no longer recommended to adopt the traditional service-oriented (SOA) technology based on ESB, because ESB products involve too much business logic, which often leads to changes in the front office business that need the cooperation of the middle office team, thus losing the significance of building a good middle office and supporting efficient innovation in the front office. In addition, centralized ESB software and complex WS-xxx protocol based on XML will also affect the availability and performance of the system. You can refer to martin fowler's evaluation in P of EAA. Web service is an application integration technology, not an application development technology.

DevOps technology: If all microservices can be self-deployed and updated without DevOps, the agility brought by microservices will not be brought into play. On the contrary, due to the increase in the number of services, the efficiency of research and development will decrease, so DevOps technologies such as continuous integration and continuous release are generally needed to realize micro-services.

Cloud native technology: Microservices and DevOps require the underlying infrastructure to be flexible and programmable, otherwise, according to Amdal's law, as long as a necessary link is inefficient, the overall efficiency cannot be improved.

It should be emphasized that the intermediate platform should be agile, on the one hand, because the intermediate platform has business attributes and supports a very rich foreground business, some agile requirements of the foreground business will be passed to the intermediate platform layer; On the other hand, the importance of the intermediate station makes it need continuous optimization, even if the external service remains unchanged, the internal implementation will change frequently.

Distributed transaction technology: After the micro-service is split, the complex business process can no longer realize the ACID characteristics through the transaction mechanism of the database, so the distributed transaction processing technology of the service layer is needed. Typical distributed transaction processing models include TCC, Saga, FMT, etc. Among them, TCC and Saga need each service to implement customized rollback logic, which is more intrusive and has a higher threshold for use. For Java, FMT mode can realize distributed transactions by adding a line of comments (such as @GlobalTransaction), and the rest is handled automatically by the framework, which is much more convenient to use. Saga mode was put forward by two researchers at Princeton in 1987, which has the best flexibility and concurrency, but it needs careful design such as semantic lock to play its role.

It can be seen that the technical support system of online business platform is quite complicated. Fortunately, the world's leading Internet companies, such as Netflix and Google, have established many practical technical modules for their own business needs, and the open source community has also made great contributions. CNCF has done a good job of collection and standardization. Through the integration of related technologies, there are already good products available, such as Netease canoe micro-service, which is a set of well-designed and fully functional online business support technology products.

Generally speaking, the front desk will adopt the same technical system as the middle desk of online business, such as cloud native, because the front desk needs agility more. With the support of the perfect middle desk, the front desk will be lighter, and FaaS serverless technology can also be considered, but there are not many practices in this field (especially in China) and the related supporting technologies are not very mature.

Second, the data support technology in the platform

Building a data center usually requires a typical set of supporting technologies, as follows:

Indicator management system: Indicators are the most critical interface between the middle desk and the front desk, and also the bull's nose in building the data middle desk. Because it is the core business language, inconsistent indicators and frequent data errors are the most common entry points in building the data middle desk. If there is no unified methodology and unified construction of the index system, it is hard to say that it is a data center. Generally speaking, the indicator management system should implement consistent methods (such as atomic/derivative/composite indicators, dimensions, modifiers, etc.). ), manage the business and technical caliber of indicators, and support the examination and approval management of indicators. The indicators of the data center cannot be handed over to the self-help construction of the front desk business.

Data service system: Similar to online business, the platform needs to provide standardized services through API gateway, and the data platform also needs standardized service mode, usually called data service system, which can also be said as data gateway or data portal. Similar to other gateway products, the data service system needs to provide authentication, log audit, flow control, protocol conversion (such as the conversion between SQL dialects), and should also develop extended functions such as multi-engine fusion query and logical model to improve the stability and flexibility of the service interface.

Metadata management system: Metadata management is the foundation and center of the whole data center, and all other systems depend on it. To do a good job in metadata management, first of all, of course, it is the management of data schema or catalog, at least to know what data the intermediate station is. For complex data platforms, data consanguinity is also very important. Without consanguinity information and the dependence between data, the quality of data cannot be managed well, because we don't know how the quality problem of a data comes from and what it will affect. Similarly, if there is no blood relationship, data assets will certainly not be managed well, because you don't know what data is valuable and what is not, just as you don't know who called a function, you don't know whether it is dead code. Metadata management systems often need to provide a basic access interface, which is usually called data mapping.

Data warehouse development and management system: In addition to index management, the development of data warehouse is the core process of combing a large number of initial data construction into a beautiful data center. Generally speaking, data center is more suitable for kimball's dimensional modeling method than the method advocated by bill Inmon, the father of data warehouse, because Inmon emphasizes top-level design, while Kimball emphasizes bottom-up. If you want to build a data center, it must be because the front desk business is complex and changeable. At this time, emphasizing the top-level design will lead to the slow and rigid construction of China-Taiwan. Because although the middle desk should be decided by the top management of the organization, its purpose is to support the front desk business, not to control it. Support rather than control must not put the cart before the horse.

Data quality management system: All complex systems need professional quality management. The online business system has a series of monitoring and operation tools such as flexible design and APM, and the data center also needs professional quality management. Data quality management system is usually designed to support rich auditing/verification/comparison rules, monitor whether the data is accurate, real-time and consistent, give an alarm in time, analyze the influence surface, and provide a means of quick repair. However, these methods can only find and remedy problems, but can't prevent them. To prevent problems, we must reduce code bugs through testing tools, deal with performance fluctuations through resource flexibility, and give priority to meeting important business needs through priority scheduling. Relatively speaking, the quality management in the data center field is not as mature as that in the online business field. For example, the testing method in online business field is far more complicated than that in data field. There are no mature and practical methods in the field of data, such as fuse, current limiting and service degradation, which are common in the field of online business (priority scheduling can be said to realize some service degradation functions). As data centers become more and more extensive and important, these technologies should and need to continue to develop, but the technical challenges are not small.

Data security management system: Because the data center gathers all the valuable data assets of the organization, good security management is necessary. Fine-grained permissions and auditing are the foundation, and generally technologies such as desensitization of privacy/sensitive data, data encryption (especially when data is hosted on a third-party platform) and data leakage protection (for example, the common method is to limit the amount of data downloaded locally) are also needed. The advanced stage may even require technologies such as federal learning and data sandbox.

Data asset management system: In the case of data quality and security, data asset management is mainly responsible for data life cycle management, cost statistical analysis and optimization.

At the same time, data centers need powerful big data computing engines, data integration/synchronization/exchange engines, and often agile BI systems:

Big data computing engine: The scale and complexity of data to be managed in data centers are often high (otherwise, it is the trouble of adding new words to data centers), and traditional databases and data warehouses are basically unable to support them. In the current technical environment, MapReduce or Spark based on Hadoop is almost the only choice. Of course, this also includes Hive and Spark SQL. If you can use SQL, use SQL, which is simple to maintain and easy to collect data. In addition, stream processing may require Flink, and interactive queries may introduce Impala or GreenPlum.

Data integration/synchronization/exchange engine: On the one hand, data centers need strong data integration and synchronization capabilities to absorb data from all parties. The concepts of integration and synchronization are similar, and synchronization emphasizes real-time performance. On the other hand, data centers are often composed of multiple data computing engines, so synchronization or exchange engines are needed to exchange data between different engines.

Agile BI system: The most important purpose of building a data center is to support business operation and decision-making, so it is necessary to further develop data products based on the data center. Agile BI system is a fast and portable means of data product development, which can play the value of data platform as soon as possible.

In addition, for Internet services, data centers usually need a unified embedded engine. If the logic of burying points is not unified, when building a data center, you will find that the data sources are chaotic and there is no way to do it later. In other industries, data collection is also a basic work and must be done first.

It can be seen that the technical support system needed to build a data center is also quite large and complex. Fortunately, in the past decade, leading enterprises such as Google, open source communities such as Hadoop/Spark and a large number of manufacturers have jointly explored a feasible way, with relatively unified methodology and technical route. On this basis, we can provide more mature technical support products for the data center. For example, "Netease Mammoth v 6.0+ Netease Superior Tree" developed by Netease Aeronautical Research is a relatively complete set of data center products.