in mid-March, global technology giants once again competed to appear on the big language model track.
within a week, OpenAI, an American startup that developed ChatGPT, Microsoft, a technology giant that invested heavily in OpenAI, and Baidu, a leading Internet company in China, released the latest developments in the field of LLM. This has once again triggered global attention in this field.
on March 14th, local time, OpenAI released the latest version of its large-scale language model-GPT-4, which significantly improved the quality and technology of question and answer compared with GPT-3.5.
on the afternoon of March 16th, Baidu started the ERNIE Bot test of a new generation of big language model and generative AI products, thus becoming the first China enterprise to join the competition.
At the press conference, Robin Li, founder, chairman and CEO of Baidu, showed ERNIE Bot's five usage scenarios, including literary creation, business copywriting, mathematical calculation, Chinese understanding and multimodal generation, through a question-and-answer session. A few hours later, Microsoft announced that it would connect GPT-4 to the whole Office bucket, and the new name was "Microsoft 365 copy".
As the article published by Finance E Law on February 17th (OpenAI exclusive response | Why doesn't |ChatGPT open its registration to all users in China? ), the mobile phone numbers in mainland China and Hongkong, China cannot be registered with ChatGPT account. In addition, although OpenAI's application programming interface (API) has been opened to 161 countries and regions, it does not include mainland China and Hongkong, China.
On the one hand, the industry is generally concerned about who will be the next trendsetter in the overwhelming wave of AIGC (Generative Artificial Intelligence). On the other hand, during the sensitive period of Sino-US technological co-opetition, all parties are also concerned about the ripple caused by Baidu's first step and how China enterprises should respond. 1 "Is it really ready?"
On March 16th, Li Yanhong gave a speech in a white shirt and sneakers. At the beginning, I faced the question directly. "Recently, many friends asked me, why today? Are you really ready?"?
Li Yanhong's answer is that although Baidu has invested in AI research for more than ten years and made full preparations for the release of ERNIE Bot, it can't be said to be completely ready, because ERNIE Bot has a high threshold for benchmarking ChatGPT and even GPT-4, and there are "many imperfections". However, he stressed that "once there is real human feedback, ERNIE Bot will make great progress".
Li Yanhong explained that the reason why he chose to publish on the same day was because there was demand in the market: both customers and partners hoped to use the latest and most advanced big language model earlier.
how to understand what Li yanhong said "the threshold for benchmarking GPT-4 is very high"?
on March 14th, local time, OpenAI released the latest version of its large-scale language model-GPT-4. It is worth noting that GPT-4 is a large-scale multi-modal model, that is, it can accept input of images and text types. GPT-3.5 can only accept text input.
In the demo video, Greg Brockman, president and co-founder of OpenAI, drew a sketch of the website with pen and paper, and entered the picture into GPT-4. After only 1-2 seconds, GPT-4 generated the webpage code, and made a website that was highly similar to the sketch. According to the experimental data released by OpenAI, the GPT-4 model has made great progress compared with the previous generation GPT-3.5, and it has exceeded the level of most human beings in many professional tests.
Pan Helin, co-director of the Digital Economy and Financial Innovation Research Center of the International Joint Business School of Zhejiang University, believes that ERNIE Bot needs to be fully opened to users in the future. Whether it is through the B-side API or directly open to the C-side users, user experience word of mouth is the last word. At present, ChatGPT is not open to users in China. In the domestic market, Baidu will have the first advantage.
Zhang Yi, CEO and chief analyst of Ai Media Consulting, who has evaluated both OpenAI and Baidu's products, said that GPT series models, including GPT-4 and ERNIE Bot, are essentially the same kind of products, but their respective data coverage areas and data model accumulation lengths are different. In the short term, OpenAI's product preparation time is relatively more sufficient, and its intelligence is temporarily ahead. But for ERNIE Bot, it is also remarkable to train such a product in such a short time.
At the same time, Zhang Yi is more confident that Baidu will make better products. His reason is that China will have more advantages in terms of talent pool of artificial intelligence, big data and big models.
Chen Duan, director of the Center for Digital Economy Integration, Innovation and Development of Central University of Finance and Economics, believes that compared with overseas competitors, Baidu's greatest advantage is that it has built a moat of understanding in language and culture.
As a large language model product developed by China Company, ERNIE Bot's Chinese comprehension ability has attracted much attention. The important reason is that many commentators thought that ChatGPT's Chinese question-and-answer ability was not as good as that of English.
Li Yanhong said that as a big language model rooted in the China market, ERNIE Bot has the most advanced natural language processing ability in the Chinese field. In the live exhibition, ERNIE Bot correctly explained the meaning of the idiom "Luoyang Paper Expensive" and the corresponding economic theory, and also wrote a Tibetan poem with "Luoyang Paper Expensive".
Li Yanhong said that ERNIE Bot's training data includes: trillion-level webpage data, billions of search data and picture data, tens of billions of daily voice call data, and 55 billion facts knowledge maps, which makes Baidu unique in Chinese language processing.
The interviewed experts also pointed out that due to the particularity of Chinese, it is more difficult for China enterprises to develop large-scale models, but if they break through, they will have greater advantages in providing local services.
Ding Wenzhao, a professor of artificial intelligence and business analysis at Lyon Business School in France, pointed out to the media a few days ago that language dialogue model training needs to make machines understand words, and English is a little easier than Chinese. Ding Wenzhao explained that most of the Chinese languages processed by artificial intelligence technology in China are pictographs, while English is explanatory, and the words are not particularly rich.
In addition, Lin Zhouhan, an assistant professor at the Johns Hopcroft Computer Science Center of Shanghai Jiaotong University, believes that in the future, the large language model will develop in a multimodal and interactive direction, further integrating technologies in the fields of vision, pronunciation and reinforcement learning. Li Yanhong also said: "Multimodal is a clear development trend of generative AI. In the future, with the enhancement of Baidu's multi-modal unified big model, ERNIE Bot's multi-modal generation ability will continue to improve. "
In terms of multimodal generation, Li Yanhong demonstrated ERNIE Bot's ability to generate text, pictures, audio and video. ERNIE Bot read a paragraph in Sichuan dialect at the scene and generated a video based on the text. However, Li Yanhong revealed that ERNIE Bot's video generation cost is high, and it is not open to all users at this stage, and it will be gradually accessed in the future.
Li Yanhong said that ERNIE Bot's training data includes: trillion-level webpage data, billions of search data and picture data, tens of billions of daily voice call data, and 55 billion facts knowledge maps, which makes Baidu unique in Chinese language processing.
The interviewed experts also pointed out that due to the particularity of Chinese, it is more difficult for China enterprises to develop large-scale models, but if they break through, they will have greater advantages in providing local services.
Ding Wenzhao, a professor of artificial intelligence and business analysis at Lyon Business School in France, pointed out to the media a few days ago that language dialogue model training needs to make machines understand words, and English is a little easier than Chinese. Ding Wenzhao explained that most of the Chinese languages processed by artificial intelligence technology in China are pictographs, while English is explanatory, and the words are not particularly rich.
In addition, Lin Zhouhan, an assistant professor at the Johns Hopcroft Computer Science Center of Shanghai Jiaotong University, believes that in the future, the large language model will develop in a multimodal and interactive direction, further integrating technologies in the fields of vision, pronunciation and reinforcement learning. Li Yanhong also said: "Multimodal is a clear development trend of generative AI. In the future, with the enhancement of Baidu's multi-modal unified big model, ERNIE Bot's multi-modal generation ability will continue to improve. "
In terms of multimodal generation, Li Yanhong demonstrated ERNIE Bot's ability to generate text, pictures, audio and video. ERNIE Bot read a paragraph in Sichuan dialect at the scene and generated a video based on the text. However, Li Yanhong revealed that ERNIE Bot's video generation cost is high, and it is not open to all users at this stage, and it will be gradually accessed in the future.
before and after the conference, Baidu's share price experienced ups and downs. On March 16th, the intraday share price of Hong Kong stock Baidu fell by over 1% to HK$ 12.1. At the close, Baidu's share price fell by 6.36% to HK$ 125.1. However, Baidu's share price has a strong momentum in the US stock market. On the same day, Baidu's US stock market opened lower and higher, with an amplitude of over 7%. At the close, it was reported at $138.16, an increase of 3.8%. On March 17, Baidu's Hong Kong stocks performed strongly, with intraday gains exceeding 15%. As of the close of the day, Baidu's Hong Kong stocks rose by 13.67% to HK$ 142.2.
within one hour after ERNIE Bot announced the opening of the invitation test, more than 3, enterprise users queued to apply for the API calling service test of ERNIE Bot Enterprise Edition, and the webpage for product testing was crowded many times, and the traffic in official website, Baidu AI Cloud soared by a hundredfold.
ERNIE Bot's market fever continues to soar, and the capital market has also been revalued. Zhang Yi believes that this also represents the public's mood of "expecting, worrying and then hoping" for the big language model/generative AI. 2 scientific and technological revolution that no one can miss
In fact, "Is it really ready?" It is not only aimed at Baidu, but also a general public question since this round of "ChatGPT" craze.
Li Yanhong observed that from 221, artificial intelligence technology began to change from "discriminant" to "generative".
Kai-Fu Lee, Chairman and CEO of innovation works, said at a trend sharing meeting on March 14th that the first phenomenal application in the AI 2. era was AIGC represented by GPT-4, also known AI(Generative AI. Kai-fu Lee said that AI2. is a revolution that must not be missed. It will be a huge platform opportunity, which will be ten times larger than the mobile Internet. He also said that AI 2. is also China's first platform competition opportunity in the AI field.
The experts interviewed generally believe that AI companies all over the world have encountered a huge problem before: even though the technology reserves are very rich, AI applications have not brought them rich benefits. The reason for this problem is that the application of AI products is mainly concentrated in B-end (enterprise users) and G-end (government users). When AI products enter enterprises or institutions, the process is often complicated, which will limit the rapid expansion of AI products in the market to some extent.
Therefore, Zhang Yi believes that the product application direction of AIGC is more likely to generate huge business opportunities at the C end. He analyzed that in the US market, before the C-end market was seized by Google, Amazon, Meta and other companies, Microsoft was under great pressure and needed a product to regain a game. In the China market, Baidu's advantages, like Google's, have a strong search engine's ability to capture data, as well as the foundation of storage, sorting and analysis capabilities. China itself has a huge market of more than one billion people, and Baidu can do very well.
"Baidu, Microsoft and Google are essentially competing in two different markets, so I believe ERNIE Bot and its series of products will definitely come out." Zhang Yi said.
Li yanhong insists that ERNIE Bot is not a "tool for Sino-US scientific and technological confrontation". But he also admitted that the success of ChatGPT has accelerated the progress of Baidu's launch of the product.
Wang Haifeng, CTO of Baidu, said that when human beings enter the AI era, the technology stack of IT technology can be divided into four layers: chip layer, framework layer, model layer and application layer. Baidu is one of the few artificial intelligence companies in the world with full-stack layout in these four layers, and has self-developed technology leading the industry at all levels. For example, the high-end chip Kunlun Core, the deep learning framework of flying oars, the large model of Wen Xin pre-training and applications such as search, intelligent cloud, autonomous driving and small degree. Wang Haifeng believes that the advantage of Baidu's full-stack layout is that it can achieve end-to-end optimization and greatly improve efficiency in technology stack's four-tier architecture.
ERNIE Bot, like ChatGPT, uses SFT (model fine tuning), RLHF (reinforcement learning from human feedback) and Prompt as the underlying technologies. In addition, ERNIE Bot has also adopted knowledge enhancement, retrieval enhancement and dialogue enhancement technologies. Wang Haifeng said that these three items are the re-innovation of Baidu's existing technological advantages.
Chen Duan believes that at a time when the integration of technological innovation is getting higher and higher, a single company with full-stack layout has comparative advantages in internal technology R&D co-ordination and later commercialization.
confidence is very important, but the gap cannot be ignored.
During the two sessions at the beginning of this month, Wang Zhigang, Minister of Science and Technology of China, responded to questions related to ChatGPT, using football as an analogy, and pointed out that China still had a lot of work to do. "Playing football is dribbling and shooting, but it is not easy to be as good as Messi (soccer superstar Lionel Messi)."
Wang Zhigang pointed out that China has also made a lot of arrangements in this field, and the research in this field has also been carried out for many years, and some
achievements have been made. "But it may remain to be seen to achieve the effect like OpenAI at present," he added.
Wang Zhigang said that after ChatGPT came out, it attracted everyone's attention. In fact, from the source of technology itself, it is called NLP and NLU, which means natural language processing and natural language understanding. ChatGPT attracts attention because it, as a large model, effectively combines big data, great computing power and strong algorithm, and its calculation method has improved. The same principle is done differently. For example, everyone can make engines, but the quality is different.
However, whether it is ChatGPT or ERNIE Bot, the big language model behind it is core competitiveness. Zhao Dongyan, a researcher at Peking University Wangxuan Computer Research Institute, told Caijing E-Law that there is still a certain gap between domestic large models and OpenAI in terms of data, training methods and cost investment.
a scientific and technological system person pointed out that objectively speaking, there is a big gap between China and the United States in the basic research achievements in this field. These basic research achievements include natural language processing (NLP), database, GPU products, "The United States cut off GP.