However, for such an extremely simple program, many problems are vague. For example, written in C language:
This article will focus on the above issues.
From the perspective of software developers, we need to know the three most critical components of computer hardware: CPU, memory and I/O control chip.
The core frequency of the early CPU was not high, just like the memory frequency, it was directly connected to the same bus. Because the speed of I/O devices such as display devices, keyboards, floppy disks and disks is still much slower than that of CPU and memory, in order to coordinate the speed of I/O devices and buses and enable CPU to communicate with I/O devices, each device generally has a corresponding I/O controller. As shown in the figure below:
Later, due to the increase of CPU core frequency, the memory could not keep up with the speed of CPU, so a system bus with the same memory frequency was produced, and the CPU communicated with the bus by frequency doubling.
Then, with the popularization of graphics operating system, especially the development of SD games and multimedia, graphics chips need to exchange a lot of data with CPU and memory. In order to coordinate, a high-speed North Bridge (PCI Bridge) chip is specially designed so that they can exchange data efficiently.
Because the North Bridge runs at a very high speed, if all the relatively low-speed equipment is directly connected to the North Bridge, then the North Bridge has to deal with both high-speed equipment and low-speed equipment, and the design will be very complicated. Therefore, people have designed a South Bridge chip to deal with low-speed equipment. Disks, USB, keyboards, mice and other devices are all connected to the south bridge, and then connected to the north bridge after being summarized by the south bridge. In 1990s, PC used PCI structure on system bus and ISA bus on low-speed equipment. The hardware architecture designed with PCI/ISA and Nanbei Bridge is as follows:
Located in the middle is the North Bridge connecting all high-speed chips, which is like the human heart, connecting and driving all parts of the body; On its left is the CPU, which is responsible for all the control and calculation, just like the human brain. The North Bridge is also connected with several high-speed components, including the memory on the left and the PCI bus below.
The highest speed of PCI is 133 MHz, but it still can't meet people's needs, so people invented AGP, PCI Express and other bus structures and corresponding control chips. Although the hardware structure looks more and more complicated, it is still not divorced from the original basic structure of CPU, memory and I/O. When we look at the hardware from the perspective of program development, we can simply regard it as the original hardware model.
People always want computers to be as fast as possible. In the past 50 years, the frequency of CPU has increased from tens of kHz to 4GHz, which has increased by hundreds of thousands of times. But there has been no qualitative improvement since then, because the manufacturing process of CPU has reached the physical limit, unless there is an essential breakthrough in the manufacturing process of CPU.
There is no room to improve the frequency, so conversely, improving the speed is to increase the number of CPU. Computers with multiple CPUs appeared a long time ago, and the most common form is symmetric multiprocessing (SMP). Simply put, each CPU has the same position and function in the system and is symmetrical to each other.
Theoretically, increasing the number of CPUs can improve the operation speed. Ideally, the speed increase is directly proportional to the number of CPUs. But in fact, this is not the case, because not all our programs can be broken down into several completely unrelated subproblems. For example, a woman can have children in 10 month, but she can't have children in 10 month.
The most common application of multiprocessor is also these commercial servers and environments that need to handle a lot of calculations. In a personal computer, using multiple processors is a luxury. After all, multiprocessors are expensive.
So processor manufacturers began to consider "merging multiple processors and packaging them for sale". These "packaged" processors share expensive cache components, only keep multiple cores and sell them in the outer packaging of one processor, which is only a little more expensive than single-core processors. This is the basic idea of multi-core processors. A multi-core processor is actually a simplified version of SMP. Of course, there are some differences in details, but from the programmer's point of view, the differences between the two are very small and logically the same. It's just that there are subtle differences between multi-core and SMP in cache sharing, so that the program can be optimized in a targeted manner. Simply put, unless you want to squeeze every drop of CPU, you can think of multi-core and SMP as the same concept.
The software used to manage the computer itself is called system software to distinguish it from ordinary applications.
System software is divided into two categories, one is platform, such as operating system, kernel, driver, runtime and thousands of system tools, and the other is used for program development, such as compiler, assembler, connector and other development tools and libraries.
The software system of computer system adopts the structure of layers, and there is a famous saying:
This famous saying summarizes the design points of computer system software architecture, and the whole architecture is designed in strict accordance with the top-down hierarchy.
One function of the operating system is to provide abstract interfaces, and another main function is to manage hardware resources.
The capability of computer hardware is limited. For example, the CPU can execute 6,543.8+billion instructions per second, or 65,438+0 GB of memory can store up to 65,438+0 GB of data at the same time. Whether you use it or not, there are always so many resources.
The resources in a computer are mainly divided into CPU, memory (including memory and disk) and I/O devices. Let's look at how to tap their potential from these three aspects.
As the upper layer of hardware layer, operating system is the management and abstraction of hardware. For the runtime and applications on the operating system, what they want to see is a unified hardware access mode. As application developers, when developing applications, we don't want to directly read and write hardware ports to handle hardware interrupts. For example, we want to draw a straight line on the monitor. For programmers, just call a unified LineTo () function, and the specific implementation method is completed by the operating system.
When a mature operating system appeared, hardware was gradually abstracted into a series of concepts. In UNIX, the access form of hardware devices is the same as that of ordinary files. These tedious hardware details are handed over to the operating system, specifically the device drivers in the operating system. The driver can be regarded as a part of the operating system, which often runs at the privileged level with the operating system kernel, but it is independent of the operating system kernel to some extent, making the driver more flexible.
In the early computer, the program ran directly on the physical memory, that is, the addresses accessed when the program was running were all physical addresses. Of course, if a computer only runs one program at the same time, as long as the memory space required by the program does not exceed the size of physical memory, there will be no problem. But in fact, in order to use hardware resources more effectively, we must run multiple programs at the same time, just like the previous multi-program, time-sharing system and multi-task. When we can run multiple programs at the same time, the utilization rate of CPU will be higher. Then an obvious problem is how to allocate the limited physical memory on the computer to multiple programs.
The existing problems are:
Address space is not isolated: all programs directly access physical addresses, and the memory spaces used by programs are not isolated from each other.
Inefficient use of memory: because there is no effective memory management mechanism, when a program needs to be executed, the monitoring program will load the whole program into memory and then start execution.
The address where the program runs is uncertain: because every time the program needs to be loaded and run, we need to allocate a large enough free area from the memory, and the location of this free area is uncertain.
Virtual address space refers to the virtual and imaginary address space. In fact, it doesn't exist. Each process has its own independent virtual space, and each process can only access its own address space, thus effectively isolating the processes.
At first, people used a method called segmentation. The basic idea is to map a virtual space to an address space, and the size of this virtual space is the size of the memory space required by the program. For example, program A needs 10 MB of memory, so we assume that there is a virtual space with an address of 0x00A000000 to 100 MB, that is, virtual space, and then we allocate a physical address with the same size from the actual physical memory, assuming that the physical address starts from 0x00 10000 to 0x00000. Then we map these two address spaces with the same size one by one, that is, each byte in the virtual space corresponds to each byte in the physical space. This mapping process is set by software, for example, the operating system sets this mapping function, and the actual address translation is done by hardware.
The method of segmentation basically solves the first and third of the three problems mentioned above. First of all, it realizes address isolation because program A and program B are mapped to two different physical space areas, but this segmentation method still does not solve our second problem, that is, the problem of memory utilization efficiency. The mapping of segments to memory areas is still program-based. If the memory is insufficient, the whole program will be swapped in and out to disk, which will inevitably lead to a large number of disk access operations, thus seriously affecting the speed. This method is still rough and has a large granularity. In fact, according to the principle of program locality, when a program is running, it only uses a small part of data frequently in a certain period of time, that is, a large amount of data of the program will not be used in a certain period of time. People will naturally think of a memory segmentation mapping method with smaller granularity, which makes full use of the principle of program locality and greatly improves the utilization rate of memory. This method is paging.
The basic method of paging is to artificially divide the address space into pages with fixed size, and the size of each page is determined by the hardware, or the hardware supports pages with multiple sizes, and the size of the page is determined by the operating system. For example, Intel Pentium series processors support a page size of 4KB or 4MB, so the operating system can choose a page size of 4KB or 4MB, but only one size can be selected at a time, so the page size of the whole system is fixed. At present, almost all operating systems on PC use 4KB pages. The PC we use is a 32-bit virtual address space, which is 4GB, so if we divide each page into 4KB pages, there will be 1 048 576 pages. Physical space is also divided in this way.
In this chapter, we review the basic hardware and software structure of the whole computer, including the connection between CPU and peripheral components.
Mode, SMP and multi-core, layered architecture of software and hardware, how to make full use of CPU and device drivers which are very related to system software.
Basic concepts of animation, operating system, virtual space, physical space and page mapping.