Discussion on GUEST OS Clock (TIMEKEEP) in Virtual Machine

  
 

The description of the clock synchronization concept in the traditional operating system

We know that there are many clocks in the system hardware such as RTC/PIT/HPET/ACPI PM TIMER/TSC, which are characterized by their characteristics. There are two types: periodic clocks in the form of interrupts, such as RTC/PIT/HPET, etc.; the other is a one-step incremental clock, such as TSC, that exists in the COUNTER counter. The difference between them is that the periodic clock achieves the purpose of timing by periodically transmitting interrupts, such as the heartbeat frequency. The single-step incremental clock does not send an interrupt, but the software needs to actively read its COUNTER register to get the time.
Therefore, the operating system will selectively use the periodic clock as the clock source for timekeep, and use the single-step incremental clock for time calibration or performance statistics. This is true for the Linux system window system.
The clock synchronization we refer to here is that the periodic clock in the system needs to be synchronized with the single-step incremental clock, for example. If the operating system uses the RTC clock as the periodic clock source of TIMEKEEP, and the frequency is 100HZ, the system CPU frequency is 3000MH, and the TSC frequency is equal to the CPU frequency, then every RTC interrupt should be generated (the system time advances by 1/100 second). The TSC COUNTER should also be increased by 1/100*3000M. If the above relationship is stable, then we call the system clock synchronized.
So why do you need TSC and RTC synchronization? The TIMEKEEP process of the operating system often (at least the Linux system will do this, but the window may not) will use the TSC time to correct the RTC maintenance time (in the Linux jiffies), because RTC periodic interruption may be accidental Lost (for example, interrupt masking, etc.), and the TSC counter will run stably without losing time, so the system timing maintained by the RTC will lag behind the TSC timing, so it is necessary to constantly calibrate the RTC timing, that is, let it The TSC timing is consistent. (For the specific calibration action, see the function timer_tsc pointed to by cur_timer in timer_interrupt
in Linux system).

Operating system in the traditional environment, whether TSC or RTC is responsible for the operation of the hardware, so as long as the hardware is stable, then TSC and RTC, although running independently, should be synchronized. . Therefore, the synchronization problem is not a problem in the traditional environment!

Synchronization between analog clocks in virtual machines
If clock synchronization is not a problem in traditional environments, what is the problem with clock synchronization in a virtual environment?
The conditions generated by the clock in the virtual environment and the traditional environment have changed! Periodic clocks such as RTC are not generated by the corresponding hardware but by software timer simulation (see the previous article), but the TSC acquisition does not pass the simulation, or read the hardware TSC COUNTER register to obtain the TSC value (only So there is no simulation because we can't simulate a stable single-step TSC COUNTER through software, and it is simpler and more reasonable to read the TSC directly. We already know that in a virtual environment, periodic clock interrupts such as software emulation RTC cannot be accurately triggered and submitted to GOS (there will be delay in the virtual environment and the trigger interval is unstable), and the RTC timing and TSC time are out of synchronization (TSC will Run fast). On the Linux system, the output prints a lot of information "Losing too many ticks!" (the corresponding code is in timer_tsc
).

For Linux systems that require TSC to correct periodic clock timing, we need to find ways to synchronize TSC and RTC. In fact, the virtual technology processor of intel and amd has been provided to us. When the GOS reads the TSC (the x86 system has a special instruction rdtsc), the result is that the original result (hardware TSC COUNTER) can be changed. The sum of the configurable offset values ​​tsc_offset, so we can set the tsc_offset value to synchronize the timing of the TSC seen by the GOS with its RTC maintenance. The specific method is to set a new tsc_offset according to the timing of the GOS timing (RTC maintenance) at each clock interrupt injection time (the statement hvm_set_guest_time in pt_intr_post in vpt.c in xen (pt->vcpu, pt- >last_plt_gtime) is to complete the work), thus ensuring clock synchronization in the virtual environment.

Synchronization in SMP Environment
The synchronization mentioned above is for single-core systems. For the synchronization of multi-core SMP environments, in addition to the above requirements, multiple cores and time are required. Synchronization - TSCs on each core need to be synchronized (each core has its own TSC), and the TSC and system periodic clocks need to be synchronized.
Multi-core synchronization in virtual environments has many difficulties. It is necessary to fully synchronize the core synchronous scheduling (simultaneously scheduled or called). Otherwise, it is possible to submit a periodic clock interrupt to a core. Being called out, then obviously the clock interrupt will be lost and the system time will be behind. As far as the Xen code is concerned, at present, in order to improve the performance of the entire system, each core is independently scheduled, so the possibility of the above-mentioned lost interrupt is indeed stored.
For this xen, the periodic clock is bound to the BSP core. Since the AP core does not receive the clock interrupt, the Ap core does not perform the timekeep operation (timekeep is performed after the clock interrupt is received), so the system timing operation and the proofreading time operation will not be performed, and no error will be reported. “Losing too many ticks !”, although the TSC on the AP core is indeed out of sync with the RTC timing.
However, the virtual dual core can not guarantee the synchronization of the TSC of the two cores, which will affect the normal operation of many performance testing tools, because many performance testing tools (pcmark, winsat, etc.) will use TSC for performance measurement, so you are likely You will find these attacks score low or unusable under the virtual dual core. This problem has not been solved or paid attention to by xen.
If you want to completely solve the multi-core clock problem, you need to do a lot of auxiliary work, such as triggering the ipi interrupt to notify other cores to perform corresponding synchronization processing when the periodic clock is interrupted; after the core is scheduled to be dispatched, the other cores are TSC. It should also freeze and so on.

Summary:
The clock problem in the virtual environment has many places to study and improve. We have a preliminary discussion on clock compensation and synchronization in our two articles. The implementation of virtual clock is an important technical point in virtualization technology, which will have a significant impact on the overall performance and stability of the system. The current xen code still has a lot of vulnerabilities in the virtualization of the clock, and hope that interested readers can continue to study.

Copyright © Windows knowledge All Rights Reserved