Nowadays PCs are involved almost in all production processes: from providing the work of an office clerk to controlling technological processes. The global trend of reducing the size of computers has led to the emergence of a whole class of computers - embedded and mini-PCs. They are used so ubiquitously that we have already stopped paying attention to them.
In regard with mini-PCs, their configuration is typical within a single information system. They feature a typical set of system and application software within a single information system. Last, they have an "as is" principle, which means that PC can be simply powered on, peripherals and network plugged in, and it immediately starts performing its functions without any additional configuration.
Continuous operation of server systems
In all these application scenarios, embedded and mini-PCs are integral elements of highly complex and critical production information systems that imply continuous operation and whose interruption could lead to unwanted or irreparable consequences. Fault tolerance and high availability of information systems have been the focus of system architects, developers, and engineers for decades.
The focus has always been on ensuring the continuous operation of server systems, because they are the heart, the main element of any information system. Information technology is so tightly integrated into production processes that any failure of the information system would be tantamount to stopping the process. Indeed, the downfall of a large bank server or a large, automated process control system will inevitably lead to downtime for the organization in question, and therefore to significant financial losses. If the business process does not work, then no money is made, which is tantamount to losing money.
Over the decades, powerful server resiliency systems have been developed. They are based on duplication of hardware and use of system software to provide near-synchronous (lockstep) operations on two or more nodes of a fault-tolerant system. They include VMware vSphere Fault Tolerance, Microsoft Windows Server Failover Clustering, Stratus everRun, Stratus ftServer and others. They provide continuity of server systems with varying degrees of efficiency and availability.
But the terminal devices, which are no less important elements of information system, remain completely unprotected from hardware and software failures. Yes, the server and information system as a whole will work. Yes, only one PC will fail. But if this device is controlling, for example, the process of artificial blood circulation during surgery, what can a failure lead to? When this terminal device controls a system of traffic lights on a busy street, how many accidents can happen if it fails? Who would be relieved that only one device failed?
In any program, there is always at least one error
Polywell Computers believes that such a skew in the continuity of information systems towards servers is unacceptable. We have to think about terminal devices, user PCs, embedded PCs. Yes, the reliability of hardware is constantly growing. This is confirmed by our own failure statistics - they do not exceed 0.01%. By using only industrial grade components, we have been able to provide and openly declare an MTBF of 100,000 hours. This is a unique offering in the market, which even the bigger brands are not able to do.
However, this does not mean that hardware failures are completely eliminated. They occur and will be occurring, and often not through the fault of their manufacturer. It is known that in many countries the quality of power supply leaves much to be desired. Meanwhile one "good" power surge can be quite enough to put your PC out of action, or even to kill it. And to provide every PC with an uninterruptible power supply is a very expensive and sometimes unfeasible task. As already mentioned, we are trying to miniaturize computers, which makes them smaller and more convenient.
There is no need to use bulky and heavy UPS. However, a number of Polywell Computers mini-PCs have built-in batteries for 1 hour of battery life, which successfully solves the described problem.
However, the source of failures is not only hardware. Software vendors have also "taken care" that they happen. There is a well-known saying in jest, that in any program, even the most well-debugged one, there is always at least one error. Judging by the amount of monthly updates of Microsoft operating systems, they have thousands of such errors, and they fix them all the time, probably generating new ones at the same time. Undoubtedly, the problem of continuity of terminal devices requires the most serious attention.
However, this task is not as simple as it may seem at first sight. We cannot go the usual way of duplicating hardware as for server systems. It is not difficult to imagine what enormous additional costs will lead to the installation of a second PC and with the means to ensure synchronous operation of the two nodes. In addition, there is not as strict a requirement for terminal devices to be completely free of downtime as there is for servers.
A new class of embedded and mini-PCs, the SwapPC (replaceable PC)
Polywell Computers' SwapPC is based on the successful Open Pluggable Specification (OPS), created back in 2010 by Intel Corporation to simplify the installation, use, maintenance, and upgrade of digital signage (Digital Signage) infrastructure.
This open standard includes the electrical, mechanical, and thermal characteristics of PCs connected via an 80-pin JAE connector that supports commonly used interfaces such as DisplayPort and USB. The overall goal is to allow manufacturers to deploy plug-in systems faster and at higher volumes, while reducing deployment and implementation costs.
To take advantage of these features, the company has designed a special frame (docking station) that contains the JAE connector on the inside and all necessary power and peripheral connections on the outside. In addition, the front side of the SwapPC has additional peripheral connectors that can be used if necessary.
Thus, in order to replace a failed PC for whatever reason, it is enough to slide it out of the docking station and insert a standby (which can be stored either at the point where the PC is used or in a warehouse located in the vicinity of it) into the docking station. The new PC will start automatically.
This scheme is especially important in places where there are no qualified technicians. The described operation can be performed by any worker or cashier. Then, the IT staff that arrives on site will take the defective PC away for repair or adjustment and leave a new standby computer on site for later use in case of failure.