|
|
FAQ
There is a track record which produced five kinds of trial production chips until now. The production purposes and specification outline of each trial
production chip are as follows.
1.TOPSTREAM™ basic platform (for functional verification)
○Purpose:It is a trial production chip for functional verification of the platform of a TOPSTREAM™
heterogeneous multi-core.
○Summary:Heterogeneous multi-core Core 5(32-bit processor either), Rohm 0.35μmCMOS technology,
br5mm×5mm [Prototype in 2001 with the support of the VSAC]
2.TOPSTREAM™ basic platform (for performance verification)
○Purpose:It is a trial production chip for performance verification of the platform of a TOPSTREAM™
heterogeneous multi-core.
○Summary:Heterogeneous multi-core Core 5(32-bit processor either), TSMC 0.18μmCMOS technology,
5mm×5mm [Prototype in 2002 with the support of the VSAC], Confirmed that operates at 160MHz.
3.COOL Interconnect Interface for 3D stacked-LSI(A function and for power consumption evaluation)
○Purpose:Was developed as an interface between the 3D stacked-LSI chip using TSV,is a prototype chip
for the evaluation of functional performance and power consumption Cool Interconnect.
○Summary:TSV1600 this、TSV connection domains are 2mm×2mm of the center of a chip,
TSMC 0.25μmCMOS technology、8.6mm×6.0mm。[Prototype in 2010 with the support of the NEDO]
4.3D stacked heterogeneous multi-core chip(For power consumption evaluation)
○Purpose:In order to evaluate the power consumption of the heterogeneous multi-core developed aiming at the cold
microprocessor in which three-dimensional lamination is possible, the trial production chip was created.
○Summary:As heterogeneous multi-chip, we fabricated two types of multi-core in order to reduce
power consumption.
- sC0 chip:Heterogeneous multi-core Core 2(32-bit processor×2 types), Cool Interconnect by this TSV1600 is
accumulated in the center of a chip, TSMC 0.18μmLPCMOS、7.0mm×4.0mm
[Prototype in 2010 with the support of the NEDO]
- C1 chip:Heterogeneous multi-core Core 5(64-bit processor×2、128-bit processor×2、256-bit processor×1, 3
types), Cool Interconnect by this TSV1600 is accumulated in the center of a chip, TSMC 0.18μmLPCMOS, 10.2mm×9.2mm
[Prototype in 2010 with the support of the NEDO]
Conventional microprocessor, has been working to improve performance by increasing the operating frequency.
However, the power consumption limit (Power Wall) has been reached that can not be increased again in the 2000s. Then, TOPSTREAM™ lowered the clock frequency of
the microprocessor which goes up too much, and has taken an approach which raises sharply the operand processed with one clock by the device of the
architecture. In order to lower frequency, compromise by the cooperation design of "Co-design architecture and algorithm," i.e., hardware, and software is very
important. Algorithms and software to achieve its structure, microprocessor instruction sets, memory hierarchy・interprocessor communication・multi-core
configurations , micro-architecture, and logical design. In either case, in order to realize the application system designed to meet the needs, we have the
optimization priority to low power consumption.
It depends on application, but usually achieves performance more than dozens of GOPS at around 50-100MHz. That is, compared with a former type
processor, it is 1/10 of low frequency, and can attain the performance far exceeding the former.However, high frequency is used when realizing the overly
high-end microprocessor which requires the high performance of several 100 TFLOPS like real-time ray tracing. In the case of real-time ray tracing, it is 750
MHz.
Basically in TOPSTREAM™, the approach which carries out functional decomposition of the processing for realizing an application system is taken.
In general, the process can be divided into processing and data processing system of the control system. First, processing of a data system is assigned to
MC(Master Controller)which is a microprocessor of 32-bit for control mounted on TOPSTREAM™. In addition, the data processing system, you assign DPE
(Data Processing Engine) to the two built-in instruction set (RISC of 32-bit and SIMD extended instruction of n-bit). However, DPE is a general term for the
microprocessor of a data-processing system mounted on TOPSTREAM™, and the data width and instruction set are optimized for every application, and it is called
by another name. For example, the processor which carries out signal processing of Loule Bell of the baseband mounted on TOPSTREAM™WLAN is a microprocessor
with the data path of 128-bit called WPE(Wireless Processing Engine).
To meet the requirements of the application system (Cost, performance and power consumption, range of application = application domain) ,
you determine the number and configuration of the DPE. After changing into the functional distributed processing of the method of message passing the
application software described as serial processing, the Architecture-Algorithm cooperation design technique original with TOPS Systems is applied. Then, a
heterogeneous multi-core processor and the distributed processing type software which operates on it are constituted, a quantitive simulation estimates
performance and power consumption, and it pursues the optimal hardware & software configuration. As the result, the kind and number of DPEs which constitute a
heterogeneous multi-core are decided.
Software development is provided with two kinds of approaches.
1.Accept the traditional software
For example, if the Ultra-Android software platform of our company is used, the application software described by Java does not need to change.
2.Parallelization of software
Since it becomes multi-core-oriented software development, difficulty goes up compared with conventional serial processing type software development.
However, the message passing type functional distributed processing adopted by TOPSTREAM™ is easier to program than data parallel type parallel processing.
Software development for TOPSTREAM™ is greatly performed at the following two steps. 1)Functional decomposition, 2)optimization of the individual process
after the division (Including streaming).
Since automation is difficult at present, after getting to know an application system, functional decomposition is carried out in (1). And it is necessary
to check whether there is any imbalance of load by a quantitive simulation. Although it is the same as that of the conventional serial type programming
about optimization of each process after division of (2), we needs to be conscious of stream-ization.
TOPS Systems is tackling development of the compiler in which formation of an automatic stream is possible.
Moreover, multi-core-oriented software development service is provided by Cool Soft Corp. which is a wholly-owned subsidiary of TOPS Systems.
It is approximately similar to a development period of SoC and it is from definition of requirements to tip trial manufacture and is approximately
18-24 month degree.
The technology of semiconductor vendors does not depend on basically. The license of TOPSTREAM™ is offered as RTL in which into chip
is possible based on the design technique of general ASIC. The libraries used by RTL are a standard logic gate (Including the tri-state drivers),
clock synchronization type SRAM of a single port, and a register file(2R1W).
We raise the productivity and quality by design approach of the following four.
①Multi-core architecture with high scalability(On-chip bus device, the device of the instruction set architecture)
②Improvement in the reusability by the SoC design approach of a platform base
③Efficiency improvement of design and verification by your own library for writing RTL
④The application of advanced functional verification(Automatic generation of instruction-level test,Acceleration of
functional verification due to a hardware emulator, Instruction-level comparison and verification of RTL and ISS)
As application of a TOPSTREAM™ heterogeneous multi-core, two application, a control system and an image-processing system, is considered greatly.
We think that TOPSTREAM™ can contribute to integration of ECU, improvement in real-time performance, and improvement in reliability by application to a control
system processor.Moreover, as an image-processing system, the demand performance exceeded several 100 GOPS and TOPSTREAM™ thinks that it is suitable for the
image recognition etc. which require the pliability which can respond to various algorithms, and scalability.
TOPSTREAM™ is premised on using it for an embedded system. The case of an embedded system, for example, SoC has been designed to suit each product
group smartphones, digital cameras, and digital television.Therefore, it does not mean that every product to be used for the processor chip of one of
versatility. The thought of TOPSTREAM™ raises a function and a scalability of the performance by composing SoC of multi-core and is the approach that reuse the
processor core which has been developed more and the software which, with that in mind, work for assets. Therefore, based on the TOPSTREAM™ architecture,
heterogeneous multi-core type SoC suitable for next-generation digital television is constituted, and high-end or it uses it for the digital television of
until low-end scalable. It can utilize the scalability which a platform of the multi-core called TOPSTREAM™ provides for such future Extensions and performance
expansion.
Moreover, about the flexibility of each processor core, since each core is equipped with the SIMD type command of same n-bit as a common 32-bit RISC command
(for control), a processor core can also perform application software at quite high speed. However, when the processor of a certain composition is used and a
function is rapidly added by software like a general-purpose processor, it may become deficiency in performance.
Various works are carried out in order to consider it as a microprocessor with high energy efficiency.
Some devices and the effect of them are introduced to below.
- ①Dual instruction set architecture:Much commands for data processing can be defined holding down
command length to 16-bit and a compact(They are 256 kinds of commands at two operands).
- ②Register bank:It can have a maximum of 256 registers, holding down command length
to 16-bit and a compact.
- ③Stream processing core:The number of clock cycles required since it can perform in parallel
to an operation command, in order to run a program for reading of continuous memory data
or the beginning to a continuous memory domain is reduced.
- ④Composite instruction:Application-specific processing system for heavy, complex instructions
to be executed in one cycle by collectively processing multiple operations,
can significantly reduce the number of clock cycles required to run the program.
- ⑤The register bank share between processor cores:Since the register bank shared between cores
can be used without using a memory for communication between processor cores, the amounts of
memory accesses and memory access time required for communication
between processors are reducible.
- ⑥Synchronization mechanism between the processor:Since it has the command
required for the synchronization between processors, communication between processors
can be performed in the minimum zero cycle.
- ⑦Scalable on-chip bus:Even if it changes the number of cores, neither change of
the Buss design nor redo of verification occurs.
This is because of a scalable multi-master bus arbitration scheme by distributed, multi-slave type.
Since a TOPSTREAM™ heterogeneous multi-core processor is constituted according to the application field and the number of cores is dependent on demand
performance or a function, the number of gates changes by multi-core composition.
As a specific example, TOPSTREAM™ WLAN:4-core configuration for WiFi(MC=32bit、SPE=32bit、RPE=32bit、WPE=128bit), about 600Kgate. 5 core composition for
animation decoding of digital TV(SCP=64bit×2、DCP=128bit×2、QCP=256bit), 1.25Mgate.
Now, the software development person is performing scheduling of stream processing. The compiler which performs scheduling of stream processing
is under development now.
As a support enterprise of the Ministry of Economy, Trade and Industry, it is the project currently carried out as joint research of TOPS Systems
and AIST from 2009.
In order to raise energy efficiency, it assumes using TOPSTREAM™ Ultra-Android(Tentative name) currently developed in parallel
as a heterogeneous multi-core processor of the low clock frequency of about 50 - 100 MHz.
Since the compatibility of application software for Android, there is no change of application software (the burden to development).
However, development of the device driver for operating the peripheral of the terminal which equipped with Ultra-Android software platforms
including a smart phone or a tablet, It is needed like the terminal which carries an Android software platform.
In addition, Cool Soft Corp. which is a subsidiary of TOPS Systems Corporation provides development services, such as a customers device driver.
It is not different from development of the old application for Android at all.
Now, it corresponds only to TOPSTREAM™. The correspondence to other multi-core processors is undecided.
You can use the application for conventional Android. Linux OS currently used by Ultra-Android is the same as Android.
TOPS Systems is promoting the open innovation type partner strategy. We are looking for a partner at any time.
It is a functional distributed processing type Android software platform. By distributed processing type functionality,
which allows for the real-time and low power consumption high speed taking advantage of heterogeneous multi-core performance.
It is due to correspond only to the major version of Android. Moreover, when carrying out functional distributed processing,
I think that the part depending on the version of Android is restrictive.
For example, I think that big differentiation can be performed by processing speed, power consumption, a response,
etc. by adopting a Ultra-Android software platform as a display and inclusion information machines and equipment with network connection.
This project aims at super-low-power-consumption-ization. Therefore, first of all, the next-generation digital television
is assumed as an applicable field which contributes to social energy saving by low-power-consumption-izing, or reduction of CO2. Moreover,
a network switch and low-power-consumption-ization of the data center are put into the view as a mid- and long-term applicable field.
Please click each item.
|
|
|