Vitruvius+: an area-efficient RISC-V decoupled vector coprocessor for high performance computing applications

F Minervini, O Palomar, O Unsal, E Reggiani… - ACM Transactions on …, 2023 - dl.acm.org
F Minervini, O Palomar, O Unsal, E Reggiani, J Quiroga, J Marimon, C Rojas, R Figueras…
ACM Transactions on Architecture and Code Optimization, 2023dl.acm.org
The maturity level of RISC-V and the availability of domain-specific instruction set
extensions, like vector processing, make RISC-V a good candidate for supporting the
integration of specialized hardware in processor cores for the High Performance Computing
(HPC) application domain. In this article, we present Vitruvius+, the vector processing
acceleration engine that represents the core of vector instruction execution in the HPC
challenge that comes within the EuroHPC initiative. It implements the RISC-V vector …
The maturity level of RISC-V and the availability of domain-specific instruction set extensions, like vector processing, make RISC-V a good candidate for supporting the integration of specialized hardware in processor cores for the High Performance Computing (HPC) application domain. In this article, we present Vitruvius+, the vector processing acceleration engine that represents the core of vector instruction execution in the HPC challenge that comes within the EuroHPC initiative. It implements the RISC-V vector extension (RVV) 0.7.1 and can be easily connected to a scalar core using the Open Vector Interface standard. Vitruvius+ natively supports long vectors: 256 double precision floating-point elements in a single vector register. It is composed of a set of identical vector pipelines (lanes), each containing a slice of the Vector Register File and functional units (one integer, one floating point). The vector instruction execution scheme is hybrid in-order/out-of-order and is supported by register renaming and arithmetic/memory instruction decoupling. On a stand-alone synthesis, Vitruvius+ reaches a maximum frequency of 1.4 GHz in typical conditions (TT/0.80V/25°C) using GlobalFoundries 22FDX FD-SOI. The silicon implementation has a total area of 1.3 mm2 and maximum estimated power of ∼920 mW for one instance of Vitruvius+ equipped with eight vector lanes.
ACM Digital Library
以上显示的是最相近的搜索结果。 查看全部搜索结果