



## **INTEL® FPGA LINUX\* DRIVER SOLUTION AND UPSTREAMING STATUS**

Figo Zhang, Hao Wu





#### Background

- Intel® FPGAs and the Modern Datacenter
- Platform Options and the Acceleration Stack
- Open Programmable Acceleration Engine

Hardware Overview

- FPGA Interface Manager (FIM)

#### Intel FPGA Linux Driver solution

- Driver Architecture
- Partial Reconfiguration
- Virtualization
- Example: Simple DMA Operation
- Upstream Status





#### Background



1

### DATA MOVEMENT AND PROCESSING EXPLOSION



#### Markets

- Government ×.
- Enterprise
- Cloud .

н.

Communications

Infrastructure

Network Storage

Compute



IEM ITAMik





#### Workloads

- Security
- **Big Data Processing and Analytics**
- Video processing and transcode
- Artificial Intelligence & Machine Learning

intel

Packet processing







The Intel® Xeon® processor with FPGA acceleration can reduce TCO and solve new problems



### INTEL® FPGA DATA CENTER FORM FACTORS OPTIONS

Enabled By The Acceleration Stack for Intel® Xeon® CPU with FPGAs







- System flexibility with Intel Xeon CPU SKU options
- Dedicated local memory
- Can be slotted into 1U servers

#### Server Platform Option with In-Package FPGA



Coherent interface benefits software developersSuperior performance for bandwidth & latency sensitive applications

Choose the Intel FPGA form factor matched to your application needs



### THE INTEL® APPLICATION DEVELOPER ADVANTAGE



Acceleration Stack for Intel<sup>®</sup> Xeon<sup>®</sup> CPU with FPGAs

Intel Environment Code re-use

#### e IP Libraries

#### Acceleration Stack for Intel<sup>®</sup> Xeon<sup>®</sup> CPU with FPGAs – Enhanced Performance, Simplified

- Saves developer time to focus on unique value-add of their solution
- Enables unprecedented code re-use across multiple Intel FPGA form-factor products
- World's first common developer interface for Intel FPGA data center products
- Optimized and simplified hardware and software APIs provided by Intel
- Enables easier development and deployment of Intel FPGAs for workload optimization

The stable and optimized foundation for building your Intel FPGA-accelerated solution



### ACCELERATION STACK FOR INTEL® XEON® CPU WITH FPGAS



Enhanced Performance, Simplified



#### Intel® delivers a system-optimized solution stack for your data center workloads

OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos Logos and names provided for illustrative purposes only. Current availability may be different.



#### Start developing for Intel FPGAs with OPAE today: <a href="http://01.org/OPAE">http://01.org/OPAE</a>

\*\*ASE requires Acceleration Functions written in RTL and a properly installed RTL simulator: Synopsys\* VCS-MX, Mentor Graphics\* ModelSim-SE\*/QuestaSim Supports: Red Hat Enterprise Linux\* 7.3 w/ kernel 4.7, Intel® Xeon® Processors v4 or newer

#### Consistent cross-platform API

- Minimal software overhead and latency
- Supports virtual machines and bare metal platforms
- Open source code licensing and developer community 🛶

Simplified FPGA Programming Layer for Application Developers

OPEN PROGRAMMABLE ACCELERATION ENGINE (OPAE) TECHNOL(

• Intel FPGA drivers being upstreaming to Linux kernel

#### SDK includes:

- Guides, utilities and sample code
- AFU Simulation Environment (ASE)\*\*:
  - Develop and debug Accelerator Functions faster







IBM



### Hardware Overview



Network Platforms Group

### **ACCELERATION ENVIRONMENT**





### FPGA INTERFACE MANAGER (FIM) OVERVIEW

Device memory organized in Device Feature List data structure

Supported features exposed through Device Feature List







### **FPGA INTERFACE MANAGER (FIM) DETAILS**



#### FPGA Management Engine

- Provides: power and thermal management, error reporting, partial reconfiguration, performance reporting, and other infrastructure functions.
- Each FPGA has one FME, accessible through the physical function.

#### Port

- Interface between the static FPGA fabric (FIM) and a partially reconfigurable region containing an Accelerated Function Unit (Accelerator Function).
- Controls communication from SW and exposes features such as reset and debug.

#### Accelerated Function Unit

- Attached to a port and exposes a MMIO region for accelerator-specific control registers.



### FPGA INTERFACE MANAGER (FIM) - VIRTUALIZATION SUPPORT



Intel Hardware

Supports PCIe SR-IOV function to create virtual functions (VFs) which can be used to assign individual accelerators to virtual machines.







#### Intel<sup>®</sup> FPGA Linux Driver Solution



#### **INTEL® FPGA DRIVER ARCHITECTURE**





intel

### **FPGA DRIVER COMPONENT – PCIE DEVICE DRIVER**

- Enumeration
  - Discover Feature Devices by walking through the Device Feature List.
  - Create Platform Device with associated resources for Feature Devices.
    - Port/AFU and FME are Feature Devices.
  - Feature Device Framework:
    - Helper functions to manage feature devices and sub-features.
- SR-IOV Support
  - Sysfs interface to enable/disable VFs.
  - Both PF and VFs share the same driver.



IRM



### **FPGA DRIVER COMPONENT - ACCELERATED FUNCTION UNIT**

- Platform Device Driver
- Expose Accelerated Function Unit MMIO Resource
  - MMIO region (mmap)
- Provide DMA buffer mapping service
- Implement other sub features
  - Error reporting
  - UMSG (Unordered Message)
  - Debug



18



### **ACCELERATED FUNCTION UNIT – DRIVER INTERFACES**

#### ioctl

- Get driver API version (FPGA\_GET\_API\_VERSION)
- Check for extensions (FPGA\_CHECK\_EXTENSION)
- Get port info (FPGA\_PORT\_GET\_INFO)
- Get MMIO region info (FPGA\_PORT\_GET\_REGION\_INFO)
- Map DMA buffer (FPGA\_PORT\_DMA\_MAP)
- Unmap DMA buffer (FPGA\_PORT\_DMA\_UNMAP)
- Reset AFU (FPGA\_PORT\_RESET)
- Enable UMsg (FPGA\_PORT\_UMSG\_ENABLE)
- Disable UMsg (FPGA\_PORT\_UMSG\_DISABLE)
- Set UMsg mode (FPGA\_PORT\_UMSG\_SET\_MODE)
- Set UMsg base address (FPGA\_PORT\_UMSG\_SET\_BASE\_ADDR)

#### mmap

- mmap() accelerator MMIO regions.
- sysfs
  - Path: /sys/class/fpga\_region/regionX/intel-fpga-afu.n/
  - Read Accelerator GUID (afu\_id)
  - Error Reporting (errors/)





### **FPGA DRIVER COMPONENT - FPGA MANAGEMENT ENGINE**

- Platform Device Driver
- Implements management features
  - FPGA capability and status.
  - Thermal & Power management.
  - Partial Reconfiguration function.
  - Global Error reporting.
  - Global Performance reporting.
  - Port Management.



20

### **FPGA MANAGEMENT ENGINE - DRIVER INTERFACES**



#### ioctl

- Get driver API version (FPGA\_GET\_API\_VERSION)
- Check for extensions (FPGA\_CHECK\_EXTENSION)
- Assign port to PF (FPGA\_FME\_PORT\_ASSIGN)
- Release port from PF (FPGA\_FME\_PORT\_RELEASE)
- Program Bitstream (FPGA\_FME\_PORT\_PR)

#### sysfs

- Path: /sys/class/fpga\_region/regionX/intel-fpga-fme.n/
- Read bitstream ID/metadata (bitstream\_id / bitstream\_metadata)
- Read number of ports (ports\_num)
- Read socket ID (socket\_id)
- Read performance counters (perf/)
- Power management (power\_mgmt/)
- Thermal management (thermal\_mgmt/)
- Error reporting (errors/)



### **PARTIAL RECONFIGURATION (PR)**

- Accelerators reconfigured through partial reconfiguration.
- Other AFUs could run workload at the same time.
- Interface compatibility needed before start PR.





#### VIRTUALIZATION



To enable access to an accelerator from within a virtual machine, AFU ports must be assigned to a VF







#### **EXAMPLE: SIMPLE DMA OPERATION**

- Open AFU device file.
- Use AFU loctl FPGA\_PORT\_GET\_REGION\_INFO to get AFU MMIO region information.
- Invoke AFU mmap to map AFU MMIO region for CSRs access.
- Allocate buffer and do DMA mapping via AFU ioctl FPGA\_PORT\_DMA\_MAP.
- Program the DMA address to related CSRs in mapped area.
- Start DMA by programming related CSRs in mapped area.
- Poll on CSRs for completion.
- Unmapp DMA by AFU ioctl FPGA\_PORT\_DMA\_UNMAP
- Close AFU device file as task is done.





### **UPSTREAMING STATUS**

- First batch of patches has been submitted, version 2 is under review now.
  - Basic functions to enable AFU usage and Partial Reconfiguration.
    - Link: <u>https://marc.info/?l=linux-fpga&m=149844232609819&w=2</u>
  - Documentation/fpga/intel-fpga.txt:
    - <u>https://marc.info/?l=linux-fpga&m=149844234509825&w=2</u>
- More advanced features to be submitted next step
  - SR-IOV support and other sub features for FME and PORT/AFU.



# START DEVELOPING FOR INTEL® FPGAS WITH OPAE TUDAY

- Learn more about OPAE by visiting: <u>http://01.org/OPAE</u>
- Join the OPAE mailing list: <u>https://lists.01.org/pipermail/opae/</u>





## BACKUP



#### **FPGA API - SYSFS & ENUMERATION**







### **FPGA API – ENUMERATE, MANAGE & ACCESS**





intel



#### **OPAE USERSPACE API ACCESS**

fpga\_result fpgaOpen(fpga\_token token, fpga\_handle \*handle, int flags);
fpga\_result fpgaClose(fpga\_handle handle);
fpga\_result fpgaReset(fpga\_handle handle);
fpga\_result fpgaPrepareBuffer(fpga\_handle handle, uint64\_t len, void \*\*buf\_addr, uint64\_t \*wsid, int
flags);
fpga\_result fpgaReleaseBuffer(fpga\_handle handle, uint64\_t wsid);
fpga result fpgaGetIOVA(fpga handle handle, uint64 t wsid, uint64 t \*iova);

fpga\_result fpgaMapMMIO(fpga\_handle handle, uint32\_t mmio\_num, uint64\_t \*\*mmio\_ptr)
fpga\_result fpgaUnmapMMIO(fpga\_handle handle, uint32\_t mmio\_num)
fpga\_result fpgaWriteMMIO32(fpga\_handle handle, uint32\_t mmio\_num, uint64\_t offset, uint32\_t value)
fpga\_result fpgaWriteMMIO64(fpga\_handle handle, uint32\_t mmio\_num, uint64\_t offset, uint32\_t value)
fpga\_result fpgaReadMMIO64(fpga\_handle handle, uint32\_t mmio\_num, uint64\_t offset, uint64\_t value)
fpga\_result fpgaReadMMIO64(fpga\_handle handle, uint32\_t mmio\_num, uint64\_t offset, uint64\_t value)
fpga\_result fpgaWriteMMIO64(fpga\_handle handle, uint32\_t mmio\_num, uint64\_t offset, uint64\_t value)
fpga\_result fpgaWriteMMIO(fpga\_handle handle, uint32\_t mmio\_num, uint64\_t offset, uint64\_t value)
fpga\_result fpgaReadMMIO(fpga\_handle handle, uint32\_t mmio\_num, uint64\_t offset, uint64\_t value)

fpga\_result fpgaGetNumUmsg(fpga\_handle handle, uint64\_t \*value)
fpga\_result fpgaSetUmsgAttributes(fpga\_handle handle, uint64\_t value)
fpga\_result fpgaGetUmsgPtr(fpga\_handle handle, uint64\_t \*\*umsg\_ptr)



30





#### **Notices & Disclaimers**



Intel technologies' features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. Check with your system manufacturer or retailer or learn more at intel.com.

No computer system can be absolutely secure.

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit <u>http://www.intel.com/performance</u>.

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit <a href="http://www.intel.com/performance">http://www.intel.com/performance</a>.

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction.

Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate.

© 2017 Intel Corporation. Intel, the Intel logo, Stratix, Arria, and Xeon are trademarks of Intel Corporation in the U.S. and/or other countries. \*Other names and brands may be claimed as property of others.

