[중요, 필독] FuriosaAI FAQ

dreameye98 · December 17, 2023, 3:00pm

Warboy Target Models

Warboy는 추론 (inference)만 지원합니다. 학습 (training)은 지원하지 않습니다.
INT8 연산만을 지원하기에 모델 양자화 (quantization)는 필수입니다.
모델마다 다르지만 입력 크기가 512x512 ~ 768x768 일때 효율성이 높습니다. 더 큰 입력 사이즈는 타일링 (큰 입력 사이즈의 입력은 보다 작은 입력 사이즈로 쪼개서 추론하고 결과를 합치는 방식)으로 지원해야 효과적입니다.

Warboy Supported Operators

CNN 계열 모델 가속에 특화되어 있습니다. 가속 연산자 목록을 보실 수 있습니다.
Transformer 연산은 지원하지 않습니다.
Resize는 가속되지 않습니다.
Softmax는 제거하거나 post-process에서 처리해야 효과적입니다.
Channel 축 방향의 concat 연산은 정확도에 영향을 줄 수 있습니다.
모델 앞 뒤에 가속되지 않는 연산자는 각각 pre/post process로 옮겨서 처리해야 효과적입니다.

Accuracy Drop

모델에 따라 FP32모델을 양자화해서 Warboy에서 수행하면 정확도가 떨어질 수 있음
- 문서에 나와 있는 여러 calibration 방법을 시도하여 가장 정확도가 높은 calibration 방법을 찾아볼 수 있음

Compile Error

Insufficient Instruction Memory
- 원인: 모델에 있는 operator 수가 많아 instruction memory size 보다 binary 크기가 큰 경우
  - Instruction memory size: 256KB

해결 방법
- 컴파일 과정에서 instruction을 동적으로 loading 할 수 있도록 compiler config에 use_program_loading옵션 추가
```
compiler_config = {
	"use_program_loading": True
}
sess = session.create(
	str(quantized_model_path),
	compiler_config=compiler_config,
)
```
- 이 방법을 사용하면 inference 시간이 늘어날 수도 있음

Runtime Error

Incompatible configuration

원인: 바이너리를 만들 때의 SDK 버전과 돌릴 때의 버전이 다른 경우

libfuriosa_hal.so --- v2.0, built @ 9928508
Loading and compiling the model F.enf
Wrtting profiler output into /home/ubuntu/ds/tracing.json. Profiler API profile() disabled
Saving the compilation log into /home/ubuntu/.local/state/furiosa/logs/compile-20230211152218-zpq6hg.log
Using furiosa-compiler 0.8.0 (rev: d9f0d7728 built at 2022-11-01T03:39:34Z)
2023-02-11T06:22:18.727642Z  INFO nux::npu: Npu (npu0pe0-1) is being initialized
2023-02-11T06:22:18.730473Z  INFO nux: NuxInner create with pes: [PeId(0)]
2023-02-11T06:22:18.730497Z  INFO nux: Profiler enabled
2023-02-11T06:22:18.931755Z ERROR nux: ENF configuration is not compatible with NPU.
2023-02-11T06:22:18.936102Z  INFO nux::npu: NPU (npu0pe0-1) has been destroyed
2023-02-11T06:22:18.937053Z ERROR nux::capi: internal error: incompatible configuration
==========================================================================================================================================================================================================
Information Dump
==========================================================================================================================================================================================================
- Python version: 3.8.10 (default, Nov 14 2022, 12:59:47)  [GCC 9.4.0]
- furiosa-libnux path: libnux.so
- furiosa-libnux version: 0.8.0 (rev: d9f0d7728 built at 2022-11-01T03:39:34Z)
- furiosa-compiler version: 0.8.0 (rev: d9f0d7728 built at 2022-11-01T03:39:34Z)
- furiosa-runtime version: 0.8.2-release (rev: e97bb1ee)
 
Please check the compiler log at /home/ubuntu/.local/state/furiosa/logs/compile-20230211152218-zpq6hg.log.
If you have a problem, please report the log file to https://furiosa-ai.atlassian.net/servicedesk/customer/portals
with the information dumped above.
==========================================================================================================================================================================================================
Traceback (most recent call last):
  File "/home/ubuntu/furiosa-sdk/examples/inferences/random_input_inference.py", line 57, in <module>
    random_input_inference(model_path, num_inf=num_inf)
  File "/home/ubuntu/furiosa-sdk/examples/inferences/random_input_inference.py", line 20, in random_input_inference
    with session.create(str(model_path)) as sess:
  File "/home/ubuntu/.local/lib/python3.8/site-packages/furiosa/runtime/session.py", line 473, in create
    return Session(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/furiosa/runtime/session.py", line 142, in __init__
    raise into_exception(err)
furiosa.runtime.errors.InternalError: unknown (native error code: 15)

해결 방법
- SDK를 최신으로 업데이트 하고 바이너리를 다시 생성해서 실행한다

Performance Issue

모델 inference 수행 시간이 기대보다 큰 경우
- 원인
  - NPU에서 수행할 수 없는 operator가 사용되어 CPU에서 수행하는 경우

해결 방법
- NPU에서 수행할 수 없는 operator가 모델의 시작 부분이나 끝 부분에 있는 경우는 모델에서 잘라내어 별도의 코드로 수행

Numerical Semantics

ONNX 변환했더니 결과 값이 bit level에서 달라짐
- Numerical Semantics(이하 NS)이란 용어로 설명할 수 있다. NS란 알고리즘의 수학적 의미(Semantics)를 뜻한다. 예를 들어 “TensorRT는 NS를 유지하는 조건 하에 최대한 최적화한다.” 라는 식으로 사용된다. Nvidia TensorRT Userguide에서도 이 현상을 설명하기 위해 사용한다.
- NS가 달라지는 대표적인 예로 FMA에서 accumulate순서가 있다. Partial sum이 유한한 bit수를 갖는다면 accumulate순서에 따라 FMA연산 결과가 달라지는 경우가 수학적으로 생길 수밖에 없다. 다만 이 값이 딥러닝 알고리즘이 행하고자 했던 바와 의미를 해치지 않는 수준 내에서 값의 차이를 만들때는 bit-exactly matching은 되지 않지만 NS는 유지된다고 표현한다.
- Torch와 ONNXruntime의 결과를 비교하는 경우에 이 현상이 나타날 수 있다. ONNXruntime은 compile을 하지는 않지만 Torch 구현에 속도를 빠르게 하기 위한 여러 linear algebra 알고리즘 및 최적화 방법들이 들어가 있다. 이 때문에 Torch도 NS를 유지할 뿐 항상 동일한 FMA accumulation 연산순서를 보장하지는 않는다

dreameye98 · December 18, 2023, 3:22am

ONNX 파일을 quantization 한 후에 나중에 session만 열어 사용하기 위해 enf 파일로 변환은 어떻게 하나요?

아래 명령어를 활용하시면 됩니다.

furiosa compile foo.onnx -o foo.enf

자세한 내용은 아래 문서를 참고하세요.
https://furiosa-ai.github.io/docs/latest/ko/software/compiler.html

dreameye98 · December 18, 2023, 3:22am

2개 이상의 NPU가 설치되어 있을 때 NPU 0번, 1번째를 각각 지정해서 수행할 수 있나요?

아래와 같이 다른 NPU를 지정해서 수행 가능합니다.
sess = session.create('model.enf', device="npu0pe0-1")
sess = session.create('model.enf', device="npu1pe0-1")

관련 내용은 아래 코드에 자세히 나와 있습니다.

github.com

furiosa-ai/furiosa-sdk/blob/v0.9.0/examples/notebooks/AdvancedTopicsInInferenceAPIs.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Advanced Topics in Inference APIs\n",
    "\n",
    "This tutorial explains a little bit more advanced topics about Inference APIs. The followings are the main topics:\n",
    "* How to specify a NPU device including *NPU core fusion*.\n",
    "* Asynchronous and non-blocking inference API\n",
    "\n",
    "## Prerequisites\n",
    "To follow this tutorial, please install the following requisites.\n",
    "\n",
    "First, you must install NPU driver, firmware, and runtime by following the instruction at [FuriosaAI Driver, Firmware, Runtime Installation Guide](https://furiosa-ai.github.io/docs/latest/ko/software/installation.html).\n",
    "\n",
    "Then, please install the following python packages:\n",
    "```sh\n",
    "pip install furiosa-sdk matplotlib mnist\n",

This file has been truncated. show original

Topic		Replies	Views
강화학습 furiosa SDK 예제 노트북 일반	1	191	October 10, 2023
Model zoo 튜토리얼 학습 중 문의사항 일반	1	171	July 31, 2023
Furiosa Warboy SDK 설치 문제 일반	2	176	July 25, 2023
Furiosa tutorial 에러 일반	3	82	July 16, 2024
FuriosaAI SDK 사용자들을 위한 YOLOv5s 및 YOLOv7 예제 일반	3	715	August 2, 2023