Model Zoo 배치 추론 에러

jwson · July 9, 2024, 12:25am

위 링크의 yolo pose 모델을 배치 추론이 잘 안되어 문의드립니다.

with create_runner(yolo_pose.model_source(), batch_size=4) as runner:
로 argument를 넣으면 아래와 같이 런타임 에러가 발생합니다.

{
	"name": "FuriosaRuntimeError",
	"message": "runtime error: Compilation error: Bad user command",
	"stack": "---------------------------------------------------------------------------
FuriosaRuntimeError                       Traceback (most recent call last)
Cell In[14], line 3
      1 yolo_pose = YOLOv7w6Pose()
----> 3 with create_runner(yolo_pose.model_source(), batch_size=4) as runner:
      4     pass

FuriosaRuntimeError: runtime error: Compilation error: Bad user command"
}```

만약 batch_size 옵션을 주지 않고 input 텐서를 배치로 묶어서 
[4, 3, 384, 640] 텐서로 주면 ValueError가 납니다

```bash
ValueError: 4 input tensor(s) are given, expected 1

배치 추론을 하려면 어떻게 해야될까요? .enf 파일을 새로 컴파일 해야되는건가요?

luxroot · July 11, 2024, 2:46am

안녕하세요, FuriosaAI 엔지니어 신명근입니다.

현재 제공되고 있는 모델들은 배치 사이즈 1로만 컴파일되어 있습니다. 따라서 다른 배치 사이즈를 원하신다면 새롭게 컴파일이 필요합니다. 이에 대한 자세한 방법은 컴파일 및 최적화 문서를 참조해주시기 바랍니다.

아래에는 제가 작성한 yolo pose 4배치 추론 코드 예시를 첨부해 드립니다. 참고로, yolo pose 모델을 컴파일하는 데에는 약 30분 정도의 시간이 소요될 수 있으니, 이 점 양해 부탁드립니다.
또한 배치 사이즈가 클수록 최적은 아닐 수 있으니 다양한 실험을 통해 성능을 확인해보시기를 추천드립니다.

감사합니다.

from copy import deepcopy
from time import perf_counter

import numpy as np
import onnx

from furiosa.models import vision
from furiosa.quantizer import ModelEditor, TensorType, get_pure_input_names, quantize
from furiosa.runtime.sync import create_runner

model = vision.YOLOv7w6Pose()
f32_onnx_model = onnx.load_from_string(model.origin)
model_wo_input_quantize = deepcopy(f32_onnx_model)
editor = ModelEditor(model_wo_input_quantize)
for input_name in get_pure_input_names(model_wo_input_quantize):
    editor.convert_input_type(input_name, TensorType.UINT8)
quantized_onnx_wo_input_quantize = quantize(model_wo_input_quantize, model.tensor_name_to_range)

with create_runner(
    quantized_onnx_wo_input_quantize,
    batch_size=4,
    compiler_config={"lower_tabulated_dequantize": True},
) as runner:
    input_tensor_desc = runner.model.inputs()
    runner.model.print_summary()
    fake_input = [
        np.random.randint(256, size=desc.shape, dtype=desc.dtype.numpy)
        for desc in input_tensor_desc
    ]
    starting_time = perf_counter()
    for _ in range(100):
        runner.run(fake_input)
    print("Average inference time:", (perf_counter() - starting_time) / 100, "s")

Topic		Replies	Views
Model zoo 튜토리얼 학습 중 문의사항 일반	1	171	July 31, 2023
NPU 다중 session 문의, yolov5l 모델 성능 문의 일반	1	212	August 5, 2023
예제 도중 IncompatibleModel 문제 일반	4	197	August 21, 2023
Furiosa 예제 파일 버전 문제 일반	8	457	July 24, 2023
Furiosa tutorial 에러 일반	3	82	July 16, 2024

Model Zoo 배치 추론 에러

Related topics