Isnet 모델을 양자화 하는데 도움이 필요합니다

안녕하세요. 고성능 컴퓨팅 지원사업을 통해 FURIOSA 를 알게 되었습니다. (_ _ )

기본적인 교육을 진행해 주셔서 자료를 통해 최대한 양자화 과정을 진행해보고 있는데, 본 영역에 부족한 점이 많아 어려움을 겪고 있습니다.

대상 모델

현재 저는 DIS(ISNET) 모델을 이용해 서비스를 운영하고 있습니다.

문제 상황

furiosa-sdk 에 제공해주고 계신 예제 중 HowToUseFuriosaSDKFromStartToFinish.ipynb를 보며 해당 모델을 양자화를 시도해 보던 중, furiosa.runtime.session.create(graph) 에서 에러가 발생하였는데, 무작정 따라하며 하다보니 어디서부터 문제인지 파악하는데 어려움을 겪고 있습니다.

에러 메세지

[1/5] 🔍   Compiling from dfg to ldfg
thread '<unnamed>' panicked at 'attempt to subtract with overflow', /rustc/dc1d9d50fba2f6a1ccab8748a0050cde38253f60/library/core/src/ops/arith.rs:219:1
thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: Any { .. }', crates/nux/src/session/mod.rs:151:14

실행 코드

주피터 노트북을 통해 해당 진행과정을 공유드립니다.
노트북 실행 결과


글 작성 도중 litmus 결과

처음부터 litmus를 사용했으면 코드를 전부 작성하지 않았어도 문제를 확인할 수 있었긴 하네요ㅠ. 혹시 이 모델도 지원이 가능할까요?

$ export RUST_BACKTRACE=full
$ furiosa litmus isnet.onnx
libfuriosa_hal.so --- v0.11.0, built @ 43c901f
INFO:furiosa.common.native:loaded native library libfuriosa_compiler.so.0.9.0 (0.9.1 d91490fa8)
furiosa-quantizer 0.9.1 (rev. a240782) furiosa-litmus 0.9.1 (rev. a240782)
[Step 1] Checking if the model can be loaded and optimized ...
[Step 1] Passed
[Step 2] Checking if the model can be quantized ...
[Step 2] Passed
[Step 3] Checking if the model can be saved as a file ...
[Step 3] Passed
[Step 4] Checking if the model can be compiled for the NPU family [warboy-2pe] ...
[1/5] 🔍   Compiling from dfg to ldfg
▸▹▹▹▹ [1/3] Splitting graph(LAS)...
thread '<unnamed>' panicked at 'attempt to subtract with overflow', /rustc/dc1d9d50fba2f6a1ccab8748a0050cde38253f60/library/core/src/ops/arith.rs:219:1
stack backtrace:
   0:     0x7f2e7ba2ae1a - <unknown>
   1:     0x7f2e7ba5277e - <unknown>
   2:     0x7f2e7ba279a5 - <unknown>
   3:     0x7f2e7ba2abe5 - <unknown>
   4:     0x7f2e7ba2c59f - <unknown>
   5:     0x7f2e7ba2c2db - <unknown>
   6:     0x7f2e7ba2cca9 - <unknown>
   7:     0x7f2e7ba2ca02 - <unknown>
   8:     0x7f2e7ba2b2cc - <unknown>
   9:     0x7f2e7ba2c752 - <unknown>
  10:     0x7f2e79c7bd03 - <unknown>
  11:     0x7f2e79c7bd9d - <unknown>
  12:     0x7f2e7b52555d - <unknown>
  13:     0x7f2e7b3ef606 - <unknown>
  14:     0x7f2e7b3edbe4 - <unknown>
  15:     0x7f2e79d1e850 - <unknown>
  16:     0x7f2e79d13933 - <unknown>
  17:     0x7f2e7a5ac241 - <unknown>
  18:     0x7f2e7a3fe09f - <unknown>
  19:     0x7f2e7a25dadd - <unknown>
  20:     0x7f2e7a169f51 - <unknown>
  21:     0x7f2e7a28dabf - <unknown>
  22:     0x7f2e7a263237 - <unknown>
  23:     0x7f2e7a10e259 - <unknown>
  24:     0x7f2e7a262151 - <unknown>
  25:     0x7f2e7a260f04 - <unknown>
  26:     0x7f2e7a26062d - <unknown>
  27:     0x7f2e7a60c267 - <unknown>
  28:     0x7f2e7a60ab13 - <unknown>
  29:     0x7f2e79fd63a4 - <unknown>
  30:     0x7f2e79f6e0c7 - <unknown>
  31:     0x7f2e7a28c153 - <unknown>
  32:     0x7f2e7a28a5c8 - <unknown>
  33:     0x7f2e79c9eec0 - fc_compile
  34:     0x7f2e91561052 - ffi_call_unix64
  35:     0x7f2e9155f925 - ffi_call_int
  36:     0x7f2e9156006e - ffi_call
  37:     0x7f2e915711e7 - _call_function_pointer
                               at /usr/local/src/conda/python-3.10.12/Modules/_ctypes/callproc.c:916:17
  38:     0x7f2e915711e7 - _ctypes_callproc
                               at /usr/local/src/conda/python-3.10.12/Modules/_ctypes/callproc.c:1262:15
  39:     0x7f2e9157a23e - PyCFuncPtr_call
                               at /usr/local/src/conda/python-3.10.12/Modules/_ctypes/_ctypes.c:4221:14
  40:           0x4f6c5b - _PyObject_MakeTpCall
                               at /usr/local/src/conda/python-3.10.12/Objects/call.c:215:18
  41:           0x4f2d26 - _PyObject_VectorcallTstate
                               at /usr/local/src/conda/python-3.10.12/Include/cpython/abstract.h:112:16
  42:           0x4f2d26 - _PyObject_VectorcallTstate
                               at /usr/local/src/conda/python-3.10.12/Include/cpython/abstract.h:99:1
  43:           0x4f2d26 - PyObject_Vectorcall
                               at /usr/local/src/conda/python-3.10.12/Include/cpython/abstract.h:123:12
  44:           0x4f2d26 - call_function
                               at /usr/local/src/conda/python-3.10.12/Python/ceval.c:5893:13
  45:           0x4f2d26 - _PyEval_EvalFrameDefault
                               at /usr/local/src/conda/python-3.10.12/Python/ceval.c:4181:23
  46:           0x4fd90f - _PyEval_EvalFrame
                               at /usr/local/src/conda/python-3.10.12/Include/internal/pycore_ceval.h:46:12
  47:           0x4fd90f - _PyEval_Vector
                               at /usr/local/src/conda/python-3.10.12/Python/ceval.c:5067:24
  48:           0x4fd90f - _PyFunction_Vectorcall
                               at /usr/local/src/conda/python-3.10.12/Objects/call.c:342:16
  49:           0x4eecf3 - _PyObject_VectorcallTstate
                               at /usr/local/src/conda/python-3.10.12/Include/cpython/abstract.h:114:11
  50:           0x4eecf3 - PyObject_Vectorcall
                               at /usr/local/src/conda/python-3.10.12/Include/cpython/abstract.h:123:12
  51:           0x4eecf3 - call_function
                               at /usr/local/src/conda/python-3.10.12/Python/ceval.c:5893:13
  52:           0x4eecf3 - _PyEval_EvalFrameDefault
                               at /usr/local/src/conda/python-3.10.12/Python/ceval.c:4231:19
  53:           0x4fd90f - _PyEval_EvalFrame
                               at /usr/local/src/conda/python-3.10.12/Include/internal/pycore_ceval.h:46:12
  54:           0x4fd90f - _PyEval_Vector
                               at /usr/local/src/conda/python-3.10.12/Python/ceval.c:5067:24
  55:           0x4fd90f - _PyFunction_Vectorcall
                               at /usr/local/src/conda/python-3.10.12/Objects/call.c:342:16
  56:           0x4eecf3 - _PyObject_VectorcallTstate
                               at /usr/local/src/conda/python-3.10.12/Include/cpython/abstract.h:114:11
  57:           0x4eecf3 - PyObject_Vectorcall
                               at /usr/local/src/conda/python-3.10.12/Include/cpython/abstract.h:123:12
  58:           0x4eecf3 - call_function
                               at /usr/local/src/conda/python-3.10.12/Python/ceval.c:5893:13
  59:           0x4eecf3 - _PyEval_EvalFrameDefault
                               at /usr/local/src/conda/python-3.10.12/Python/ceval.c:4231:19
  60:           0x4fd90f - _PyEval_EvalFrame
                               at /usr/local/src/conda/python-3.10.12/Include/internal/pycore_ceval.h:46:12
  61:           0x4fd90f - _PyEval_Vector
                               at /usr/local/src/conda/python-3.10.12/Python/ceval.c:5067:24
  62:           0x4fd90f - _PyFunction_Vectorcall
                               at /usr/local/src/conda/python-3.10.12/Objects/call.c:342:16
  63:           0x4edc5f - _PyObject_VectorcallTstate
                               at /usr/local/src/conda/python-3.10.12/Include/cpython/abstract.h:114:11
  64:           0x4edc5f - PyObject_Vectorcall
                               at /usr/local/src/conda/python-3.10.12/Include/cpython/abstract.h:123:12
  65:           0x4edc5f - call_function
                               at /usr/local/src/conda/python-3.10.12/Python/ceval.c:5893:13
  66:           0x4edc5f - _PyEval_EvalFrameDefault
                               at /usr/local/src/conda/python-3.10.12/Python/ceval.c:4213:19
  67:           0x595062 - _PyEval_EvalFrame
                               at /usr/local/src/conda/python-3.10.12/Include/internal/pycore_ceval.h:46:12
  68:           0x595062 - _PyEval_Vector
                               at /usr/local/src/conda/python-3.10.12/Python/ceval.c:5067:24
  69:           0x594fa7 - PyEval_EvalCode
                               at /usr/local/src/conda/python-3.10.12/Python/ceval.c:1134:12
  70:           0x5c5e17 - run_eval_code_obj
                               at /usr/local/src/conda/python-3.10.12/Python/pythonrun.c:1291:9
  71:           0x5c0f60 - run_mod
                               at /usr/local/src/conda/python-3.10.12/Python/pythonrun.c:1312:19
  72:           0x4595b6 - pyrun_file
                               at /usr/local/src/conda/python-3.10.12/Python/pythonrun.c:1208:15
  73:           0x5bb4ef - _PyRun_SimpleFileObject
                               at /usr/local/src/conda/python-3.10.12/Python/pythonrun.c:456:13
  74:           0x5bb253 - _PyRun_AnyFileObject
                               at /usr/local/src/conda/python-3.10.12/Python/pythonrun.c:90:15
  75:           0x5b800d - pymain_run_file_obj
                               at /usr/local/src/conda/python-3.10.12/Modules/main.c:357:15
  76:           0x5b800d - pymain_run_file
                               at /usr/local/src/conda/python-3.10.12/Modules/main.c:376:15
  77:           0x5b800d - pymain_run_python
                               at /usr/local/src/conda/python-3.10.12/Modules/main.c:591:21
  78:           0x5b800d - Py_RunMain
                               at /usr/local/src/conda/python-3.10.12/Modules/main.c:670:5
  79:           0x588299 - Py_BytesMain
                               at /usr/local/src/conda/python-3.10.12/Modules/main.c:1090:12
  80:     0x7f2e91dc0083 - __libc_start_main
  81:           0x58814e - <unknown>
  82:                0x0 - <unknown>
fatal runtime error: failed to initiate panic, error 5
1 Like

안녕하세요 FuriosaAI 정영범입니다.
보내주신 문제는 살펴보고 있습니다. 혹시 가능하시면 양자화 완료된 quantized model (ONNX format 파일)을 공유해주실 수 있을까요? 공개되는게 문제가 되신다면 dreameye98@furiosa.ai 로 보내주시면 감사하겠습니다.

아래 두가지 명령을 수행해서 결과를 알려주실 수 있을까요?
$ apt list --installed | grep furiosa

$ pip list | grep furiosa

두가지 명령 수행 결과 입니다~

$ apt list --installed | grep furiosa

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

furiosa-driver-warboy/now 1.9.0-2 amd64 [installed,local]
furiosa-libcompiler/now 0.9.1-2 amd64 [installed,local]
furiosa-libhal-warboy/now 0.11.0-2 amd64 [installed,local]
furiosa-libnux/now 0.9.1-2 amd64 [installed,local]
furiosa-toolkit/now 0.10.2-2 amd64 [installed,local]
$ pip list | grep furiosa
furiosa-cli                              0.9.1
furiosa-common                           0.9.1
furiosa-litmus                           0.9.1
furiosa-models                           0.9.1
furiosa-native-postprocess               0.9.0
furiosa-optimizer                        0.9.1
furiosa-quantizer                        0.9.1
furiosa-quantizer-impl                   0.9.2
furiosa-registry                         0.9.1
furiosa-runtime                          0.9.1
furiosa-sdk                              0.9.2
furiosa-server                           0.9.2
furiosa-serving                          0.9.1
furiosa-tools                            0.9.1

[notice] A new release of pip is available: 23.1.2 -> 23.2.1
[notice] To update, run: pip install --upgrade pip

공개해도 괜찮습니다! quantized model 공유 드립니다. (누군가에게도 도움이 됐으면 좋겠네요…)

isnet.onnx

1 Like

안녕하세요 성환님 오래 기다리셨습니다.

Resize 연산자의 resize_modeLINEAR 대신 NEAREST를 사용해주시면 감사하겠습니다. LINEAR 모드에 버그가 있어서 추후 릴리즈에 수정할 계획입니다.

감사합니다.

1 Like

안녕하세요 영범님 당일 시도해보긴 했으나, 뭔가 잘 안되어서 이것저것 더 시도해본다고 답이 늦었습니다ㅜㅜ

이해한바가 맞다면 Preprocess 의 transforms의 구성중 Resize 연산interpolation 옵션에 InterpolationMode.NEAREST 를 넣어주면 되는걸까요??

preprocess = transforms.Compose(
    [
        transforms.Resize((1024, 1024)),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
    ]
)

preprocess = transforms.Compose(
    [
        transforms.Resize((1024, 1024), interpolation=InterpolationMode.NEAREST),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
    ]
)

이와 같이 처리한뒤 Calibrate & Quantize 완료 한 뒤 추론 테스트 과정 중 furiosa.runtime.session.create(graph) 코드에서 아래와 같은 에러가 발생하고 있습니다.

File ~/miniconda3/lib/python3.10/site-packages/furiosa/runtime/session.py:142, in Session.__init__(self, model, device, worker_num, batch_size, compiler_hints, compiler_config)
    140 if is_err(err):
    141     dump_info(log_path)
--> 142     raise into_exception(err)
    144 self.ref = sess
    145 self._as_parameter_ = self.ref

InternalError: unknown (native error code: 15)

모델 graph에 있는 resize 연산자들을 말씀드린 것입니다.
이번주 중으로 Furiosa SDK 0.10.0 릴리스가 예정되어 있습니다. 릴리스 후에 litmus 명령어로 원본 FP32 모델의 onnx format으로 테스트 해보시면 좋겠습니다.

1 Like

앗 늦은시간 감사합니다!

0.10.0 버전 기대하고 응원하고 있겠습니다 (_ _ )

3 Likes