AIBOX-1684X 跑llama3问题
0、AIBOX运行uname -v 显示#3 SMP Tue Apr 23 10:23:57CST 2024
1、通过”快速部署 Llama3(https://wiki.t-firefly.com/AIBOX-1684X/quick-llama3.html)“
部署完成后可以正常运行。运行模型为llama3-8b_int4_1dev_256.bmodel
2、因为提示词工程需要,换了新编的模型 llama3-8b_int8_1dev_4096.bmodel
出现如下错误。
Device [ 0 ] loading ....
INFO:cpu_lib 'libcpuop.so' is loaded.
bmcpu init: skip cpu_user_defined
open usercpu.so, init user_cpu_init
INFO:Profile For arch=3
INFO:gdma=0, tiu=0, mcu=0
Model loading ....
INFO:Loading bmodel from . Thanks for your patience...
INFO:Bmodel loaded, version 2.2+v1.8.beta.0-89-g32b7f39b8-20240612
INFO:pre net num: 0, load net num: 69
INFO:loading firmare in bmodel
INFO: core_id=0, multi_fullnet_func_id=22
INFO: core_id=0, dynamic_fullnet_func_id=23
bm_alloc_gmem failed, dev_id = 0, size = 0xd05a000
BM_CHECK_RET fail /workspace/libsophon/bmlib/src/bmlib_memory.cpp: sg_malloc_device_byte_heap_mask: 729
FATAL:coeff alloc failed, size
python3: /home/linaro/llama3/Llama3/python_demo/chat.cpp:128: void Llama3::init(const std::vector<int>&, std::string): Assertion `true == ret' failed.
(附:log中显示名字为int4是因为程序写死了,故重命名llama3-8b_int8_1dev_4096.bmodel-> 为llama3-8b_int4_1dev_256.bmodel,实际模型是 llama3-8b_int8_1dev_4096.bmodel)
3、请问AIBOX-1684X是否支持llama3-8b_int8_1dev_4096.bmodel,如果不支持,那支持int44096的吗?
页:
[1]