[코드]

###########################################
# 2-1. Zero-shot 평가 (영어)
## About hellaswag, copa, boolq, mmlu

!lm_eval --model hf \
    --model_args pretrained=[...Custom_LLM...] \
    --tasks hellaswag,copa,boolq,mmlu \
    --device cuda:0 \
    --batch_size 8 \
    --num_fewshot 0

 

 

[결과]

hf (pretrained=cashbook/SOLAR-Platypus-10.7B-v1-kjw), gen_kwargs: (None), limit: None, num_fewshot: 0, batch_size: 8
|                 Tasks                 |Version|Filter|n-shot| Metric |Value |   |Stderr|
|---------------------------------------|-------|------|-----:|--------|-----:|---|-----:|
|mmlu                                   |N/A    |none  |     0|acc     |0.6304|±  |0.0038|
| - humanities                          |N/A    |none  |     0|acc     |0.5626|±  |0.0066|
|  - formal_logic                       |      0|none  |     0|acc     |0.3413|±  |0.0424|
|  - high_school_european_history       |      0|none  |     0|acc     |0.7818|±  |0.0323|
|  - high_school_us_history             |      0|none  |     0|acc     |0.8284|±  |0.0265|
|  - high_school_world_history          |      0|none  |     0|acc     |0.8101|±  |0.0255|
|  - international_law                  |      0|none  |     0|acc     |0.8099|±  |0.0358|
|  - jurisprudence                      |      0|none  |     0|acc     |0.7407|±  |0.0424|
|  - logical_fallacies                  |      0|none  |     0|acc     |0.7607|±  |0.0335|
|  - moral_disputes                     |      0|none  |     0|acc     |0.7312|±  |0.0239|
|  - moral_scenarios                    |      0|none  |     0|acc     |0.2413|±  |0.0143|
|  - philosophy                         |      0|none  |     0|acc     |0.7074|±  |0.0258|
|  - prehistory                         |      0|none  |     0|acc     |0.7500|±  |0.0241|
|  - professional_law                   |      0|none  |     0|acc     |0.4831|±  |0.0128|
|  - world_religions                    |      0|none  |     0|acc     |0.8129|±  |0.0299|
| - other                               |N/A    |none  |     0|acc     |0.7219|±  |0.0077|
|  - business_ethics                    |      0|none  |     0|acc     |0.7000|±  |0.0461|
|  - clinical_knowledge                 |      0|none  |     0|acc     |0.6981|±  |0.0283|
|  - college_medicine                   |      0|none  |     0|acc     |0.6474|±  |0.0364|
|  - global_facts                       |      0|none  |     0|acc     |0.3600|±  |0.0482|
|  - human_aging                        |      0|none  |     0|acc     |0.7175|±  |0.0302|
|  - management                         |      0|none  |     0|acc     |0.7961|±  |0.0399|
|  - marketing                          |      0|none  |     0|acc     |0.8932|±  |0.0202|
|  - medical_genetics                   |      0|none  |     0|acc     |0.7800|±  |0.0416|
|  - miscellaneous                      |      0|none  |     0|acc     |0.8340|±  |0.0133|
|  - nutrition                          |      0|none  |     0|acc     |0.7516|±  |0.0247|
|  - professional_accounting            |      0|none  |     0|acc     |0.5319|±  |0.0298|
|  - professional_medicine              |      0|none  |     0|acc     |0.7022|±  |0.0278|
|  - virology                           |      0|none  |     0|acc     |0.5241|±  |0.0389|
| - social_sciences                     |N/A    |none  |     0|acc     |0.7423|±  |0.0077|
|  - econometrics                       |      0|none  |     0|acc     |0.4737|±  |0.0470|
|  - high_school_geography              |      0|none  |     0|acc     |0.8131|±  |0.0278|
|  - high_school_government_and_politics|      0|none  |     0|acc     |0.8756|±  |0.0238|
|  - high_school_macroeconomics         |      0|none  |     0|acc     |0.6308|±  |0.0245|
|  - high_school_microeconomics         |      0|none  |     0|acc     |0.7269|±  |0.0289|
|  - high_school_psychology             |      0|none  |     0|acc     |0.8367|±  |0.0158|
|  - human_sexuality                    |      0|none  |     0|acc     |0.7786|±  |0.0364|
|  - professional_psychology            |      0|none  |     0|acc     |0.6667|±  |0.0191|
|  - public_relations                   |      0|none  |     0|acc     |0.7000|±  |0.0439|
|  - security_studies                   |      0|none  |     0|acc     |0.7388|±  |0.0281|
|  - sociology                          |      0|none  |     0|acc     |0.8507|±  |0.0252|
|  - us_foreign_policy                  |      0|none  |     0|acc     |0.8600|±  |0.0349|
| - stem                                |N/A    |none  |     0|acc     |0.5322|±  |0.0086|
|  - abstract_algebra                   |      0|none  |     0|acc     |0.3300|±  |0.0473|
|  - anatomy                            |      0|none  |     0|acc     |0.5926|±  |0.0424|
|  - astronomy                          |      0|none  |     0|acc     |0.7039|±  |0.0372|
|  - college_biology                    |      0|none  |     0|acc     |0.7708|±  |0.0351|
|  - college_chemistry                  |      0|none  |     0|acc     |0.4100|±  |0.0494|
|  - college_computer_science           |      0|none  |     0|acc     |0.5400|±  |0.0501|
|  - college_mathematics                |      0|none  |     0|acc     |0.3800|±  |0.0488|
|  - college_physics                    |      0|none  |     0|acc     |0.4118|±  |0.0490|
|  - computer_security                  |      0|none  |     0|acc     |0.7300|±  |0.0446|
|  - conceptual_physics                 |      0|none  |     0|acc     |0.5362|±  |0.0326|
|  - electrical_engineering             |      0|none  |     0|acc     |0.5655|±  |0.0413|
|  - elementary_mathematics             |      0|none  |     0|acc     |0.4339|±  |0.0255|
|  - high_school_biology                |      0|none  |     0|acc     |0.7742|±  |0.0238|
|  - high_school_chemistry              |      0|none  |     0|acc     |0.4975|±  |0.0352|
|  - high_school_computer_science       |      0|none  |     0|acc     |0.6200|±  |0.0488|
|  - high_school_mathematics            |      0|none  |     0|acc     |0.3593|±  |0.0293|
|  - high_school_physics                |      0|none  |     0|acc     |0.3974|±  |0.0400|
|  - high_school_statistics             |      0|none  |     0|acc     |0.5509|±  |0.0339|
|  - machine_learning                   |      0|none  |     0|acc     |0.4286|±  |0.0470|
|hellaswag                              |      1|none  |     0|acc     |0.6396|±  |0.0048|
|                                       |       |none  |     0|acc_norm|0.8310|±  |0.0037|
|copa                                   |      1|none  |     0|acc     |0.8700|±  |0.0338|
|boolq                                  |      2|none  |     0|acc     |0.8260|±  |0.0066|

|      Groups      |Version|Filter|n-shot|Metric|Value |   |Stderr|
|------------------|-------|------|-----:|------|-----:|---|-----:|
|mmlu              |N/A    |none  |     0|acc   |0.6304|±  |0.0038|
| - humanities     |N/A    |none  |     0|acc   |0.5626|±  |0.0066|
| - other          |N/A    |none  |     0|acc   |0.7219|±  |0.0077|
| - social_sciences|N/A    |none  |     0|acc   |0.7423|±  |0.0077|
| - stem           |N/A    |none  |     0|acc   |0.5322|±  |0.0086|

 

'LLM' 카테고리의 다른 글

BART (Bidirectional Auto-Regressive Transformer)  (0) 2024.02.20
[에러] config.json  (0) 2024.02.20
Posted by 캬웃
,

# Facebook에서 개발

# BART는 BERT와 GPT를 하나로 합친 형태

(기존 Sequence to Sequence 트랜스포머 모델을 새로운 Pre-training objective를 통해 학습하여 하나로 합친 모델)

 

# 핵심 장점:

noising의 유연성

어떤 임의의 변형이라도 기존 텍스트에 바로 적용될 수 있으며, 심지어 길이도 변화시킬 수 있습니다. 

 

# 모델

자 그러면 모델 구조를 알아봅시다. BART는 손상된 문서를 기존 문서로 되돌리는 denoising autoencoder입니다. BART는 seq2seq 모델으로 구현되어 있고 손상된 텍스트를 birdirectional encoder(BERT)가 엔코딩하고 이를 left-to-right autoregressive decoder(GPT)가 받습니다. 사전학습을 위해, 기존의 negative log lielihood를 최적화 하였다고 합니다.

 

# 논문 작성에 참고:

기존에 있던 모델을 하나로 단순히 합친 모델, 아니 심지어 기존 Viswani et al.이 제안한 트랜스포머 모델 구조와 동일한데 어떻게 논문이 될 수 있었을까요? 바로 BART 모델이 여러 자연어 벤치마크에서 sota를 달성한 것도 있겠지만, 여러 사전 학습 태스크에 대한 면밀한 분석도 한 몫하였습니다. 

논문을 쓰시는 독자 여러분들께선 와 이렇게도 논문이 만들어지는구나~ 하고 보시면 될 것 같고, 자연어 처리를 공부하시는 입장에서는 사전학습의 중요성에 대해 알아가시면 좋을 것 같습니다.

 

# 사전학습에 참고:

사전 학습에 대한 내용을 더 많이 알고싶다면, T5논문을 읽는것을 추천드립니다. 강추!!

 

 

[참고문헌]

https://chloelab.tistory.com/34

'LLM' 카테고리의 다른 글

[Fine-tuning] Zero-shot 평가 (영어)  (0) 2024.02.20
[에러] config.json  (0) 2024.02.20
Posted by 캬웃
,

[에러] config.json

LLM 2024. 2. 20. 13:55

[증상]

colab에서 평가 돌렸을 때 아래와 같이 에러남

 

File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 1238, in hf_hub_download
    metadata = get_hf_file_metadata(
  File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 1631, in get_hf_file_metadata
    r = _request_wrapper(
  File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 385, in _request_wrapper
    response = _request_wrapper(
  File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 409, in _request_wrapper
    hf_raise_for_status(response)
  File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py", line 296, in hf_raise_for_status
    raise EntryNotFoundError(message, response) from e
huggingface_hub.utils._errors.EntryNotFoundError: 404 Client Error. (Request ID: Root=1-65d2f58b-6bbe66047d0d87c019aa16ba;557e17ef-bc31-490e-bd8b-52d4921f268b)

Entry Not Found for url: https://huggingface.co/cashbook/SOLAR-Platypus-10.7B-v1-kjw/resolve/main/config.json.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/bin/lm_eval", line 8, in <module>
    sys.exit(cli_evaluate())
  File "/content/drive/MyDrive/FastCampus-LLM/lm-evaluation-harness/lm_eval/__main__.py", line 279, in cli_evaluate
    results = evaluator.simple_evaluate(
  File "/content/drive/MyDrive/FastCampus-LLM/lm-evaluation-harness/lm_eval/utils.py", line 288, in _wrapper
    return fn(*args, **kwargs)
  File "/content/drive/MyDrive/FastCampus-LLM/lm-evaluation-harness/lm_eval/evaluator.py", line 123, in simple_evaluate
    lm = lm_eval.api.registry.get_model(model).create_from_arg_string(
  File "/content/drive/MyDrive/FastCampus-LLM/lm-evaluation-harness/lm_eval/api/model.py", line 134, in create_from_arg_string
    return cls(**args, **args2)
  File "/content/drive/MyDrive/FastCampus-LLM/lm-evaluation-harness/lm_eval/models/huggingface.py", line 187, in __init__
    self._get_config(
  File "/content/drive/MyDrive/FastCampus-LLM/lm-evaluation-harness/lm_eval/models/huggingface.py", line 444, in _get_config
    self._config = transformers.AutoConfig.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 1048, in from_pretrained
    config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py", line 622, in get_config_dict
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py", line 677, in _get_config_dict
    resolved_config_file = cached_file(
  File "/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py", line 481, in cached_file
    raise EnvironmentError(
OSError: cashbook/SOLAR-Platypus-10.7B-v1-kjw does not appear to have a file named config.json. Checkout 'https://huggingface.co/cashbook/SOLAR-Platypus-10.7B-v1-kjw/main' for available files.

 

 

[해결]

config.json이 없어서 그럼.
허깅페이스에서도 이 파일 없으면 모델 없다고 안올라감.

모델 업로드할 때 필요한 파일들 중 config.json만 누락되어 있어서,

구글 드라이브에 올라와 있는 거 받아서 수동으로 이것만 올리니 해결.

'LLM' 카테고리의 다른 글

[Fine-tuning] Zero-shot 평가 (영어)  (0) 2024.02.20
BART (Bidirectional Auto-Regressive Transformer)  (0) 2024.02.20
Posted by 캬웃
,