Skip to content

feat: TPU compatibility#8

Merged
monologg merged 2 commits intomainfrom
feat/tpu
Nov 24, 2021
Merged

feat: TPU compatibility#8
monologg merged 2 commits intomainfrom
feat/tpu

Conversation

@monologg
Copy link
Copy Markdown
Collaborator

@monologg monologg commented Nov 24, 2021

Details

  • Pytorch XLA, TPU VM에서 정상 작동하도록 설정
  • 실제 코드에서는 resize_embedding 도 같이 했었습니다. 다음 PR에서 추가 부탁드립니다.
python3 xla_spawn.py --num_cores 8 pretrain_language_model.py \
  --output_dir test_ckpt \
  --per_device_train_batch_size 32 \
  --gradient_accumulation_steps 4 \
  --num_train_epochs 100 \
  --learning_rate 1e-4 \
  --lr_scheduler_type linear \
  --warmup_ratio 0.01 \
  --save_strategy epoch \
  --save_total_limit 5 \
  --seed 42

Screenshot

Screenshot_20211124-142838

Related Issue

close #7

@monologg monologg added the enhancement New feature or request label Nov 24, 2021
@monologg monologg self-assigned this Nov 24, 2021
@monologg monologg merged commit 839ad5b into main Nov 24, 2021
@monologg monologg deleted the feat/tpu branch December 5, 2021 09:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TPU + HF Trainer 정상 동작 체크

2 participants