Bladeren bron

Update FAQ.md

sekyonda 1 jaar geleden
bovenliggende
commit
aa032f1595
1 gewijzigde bestanden met toevoegingen van 3 en 2 verwijderingen
  1. 3 2
      docs/FAQ.md

+ 3 - 2
docs/FAQ.md

@@ -20,8 +20,9 @@ Here we discuss frequently asked questions that may occur and we found useful al
 
 
 5. What are the hardware SKU requirements for deploying these models?
 5. What are the hardware SKU requirements for deploying these models?
 
 
-    Hardware requirements vary based on latency, throughput and cost constraints. For good latency, the models were split across multiple GPUs with tensor parallelism in a machine with NVIDIA A100s or H100s. But TPUs, other types of GPUs, or even commodity hardware can also be used to deploy these models (e.g. https://github.com/ggerganov/llama.cpp).
+    Hardware requirements vary based on latency, throughput and cost constraints. For good latency, the models were split across multiple GPUs with tensor parallelism in a machine with NVIDIA A100s or H100s. But TPUs, other types of GPUs like A10G, T4, L4, or even commodity hardware can also be used to deploy these models (e.g. https://github.com/ggerganov/llama.cpp).
+    If working on a CPU, it is worth looking at this [blog post](https://www.intel.com/content/www/us/en/developer/articles/news/llama2.html) from Intel for an idea of Llama 2's performance on a CPU.
 
 
 6. What are the hardware SKU requirements for fine-tuning Llama pre-trained models?
 6. What are the hardware SKU requirements for fine-tuning Llama pre-trained models?
 
 
-    Fine-tuning requirements vary based on amount of data, time to complete fine-tuning and cost constraints. To fine-tune these models we have generally used multiple NVIDIA A100 machines with data parallelism across nodes and a mix of data and tensor parallelism intra node. But using a single machine, or other GPU types are definitely possible (e.g. alpaca models are trained on a single RTX4090: https://github.com/tloen/alpaca-lora).
+    Fine-tuning requirements vary based on amount of data, time to complete fine-tuning and cost constraints. To fine-tune these models we have generally used multiple NVIDIA A100 machines with data parallelism across nodes and a mix of data and tensor parallelism intra node. But using a single machine, or other GPU types like NVIDIA A10G or H100 are definitely possible (e.g. alpaca models are trained on a single RTX4090: https://github.com/tloen/alpaca-lora).