Skip to content

Commit 5f75cab

Browse files
committed
gguf.md: add sharding to naming convention [no ci]
1 parent 9988298 commit 5f75cab

File tree

1 file changed

+15
-1
lines changed

1 file changed

+15
-1
lines changed

docs/gguf.md

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ The key difference between GGJT and GGUF is the use of a key-value structure for
2020

2121
### GGUF Naming Convention
2222

23-
GGUF follow a naming convention of `<Model>-<Version>-<ExpertsCount>x<Parameters>-<EncodingScheme>.gguf`
23+
GGUF follow a naming convention of `<Model>-<Version>-<ExpertsCount>x<Parameters>-<EncodingScheme>-<Shard>.gguf`
2424

2525
The components are:
2626
1. **Model**: A descriptive name for the model type or architecture.
@@ -34,6 +34,9 @@ The components are:
3434
- `M`: Million parameters.
3535
- `K`: Thousand parameters.
3636
5. **EncodingScheme**: Indicates the weights encoding scheme that was applied to the model. Content, type mixture and arrangement however are determined by user code and can vary depending on project needs.
37+
6. **Shard**: (Optional) Indicates and denotes that the model has been split into multiple shards, formatted as `<ShardNum>-of-<ShardTotal>`.
38+
- *ShardNum* : Shard position in this model. Must be at least 5 digits padded by zeros.
39+
- *ShardTotal* : Total number of shards in this model. Must be at least 5 digits padded by zeros.
3740

3841
#### Parsing Above Naming Convention
3942

@@ -47,13 +50,24 @@ For example:
4750
- Expert Count: 8
4851
- Parameter Count: 7B
4952
- Weight Encoding Scheme: KQ2
53+
- Shard: N/A
5054

5155
* `Hermes-2-Pro-Llama-3-8B-F16.gguf`:
5256
- Model Name: Hermes 2 Pro Llama 3
5357
- Version Number: v0.0 (`<Version>-` missing)
5458
- Expert Count: 0 (`<ExpertsCount>x` missing)
5559
- Parameter Count: 8B
5660
- Weight Encoding Scheme: F16
61+
- Shard: N/A
62+
63+
* `grok-v1.0-100B-Q4_0-00003-of-00009.gguf"`
64+
- Model Name: Grok
65+
- Version Number: v1.0
66+
- Expert Count: 0 (`<ExpertsCount>x` missing)
67+
- Parameter Count: 100B
68+
- Weight Encoding Scheme: Q4_0
69+
- Shard: 3 out of 9 total shards
70+
5771

5872
### File Structure
5973

0 commit comments

Comments
 (0)