docs: Improve readability (#2378)

BZO95 · BZO95 · commit fad797187cfe · 2022-07-08T09:19:07.000-07:00
Summary: Signed-off-by: Ryan Russell <git@ryanrussell.org> Various readability fixes focused on `.md` files: - Grammar - Fix some incorrect command references to `distributed_kmeans.py` - Styling the markdown bash code snippets sections so they format Attempted to put a lot of little things into one PR and commit; let me know if any mods are needed! Best, Ryan Pull Request resolved: facebookresearch/faiss#2378 Reviewed By: alexanderguzhva Differential Revision: D37717671 Pulled By: mdouze fbshipit-source-id: 0039192901d98a083cd992e37f6b692d0572103a
diff --git a/benchs/bench_all_ivf/README.md b/benchs/bench_all_ivf/README.md
@@ -11,7 +11,7 @@ The code is organized as:
 - `bench_all_ivf.py`: evaluate one type of inverted file
 
 - `run_on_cluster_generic.bash`: call `bench_all_ivf.py` for all tested types of indices. 
-Since the number of experiments is quite large the script is structued so that the benchmark can be run on a cluster.
+Since the number of experiments is quite large the script is structured so that the benchmark can be run on a cluster.
 
 - `parse_bench_all_ivf.py`: make nice tradeoff plots from all the results. 
 
diff --git a/benchs/distributed_ondisk/README.md b/benchs/distributed_ondisk/README.md
@@ -10,15 +10,15 @@ Hopefully, changing to another type of scheduler should be quite straightforward
 
 ## Distributed k-means
 
-To cluster 500M vectors to 10M centroids, it is useful to have a distriubuted k-means implementation.
+To cluster 500M vectors to 10M centroids, it is useful to have a distributed k-means implementation.
 The distribution simply consists in splitting the training vectors across machines (servers) and have them do the assignment.
 The master/client then synthesizes the results and updates the centroids.
 
 The distributed k-means implementation here is based on 3 files:
 
 - [`distributed_kmeans.py`](distributed_kmeans.py) contains the k-means implementation.
 The main loop of k-means is re-implemented in python but follows closely the Faiss C++ implementation, and should not be significantly less efficient.
-It relies on a `DatasetAssign` object that does the assignement to centrtoids, which is the bulk of the computation.
+It relies on a `DatasetAssign` object that does the assignment to centroids, which is the bulk of the computation.
 The object can be a Faiss CPU index, a GPU index or a set of remote GPU or CPU indexes.
 
 - [`run_on_cluster.bash`](run_on_cluster.bash) contains the shell code to run the distributed k-means on a cluster.
@@ -30,7 +30,7 @@ The file is also assumed to be accessible from all server machines with eg. a di
 
 ### Local tests
 
-Edit `distibuted_kmeans.py` to point `testdata` to your local copy of the dataset.
+Edit `distributed_kmeans.py` to point `testdata` to your local copy of the dataset.
 
 Then, 4 levels of sanity check can be run:
 ```bash
@@ -47,7 +47,7 @@ The output should look like [This gist](https://gist.github.com/mdouze/ffa01fe66
 
 ### Distributed sanity check
 
-To run the distributed k-means, `distibuted_kmeans.py` has to be run both on the servers (`--server` option) and client sides (`--client` option).
+To run the distributed k-means, `distributed_kmeans.py` has to be run both on the servers (`--server` option) and client sides (`--client` option).
 Edit the top of `run_on_cluster.bash` to set the path of the data to cluster.
 
 Sanity checks can be run with
@@ -56,7 +56,7 @@ Sanity checks can be run with
 bash run_on_cluster.bash test_kmeans_0
 # using all the machine's GPUs
 bash run_on_cluster.bash test_kmeans_1
-# distrbuted run, with one local server per GPU
+# distributed run, with one local server per GPU
 bash run_on_cluster.bash test_kmeans_2
 ```
 The test `test_kmeans_2` simulates a distributed run on a single machine by starting one server process per GPU and connecting to the servers via the rpc protocol.
@@ -67,10 +67,10 @@ The output should look like [this gist](https://gist.github.com/mdouze/5b2dc69b7
 ### Distributed run
 
 The way the script can be distributed depends on the cluster's scheduling system.
-Here we use Slurm, but it should be relatively easy to adapt to any scheduler that can allocate a set of matchines and start the same exectuable on all of them.
+Here we use Slurm, but it should be relatively easy to adapt to any scheduler that can allocate a set of machines and start the same executable on all of them.
 
 The command
-```
+```bash
 bash run_on_cluster.bash slurm_distributed_kmeans
 ```
 asks SLURM for 5 machines with 4 GPUs each with the `srun` command.
@@ -90,12 +90,12 @@ The output should look like [this gist](https://gist.github.com/mdouze/8d25e89fb
 For the real run, we run the clustering on 50M vectors to 1M centroids.
 This is just a matter of using as many machines / GPUs as possible in setting the output centroids with the `--out filename` option.
 Then run
-```
+```bash
 bash run_on_cluster.bash deep1b_clustering
 ```
 
 The last lines of output read like:
-```
+```bash
   Iteration 19 (898.92 s, search 875.71 s): objective=1.33601e+07 imbalance=1.303 nsplit=0
  0: writing centroids to /checkpoint/matthijs/ondisk_distributed/1M_centroids.npy
 ```
@@ -121,25 +121,25 @@ This is performed by the script [`make_trained_index.py`](make_trained_index.py)
 
 ## Building the index by slices
 
-We call the slices "vslices" as they are vertical slices of the big matrix, see explanation in the wiki section [Split across datanbase partitions](https://github.com/facebookresearch/faiss/wiki/Indexing-1T-vectors#split-across-database-partitions).
+We call the slices "vslices" as they are vertical slices of the big matrix, see explanation in the wiki section [Split across database partitions](https://github.com/facebookresearch/faiss/wiki/Indexing-1T-vectors#split-across-database-partitions).
 
 The script [make_index_vslice.py](make_index_vslice.py) makes an index for a subset of the vectors of the input data and stores it as an independent index.
 There are 200 slices of 5M vectors each for Deep1B.
 It can be run in a brute-force parallel fashion, there is no constraint on ordering.
 To run the script in parallel on a slurm cluster, use:
-```
+```bash
 bash run_on_cluster.bash make_index_vslices
 ```
 For a real dataset, the data would be read from a DBMS.
 In that case, reading the data and indexing it in parallel is worthwhile because reading is very slow.
 
-## Splitting accross inverted lists
+## Splitting across inverted lists
 
 The 200 slices need to be merged together.
 This is done with the script [merge_to_ondisk.py](merge_to_ondisk.py), that memory maps the 200 vertical slice indexes, extracts a subset of the inverted lists and writes them to a contiguous horizontal slice.
 We slice the inverted lists into 50 horizontal slices.
 This is run with
-```
+```bash
 bash run_on_cluster.bash make_index_hslices
 ```
 
@@ -150,11 +150,11 @@ The horizontal slices need to be loaded in the right order and combined into an
 This is done in the [combined_index.py](combined_index.py) script.
 It provides a `CombinedIndexDeep1B` object that contains an index object that can be searched.
 To test, run:
-```
+```bash
 python combined_index.py
 ```
 The output should look like:
-```
+```bash
 (faiss_1.5.2) matthijs@devfair0144:~/faiss_versions/faiss_1Tcode/faiss/benchs/distributed_ondisk$ python combined_index.py
 reading /checkpoint/matthijs/ondisk_distributed//hslices/slice49.faissindex
 loading empty index /checkpoint/matthijs/ondisk_distributed/trained.faissindex
@@ -169,14 +169,14 @@ ie. searching is a lot slower than from RAM.
 
 ## Distributed query
 
-To reduce the bandwidth required from the machine that does the queries, it is possible to split the search accross several search servers.
+To reduce the bandwidth required from the machine that does the queries, it is possible to split the search across several search servers.
 This way, only the effective results are returned to the main machine.
 
 The search client and server are implemented in [`search_server.py`](search_server.py).
 It can be used as a script to start a search server for `CombinedIndexDeep1B` or as a module to load the clients.
 
 The search servers can be started with
-```
+```bash
 bash run_on_cluster.bash run_search_servers
 ```
 (adjust to the number of servers that can be used).
diff --git a/benchs/link_and_code/README.md b/benchs/link_and_code/README.md
@@ -30,7 +30,7 @@ The test runs with 3 files:
 
 - `datasets.py`: code to load the datasets. The example code runs on the
   deep1b and bigann datasets. See the [toplevel README](../README.md)
-  on how to downlod them. They should be put in a directory, edit
+  on how to download them. They should be put in a directory, edit
   datasets.py to set the path.
 
 - `neighbor_codec.py`: this is where the representation is trained.
@@ -46,7 +46,7 @@ Reproducing Table 2 in the paper
 The results of table 2 (accuracy on deep100M) in the paper can be
 obtained with:
 
-```
+```bash
 python bench_link_and_code.py \
    --db deep100M \
    --M0 6 \
@@ -84,7 +84,7 @@ Explanation of the flags:
 - `--neigh_recons_codes $bdir/deep1M_PQ36_M6_nsq4_codes.npy`: filename
   for the encoded weights (beta) of the combination
 
-- `--k_reorder 0,5`: number of restults to reorder. 0 = baseline
+- `--k_reorder 0,5`: number of results to reorder. 0 = baseline
   without reordering, 5 = value used throughout the paper
 
 - `--efSearch 1,1024`: number of nodes to visit (T in the paper)
@@ -98,7 +98,7 @@ ground-truth file is not provided)
 
 2. build the index and store it
 
-3. compute the residuals and train the beta vocabulary to do the reconstuction
+3. compute the residuals and train the beta vocabulary to do the reconstruction
 
 4. encode the vertices
 
@@ -108,7 +108,7 @@ With option `--exhaustive` the results of the exhaustive column can be
 obtained.
 
 The run above should output:
-```
+```bash
 ...
 setting k_reorder=5
 ...
@@ -117,7 +117,7 @@ efSearch=1024      0.3132 ms per query,  R@1: 0.4283 R@10: 0.6337 R@100: 0.6520
 ```
 which matches the paper's table 2.
 
-Note that in multi-threaded mode, the building of the HNSW strcuture
+Note that in multi-threaded mode, the building of the HNSW structure
 is not deterministic. Therefore, the results across runs may not be exactly the same.
 
 Reproducing Figure 5 in the paper
@@ -126,7 +126,7 @@ Reproducing Figure 5 in the paper
 Figure 5 just evaluates the combination of HNSW and PQ. For example,
 the operating point L6&OPQ40 can be obtained with
 
-```
+```bash
 python bench_link_and_code.py \
    --db deep1M \
    --M0 6 \
@@ -144,7 +144,7 @@ reproduction value).
 
 The output should look like:
 
-```
+```bash
 setting k_reorder=0
 efSearch=16        0.0147 ms per query,  R@1: 0.3409 R@10: 0.4388 R@100: 0.4394 ndis 2629735 nreorder 0
 efSearch=64        0.0122 ms per query,  R@1: 0.4836 R@10: 0.6490 R@100: 0.6509 ndis 4623221 nreorder 0
diff --git a/c_api/INSTALL.md b/c_api/INSTALL.md
@@ -75,7 +75,7 @@ and cuBLAS.
 Using the GPU with the C API
 ----------------------------
 
-A standard GPU resurces object can be obtained by the name `FaissStandardGpuResources`:
+A standard GPU resources object can be obtained by the name `FaissStandardGpuResources`:
 
 ```c
 FaissStandardGpuResources* gpu_res = NULL;
diff --git a/contrib/README.md b/contrib/README.md
@@ -19,7 +19,7 @@ A very simple Remote Procedure Call library, where function parameters and resul
 ### client_server.py
 
 The server handles requests to a Faiss index. The client calls the remote index.
-This is mainly to shard datasets over several machines, see [Distributd index](https://github.com/facebookresearch/faiss/wiki/Indexes-that-do-not-fit-in-RAM#distributed-index)
+This is mainly to shard datasets over several machines, see [Distributed index](https://github.com/facebookresearch/faiss/wiki/Indexes-that-do-not-fit-in-RAM#distributed-index)
 
 ### ondisk.py
 
@@ -52,7 +52,7 @@ A few functions to override the coarse quantizer in IVF, providing additional fl
 
 (may require h5py)
 
-Defintion of how to access data for some standard datsets.
+Definition of how to access data for some standard datasets.
 
 ### factory_tools.py