SciCatProject · LAShemilt · Sep 16, 2025 · Sep 17, 2025 · Sep 17, 2025 · Sep 17, 2025
diff --git a/Ingestion/Ingesting_Data.md b/Ingestion/Ingesting_Data.md
diff --git a/Ingestor/IngestManual_ESS.md b/Ingestor/IngestManual_ESS.md
@@ -0,0 +1,38 @@
+# Ingest Instructions ESS
+___This page needs to be updated___
+
+![Kafka flow](screenshots/kafka.png)
+
+As shown in the picture above the detector and the data collection software is writing into kafka topics. The topics is then being read by a filewriter that in turn writes the dataset to storage and sends back an event when the file is written that contains metadata about the file and the experiment. There is then an ingestion program that will parse the event from the filewriter and gather additional information before triggering REST calls into the backend. 
+
+### How the ingestion program works:
+
+#### 1. Parse Event
+
+The ingestor is subscribing to a topic in Kafka where the filewriter creates an event when a file has been written. This event contains information about location, proposal id and metadata that exist on the file in the form of nexus data. 
+
+#### 2. Login  
+
+ Login and get a access token that can be used for interfacing with the backend. We advise to create a special ingestor account to be used when doing automatic ingestion. 
+
+#### 3. Gather more metadata
+
+The ingestor contacts the User Office and get's additional information regarding the experiment, the principal investigator and the sample.
+
+#### 4. Create dataset
+
+![dataset](screenshots/dataset.png)
+
+Using the information from the filewriter event with the additional information gathered from the user office a dataset request can be constructed and sent to the backend.
+
+#### 5. Create OrigDatablocks
+
+![datablock](screenshots/datablock.png)
+
+After a dataset has been created we attach the files that relate, this is done by creating datablocks that attaches to the dataset
+
+#### 6. Create Sample
+
+![sample](screenshots/sample.png)
+
+If sample information is available a sample record can be created and attached to the dataset. This is an important step as it makes the data available to a wider audience of people. 
diff --git a/Ingestor/README.md b/Ingestor/README.md
@@ -0,0 +1,45 @@
+# Ingesting Data into SciCat
+
+## Using pyscicat (recommended)
+Pyscicat is python client for working with the SciCat API, which provides an easy mechanism to ingest data. See  https://www.scicatproject.org/pyscicat/howto/ingest.html to get started. 
+
+For an example of the full workflow, please see the `pyscicat.ipynb` Jupyter notebook in SciCat live: https://github.com/SciCatProject/scicatlive/blob/main/services/jupyter/config/notebooks/pyscicat.ipynb. This includes how to authenticate, create a dataset, add datablocks and upload an attachment.
+
+## Manual ingestion
+
+The following steps will add a dataset to your system using the API with the Linux program curl.
+
+1.
+Login to the backend. The default password for the ingestor user is aman. Running the command below in the terminal will yield an access token.
+
+```
+ curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '{"username":"ingestor", "password":"<your_password>"}' 'http://localhost:3000/api/v3/Users/login'
+ ```
+
+2.
+Create a json file with the contents below and name it metadata.json
+```
+{
+    "creationLocation": "/PSI/SLS/TOMCAT",
+    "sourceFolder": "/scratch/devops",
+    "type": "raw",
+    "ownerGroup":"p16623"
+}
+```
+
+3.
+cat the metadata.json file and pipe it to a curl command. Insert your access token in the command below and run it in the terminal:
+
+```
+cat metadata.json | curl -X POST --header 'Content-Type: application/json'  --header 'Accept: application/json'  -d @-  'http://localhost:3000/api/v3/Datasets?access_token=YOUR_TOKEN_HERE'
+```
+
+There should now be a dataset in your mongoDB instance.
+
+
+
+# Site Specific Examples
+
+For site specific examples see the following links:
+* [ESS](IngesManual_ESS.md)
+* [PSI](ingestManual.md) In future to move to https://data-catalog-services.pages.psi.ch