-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
Hi there,
Small thing, I think it may be helpful to have some error checking on whether docs have been set when they are needed.
I was running through the example notebook and just set the newsgroup embeddings using tmt.embeddings = embeddings rather than calculating them (because I use them all the time I just have them saved) but didn't set the documents anywhere.
When I got to tmt.visualizeEmbeddings(131,78).show() it threw the following error generated in _check_CS_SS
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[14], line 1
----> 1 tmt.visualizeEmbeddings(131,78).show()
File [c:\path\lib\site-packages\topictuner\basetuner.py:316](file:///C:/path/lib/site-packages/topictuner/basetuner.py:316), in BaseHDBSCANTuner.visualizeEmbeddings(self, min_cluster_size, min_samples, width, height, markersize, opacity)
310 VizDF["wrappedText"] = [
311 "Topic #: " + str(topic) + "
" + text
312 for topic, text in zip(VizDF["topics"], wrappedText)
313 ]
314 else:
315 VizDF["wrappedText"] = [
--> 316 "Topic #: " + str(topic) for topic in self.runHDBSCAN()
317 ]
318 for topiclabel in set(VizDF["topics"]):
319 topicDF = VizDF.loc[VizDF["topics"] == topiclabel]
File [c:\path\lib\site-packages\topictuner\basetuner.py:94](file:///C:/path/lib/site-packages/topictuner/basetuner.py:94), in BaseHDBSCANTuner.runHDBSCAN(self, min_cluster_size, min_samples)
88 def runHDBSCAN(self, min_cluster_size: int = None, min_samples: int = None):
89 """
90 Cluster the target embeddings (these will be the reduced embeddings when
91 run as a TMT instance. Per HDBSCAN, min_samples must be more than 0 and less than
92 or equal to min_cluster_size.
93 """
---> 94 min_cluster_size, min_samples = self._check_CS_SS(
...
--> 408 raise ValueError("Cannot set min_cluster_size==None")
409 if min_cluster_size == 1:
410 raise ValueError("min_cluster_size must be more than 1")
ValueError: Cannot set min_cluster_size==None
This wasn't very helpful as the issue was no docs being set, not anything to do with min_cluster_size or min_samples
Setting tmt.docs = docs resolve the issue
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels