Skip to content

Commit 1b36ef8

Browse files
authored
[ENH, GRAPH] Experimental: A module for time-series graphs that relies on only networkx, and simulation and algorithms for time-series (#21)
* Adding API for time series graph for dodiscover * Adding new time-series graph * Working time-series graphs with forward and backwards homologous edge removal * Adding set max lag method * Adding pds algorithms for time-series graphs * Adding sys info Signed-off-by: Adam Li <[email protected]>
1 parent 5924622 commit 1b36ef8

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

50 files changed

+4533
-499
lines changed

.circleci/config.yml

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -46,8 +46,8 @@ jobs:
4646
- run:
4747
name: Install the latest version of Poetry
4848
command: |
49-
curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | POETRY_UNINSTALL=1 python -
50-
curl -sSL https://install.python-poetry.org | python3 -
49+
curl -sSL https://install.python-poetry.org | python3 - --version 1.3.0
50+
poetry --version
5151
- run:
5252
name: Set BASH_ENV
5353
command: |
@@ -70,7 +70,7 @@ jobs:
7070
command: sudo apt update && sudo apt install -y pandoc optipng
7171
- python/install-packages:
7272
pkg-manager: poetry
73-
cache-version: "v2" # change to clear cache
73+
cache-version: "v1" # change to clear cache
7474
args: "--with docs"
7575
- run:
7676
name: Check poetry package versions
@@ -140,6 +140,7 @@ jobs:
140140
- python/install-packages:
141141
pkg-manager: poetry
142142
cache-version: "v1" # change to clear cache
143+
args: "--with docs"
143144
- run:
144145
name: make linkcheck
145146
command: |

.github/workflows/main.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -137,7 +137,7 @@ jobs:
137137
run: pip install poetry-dynamic-versioning
138138
- name: Install packages via poetry
139139
run: |
140-
poetry install --with test
140+
poetry install --with test,ts
141141
# TODO: uncomment, when MixedEdgeGraph PRed into networkx
142142
# - name: Install Networkx (main)
143143
# if: "matrix.networkx == 'main'"

Makefile

Lines changed: 0 additions & 29 deletions
This file was deleted.

README.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,15 @@ Note: The API is subject to change without deprecation cycles due to the current
1212

1313
## Why?
1414

15-
Representation of causal inference models in Python are severely lacking. Moreover, sampling from causal models is non-trivial. However, sampling from simulations is a requirement to benchmark different structural learning, causal ID, or other causal related algorithms.
15+
Representation of causal graphical models in Python are severely lacking.
1616

1717
PyWhy-Graphs implements a graphical API layer for ADMG, CPDAG and PAG. For causal DAGs, we recommend using the `networkx.DiGraph` class and
1818
ensuring acylicity via `networkx.is_directed_acyclic_graph` function.
1919

20+
Existing packages that aim to represent causal graphs either break from the networkX API, or only implement a subset of the relevant causal graphs. By keeping in-line with the robust NetworkX API, we aim to ensure a consistent user experience and a gentle introduction to causal graphical models.
21+
22+
Moreover, sampling from causal models is non-trivial, but a requirement for benchmarking many causal algorithms in discovery, ID, estimation and more. We aim to provide simulation modules that are easily connected with causal graphs to provide a simple robust API for modeling causal graphs and then simulating data.
23+
2024
# Documentation
2125

2226
See the [development version documentation](https://py-why.github.io/pywhy-graphs/dev/index.html).
@@ -47,9 +51,6 @@ To install the package from github, clone the repository and then `cd` into the
4751

4852
poetry install
4953

50-
# for time-series graph functionality
51-
poetry install --extras ts
52-
5354
# for vizualizing graph functionality
5455
poetry install --extras viz
5556

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
{{ objname | escape | underline }}
2+
3+
.. currentmodule:: {{ module }}
4+
5+
.. auto{{ objtype }}:: {{ objname }}

docs/api.rst

Lines changed: 28 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,8 @@ for classes (``CamelCase`` names) and functions
1313
(``underscore_case`` names) of pywhy-graphs, grouped thematically by analysis
1414
stage.
1515

16-
Most-used classes
17-
=================
16+
Causal graph classes
17+
====================
1818
These are the causal classes for Structural Causal Models (SCMs), or various causal
1919
graphs encountered in the literature.
2020

@@ -78,7 +78,21 @@ welcome feedback.
7878
MixedEdgeGraph
7979
bidirected_to_unobserved_confounder
8080
m_separated
81-
81+
82+
Timeseries
83+
==========
84+
The following are useful functions that operate specifically on time-series graphs.
85+
86+
.. currentmodule:: pywhy_graphs.classes.timeseries
87+
.. autosummary::
88+
:toctree: generated/
89+
90+
complete_ts_graph
91+
empty_ts_graph
92+
get_summary_graph
93+
has_homologous_edges
94+
nodes_in_time_order
95+
8296
Visualization of causal graphs
8397
==============================
8498
Visualization of causal graphs is different compared to networkx because causal graphs
@@ -91,3 +105,14 @@ to perform modular visualization of nodes and edges.
91105
:toctree: generated/
92106

93107
draw
108+
timeseries_layout
109+
110+
Utilities for debugging
111+
=======================
112+
.. currentmodule:: pywhy_graphs
113+
114+
.. autosummary::
115+
:toctree: generated/
116+
117+
sys_info
118+

docs/conf.py

Lines changed: 39 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@
4040

4141
# If your documentation needs a minimal Sphinx version, state it here.
4242
#
43-
needs_sphinx = "4.0"
43+
needs_sphinx = "5.0"
4444

4545
# Add any Sphinx extension module names here, as strings. They can be
4646
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
@@ -57,6 +57,7 @@
5757
"sphinx_gallery.gen_gallery",
5858
"sphinxcontrib.bibtex",
5959
"sphinx_copybutton",
60+
# 'sphinx.ext.napoleon',
6061
"numpydoc",
6162
# "IPython.sphinxext.ipython_console_highlighting",
6263
]
@@ -68,16 +69,21 @@
6869
# generate autosummary even if no references
6970
# -- sphinx.ext.autosummary
7071
autosummary_generate = True
71-
autodoc_default_options = {"inherited-members": None}
72+
autodoc_default_options = {
73+
"inherited-members": None,
74+
}
75+
autodoc_inherit_docstrings = True
7276
# autodoc_typehints = "signature"
7377

7478
# -- numpydoc
7579
# Below is needed to prevent errors
76-
numpydoc_xref_param_type = True
80+
# numpydoc_xref_param_type = True
81+
numpydoc_show_inherited_class_members = False
82+
numpydoc_show_class_members = False
7783
numpydoc_class_members_toctree = False
7884
numpydoc_attributes_as_param_list = True
7985
numpydoc_use_blockquotes = True
80-
numpydoc_validate = True
86+
# numpydoc_validate = True
8187

8288
numpydoc_xref_ignore = {
8389
# words
@@ -108,13 +114,16 @@
108114
"no",
109115
"attributes",
110116
"dictionary",
111-
"ArrayLike",
112117
"pywhy_nx.MixedEdgeGraph",
113118
# pywhy-graphs
114119
"causal",
115120
"Node",
116121
"circular",
117122
"endpoint",
123+
"TsNode",
124+
"tsdict",
125+
"TimeSeriesGraph",
126+
"TimeSeriesDiGraph",
118127
# networkx
119128
"node",
120129
"nodes",
@@ -136,6 +145,7 @@
136145
"Graph",
137146
"sets",
138147
"value",
148+
'edges is None', 'nodes is None', 'G = nx.DiGraph(D)',
139149
# shapes
140150
"n_times",
141151
"obj",
@@ -159,12 +169,15 @@
159169
"nx.MultiDiGraph": "networkx.MultiDiGraph",
160170
"NetworkXError": "networkx.NetworkXError",
161171
"pgmpy.models.BayesianNetwork": "pgmpy.models.BayesianNetwork",
162-
"ArrayLike": "numpy.ndarray",
172+
"ArrayLike": "numpy.typing.ArrayLike",
163173
# pywhy-graphs
164174
"ADMG": "pywhy_graphs.ADMG",
165175
"PAG": "pywhy_graphs.PAG",
166176
"CPDAG": "pywhy_graphs.CPDAG",
167177
"pywhy_nx.MixedEdgeGraph": "pywhy_graphs.networkx.MixedEdgeGraph",
178+
"TimeSeriesGraph": "pywhy_graphs.classes.timeseries.TimeSeriesGraph",
179+
"TimeSeriesDiGraph": "pywhy_graphs.classes.timeseries.TimeSeriesDiGraph",
180+
"TimeSeriesMixedEdgeGraph": "pywhy_graphs.classes.timeseries.TimeSeriesMixedEdgeGraph",
168181
# joblib
169182
"joblib.Parallel": "joblib.Parallel",
170183
# pandas
@@ -199,15 +212,18 @@
199212

200213
intersphinx_mapping = {
201214
"python": ("https://docs.python.org/3", None),
202-
"numpy": ("https://numpy.org/devdocs", None),
203-
"scipy": ("https://scipy.github.io/devdocs", None),
215+
"numpy": ("https://numpy.org/doc/stable/", None),
216+
"neps": ("https://numpy.org/neps", None),
217+
"scipy": ("https://docs.scipy.org/doc/scipy/reference", None),
204218
"networkx": ("https://networkx.org/documentation/latest/", None),
205219
"nx-guides": ("https://networkx.org/nx-guides/", None),
206220
"matplotlib": ("https://matplotlib.org/stable", None),
207-
"pandas": ("https://pandas.pydata.org/pandas-docs/dev", None),
208-
"pgmpy": ("https://pgmpy.org", None),
221+
"pandas": ("https://pandas.pydata.org/pandas-docs/stable", None),
209222
"sklearn": ("https://scikit-learn.org/stable", None),
210223
"joblib": ("https://joblib.readthedocs.io/en/latest", None),
224+
"pygraphviz": ("https://pygraphviz.github.io/documentation/stable/", None),
225+
"graphviz": ("https://graphviz.readthedocs.io/en/stable/", None),
226+
"sphinx-gallery": ("https://sphinx-gallery.github.io/stable/", None),
211227
}
212228
intersphinx_timeout = 5
213229

@@ -315,10 +331,23 @@
315331
("py:obj", "networkx.MixedEdgeGraph"),
316332
("py:obj", "pywhy_graphs.networkx.MixedEdgeGraph"),
317333
("py:obj", "pywhy_nx.MixedEdgeGraph"),
334+
("py:class", "optional"),
335+
("py:class", "array"),
336+
("py:class", "pywhy_nx.classes.timeseries.TimeSeriesGraph"),
337+
("py:class", "pywhy_nx.classes.timeseries.TimeSeriesDiGraph"),
338+
("py:class", "pywhy_nx.classes.timeseries.TimeSeriesMixedEdgeGraph"),
339+
("py:class", "pywhy_nx.classes.timeseries.StationaryTimeSeriesGraph"),
340+
("py:class", "pywhy_nx.classes.timeseries.StationaryTimeSeriesDiGraph"),
341+
("py:class", "pywhy_nx.classes.timeseries.StationaryTimeSeriesMixedEdgeGraph"),
342+
("py:class", "pywhy_graphs.classes.timeseries.base.tsdict"),
318343
("py:class", "networkx.classes.mixededge.MixedEdgeGraph"),
319344
("py:class", "numpy._typing._array_like._SupportsArray"),
320345
("py:class", "numpy._typing._nested_sequence._NestedSequence"),
321346
]
347+
nitpick_ignore_regex = [
348+
('py:obj', r"pywhy_graphs\.classes\.timeseries*"),
349+
('py:obj', r"networkx*"),
350+
]
322351

323352

324353
# -- Warnings management -----------------------------------------------------

docs/reference/classes/index

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
.. _classes:
2+
3+
***********
4+
Graph types
5+
***********
6+
7+
Pywhy-graphs provides data structures and methods for storing causal graphs.
8+
9+
The classes heavily rely on NetworkX and follows a similar API.
10+
11+
The choice of graph class depends on the structure of the
12+
graph you want to represent.
13+
14+
Which graph class should I use?
15+
===============================
16+
17+
+-------------------+----------------------------------+-----------------------+
18+
| Pywhy_graph Class | Edge Types | Latent confounders |
19+
+===================+==================================+=======================+
20+
| ADMG | directed, bidirected, undirected | Yes |
21+
+-------------------+------------+--------------------+------------------------+
22+
23+
We also represent common equivalence classes of causal graphs.
24+
25+
+-------------------+----------------------------------+-----------------------+
26+
| Pywhy_graph Class | Edge Types | Latent confounders |
27+
+===================+==================================+=======================+
28+
| CPDAG | directed, undirected | No |
29+
+-------------------+----------------------------------+-----------------------+
30+
| PAG | directed, bidirected, undirected | Yes |
31+
+-------------------+---------------------------------+------------------------+
32+
| MultiDiGraph | directed | Yes | Yes |
33+
+-------------------+------------+--------------------+------------------------+
34+
35+
Causal graph types
36+
==================
37+
38+
.. currentmodule:: pywhy_graphs.classes.timeseries
39+
.. autoclass:: TimeSeriesGraph
40+
:inherited-members:
41+
.. autoclass:: TimeSeriesDiGraph
42+
:inherited-members:
43+
.. autoclass:: TimeSeriesMixedEdgeGraph
44+
:inherited-members:
45+
.. autoclass:: StationaryTimeSeriesCPDAG
46+
:inherited-members:
47+
.. autoclass:: StationaryTimeSeriesDiGraph
48+
:inherited-members:
49+
.. autoclass:: StationaryTimeSeriesGraph
50+
:inherited-members:
51+
.. autoclass:: StationaryTimeSeriesMixedEdgeGraph
52+
:inherited-members:
53+
.. autoclass:: StationaryTimeSeriesPAG
54+
:inherited-members:
55+
56+
.. note:: NetworkX uses `dicts` to store the nodes and neighbors in a graph.
57+
So the reporting of nodes and edges for the base graph classes may not
58+
necessarily be consistent across versions and platforms; however, the reporting
59+
for CPython is consistent across platforms and versions after 3.6.

docs/references.bib

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,24 @@ @article{Gamez2011
3636
doi = {10.1007/s10618-010-0178-6}
3737
}
3838

39+
40+
@InProceedings{Malinsky18a_svarfci,
41+
title = {Causal Structure Learning from Multivariate Time Series in Settings with Unmeasured Confounding},
42+
author = {Malinsky, Daniel and Spirtes, Peter},
43+
booktitle = {Proceedings of 2018 ACM SIGKDD Workshop on Causal Disocvery},
44+
pages = {23--47},
45+
year = {2018},
46+
editor = {Le, Thuc Duy and Zhang, Kun and Kıcıman, Emre and Hyvärinen, Aapo and Liu, Lin},
47+
volume = {92},
48+
series = {Proceedings of Machine Learning Research},
49+
month = {20 Aug},
50+
publisher = {PMLR},
51+
pdf = {http://proceedings.mlr.press/v92/malinsky18a/malinsky18a.pdf},
52+
url = {https://proceedings.mlr.press/v92/malinsky18a.html},
53+
abstract = {We present constraint-based and (hybrid) score-based algorithms for causal structure learning that estimate dynamic graphical models from multivariate time series data. In contrast to previous work, our methods allow for both “contemporaneous” causal relations and arbitrary unmeasured (“latent”) processes influencing observed variables. The performance of our algorithms is investigated with simulation experiments and we briefly illustrate the proposed approach on some real data from international political economy.}
54+
}
55+
56+
3957
@article{Meek1995,
4058
author = {Meek, Christopher},
4159
year = {2013},
@@ -168,6 +186,10 @@ @article{Zhang2008
168186
abstract = {Causal discovery becomes especially challenging when the possibility of latent confounding and/or selection bias is not assumed away. For this task, ancestral graph models are particularly useful in that they can represent the presence of latent confounding and selection effect, without explicitly invoking unobserved variables. Based on the machinery of ancestral graphs, there is a provably sound causal discovery algorithm, known as the FCI algorithm, that allows the possibility of latent confounders and selection bias. However, the orientation rules used in the algorithm are not complete. In this paper, we provide additional orientation rules, augmented by which the FCI algorithm is shown to be complete, in the sense that it can, under standard assumptions, discover all aspects of the causal structure that are uniquely determined by facts of probabilistic dependence and independence. The result is useful for developing any causal discovery and reasoning system based on ancestral graph models.}
169187
}
170188

189+
@article{Zhang2008AncestralGraphs,
190+
title = {Causal Reasoning with Ancestral Graphs},
191+
}
192+
171193
@inproceedings{Zhang2011,
172194
author = {Zhang, Kun and Peters, Jonas and Janzing, Dominik and Sch\"{o}lkopf, Bernhard},
173195
title = {Kernel-Based Conditional Independence Test and Application in Causal Discovery},

docs/whats_new/_contributors.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,5 +20,5 @@
2020
.. |API| replace:: :raw-html:`<span class="badge badge-warning">API Change</span>` :raw-latex:`{\small\sc [API Change]}`
2121

2222

23-
.. _Adam Li: https://py-why.github.io
23+
.. _Adam Li: https://github.com/adam2392
2424
.. _Julien Siebert: https://github.com/siebert-julien

0 commit comments

Comments
 (0)