Skip to content

Commit a1adc3a

Browse files
committed
Fixed Pandas read_csv bug by specifying engine
As mentioned by Kristi[1], the better solution to the Pandas read_csv bug is to specify the engine as "pyarrow", rather than having the loading and casting step seperate. [1] nerc-project/coldfront-plugin-cloud#290 (review)
1 parent e590e8c commit a1adc3a

3 files changed

Lines changed: 6 additions & 8 deletions

File tree

process_report/process_report.py

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -105,12 +105,11 @@ def merge_csv(files):
105105
for file in files:
106106
dataframe = pandas.read_csv(
107107
file,
108-
)
109-
dataframe = dataframe.astype(
110-
{
108+
engine="pyarrow",
109+
dtype={
111110
invoice.COST_FIELD: pandas.ArrowDtype(pyarrow.decimal128(21, 2)),
112111
invoice.RATE_FIELD: str,
113-
}
112+
},
114113
)
115114
dataframes.append(dataframe)
116115

process_report/processors/new_pi_credit_processor.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -50,9 +50,8 @@ def _load_old_pis(old_pi_filepath) -> pandas.DataFrame:
5050
try:
5151
old_pi_df = pandas.read_csv(
5252
old_pi_filepath,
53-
)
54-
old_pi_df = old_pi_df.astype(
55-
{
53+
engine="pyarrow",
54+
dtype={
5655
invoice.PI_INITIAL_CREDITS: pandas.ArrowDtype(
5756
pyarrow.decimal128(21, 2)
5857
),

requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
nerc-rates>=1.0.1,<2.0.0
2-
pandas
2+
pandas>=3.0,<4.0
33
pyarrow
44
boto3>=1.42.6,<2.0
55
Jinja2

0 commit comments

Comments
 (0)