-
Notifications
You must be signed in to change notification settings - Fork 2.3k
introducing read excel #1757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
introducing read excel #1757
Conversation
adding unit tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed everything up to 44300f0 in 2 minutes and 44 seconds. Click for details.
- Reviewed
257lines of code in2files - Skipped
0files when reviewing. - Skipped posting
0draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
Workflow ID: wflow_bAi4ME3yq48bqHTg
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
bfd6fd2 to
a2400a7
Compare
|
@matteocacciola The current implementation breaks when the Excel sheet name contains spaces. I tested it with a file where the sheet name was "Display Tools". Ideally, the behavior should match pd.read_excel — when a single sheet is passed or detected, it should return a DataFrame. However, in our case, it always returns a dictionary, which leads to inconsistencies. |
I will check the bug with a space in the name. However, pd.read_excel returns a dictionary when the Excel file contains more than one sheet and no sheet name is provided |
I tested with excel file having one sheet. |
requested changes applied. I hope the current version is fine to you |
|
@matteocacciola Thanks a lot! Just one last thing, could you please remove openpyxl if it's not being used? Everything seems to be working fine without it, so it might be an unnecessary dependency. Then we are good to go. |
it looks like this check failed because of the missing library |
|
Hey @ArslanSaleem , can you please proceed with this PR? |
|
Thank you @matteocacciola for the improvement. |
Important
Introduces
read_excel()inpandasai/__init__.pyto read Excel files with support for multiple sheets and adds comprehensive tests intest_pandasai_read_excel.py.read_excel()inpandasai/__init__.pyto read Excel files, supporting bothstrandBytesIOfile paths.DataFrameor a dictionary ofDataFrames.sheet_nameparameter to specify a sheet or return all sheets if not specified.test_pandasai_read_excel.pywith tests forread_excel()covering single/multiple sheets,str/BytesIOpaths, and nonexistent sheets.sheet_nameand propagation of exceptions.read_csv()inpandasai/__init__.pyto acceptBytesIOin addition tostr.This description was created by
for 44300f0. You can customize this summary. It will automatically update as commits are pushed.