Improve error message for missing function parameters #6232

suavemint · 2023-09-11T19:11:58Z

The error message in the fingerprint module was missing the f-string 'f' symbol, so the error message returned by fingerprint.py, line 469 was literally "function {func} is missing parameters {fingerprint_names} in signature."

This has been fixed.

The error message in the fingerprint module was missing the f-string 'f' symbol, so the error message returned by fingerprint.py, line 469 was literally "function {func} is missing parameters {fingerprint_names} in signature." This has been fixed.

HuggingFaceDocBuilderDev · 2023-09-12T05:55:49Z

The documentation is not available anymore as the PR was closed or merged.

mariosasko

Good catch! I replaced the func.__name__ with func.__qualname__ to be aligned with the function __repr__ (equivalent to def __repr__(self): return f"<function {self.__qualname__} {hex(id(self))}>")

EDIT:

I reverted the func.__name__/func.__qualname__ change to be aligned with the log messages about unhashable transforms

mariosasko · 2023-09-15T17:58:48Z

CI errors are unrelated

github-actions · 2023-09-15T18:07:55Z

Show benchmarks

PyArrow==8.0.0

Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric	read_batch_formatted_as_numpy after write_array2d	read_batch_formatted_as_numpy after write_flattened_sequence	read_batch_formatted_as_numpy after write_nested_sequence	read_batch_unformated after write_array2d	read_batch_unformated after write_flattened_sequence	read_batch_unformated after write_nested_sequence	read_col_formatted_as_numpy after write_array2d	read_col_formatted_as_numpy after write_flattened_sequence	read_col_formatted_as_numpy after write_nested_sequence	read_col_unformated after write_array2d	read_col_unformated after write_flattened_sequence	read_col_unformated after write_nested_sequence	read_formatted_as_numpy after write_array2d	read_formatted_as_numpy after write_flattened_sequence	read_formatted_as_numpy after write_nested_sequence	read_unformated after write_array2d	read_unformated after write_flattened_sequence	read_unformated after write_nested_sequence	write_array2d	write_flattened_sequence	write_nested_sequence
new / old (diff)	0.006681 / 0.011353 (-0.004672)	0.004132 / 0.011008 (-0.006876)	0.085045 / 0.038508 (0.046536)	0.077680 / 0.023109 (0.054571)	0.382042 / 0.275898 (0.106144)	0.412932 / 0.323480 (0.089452)	0.005339 / 0.007986 (-0.002646)	0.003408 / 0.004328 (-0.000921)	0.065280 / 0.004250 (0.061030)	0.055732 / 0.037052 (0.018680)	0.400231 / 0.258489 (0.141742)	0.432497 / 0.293841 (0.138656)	0.031532 / 0.128546 (-0.097014)	0.008721 / 0.075646 (-0.066925)	0.289612 / 0.419271 (-0.129660)	0.053089 / 0.043533 (0.009556)	0.383300 / 0.255139 (0.128161)	0.401204 / 0.283200 (0.118004)	0.023582 / 0.141683 (-0.118100)	1.493854 / 1.452155 (0.041699)	1.583497 / 1.492716 (0.090781)

Benchmark: benchmark_getitem_100B.json

metric	get_batch_of_1024_random_rows	get_batch_of_1024_rows	get_first_row	get_last_row
new / old (diff)	0.239163 / 0.018006 (0.221157)	0.469555 / 0.000490 (0.469065)	0.008325 / 0.000200 (0.008125)	0.000113 / 0.000054 (0.000059)

Benchmark: benchmark_indices_mapping.json

metric	select	shard	shuffle	sort	train_test_split
new / old (diff)	0.028975 / 0.037411 (-0.008436)	0.084195 / 0.014526 (0.069669)	0.189394 / 0.176557 (0.012837)	0.158010 / 0.737135 (-0.579125)	0.097502 / 0.296338 (-0.198837)

Benchmark: benchmark_iterating.json

metric	read 5000	read 50000	read_batch 50000 10	read_batch 50000 100	read_batch 50000 1000	read_formatted numpy 5000	read_formatted pandas 5000	read_formatted tensorflow 5000	read_formatted torch 5000	read_formatted_batch numpy 5000 10	read_formatted_batch numpy 5000 1000	shuffled read 5000	shuffled read 50000	shuffled read_batch 50000 10	shuffled read_batch 50000 100	shuffled read_batch 50000 1000	shuffled read_formatted numpy 5000	shuffled read_formatted_batch numpy 5000 10	shuffled read_formatted_batch numpy 5000 1000
new / old (diff)	0.383085 / 0.215209 (0.167876)	3.827030 / 2.077655 (1.749375)	1.872279 / 1.504120 (0.368159)	1.705808 / 1.541195 (0.164613)	1.833706 / 1.468490 (0.365216)	0.484744 / 4.584777 (-4.100033)	3.658221 / 3.745712 (-0.087491)	3.398462 / 5.269862 (-1.871399)	2.064974 / 4.565676 (-2.500703)	0.057740 / 0.424275 (-0.366535)	0.007926 / 0.007607 (0.000319)	0.465358 / 0.226044 (0.239314)	4.652951 / 2.268929 (2.384022)	2.328390 / 55.444624 (-53.116235)	2.000606 / 6.876477 (-4.875870)	2.268391 / 2.142072 (0.126318)	0.586537 / 4.805227 (-4.218690)	0.134749 / 6.500664 (-6.365915)	0.061276 / 0.075469 (-0.014193)

Benchmark: benchmark_map_filter.json

metric	filter	map fast-tokenizer batched	map identity	map identity batched	map no-op batched	map no-op batched numpy	map no-op batched pandas	map no-op batched pytorch	map no-op batched tensorflow
new / old (diff)	1.337913 / 1.841788 (-0.503875)	20.232122 / 8.074308 (12.157814)	14.478579 / 10.191392 (4.287187)	0.167545 / 0.680424 (-0.512878)	0.018745 / 0.534201 (-0.515456)	0.401209 / 0.579283 (-0.178074)	0.425748 / 0.434364 (-0.008616)	0.462539 / 0.540337 (-0.077798)	0.652446 / 1.386936 (-0.734490)

PyArrow==latest

Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric	read_batch_formatted_as_numpy after write_array2d	read_batch_formatted_as_numpy after write_flattened_sequence	read_batch_formatted_as_numpy after write_nested_sequence	read_batch_unformated after write_array2d	read_batch_unformated after write_flattened_sequence	read_batch_unformated after write_nested_sequence	read_col_formatted_as_numpy after write_array2d	read_col_formatted_as_numpy after write_flattened_sequence	read_col_formatted_as_numpy after write_nested_sequence	read_col_unformated after write_array2d	read_col_unformated after write_flattened_sequence	read_col_unformated after write_nested_sequence	read_formatted_as_numpy after write_array2d	read_formatted_as_numpy after write_flattened_sequence	read_formatted_as_numpy after write_nested_sequence	read_unformated after write_array2d	read_unformated after write_flattened_sequence	read_unformated after write_nested_sequence	write_array2d	write_flattened_sequence	write_nested_sequence
new / old (diff)	0.007159 / 0.011353 (-0.004194)	0.004091 / 0.011008 (-0.006917)	0.066202 / 0.038508 (0.027694)	0.083096 / 0.023109 (0.059987)	0.402160 / 0.275898 (0.126261)	0.440565 / 0.323480 (0.117085)	0.005757 / 0.007986 (-0.002228)	0.003445 / 0.004328 (-0.000884)	0.065498 / 0.004250 (0.061248)	0.059787 / 0.037052 (0.022735)	0.407017 / 0.258489 (0.148528)	0.448270 / 0.293841 (0.154429)	0.033606 / 0.128546 (-0.094941)	0.008744 / 0.075646 (-0.066902)	0.072902 / 0.419271 (-0.346369)	0.050144 / 0.043533 (0.006611)	0.401069 / 0.255139 (0.145930)	0.426389 / 0.283200 (0.143189)	0.023297 / 0.141683 (-0.118386)	1.506152 / 1.452155 (0.053998)	1.570211 / 1.492716 (0.077495)

Benchmark: benchmark_getitem_100B.json

metric	get_batch_of_1024_random_rows	get_batch_of_1024_rows	get_first_row	get_last_row
new / old (diff)	0.235759 / 0.018006 (0.217753)	0.488410 / 0.000490 (0.487921)	0.004587 / 0.000200 (0.004387)	0.000115 / 0.000054 (0.000060)

Benchmark: benchmark_indices_mapping.json

metric	select	shard	shuffle	sort	train_test_split
new / old (diff)	0.034123 / 0.037411 (-0.003289)	0.102163 / 0.014526 (0.087638)	0.110892 / 0.176557 (-0.065664)	0.166000 / 0.737135 (-0.571135)	0.110845 / 0.296338 (-0.185494)

Benchmark: benchmark_iterating.json

metric	read 5000	read 50000	read_batch 50000 10	read_batch 50000 100	read_batch 50000 1000	read_formatted numpy 5000	read_formatted pandas 5000	read_formatted tensorflow 5000	read_formatted torch 5000	read_formatted_batch numpy 5000 10	read_formatted_batch numpy 5000 1000	shuffled read 5000	shuffled read 50000	shuffled read_batch 50000 10	shuffled read_batch 50000 100	shuffled read_batch 50000 1000	shuffled read_formatted numpy 5000	shuffled read_formatted_batch numpy 5000 10	shuffled read_formatted_batch numpy 5000 1000
new / old (diff)	0.431397 / 0.215209 (0.216188)	4.291540 / 2.077655 (2.213885)	2.298248 / 1.504120 (0.794128)	2.134752 / 1.541195 (0.593557)	2.207913 / 1.468490 (0.739423)	0.490607 / 4.584777 (-4.094170)	3.683078 / 3.745712 (-0.062635)	3.314266 / 5.269862 (-1.955596)	2.059488 / 4.565676 (-2.506188)	0.057876 / 0.424275 (-0.366399)	0.007696 / 0.007607 (0.000089)	0.512186 / 0.226044 (0.286142)	5.124071 / 2.268929 (2.855142)	2.803913 / 55.444624 (-52.640711)	2.428558 / 6.876477 (-4.447919)	2.655207 / 2.142072 (0.513135)	0.584589 / 4.805227 (-4.220638)	0.133518 / 6.500664 (-6.367146)	0.060729 / 0.075469 (-0.014740)

Benchmark: benchmark_map_filter.json

metric	filter	map fast-tokenizer batched	map identity	map identity batched	map no-op batched	map no-op batched numpy	map no-op batched pandas	map no-op batched pytorch	map no-op batched tensorflow
new / old (diff)	1.352916 / 1.841788 (-0.488872)	20.249632 / 8.074308 (12.175323)	15.283079 / 10.191392 (5.091686)	0.157601 / 0.680424 (-0.522823)	0.019650 / 0.534201 (-0.514551)	0.396398 / 0.579283 (-0.182885)	0.430111 / 0.434364 (-0.004252)	0.480627 / 0.540337 (-0.059710)	0.642165 / 1.386936 (-0.744771)

mariosasko added 2 commits September 14, 2023 00:07

Replace __name__ with __qualname__

2c18dc3

Merge branch 'main' into fixes-fingerprint-error-fstrings

4c09e16

mariosasko approved these changes Sep 13, 2023

View reviewed changes

mariosasko added 2 commits September 14, 2023 00:14

Revert

facd60f

Merge branch 'main' into fixes-fingerprint-error-fstrings

6db0388

mariosasko merged commit 9b21e18 into huggingface:main Sep 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve error message for missing function parameters #6232

Improve error message for missing function parameters #6232

Uh oh!

suavemint commented Sep 11, 2023

Uh oh!

HuggingFaceDocBuilderDev commented Sep 12, 2023 •

edited

Loading

Uh oh!

mariosasko left a comment •

edited

Loading

Uh oh!

mariosasko commented Sep 15, 2023

Uh oh!

github-actions bot commented Sep 15, 2023

Benchmark: benchmark_array_xd.json

Benchmark: benchmark_getitem_100B.json

Benchmark: benchmark_indices_mapping.json

Benchmark: benchmark_iterating.json

Benchmark: benchmark_map_filter.json

Benchmark: benchmark_array_xd.json

Benchmark: benchmark_getitem_100B.json

Benchmark: benchmark_indices_mapping.json

Benchmark: benchmark_iterating.json

Benchmark: benchmark_map_filter.json

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Improve error message for missing function parameters #6232

Improve error message for missing function parameters #6232

Uh oh!

Conversation

suavemint commented Sep 11, 2023

Uh oh!

HuggingFaceDocBuilderDev commented Sep 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mariosasko left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mariosasko commented Sep 15, 2023

Uh oh!

github-actions bot commented Sep 15, 2023

Benchmark: benchmark_array_xd.json

Benchmark: benchmark_getitem_100B.json

Benchmark: benchmark_indices_mapping.json

Benchmark: benchmark_iterating.json

Benchmark: benchmark_map_filter.json

Benchmark: benchmark_array_xd.json

Benchmark: benchmark_getitem_100B.json

Benchmark: benchmark_indices_mapping.json

Benchmark: benchmark_iterating.json

Benchmark: benchmark_map_filter.json

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

HuggingFaceDocBuilderDev commented Sep 12, 2023 •

edited

Loading

mariosasko left a comment •

edited

Loading