Skip to content

Conversation

@bdewater
Copy link
Contributor

@bdewater bdewater commented Oct 14, 2021

This PR does a few things based on a couple of use cases that emerged in Shopify:

First it introduces a parser that is locale-aware and can properly parse amounts with 3 digits, for the currencies Bahraini dinar, Iraqi dinar, Jordanian dinar, Kuwaiti dinar, Libyan dinar, Omani rial and Tunisian dinar. This parser also normalizes fullwidth characters, this is useful for e.g. Japanese users where their phone keyboards might lead them to entering these characters. In the future normalizing other digits from the Unicode Character Category 'Number, Decimal Digit' could be possible but for now we're just solving the use case we've actually run into.

Second, it introduces a 'simple' parser for API cases (which can also handle 3 decimal currencies) in order to be used in conjunction Shopify's gem that adds type coercion to controller parameters. We had a recent issue where the Fuzzy parser was used in a scenario like this, multiplying amounts by a 1000 , and requiring an extensive data cleanup effort.

Money::Parser::Fuzzy.parse("123.456", "USD")
=> #<Money value:123456.00 currency:USD>

Money::Parser::Simple.parse("123.456", "USD")
=> #<Money value:123.46 currency:USD>

Copy link
Contributor

@cejaekl cejaekl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks solid to me. I tried poking holes in it with a bit of ad hoc testing, and failed to find any bugs. 😄

end

def last_digits_decimals?(digits, marks, currency)
# Thousands marks are always different from decimal marks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Grouping marks, not Thousands marks. In locales that adopt the "Chinese" 4-digit grouping system, one group is a myriad, not a thousand.

Copy link
Contributor Author

@bdewater bdewater Oct 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL! This was already present in the old code, GH diff might make it look like a new file but it's really just whitespace changes.

return unless currency

normalized_input = input
.tr('-0-9.,', '-0-9.,')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also convert (U+3001) and (U+FF64) to , (U+002C)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured I'd start small, fullwidth digits I've seen being used on Shopify Payments onboarding forms by Japanese merchants. Based on https://www.localeplanet.com/icu/ja-JP/index.html dot and comma seemed likely to be sent as fullwidth as well.

Anything else that was not 100% equivalent my hunch was to wait and see first how people are actually interacting with the system - but I will also readily admit I do not have a lot of cultural context to make a prediction. Do you think these characters to be likely used?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think these characters to be likely used?

Well, that will depend on how the user's IME behaves.
I gave it a try just now (Mac OS, Japanese hiragana input, into a Chrome web page's text input field). Typing

100,000,000

and then hitting Enter left this in the text field:

100,000、000

So, yes, I think it's likely you'll see or instead of , or , at least some of the time.

@bdewater bdewater force-pushed the parser-refactoring branch 2 times, most recently from 6a21786 to 4739937 Compare October 17, 2021 01:13
@elfassy
Copy link
Contributor

elfassy commented Oct 21, 2021

What do you think of moving the parsing functionality to a new gem? A bit like rubyMoney which split its parsing to https://github.com/RubyMoney/monetize

currency = Money::Helpers.value_to_currency(currency)
return unless currency

amount = BigDecimal(input, exception: false)
Copy link
Contributor

@elfassy elfassy Oct 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strings in this format are already handled by calling money directly, ex: Money.new("123.10", "USD"). Maybe we can make this slightly different by stripping white spaces and handling string with trailing dots "123."?
I would also not completely trust BigDecimal to raise exception, for example I don't think we should allow the exponential notation

> BigDecimal("10.10E2")
=> 0.101e4

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling Money directly adds the complexity of an app potentially having opted into deprecated behaviour and returning 0 instead of nil for invalid input, this doesn't work with the expectations from our parameter type coercion gem that is used in Shopify core and other apps.

This parser is meant for well-formatted data so I am wary of normalizing here to paper over bugs from other people. We can loosen restrictions later if we learn it's needed (the opposite is much harder).

Good call on the scientific notation, I added something for that 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opted into deprecated behaviour and returning 0 instead of nil

If we're already expecting a well formatted string, could we get them to not opt-in to that deprecated behaviour, instead of adding a new parser? IMO nil -> 0 should already never be happening and is fairly easy to do manually if needed Money.new(amount || 0)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I follow. Are you suggesting to remove the deprecation for invalid input for Money.new?

Assuming an app hasn't opted in to the deprecated path, Money.new accepts more input than we want for APIs. Money.new will raise on invalid input where our typed parameter code (see related PR) and GraphQL types need a nil returned. We could rescue for those cases but that's less performant for most inputs.

I don't understand how Money.new(amount || 0) is of use, but if you meant Money::Parser::Simple.parse(amount) || 0 then yes that's another reason to prefer returning nil.

Comment on lines 21 to 22
if SIGNED_DECIMAL_MATCHER.match?(coerced) && (amount = BigDecimal(coerced, exception: false))
Money.new(amount, currency)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we should avoid using an assignment on the second part of the boolean

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our style guide allows it. I'll look into adding our Rubocop rules to this repo because linters can do this better than humans 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be a fair bit of work: #240 will circle back after this and some other outstanding PRs are merged to minimize disruption.

if SIGNED_DECIMAL_MATCHER.match?(coerced) && (amount = BigDecimal(coerced, exception: false))
Money.new(amount, currency)
elsif strict
raise ArgumentError, "unable to parse input=\"#{input}\" currency=\"#{currency}\""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worth adding a custom error IMO, would make this easier to rescue

@bdewater bdewater requested a review from kogulko January 19, 2022 23:57
@kogulko
Copy link

kogulko commented Feb 22, 2022

Left comment about additional test
LGTM (if we are bumping gem version)

Bart de Water added 5 commits April 17, 2022 08:13
Unlike the existing legacy MoneyParser, this one doesn't use complicated logic to try and guess if a comma or dot is used as a thousands or decimal separator. There are cases where it can never properly parse numbers, especially for currencies with 3 digits. This new parser fixes that.
We already had two and introduced two more (with different return semantics). It's better to make an informed choice for every callsite than implicitly relying on global configuration and making a change everywhere.

If a change everywhere is desired for simple apps, it's as simple as replacing 'Money.parse' with 'Money::Parser::Fuzzy.parse' to keep the same behaviour.
@bdewater bdewater force-pushed the parser-refactoring branch from 8387fe5 to de68c3a Compare April 17, 2022 12:44
Bart de Water added 3 commits April 17, 2022 08:54
Railties allow us to hook into Action View's loading process and do nothing if we're not in a Rails app.
Testing them in isolation is not trivial. The behaviour we setup here however is, and the core logic is already well tested, so skipping over a few lines of glue code is an acceptable tradeoff.
@bdewater bdewater force-pushed the parser-refactoring branch from de68c3a to 4b478f2 Compare April 17, 2022 12:54
@bdewater bdewater merged commit 22a5b62 into master Apr 17, 2022
@bdewater bdewater deleted the parser-refactoring branch April 17, 2022 13:48
@shopify-shipit shopify-shipit bot temporarily deployed to production April 17, 2022 14:39 Inactive
@shopify-shipit shopify-shipit bot temporarily deployed to production June 21, 2023 20:43 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants